Descriptive Alt Text

Projects

Advanced Reasoning Benchmark (ARB) Project

Status: In Progress

advanced prompting methods, measure performance of LLMs to solve academic problems

Multimodal Dataset Project

Status: In Progress

build dataset for training multimodal agents

Minerva-OCR Project

Status: In Progress

An OCR model of quality comparable to proprietary OCR solutions that can generate bounding boxes around any word that appears on an image and label them with the corresponding word.

Exploring video pretraining for sample efficient RL (aka #video-rl)

Status: In Progress

Improve pre training of video models for sample efficient RL agents

Non-English GPTs and better datasets and tokenizers (aka #polylingual)

Status: In Progress

Build higher-quality non-English datasets and tokenizers and finetune non-English GPTs out of a pretrained English GPT

Learning Machine Learning (aka #learning)

Status: In Progress

bring people up to speed in SoTA machine learning