Announcing ARES - our open-source Agentic Research and Evaluation Suite.
ARES is built around 3 pillars (👇 see the thread) to make reinforcement learning for code agents easy.
We’ve also found it to be incredibly useful for our own mech interp research.
$1,000,000 to understand how LLMs write code.
Announcing: The Martian Interpretability Challenge.
Understanding the inner workings of LLMs is the greatest scientific challenge of our age,. Let's solve it.
Apply here:
🧵👇