Announcement_preprint_2024

Check out our new insights on “Limits to scalable evaluation at the frontier: LLM as Judge won’t beat twice the data” on arxiv. :sparkles: