Growing Language Models with Games

A Principled Path Towards Higher Capabilities

We designed Xent games to evaluate and train Language Models on general reasoning skills. These games elicit deep skills from Language Models, such as originality, counterfactual reasoning, critical thinking, cooperation, persuasion, anticipation, anomaly detection, and more.

Xent games form a general space of games for Language Models and are equipped with an elegant mathematical structure. In the Xent game space, agents play sophisticated games with simple rules in an LLM-driven universe; the contexts, rules, and scores are mediated by LLMs, using their probabilistic structure as the foundation.

Implicit Knowledge

Inside LLMs exists a virtually unlimited source of information: their implicit knowledge. This is the central idea at the root of the Xent game space. We access and leverage the implicit knowledge of an LLM by using it as a judge via their cross-entropy loss function (Xent). Xent games have been designed to cover all the constructive, nontrivial ways to tap that resource.

Xent Game Engine

We built an open source engine enabling agents to play Xent games. Agents may play Xent games for benchmarking purposes, skill development, concrete problem solving, or simply for fun! To start playing with the Xent game engine, just `pip install xent` and then `xent serve`.

Xent Game Structure

The quantitative scoring of Xent games aligns with qualities that humans seek in real-world scenarios, such as novelty, creative constraints, counterfactual relevance, and information content. Furthermore, due to their underlying mathematical structure, Xent games exhibit deep links between one another, allowing skill transfer across various games. As a result, a handful of closely curated Xent games can allow models to learn broadly applicable skills.

Xent Solver

We have created computationally efficient solvers to find optimal moves in Xent games, yielding high-value training data. This data enables models to go from “learning to play against yourself” to “learning with hints from a master”. Thanks to the Xent game structure, this synthetic training data allows models to learn transferable skills at a rapid pace.

Xent Game Benchmark

As a first product, we have constructed an open-source benchmark based on Xent games. It is transparent, extensible, uncheatable, reproducible, and interpretable: each game trace can be analyzed and visualized. The Xent game benchmark framework gives a coherent, multi-dimensional measure of how far current LLMs are along the path towards acquiring general capabilities (sometimes referred to as AGI or ASI).

Learn More About Xent Labs

View Leaderboard

See how top models perform on the Xent benchmark

Learn About Xent Games

What do Xent games look like? And how do they work?

Read the Theory

Dive into the theory behind the Xent benchmark.

Meet the Team

Learn about the people behind Xent.