Compare models

Explore how different models behave in the Beer Game.
Switch contexts to see how performance changes under stress.

Context selection

Choose the conditions under which models will be compared. Comparisons are always made within the selected context.

Scenario
?
Scenario sets the demand pattern: classic steady, random unpredictable, shock jump, seasonal swings.
Classic
Visibility
?
Visibility describes what each tier can see: local only, adjacent neighbors, or full chain.
Local
Memory
?
Memory is how much history the model sees: none, short window, or full history.
None
Prompt type
?
Prompt type controls guidance: neutral is minimal, specific adds structured objectives.
Neutral
Game mode
?
Game mode sets the rules: classic Beer Game or a modern variant in SCM-Arena.
Classic
Model selection

Select 2–4 models to compare side by side.

Selected (2–4)
No models selected yet.
Small (<7B)
Medium (7–20B)
Large (>20B)
Frontier (proprietary)
Closed-weight, API-only models.
Select at least two models to compare.