16MB Language Model Parameter Golf Visualization

πŸŒοΈβ€β™€οΈ Golfer 1: gzip
πŸŒοΈβ€β™€οΈ Golfer 2: Auto-Regressive baseline, 17M parameters / 15.8MB
πŸŒοΈβ€β™€οΈ Golfer 3: Masked Diffusion Language Model, 19M parameters / 15.8MB
πŸŒοΈβ€β™€οΈ Golfer 4: SmolLM2-135M, 79MB.

BPB scores each byte (bits-per-byte) as a colored bar; the brighter the bar, the more bits that method spent on that byte.
Generate continues a prompt under each method, with shared temperature / top-p / repetition-penalty knobs.

β€”Lee Butterman, 2026
how this was made β‹™

Text to compute bits per byte from

Per-byte bits


Text prompt to generate from