The Predictability Ceiling

Salganik-style forecasting exercise: how well can any model predict which names enter the SSA top 100?

By Namesake ResearchApril 13, 2026

5.11 The Predictability Ceiling

Following Salganik et al. (MusicLab, 2006), we ask: how predictable is baby name success? Models are trained on 2004-2014 data and tested on whether a name entered the SSA top 100 during 2015-2024.

Model	AUC	PR-AUC	Brier	P@50	P@100	P@200
Baseline A: Top-500 rule	0.982	0.199	0.018	0.200	0.230	0.210
Baseline B: AR(1) rank	0.997	0.745	0.003	1.000	0.980	0.770
Full: Logistic Regression	0.999	0.881	0.008	0.980	0.980	0.900

Interpretation

The best model achieves AUC = 0.999, suggesting surprisingly high predictability. This may reflect that structural features (prior popularity, phonetic properties) constrain the possibility space more than Salganik's framework would predict.

The full model improves over the simple top-500 baseline by 0.017 AUC points, suggesting that cultural, phonetic, and adoption-dynamic features carry real predictive power beyond mere historical popularity.

Training: 58409 names from 2004-2014. Positive class: 271 names that entered top 100 during 2015-2024 (0.5% base rate).

View all research