Single Policy Evaluation
Evaluate one candidate policy against historical logs
Upload Data
CSV or Parquet format
CSV or Parquet format
Gating Thresholds (Advanced)
Minimum support overlap required. Lower = more permissive. Default: 20%
Minimum Effective Sample Size. Lower = more permissive. Default: 1000
Minimum uplift LCB to SHIP. Default: 0.005 (0.5%)
Uplift LCB below this triggers BLOCK. Default: 0.005 (0.5%)
Stress test gate: max fraction of exposures dropped before INCONCLUSIVE. Default: 30%
Propensity Estimation (Advanced)
Warning: Estimated propensities are less reliable than true propensities from your logging policy. CI and ESS results should be interpreted with caution. Use only if true propensities are unavailable.
Methods: softmax (if score/rating column exists) or uniform per-user. Auto-detects available columns.