IE University Bachelor's Thesis · 2025

PHANTOM

Revenue management in the age of AI agents.

Private Valuation True Demand Constraints
Written by Daniel Rösel Advised by Alberto Martín Izquierdo

Powered by Google TPU Research Cloud.

01

The vulnerability

Repeated agent price queries collapse the Cost of Information that dynamic pricing depends on.

02

The signal

Human and agent sessions separate through transition-kernel behavior, not brittle bot flags.

03

The defense

Distributionally robust RL preserves pricing power under contaminated demand.

PHANTOM teaser diagram connecting vulnerability, behavioral signal, and robust control

The thesis, compressed.

Dynamic pricing extracts margin by exploiting the gap between what a platform knows and what a buyer knows. A user who browses a hotel across several sessions signals intent; the platform raises the price accordingly. That information asymmetry — the Cost of Information — is the economic engine behind session-based pricing in travel, hospitality, and e-commerce.

LLM agents break the engine. An agent conducting reconnaissance in isolated sessions accumulates zero demand signal, then routes the purchase through a clean session at the floor price. As the number of independent querying agents grows, the realizable price converges to its minimum order statistic and COI collapses to zero. This is not a future risk; it is a structural failure mode in any pricing system that treats sessions independently.

PHANTOM formalizes the failure, measures it on real human and agent interaction data, and builds a defense. We prove the COI erosion theorem, collect 29 labeled sessions (13 human, 16 agent) across hotel and airline storefronts under goal-driven tasks, learn class-specific Markov transition kernels, and train a Distributionally Robust RL pricing policy over a Wasserstein ambiguity set. Behavioral separability is statistically significant (Mann–Whitney U = 2.0, p = 0.0006). The per-session agent probability signal f(τ) feeds directly into the robust policy reward as a COI-leakage penalty.

New interaction environment of future commerce.

Isometric illustration of a human user as a cube character

Users

Have new needs and means of research & acquisition.

Isometric illustration of an LLM agent depicted as a cube robot

Agents

Use browsers (C/BUA) to look human and create clean sessions.

Platforms

Run standard pricing algorithms and experience revenue loss.

When agents can repeatedly query prices, realizable markup disappears.

COI = E[P] − p

Cost of Information — the expected premium dynamic pricing earns over the reservation price — collapses to zero as the number of independent querying agents grows.

We study behavior, convert it into a control signal, and train a pricing policy that survives contamination.

01

Observe

Human participants and LLM agents complete goal-driven hotel and airline tasks. The storefront records behavior events and price quotes as timestamped trajectories.

02

Distinguish

Session paths become transition kernels. KL distance to human and agent prototypes yields a continuous agent-probability signal.

03

Defend

A contamination generator mixes human and synthetic agent trajectories. A distributionally robust RL policy optimizes price under worst-case demand shifts.

Agents distort marketplace signals. PHANTOM uses behavioral distinguishability and DR–RL to preserve pricing power.

  1. 01 We can distinguish humans from agents at the transition-kernel level.Mann–Whitney U = 2.0, p = 0.0006 across 29 labeled sessions.
  2. 02 Revenue declines monotonically in agent-contaminated systems.Each 1.0 step of contamination α removes ~90,140 in cohort revenue (p < 10-77).
  3. 03 Distributionally robust RL preserves margin under worst-case contamination.Defended policy holds positive COI gap over baseline across α ∈ [0, 1].

Our solution can be forward-deployed to any e-commerce platform to preserve their COI.

WhoClickedIt — published on Hugging Face.

~4k rows of labeled human and agent interaction data across hotel and airline tasks. Open dataset used for training the behavioral kernels.

huggingface.co/datasets/velocitatem/whoclickedit

Full thesis.

BibTeX

@thesis{Rosel2025PHANTOM,
  title={Pricing Heuristics Against Non-human Transaction Orchestration Mechanisms},
  author={Rösel, Daniel},
  school={IE University},
  year={2025},
  address={Madrid, Spain},
  type={Bachelor's Thesis},
  note={Advisor: Alberto Mart{\'i}n Izquierdo}
}