Discussion Papers, Policy Papers, Books & Reports, Bulletin, Newsletter, Economic Policy Lunchtime Meetings, Workshops & Conferences, Events Diary, Previous Events Programme Areas, Current Research Projects, Networks, Vacancies Programme Directors, Researchers Lists, Noticeboard Press Releases, Coverage, Request a Press Release Data?, Resources for Economists, Data on Other sites Membership information Login, Create a Profile, Profile Benefits, Your Profile Settings, Forgot Your Password? Site Map, How to find us, How to Order Publications, Privacy Policy, Feedback How to find us, Frequently Asked Questions, ESRC Site Guide, Frequently Asked Questions, Vacancies, How to Search Site Map, How to find us, How to Order Publications, Privacy Policy, Feedback CEPR Home Page You have items in your shopping cart.  Click to view your cart


DP7270 Strategic Experimentation with Poisson Bandits

Author(s): R Godfrey Keller , Sven Rady
Publication Date: April 2009
Keyword(s): Bayesian Learning , Differential-Difference Equation , Markov Perfect Equilibrium , Piecewise Deterministic Process , Poisson Process , Strategic Experimentation , Two-Armed Bandit
JEL(s): C73 , D83 , O32
Programme Areas: Industrial Organization
Link to this Page: www.cepr.org/pubs/dps/DP7270.asp.asp


We study a game of strategic experimentation with two-armed bandits where the risky arm distributes lump-sum payoffs according to a Poisson process. Its intensity is either high or low, and unknown to the players. We consider Markov perfect equilibria with beliefs as the state variable. As the belief process is piecewise deterministic, payoff functions solve differential-difference equations. There is no equilibrium where all players use cut-off strategies, and all equilibria exhibit an 'encouragement effect' relative to the single-agent optimum. We construct asymmetric equilibria in which players have symmetric continuation values at sufficiently optimistic beliefs yet take turns playing the risky arm before all experimentation stops. Owing to the encouragement effect, these equilibria Pareto dominate the unique symmetric one for sufficiently frequent turns. Rewarding the last experimenter with a higher continuation value increases the range of beliefs where players experiment, but may reduce average payoffs at more optimistic beliefs. Some equilibria exhibit an 'anticipation effect': as beliefs become more pessimistic, the continuation value of a single experimenter increases over some range because a lower belief means a shorter wait until another player takes over.


Full text Search:
Enter a DP Number:

Access other features of the site by loging in with your personal profile. Purchase a copy of the paper in PDF format. How to subscribe to the CEPR Discussion Paper series Send an email to a colleague with details of the paper. Obtain Plain Text details of this paper which you can copy in to a word document or email allowing you to easily cite this paper! Help in purchasing and downloading papers. CEPR RSS feeds information page.

Your current location: Publications > Discussion Papers
Top CEPR, 77 Bastwick St, London EC1V 3PZ
United Kingdom.
Tel: +44 (0)20 7183 8801     Fax: +44 (0)20 7183 8820
Email: cepr@cepr.org     Webmaster: webmaster@cepr.org
Home
With the support of the European Union: Support for bodies active at European level in the field of active European citizenship