Hi Eric

I was remembering that if you tossed a perfectly balanced coin and got 10 or 100 heads in a row it says absolutely nothing about the future coin tosses nor undermines the initial condition of a perfectly balanced coin. Bayesian or not the next head has a 50:50 probability of occurring. If you saw a player get a long winning streak would you really place your bet in the same way on the next spin? I would need to see lots of long runs (data points) to make a choice on which tables to focus my efforts and we can then employ Bayesian or formal statistics to the problem.

I think your excellent analysis was founded on 'relative wins' which is fine by me in identifying a winning wheel, as against 'the longer a run of success' finding one which I'd consider very 'dodgy'.

Thanks Robert

On 12/12/16 1:56 PM, Eric Smith wrote:
Hi Robert,

I worry about mixing technical and informal claims, and making it hard for 
people with different backgrounds to track which level the conversation is 
operating at.

You said:

A long run is itself a data point and the premise in red (below) is false.
and the premise in red (I am not using an RTF sender) from Nick was:

But the longer a run of success continues, the greater is the probability that 
the wheel that produces those successes is biased.
Whether or not it is false actually depends on what “probability” one means to 
be referring to.  (I am ending many sentences with prepositions; apologies.)

It is hard to say that any “probability” inherently is “the” probability that 
the wheel produces those successes.  A wheel is just a wheel (Freud or no 
Freud); to assign it a probability requires choosing a set and measure within 
which to embed it, and that always involves other assumptions by whoever is 
making the assertion.

Under typical usages, yes, there could be some kind of “a priori” (or, in 
Bayesian-inference language, “prior”) probability that the wheel has a 
property, and yes, that probability would not be changed by testing how many 
wins it produces.

On the other hand, the Bayesian posterior probability, obtained from the prior 
(however arrived-at) and the likelihood function, would indeed put greater 
weight on the wheel that is loaded, (under yet more assumptions of independence 
etc. to account for Roger’s comment that long runs are not the only possible 
signature of loading, and your own comments as well), the more wins one had 
seen from it relatively.

I _assume_ that this intuition for how one updates Bayesian posteriors is 
behind Nick’s common-language premise that “the longer a run of success 
continues, the greater is the probability that the wheel that produces those 
successes is biased”.  That would certainly have been what I meant in a 
short-hand for the more laborious Bayesian formula.

For completeness, the Bayesian way of choosing a meaning for probabilities 
updated by observations is the following.

Assume two random variables, M and D, which take values respectively standing 
for a Model or hypothesis, and an observed-value or Datum.  So: hypothesis: 
this wheel and not that one is loaded.  datum: this wheel has produced 
relatively more wins.

Then, by some means, commit to what probability you assign to each value of M 
before you make an observation.  Call it P(M).  This is your Bayesian prior 
(for whether or not a certain wheel is loaded).  Maybe you admit the 
possibility that some wheel is loaded because you have heard it said, and maybe 
you even assume that precisely one wheel in the house is loaded, only you don’t 
know which one.  Lots of forms could be adopted.

Next, we assume a true, physical property of the wheel is the probability 
distribution with which it produces wins, given whether it is or is not loaded. 
 Notation is P(D|M).  This is called the _likelihood function_ for data given a 

The Bayes construction is to say that the structure of unconditioned and 
conditioned probabilites requires that the same joint probability be 
arrivable-at in either of two ways:
P(D,M) = P(D|M)P(M) = P(M|D)P(D).

We have had to introduce a new “conditioned” probability, called the Bayesian 
Posterior, P(M|D), which treats the model as if it depended on the data.  But 
this is just chopping a joint space of models and data two ways, and we are 
always allowed to do that.  The unconditioned probability for data values, 
P(D), is usually expressed as the sum of P(D|M)P(M) over all values that M can 
take.  That is the probability to see that datum any way it can be produced, if 
the prior describes that world correctly.  In any case, if the prior P(M) was 
the best you can do, then P(D) is the best you can produce from it within this 

Bayesian updating says we can consistently assign this posterior probability 
as: P(M|D) = P(D|M) P(M) / P(D).

P(M|D) obeys the axioms of a probability, and so is eligible to be the referent 
of Nick’s informal claim, and it would have the property he asserts, relative 
to P(M).

Of course, none of this ensures that any of these probabilities is empirically 
accurate; that requires efforts at calibrating your whole system.  Cosma 
Shalizi and Andrew Gelman have some lovely write-up of this somewhere, which 
should be easy enough to find (about standard fallacies in use of Bayesian 
updating, and what one can do to avoid committing them naively).   Nonetheless, 
Bayesian updating does have many very desirable properties of converging on 
consistent answers in the limit of long observations, and making you less 
sensitive to mistakes in your original premises (at least under many 
circumstances, inluding roulette wheels) than you were originally.

To my mind, none of this grants probabilities from God, which then end 
discussions.  (So no buying into “objective Bayesianism”.)  What this all does, 
in the best of worlds, is force us to speak in complete sentences about what 
assumptions we are willing to live with to get somewhere in reasoning.

All best,


On Dec 12, 2016, at 12:44 PM, Robert J. Cordingley <rob...@cirrillian.com> 

Based on https://plato.stanford.edu/entries/peirce/#dia - it looks like 
abduction (AAA-2) to me - ie developing an educated guess as to which might be 
the winning wheel. Enough funds should find it with some degree of certainty 
but that may be a different question and should use different statistics 
because the 'longest run' is a poor metric compared to say net winnings or 
average rate of winning. A long run is itself a data point and the premise in 
red (below) is false.

Waiting for wisdom to kick in. R

PS FWIW the article does not contain the phrase 'scientific induction' R

On 12/12/16 12:31 AM, Nick Thompson wrote:
Dear Wise Persons,
Would the following work? Imagine you enter a casino that has a thousand roulette tables. The rumor circulates around the casino that one of the wheels is loaded. So, you call up a thousand of your friends and you all work together to find the loaded wheel. Why, because if you use your knowledge to play that wheel you will make a LOT of money. Now the problem you all face, of course, is that a run of successes is not an infallible sign of a loaded wheel. In fact, given randomness, it is assured that with a thousand players playing a thousand wheels as fast as they can, there will be random long runs of successes. But the longer a run of success continues, the greater is the probability that the wheel that produces those successes is biased. So, your team of players would be paid, on this account, for beginning to focus its play on those wheels with the longest runs. FWIW, this, I think, is Peirce’s model of scientific induction. Nick Nicholas S. Thompson
Emeritus Professor of Psychology and Biology
Clark University

