Re: [Jprogramming] Tea-tasting for non-statisticians

Ian Clark Wed, 25 Feb 2015 07:58:43 -0800

On Wed, Feb 25, 2015 at 3:31 PM, Devon McCormick <[email protected]> wrote:
> Ian - setting up a framework for this sort of experiment could prove
> valuable in a wide variety of fields, not just audio testing.
>
> However, you may be overshadowing the point of the example from the book:
> it's an illustration of the relevance of prior knowledge.  As I remember
> it, not having the book in front of me, it goes something like this:
>
> There are these three examples of evidence supporting a hypothesis:
>
> 1) A lady claims to be able to distinguish, by tasting a cup of tea with
> milk, whether the tea was added before the milk or the milk before the
> tea.  You test her ten times and she is correct every time.
>
> 2) Someone claims to be able to distinguish by ear a score written by
> Mozart from one not written by Mozart.  You test him ten times and he is
> correct every time.
>
> 3) A drunken friend claims to be able to predict the result of a coin
> toss.  You test him ten times and he is correct every time.
>
> Since the empirical evidence in all three cases is identical, why would we
> not believe all three hypotheses to be equally well-proved?


Aha! -- do I detect a Bayesian? Or a disciple of de Finetti?

I recall a Real statistician taking me aside and telling me: "the
trouble with you people is that you don't think anything's happened in
Statistics in the last 20 years." -- and that was way back in 1989.

I wouldn't know how to write a program based on a Bayesian analysis to
replace my classical "H0-rejection" analysis. I suspect it's no use
even trying. De Finetti, I recall, required you to input your a-priori
subjective prediction of the outcome as a normalized distribution
before anything else.

Maybe I'll just stick to a disclaimer…
"This assessment is based on the assumption that you are not an
occultist, a superhero, mentally certifiable or under the influence of
drugs or alcohol."
…so bang goes all my user audience of musicians and songwriters!
 :-D



On Wed, Feb 25, 2015 at 3:31 PM, Devon McCormick <[email protected]> wrote:
> Ian - setting up a framework for this sort of experiment could prove
> valuable in a wide variety of fields, not just audio testing.
>
> However, you may be overshadowing the point of the example from the book:
> it's an illustration of the relevance of prior knowledge.  As I remember
> it, not having the book in front of me, it goes something like this:
>
> There are these three examples of evidence supporting a hypothesis:
>
> 1) A lady claims to be able to distinguish, by tasting a cup of tea with
> milk, whether the tea was added before the milk or the milk before the
> tea.  You test her ten times and she is correct every time.
>
> 2) Someone claims to be able to distinguish by ear a score written by
> Mozart from one not written by Mozart.  You test him ten times and he is
> correct every time.
>
> 3) A drunken friend claims to be able to predict the result of a coin
> toss.  You test him ten times and he is correct every time.
>
> Since the empirical evidence in all three cases is identical, why would we
> not believe all three hypotheses to be equally well-proved?
>
>
> On Wed, Feb 25, 2015 at 10:19 AM, Ian Clark <[email protected]> wrote:
>
>> @Chris - just what I've been looking for!
>>
>> However did I miss it? -- I guess it must have eluded my choice of
>> keywords.
>> Next time I'll use Google with site:jsoftware.com
>>
>> On Wed, Feb 25, 2015 at 2:27 PM, chris burke <[email protected]> wrote:
>> >> Has anyone built a standalone Mac app using JQt?
>> >
>> > Please see http://www.jsoftware.com/jwiki/Guides/J8%20Standalone
>> >
>> > On 25 February 2015 at 06:09, Ian Clark <[email protected]> wrote:
>> >
>> >> @Henry -- thanks for your comments. Great!
>> >>
>> >> IMO this is just the sort of discussion I would like to see aired in
>> >> public. Though maybe do the more philosophical stuff in Chat?
>> >> Ideally I would like a summary of the J community's findings
>> >> documented on a Jwiki page for wider consumption.
>> >>
>> >> Further comments in-line…
>> >>
>> >>
>> >> On Wed, Feb 25, 2015 at 4:27 AM, Henry Rich <[email protected]>
>> wrote:
>> >> > We should take this off-group, but I'm replying in public because if
>> I'm
>> >> > wrong I would like to be corrected (and I'm only an amateur
>> >> statistician):
>> >>
>> >> That's exactly why I'm appealing to the forum too.
>> >> …To the annoyance of Real Statisticians, no doubt, because this must
>> >> be elementary stuff to them.
>> >> But Wikipedia -- which you'd expect to give simple answers to simple
>> >> questions which laypeople want to ask and need to ask -- approaches
>> >> the whole issue like a cat circling a bowl of hot porridge.
>> >> …If you're a layperson, just try working out how to score the "Lady
>> >> Tasting Tea" experiment from these pages…
>> >> https://en.wikipedia.org/wiki/Binomial_test
>> >> https://en.wikipedia.org/wiki/Binomial_distribution
>> >> https://en.wikipedia.org/wiki/Bernoulli_distribution
>> >> https://en.wikipedia.org/wiki/Bernoulli_trial
>> >> https://en.wikipedia.org/wiki/Bernoulli_process
>> >>
>> >> As a Human Factors *engineer* -- I've been a professional *user* of
>> >> hypothesis-testing but an amateur Statistician.
>> >> …Or should that be Probabilist? Or even Epistemologist?
>> >>
>> >> Plus… now I'm retired, I'm getting rusty.
>> >>
>> >> Plus… I can't find precise enough documentation of JAL verb:
>> binomialprob.
>> >> Like… what's the semantics of the 3rd entry of (y) (styled "minimum
>> >> number of successes (s)") when y has only 3 entries? Can it be called
>> >> "minimum" any more? What I've concluded, after a bit of RTFC plus a
>> >> few idiot tests, is:
>> >>
>> >>    (binomialprob 0.5,N,s) -: (binomialprob 0.5,N,s,N)
>> >>
>> >> Plus… has this doggie got 2 tails or just 1??
>> >>
>> >>
>> >> > I think you are calling binomialprob correctly but I have some
>> >> objections to
>> >> > your use of the result.
>> >> >
>> >> > 1.  I think your rejectH0 should use 1 - -: CONFIDENCE instead of
>> >> > 1-CONFIDENCE.
>> >> >
>> >> >   The question is, "How likely is a result as weird as I am seeing,
>> >> assuming
>> >> > H0?"  You should not bias "weird" by assuming that weird results will
>> be
>> >> > correct guesses - they could just as likely be incorrect guesses.  To
>> >> ensure
>> >> > that you reject 95% of the purely-chance deviations of a certain size,
>> >> that
>> >> > 95% should be centered around the mean, not loaded toward one side.
>> >>
>> >> The "1-tail-or-2?" question -- or so I thought at first.
>> >> But it's deeper than that. It's much more serious. Serious enough to
>> >> be the key issue for me.
>> >> Which is precisely why I want to be *sure*. Sure enough to argue my
>> >> case to a determined layperson. Not merely make an inspired guess, as
>> >> most people would in an industrial situation (…knowing no one else
>> >> knows enough statistics to dare to challenge you!)
>> >>
>> >> What I understand @Henry to be saying is: should the 5% area under the
>> >> binomial distribution curve, which sets the pass/fail threshold, be
>> >> shared equally between both tails? Even if one tail happens to be in
>> >> fairyland?
>> >>
>> >> What I mean by that last remark is…
>> >> If The Lady Tasting Tea (TLTT) gets every trial *wrong*, then she's
>> >> *not* a monkey flipping a fair coin. It's a very biased coin!
>> >> She is sending a strong signal that she can be depended upon (…with X%
>> >> confidence) to make the wrong decision.
>> >> But I don't want to credit her this as evidence to support her claim
>> >> she can tell the difference (…at least, not tell it correctly).
>> >> This is what makes TLTT different from detecting a biased coin by
>> >> repeated tosses.
>> >>
>> >> What's to do?
>> >>
>> >>
>> >> > are there really people who think optical might be better than USB??
>> >>
>> >> Oh-ho-ho! -- yes, they can still be found.
>> >> Hi-Fi buffs have not become extinct, and the (undead?) audio industry
>> >> still lives off their lifeblood.
>> >>
>> >>
>> >> > This is digital communication, no?  44K samples/sec, 2 channels, 20
>> >> bits/sample,
>> >> > needs 2Mb/sec max out of 480Mb/sec rated USB speed... how could that
>> not
>> >> be
>> >> > enough?
>> >>
>> >> My interlocutor claims it's like the group was there, in his front
>> >> room, playing "just for him".
>> >> Now this guy is an intelligent chap, a developer of digital musical
>> >> instruments and a sound engineer as well as being an accomplished
>> >> musician. He sends me two MP3s (…yes, lossy MP3s!) to support his
>> >> claim. I drop these into Audacity and inspect the waveform at very
>> >> fine detail and I cannot for the life of me detect any difference.
>> >> So I know, as sure as God made little Apples, that I'm not going to
>> >> *hear* any difference.
>> >> But I've got lo-fi ears. In fact I'm half-deaf. Most of what I hear I
>> >> imagine. Mostly I get it right with people (I think…) But I don't know
>> >> what subliminal cues I'm using to do so. It's the "clever Hans"
>> >> effect.
>> >>
>> >> Maybe there are people who *can* tell the difference? But from my
>> >> pondering the figures, like you have, plus eyeballing the waveforms,
>> >> we're talking about magical superpowers here.
>> >>
>> >>
>> >> > It was ever thus... when I last looked at this sort of thing, 20 years
>> >> back,
>> >> > the debate was whether big fat expensive cables would make a
>> difference.
>> >> > Bob Pease, a respected analog engineer, pointed out that it was
>> >> impossible,
>> >> > and James Randi had a bet that no one could discern $7000 cables from
>> >> > ordinary speaker wire, but still the non-EEs have their
>> superstitions...]
>> >>
>> >> That's around the time my son was spending all his pocket-money on big
>> >> fat speaker cables and gold-plated jack-plugs.
>> >> Now he's teaching a Theory of Knowledge course (…yes, Epistemology!)
>> >> at a school in Hong Kong. He is greedy to get his hands on my little
>> >> program, and dispel a few lingering superstitions masquerading as
>> >> received wisdom about science.
>> >>
>> >> I want to package it up and send it to him, but I don't want to ask
>> >> him to install J on his Mac because not only will he grouse like heck
>> >> about fairy software but it will discourage him sharing the app with
>> >> his colleages, who share his sentiments.
>> >>
>> >> I know how to package up a standalone Mac app in J602, but J602 and my
>> >> packaged apps no longer work out-of-the-box on the Mac under Yosemite
>> >> (it's to do with 32-bit Java). Has anyone built a standalone Mac app
>> >> using JQt? If so I'd dearly love to see a monkey-see monkey-do page on
>> >> Jwiki. I'll write one myself, but it'll be a year before I can get
>> >> round to it.
>> >>
>> >>
>> >> > 2.  Why 95%?  I would fear that someone who is thinking about optical
>> >> cable
>> >> > would rest uneasy with a 5-10% chance that they have not spent enough
>> on
>> >> > quality audio.  Why not simply report, "A monkey with a coin to toss
>> >> would
>> >> > do as well as you y% of the time.  Most researchers accept results as
>> >> > significant only if the monkey would do as well less than 5% of the
>> time.
>> >> > Take more samples if you want less uncertainty."
>> >>
>> >> 95% is just for the sake of argument. 99% is there as an option. IMO
>> >> more options are neither necessary nor advisable.
>> >> The number of trials can be varied too. I'd like to offer 10 or 20
>> >> trials. But 20 gets tedious, so I'm offering the option to give up
>> >> when you're bored and score the number you've done.
>> >> (This is an app for discretionary users -- we're not paying our
>> >> subjects $10 a session.)
>> >>
>> >> Anything under 7 trials fails to reject H0 however many successes. But
>> >> that's dependent on the value of CONFIDENCE and how it's to be
>> >> applied. But only to make a difference of 1 or maybe 2 trials.
>> >> I'm finding in practice that with such a low number of trials as 10,
>> >> anything short of 100% correct is statistical hairsplitting when it
>> >> comes to rejecting H0. With 20 trials there's more leeway: you're
>> >> allowed to get 3 or 4 wrong before the app rubbishes you.
>> >>
>> >> As for your wording: it's theoretically sound, but a trifle insulting.
>> >> Performing musicians have sizeable egos and wouldn't like to be rated
>> >> along with performing monkeys. :-)
>> >>
>> >> Ian
>> >> ----------------------------------------------------------------------
>> >> For information about J forums see http://www.jsoftware.com/forums.htm
>> >>
>> > ----------------------------------------------------------------------
>> > For information about J forums see http://www.jsoftware.com/forums.htm
>> ----------------------------------------------------------------------
>> For information about J forums see http://www.jsoftware.com/forums.htm
>>
>
>
>
> --
> Devon McCormick, CFA
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Re: [Jprogramming] Tea-tasting for non-statisticians

Reply via email to