Re: [R] Inference for R Spam

2009-02-24 Thread Duncan Murdoch

On 24/02/2009 8:06 AM, Dieter Menne wrote:

Dear List,

I registered for the useR conference in Rennes today; half an hour after the
confirmation I received a first "requested newsletter" from a company selling a
product named "Inference for R".

This coincidence might be spurious. Or not, depending on frequency.


I think it was just a coincidence.  I also received the spam, and am not 
yet registered for the conference.


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Inference for R Spam

2009-02-24 Thread Wacek Kusnierczyk
Dieter Menne wrote:
> Dear List,
>
> I registered for the useR conference in Rennes today; half an hour after the
> confirmation I received a first "requested newsletter" from a company selling 
> a
> product named "Inference for R".
>
> This coincidence might be spurious. Or not, depending on frequency.
>
>   


i haven't registered, but have received the newsletter too.

vQ

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Inference for R Spam

2009-02-24 Thread ronggui
I received it too without the conference registration.

2009/2/24 Wacek Kusnierczyk :
> Dieter Menne wrote:
>> Dear List,
>>
>> I registered for the useR conference in Rennes today; half an hour after the
>> confirmation I received a first "requested newsletter" from a company 
>> selling a
>> product named "Inference for R".
>>
>> This coincidence might be spurious. Or not, depending on frequency.
>>
>>
>
>
> i haven't registered, but have received the newsletter too.
>
> vQ
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
HUANG Ronggui, Wincent
Tel: (00852) 3442 3832
PhD Candidate
Dept of Public and Social Administration
City University of Hong Kong
Home page: http://asrr.r-forge.r-project.org/rghuang.html

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Inference for R Spam

2009-02-24 Thread Thomas Lumley


The same company caused a complaint about a year ago
https://stat.ethz.ch/pipermail/r-help/2008-March/157423.html

The mailing company they are using (iContact.com) claims to have a tough 
antispam policy. So does everyone, of course.

  -thomas

Thomas Lumley   Assoc. Professor, Biostatistics
tlum...@u.washington.eduUniversity of Washington, Seattle

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Inference for R Spam

2009-02-24 Thread Simon Pickett
I got the same spam message today and I havent signed up for anything except 
this forum mailing list.


The software they are trying to sell doesnt seem to cover any new ground 
anyway.


Simon.


- Original Message - 
From: "Thomas Lumley" 

To: "ronggui" 
Cc: 
Sent: Tuesday, February 24, 2009 1:39 PM
Subject: Re: [R] Inference for R Spam




The same company caused a complaint about a year ago
https://stat.ethz.ch/pipermail/r-help/2008-March/157423.html

The mailing company they are using (iContact.com) claims to have a tough 
antispam policy. So does everyone, of course.


  -thomas

Thomas Lumley Assoc. Professor, Biostatistics
tlum...@u.washington.edu University of Washington, Seattle

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Inference for R Spam

2009-02-24 Thread Martin Maechler
> "TL" == Thomas Lumley 
> on Tue, 24 Feb 2009 05:39:33 -0800 (PST) writes:

TL> The same company caused a complaint about a year ago
TL> https://stat.ethz.ch/pipermail/r-help/2008-March/157423.html

thanks, Thomas.

TL> The mailing company they are using (iContact.com) claims to have a 
tough antispam policy. So does everyone, of course.

Indeed.  Note that the 'From' (= Reply-To) address of their spam is valid;
if everyone replies (as I did) that they would never consider
their product because of their completely fraudulous spamming

(They have the infamous

  >> You received this email because you’ve subscribed to the
  >> Inference for R newsletter or downloaded Inference for R.
  >> To unsubscribe, click the "To be removed" link below.
)
that action may help (or then may not ..).

Yes, "of course" they've probably harvested e-mails from the R
mailing lists (mirrors / repositories / ...), but at the moment,
please don't start a public discussion about the
non-appropriateness of mailing lists in the age of spam,
unless *YOU* are offering your time to provide something
"uniformly better" (i.e., for those who want, it *must* work via
e-mail exclusively, not via compulsory web-browser clicking).

Martin Maechler, 
ETH Zurich

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Inference for R Spam

2009-02-24 Thread Tony Breyal
Cheers for that information; I've just registered for the useR meeting
in London and then about 10 minutes later got that same bit of spam
too which made me a wee bit suspicious.

On 24 Feb, 13:39, Thomas Lumley  wrote:
> The same company caused a complaint about a year 
> agohttps://stat.ethz.ch/pipermail/r-help/2008-March/157423.html
>
> The mailing company they are using (iContact.com) claims to have a tough 
> antispam policy. So does everyone, of course.
>
>        -thomas
>
> Thomas Lumley                   Assoc. Professor, Biostatistics
> tlum...@u.washington.edu     University of Washington, Seattle
>
> __
> r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Inference for R Spam

2009-02-24 Thread Gabor Grothendieck
On Tue, Feb 24, 2009 at 9:58 AM, Martin Maechler
 wrote:
>> "TL" == Thomas Lumley 
>>     on Tue, 24 Feb 2009 05:39:33 -0800 (PST) writes:
>
>    TL> The same company caused a complaint about a year ago
>    TL> https://stat.ethz.ch/pipermail/r-help/2008-March/157423.html
>
> thanks, Thomas.
>
>    TL> The mailing company they are using (iContact.com) claims to have a 
> tough antispam policy. So does everyone, of course.
>
> Indeed.  Note that the 'From' (= Reply-To) address of their spam is valid;
> if everyone replies (as I did) that they would never consider
> their product because of their completely fraudulous spamming
>
> (They have the infamous
>
>  >> You received this email because you’ve subscribed to the
>  >> Inference for R newsletter or downloaded Inference for R.
>  >> To unsubscribe, click the "To be removed" link below.
> )
> that action may help (or then may not ..).
>
> Yes, "of course" they've probably harvested e-mails from the R
> mailing lists (mirrors / repositories / ...), but at the moment,
> please don't start a public discussion about the
> non-appropriateness of mailing lists in the age of spam,
> unless *YOU* are offering your time to provide something
> "uniformly better" (i.e., for those who want, it *must* work via
> e-mail exclusively, not via compulsory web-browser clicking).

Perhaps a separate list could be set up for commercial messages
to keep the main list free while providing an outlet for them.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Inference for R Spam

2009-02-24 Thread Albyn Jones
I also received the spam, and have not registered for the conference.
I decided to do the noble experiment, and used their web interface to
unsubscribe to the newsletter to which they claim I had subscribed,
and for good measure added my name to their "do not contact" list.

albyn

On Tue, Feb 24, 2009 at 01:06:11PM +, Dieter Menne wrote:
> Dear List,
> 
> I registered for the useR conference in Rennes today; half an hour after the
> confirmation I received a first "requested newsletter" from a company selling 
> a
> product named "Inference for R".
> 
> This coincidence might be spurious. Or not, depending on frequency.
> 
> Dieter
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Inference for R Spam

2009-02-24 Thread Dieter Menne
Tony Breyal  googlemail.com> writes:

>Cheers for that information; I've just registered for the useR meeting
>in London and then about 10 minutes later got that same bit of spam
>too which made me a wee bit suspicious.

Welcome in the Fooled by Randomness society. 2:0 is a bit away from 
significance.  But don't tell this to football/soccer fans.

And, since my son asked me and I am basketball ignorant: Why are 
basketball scores mostly much too close to equality? The arguments (loose 
power when leading) might suggest that 2:0 might not be significant, 
but relevant. I tend to argue the other way round though, in medical 
statistics.

Dieter

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Inference for R Spam

2009-02-24 Thread Ted Harding
On 24-Feb-09 17:25:53, Dieter Menne wrote:
> Tony Breyal  googlemail.com> writes:
> 
>>Cheers for that information; I've just registered for the useR meeting
>>in London and then about 10 minutes later got that same bit of spam
>>too which made me a wee bit suspicious.
> 
> Welcome in the Fooled by Randomness society. 2:0 is a bit away from 
> significance.  But don't tell this to football/soccer fans.
> 
> And, since my son asked me and I am basketball ignorant: Why are 
> basketball scores mostly much too close to equality? The arguments
> (loose power when leading)

Or: Once you are in the lead, become much more defensive against
attacking play by the other side ...
Ted.

> might suggest that 2:0 might not be significant, 
> but relevant. I tend to argue the other way round though, in medical 
> statistics.
> 
> Dieter
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


E-Mail: (Ted Harding) 
Fax-to-email: +44 (0)870 094 0861
Date: 24-Feb-09   Time: 18:26:36
-- XFMail --

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Inference for R Spam

2009-02-24 Thread Dieter Menne

> 
> > And, since my son asked me and I am basketball ignorant: Why are 
> > basketball scores mostly much too close to equality? The arguments
> > (loose power when leading)


  manchester.ac.uk> writes:
 
> Or: Once you are in the lead, become much more defensive against
> attacking play by the other side ...
> Ted.
> 
> > might suggest that 2:0 might not be significant,  but relevant. 

Well, that's approximately the normal way of reasoning. But wouldn't 
that mean that 2:0 is also far more significant than one would expect?

(currently out of office, so I can afford this loose speak I would 
not let pass to my clients)

Dieter

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Inference for R Spam

2009-02-25 Thread Jim Lemon

Dieter Menne wrote:

And, since my son asked me and I am basketball ignorant: Why are 
basketball scores mostly much too close to equality? The arguments

(loose power when leading)


Characteristic of the game. Possession of the ball changes rapidly and 
the probability of scoring is much higher given possession than in a 
game like soccer. The difference between winning and losing is largely a 
matter of not missing the frequent opportunities to score, whereas in 
soccer the difference is the ability to penetrate a much more effective 
defense. To see a similar thing within a sport, compare a tennis match 
between two power servers with one between two volley experts.


Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Inference for R Spam

2009-03-03 Thread Michael A. Miller
> "Dieter" == Dieter Menne  writes:

> And, since my son asked me and I am basketball ignorant:
> Why are basketball scores mostly much too close to
> equality? The arguments (loose power when leading) might
> suggest that 2:0 might not be significant, but relevant. I
> tend to argue the other way round though, in medical
> statistics.

Sports scores are not statistics, they are measurements (counts)
of the number of times each team scores.  There is no sampling
and vanishingly small possibility of systematic error in the
measurement.

Mike

-- 
Michael A. Miller mmill...@iupui.edu
  Department of Radiology, Indiana University School of Medicine

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Inference for R Spam

2009-03-03 Thread Rolf Turner


On 4/03/2009, at 11:50 AM, Michael A. Miller wrote:


"Dieter" == Dieter Menne  writes:



And, since my son asked me and I am basketball ignorant:
Why are basketball scores mostly much too close to
equality? The arguments (loose power when leading) might
suggest that 2:0 might not be significant, but relevant. I
tend to argue the other way round though, in medical
statistics.


Sports scores are not statistics, they are measurements (counts)
of the number of times each team scores.  There is no sampling
and vanishingly small possibility of systematic error in the
measurement.


I think this comment indicates a fundamental misunderstanding
of the nature of statistics in general and the concept of variability
in particular.  Measurement error is only *one possible* source
of variability and is often a minor --- or as in the case of
sports scores a non-existent --- source.

cheers,

Rolf Turner

##
Attention:\ This e-mail message is privileged and confid...{{dropped:9}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Inference for R Spam

2009-03-04 Thread Michael A. Miller
> "Rolf" == Rolf Turner  writes:

> On 4/03/2009, at 11:50 AM, Michael A. Miller wrote:

>> Sports scores are not statistics, they are measurements
>> (counts) of the number of times each team scores.  There
>> is no sampling and vanishingly small possibility of
>> systematic error in the measurement.

> I think this comment indicates a fundamental
> misunderstanding of the nature of statistics in general and
> the concept of variability in particular.  Measurement
> error is only *one possible* source of variability and is
> often a minor --- or as in the case of sports scores a
> non-existent --- source.

Would you elaborate Rolf?  I'm was referring to measurements, not
statistics.  Isn't calling scores statistics similar to saying
that the values of some response in an individual subject before
and after treatment are statistics?  I think they are just
measured values and that if they are measured accurately enough,
they can be precisely known.  It is in considering the
distribution of similar measurements obtained in repeated trials
that statistics come into play.

>From my perspective as a baseball fan (I know I'm in Indiana and
I aught to be more of a basketball fan, but I grew up as a Cubs
watcher and still can't shake it), it doesn't seem to me that the
purpose of the score is to allow for some inference about the
overall population of teams.  It is about which team beats the
other one and entertainment (and hot dogs) for the fans.

Mike


-- 
Michael A. Miller mmill...@iupui.edu
  Department of Radiology, Indiana University School of Medicine

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Inference for R Spam

2009-03-04 Thread Rolf Turner


On 5/03/2009, at 4:54 AM, Michael A. Miller wrote:


"Rolf" == Rolf Turner  writes:



On 4/03/2009, at 11:50 AM, Michael A. Miller wrote:



Sports scores are not statistics, they are measurements
(counts) of the number of times each team scores.  There
is no sampling and vanishingly small possibility of
systematic error in the measurement.



I think this comment indicates a fundamental
misunderstanding of the nature of statistics in general and
the concept of variability in particular.  Measurement
error is only *one possible* source of variability and is
often a minor --- or as in the case of sports scores a
non-existent --- source.


Would you elaborate Rolf?  I'm was referring to measurements, not
statistics.  Isn't calling scores statistics similar to saying
that the values of some response in an individual subject before
and after treatment are statistics?  I think they are just
measured values and that if they are measured accurately enough,
they can be precisely known.  It is in considering the
distribution of similar measurements obtained in repeated trials
that statistics come into play.


From my perspective as a baseball fan (I know I'm in Indiana and

I aught to be more of a basketball fan, but I grew up as a Cubs
watcher and still can't shake it), it doesn't seem to me that the
purpose of the score is to allow for some inference about the
overall population of teams.  It is about which team beats the
other one and entertainment (and hot dogs) for the fans.


Well the *purpose* of the score has nothing to do with statistics
as such, but then then the ``purpose'' of many (most?) observations
to which the ideas of statistics are applied has nothing to do
with statistics either.

Technically a statistic is any function of a *sample* (sample =
a collection of random variables), including any one of these
random variables themselves.

The purpose of the subject or discipline ``statistics'' is in essence
to answer the question ``could the phenomenon we observed have arisen
simply by chance?'', or to quantify the *uncertainty* in any estimate
that we make of a quantity.

E.g., to stick with the sports idea:  We might ask ``Is there a home
field advantage?'' or ``How big is the home field advantage?'' or
``Is the home field advantage in the Premier Division (English football)
bigger than that in the equivalent division/league in Italian  
football?''


We would collect a sample or samples of pairs of scores

(X,Y) = (home team score, away team score)

and analyse these scores in some way, possibly on the basis of the
differences X - Y, possibly not, in order to answer these *statistical*
questions.  Not that there is *variability* or *uncertainty* in the  
differences
X - Y.  Even if we knew exactly that the home field advantage was  
1.576 goals,
we would not be able to say that the home team would always win by  
exactly
1.576 goals.  In fact the home team would *never* win by exactly  
1.576 goals! :-)


Sports scores are random variables.  You don't know a priori what the  
scores are
going to be, do you?  (Well, if you do, you must be able to make a  
*lot* of money
betting on games!)  After the game is over they aren't random any  
more; they're
just numbers.  But that applies to any random variable.  A random  
variable is

random only until it is observed, then POOF! it turns into a number.

The randomness in the scores does not arise from measurement error.   
This is usually
the case with integer valued random variables.  An ornithologist  
counting birds nests
in quadrats does not have to contend with measurement error.  Well,  
some ornithologists
might --- depends on how well they were taught to count.  But the  
quadrat counts are

random variables (statistics) nevertheless --- until they are observed.

cheers,

Rolf Turner

##
Attention:\ This e-mail message is privileged and confid...{{dropped:9}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Inference for R Spam

2009-03-04 Thread Bert Gunter

"The purpose of the subject or discipline ``statistics'' is in essence
to answer the question ``could the phenomenon we observed have arisen
simply by chance?'', or to quantify the *uncertainty* in any estimate
that we make of a quantity."


May I take strong issue with this characterization? It is far too narrow and
constraining. We are scientists first and foremost. The most important and
useful thing I do is to collaborate with other scientists to frame good
questions, design good experiments and studies, and gain insight into the
results of those experiments and studies (usually via graphical displays,
for which R is superbly suited). Blessing data with P-values is rarely of
much importance, and is often frankly irrelevant and even misleading (but
that's another rant). 

George Box said this much better than I: "The business of the statistician
is to catalyze the scientific learning process."

This is much much more than you intimate.

Cheers to all,

Bert Gunter
Genentech Nonclinical Biostatistics

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Inference for R Spam

2009-03-04 Thread Rolf Turner


On 5/03/2009, at 12:13 PM, Bert Gunter wrote:



"The purpose of the subject or discipline ``statistics'' is in essence
to answer the question ``could the phenomenon we observed have arisen
simply by chance?'', or to quantify the *uncertainty* in any estimate
that we make of a quantity."


May I take strong issue with this characterization? It is far too  
narrow and
constraining. We are scientists first and foremost. The most  
important and
useful thing I do is to collaborate with other scientists to frame  
good
questions, design good experiments and studies, and gain insight  
into the
results of those experiments and studies (usually via graphical  
displays,
for which R is superbly suited). Blessing data with P-values is  
rarely of
much importance, and is often frankly irrelevant and even  
misleading (but

that's another rant).

George Box said this much better than I: "The business of the  
statistician

is to catalyze the scientific learning process."

This is much much more than you intimate.


I must respectfully disagree.  Far be it from me to argue with George  
Box,

but nevertheless ... it may be statisticians *business* to catalyze the
scientific learning process, but that is the business of *any*  
scientist.

What we bring to the process is our understanding of the essentials of
statistics, just as the chemist brings her understanding of the  
essentials

of chemistry and the biologist her understanding of the essentials of
biology.

The essentials of statistics consist in answering the question of  
``could
this phenomenon have arisen by chance?''  This is where we contribute  
in a
way that other scientists do not.  They don't understand variability,  
the

poor dears.  (Unless they have been well taught and thereby have become
in part statisticians themselves.) They have a devastating tendency  
to treat
an estimated regression line as *the* regression line, the truth.   
And so on.


The *way* we address the question of ``could it have happened by  
chance''
and the way we address the problem of quantifying variability is  
indeed open

to a broad range of techniques including graphics.

Note that I did not say word one about p-values.  The example I gave was
a scientific question --- is there a difference in the home field  
advantage
between the English Premier Division and the equivalent division or  
league
in Italy?  How much of a difference?  You may wish to throw in a p- 
value,
or you may not.  You will probably wish to look at a confidence  
interval.
You may wish to look at the question from the point of view of the  
distribution
of (home) - (away) differences, in which case graphics will most  
certainly
help.  But it comes down to answering the basic question.  If you  
have no

ability to answer such questions you are not, or might as well not be, a
statistician.

cheers,

Rolf Turner


##
Attention:\ This e-mail message is privileged and confid...{{dropped:9}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Inference for R Spam

2009-03-04 Thread David Winsemius
I mostly agree with you, Rolf (and Gunter). I would challenge your  
joint use of the term "scientists". My quibble arises not regarding  
biomedical practitioners (who may be irredeemable as a group)  but  
rather regarding physicists. At least in that domain, I believe those  
domain experts are at least as likely, and possibly more so, to  
understand issues relating to randomness as are statisticians.  
Randomness has been theoretically embedded in the domain for the last  
90 years or so.


--
David Winsemius, MD


--
On Mar 4, 2009, at 6:43 PM, Rolf Turner wrote:



On 5/03/2009, at 12:13 PM, Bert Gunter wrote:



"The purpose of the subject or discipline ``statistics'' is in  
essence

to answer the question ``could the phenomenon we observed have arisen
simply by chance?'', or to quantify the *uncertainty* in any estimate
that we make of a quantity."


May I take strong issue with this characterization? It is far too  
narrow and
constraining. We are scientists first and foremost. The most  
important and
useful thing I do is to collaborate with other scientists to frame  
good
questions, design good experiments and studies, and gain insight  
into the
results of those experiments and studies (usually via graphical  
displays,
for which R is superbly suited). Blessing data with P-values is  
rarely of
much importance, and is often frankly irrelevant and even  
misleading (but

that's another rant).

George Box said this much better than I: "The business of the  
statistician

is to catalyze the scientific learning process."

This is much much more than you intimate.


I must respectfully disagree.  Far be it from me to argue with  
George Box,
but nevertheless ... it may be statisticians *business* to catalyze  
the
scientific learning process, but that is the business of *any*  
scientist.

What we bring to the process is our understanding of the essentials of
statistics, just as the chemist brings her understanding of the  
essentials

of chemistry and the biologist her understanding of the essentials of
biology.

The essentials of statistics consist in answering the question of  
``could
this phenomenon have arisen by chance?''  This is where we  
contribute in a
way that other scientists do not.  They don't understand  
variability, the
poor dears.  (Unless they have been well taught and thereby have  
become
in part statisticians themselves.) They have a devastating tendency  
to treat
an estimated regression line as *the* regression line, the truth.   
And so on.


The *way* we address the question of ``could it have happened by  
chance''
and the way we address the problem of quantifying variability is  
indeed open

to a broad range of techniques including graphics.

Note that I did not say word one about p-values.  The example I gave  
was
a scientific question --- is there a difference in the home field  
advantage
between the English Premier Division and the equivalent division or  
league
in Italy?  How much of a difference?  You may wish to throw in a p- 
value,
or you may not.  You will probably wish to look at a confidence  
interval.
You may wish to look at the question from the point of view of the  
distribution
of (home) - (away) differences, in which case graphics will most  
certainly
help.  But it comes down to answering the basic question.  If you  
have no
ability to answer such questions you are not, or might as well not  
be, a

statistician.

cheers,

Rolf Turner


##
Attention:\ This e-mail message is privileged and confid...{{dropped: 
9}}


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Inference for R Spam

2009-03-04 Thread Rolf Turner


On 5/03/2009, at 3:06 PM, David Winsemius wrote:


I mostly agree with you, Rolf (and Gunter). I would challenge your
joint use of the term "scientists". My quibble arises not regarding
biomedical practitioners (who may be irredeemable as a group)  but
rather regarding physicists. At least in that domain, I believe those
domain experts are at least as likely, and possibly more so, to
understand issues relating to randomness as are statisticians.
Randomness has been theoretically embedded in the domain for the last
90 years or so.


My impression --- and I could be wrong --- is that physicists  
understanding
of randomness is very narrow and constrained.  They tend to think  
along the
lines of chaotic dynamical systems (although perhaps not consciously;  
and they
may not explicitly express themselves in this way).  They also tend  
to think
exclusively in terms of measurement error as the source of  
variability.  Which
may be appropriate in the applications with which they are concerned,  
but is
pretty limited.  Also they're a rather arrogant bunch.  E.g.  
Rutherford (???):

``If I need statistics to analyze my data I need more data.''

cheers,

Rolf Turner

##
Attention:\ This e-mail message is privileged and confid...{{dropped:9}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Inference for R Spam

2009-03-04 Thread Wacek Kusnierczyk
Rolf Turner wrote:
>
> Sports scores are random variables.  You don't know a priori what the
> scores are
> going to be, do you?  (Well, if you do, you must be able to make a
> *lot* of money
> betting on games!)  After the game is over they aren't random any
> more; they're
> just numbers.  But that applies to any random variable.  A random
> variable is
> random only until it is observed, then POOF! it turns into a number.
>

may i respectfully disagree?

to call for a reference, [1] says (p. 26, def. 1.4.1):

a random variable is a function from sample space S into the real
numbers.

and it's a pretty standard definition.

do you really turn a *function* into a *number* by *observing the
function*?  in the example above, you have a sample space, which
consists of possible outcomes of a class of sports events.  you have a
random variable -- a function that maps from the number of goals into,
well, the number of goals. 

after a sports event, the function is no less random, and no more a
number.  you have observed an event, you have computed one realization
of the function (here's your number, which happens to be an integer) --
but the random variable does not turn to anything.

vQ


[1] Casella, Berger. Statistical Inference, 1st 1990

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Inference for R Spam

2009-03-05 Thread Michael A. Miller
> "Rolf" == Rolf Turner  writes:

> My impression --- and I could be wrong --- is that
> physicists understanding of randomness is very narrow and
> constrained.  They tend to think along the lines of chaotic
> dynamical systems (although perhaps not consciously; and
> they may not explicitly express themselves in this way).
> They also tend to think exclusively in terms of measurement
> error as the source of variability.  Which may be
> appropriate in the applications with which they are
> concerned, but is pretty limited.  Also they're a rather
> arrogant bunch.  E.g.  Rutherford (???): ``If I need
> statistics to analyze my data I need more data.''


This is an interesting discussion all around, but as one of those
physicists I feel a need to jump back in ;-) Just as in any
multidisciplinary endeavor, much of the fun comes from bridging
communication gaps that arise from our certainty that "everyone
knows" what we mean when we say what we say.

First, I counter with a quote from my list of interesting sayings :-)

 "We must be careful not to confuse data with the abstractions
  we use to analyze them."  --- William James

I went through an interesting transition when I moved from basic
physics (medium energy nuclear/particle physics) to biomedical
applications (cardiology and then imaging sciences/radiology).
There is an important difference between physics-y statistical
analysis and biomedical-y statistical analysis that I was not
aware of before I crossed over to the biomedical side.  That my
biomedical and biostatisticians colleagues didn't have the same
background didn't make their perspective invalid, just as my not
having a background in biomedical statistics didn't make me
arrogant.  That we were unaware that we were sometimes speaking
different languages made up of the same words lead to some
adventures.

I had to learn two things.  One, that biomedical systems tend to
have broad distributions while many physical systems have very
narrow distributions.  Two, that physics models are based on
physics theories and that biomedical/biostats models are purely
phenomonological and only model the data - they often do not have
a basis in underlying physical theory.  Simple, but not stressed
in my statistical physics or biomedical statistics training.

Perhaps the key example is statistical mechanics, both classical
and quantum mechanical.  A fundamental physics-y concept is that
a single object has no statistical properties.  "Statistical" is
a word reserved for properties of ensembles.  Statistical
mechanics can only be applied to ensembles of objects where their
joint behavior leads to (highly) predictable results.  The
density of states for any macroscopic ensemble of like objects is
extremely sharply peaked, leading to wonderfully reliable
theoretical predictions.  Just the opposite of what we tend to
see in biomedical systems.

For those who are interested in a physics-y perspective, I'd
suggest taking a crack at "Statistical Methods in Experimental
Physics" (F. James) and some of the many statistical mechanics
texts out there.  My favorites are still F. Mandl's "Statistical
Physics" and K. Huang's "Statistical Mechanics," but there are
many, many more.

Another nice little book is "Observational Foundations of
Physics" by Cook.  It addresses in part the question of why
mathematics is so startlingly effective in physics.  It is a
result of the correspondence between physical processes in the
natural world and mathematical groups.  As far as I know, a
similar correspondence does not exist in the biomedical realm,
nor in many other domains.  That lack of correspondence leads to
purely phenomonological models that model the data but are not
based on underlying physical theory - all that is left is
statistical modeling.  I suspect this is the source of the sort
of statement you attributed to Rutherford.  I hear him simply
saying that we can do perfectly respectable statistical modeling
without physics, but then it is not physics.  And if our goal is
to do physics, then we aught to get back to the lab and observe
reality some more.  Which is where the fun is for many of us
scientists!

Regards, Mike

-- 
Michael A. Miller mmill...@iupui.edu
  Department of Radiology, Indiana University School of Medicine

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Inference for R Spam

2009-03-05 Thread Rolf Turner


On 5/03/2009, at 8:48 PM, Wacek Kusnierczyk wrote:


Rolf Turner wrote:


Sports scores are random variables.  You don't know a priori what the
scores are
going to be, do you?  (Well, if you do, you must be able to make a
*lot* of money
betting on games!)  After the game is over they aren't random any
more; they're
just numbers.  But that applies to any random variable.  A random
variable is
random only until it is observed, then POOF! it turns into a number.



may i respectfully disagree?

to call for a reference, [1] says (p. 26, def. 1.4.1):

a random variable is a function from sample space S into the real
numbers.

and it's a pretty standard definition.

do you really turn a *function* into a *number* by *observing the
function*?  in the example above, you have a sample space, which
consists of possible outcomes of a class of sports events.  you have a
random variable -- a function that maps from the number of goals into,
well, the number of goals.

after a sports event, the function is no less random, and no more a
number.  you have observed an event, you have computed one realization
of the function (here's your number, which happens to be an  
integer) --

but the random variable does not turn to anything.

vQ

[1] Casella, Berger. Statistical Inference, 1st 1990


I was discussing the issue from an elementary/intuitive point of view.
The rigorous mathematical definition of a random variable as a  
(measurable)
function from a sample (probability) space is not very helpful to the  
beginner.


From the beginner's point of view it is useful to think of random  
variables
as being unpredictable quantities that you are *going* to observe.   
After
you've observed them, you know what they are and prediction doesn't  
come into

it; they are thus no longer random.

From the more mathematical point of view the distinction is between the
function X : Omega |--> R (the real numbers), say, and a *particular  
value*

of the function X(omega).

In discussions of statistical inference the viewpoint is always shifting
backwards and forwards between the ``random sample'' X_1, ..., X_n and
the ``realized random sample'' x_1 = X_1(omega), ... x_n = X_n(omega).
Most students --- and I was one of them --- find this shifting point of
view confusing, and I think the elementary heuristic that I introduced
is helpful to many.

cheers,

Rolf

##
Attention:\ This e-mail message is privileged and confid...{{dropped:9}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Inference for R Spam

2009-03-05 Thread Mark Difford

Hi Rolf,

>> ... From the beginner's point of view it is useful to think of random  
>> variables ...

Who, exactly, is the beginner ? And was not Sir R. A. Fisher pretty arrogant
and fractious ? He also was highly dismissive of Sir Richard Doll's
conclusion that smoking caused cancer (himself being a smoker). Does that
make him a bad statistician, or all statisticians "bad" or arrogant ?

Regards, Mark.


Rolf Turner-3 wrote:
> 
> 
> On 5/03/2009, at 8:48 PM, Wacek Kusnierczyk wrote:
> 
>> Rolf Turner wrote:
>>>
>>> Sports scores are random variables.  You don't know a priori what the
>>> scores are
>>> going to be, do you?  (Well, if you do, you must be able to make a
>>> *lot* of money
>>> betting on games!)  After the game is over they aren't random any
>>> more; they're
>>> just numbers.  But that applies to any random variable.  A random
>>> variable is
>>> random only until it is observed, then POOF! it turns into a number.
>>>
>>
>> may i respectfully disagree?
>>
>> to call for a reference, [1] says (p. 26, def. 1.4.1):
>>
>> a random variable is a function from sample space S into the real
>> numbers.
>>
>> and it's a pretty standard definition.
>>
>> do you really turn a *function* into a *number* by *observing the
>> function*?  in the example above, you have a sample space, which
>> consists of possible outcomes of a class of sports events.  you have a
>> random variable -- a function that maps from the number of goals into,
>> well, the number of goals.
>>
>> after a sports event, the function is no less random, and no more a
>> number.  you have observed an event, you have computed one realization
>> of the function (here's your number, which happens to be an  
>> integer) --
>> but the random variable does not turn to anything.
>>
>> vQ
>>
>> [1] Casella, Berger. Statistical Inference, 1st 1990
> 
> I was discussing the issue from an elementary/intuitive point of view.
> The rigorous mathematical definition of a random variable as a  
> (measurable)
> function from a sample (probability) space is not very helpful to the  
> beginner.
> 
>  From the beginner's point of view it is useful to think of random  
> variables
> as being unpredictable quantities that you are *going* to observe.   
> After
> you've observed them, you know what they are and prediction doesn't  
> come into
> it; they are thus no longer random.
> 
>  From the more mathematical point of view the distinction is between the
> function X : Omega |--> R (the real numbers), say, and a *particular  
> value*
> of the function X(omega).
> 
> In discussions of statistical inference the viewpoint is always shifting
> backwards and forwards between the ``random sample'' X_1, ..., X_n and
> the ``realized random sample'' x_1 = X_1(omega), ... x_n = X_n(omega).
> Most students --- and I was one of them --- find this shifting point of
> view confusing, and I think the elementary heuristic that I introduced
> is helpful to many.
> 
>   cheers,
> 
>   Rolf
> 
> ##
> Attention:\ This e-mail message is privileged and confid...{{dropped:9}}
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Inference-for-R-Spam-tp22181352p22361224.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Inference for R Spam

2009-03-05 Thread Rolf Turner


On 6/03/2009, at 10:38 AM, Mark Difford wrote:



Hi Rolf,

... From the beginner's point of view it is useful to think of  
random

variables ...


Who, exactly, is the beginner ?


The OP --- well, not the OP, but the person who introduced this
line of discussion to this thread, by saying that sports scores
were not/could not be statistics --- seemed to be pretty much at
the neophyte level.

More generally there seem to be lots of subscribers to this list
who not sophisticated mathematical statisticians and would benefit
more from the ``random quantity that you are going to observe'' pov
than from the ``measurable function on a probability space'' pov.


And was not Sir R. A. Fisher pretty arrogant
and fractious ?


Dunno.  Never met him! :-)


He also was highly dismissive of Sir Richard Doll's
conclusion that smoking caused cancer (himself being a smoker).  
Does that

make him a bad statistician,


I don't ***think*** so.


or all statisticians "bad" or arrogant ?


I can think of at least one counter-example, that being of course
my very good self! :-)

What is your point, exactly?

I asserted that in my experience physicists tend to be arrogant
(and dismissive and condescending) in respect of statistics.
That *is* my experience.  I haven't done a carefully designed
survey, but.

	Many (most?) statisticians have a similar impression of the  
attitudes of

pure mathematicians.  That is *not* my experience.

I certainly never said that no statisticians are arrogant; some
of them may well be.  I never met one, but. :-)

cheers,

Rolf Turner

##
Attention:\ This e-mail message is privileged and confid...{{dropped:9}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Inference for R Spam

2009-03-05 Thread Wacek Kusnierczyk
Rolf Turner wrote:
>
> I certainly never said that no statisticians are arrogant; some
> of them may well be.  I never met one, but. :-)

this can't be true.

vQ

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Inference for R Spam

2009-03-05 Thread Mark Difford

Rolf Turner wrote:

>> What is your point, exactly?

My point is that you are making some very broad generalizations on the basis
of your own (perhaps limited) experience and knowledge of the history of
science. You say:

"My impression --- and I could be wrong --- is that physicists understanding
of randomness is very narrow and constrained."

Well most informed scientist would not consider Einstein, de Broglie, and
Schrödinger, together with a host of other physicists, of being "guilty" of
such limited understanding of the randomness of things.

You also criticize physicists for having an approach to measurement error
that "...may be appropriate in the applications with which they are
concerned...," i.e. that is appropriate to their field study. So what is, or
was, your point with all the fluff, stuff, and nonsense taken out?

Regards, Mark.


Rolf Turner-3 wrote:
> 
> 
> On 6/03/2009, at 10:38 AM, Mark Difford wrote:
> 
>>
>> Hi Rolf,
>>
 ... From the beginner's point of view it is useful to think of  
 random
 variables ...
>>
>> Who, exactly, is the beginner ?
> 
>   The OP --- well, not the OP, but the person who introduced this
>   line of discussion to this thread, by saying that sports scores
>   were not/could not be statistics --- seemed to be pretty much at
>   the neophyte level.
> 
>   More generally there seem to be lots of subscribers to this list
>   who not sophisticated mathematical statisticians and would benefit
>   more from the ``random quantity that you are going to observe'' pov
>   than from the ``measurable function on a probability space'' pov.
> 
>> And was not Sir R. A. Fisher pretty arrogant
>> and fractious ?
> 
>   Dunno.  Never met him! :-)
> 
>> He also was highly dismissive of Sir Richard Doll's
>> conclusion that smoking caused cancer (himself being a smoker).  
>> Does that
>> make him a bad statistician,
> 
>   I don't ***think*** so.
> 
>> or all statisticians "bad" or arrogant ?
> 
>   I can think of at least one counter-example, that being of course
>   my very good self! :-)
> 
>   What is your point, exactly?
> 
>   I asserted that in my experience physicists tend to be arrogant
>   (and dismissive and condescending) in respect of statistics.
>   That *is* my experience.  I haven't done a carefully designed
>   survey, but.
> 
>   Many (most?) statisticians have a similar impression of the  
> attitudes of
>   pure mathematicians.  That is *not* my experience.
> 
>   I certainly never said that no statisticians are arrogant; some
>   of them may well be.  I never met one, but. :-)
> 
>   cheers,
> 
>   Rolf Turner
> 
> ##
> Attention:\ This e-mail message is privileged and confid...{{dropped:9}}
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Inference-for-R-Spam-tp22181352p22367339.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.