Re: [R] Wilcoxon signed rank test and its requirements

Atte Tenkanen Sat, 26 Jun 2010 02:46:52 -0700

Atte Tenkanen kirjoitti 26.6.2010 kello 5.15:

>
> Greg Snow kirjoitti 25.6.2010 kello 21.55:
>
>> Let me see if I understand.  You actually have the data for the  
>> whole population (the entire piece) but you have some pre-defined  
>> sections that you want to see if they differ from the population,  
>> or more meaningfully they are different from a randomly selected  
>> set of measures.  Is that correct?
>>
>> If so, since you have the entire population of interest you can  
>> create the actual sampling distribution (or a good approximation  
>> of it).  Just take random samples from the population of the given  
>> size (matching the subset you are interested in) and calculate the  
>> means (or other value of interest), probably 10,000 to 1,000,000  
>> samples.  Now compare the value from your predefined subset to the  
>> set of random values you generated to see if it is in the tail or  
>> not.
>
> I check, so you mean doing it this way:
>
> t.test(sample(POPUL, length(SAMPLE), replace = FALSE), mu=mean 
> (SAMPLE), alt = "less")


NO, this way:

t.test(POPUL[sample(1:length(POPUL), length(SAMPLE), replace =  
FALSE)], mu=mean(SAMPLE), alt = "less")

Atte

>
> Atte
>
>>
>> -- 
>> Gregory (Greg) L. Snow Ph.D.
>> Statistical Data Center
>> Intermountain Healthcare
>> greg.s...@imail.org
>> 801.408.8111
>>
>>
>>> -----Original Message-----
>>> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
>>> project.org] On Behalf Of Atte Tenkanen
>>> Sent: Thursday, June 24, 2010 11:04 PM
>>> To: David Winsemius
>>> Cc: R mailing list
>>> Subject: Re: [R] Wilcoxon signed rank test and its requirements
>>>
>>> The values come from this kind of process:
>>> The musical composition is segmented into so-called 'pitch-class
>>> segments' and these segments are compared with one reference set  
>>> with a
>>> distance function. Only some distance values are possible. These
>>> distance values can be averaged over music bars which produces  
>>> smoother
>>> distribution and the 'comparison curve' that illustrates the  
>>> distances
>>> according to the reference set through a musical piece result in  
>>> more
>>> readable curve (see e.g. http://users.utu.fi/attenka/with6.jpg ),  
>>> but I
>>> would prefer to use original values.
>>>
>>> then, I want to pick only some regions from the piece and compare  
>>> those
>>> values of those regions, whether they are higher than the mean of  
>>> all
>>> values.
>>>
>>> Atte
>>>
>>>> On Jun 24, 2010, at 6:58 PM, Atte Tenkanen wrote:
>>>>
>>>>> Is there anything for me?
>>>>>
>>>>> There is a lot of data, n=2418, but there are also a lot of ties.
>>>>> My sample nÅ250-300
>>>>>
>>>>
>>>> I do not understand why there should be so many ties. You have not
>>>> described the measurement process or units. ( ... although you  
>>>> offer
>>> a
>>>>
>>>> glipmse without much background  later.)
>>>>
>>>>> i would like to test, whether the mean of the sample differ
>>>>> significantly from the population mean.
>>>>
>>>> Why? What is the purpose of this investigation? Why should the mean
>>> of
>>>>
>>>> a sample be that important?
>>>>
>>>>>
>>>>> The histogram of the population looks like in attached histogram,
>>>>> what test should I use? No choices?
>>>>>
>>>>> This distribution comes from a musical piece and the values are
>>>>> 'tonal distances'.
>>>>>
>>>>> http://users.utu.fi/attenka/Hist.png
>>>>
>>>> That picture does not offer much insidght into the features of that
>>>> measurement. It appears to have much more structure than I would
>>>> expect for a sample from a smooth unimodal underlying population.
>>>>
>>>> --
>>>> David.
>>>>
>>>>>
>>>>> Atte
>>>>>
>>>>>> On 06/24/2010 12:40 PM, David Winsemius wrote:
>>>>>>>
>>>>>>> On Jun 23, 2010, at 9:58 PM, Atte Tenkanen wrote:
>>>>>>>
>>>>>>>> Thanks. What I have had to ask is that
>>>>>>>>
>>>>>>>> how do you test that the data is symmetric enough?
>>>>>>>> If it is not, is it ok to use some data transformation?
>>>>>>>>
>>>>>>>> when it is said:
>>>>>>>>
>>>>>>>> "The Wilcoxon signed rank test does not assume that the data  
>>>>>>>> are
>>>>>>>> sampled from a Gaussian distribution. However it does assume
>>> that
>>>>
>>>>>>>> the
>>>>>>>> data are distributed symmetrically around the median. If the
>>>>>>>> distribution is asymmetrical, the P value will not tell you  
>>>>>>>> much
>>>>
>>>>>>>> about
>>>>>>>> whether the median is different than the hypothetical value."
>>>>>>>
>>>>>>> You are being misled. Simply finding a statement on a statistics
>>>>>>> software website, even one as reputable as Graphpad (???), does
>>> not
>>>>>> mean
>>>>>>> that it is necessarily true. My understanding (confirmed
>>> reviewing
>>>>>>> "Nonparametric statistical methods for complete and censored
>>> data"
>>>>>> by M.
>>>>>>> M. Desu, Damaraju Raghavarao, is that the Wilcoxon signed-rank
>>> test
>>>>>> does
>>>>>>> not require that the underlying distributions be symmetric. The
>>>>>>> above
>>>>>>> quotation is highly inaccurate.
>>>>>>>
>>>>>>
>>>>>> To add to what David and others have said, look at the kernel  
>>>>>> that
>>>>
>>>>>> the
>>>>>>
>>>>>> U-statistic associated with the WSR test uses: the indicator  
>>>>>> (0/1)
>>>> of
>>>>>> xi
>>>>>> + xj > 0.  So WSR tests H0:p=0.5 where p = the probability that
>>> the
>>>>>> average of a randomly chosen pair of values is positive.  [If
>>> there
>>>>>> are
>>>>>> ties this probably needs to be worded as P[xi + xj > 0] = P[xi +
>>> xj
>>>> <
>>>>>>
>>>>>> 0], i neq j.
>>>>>>
>>>>>> Frank
>>>>>>
>>>>>> --
>>>>>> Frank E Harrell Jr   Professor and Chairman        School of
>>> Medicine
>>>>>>                      Department of Biostatistics   Vanderbilt
>>>>>> University
>>>>
>>>
>>> ______________________________________________
>>> R-help@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-
>>> guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>


        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Wilcoxon signed rank test and its requirements

Reply via email to