Re: [R] Wilcoxon signed rank test and its requirements

Atte Tenkanen Sat, 26 Jun 2010 15:37:40 -0700

Thanks! The results were similar to the t.test p-values show (I have  
four samples).
Thank you also for using that replicate-function which i didn't know.  
Till now I have just used for-loops that are not so beautiful... i  
don't know about the speed. Have to test that.


Atte

Greg Snow kirjoitti 26.6.2010 kello 23.30:

> No I mean something like this, assuming that the iris dataset  
> contains the full population and we want to see if Setaso have a  
> different mean than the population (the null would be that there is  
> no difference in sepal width between species, or that species tells  
> nothing about sepal width):
>
>
> out1 <- replicate( 100000, mean(sample(iris$Sepal.Width, 50)) )
> obs1 <- mean( iris$Sepal.Width[1:50] )
>
> hist(out1, xlim=range(out1,obs1))
> abline(v=obs1)
>
> mean( out1 > obs1 )
>
>
> I donÕt have a reference (other than a text book that defines  
> sampling distributions).
>
> --
> Gregory (Greg) L. Snow Ph.D.
> Statistical Data Center
> Intermountain Healthcare
> greg.s...@imail.org
> 801.408.8111
>
> From: Atte Tenkanen [mailto:atte...@utu.fi]
> Sent: Friday, June 25, 2010 10:08 PM
> To: Atte Tenkanen
> Cc: Greg Snow; David Winsemius; R mailing list
> Subject: Re: [R] Wilcoxon signed rank test and its requirements
>
>
> Atte Tenkanen kirjoitti 26.6.2010 kello 5.15:
>
>
>
> Greg Snow kirjoitti 25.6.2010 kello 21.55:
>
>
> Let me see if I understand.  You actually have the data for the  
> whole population (the entire piece) but you have some pre-defined  
> sections that you want to see if they differ from the population,  
> or more meaningfully they are different from a randomly selected  
> set of measures.  Is that correct?
>
> If so, since you have the entire population of interest you can  
> create the actual sampling distribution (or a good approximation of  
> it).  Just take random samples from the population of the given  
> size (matching the subset you are interested in) and calculate the  
> means (or other value of interest), probably 10,000 to 1,000,000  
> samples.  Now compare the value from your predefined subset to the  
> set of random values you generated to see if it is in the tail or not.
>
> I check, so you mean doing it this way:
>
> t.test(sample(POPUL, length(SAMPLE), replace = FALSE), mu=mean 
> (SAMPLE), alt = "less")
>
> NO, this way:
>
> t.test(POPUL[sample(1:length(POPUL), length(SAMPLE), replace =  
> FALSE)], mu=mean(SAMPLE), alt = "less")
>
> Atte
>
>
>
> Atte
>
>
>
> -- 
> Gregory (Greg) L. Snow Ph.D.
> Statistical Data Center
> Intermountain Healthcare
> greg.s...@imail.org
> 801.408.8111
>
>
> -----Original Message-----
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
> project.org] On Behalf Of Atte Tenkanen
> Sent: Thursday, June 24, 2010 11:04 PM
> To: David Winsemius
> Cc: R mailing list
> Subject: Re: [R] Wilcoxon signed rank test and its requirements
>
> The values come from this kind of process:
> The musical composition is segmented into so-called 'pitch-class
> segments' and these segments are compared with one reference set  
> with a
> distance function. Only some distance values are possible. These
> distance values can be averaged over music bars which produces  
> smoother
> distribution and the 'comparison curve' that illustrates the distances
> according to the reference set through a musical piece result in more
> readable curve (see e.g. http://users.utu.fi/attenka/with6.jpg ),  
> but I
> would prefer to use original values.
>
> then, I want to pick only some regions from the piece and compare  
> those
> values of those regions, whether they are higher than the mean of all
> values.
>
> Atte
>
> On Jun 24, 2010, at 6:58 PM, Atte Tenkanen wrote:
>
> Is there anything for me?
>
> There is a lot of data, n=2418, but there are also a lot of ties.
> My sample nÅ250-300
>
>
> I do not understand why there should be so many ties. You have not
> described the measurement process or units. ( ... although you offer
> a
>
> glipmse without much background  later.)
>
> i would like to test, whether the mean of the sample differ
> significantly from the population mean.
>
> Why? What is the purpose of this investigation? Why should the mean
> of
>
> a sample be that important?
>
>
> The histogram of the population looks like in attached histogram,
> what test should I use? No choices?
>
> This distribution comes from a musical piece and the values are
> 'tonal distances'.
>
> http://users.utu.fi/attenka/Hist.png
>
> That picture does not offer much insidght into the features of that
> measurement. It appears to have much more structure than I would
> expect for a sample from a smooth unimodal underlying population.
>
> --
> David.
>
>
> Atte
>
> On 06/24/2010 12:40 PM, David Winsemius wrote:
>
> On Jun 23, 2010, at 9:58 PM, Atte Tenkanen wrote:
>
> Thanks. What I have had to ask is that
>
> how do you test that the data is symmetric enough?
> If it is not, is it ok to use some data transformation?
>
> when it is said:
>
> "The Wilcoxon signed rank test does not assume that the data are
> sampled from a Gaussian distribution. However it does assume
> that
>
> the
> data are distributed symmetrically around the median. If the
> distribution is asymmetrical, the P value will not tell you much
>
> about
> whether the median is different than the hypothetical value."
>
> You are being misled. Simply finding a statement on a statistics
> software website, even one as reputable as Graphpad (???), does
> not
> mean
> that it is necessarily true. My understanding (confirmed
> reviewing
> "Nonparametric statistical methods for complete and censored
> data"
> by M.
> M. Desu, Damaraju Raghavarao, is that the Wilcoxon signed-rank
> test
> does
> not require that the underlying distributions be symmetric. The
> above
> quotation is highly inaccurate.
>
>
> To add to what David and others have said, look at the kernel that
>
> the
>
> U-statistic associated with the WSR test uses: the indicator (0/1)
> of
> xi
> + xj > 0.  So WSR tests H0:p=0.5 where p = the probability that
> the
> average of a randomly chosen pair of values is positive.  [If
> there
> are
> ties this probably needs to be worded as P[xi + xj > 0] = P[xi +
> xj
> <
>
> 0], i neq j.
>
> Frank
>
> --
> Frank E Harrell Jr   Professor and Chairman        School of
> Medicine
>                      Department of Biostatistics   Vanderbilt
> University
>
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>


        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Wilcoxon signed rank test and its requirements

Reply via email to