Thanks! The results were similar to the t.test p-values show (I have four samples). Thank you also for using that replicate-function which i didn't know. Till now I have just used for-loops that are not so beautiful... i don't know about the speed. Have to test that.
Atte Greg Snow kirjoitti 26.6.2010 kello 23.30: > No I mean something like this, assuming that the iris dataset > contains the full population and we want to see if Setaso have a > different mean than the population (the null would be that there is > no difference in sepal width between species, or that species tells > nothing about sepal width): > > > out1 <- replicate( 100000, mean(sample(iris$Sepal.Width, 50)) ) > obs1 <- mean( iris$Sepal.Width[1:50] ) > > hist(out1, xlim=range(out1,obs1)) > abline(v=obs1) > > mean( out1 > obs1 ) > > > I donÕt have a reference (other than a text book that defines > sampling distributions). > > -- > Gregory (Greg) L. Snow Ph.D. > Statistical Data Center > Intermountain Healthcare > greg.s...@imail.org > 801.408.8111 > > From: Atte Tenkanen [mailto:atte...@utu.fi] > Sent: Friday, June 25, 2010 10:08 PM > To: Atte Tenkanen > Cc: Greg Snow; David Winsemius; R mailing list > Subject: Re: [R] Wilcoxon signed rank test and its requirements > > > Atte Tenkanen kirjoitti 26.6.2010 kello 5.15: > > > > Greg Snow kirjoitti 25.6.2010 kello 21.55: > > > Let me see if I understand. You actually have the data for the > whole population (the entire piece) but you have some pre-defined > sections that you want to see if they differ from the population, > or more meaningfully they are different from a randomly selected > set of measures. Is that correct? > > If so, since you have the entire population of interest you can > create the actual sampling distribution (or a good approximation of > it). Just take random samples from the population of the given > size (matching the subset you are interested in) and calculate the > means (or other value of interest), probably 10,000 to 1,000,000 > samples. Now compare the value from your predefined subset to the > set of random values you generated to see if it is in the tail or not. > > I check, so you mean doing it this way: > > t.test(sample(POPUL, length(SAMPLE), replace = FALSE), mu=mean > (SAMPLE), alt = "less") > > NO, this way: > > t.test(POPUL[sample(1:length(POPUL), length(SAMPLE), replace = > FALSE)], mu=mean(SAMPLE), alt = "less") > > Atte > > > > Atte > > > > -- > Gregory (Greg) L. Snow Ph.D. > Statistical Data Center > Intermountain Healthcare > greg.s...@imail.org > 801.408.8111 > > > -----Original Message----- > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- > project.org] On Behalf Of Atte Tenkanen > Sent: Thursday, June 24, 2010 11:04 PM > To: David Winsemius > Cc: R mailing list > Subject: Re: [R] Wilcoxon signed rank test and its requirements > > The values come from this kind of process: > The musical composition is segmented into so-called 'pitch-class > segments' and these segments are compared with one reference set > with a > distance function. Only some distance values are possible. These > distance values can be averaged over music bars which produces > smoother > distribution and the 'comparison curve' that illustrates the distances > according to the reference set through a musical piece result in more > readable curve (see e.g. http://users.utu.fi/attenka/with6.jpg ), > but I > would prefer to use original values. > > then, I want to pick only some regions from the piece and compare > those > values of those regions, whether they are higher than the mean of all > values. > > Atte > > On Jun 24, 2010, at 6:58 PM, Atte Tenkanen wrote: > > Is there anything for me? > > There is a lot of data, n=2418, but there are also a lot of ties. > My sample nÅ250-300 > > > I do not understand why there should be so many ties. You have not > described the measurement process or units. ( ... although you offer > a > > glipmse without much background later.) > > i would like to test, whether the mean of the sample differ > significantly from the population mean. > > Why? What is the purpose of this investigation? Why should the mean > of > > a sample be that important? > > > The histogram of the population looks like in attached histogram, > what test should I use? No choices? > > This distribution comes from a musical piece and the values are > 'tonal distances'. > > http://users.utu.fi/attenka/Hist.png > > That picture does not offer much insidght into the features of that > measurement. It appears to have much more structure than I would > expect for a sample from a smooth unimodal underlying population. > > -- > David. > > > Atte > > On 06/24/2010 12:40 PM, David Winsemius wrote: > > On Jun 23, 2010, at 9:58 PM, Atte Tenkanen wrote: > > Thanks. What I have had to ask is that > > how do you test that the data is symmetric enough? > If it is not, is it ok to use some data transformation? > > when it is said: > > "The Wilcoxon signed rank test does not assume that the data are > sampled from a Gaussian distribution. However it does assume > that > > the > data are distributed symmetrically around the median. If the > distribution is asymmetrical, the P value will not tell you much > > about > whether the median is different than the hypothetical value." > > You are being misled. Simply finding a statement on a statistics > software website, even one as reputable as Graphpad (???), does > not > mean > that it is necessarily true. My understanding (confirmed > reviewing > "Nonparametric statistical methods for complete and censored > data" > by M. > M. Desu, Damaraju Raghavarao, is that the Wilcoxon signed-rank > test > does > not require that the underlying distributions be symmetric. The > above > quotation is highly inaccurate. > > > To add to what David and others have said, look at the kernel that > > the > > U-statistic associated with the WSR test uses: the indicator (0/1) > of > xi > + xj > 0. So WSR tests H0:p=0.5 where p = the probability that > the > average of a randomly chosen pair of values is positive. [If > there > are > ties this probably needs to be worded as P[xi + xj > 0] = P[xi + > xj > < > > 0], i neq j. > > Frank > > -- > Frank E Harrell Jr Professor and Chairman School of > Medicine > Department of Biostatistics Vanderbilt > University > > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code. > > [[alternative HTML version deleted]]
______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.