Atte Tenkanen kirjoitti 26.6.2010 kello 5.15: > > Greg Snow kirjoitti 25.6.2010 kello 21.55: > >> Let me see if I understand. You actually have the data for the >> whole population (the entire piece) but you have some pre-defined >> sections that you want to see if they differ from the population, >> or more meaningfully they are different from a randomly selected >> set of measures. Is that correct? >> >> If so, since you have the entire population of interest you can >> create the actual sampling distribution (or a good approximation >> of it). Just take random samples from the population of the given >> size (matching the subset you are interested in) and calculate the >> means (or other value of interest), probably 10,000 to 1,000,000 >> samples. Now compare the value from your predefined subset to the >> set of random values you generated to see if it is in the tail or >> not. > > I check, so you mean doing it this way: > > t.test(sample(POPUL, length(SAMPLE), replace = FALSE), mu=mean > (SAMPLE), alt = "less")
NO, this way: t.test(POPUL[sample(1:length(POPUL), length(SAMPLE), replace = FALSE)], mu=mean(SAMPLE), alt = "less") Atte > > Atte > >> >> -- >> Gregory (Greg) L. Snow Ph.D. >> Statistical Data Center >> Intermountain Healthcare >> greg.s...@imail.org >> 801.408.8111 >> >> >>> -----Original Message----- >>> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- >>> project.org] On Behalf Of Atte Tenkanen >>> Sent: Thursday, June 24, 2010 11:04 PM >>> To: David Winsemius >>> Cc: R mailing list >>> Subject: Re: [R] Wilcoxon signed rank test and its requirements >>> >>> The values come from this kind of process: >>> The musical composition is segmented into so-called 'pitch-class >>> segments' and these segments are compared with one reference set >>> with a >>> distance function. Only some distance values are possible. These >>> distance values can be averaged over music bars which produces >>> smoother >>> distribution and the 'comparison curve' that illustrates the >>> distances >>> according to the reference set through a musical piece result in >>> more >>> readable curve (see e.g. http://users.utu.fi/attenka/with6.jpg ), >>> but I >>> would prefer to use original values. >>> >>> then, I want to pick only some regions from the piece and compare >>> those >>> values of those regions, whether they are higher than the mean of >>> all >>> values. >>> >>> Atte >>> >>>> On Jun 24, 2010, at 6:58 PM, Atte Tenkanen wrote: >>>> >>>>> Is there anything for me? >>>>> >>>>> There is a lot of data, n=2418, but there are also a lot of ties. >>>>> My sample nĂ…250-300 >>>>> >>>> >>>> I do not understand why there should be so many ties. You have not >>>> described the measurement process or units. ( ... although you >>>> offer >>> a >>>> >>>> glipmse without much background later.) >>>> >>>>> i would like to test, whether the mean of the sample differ >>>>> significantly from the population mean. >>>> >>>> Why? What is the purpose of this investigation? Why should the mean >>> of >>>> >>>> a sample be that important? >>>> >>>>> >>>>> The histogram of the population looks like in attached histogram, >>>>> what test should I use? No choices? >>>>> >>>>> This distribution comes from a musical piece and the values are >>>>> 'tonal distances'. >>>>> >>>>> http://users.utu.fi/attenka/Hist.png >>>> >>>> That picture does not offer much insidght into the features of that >>>> measurement. It appears to have much more structure than I would >>>> expect for a sample from a smooth unimodal underlying population. >>>> >>>> -- >>>> David. >>>> >>>>> >>>>> Atte >>>>> >>>>>> On 06/24/2010 12:40 PM, David Winsemius wrote: >>>>>>> >>>>>>> On Jun 23, 2010, at 9:58 PM, Atte Tenkanen wrote: >>>>>>> >>>>>>>> Thanks. What I have had to ask is that >>>>>>>> >>>>>>>> how do you test that the data is symmetric enough? >>>>>>>> If it is not, is it ok to use some data transformation? >>>>>>>> >>>>>>>> when it is said: >>>>>>>> >>>>>>>> "The Wilcoxon signed rank test does not assume that the data >>>>>>>> are >>>>>>>> sampled from a Gaussian distribution. However it does assume >>> that >>>> >>>>>>>> the >>>>>>>> data are distributed symmetrically around the median. If the >>>>>>>> distribution is asymmetrical, the P value will not tell you >>>>>>>> much >>>> >>>>>>>> about >>>>>>>> whether the median is different than the hypothetical value." >>>>>>> >>>>>>> You are being misled. Simply finding a statement on a statistics >>>>>>> software website, even one as reputable as Graphpad (???), does >>> not >>>>>> mean >>>>>>> that it is necessarily true. My understanding (confirmed >>> reviewing >>>>>>> "Nonparametric statistical methods for complete and censored >>> data" >>>>>> by M. >>>>>>> M. Desu, Damaraju Raghavarao, is that the Wilcoxon signed-rank >>> test >>>>>> does >>>>>>> not require that the underlying distributions be symmetric. The >>>>>>> above >>>>>>> quotation is highly inaccurate. >>>>>>> >>>>>> >>>>>> To add to what David and others have said, look at the kernel >>>>>> that >>>> >>>>>> the >>>>>> >>>>>> U-statistic associated with the WSR test uses: the indicator >>>>>> (0/1) >>>> of >>>>>> xi >>>>>> + xj > 0. So WSR tests H0:p=0.5 where p = the probability that >>> the >>>>>> average of a randomly chosen pair of values is positive. [If >>> there >>>>>> are >>>>>> ties this probably needs to be worded as P[xi + xj > 0] = P[xi + >>> xj >>>> < >>>>>> >>>>>> 0], i neq j. >>>>>> >>>>>> Frank >>>>>> >>>>>> -- >>>>>> Frank E Harrell Jr Professor and Chairman School of >>> Medicine >>>>>> Department of Biostatistics Vanderbilt >>>>>> University >>>> >>> >>> ______________________________________________ >>> R-help@r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide http://www.R-project.org/posting- >>> guide.html >>> and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]]
______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.