Re: [ai-geostats] Re: F and T-test for samples drawn from the same p
Dear Isobel, Thanks for the information. Perhaps I didn't explain my request clearly. What I need is to verify the ideas you suggested in the previous message. Specifically, (1) Has anybody used the sill values (in geostatistics) to replace the variances (in classical statistics) in F test? (2) Has anybody used the global standard errors (in geostatistics) to replace the mean standard errors (in classical statistics) in t-test? Cheers, Chaosheng - Original Message - From: Isobel Clark [EMAIL PROTECTED] To: Chaosheng Zhang [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Sent: Monday, December 06, 2004 6:03 PM Subject: [ai-geostats] Re: F and T-test for samples drawn from the same p There ws a pretty good paper on global standard errors in the 1984 APCOM proceedings, so I am sure it should be in the major textbooks by now. Commparing the sills is very straightforward, I think. Isobel http://geecosse.bizland.com/books.htm --- Chaosheng Zhang [EMAIL PROTECTED] wrote: Isobel, Good idea, and that's a step forward. Any references or is it still an idea? Cheers, Chaosheng - Original Message - From: Isobel Clark [EMAIL PROTECTED] To: AI Geostats mailing list [EMAIL PROTECTED] Sent: Monday, December 06, 2004 1:07 PM Subject: Re: [ai-geostats] F and T-test for samples drawn from the same p Dear all I am having difficulty understanding why none of you want to try a spatial approach to statistics. Everyone is trying to make the 'independent' statistical tests work on spatial data. Try turning this around and look at the spatial aspect first. (1) Testing variances: the sill on the semi-variogram (total height of model) is theoretically a good estimate for the sample variance when auto-correlation or spatial dependence is present. Do your F test on that. Yes, you still have degrees of freedom problems, but with thousands of samples the 'infinity column' should be sufficient. (2) Testing means: the classic t-test in the presence of 'equal variances' requires the 'standard error' of each mean. For independent samples, this is s/sqrt(n). For spatially dependent samples, this is the kriging standard error for the global mean. Your only problem then is getting a global standard error. Isobel http://geoecosse.bizland.com/whatsnew.htm -- -- * By using the ai-geostats mailing list you agree to follow its rules ( see http://www.ai-geostats.org/help_ai-geostats.htm ) * To unsubscribe to ai-geostats, send the following in the subject or in the body (plain text format) of an email message to [EMAIL PROTECTED] Signoff ai-geostats * By using the ai-geostats mailing list you agree to follow its rules ( see http://www.ai-geostats.org/help_ai-geostats.htm ) * To unsubscribe to ai-geostats, send the following in the subject or in the body (plain text format) of an email message to [EMAIL PROTECTED] Signoff ai-geostats * By using the ai-geostats mailing list you agree to follow its rules ( see http://www.ai-geostats.org/help_ai-geostats.htm ) * To unsubscribe to ai-geostats, send the following in the subject or in the body (plain text format) of an email message to [EMAIL PROTECTED] Signoff ai-geostats * By using the ai-geostats mailing list you agree to follow its rules ( see http://www.ai-geostats.org/help_ai-geostats.htm ) * To unsubscribe to ai-geostats, send the following in the subject or in the body (plain text format) of an email message to [EMAIL PROTECTED] Signoff ai-geostats
Re: [ai-geostats] Re: F and T-test for samples drawn from the same p
Digby I see where you are coming from on this, but in fact the sill is composed of those pairs of samples which are independent of one another - or, at least, have reached some background correlation. This is why the sill makes a better estimate of the variance than the conventional statistical measures, since it is based on independent sampling. Isobel http://geoecosse.bizland.com/whatsnew.htm --- Digby Millikan [EMAIL PROTECTED] wrote: While your talking about sill's being the global variance which I read everywhere, isn't the global variance actually slightly less than the sill, as the values below the range of the variogram are not included? i.e. the sill would be the global variance when you have pure nugget effect. * By using the ai-geostats mailing list you agree to follow its rules ( see http://www.ai-geostats.org/help_ai-geostats.htm ) * To unsubscribe to ai-geostats, send the following in the subject or in the body (plain text format) of an email message to [EMAIL PROTECTED] Signoff ai-geostats * By using the ai-geostats mailing list you agree to follow its rules ( see http://www.ai-geostats.org/help_ai-geostats.htm ) * To unsubscribe to ai-geostats, send the following in the subject or in the body (plain text format) of an email message to [EMAIL PROTECTED] Signoff ai-geostats
Re: [ai-geostats] Re: F and T-test for samples drawn from the same p
Hi Isobel, Could you explain why it would be a better estimate of the variance when independance is considered? I'd rather think that we consider the dependance when the overall variance are to be estimated-- if there actually is dependance between values. Or are you talking about modeling sill value by the stablizing tail on the experimental variogram, instead of modeling by the calculated overall variance? Or, are we talking about variance of different definitions? I'd be concerned if I missed some point of the original definition for variances, like, the variance should be defined with no dependance beween values or something like that. Frankly, I don't think I took the definition of variance too serious when I was learning stats. Meng-ying Digby I see where you are coming from on this, but in fact the sill is composed of those pairs of samples which are independent of one another - or, at least, have reached some background correlation. This is why the sill makes a better estimate of the variance than the conventional statistical measures, since it is based on independent sampling. Isobel * By using the ai-geostats mailing list you agree to follow its rules ( see http://www.ai-geostats.org/help_ai-geostats.htm ) * To unsubscribe to ai-geostats, send the following in the subject or in the body (plain text format) of an email message to [EMAIL PROTECTED] Signoff ai-geostats
[ai-geostats] RE: F and T-test for samples drawn from the same p
Hence my recommendation to use cross cross validation Isobel http://geoecosse.bizland.com/books.htm --- Colin Daly [EMAIL PROTECTED] wrote: Hi Sorry to repeat myself - but the samples are not independent. Independance is a fundamental assumption of these types of tests - and you cannot interpret the tests if this assumption is violated. In the situation where spatial correlation exists, the true standard error is nothing like as small as the (s/sqrt(n)) that Chaosheng discusses - because the sqrt(n) depends on independence. Again, as I said before, if the data has any type of trend in it, then it is completely meaningless to try and use these tests - and with no trend but some 'ordinary' correlation, you must find a means of taking the data redundancy into account or risk get hopelessly pessimistic results (in the sense of rejecting the null hypothesis of equal means far too often) Consider a trivial example. A one dimensional random function which takes constant values over intervals of lenght one - so, it takes the value a_0 in the interval [0,1[ then the value a_1 in the interval [1,2[ and so on (let us suppose that each a_n term is drawn at random from a gaussian distribution with the same mean and variance for example). Next suppose you are given samples on the interval [0,2]. You spot that there seems to be a jump between [0,1[ and [1,2[ - so you test for the difference in the means. If you apply an f test you will easily find that the mean differs (and more convincingly the more samples you have drawn!). However by construction of the random function, the mean is not different. We have been lulled into the false conclusion of differing means by assuming that all our data are independent. Regards Colin Daly -Original Message- From: Chaosheng Zhang [mailto:[EMAIL PROTECTED] Sent: Sun 12/5/2004 11:42 AM To: [EMAIL PROTECTED] Cc: Colin Badenhorst; Isobel Clark; Donald E. Myers Subject: Re: [ai-geostats] F and T-test for samples drawn from the same p Dear all, I'm wondering if sample size (number of samples, n) is playing a role here. Since Colin is using Excel to analyse several thousand samples, I have checked the functions of t-tests in Excel. In the Data Analysis Tools help, a function is provided for t-Test: Two-Sample Assuming Unequal Variances analysis. This function is the same as those from many text books (There are other forms of the function). Unfortunately, I cannot find the function for assuming equal variances in Excel, but I assume they are similar, and should be the same as those from some text books. From the function, you can find that when the sample size is large you always get a large t value. When sample size is large enough, even slight differences between the mean values of two data sets (x bar and y bar) can be detected, and this will result in rejection of the null hypothesis. This is in fact quite reasonable. When the sample size is large, you are confident with the mean values (Central Limit Theorem), with a very small stand error (s/(sqrt(n)). Therefore, you are confident to detect the differences between the two data sets. Even though there is only a slight difference, you can still say, yes, they are significantly different. If you still remember some time ago, we had a discussion on large sample size problem for tests for normality. When the sample size is large enough, the result can always be expected (for real data sets), that is, rejection of the null hypothesis. Cheers, Chaosheng -- Dr. Chaosheng Zhang Lecturer in GIS Department of Geography National University of Ireland, Galway IRELAND Tel: +353-91-524411 x 2375 Direct Tel: +353-91-49 2375 Fax: +353-91-525700 E-mail: [EMAIL PROTECTED] Web 1: www.nuigalway.ie/geography/zhang.html Web 2: www.nuigalway.ie/geography/gis/index.htm - Original Message - From: Isobel Clark [EMAIL PROTECTED] To: Donald E. Myers [EMAIL PROTECTED] Cc: Colin Badenhorst [EMAIL PROTECTED]; [EMAIL PROTECTED] Sent: Saturday, December 04, 2004 11:49 AM Subject: [ai-geostats] F and T-test for samples drawn from the same p Don Thank you for the extended clarification of F and t hypothesis test. For those unfamiliar with the concept, it is worth noting that the F test for multiple means may be more familiar under the title Analysis of variance. My own brief answer was in the context of Colin's question, where it was quite clear that he was talking aboutthe simplest F variance-ratio and t comparison of means test. Isobel