Re: [ai-geostats] F and T-test for samples drawn from the same p
RE: [ai-geostats] F and T-test for samples drawn from the same pComparisons of the sills of relative variograms may indicate wether the proportional effect is present between the low and high grade zones, so a test on the correlation coefficients could be relevant. Digby www.users.on.net/~digbym * By using the ai-geostats mailing list you agree to follow its rules ( see http://www.ai-geostats.org/help_ai-geostats.htm ) * To unsubscribe to ai-geostats, send the following in the subject or in the body (plain text format) of an email message to [EMAIL PROTECTED] Signoff ai-geostats
Re: [ai-geostats] F and T-test for samples drawn from the same p
Isobel, Good idea, and that's a step forward. Any references or is it still an idea? Cheers, Chaosheng - Original Message - From: "Isobel Clark" <[EMAIL PROTECTED]> To: "AI Geostats mailing list" <[EMAIL PROTECTED]> Sent: Monday, December 06, 2004 1:07 PM Subject: Re: [ai-geostats] F and T-test for samples drawn from the same p > Dear all > > I am having difficulty understanding why none of you > want to try a spatial approach to statistics. Everyone > is trying to make the 'independent' statistical tests > work on spatial data. Try turning this around and look > at the spatial aspect first. > > (1) Testing variances: the sill on the semi-variogram > (total height of model) is theoretically a good > estimate for the sample variance when auto-correlation > or spatial dependence is present. Do your F test on > that. Yes, you still have degrees of freedom problems, > but with thousands of samples the 'infinity column' > should be sufficient. > > (2) Testing means: the classic t-test in the presence > of 'equal variances' requires the 'standard error' of > each mean. For independent samples, this is s/sqrt(n). > For spatially dependent samples, this is the kriging > standard error for the global mean. Your only problem > then is getting a global standard error. > > Isobel > http://geoecosse.bizland.com/whatsnew.htm > > > * By using the ai-geostats mailing list you agree to follow its rules > ( see http://www.ai-geostats.org/help_ai-geostats.htm ) > > * To unsubscribe to ai-geostats, send the following in the subject or in the body (plain text format) of an email message to [EMAIL PROTECTED] > > Signoff ai-geostats > * By using the ai-geostats mailing list you agree to follow its rules ( see http://www.ai-geostats.org/help_ai-geostats.htm ) * To unsubscribe to ai-geostats, send the following in the subject or in the body (plain text format) of an email message to [EMAIL PROTECTED] Signoff ai-geostats
Re: [ai-geostats] F and T-test for samples drawn from the same p
Dear all I am having difficulty understanding why none of you want to try a spatial approach to statistics. Everyone is trying to make the 'independent' statistical tests work on spatial data. Try turning this around and look at the spatial aspect first. (1) Testing variances: the sill on the semi-variogram (total height of model) is theoretically a good estimate for the sample variance when auto-correlation or spatial dependence is present. Do your F test on that. Yes, you still have degrees of freedom problems, but with thousands of samples the 'infinity column' should be sufficient. (2) Testing means: the classic t-test in the presence of 'equal variances' requires the 'standard error' of each mean. For independent samples, this is s/sqrt(n). For spatially dependent samples, this is the kriging standard error for the global mean. Your only problem then is getting a global standard error. Isobel http://geoecosse.bizland.com/whatsnew.htm * By using the ai-geostats mailing list you agree to follow its rules ( see http://www.ai-geostats.org/help_ai-geostats.htm ) * To unsubscribe to ai-geostats, send the following in the subject or in the body (plain text format) of an email message to [EMAIL PROTECTED] Signoff ai-geostats
Re: [ai-geostats] F and T-test for samples drawn from the same p
Title: RE: [ai-geostats] F and T-test for samples drawn from the same p Besides the discussions on the theory, I think we need a practical solution for Colin Badenhorst's initial problem (This is not his problem only). He wants to compare two sets of spatial data with several thousand samples. Spatial autocorrelation (or lack of independence) is a basic feature of spatial data, and thus we cannot do anything to ask spatial data to behave well to satisfy the statistical requirements. If your spatial data set is lack of spatial autocorrelation, you may be asked to go back and take more samples. The ideal way is perhaps to develop a t-test (or whatever test) for spatial data, something like "spatially weighted test". If such a test is not available, we have no choice, but have to use existing methods. They may not be exactly suitable to spatial data, but better than nothing. For the time being, the best way to solve the problem is still to use statistical methods, but try to explain the results carefully and appropriately. We have to acknowledge the discrepancies between the basic feature of spatial data and possible statistical requirements. Meanwhile, when the sample size (well, going back to my initial concern) is large, you will always get the result of rejecting the null hypothesis for REAL data, no matter there is spatial dependence or not. In this case, what does such a result mean? I would like to say this result is not very meaningful, as it just proves the power of statistical tests. The simple ways of graphs (e.g., histogram, box-plot) and percentiles may become helpful for comparison. Therefore, for Colin's initial problem, the solution is to explain the results properly, and maybe to try some other methods if available. Cheers, Chaosheng --Dr. Chaosheng ZhangLecturer in GISDepartment of GeographyNational University of Ireland, GalwayIRELANDTel: +353-91-524411 x 2375Direct Tel: +353-91-49 2375Fax: +353-91-525700E-mail: [EMAIL PROTECTED]Web 1: www.nuigalway.ie/geography/zhang.htmlWeb 2: www.nuigalway.ie/geography/gis/index.htm * By using the ai-geostats mailing list you agree to follow its rules ( see http://www.ai-geostats.org/help_ai-geostats.htm ) * To unsubscribe to ai-geostats, send the following in the subject or in the body (plain text format) of an email message to [EMAIL PROTECTED] Signoff ai-geostats
Re: [ai-geostats] F and T-test for samples drawn from the same p
Every resource model I have done, I always subdivide the populations into those of equal mean and variance, so stationarity is obeyed, is this the correct procedure, I havn't read Mining Geostatisitcs in detail yet, but understood that this was a basic requirement for geostatisitical modelling procedures. http://www.users.on.net/~digbym/about_consulting.htm Digby * By using the ai-geostats mailing list you agree to follow its rules ( see http://www.ai-geostats.org/help_ai-geostats.htm ) * To unsubscribe to ai-geostats, send the following in the subject or in the body (plain text format) of an email message to [EMAIL PROTECTED] Signoff ai-geostats
Re: [ai-geostats] F and T-test for samples drawn from the same p
Colin, Isn't a basic rule of geostatisitics that all populations must follow the intrinsic hypothesis, i.e. stationarity ,constant mean and variance, so you should split any populations that do not have the same mean and variance, introduced pp33 Mining Geostatistics A.G.Journel & Ch. J.Huijbregts. Regards Digby - Original Message - From: "Colin Badenhorst" <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Cc: <[EMAIL PROTECTED]> Sent: Saturday, December 04, 2004 1:28 AM Subject: RE: [ai-geostats] F and T-test for samples drawn from the same p Hi Ted, Thanks for your reply. I suspect my original query was too vague, so I will illustrate it with a practical example here. I have an ore horizon that splits into two separate horizons. One of these split horizons has a lower average grade, and the other has a higher average grade. I need to determine whether I should treat these two horizons as separate entities during grade estimation. My geological observations tell me that these two horizons derive from the same source, and on the face of it are not different from one another in terms of mineral content and genesis. I aim to back it up by proving, or attempting to prove, that statistically these two horizons are the same, and can be treated as such as far as grade estimation goes. Because the mean grades vary between the two, I suspect that the T-test might fail, but I also suspect that the variance in grade between the two might be very similar, and thus the F-test will pass. Now I have a problem : a T-test tells me the populations differ statistically, and but the F-test tells me they don't. The confidence limit I refer to in (2) by the way is the Alpha value used to determine the confidence level for the test - I am using Excel to do the test. Thanks, Colin -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Sent: 03 December 2004 14:15 To: Colin Badenhorst Cc: [EMAIL PROTECTED] Subject: RE: [ai-geostats] F and T-test for samples drawn from the same p On 03-Dec-04 Colin Badenhorst wrote: Hello everyone, I have two groups of several thousand samples analysed for various elements, and wish to determine if these samples are drawn from the same statistical population for later variography studies. I propose to test the two groups by using a F-test to test the sample variances, and a T-test to test the group means, at a given confidence limit. Before I do this, I wonder how I would interpret the results of the test if, for example: 1. The F-test suggests no significant statistical difference between the variances at a 90% confidence limit, BUT 2. The T-test suggests a significant statistical difference between the means at the same, or lower confidence limit. Has anyone come across this scenario before and how are they interpreted? On the face of it, the scenario you describe corresponds to a standard t-test (which involves an assumption that the variances of the two populations do not differ), though I'm not sure what you mean in (2) by significant "at the same, or lower confidence limit." (Do I take it that in (1) you mean that the P-value for the F test is 0.1 or less?) However, if you get significant difference between the variances in (1), then it may not be very good to use the standard t test (depending on how different they are). A modified version, such as the Welch test, should be used instead. There is an issue with interpreting the results where the samples have initially been screened by one test, before another one is applied, since the sampling distribution of the second test, conditional on the outcome of the first, may not be the same as the sampling distribution of the second test on its own. However, I feel inclined to guess that this may not make any important difference in your case. Hoping this helps, Ted. E-Mail: (Ted Harding) <[EMAIL PROTECTED]> Fax-to-email: +44 (0)870 094 0861 [NB: New number!] Date: 03-Dec-04 Time: 14:15:09 -- XFMail -- * By using the ai-geostats mailing list you agree to follow its rules ( see http://www.ai-geostats.org/help_ai-geostats.htm ) * To unsubscribe to ai-geostats, send the following in the subject or in the body (plain text format) of an email message to [EMAIL PROTECTED] Signoff ai-geostats * By using the ai-geostats mailing list you agree to follow its rules ( see http://www.ai-geostats.org/help_ai-geostats.htm ) * To unsubscribe to ai-geostats, send the following in the subject or in the body (plain text format) of an email message to [EMAIL PROTECTED] Signoff ai-geostats
RE: [ai-geostats]F and T-test for samples drawn from the same p
Sorry if this is somewhat off subject - but I'd like to discuss (and invite further comments) on Colin's comments regarding the effects of independence on standard statistical tests. He mentioned that a lack of independence "typically removes a large part of the usability of basic tests unless corrected for spatial variables". The standard argument goes something like: 'Spatial autocorrelation means that the sampled values are not independent, so you have less information than you think (i.e. your estimated degrees of freedom are too large). Consequently, the variance is underestimated and confidence intervals are too small (or the type I error is under-reported)'. My understanding is that this argument is quite valid when you are inferring beyond the area from which you have sampled (or inferring about the stochastic process generating the sample data). However, it's probably worth mentioning that if you are simply looking to compare the parameters of specified areas (or volumes) and you have used a sensible design-based sampling method (e.g. SRS), then autocorrelation poses no problem. i.e. if you have randomly sampled some regionalized variable in volume X and volume Y, and simply wish to determine if, say, the population means of these volumes are different -- then the sample points will be independent (relative to the area of inference). In this scenario, classical statistical tests can be used to compare the realization parameters of the different areas. The question that often is failed to be asked is - What inference space are we interested in? Do we wish to discuss the process that generated the data, or simply make inference about the actual physical realization? Geostatistics avoids many complications with autocorrelation by typically restricting inference to the actual data, rather than the stochastic process. In your particular case I would expect that statistically showing that: (a) two horizons exhibit the same mineral content/spatial structure and (b) two horizons derive from the same process are very different problems. Certainly within biology, the difference between these situations does not seem to be well understood - I am curious if geostatisticians distinguish between them as a matter of course? regards, Matthew Pawley --- Colin Daly <[EMAIL PROTECTED]> wrote: > > > Hi > > Sorry to repeat myself - but the samples are not independent. > Independance is a fundamental assumption of these types of tests - and > you cannot interpret the tests if this assumption is violated. > In the situation where spatial correlation exists, the true standard > error is nothing like as small as the (s/sqrt(n)) that Chaosheng > discusses - because the sqrt(n) depends on independence. > > Again, as I said before, if the data has any type of trend in it, then > it is completely meaningless to try and use these tests - and with no > trend but some 'ordinary' correlation, you must find a means of taking > the data redundancy into account or risk get hopelessly pessimistic > results (in the sense of rejecting the null hypothesis of equal means > far too > often) > > Consider a trivial example. A one dimensional random function which > takes constant values over intervals of lenght one - so, it takes the > value a_0 in the interval [0,1[ then the value a_1 in the interval > [1,2[ and so on (let us suppose that each a_n term is drawn at random > from a gaussian distribution with the same mean and variance for > example). Next suppose you are given samples on the interval [0,2]. > You spot that there seems to be a jump between [0,1[ and [1,2[ - so > you test for the difference in the means. If you apply an f test you > will easily find that the mean differs (and more convincingly the more > samples you have drawn!). However by construction of the random > function, the mean is not different. We have been lulled into the > false conclusion of differing means by assuming that all our data are > independent. > > Regards > > Colin Daly > * By using the ai-geostats mailing list you agree to follow its rules ( see http://www.ai-geostats.org/help_ai-geostats.htm ) * To unsubscribe to ai-geostats, send the following in the subject or in the body (plain text format) of an email message to [EMAIL PROTECTED] Signoff ai-geostats
RE: [ai-geostats] F and T-test for samples drawn from the same p
Hello, I am currently principal investigator on a major NIH grant that aims to develop software for test of hypothesis using alternate hypothesis specified by the user and that differ from the omnibus "spatial independence"; we called them "spatial neutral models". For example, you can test for clusters of cancer rates "above and beyond" a regional background in exposure. The p-values are computed using randomization and I applied geostatistical simulation to generate multiple realizations that are then used to derive the empirical distribution of the test statistic. I presented an example during the last GeoEnv conference and I put a PDF copy of the paper, which is in press for the moment, on my website. Cheers, Pierre <><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> Dr. Pierre Goovaerts President of PGeostat, LLC Chief Scientist with Biomedware Inc. 710 Ridgemont Lane Ann Arbor, Michigan, 48103-1535, U.S.A. E-mail: [EMAIL PROTECTED] Phone: (734) 668-9900 Fax: (734) 668-7788 http://alumni.engin.umich.edu/~goovaert/ <><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> On Sun, 5 Dec 2004, Colin Daly wrote: > > > Hi > > Sorry to repeat myself - but the samples are not independent. Independance > is a fundamental assumption of these types of tests - and you cannot > interpret the tests if this assumption is violated. In the situation where > spatial correlation exists, the true standard error is nothing like as small > as the (s/sqrt(n)) that Chaosheng discusses - because the sqrt(n) depends on > independence. > > Again, as I said before, if the data has any type of trend in it, then it is > completely meaningless to try and use these tests - and with no trend but > some 'ordinary' correlation, you must find a means of taking the data > redundancy into account or risk get hopelessly pessimistic results (in the > sense of rejecting the null hypothesis of equal means far too often) > > Consider a trivial example. A one dimensional random function which takes > constant values over intervals of lenght one - so, it takes the value a_0 in > the interval [0,1[ then the value a_1 in the interval [1,2[ and so on (let > us suppose that each a_n term is drawn at random from a gaussian distribution > with the same mean and variance for example). Next suppose you are given > samples on the interval [0,2]. You spot that there seems to be a jump between > [0,1[ and [1,2[ - so you test for the difference in the means. If you apply > an f test you will easily find that the mean differs (and more convincingly > the more samples you have drawn!). However by construction of the random > function, the mean is not different. We have been lulled into the false > conclusion of differing means by assuming that all our data are independent. > > Regards > > Colin Daly > > > -Original Message- > From: Chaosheng Zhang [mailto:[EMAIL PROTECTED] > Sent: Sun 12/5/2004 11:42 AM > To: [EMAIL PROTECTED] > Cc: Colin Badenhorst; Isobel Clark; Donald E. Myers > Subject: Re: [ai-geostats] F and T-test for samples drawn from the same p > Dear all, > > > > I'm wondering if sample size (number of samples, n) is playing a role here. > > > > Since Colin is using Excel to analyse several thousand samples, I have > checked the functions of t-tests in Excel. In the Data Analysis Tools help, a > function is provided for "t-Test: Two-Sample Assuming Unequal Variances > analysis". This function is the same as those from many text books (There are > other forms of the function). Unfortunately, I cannot find the function for > "assuming equal variances" in Excel, but I assume they are similar, and > should be the same as those from some text books. > > > > From the function, you can find that when the sample size is large you always > get a large t value. When sample size is large enough, even slight > differences between the mean values of two data sets (x bar and y bar) can be > detected, and this will result in rejection of the null hypothesis. This is > in fact quite reasonable. When the sample size is large, you are confident > with the mean values (Central Limit Theorem), with a very small stand er
RE: [ai-geostats] F and T-test for samples drawn from the same p
Title: RE: [ai-geostats] F and T-test for samples drawn from the same p Hi Sorry to repeat myself - but the samples are not independent. Independance is a fundamental assumption of these types of tests - and you cannot interpret the tests if this assumption is violated. In the situation where spatial correlation exists, the true standard error is nothing like as small as the (s/sqrt(n)) that Chaosheng discusses - because the sqrt(n) depends on independence. Again, as I said before, if the data has any type of trend in it, then it is completely meaningless to try and use these tests - and with no trend but some 'ordinary' correlation, you must find a means of taking the data redundancy into account or risk get hopelessly pessimistic results (in the sense of rejecting the null hypothesis of equal means far too often) Consider a trivial example. A one dimensional random function which takes constant values over intervals of lenght one - so, it takes the value a_0 in the interval [0,1[ then the value a_1 in the interval [1,2[ and so on (let us suppose that each a_n term is drawn at random from a gaussian distribution with the same mean and variance for example). Next suppose you are given samples on the interval [0,2]. You spot that there seems to be a jump between [0,1[ and [1,2[ - so you test for the difference in the means. If you apply an f test you will easily find that the mean differs (and more convincingly the more samples you have drawn!). However by construction of the random function, the mean is not different. We have been lulled into the false conclusion of differing means by assuming that all our data are independent. Regards Colin Daly -Original Message- From: Chaosheng Zhang [mailto:[EMAIL PROTECTED]] Sent: Sun 12/5/2004 11:42 AM To: [EMAIL PROTECTED] Cc: Colin Badenhorst; Isobel Clark; Donald E. Myers Subject: Re: [ai-geostats] F and T-test for samples drawn from the same p Dear all, I'm wondering if sample size (number of samples, n) is playing a role here. Since Colin is using Excel to analyse several thousand samples, I have checked the functions of t-tests in Excel. In the Data Analysis Tools help, a function is provided for "t-Test: Two-Sample Assuming Unequal Variances analysis". This function is the same as those from many text books (There are other forms of the function). Unfortunately, I cannot find the function for "assuming equal variances" in Excel, but I assume they are similar, and should be the same as those from some text books. >From the function, you can find that when the sample size is large you always get a large t value. When sample size is large enough, even slight differences between the mean values of two data sets (x bar and y bar) can be detected, and this will result in rejection of the null hypothesis. This is in fact quite reasonable. When the sample size is large, you are confident with the mean values (Central Limit Theorem), with a very small stand error (s/(sqrt(n)). Therefore, you are confident to detect the differences between the two data sets. Even though there is only a slight difference, you can still say, yes, they are "significantly" different. If you still remember some time ago, we had a discussion on large sample size problem for tests for normality. When the sample size is large enough, the result can always be expected (for real data sets), that is, rejection of the null hypothesis. Cheers, Chaosheng -- Dr. Chaosheng Zhang Lecturer in GIS Department of Geography National University of Ireland, Galway IRELAND Tel: +353-91-524411 x 2375 Direct Tel: +353-91-49 2375 Fax: +353-91-525700 E-mail: [EMAIL PROTECTED] Web 1: www.nuigalway.ie/geography/zhang.html Web 2: www.nuigalway.ie/geography/gis/index.htm - Original Message - From: "Isobel Clark" <[EMAIL PROTECTED]> To: "Donald E. Myers" <[EMAIL PROTECTED]> Cc: "Colin Badenhorst" <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]> Sent: Saturday, December 04, 2004 11:49 AM Subject: [ai-geostats] F and T-test for samples drawn from the same p > Don > > Thank you for the extended clarification of F and t > hypothesis test. For those unfamiliar with the > concept, it is worth noting that the F test for > multiple means may be more familiar under the title > "Analysis of variance". > > My own brief answer was in the context of Colin's > question, where it was quite clear that he was talking > aboutthe simplest F variance-ratio and t comparison of > means test. > > Isobel > > > * By using the ai-geostat
Re: [ai-geostats] F and T-test for samples drawn from the same p
Dear all, I'm wondering if sample size (number of samples, n) is playing a role here. Since Colin is using Excel to analyse several thousand samples, I have checked the functions of t-tests in Excel. In the Data Analysis Tools help, a function is provided for "t-Test: Two-Sample Assuming Unequal Variances analysis". This function is the same as those from many text books (There are other forms of the function). Unfortunately, I cannot find the function for "assuming equal variances" in Excel, but I assume they are similar, and should be the same as those from some text books. From the function, you can find that when the sample size is large you always get a large t value. When sample size is large enough, even slight differences between the mean values of two data sets (x bar and y bar) can be detected, and this will result in rejection of the null hypothesis. This is in fact quite reasonable. When the sample size is large, you are confident with the mean values (Central Limit Theorem), with a very small stand error (s/(sqrt(n)). Therefore, you are confident to detect the differences between the two data sets. Even though there is only a slight difference, you can still say, yes, they are "significantly" different. If you still remember some time ago, we had a discussion on large sample size problem for tests for normality. When the sample size is large enough, the result can always be expected (for real data sets), that is, rejection of the null hypothesis. Cheers, Chaosheng --Dr. Chaosheng ZhangLecturer in GISDepartment of GeographyNational University of Ireland, GalwayIRELANDTel: +353-91-524411 x 2375Direct Tel: +353-91-49 2375Fax: +353-91-525700E-mail: [EMAIL PROTECTED]Web 1: www.nuigalway.ie/geography/zhang.htmlWeb 2: www.nuigalway.ie/geography/gis/index.htm - Original Message - From: "Isobel Clark" <[EMAIL PROTECTED]> To: "Donald E. Myers" <[EMAIL PROTECTED]> Cc: "Colin Badenhorst" <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]> Sent: Saturday, December 04, 2004 11:49 AM Subject: [ai-geostats] F and T-test for samples drawn from the same p > Don> > Thank you for the extended clarification of F and t> hypothesis test. For those unfamiliar with the> concept, it is worth noting that the F test for> multiple means may be more familiar under the title> "Analysis of variance".> > My own brief answer was in the context of Colin's> question, where it was quite clear that he was talking> aboutthe simplest F variance-ratio and t comparison of> means test.> > Isobel> > > * By using the ai-geostats mailing list you agree to follow its rules > ( see http://www.ai-geostats.org/help_ai-geostats.htm )> > * To unsubscribe to ai-geostats, send the following in the subject or in the body (plain text format) of an email message to [EMAIL PROTECTED]> > Signoff ai-geostats> * By using the ai-geostats mailing list you agree to follow its rules ( see http://www.ai-geostats.org/help_ai-geostats.htm ) * To unsubscribe to ai-geostats, send the following in the subject or in the body (plain text format) of an email message to [EMAIL PROTECTED] Signoff ai-geostats
[ai-geostats] F and T-test for samples drawn from the same p
Don Thank you for the extended clarification of F and t hypothesis test. For those unfamiliar with the concept, it is worth noting that the F test for multiple means may be more familiar under the title "Analysis of variance". My own brief answer was in the context of Colin's question, where it was quite clear that he was talking aboutthe simplest F variance-ratio and t comparison of means test. Isobel * By using the ai-geostats mailing list you agree to follow its rules ( see http://www.ai-geostats.org/help_ai-geostats.htm ) * To unsubscribe to ai-geostats, send the following in the subject or in the body (plain text format) of an email message to [EMAIL PROTECTED] Signoff ai-geostats
RE: [ai-geostats] F and T-test for samples drawn from the same p
Title: RE: [ai-geostats] F and T-test for samples drawn from the same p Colin (Daly) is exactly correct. The spatial dependence is the main issue here when you use the t-test for spatial data. You might be able to transform your data for normality or even homogeneity, but the dependence is still there. In this case, you need to incorporate the spatial dependence (described by variogram) into the ttest. Try the generalized least square for a likelihood approach. Din Chen From: Colin Daly [mailto:[EMAIL PROTECTED] Sent: Friday, December 03, 2004 8:16 AM To: Glover, Tim; Colin Badenhorst; [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Subject: RE: [ai-geostats] F and T-test for samples drawn from the same p There is one other very important assumption about these standard statiatical tests - namely that the samples are independent. This typically removes a large part of the usability of basic tests unless corrected for spatial variables. It is most likely the case that your samples within each horizon are not independent (unless the variogram has got zero range)- so your typical tests cannot be used. They will tend to give pessimistic results - in other words you will tend to find differences in means when none exists. So, these type of tests don't apply directly. I don't know if there has been much work on trying to provide 'rigourous' methods (but given that it is impossible to give a statistical test that shows if a random function is stationary or not (Matheron - 'Estimating and choosing') then I guess the results would not be completely rigourous). You may be able to get an intuitive feel for the likely difference in means by trying to see how many quasi independent points you have got. You could guess-timate this by assuming that points separated by more than a variogram range are independent and see how many such 'range units' you have got and using this as the number of 'samples' (actually - you may be better by working with an integral range). But if you have any trends in the data then you will not reliable estimates of the two means and so cannot 'prove' that the samples come from the same random function - even if they do. Regards Colin Daly -Original Message- From: Glover, Tim [mailto:[EMAIL PROTECTED]] Sent: Fri 12/3/2004 3:15 PM To: Colin Badenhorst; [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Subject: RE: [ai-geostats] F and T-test for samples drawn from the same p Standard t-tests make two assumptions: 1. both data sets are normally distributed; 2. they have approximately equal variance. Test these assumptions before applying a t-test. Violate these assumptions at your own risk. If you fail either assumption, you need to consider your options, but probably should not use a plain-vanilla t-test. You could possibly use a data transform to "fix" the first assumption. You might have to use a modified t-test (such as Satterthwaite's modification) Or you might consider a non-parametric approach, such as Mann-Whitney U-test. Tim Glover Senior Environmental Scientist - Geochemistry Geoenvironmental Department MACTEC Engineering and Consulting, Inc. Kennesaw, Georgia, USA Office 770-421-3310 Fax 770-421-3486 Email [EMAIL PROTECTED] Web www.mactec.com -Original Message- From: Colin Badenhorst [mailto:[EMAIL PROTECTED]] Sent: Friday, December 03, 2004 9:59 AM To: '[EMAIL PROTECTED]' Cc: '[EMAIL PROTECTED]' Subject: RE: [ai-geostats] F and T-test for samples drawn from the same p Hi Ted, Thanks for your reply. I suspect my original query was too vague, so I will illustrate it with a practical example here. I have an ore horizon that splits into two separate horizons. One of these split horizons has a lower average grade, and the other has a higher average grade. I need to determine whether I should treat these two horizons as separate entities during grade estimation. My geological observations tell me that these two horizons derive from the same source, and on the face of it are not different from one another in terms of mineral content and genesis. I aim to back it up by proving, or attempting to prove, that statistically these two horizons are the same, and can be treated as such as far as grade estimation goes. Because the mean grades vary between the two, I suspect that the T-test might fail, but I also suspect that the variance in grade between the two might be very similar, and thus the F-test will pass. Now I have a problem : a T-test tells me the populations differ statistically, and but the F-test tells me they don't. The confidence limit I refer to in (2) by the way is the Alpha value used to determine the confidence level for the test - I am using Excel to do the test. Thanks, Colin -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] Sent: 03 December 2004 14:15 To: Colin Badenhorst
RE: [ai-geostats] F and T-test for samples drawn from the same p
Title: RE: [ai-geostats] F and T-test for samples drawn from the same p There is one other very important assumption about these standard statiatical tests - namely that the samples are independent. This typically removes a large part of the usability of basic tests unless corrected for spatial variables. It is most likely the case that your samples within each horizon are not independent (unless the variogram has got zero range)- so your typical tests cannot be used. They will tend to give pessimistic results - in other words you will tend to find differences in means when none exists. So, these type of tests don't apply directly. I don't know if there has been much work on trying to provide 'rigourous' methods (but given that it is impossible to give a statistical test that shows if a random function is stationary or not (Matheron - 'Estimating and choosing') then I guess the results would not be completely rigourous). You may be able to get an intuitive feel for the likely difference in means by trying to see how many quasi independent points you have got. You could guess-timate this by assuming that points separated by more than a variogram range are independent and see how many such 'range units' you have got and using this as the number of 'samples' (actually - you may be better by working with an integral range). But if you have any trends in the data then you will not reliable estimates of the two means and so cannot 'prove' that the samples come from the same random function - even if they do. Regards Colin Daly -Original Message- From: Glover, Tim [mailto:[EMAIL PROTECTED]] Sent: Fri 12/3/2004 3:15 PM To: Colin Badenhorst; [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Subject: RE: [ai-geostats] F and T-test for samples drawn from the same p Standard t-tests make two assumptions: 1. both data sets are normally distributed; 2. they have approximately equal variance. Test these assumptions before applying a t-test. Violate these assumptions at your own risk. If you fail either assumption, you need to consider your options, but probably should not use a plain-vanilla t-test. You could possibly use a data transform to "fix" the first assumption. You might have to use a modified t-test (such as Satterthwaite's modification) Or you might consider a non-parametric approach, such as Mann-Whitney U-test. Tim Glover Senior Environmental Scientist - Geochemistry Geoenvironmental Department MACTEC Engineering and Consulting, Inc. Kennesaw, Georgia, USA Office 770-421-3310 Fax 770-421-3486 Email [EMAIL PROTECTED] Web www.mactec.com -Original Message- From: Colin Badenhorst [mailto:[EMAIL PROTECTED]] Sent: Friday, December 03, 2004 9:59 AM To: '[EMAIL PROTECTED]' Cc: '[EMAIL PROTECTED]' Subject: RE: [ai-geostats] F and T-test for samples drawn from the same p Hi Ted, Thanks for your reply. I suspect my original query was too vague, so I will illustrate it with a practical example here. I have an ore horizon that splits into two separate horizons. One of these split horizons has a lower average grade, and the other has a higher average grade. I need to determine whether I should treat these two horizons as separate entities during grade estimation. My geological observations tell me that these two horizons derive from the same source, and on the face of it are not different from one another in terms of mineral content and genesis. I aim to back it up by proving, or attempting to prove, that statistically these two horizons are the same, and can be treated as such as far as grade estimation goes. Because the mean grades vary between the two, I suspect that the T-test might fail, but I also suspect that the variance in grade between the two might be very similar, and thus the F-test will pass. Now I have a problem : a T-test tells me the populations differ statistically, and but the F-test tells me they don't. The confidence limit I refer to in (2) by the way is the Alpha value used to determine the confidence level for the test - I am using Excel to do the test. Thanks, Colin -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] Sent: 03 December 2004 14:15 To: Colin Badenhorst Cc: [EMAIL PROTECTED] Subject: RE: [ai-geostats] F and T-test for samples drawn from the same p On 03-Dec-04 Colin Badenhorst wrote: > Hello everyone, > > I have two groups of several thousand samples analysed > for various elements, and wish to determine if these > samples are drawn from the same statistical population > for later variography studies. I propose to test the two > groups by using a F-test to test the sample variances, > and a T-test to test the group means, at a given confidence limit. > > Before I do this, I wonder how I would interpret the results > of the test if, for example: > > 1. The F-test s
RE: [ai-geostats] F and T-test for samples drawn from the same p
Standard t-tests make two assumptions: 1. both data sets are normally distributed; 2. they have approximately equal variance. Test these assumptions before applying a t-test. Violate these assumptions at your own risk. If you fail either assumption, you need to consider your options, but probably should not use a plain-vanilla t-test. You could possibly use a data transform to "fix" the first assumption. You might have to use a modified t-test (such as Satterthwaite's modification) Or you might consider a non-parametric approach, such as Mann-Whitney U-test. Tim Glover Senior Environmental Scientist - Geochemistry Geoenvironmental Department MACTEC Engineering and Consulting, Inc. Kennesaw, Georgia, USA Office 770-421-3310 Fax 770-421-3486 Email [EMAIL PROTECTED] Web www.mactec.com -Original Message- From: Colin Badenhorst [mailto:[EMAIL PROTECTED] Sent: Friday, December 03, 2004 9:59 AM To: '[EMAIL PROTECTED]' Cc: '[EMAIL PROTECTED]' Subject: RE: [ai-geostats] F and T-test for samples drawn from the same p Hi Ted, Thanks for your reply. I suspect my original query was too vague, so I will illustrate it with a practical example here. I have an ore horizon that splits into two separate horizons. One of these split horizons has a lower average grade, and the other has a higher average grade. I need to determine whether I should treat these two horizons as separate entities during grade estimation. My geological observations tell me that these two horizons derive from the same source, and on the face of it are not different from one another in terms of mineral content and genesis. I aim to back it up by proving, or attempting to prove, that statistically these two horizons are the same, and can be treated as such as far as grade estimation goes. Because the mean grades vary between the two, I suspect that the T-test might fail, but I also suspect that the variance in grade between the two might be very similar, and thus the F-test will pass. Now I have a problem : a T-test tells me the populations differ statistically, and but the F-test tells me they don't. The confidence limit I refer to in (2) by the way is the Alpha value used to determine the confidence level for the test - I am using Excel to do the test. Thanks, Colin -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Sent: 03 December 2004 14:15 To: Colin Badenhorst Cc: [EMAIL PROTECTED] Subject: RE: [ai-geostats] F and T-test for samples drawn from the same p On 03-Dec-04 Colin Badenhorst wrote: > Hello everyone, > > I have two groups of several thousand samples analysed > for various elements, and wish to determine if these > samples are drawn from the same statistical population > for later variography studies. I propose to test the two > groups by using a F-test to test the sample variances, > and a T-test to test the group means, at a given confidence limit. > > Before I do this, I wonder how I would interpret the results > of the test if, for example: > > 1. The F-test suggests no significant statistical difference > between the variances at a 90% confidence limit, BUT > 2. The T-test suggests a significant statistical difference > between the means at the same, or lower confidence limit. > > Has anyone come across this scenario before and how are they > interpreted? On the face of it, the scenario you describe corresponds to a standard t-test (which involves an assumption that the variances of the two populations do not differ), though I'm not sure what you mean in (2) by significant "at the same, or lower confidence limit." (Do I take it that in (1) you mean that the P-value for the F test is 0.1 or less?) However, if you get significant difference between the variances in (1), then it may not be very good to use the standard t test (depending on how different they are). A modified version, such as the Welch test, should be used instead. There is an issue with interpreting the results where the samples have initially been screened by one test, before another one is applied, since the sampling distribution of the second test, conditional on the outcome of the first, may not be the same as the sampling distribution of the second test on its own. However, I feel inclined to guess that this may not make any important difference in your case. Hoping this helps, Ted. E-Mail: (Ted Harding) <[EMAIL PROTECTED]> Fax-to-email: +44 (0)870 094 0861 [NB: New number!] Date: 03-Dec-04 Time: 14:15:09 -- XFMail -- * By using the ai-geostats mailing list you agree to follow its rules ( see http://www.ai-geostats.org/help_ai-geostats.htm ) * To unsubscribe to ai-geostats, send the following in the subject or in the body (plain text format) of an email message to [EMAIL PROTECTED] Signoff ai-geostats
RE: [ai-geostats] F and T-test for samples drawn from the same p
Hi Ted, Thanks for your reply. I suspect my original query was too vague, so I will illustrate it with a practical example here. I have an ore horizon that splits into two separate horizons. One of these split horizons has a lower average grade, and the other has a higher average grade. I need to determine whether I should treat these two horizons as separate entities during grade estimation. My geological observations tell me that these two horizons derive from the same source, and on the face of it are not different from one another in terms of mineral content and genesis. I aim to back it up by proving, or attempting to prove, that statistically these two horizons are the same, and can be treated as such as far as grade estimation goes. Because the mean grades vary between the two, I suspect that the T-test might fail, but I also suspect that the variance in grade between the two might be very similar, and thus the F-test will pass. Now I have a problem : a T-test tells me the populations differ statistically, and but the F-test tells me they don't. The confidence limit I refer to in (2) by the way is the Alpha value used to determine the confidence level for the test - I am using Excel to do the test. Thanks, Colin -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Sent: 03 December 2004 14:15 To: Colin Badenhorst Cc: [EMAIL PROTECTED] Subject: RE: [ai-geostats] F and T-test for samples drawn from the same p On 03-Dec-04 Colin Badenhorst wrote: > Hello everyone, > > I have two groups of several thousand samples analysed > for various elements, and wish to determine if these > samples are drawn from the same statistical population > for later variography studies. I propose to test the two > groups by using a F-test to test the sample variances, > and a T-test to test the group means, at a given confidence limit. > > Before I do this, I wonder how I would interpret the results > of the test if, for example: > > 1. The F-test suggests no significant statistical difference > between the variances at a 90% confidence limit, BUT > 2. The T-test suggests a significant statistical difference > between the means at the same, or lower confidence limit. > > Has anyone come across this scenario before and how are they > interpreted? On the face of it, the scenario you describe corresponds to a standard t-test (which involves an assumption that the variances of the two populations do not differ), though I'm not sure what you mean in (2) by significant "at the same, or lower confidence limit." (Do I take it that in (1) you mean that the P-value for the F test is 0.1 or less?) However, if you get significant difference between the variances in (1), then it may not be very good to use the standard t test (depending on how different they are). A modified version, such as the Welch test, should be used instead. There is an issue with interpreting the results where the samples have initially been screened by one test, before another one is applied, since the sampling distribution of the second test, conditional on the outcome of the first, may not be the same as the sampling distribution of the second test on its own. However, I feel inclined to guess that this may not make any important difference in your case. Hoping this helps, Ted. E-Mail: (Ted Harding) <[EMAIL PROTECTED]> Fax-to-email: +44 (0)870 094 0861 [NB: New number!] Date: 03-Dec-04 Time: 14:15:09 -- XFMail -- * By using the ai-geostats mailing list you agree to follow its rules ( see http://www.ai-geostats.org/help_ai-geostats.htm ) * To unsubscribe to ai-geostats, send the following in the subject or in the body (plain text format) of an email message to [EMAIL PROTECTED] Signoff ai-geostats
RE: [ai-geostats] F and T-test for samples drawn from the same p
On 03-Dec-04 Colin Badenhorst wrote: > Hello everyone, > > I have two groups of several thousand samples analysed > for various elements, and wish to determine if these > samples are drawn from the same statistical population > for later variography studies. I propose to test the two > groups by using a F-test to test the sample variances, > and a T-test to test the group means, at a given confidence limit. > > Before I do this, I wonder how I would interpret the results > of the test if, for example: > > 1. The F-test suggests no significant statistical difference > between the variances at a 90% confidence limit, BUT > 2. The T-test suggests a significant statistical difference > between the means at the same, or lower confidence limit. > > Has anyone come across this scenario before and how are they > interpreted? On the face of it, the scenario you describe corresponds to a standard t-test (which involves an assumption that the variances of the two populations do not differ), though I'm not sure what you mean in (2) by significant "at the same, or lower confidence limit." (Do I take it that in (1) you mean that the P-value for the F test is 0.1 or less?) However, if you get significant difference between the variances in (1), then it may not be very good to use the standard t test (depending on how different they are). A modified version, such as the Welch test, should be used instead. There is an issue with interpreting the results where the samples have initially been screened by one test, before another one is applied, since the sampling distribution of the second test, conditional on the outcome of the first, may not be the same as the sampling distribution of the second test on its own. However, I feel inclined to guess that this may not make any important difference in your case. Hoping this helps, Ted. E-Mail: (Ted Harding) <[EMAIL PROTECTED]> Fax-to-email: +44 (0)870 094 0861 [NB: New number!] Date: 03-Dec-04 Time: 14:15:09 -- XFMail -- * By using the ai-geostats mailing list you agree to follow its rules ( see http://www.ai-geostats.org/help_ai-geostats.htm ) * To unsubscribe to ai-geostats, send the following in the subject or in the body (plain text format) of an email message to [EMAIL PROTECTED] Signoff ai-geostats