Re: [ai-geostats] Re: F and T-test for samples drawn from the same p
Dear Isobel, Thanks for the information. Perhaps I didn't explain my request clearly. What I need is to verify the ideas you suggested in the previous message. Specifically, (1) Has anybody used the sill values (in geostatistics) to replace the variances (in classical statistics) in F test? (2) Has anybody used the global standard errors (in geostatistics) to replace the mean standard errors (in classical statistics) in t-test? Cheers, Chaosheng - Original Message - From: Isobel Clark [EMAIL PROTECTED] To: Chaosheng Zhang [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Sent: Monday, December 06, 2004 6:03 PM Subject: [ai-geostats] Re: F and T-test for samples drawn from the same p There ws a pretty good paper on global standard errors in the 1984 APCOM proceedings, so I am sure it should be in the major textbooks by now. Commparing the sills is very straightforward, I think. Isobel http://geecosse.bizland.com/books.htm --- Chaosheng Zhang [EMAIL PROTECTED] wrote: Isobel, Good idea, and that's a step forward. Any references or is it still an idea? Cheers, Chaosheng - Original Message - From: Isobel Clark [EMAIL PROTECTED] To: AI Geostats mailing list [EMAIL PROTECTED] Sent: Monday, December 06, 2004 1:07 PM Subject: Re: [ai-geostats] F and T-test for samples drawn from the same p Dear all I am having difficulty understanding why none of you want to try a spatial approach to statistics. Everyone is trying to make the 'independent' statistical tests work on spatial data. Try turning this around and look at the spatial aspect first. (1) Testing variances: the sill on the semi-variogram (total height of model) is theoretically a good estimate for the sample variance when auto-correlation or spatial dependence is present. Do your F test on that. Yes, you still have degrees of freedom problems, but with thousands of samples the 'infinity column' should be sufficient. (2) Testing means: the classic t-test in the presence of 'equal variances' requires the 'standard error' of each mean. For independent samples, this is s/sqrt(n). For spatially dependent samples, this is the kriging standard error for the global mean. Your only problem then is getting a global standard error. Isobel http://geoecosse.bizland.com/whatsnew.htm -- -- * By using the ai-geostats mailing list you agree to follow its rules ( see http://www.ai-geostats.org/help_ai-geostats.htm ) * To unsubscribe to ai-geostats, send the following in the subject or in the body (plain text format) of an email message to [EMAIL PROTECTED] Signoff ai-geostats * By using the ai-geostats mailing list you agree to follow its rules ( see http://www.ai-geostats.org/help_ai-geostats.htm ) * To unsubscribe to ai-geostats, send the following in the subject or in the body (plain text format) of an email message to [EMAIL PROTECTED] Signoff ai-geostats * By using the ai-geostats mailing list you agree to follow its rules ( see http://www.ai-geostats.org/help_ai-geostats.htm ) * To unsubscribe to ai-geostats, send the following in the subject or in the body (plain text format) of an email message to [EMAIL PROTECTED] Signoff ai-geostats * By using the ai-geostats mailing list you agree to follow its rules ( see http://www.ai-geostats.org/help_ai-geostats.htm ) * To unsubscribe to ai-geostats, send the following in the subject or in the body (plain text format) of an email message to [EMAIL PROTECTED] Signoff ai-geostats
Re: [ai-geostats] Re: F and T-test for samples drawn from the same p
Digby I see where you are coming from on this, but in fact the sill is composed of those pairs of samples which are independent of one another - or, at least, have reached some background correlation. This is why the sill makes a better estimate of the variance than the conventional statistical measures, since it is based on independent sampling. Isobel http://geoecosse.bizland.com/whatsnew.htm --- Digby Millikan [EMAIL PROTECTED] wrote: While your talking about sill's being the global variance which I read everywhere, isn't the global variance actually slightly less than the sill, as the values below the range of the variogram are not included? i.e. the sill would be the global variance when you have pure nugget effect. * By using the ai-geostats mailing list you agree to follow its rules ( see http://www.ai-geostats.org/help_ai-geostats.htm ) * To unsubscribe to ai-geostats, send the following in the subject or in the body (plain text format) of an email message to [EMAIL PROTECTED] Signoff ai-geostats * By using the ai-geostats mailing list you agree to follow its rules ( see http://www.ai-geostats.org/help_ai-geostats.htm ) * To unsubscribe to ai-geostats, send the following in the subject or in the body (plain text format) of an email message to [EMAIL PROTECTED] Signoff ai-geostats
Re: [ai-geostats] Re: F and T-test for samples drawn from the same p
Hi Isobel, Could you explain why it would be a better estimate of the variance when independance is considered? I'd rather think that we consider the dependance when the overall variance are to be estimated-- if there actually is dependance between values. Or are you talking about modeling sill value by the stablizing tail on the experimental variogram, instead of modeling by the calculated overall variance? Or, are we talking about variance of different definitions? I'd be concerned if I missed some point of the original definition for variances, like, the variance should be defined with no dependance beween values or something like that. Frankly, I don't think I took the definition of variance too serious when I was learning stats. Meng-ying Digby I see where you are coming from on this, but in fact the sill is composed of those pairs of samples which are independent of one another - or, at least, have reached some background correlation. This is why the sill makes a better estimate of the variance than the conventional statistical measures, since it is based on independent sampling. Isobel * By using the ai-geostats mailing list you agree to follow its rules ( see http://www.ai-geostats.org/help_ai-geostats.htm ) * To unsubscribe to ai-geostats, send the following in the subject or in the body (plain text format) of an email message to [EMAIL PROTECTED] Signoff ai-geostats
[ai-geostats] Sill versus least-squares classical variance estimate
Meng-Ying We are talking about estimating the variance of a set of samples where spatial dependence exists. The classical statistical unbiassed estimator of the population variance is s-squared which is the sum of the squared deviations from the mean divided by the relevant degrees of freedom. If the samples are not inter-correlated, the relevant degrees of freedom are (n-1). This gives the formula you find in any introductory statistics book or course. If samples are not independent of one another, the degrees of freedom issue becomes a problem and the classical estimator will be biassed (generally too small on average). In theory, pairs of samples beyond the range of influence on a semi-variogram graph are independent of one another. In theory, the variance of the difference betwen two values which are uncorrelated is twice the variance of one sample around the population mean. This is thought to be why Matheron defined the semi-variogram (one-half the squared difference) so that the final sill would be (theoretically) equal to the population variance. There are computer software packages which will draw a line on your experimental semi-variogram at the height equivalent to the classically calculated sample variance. Some people try to force their semi-variogram models to go through this line. This is dumb as the experimental sill is a better estimate because it does have the degrees of freedom it is supposed to have. I am not sure whether this is clear enough. If you email me off the list, I can recommend publications which might help you out. Isobel http://geoecosse.bizland.com/books.htm --- Meng-Ying Li [EMAIL PROTECTED] wrote: Hi Isobel, Could you explain why it would be a better estimate of the variance when independance is considered? I'd rather think that we consider the dependance when the overall variance are to be estimated-- if there actually is dependance between values. Or are you talking about modeling sill value by the stablizing tail on the experimental variogram, instead of modeling by the calculated overall variance? Or, are we talking about variance of different definitions? I'd be concerned if I missed some point of the original definition for variances, like, the variance should be defined with no dependance beween values or something like that. Frankly, I don't think I took the definition of variance too serious when I was learning stats. Meng-ying Digby I see where you are coming from on this, but in fact the sill is composed of those pairs of samples which are independent of one another - or, at least, have reached some background correlation. This is why the sill makes a better estimate of the variance than the conventional statistical measures, since it is based on independent sampling. Isobel * By using the ai-geostats mailing list you agree to follow its rules ( see http://www.ai-geostats.org/help_ai-geostats.htm ) * To unsubscribe to ai-geostats, send the following in the subject or in the body (plain text format) of an email message to [EMAIL PROTECTED] Signoff ai-geostats
[ai-geostats] Continuing discussion on F and t tests
The sample variance (assuming that you use the n-1 divisor) is an unbiased estimator of the population variance provided you use random sampling. Note the ing on the word sampling, it is not quite correct to talk about random samples or independent samples. or at least it may be mis-leading. Random sampling pertains to how the data is collected, not the end result. Note moreover that one can always compute a sample variance for a given data set but this does not show that the random variable or random function has a finite variance. The sample variance (even when sampling from a normal population) is relatively speaking more variable as an estimator of the variance than the sample mean is as an estimator of the population mean. The sampling distribution in this restricted case is chi-square, the chi-square distribution has a fat tail (as contrasted with a normal distribution). If correctly (or maybe you would want to say adequately ) estimated, the sill of a second order stationary random function would be the variance of the random function. In general, the sample variance will not estimate the sill (because you are not using random sampling). Donald Myers http://www.u.arizona.edu/~donaldm * By using the ai-geostats mailing list you agree to follow its rules ( see http://www.ai-geostats.org/help_ai-geostats.htm ) * To unsubscribe to ai-geostats, send the following in the subject or in the body (plain text format) of an email message to [EMAIL PROTECTED] Signoff ai-geostats
[ai-geostats] equivalence of mean and var
It was previously mentioned that a common approach is to subdivide populations into those of equal mean and variance so that stationarity is obeyed. What do you suggest as tests for determining equivalence of mean and variance prior to spatial analysis? Thanks, Randy. * By using the ai-geostats mailing list you agree to follow its rules ( see http://www.ai-geostats.org/help_ai-geostats.htm ) * To unsubscribe to ai-geostats, send the following in the subject or in the body (plain text format) of an email message to [EMAIL PROTECTED] Signoff ai-geostats
[ai-geostats] Re: Sill versus least-squares classical variance estimate
I understand why it is not appropriate to force the sill so it matches the sample variance. My question is, why estimate the overall variance by the sill value when data are actually correlated? Meng-ying On Tue, 7 Dec 2004, Isobel Clark wrote: Meng-Ying We are talking about estimating the variance of a set of samples where spatial dependence exists. The classical statistical unbiassed estimator of the population variance is s-squared which is the sum of the squared deviations from the mean divided by the relevant degrees of freedom. If the samples are not inter-correlated, the relevant degrees of freedom are (n-1). This gives the formula you find in any introductory statistics book or course. If samples are not independent of one another, the degrees of freedom issue becomes a problem and the classical estimator will be biassed (generally too small on average). In theory, pairs of samples beyond the range of influence on a semi-variogram graph are independent of one another. In theory, the variance of the difference betwen two values which are uncorrelated is twice the variance of one sample around the population mean. This is thought to be why Matheron defined the semi-variogram (one-half the squared difference) so that the final sill would be (theoretically) equal to the population variance. There are computer software packages which will draw a line on your experimental semi-variogram at the height equivalent to the classically calculated sample variance. Some people try to force their semi-variogram models to go through this line. This is dumb as the experimental sill is a better estimate because it does have the degrees of freedom it is supposed to have. I am not sure whether this is clear enough. If you email me off the list, I can recommend publications which might help you out. Isobel http://geoecosse.bizland.com/books.htm --- Meng-Ying Li [EMAIL PROTECTED] wrote: Hi Isobel, Could you explain why it would be a better estimate of the variance when independance is considered? I'd rather think that we consider the dependance when the overall variance are to be estimated-- if there actually is dependance between values. Or are you talking about modeling sill value by the stablizing tail on the experimental variogram, instead of modeling by the calculated overall variance? Or, are we talking about variance of different definitions? I'd be concerned if I missed some point of the original definition for variances, like, the variance should be defined with no dependance beween values or something like that. Frankly, I don't think I took the definition of variance too serious when I was learning stats. Meng-ying Digby I see where you are coming from on this, but in fact the sill is composed of those pairs of samples which are independent of one another - or, at least, have reached some background correlation. This is why the sill makes a better estimate of the variance than the conventional statistical measures, since it is based on independent sampling. Isobel * By using the ai-geostats mailing list you agree to follow its rules ( see http://www.ai-geostats.org/help_ai-geostats.htm ) * To unsubscribe to ai-geostats, send the following in the subject or in the body (plain text format) of an email message to [EMAIL PROTECTED] Signoff ai-geostats
RE: [ai-geostats] Continuing discussion on F and t tests
Title: RE: [ai-geostats] Continuing discussion on F and t tests I'd agree with Don's point about the sample variance being unbaised under random sampling. Because of the linearity of the estimate, the lack of independence of samples is not a problem here. This should not be confused with the problem of t tests. There the source of the problem is that the variance of the sample mean, var(1/n sum(Z(x_i)) takes the form sigma**2/n + (1/n**2)sum(C_ij) ( sum over all i,j: i not equal to j) If the covariance terms for i not equal to j are all zero, then the variance of error reduces to sigma**2/n and this is where the number of independent samples n comes into it. If the samples are not independent, then the second term of the above does not necessarily fall away to zero quickly (in particular, in an extreme case, if the covariance falls very slowly we may have C_ij approx equal to sigma**2 and so the total above acts like sigma**2/n + (n-1/n)*sigma**2 = sigma**2. In other words the error does not reduce at all with an increasing number of samples - let alone reduce like 1/n). So, for this t test business, a crude method of getting a number of 'independent' samples would be to take the lenght of the field divided by range (provided that we have enough sample data to cover the field at a sampling spacing less than the range). This could be used in place of the raw number of samples n - which as said before will give a very poor result. Colin Daly -Original Message- From: Donald E. Myers [mailto:[EMAIL PROTECTED]] Sent: Tue 12/7/2004 6:52 PM To: [EMAIL PROTECTED] Cc: Subject: [ai-geostats] Continuing discussion on F and t tests The sample variance (assuming that you use the n-1 divisor) is an unbiased estimator of the population variance provided you use random sampling. Note the ing on the word sampling, it is not quite correct to talk about random samples or independent samples. or at least it may be mis-leading. Random sampling pertains to how the data is collected, not the end result. Note moreover that one can always compute a sample variance for a given data set but this does not show that the random variable or random function has a finite variance. The sample variance (even when sampling from a normal population) is relatively speaking more variable as an estimator of the variance than the sample mean is as an estimator of the population mean. The sampling distribution in this restricted case is chi-square, the chi-square distribution has a fat tail (as contrasted with a normal distribution). If correctly (or maybe you would want to say adequately ) estimated, the sill of a second order stationary random function would be the variance of the random function. In general, the sample variance will not estimate the sill (because you are not using random sampling). Donald Myers http://www.u.arizona.edu/~donaldm DISCLAIMER: This message contains information that may be privileged or confidential and is the property of the Roxar Group. It is intended only for the person to whom it is addressed. If you are not the intended recipient, you are not authorised to read, print, retain, copy, disseminate, distribute, or use this message or any part thereof. If you receive this message in error, please notify the sender immediately and delete all copies of this message. * By using the ai-geostats mailing list you agree to follow its rules ( see http://www.ai-geostats.org/help_ai-geostats.htm ) * To unsubscribe to ai-geostats, send the following in the subject or in the body (plain text format) of an email message to [EMAIL PROTECTED] Signoff ai-geostats
Re: [ai-geostats] Continuing discussion on F and t tests
Thanks Donald, I think what you mean by adequately is the sampling with CSR (complete spatial randomness) -- please correct me if I'm wrong. But I still have problem about estimating the variance. I mean, even if we sample with CSR, wouldn't the sample variance still be smaller than the sill value? I'd think that unless we restrict our sampling location far enough from each other, the sample variance calculated by S^2 will not reach the sill value. . . Meng-ying If correctly (or maybe you would want to say adequately ) estimated, the sill of a second order stationary random function would be the variance of the random function. In general, the sample variance will not estimate the sill (because you are not using random sampling). Donald Myers http://www.u.arizona.edu/~donaldm * By using the ai-geostats mailing list you agree to follow its rules ( see http://www.ai-geostats.org/help_ai-geostats.htm ) * To unsubscribe to ai-geostats, send the following in the subject or in the body (plain text format) of an email message to [EMAIL PROTECTED] Signoff ai-geostats
RE: [ai-geostats] variogram analysis
Rajive, Cyclic variograms indicate that your attribute of interest also fluctuates. I encountered this when working with time-series of water levels, in which case the fluctuations were related to seasonality. I am not sure what it would mean in the case of platinum deposits. Such variograms can be modeled using the hole effect model, but 2-dimensional semivraiogram modeling when you have anisotropy to account for, can be tricky with a hole effect because you cannot apply a hole effect model in more than one direction. It may be better to work with a residual, i.e. to find a correlated cyclic variable, remove the cyclicity for semivariogram and kriging purposes and add the kriged residual back in at the end. If you do want to model such a variogram, e.g. if you only encounter the cyclicity in one direction, and you are working with GSLib, then you may have to modify the kriging code, as the dampening factor (if the cyclicity diminishes with lag)is not specified in the parameter file. I don't know what other programs allow you to do with the hole effect model, though ... Noémi -Original Message- From: Rajive Ganguli [mailto:[EMAIL PROTECTED] Sent: Tuesday, December 07, 2004 4:50 PM To: [EMAIL PROTECTED] Subject: [ai-geostats] variogram analysis My question is general. What do you conclude if your variogram is wavy? Cyclic patterns? I have what appears to be high nugget, followed by a wavy pattern. If you wish, here is more info: an offshore placer platinum deposit, not too many boreholes - just 29 from decades ago spanning several square kilometers. The variogram (from GEOEAS) of the grade (ln) is given in: http://www.faculty.uaf.edu/ffrg/Variogram.zip The variogram is cyclic. Goes up and down. I tried various lags/directions. I will try to dig up the geological information and see what it says. -- Rajive * By using the ai-geostats mailing list you agree to follow its rules ( see http://www.ai-geostats.org/help_ai-geostats.htm ) * To unsubscribe to ai-geostats, send the following in the subject or in the body (plain text format) of an email message to [EMAIL PROTECTED] Signoff ai-geostats
Re: [ai-geostats] Sill versus least-squares classical variance estimate
Dear List, I think I'd like to state my problem more clearly. What I think to be the estimate of the overall variance is the expected variance in the future samples. This have to do with what kind of sampling scheme we use in the future, however. If we could assume the future samples to be enough apart from each other, then I'd have no problem using the sill value we calculated from the experimental variogram. Or, if we're talking about setting up a standard value so we could compare the maximum possible variances to that of other samples, I'd also have little doubt on the estimation using the sill value. Otherwise I think the sill value would be generally an over-estimation of the variance for a future sample, even for samples collected with complete spatial randomness in the future. And again, please correct me if I missed any important point along the discussion. I'd really like to be careful about (geo)stats, but probably not as careful about asking questions. Mng-yng * By using the ai-geostats mailing list you agree to follow its rules ( see http://www.ai-geostats.org/help_ai-geostats.htm ) * To unsubscribe to ai-geostats, send the following in the subject or in the body (plain text format) of an email message to [EMAIL PROTECTED] Signoff ai-geostats
Re: [ai-geostats] variogram analysis
Dear Rajive: I cannot conclude with only 328 pairs that the feature is "wavy" because I do not know how those pairs are distributed for each point in the variogram. Try different lag spacings, or create an "equal-n" lag variogram where each lag has the same number of pairs. If that shows the same feature, then perhaps there is a repeating feature (faults, fractures, ore controls, etc.) occurring at regular intervals throughout the sampling domain. I take it thatyou have 26 or so sample locations. Using "equal-distance" lags usually gives a large number of pairs to the first couple of lags, and then the n drops off rapidly, and the variogram is harder to interpret than with an "equal-n" type variogram. I wrote my variography codes to work both ways... Dan ii Dan W. McCarn, AIPG CPG #10245, Wyoming PG #3031, EurGeol #46210228 A Admiral Halsey NE; Albuquerque, NM 87111 USAHome: +1-505-822-1323; Cell: +1-505-710-3600The College of Santa Fe4501 Indian School NE Ste. 100; Albuquerque, NM 87110(505) 884-2732 fax (505) 262-5595[EMAIL PROTECTED]Institut für Geowissenschaften; Montanuniversität LeobenPeter-Tunner-Strasse 5; A8700 Leoben, AUSTRIACell: +43-676/725-6622; Fax; +43-3842-402-4902; Office: +43-3842-402-4903 In a message dated 12/7/2004 3:27:31 PM Mountain Standard Time, [EMAIL PROTECTED] writes: Usually when I've seen a "wavy" semivariogram, it's because of a localfeature superimposed over an existing field function - for instance, arelease of mercury in a field of soil with very low "natural" mercurycontent. The period of the waviness is related to the distance acrossthe feature (the width of the spill, in this case). Of course, this isnothing particularly earth-shattering, but useful none the less.I've used semivariograms like this in the past to "guestimate" theapproximate size of a plume based on sparse data. Not all geostatisticsends up in gridding and estimating at every point! Sometimes justlooking at the semivariogram is very useful. Tim GloverSenior Environmental Scientist - Geochemistry Geoenvironmental DepartmentMACTEC Engineering and Consulting, Inc.Kennesaw, Georgia, USAOffice 770-421-3310Fax 770-421-3486Email [EMAIL PROTECTED] Web www.mactec.com-Original Message-From: Rajive Ganguli [mailto:[EMAIL PROTECTED] Sent: Tuesday, December 07, 2004 4:50 PMTo: [EMAIL PROTECTED]Subject: [ai-geostats] variogram analysisMy question is general. What do you conclude if your variogram iswavy? Cyclic patterns? I have what appears to be high nugget,followed by a wavy pattern.If you wish, here is more info: an offshore placer platinum deposit,not too many boreholes - just 29 from decades ago spanning severalsquare kilometers. The variogram (from GEOEAS) of the grade (ln) isgiven in:http://www.faculty.uaf.edu/ffrg/Variogram.zipThe variogram is cyclic. Goes up and down. I tried variouslags/directions. I will try to dig up the geological information and see what it says.-- Rajive CONFIDENTIALITY NOTICE: The materials in this e-mail transmission (including all attachments) are private and confidential, and the property of the sender. The information contained in the materials is privileged and intended only for the use of the named addressee(s). If you are not the intended addressee, be advised that any unauthorized disclosure, copying, or distribution, or the taking of any action in reliance on the contents of this material is strictly prohibited. If you have received this e-mail transmission in error, please immediately notify the sender by sending an e-mail message, and thereafter destroy the e-mail you received and all copies thereof. * By using the ai-geostats mailing list you agree to follow its rules ( see http://www.ai-geostats.org/help_ai-geostats.htm ) * To unsubscribe to ai-geostats, send the following in the subject or in the body (plain text format) of an email message to [EMAIL PROTECTED] Signoff ai-geostats