Title: RE: [ai-geostats] Re: Sill versus least-squares classical variance estimate

Hi Digby

Sorry to say - but suggesting that less data is systematically better is mistaken - this is fundemental...and is contained in the intro pages of any good intro to geostats.  If the data is clustered - then you might have to decluster in some sense - But with an unbiased sample then you will use all million samples. Please, any new users of geostats lucky to have a million samples - don't throw 99.9% of your data away!!

Declustering is about trying to remove the bias that most realistic sampling strategies have (e,g, in petroleum, you tend to drill into the best reservoir regions first...). If your data is an unbaised sample from the true histogram (ie what you would get by mining out the resource fully) then you will use all  of it for estimating any statistic. This does not mean that the samples have to be far apart - just that they don't cluster into high or low regions.

There seems to be some confusion about independence and estimates. Suppose the mean (and/or variance) is being estimated (provisos: 1) unbaised sample data 2) stationary (so that mean and variance have a meaning)), then the estimate is unbiased - irrespective of the correlation of the data - what does depend on the correlation is the error in the estimation. For a zero correlation length, then the variance of error of the mean drops as 1/n. For a non-zero correlation length it drops slower than 1/n  -  but you do not get a quicker convergence by throwing away good data - in fact virtually always you will get a strictly worse estimate!

Regards

Colin


-----Original Message-----
From:   Digby Millikan [mailto:[EMAIL PROTECTED]]
Sent:   Wed 12/8/2004 5:12 PM
To:     ai-geostats; Meng-Ying  Li
Cc:    
Subject:        Re: [ai-geostats] Re: Sill versus least-squares classical variance estimate
Dear Meng-Ying,

 It's not that you are defining variance to be the variance of data to be
data
beyond the range of the variogram. Say you have a panel made up of a
1 million samples which covers the entire panel, then you select 1000
samples
to estimate the variance. If two samples of the thousand are within range of
each other (close and similar value), then you are effectively doubling up
on one of the samples, so to give a better representation of the 1 million
samples you are better to remove the doubled up sample, giving 999
samples to estimate the variance of the 1 million. This will give a better
estimate of the variance you could calculate from the million by the least
squares classical method, which is what Isobel was saying.

Regards Digby





DISCLAIMER:
This message contains information that may be privileged or confidential and is the property of the Roxar Group. It is intended only for the person to whom it is addressed. If you are not the intended recipient, you are not authorised to read, print, retain, copy, disseminate, distribute, or use this message or any part thereof. If you receive this message in error, please notify the sender immediately and delete all copies of this message.
* By using the ai-geostats mailing list you agree to follow its rules 
( see http://www.ai-geostats.org/help_ai-geostats.htm )

* To unsubscribe to ai-geostats, send the following in the subject or in the 
body (plain text format) of an email message to [EMAIL PROTECTED]

Signoff ai-geostats

Reply via email to