[ai-geostats] Sill versus least-squares classical variance estimate

2004-12-07 Thread Isobel Clark
Meng-Ying

We are talking about estimating the variance of a set
of samples where spatial dependence exists. 

The classical statistical unbiassed estimator of the
population variance is s-squared which is the sum of
the squared deviations from the mean divided by the
relevant degrees of freedom. If the samples are not
inter-correlated, the relevant degrees of freedom are
(n-1). This gives the formula you find in any
introductory statistics book or course.

If samples are not independent of one another, the
degrees of freedom issue becomes a problem and the
classical estimator will be biassed (generally too
small on average). 

In theory, pairs of samples beyond the range of
influence on a semi-variogram graph are independent of
one another. In theory, the variance of the difference
betwen two values which are uncorrelated is twice the
variance of one sample around the population mean.
This is thought to be why Matheron defined the
semi-variogram (one-half the squared difference) so
that the final sill would be (theoretically) equal to
the population variance.

There are computer software packages which will draw a
line on your experimental semi-variogram at the height
equivalent to the classically calculated sample
variance. Some people try to force their
semi-variogram models to go through this line. This is
dumb as the experimental sill is a better estimate
because it does have the degrees of freedom it is
supposed to have.

I am not sure whether this is clear enough. If you
email me off the list, I can recommend publications
which might help you out.

Isobel
http://geoecosse.bizland.com/books.htm

 --- Meng-Ying  Li <[EMAIL PROTECTED]> wrote: 
> Hi Isobel,
> 
> Could you explain why it would be a better estimate
> of the variance when
> independance is considered? I'd rather think that we
> consider the
> dependance when the overall variance are to be
> estimated-- if there
> actually is dependance between values.
> 
> Or are you talking about modeling sill value by the
> stablizing tail on
> the experimental variogram, instead of modeling by
> the calculated overall
> variance?
> 
> Or, are we talking about variance of different
> definitions? I'd be
> concerned if I missed some point of the original
> definition for variances,
> like, the variance should be defined with no
> dependance beween values or
> something like that. Frankly, I don't think I took
> the definition of
> variance too serious when I was learning stats.
> 
> 
> Meng-ying
> 
> > Digby
> >
> > I see where you are coming from on this, but in
> fact
> > the sill is composed of those pairs of samples
> which
> > are independent of one another - or, at least,
> have
> > reached some background correlation. This is why
> the
> > sill makes a better estimate of the variance than
> the
> > conventional statistical measures, since it is
> based
> > on independent sampling.
> >
> > Isobel
>  

* By using the ai-geostats mailing list you agree to follow its rules 
( see http://www.ai-geostats.org/help_ai-geostats.htm )

* To unsubscribe to ai-geostats, send the following in the subject or in the 
body (plain text format) of an email message to [EMAIL PROTECTED]

Signoff ai-geostats

Re: [ai-geostats] Sill versus least-squares classical variance estimate

2004-12-07 Thread Meng-Ying Li
Dear List,

I think I'd like to state my problem more clearly.

What I think to be the estimate of the overall variance is the expected
variance in the future samples. This have to do with what kind of sampling
scheme we use in the future, however.

If we could assume the future samples to be enough apart from each other,
then I'd have no problem using the sill value we calculated from the
experimental variogram. Or, if we're talking about setting up a standard
value so we could compare the maximum possible variances to that of other
samples, I'd also have little doubt on the estimation using the sill
value. Otherwise I think the sill value would be generally an
over-estimation of the variance for a future sample, even for samples
collected with complete spatial randomness in the future.

And again, please correct me if I missed any important point along the
discussion. I'd really like to be careful about (geo)stats, but probably
not as careful about asking questions.


Mng-yng

* By using the ai-geostats mailing list you agree to follow its rules 
( see http://www.ai-geostats.org/help_ai-geostats.htm )

* To unsubscribe to ai-geostats, send the following in the subject or in the 
body (plain text format) of an email message to [EMAIL PROTECTED]

Signoff ai-geostats

RE: [ai-geostats] Sill versus least-squares classical variance estimate

2004-12-08 Thread Colin Daly
Title: RE: [ai-geostats] Sill versus least-squares classical variance estimate







Meng-Ying

 samples taken beyond the range are, in fact, far enough apart from one another! The sill is - to all intents and puposes - equal to the variance of the data (This fails if there are trends in the data and/or the data is somehow preferentially clustered in high or low regions).

 If you need further confirmation - you will find this developed in the first pages of  most of the standard geostat books - such as the 2 classics...
    1) Matheron. "The theory of regionalised variables..." ( also known as Fasicule 5)
    2) Journel and Huijbrets "Mining Geostatistics"
But you will almost certainly find it in later books like a) Isaacs and Srivastava; b) Goovaerts   c) Chiles and Delfiner

-Original Message-
From:   Meng-Ying Li [mailto:[EMAIL PROTECTED]]
Sent:   Tue 12/7/2004 11:26 PM
To: Isobel Clark
Cc: ai-geostats
Subject:    Re: [ai-geostats] Sill versus least-squares classical variance estimate
Dear List,

I think I'd like to state my problem more clearly.

What I think to be the estimate of the overall variance is the expected
variance in the future samples. This have to do with what kind of sampling
scheme we use in the future, however.

If we could assume the future samples to be enough apart from each other,
then I'd have no problem using the sill value we calculated from the
experimental variogram. Or, if we're talking about setting up a standard
value so we could compare the maximum possible variances to that of other
samples, I'd also have little doubt on the estimation using the sill
value. Otherwise I think the sill value would be generally an
over-estimation of the variance for a future sample, even for samples
collected with complete spatial randomness in the future.

And again, please correct me if I missed any important point along the
discussion. I'd really like to be careful about (geo)stats, but probably
not as careful about asking questions.


Mng-yng








DISCLAIMER:
This message contains information that may be privileged or confidential and is the property of the Roxar Group. It is intended only for the person to whom it is addressed. If you are not the intended recipient, you are not authorised to read, print, retain, copy, disseminate, distribute, or use this message or any part thereof. If you receive this message in error, please notify the sender immediately and delete all copies of this message.
* By using the ai-geostats mailing list you agree to follow its rules 
( see http://www.ai-geostats.org/help_ai-geostats.htm )

* To unsubscribe to ai-geostats, send the following in the subject or in the 
body (plain text format) of an email message to [EMAIL PROTECTED]

Signoff ai-geostats