Chaosheng
 
If you are only describing your samples, such concepts as random and independent are irrelevant. They apply to the use of your sample statistics to estimate population parameters. If all you want to do is describe your samples, you can calculate any statistics you like.
 
However, you talk about "normality" and "outliers". These concepts depend on teh notion of a population from which the samples were drawn. If you are trying to estimate the parameters of that population, then dependence and non-randomness are as important as potential outliers and the shape of the population.
 
The "optimal weighted average" is usually known as "ordinary kriging" provided there is no significant trend. ;-)
 
Isobel
http://www.kriging.com

Chaosheng Zhang <[EMAIL PROTECTED]> wrote:
Dear Isobel,

Thanks for the helpful reply. In fact, I have been waiting for a reply from
you. -:)

I think the questions are fairly well answered by you. However, I want to
move a step forward or perhaps backward.

A question "forward": What are the methods to calculate the "optimal
weighted average"? Are they widely accepted/used/cited?

A question "backward": Do we really need to care about if the data are
spatially correlated or not, when we calculate descriptive statistics even
though we are aware of such an issue? Results calcuated from only the
non-correlated samples (e.g., sill in a variogram) really reflect the "true"
values of statistics? Generally we only care about outliers and
non-normality. In the spatial context, we care about sampling clusters.

Otherwise, we still have to use conventional statistics.

Best regards,

Chaosheng


----- Original Message -----
From: Isobel Clark
To: Chaosheng Zhang
Cc: ai-geostats@jrc.it
Sent: Thursday, May 25, 2006 3:59 PM
Subject: AI-GEOSTATS: Re: Effects of spatial autocorrelation on descriptive
statistics


Chaosheng

Some thoughts in response to your questions:

1: "Spatially correlated data provide redundant information for the
calculation of mean"

I would not say "redundant". Even if information is correlated, the
correlation is not perfect (=1) which would be "redundant". If the data is
spatially correlated, the correlations should be included in the choice of
weight for each sample and in the calculation of the 'standard error' and
confidence levels. An optimal weighted average of spatially correlated data
will always give a better answer than a smaller subset on non-correlated
data.

As an example, you might try kriging a large block with a set of (internal)
samples spaced at the range of influence and then repeat the exercise with a
handful of samples between these 'independent' ones.

2: "In the presence of spatially correlated data, would a dispersion
variance . be the proper calculation for the measure of variance?"

The obvious answer is "yes and no". If by dispersion variance you mean the
standard calculation of variance:

Sum(g_i - gbar)^2/(n-1) often calculated as

{Sum(g_i^2)/n - gbar^2}/(n-1)

where g_i represents each sample value and gbar the arithmetic mean of all
samples, then No, it is not appropriate.

The proper calculation for dispersion variance of a spatially correlated
data set includes all the cross-covariances, not just the squares of sample
values. It also requires a better estimate of the population than gbar (see
1 above). If you are looking for descriptive statistics, then the dispersion
variance can be calculated using the 'middle term' from the full estimation
variance -- the gamma-bar(S_i,S_j) term.

In prectice, the most appropriate (and probably simplest) estimate of the
'population' dispersion variance in the presence of spatially correlated
data is the total sill on the semi-variogram model. This is, theoretically,
the dispersion variance as calculated from samples which are non-correlated.

Isobel

Chaosheng Zhang <[EMAIL PROTECTED]>wrote:
AI-GEOSTATS
Move of the list to [EMAIL PROTECTED]

Dear All,

I'm looking for answers to effects of spatial autocorrelation on
conventional descriptive statistics. More specifically, any comments on the
following statements?

1. "Spatially correlated data provide redundant information for the
calculation of mean";

2. "In the presence of spatially correlated data, would a dispersion
variance . be the proper calculation for the measure of variance?"

Best regards,

Chaosheng Zhang
------------------
Dr. Chaosheng Zhang
Lecturer in GIS
Department of Geography
National University of Ireland, Galway
IRELAND
Tel: +353-91-492375
Fax: +353-91-495505
E-mail: [EMAIL PROTECTED]
Web1: www.nuigalway.ie/geography/zhang.html
Web2: www.nuigalway.ie/geography/gis


+ To post a message to the list, send it to ai-geostats@jrc.it
+ To unsubscribe, send email to majordomo@ jrc.it with no subject and
"unsubscribe ai-geostats" in the message body. DO NOT SEND
Subscribe/Unsubscribe requests to the list
+ As a general service to list users, please remember to post a summary of
any useful responses to your questions.
+ Support to the forum can be found at http://www.ai-geostats.org/

Reply via email to