Re: AI-GEOSTATS: Extreme values?

2001-12-13 Thread Marcel Vallée


Dear Chaosheng Zang

The sampling interval is so wide that the high values could easily be related to "hot 
spots" of 
higher grade contamination, i..e dumping areas for particular kinds of slags, 
mineralized 
waste, etc.  A property map might help.

Have you contoured the data?  If so, the sampling interval is so wide that  real hot 
spots of 
environmental significance might not show 2D distribution on such a wide sampling 
grid, 
however.

Regards

Marcel Vallée, Eng,, Geo.
Geoconseil Marcel Vallée Inc.
706 Routhier Ave
Québec, Québec G1X 3J9
Canada
Tel:(1) 418 652 3497
Fax:(1) 418 652 9148
Email:  [EMAIL PROTECTED]

==
13/12/01 08:01:48, Chaosheng Zhang <[EMAIL PROTECTED]> wrote:

>
>  Date:   Thu, 13 Dec 2001 13:01:48 +
>
>  From:   Chaosheng Zhang <[EMAIL PROTECTED]>
>  Subject:AI-GEOSTATS: Extreme values?
>  To: [EMAIL PROTECTED]
>
>
>
>  Dear all,
>
>  My question is: How to deal with the extreme/outlying values in a data set?
>
>  I am dealing with heavy metal concentrations in soils from a mine area. The
>
>  sample number is 223, and the samples are spatially evenly distributed with
>  the sampling interval of 400 metres. There are several samples with
>  extremely high values, which makes me feel uncomfortable. The percentiles of
>  the dataset are listed as follows (in mg/kg):
>
>
> ZnCu Pb CdAs
>  Min 4 1 250.0 2
>   5%35 6 350.1 6
>  10%40 7 410.2 7
>
>  25%6513 620.3 9
>  50%   122181680.615
>  75%   338278211.528
>  90%   90756   27992.858
>
>  95%  1986   116   44904.280
>  96%  2462   151   46984.982
>  97%  3493   178   54136.291
>  98%  4697   207   76098.3   111
>
>  99%  6712   247  11750   12.4   184
>  Max 11473  1293  16305   48.5  1060
>  When doing geostatistical and statistical analyses, we need some confidence
>  in dealing with the these very high extreme values which account for less
>
>  than 2% of the total sample number.
>
>  Any suggestions?
>
>  Cheers,
>
>  Chaosheng Zhang
>  ===
>  Dr. Chaosheng Zhang
>  Department of Geography
>  National University of Ireland
>  Galway
>  IRELAND
>
>  Tel: +353-91-524411 ext. 2375
>  Fax: +353-91-525700
>  Email: [EMAIL PROTECTED]
>  ===




--
* To post a message to the list, send it to [EMAIL PROTECTED]
* As a general service to the users, please remember to post a summary of any useful 
responses to your questions.
* To unsubscribe, send an email to [EMAIL PROTECTED] with no subject and "unsubscribe 
ai-geostats" followed by "end" on the next line in the message body. DO NOT SEND 
Subscribe/Unsubscribe requests to the list
* Support to the list is provided at http://www.ai-geostats.org



Re: AI-GEOSTATS: interannual spatial "stability" of variable

2001-12-13 Thread Isobel Clark

Chris

There are two ways of approaching data which has a
time element:

(1) treat time as a co-variable and use co-kriging.
You would probably want to do this if you have more
than one variable anyway

(2) treat time as a dimension -- as an additional
co-ordinate. If your original data is two-dimensional,
you can use any normal 3d geostat package for this. If
your original data is already 3d, things get a little
more complicated.

Good place to start would be Geostat Congress Volumes
or Noel Cressie's book.

Isobel Clark

__
Do You Yahoo!?
Everything you'll ever need on one web page
from News and Sport to Email and Music Charts
http://uk.my.yahoo.com

--
* To post a message to the list, send it to [EMAIL PROTECTED]
* As a general service to the users, please remember to post a summary of any useful 
responses to your questions.
* To unsubscribe, send an email to [EMAIL PROTECTED] with no subject and "unsubscribe 
ai-geostats" followed by "end" on the next line in the message body. DO NOT SEND 
Subscribe/Unsubscribe requests to the list
* Support to the list is provided at http://www.ai-geostats.org



AI-GEOSTATS: interannual spatial "stability" of variable

2001-12-13 Thread Chris Duke

Hi, we have multiyear point data (gridded) of a measured variable, Y,
from farm fields and there are spatial patterns to Y. We want to
measure the degree of similarity of these patterns from year to year.
Could someone please point me in a direction or provide references?
thanks, chris



--
* To post a message to the list, send it to [EMAIL PROTECTED]
* As a general service to the users, please remember to post a summary of any useful 
responses to your questions.
* To unsubscribe, send an email to [EMAIL PROTECTED] with no subject and "unsubscribe 
ai-geostats" followed by "end" on the next line in the message body. DO NOT SEND 
Subscribe/Unsubscribe requests to the list
* Support to the list is provided at http://www.ai-geostats.org



Re: AI-GEOSTATS: Extreme values?

2001-12-13 Thread Chaosheng Zhang

Dear Isobel,

Thanks for your quick and helpful reply!

(1) I would like to trust both the accuracy and precision of the dataset,
and the real problem is how we "play the computer game". The extreme values
may be from the samples which by chance contains many minerals.

(2) From the information of percentiles I provided in the message, you can
find that
the dataset is heavily skewed in deed. Logarithmic transformation can make
some of the variables follow the "normal distribution", but not all.
However, the extreme values still look extreme in the transformed dataset.

(3) There may be two populations: "background" and "mineralised". However,
there is really no way to "dichotomise" the two populations. Geographically
or mathematically? Geographically, there are three areas of high values.
Mathematically, we need some proof. Even though we could properly separate
the datasets into two "populations", the extreme values may still be extreme
in the "mineralised" population.

Since the really "bad" values are only <2% of the total number (such as 4 or
5 values out of the total number of 223, which can also be seen from the
percentiles), I am unwilling to use nonparametric methods until we cannot
find a way to use the parametric methods.

Another problem is when we carry out spatial interpolation, these values may
produce artificial contour lines around these sampling locations, even
though they can be smoothed. I don't think this is the realistic situation
in the field.

Well, I am still not very confident what the best way should be ... I know
the worst way is to discard these "outlying" values, and the second worst
way is to use non-parametric methods.

Cheers,

Chaosheng Zhang


- Original Message -
From: "Isobel Clark" <[EMAIL PROTECTED]>
To: "Chaosheng Zhang" <[EMAIL PROTECTED]>
Cc: <[EMAIL PROTECTED]>
Sent: Thursday, December 13, 2001 2:18 PM
Subject: Re: AI-GEOSTATS: Extreme values?


> > My question is: How to deal with the
> > extreme/outlying values in a data set?
> The real priority is to establish why you have extreme
> highs. For example:
>
> (1) is there a high imprecision in measuring the
> values, so that the sample observations are actually
> inaccurate? If so, is it relative to the value or a
> flat error?
>
> (2) do you have a skewed distribution of values?
>
> (3) do you have two (or more) populations, only one of
> which gives the high values?
>
> and there may be others. Once you determine the reason
> for extreme values, then you can more objectively know
> how to deal with them.
>
> For example, if you think (2) is most likely than look
> at transformations or distribution-free approaches to
> geostatistics. You can find some of my papers in
> dealing with positivel skewed distributions at:
>
> http://uk.geocities.com/drisobelclark/resume/Publications.html
>
> If (3) is more likely - as may be probable is your are
> looking at an area where samples may be 'background'
> or 'contaminated' - you really need to identify the
> populations first. Then you may be able to apply a
> mixture model together with indicator geostatistical
> approaches.
>
> If (1) is your problem, then you may be able to use a
> rough non-parametric approach to get to cross
> validation. The 'error statistics' in a cross
> validation exercise will often assist in identifying
> erroneous sample measurements.
>
> Hope this helps
> Isobel Clark
>
>
>
>
> __
> Do You Yahoo!?
> Everything you'll ever need on one web page
> from News and Sport to Email and Music Charts
> http://uk.my.yahoo.com


--
* To post a message to the list, send it to [EMAIL PROTECTED]
* As a general service to the users, please remember to post a summary of any useful 
responses to your questions.
* To unsubscribe, send an email to [EMAIL PROTECTED] with no subject and "unsubscribe 
ai-geostats" followed by "end" on the next line in the message body. DO NOT SEND 
Subscribe/Unsubscribe requests to the list
* Support to the list is provided at http://www.ai-geostats.org



Re: AI-GEOSTATS: Extreme values?

2001-12-13 Thread Isobel Clark

> My question is: How to deal with the
> extreme/outlying values in a data set?
The real priority is to establish why you have extreme
highs. For example:

(1) is there a high imprecision in measuring the
values, so that the sample observations are actually
inaccurate? If so, is it relative to the value or a
flat error?

(2) do you have a skewed distribution of values?

(3) do you have two (or more) populations, only one of
which gives the high values?

and there may be others. Once you determine the reason
for extreme values, then you can more objectively know
how to deal with them. 

For example, if you think (2) is most likely than look
at transformations or distribution-free approaches to
geostatistics. You can find some of my papers in
dealing with positivel skewed distributions at:

http://uk.geocities.com/drisobelclark/resume/Publications.html

If (3) is more likely - as may be probable is your are
looking at an area where samples may be 'background'
or 'contaminated' - you really need to identify the
populations first. Then you may be able to apply a
mixture model together with indicator geostatistical
approaches.

If (1) is your problem, then you may be able to use a
rough non-parametric approach to get to cross
validation. The 'error statistics' in a cross
validation exercise will often assist in identifying
erroneous sample measurements.

Hope this helps
Isobel Clark




__
Do You Yahoo!?
Everything you'll ever need on one web page
from News and Sport to Email and Music Charts
http://uk.my.yahoo.com

--
* To post a message to the list, send it to [EMAIL PROTECTED]
* As a general service to the users, please remember to post a summary of any useful 
responses to your questions.
* To unsubscribe, send an email to [EMAIL PROTECTED] with no subject and "unsubscribe 
ai-geostats" followed by "end" on the next line in the message body. DO NOT SEND 
Subscribe/Unsubscribe requests to the list
* Support to the list is provided at http://www.ai-geostats.org



AI-GEOSTATS: Extreme values?

2001-12-13 Thread Chaosheng Zhang



Dear all,
 
My question is: How to deal with the 
extreme/outlying values in a data set?
 
I am dealing with heavy metal 
concentrations in soils from a mine area. The sample number is 223, and the 
samples are spatially evenly distributed with the sampling interval of 400 
metres. There are several samples with extremely high values, which makes me 
feel uncomfortable. The percentiles of the dataset are listed as follows 
(in mg/kg):
 
   
Zn    Cu Pb     
Cd    As        Min  
   4     1     
250.0     2    
     5%35     6 
    35    0.1    
 6        10%   
 40     7 
410.2     7    
    25%    65   
 13 620.3   
  9        50%  
 122    18    168   
 0.6    15        
75%   338    27    821    
1.5    28        90%  
 907    56   2799    
2.8    58        95% 
 1986   116   4490    4.2   
 80        96%  2462  
 151   4698    4.9   
 82        97%  3493  
 178   5413    6.2   
 91        98%  4697   207 
  7609    8.3   111    
    99%  6712   247  11750   
12.4   184        Max 11473 
 1293  16305   48.5  1060
When doing geostatistical and statistical 
analyses, we need some confidence in dealing with the these very high 
extreme values which account for less than 2% of the total sample 
number. 
 
Any suggestions?
Cheers,
 
Chaosheng Zhang
===Dr. Chaosheng ZhangDepartment of 
GeographyNational University of IrelandGalwayIRELANDTel: 
+353-91-524411 ext. 2375Fax: +353-91-525700Email: 
[EMAIL PROTECTED]===