> My question is: How to deal with the > extreme/outlying values in a data set? The real priority is to establish why you have extreme highs. For example:
(1) is there a high imprecision in measuring the values, so that the sample observations are actually inaccurate? If so, is it relative to the value or a flat error? (2) do you have a skewed distribution of values? (3) do you have two (or more) populations, only one of which gives the high values? and there may be others. Once you determine the reason for extreme values, then you can more objectively know how to deal with them. For example, if you think (2) is most likely than look at transformations or distribution-free approaches to geostatistics. You can find some of my papers in dealing with positivel skewed distributions at: http://uk.geocities.com/drisobelclark/resume/Publications.html If (3) is more likely - as may be probable is your are looking at an area where samples may be 'background' or 'contaminated' - you really need to identify the populations first. Then you may be able to apply a mixture model together with indicator geostatistical approaches. If (1) is your problem, then you may be able to use a rough non-parametric approach to get to cross validation. The 'error statistics' in a cross validation exercise will often assist in identifying erroneous sample measurements. Hope this helps Isobel Clark __________________________________________________ Do You Yahoo!? Everything you'll ever need on one web page from News and Sport to Email and Music Charts http://uk.my.yahoo.com -- * To post a message to the list, send it to [EMAIL PROTECTED] * As a general service to the users, please remember to post a summary of any useful responses to your questions. * To unsubscribe, send an email to [EMAIL PROTECTED] with no subject and "unsubscribe ai-geostats" followed by "end" on the next line in the message body. DO NOT SEND Subscribe/Unsubscribe requests to the list * Support to the list is provided at http://www.ai-geostats.org