Re: AI-GEOSTATS: SIC2004: Automatic (one-click) mapping
Dear Gregoire, Dear List I am not clear wether it is allowed to start a discussion on SIC2004 before it actually starts. Anyway I would like to promote discussion on the following point In one sentence: Due to game theory, one of the worst blind algorithms will perform best in SIC2004. The point is: A fully automatic estimation algorithm has to obey the laws of game theory. Especially we have the classical problem of statistical optimality: Let L(A,P) discribe any negativ measure of fitness (The expecedt Loose in statistics) of an Algorithm A to cope with a truth with probability distribution P. Than in general it does not exist any algorithm A0 with L(A0,P) = L(A,P) forall A and P That leads to the definition of an admissible estimator A0 in statistics which is given by It does not exist any A1 such that A1 is striktly better. Not Exists A such that forall P : L(A1,P) = L(A0,P) Comparing to admissible estimators for P in {P0,P1} leads following conclusion: When L(A0,P0)L(A1,P=) is then L(A0,P1)L(A1,P1) Thus the estimator performing best with that one Problem will be probabily worse on others, because a simple/specific algorithm fit for the specific problem will perform best. The question is just, which simple/specific algorithm will win (because that will depend on the problem, since specific algorithms perform best on their own problem but worse on others). But what we need for a blind mapping is something totally different: It should perform well for all P. (Or even better: Bail out with error message, when it not able to give good results) This corresponds to the concept of minmax estimators which minimize the maximum L(A,P) or to Bayes Estimators minimizing the the mean of L(A,P) over all expected P s. However for any Minimax estimator typically for any fixed P a better algorithm exists. And because many alorithms are in the test, but only one problem is in the test, we will see one of the naive ones to perform best. As an example compare to algorithms using oridinary kriging with Alg1: a linear variogram, Alg2: a power variogram. If the data ist indeed obeying a linear variogram Alg1 is BLUE and Alg2 estimates the a power near one and will be nearly BLUE. Alg1 won and Alg2 is slightly worse. If the data is obeying a spherical variogram, Alg2 performs better than, but will be outnumbered by simple inverse square distances methods. However Alg2 was performing well in both cases. Thus I would propose to modify SIC2004 in the following way: Give multiple problems. Hoping for nice discussion, Gerald On Wednesday 14 April 2004 11:01, Gregoire Dubois wrote: Good day everyone! Time is ripe for a new SIC (Spatial Interpolation Comparison) exercise ! The second edition of SIC (SIC2004 http://www.ai-geostats.org/events/sic2004.htm ) will be launched by the end of this month. The topic of this year will be automatic mapping, that is the use of algorithms for spatial interpolation that will not require any intervention or decision from the users. Hence the expression one-click mapping. Such algorithms would be obviously more than useful in the frame of environmental monitoring networks (e.g. automatic mapping of ozone levels in cities, radioactivity in the environment, etc.). However, SIC97 has shown that it was very difficult to generate good results if one is not using the information provided by the spatial correlation (i.e. semivariograms). Can we today blindly use functions for the automatic fitting of semivariograms? Can machine learning algorithms compete with geostatistical functions? As for SIC97 (see http://www.ai-geostats.org/events/sic97.htm http://www.ai-geostats.org/events/sic97.htm ), participants to SIC2004 will receive a subset of an environmental data set (typically measurements of an environmental variable + spatial coordinates of the sampling places) and will have to estimate the values taken by the variable at the remaining locations of the full data set. The true values found at these locations will be made public only at the end of the exercise. Various criteria will be used to assess the performances of the interpolation algorithms (time of calculation, minimum errors, etc.). Because everything should be automatic, participants to SIC2004 will have to prepare their algorithms before receiving the data: only sampling locations will be given and no interaction with the algorithm will be allowed during the exercise. No worry, participants will have from the end of this month until the 15th of September to setup their functions. Participants to SIC2004 will be invited at the end of the exercise to submit a manuscript for publication in the online journal GIDA (Geographic Information and Decision Analysis). A selected number of papers will be published in a book (a European Report hardcopy) with some unpublished material provided by the editorial board. For more information, please visit
RE: AI-GEOSTATS: SIC2004: Automatic (one-click) mapping
Hi Gerald, hello everyone, I am not clear wether it is allowed to start a discussion on SIC2004 before it actually starts. Any comment, discussion about automatic mapping are of course more than welcome!! Actually they are more than urgent since I want to respect all deadlines and we have only a few days left. Discussions about automatic mapping would potentially intererest anyone working in spatial statistics and SIC2004 would certainly benefit from it. Discussions about the management and organisation of SIC2004 should be sent directly to me, not to the list . If you want to contribute to SIC2004 without playing the best estimation game, please do not hesitate to submit a manuscript that would be published in the hardcopy version of SIC2004 (after reviewing of course). One thing only so far is sure: SIC2004 will be about automatic mapping of daily measurements of a variable X (no revelation at this point) and some prior information will be given to tune all the paramaters in advance. We still have to define what prior information to give exactly. A histogram, other data (subsets of the whole dataset, basic statistics, repeated measurements of X at the same locations but for other days, ...?) This is where we are now. Best wishes and thanks in advance for any feedback, Gregoire -Original Message- From: Gerald Boogaart [mailto:[EMAIL PROTECTED] Sent: 21 April 2004 11:47 To: Gregoire Dubois; [EMAIL PROTECTED] Subject: Re: AI-GEOSTATS: SIC2004: Automatic (one-click) mapping Dear Gregoire, Dear List I am not clear wether it is allowed to start a discussion on SIC2004 before it actually starts. Anyway I would like to promote discussion on the following point In one sentence: Due to game theory, one of the worst blind algorithms will perform best in SIC2004. The point is: A fully automatic estimation algorithm has to obey the laws of game theory. Especially we have the classical problem of statistical optimality: Let L(A,P) discribe any negativ measure of fitness (The expecedt Loose in statistics) of an Algorithm A to cope with a truth with probability distribution P. Than in general it does not exist any algorithm A0 with L(A0,P) = L(A,P) forall A and P That leads to the definition of an admissible estimator A0 in statistics which is given by It does not exist any A1 such that A1 is striktly better. Not Exists A such that forall P : L(A1,P) = L(A0,P) Comparing to admissible estimators for P in {P0,P1} leads following conclusion: When L(A0,P0)L(A1,P=) is then L(A0,P1)L(A1,P1) Thus the estimator performing best with that one Problem will be probabily worse on others, because a simple/specific algorithm fit for the specific problem will perform best. The question is just, which simple/specific algorithm will win (because that will depend on the problem, since specific algorithms perform best on their own problem but worse on others). But what we need for a blind mapping is something totally different: It should perform well for all P. (Or even better: Bail out with error message, when it not able to give good results) This corresponds to the concept of minmax estimators which minimize the maximum L(A,P) or to Bayes Estimators minimizing the the mean of L(A,P) over all expected P s. However for any Minimax estimator typically for any fixed P a better algorithm exists. And because many alorithms are in the test, but only one problem is in the test, we will see one of the naive ones to perform best. As an example compare to algorithms using oridinary kriging with Alg1: a linear variogram, Alg2: a power variogram. If the data ist indeed obeying a linear variogram Alg1 is BLUE and Alg2 estimates the a power near one and will be nearly BLUE. Alg1 won and Alg2 is slightly worse. If the data is obeying a spherical variogram, Alg2 performs better than, but will be outnumbered by simple inverse square distances methods. However Alg2 was performing well in both cases. Thus I would propose to modify SIC2004 in the following way: Give multiple problems. Hoping for nice discussion, Gerald On Wednesday 14 April 2004 11:01, Gregoire Dubois wrote: Good day everyone! Time is ripe for a new SIC (Spatial Interpolation Comparison) exercise ! The second edition of SIC (SIC2004 http://www.ai-geostats.org/events/sic2004.htm ) will be launched by the end of this month. The topic of this year will be automatic mapping, that is the use of algorithms for spatial interpolation that will not require any intervention or decision from the users. Hence the expression one-click mapping. Such algorithms would be obviously more than useful in the frame of environmental monitoring networks (e.g. automatic mapping of ozone levels in cities, radioactivity in the environment, etc.). However, SIC97 has shown that it was very difficult to generate good results if one is not using the information provided by the spatial correlation
RE: AI-GEOSTATS: Help: spatial heterogeneity and autocorrelation indices
Hi, This is a question I have been scratching my head lately. In my data, there seems to be a global range, i.e. spatial autocorrelation does imply some sort of homogeneity throughout the whole area. However, if you look into more details, some of the sub-region contains outlier and hotspot which have totally differet local ranges than global ranges. So, the questions come as following: 1. Do we need to break up into different sub-regions (hetrogeneity) even you do have global range? 2. If so, how do you break the whole area into sub-regions? Can we use cluster anaylsis based on the minimum variances algorithms? Or can we optimze the area into differnt sub-regions based on the distributions of local range? Can we sperate the local range with global range? Getis (2001) paper discussed something about this, but I think there is a lot needs to be done. Anyone has any comments about this? Shing = Original Message From Chunhua Zhang [EMAIL PROTECTED] = Hello lists, I have a question: I am now interested in heterogeneity. Heterogeneity is related to pattern and pattern is absence of randomness. Many indices of spatial autocorrelation have been applied to study heterogeneity. From my understanding, spatial autocorrelation indices deals with spatial dependence, while it is a special case of spatial homogeneity. Therefore, is it reasonable to apply indices of homogeneity to pattern study? Thanks! Chunhua Shing-Tzong Lin Teaching and Research Assistant Department of Geography Texas State University, San Marcos (512)345-1935 -- * To post a message to the list, send it to [EMAIL PROTECTED] * As a general service to the users, please remember to post a summary of any useful responses to your questions. * To unsubscribe, send an email to [EMAIL PROTECTED] with no subject and unsubscribe ai-geostats followed by end on the next line in the message body. DO NOT SEND Subscribe/Unsubscribe requests to the list * Support to the list is provided at http://www.ai-geostats.org
AI-GEOSTATS: references for spatial weighting matrix based on distance
Hello list, I have read a thesis where the author consider the matrix W=1-Dij/max(Dij) (Dij is the distance between site i and j) as a spatial weight matrix for spatial analysis purposes (e.g. Moran's I). I am looking for other references dealing with spatial weighting matrix based on distance. Functions such as 1/dij, 1-dij^2/max(dij^2) can also be considered. I am looking for applications using these kind of matrices or papers which provide some comparisons between these different options. Thanks in advance, Sincerely. Stéphane DRAY -- Département des Sciences Biologiques Université de Montréal, C.P. 6128, succursale centre-ville Montréal, Québec H3C 3J7, Canada Tel : 514 343 6111 poste 1233 E-mail : [EMAIL PROTECTED] -- Web http://www.steph280.freesurf.fr/ -- -- * To post a message to the list, send it to [EMAIL PROTECTED] * As a general service to the users, please remember to post a summary of any useful responses to your questions. * To unsubscribe, send an email to [EMAIL PROTECTED] with no subject and unsubscribe ai-geostats followed by end on the next line in the message body. DO NOT SEND Subscribe/Unsubscribe requests to the list * Support to the list is provided at http://www.ai-geostats.org