Re: AI-GEOSTATS: SIC2004: Automatic (one-click) mapping

2004-04-21 Thread Gerald Boogaart
Dear Gregoire, Dear List

I am not clear wether it is allowed to start a discussion on SIC2004 before it 
actually starts. Anyway I would like to promote discussion on the following 
point

In one sentence: Due to game theory, one of the worst blind algorithms will 
perform best in SIC2004.

The point is:
A fully automatic estimation algorithm has to obey the laws of game theory. 
Especially we have the classical problem of statistical optimality: 

Let L(A,P) discribe any negativ measure of fitness (The expecedt Loose in 
statistics) of an Algorithm A to cope with a truth with probability 
distribution P. Than in general it does not exist any algorithm A0 with 

L(A0,P) = L(A,P) forall A and P

That leads to the definition of an admissible estimator A0 in statistics which 
is given by 
It does not exist any A1 such that A1 is striktly better. 

Not Exists A such that forall P : L(A1,P) = L(A0,P) 

Comparing to admissible estimators for P in {P0,P1} leads following 
conclusion:
When L(A0,P0)L(A1,P=) is then   L(A0,P1)L(A1,P1)

Thus the estimator performing best with that one Problem will be probabily 
worse on others, because a simple/specific algorithm fit for the specific 
problem will perform best. The question is just, which simple/specific 
algorithm will win (because that will depend on the problem, since specific 
algorithms perform best on their own problem but worse on others). But what 
we need for a blind mapping is something totally different:

It should perform well for all P. (Or even better: Bail out with error 
message, when it not able to give good results)

This corresponds to the concept of minmax estimators which minimize the 
maximum L(A,P) or to Bayes Estimators minimizing the the mean of L(A,P) over 
all expected P s. 

However for any Minimax estimator typically for any fixed P a better algorithm
exists. And because many alorithms are in the test, but only one problem is in 
the test, we will see one of the naive ones to perform best.

As an example compare to algorithms using oridinary kriging with Alg1: a 
linear variogram, Alg2: a power variogram. 

If the data ist indeed obeying a linear variogram Alg1 is BLUE and Alg2 
estimates the a power near one and will be nearly BLUE. Alg1 won and Alg2 is 
slightly worse. 

If the data is obeying a spherical variogram, Alg2 performs better than, but 
will be outnumbered by  simple inverse square distances methods. 

However Alg2 was performing well in both cases. 

Thus I would propose to modify SIC2004 in the following way: 
Give multiple problems.

Hoping for nice discussion,
Gerald





On Wednesday 14 April 2004 11:01, Gregoire Dubois wrote:
 Good day everyone!

 Time is ripe for a new SIC (Spatial Interpolation Comparison) exercise !

 The second edition of SIC (SIC2004
 http://www.ai-geostats.org/events/sic2004.htm ) will be launched by
 the end of this month. The topic of this year will be automatic
 mapping, that is the use of algorithms for spatial interpolation that
 will not require any intervention or decision from the users. Hence the
 expression one-click mapping. Such algorithms would be obviously more
 than useful in the frame of environmental monitoring networks (e.g.
 automatic mapping of ozone levels in cities, radioactivity in the
 environment, etc.).  However, SIC97 has shown that it was very difficult
 to generate good results if one is not using the information provided by
 the spatial correlation (i.e. semivariograms). Can we today blindly use
 functions for the automatic fitting of semivariograms? Can machine
 learning algorithms compete with geostatistical functions?

 As for SIC97 (see  http://www.ai-geostats.org/events/sic97.htm
 http://www.ai-geostats.org/events/sic97.htm ), participants to SIC2004
 will receive a subset of an environmental data set (typically
 measurements of an environmental variable + spatial coordinates of the
 sampling places) and will have to estimate the values taken by the
 variable at the remaining locations of the full data set. The true
 values found at these locations will be made public only at the end of
 the exercise. Various criteria will be used to assess the performances
 of the interpolation algorithms (time of calculation, minimum errors,
 etc.).

 Because everything should be automatic, participants to SIC2004 will
 have to prepare their algorithms before receiving the data: only
 sampling locations will be given and no interaction with the algorithm
 will be allowed during the exercise. No worry, participants will have
 from the end of this month until the 15th of September to setup their
 functions.

 Participants to SIC2004 will be invited at the end of the exercise to
 submit a manuscript for publication in the online journal GIDA
 (Geographic Information and Decision Analysis). A selected number of
 papers will be published in a book (a European Report hardcopy) with
 some unpublished material provided by the editorial board.

 For more information, please visit 

RE: AI-GEOSTATS: SIC2004: Automatic (one-click) mapping

2004-04-21 Thread Gregoire Dubois
Hi Gerald, hello everyone,

I am not clear wether it is allowed to start a discussion on SIC2004
before it 
actually starts.

Any comment, discussion about automatic mapping are of course more than
welcome!! 

Actually they are more than urgent since I want to respect all deadlines
and we have only a few days left.
 
Discussions about automatic mapping would potentially intererest anyone
working in spatial statistics and SIC2004 would certainly benefit from
it. Discussions about the management and organisation of SIC2004 should
be sent directly to me, not to the list .

If you want to contribute to SIC2004 without playing the best
estimation game, please do not hesitate to submit a manuscript that
would be published in the hardcopy version of SIC2004 (after reviewing
of course).

One thing only so far is sure: SIC2004 will be about automatic mapping
of daily measurements of a variable X (no revelation at this point) and
some prior information will be given to tune all the paramaters in
advance.

We still have to define what prior information to give exactly. A
histogram, other data (subsets of the whole dataset, basic statistics,
repeated measurements of X at the same locations but for other days,
...?)

This is where we are now.

Best wishes and thanks in advance for any feedback,

Gregoire






-Original Message-
From: Gerald Boogaart [mailto:[EMAIL PROTECTED] 
Sent: 21 April 2004 11:47
To: Gregoire Dubois; [EMAIL PROTECTED]
Subject: Re: AI-GEOSTATS: SIC2004: Automatic (one-click) mapping


Dear Gregoire, Dear List

I am not clear wether it is allowed to start a discussion on SIC2004
before it 
actually starts. Anyway I would like to promote discussion on the
following 
point

In one sentence: Due to game theory, one of the worst blind algorithms
will 
perform best in SIC2004.

The point is:
A fully automatic estimation algorithm has to obey the laws of game
theory. 
Especially we have the classical problem of statistical optimality: 

Let L(A,P) discribe any negativ measure of fitness (The expecedt Loose
in 
statistics) of an Algorithm A to cope with a truth with probability 
distribution P. Than in general it does not exist any algorithm A0 with 

L(A0,P) = L(A,P) forall A and P

That leads to the definition of an admissible estimator A0 in statistics
which 
is given by 
It does not exist any A1 such that A1 is striktly better. 

Not Exists A such that forall P : L(A1,P) = L(A0,P) 

Comparing to admissible estimators for P in {P0,P1} leads following 
conclusion:
When L(A0,P0)L(A1,P=) is then   L(A0,P1)L(A1,P1)

Thus the estimator performing best with that one Problem will be
probabily 
worse on others, because a simple/specific algorithm fit for the
specific 
problem will perform best. The question is just, which simple/specific 
algorithm will win (because that will depend on the problem, since
specific 
algorithms perform best on their own problem but worse on others). But
what 
we need for a blind mapping is something totally different:

It should perform well for all P. (Or even better: Bail out with error 
message, when it not able to give good results)

This corresponds to the concept of minmax estimators which minimize the 
maximum L(A,P) or to Bayes Estimators minimizing the the mean of L(A,P)
over 
all expected P s. 

However for any Minimax estimator typically for any fixed P a better
algorithm exists. And because many alorithms are in the test, but only
one problem is in 
the test, we will see one of the naive ones to perform best.

As an example compare to algorithms using oridinary kriging with Alg1: a

linear variogram, Alg2: a power variogram. 

If the data ist indeed obeying a linear variogram Alg1 is BLUE and Alg2 
estimates the a power near one and will be nearly BLUE. Alg1 won and
Alg2 is 
slightly worse. 

If the data is obeying a spherical variogram, Alg2 performs better than,
but 
will be outnumbered by  simple inverse square distances methods. 

However Alg2 was performing well in both cases. 

Thus I would propose to modify SIC2004 in the following way: 
Give multiple problems.

Hoping for nice discussion,
Gerald





On Wednesday 14 April 2004 11:01, Gregoire Dubois wrote:
 Good day everyone!

 Time is ripe for a new SIC (Spatial Interpolation Comparison) exercise

 !

 The second edition of SIC (SIC2004 
 http://www.ai-geostats.org/events/sic2004.htm ) will be launched by 
 the end of this month. The topic of this year will be automatic 
 mapping, that is the use of algorithms for spatial interpolation that

 will not require any intervention or decision from the users. Hence 
 the expression one-click mapping. Such algorithms would be obviously

 more than useful in the frame of environmental monitoring networks 
 (e.g. automatic mapping of ozone levels in cities, radioactivity in 
 the environment, etc.).  However, SIC97 has shown that it was very 
 difficult to generate good results if one is not using the information

 provided by the spatial correlation 

RE: AI-GEOSTATS: Help: spatial heterogeneity and autocorrelation indices

2004-04-21 Thread sl23349
Hi,

This is a question I have been scratching my head lately.

In my data, there seems to be a global range, i.e. spatial autocorrelation 
does imply some sort of homogeneity throughout the whole area. However, if you 
look into more details, some of the sub-region contains outlier and hotspot 
which have totally differet local ranges than global ranges. So, the questions 
come as following:

1. Do we need to break up into different sub-regions (hetrogeneity) even you 
do have global range?

2. If so, how do you break the whole area into sub-regions? Can we use cluster 
anaylsis based on the minimum variances algorithms? Or can we optimze the area 
into differnt sub-regions based on the distributions of local range? Can we 
sperate the local range with global range? Getis (2001) paper discussed 
something about this, but I think there is a lot needs to be done. Anyone has 
any comments about this?


Shing




= Original Message From Chunhua Zhang [EMAIL PROTECTED] =
 Hello lists,

I have a question:
I am now interested in heterogeneity.  Heterogeneity is related to pattern 
and pattern is absence of randomness.
Many indices of spatial autocorrelation have been applied to study 
heterogeneity.
From my understanding, spatial autocorrelation indices deals with spatial 
dependence, while it is a
special case of spatial homogeneity. Therefore, is it reasonable to apply 
indices of homogeneity to pattern study?

Thanks!

Chunhua

Shing-Tzong Lin
Teaching and Research Assistant
Department of Geography
Texas State University, San Marcos
(512)345-1935


--
* To post a message to the list, send it to [EMAIL PROTECTED]
* As a general service to the users, please remember to post a summary of any useful 
responses to your questions.
* To unsubscribe, send an email to [EMAIL PROTECTED] with no subject and unsubscribe 
ai-geostats followed by end on the next line in the message body. DO NOT SEND 
Subscribe/Unsubscribe requests to the list
* Support to the list is provided at http://www.ai-geostats.org


AI-GEOSTATS: references for spatial weighting matrix based on distance

2004-04-21 Thread Stephane DRAY
Hello list,

I have read a thesis where the author consider the matrix W=1-Dij/max(Dij) 
(Dij is the distance between site i and j) as a spatial weight matrix for 
spatial analysis purposes (e.g. Moran's I).
I am looking for other references dealing with spatial weighting matrix 
based on distance.
Functions such as 1/dij, 1-dij^2/max(dij^2) can also be considered.
I am looking for applications using these kind of matrices or papers which 
provide some comparisons between these different options.

Thanks in advance,
Sincerely.
Stéphane DRAY
-- 

Département des Sciences Biologiques
Université de Montréal, C.P. 6128, succursale centre-ville
Montréal, Québec H3C 3J7, Canada
Tel : 514 343 6111 poste 1233
E-mail : [EMAIL PROTECTED]
-- 

Web  http://www.steph280.freesurf.fr/
-- 



--
* To post a message to the list, send it to [EMAIL PROTECTED]
* As a general service to the users, please remember to post a summary of any useful 
responses to your questions.
* To unsubscribe, send an email to [EMAIL PROTECTED] with no subject and unsubscribe 
ai-geostats followed by end on the next line in the message body. DO NOT SEND 
Subscribe/Unsubscribe requests to the list
* Support to the list is provided at http://www.ai-geostats.org