[R-sig-Geo] Looking for a clustering indicator

Adrian.Baddeley Tue, 16 Feb 2010 16:46:25 -0800

Etienne Bellemare Racine etienn...@gmail.com<mailto:etienn...@gmail.com> writes:


> I am looking for a way to tell how much clustered or not a process is
> (on a numeric scale). I've tried to test for CSR, but the result is too
> narrow as it only tell if the process is random and uniform inside a
> confidence interval. I would like to have an indicator going e.g. from
> random, to clustered, to very clustered. Do you know any way I could do
> that on a more than 3000 points pattern ?

> Maybe I've overlooked a simple CSR test, which could be interpreted to
> give a non-boolean answer ?
If I'm not mistaken, you are asking about spatial point pattern data. (So 
Moran's I is not applicable).

A hypothesis test is designed to give a yes/no answer. To get a measure of 
clustering, you need a summary statistic of some kind.

You could use the K-function or one of the other classical summary statistics 
(G-function, F-function etc). The values of these functions are indicators of 
the degree of clustering or regularity in the point pattern. Choose a 
particular distance r. Then K(r) suggests clustering if K(r) > pi * r^2 and 
suggests regularity if K(r) < pi * r^2. The value of K(r) is a measure of the 
degree of clustering or regularity. Similarly for the other summary functions.

I assume that you calculated envelopes of the K-function (for example) based on 
simulation from CSR, and plotted these together with the estimated K-function 
from the data point pattern. This is equivalent to a hypothesis test (it is NOT 
equivalent to a confidence interval). The test statistic is the rank of the 
observed value of K(r) amongst the simulated values of K(r). You could use this 
rank as a measure of clustering or repulsion.

However the most precise way to get an estimate of the degree of clustering is 
to fit a model to the data, and use the  value of an appropriate parameter in 
the model. For example, computing K(r) is equivalent to fitting a Strauss point 
process model with interaction range r. The interaction parameter 'gamma' of 
the Strauss process is a measure of the degree of regularity. There are many 
other models you could use. The Geyer saturation model allows both clustering 
and regularity. The interaction parameter 'gamma' of the Geyer model ranges 
from 0 to infinity with gamma < 1 indicating regularity and gamma > 1 
indicating clustering.

In the package 'spatstat' you can fit the Strauss process model with r=0.2 to a 
point pattern dataset X by typing
      fit <- ppm(X, ~1, Strauss(0.2))
Then printing 'fit' gives the interaction parameter gamma.

For more information please read the spatstat workshop notes 
www.csiro.au/resources/pf16h.html<http://www.csiro.au/resources/pf16h.html>

Adrian Baddeley

_______________________________________________
R-sig-Geo mailing list
R-sig-Geo@stat.math.ethz.ch
https://stat.ethz.ch/mailman/listinfo/r-sig-geo

[R-sig-Geo] Looking for a clustering indicator

Reply via email to