Dave,
Transformation to a continuous distribution when the data follow a
discrete distribution is always messy, and the back-transform may get worse.
While you're at the library, try to pick up Diggle & Ribeiro's
Model-based geostatistics; they describe a model-based approach that
extends glm models. It seems the most appropriate way for your kind of
data. I'm not sure whether the accompanying software (packages
geoR/geoRglm) supports zero-inflated Poissons. In case it does, it
remains to be seen whether prediction will actually improve substantially.
--
Edzer
Dave Depew wrote:
Thanks Edzer,
I've requested Cressie's book from our library (just waiting on it).
My main concern was the many 0 counts. I also was not enthusiastic
about odd transformations which then require appropriate
back-transforms (I imagine the back transform of the kriging variance
gets messy)
I've tried several linear and non-linear combinations....they all do
not improve on predictions generated by using OK with the
untransformed data. I am confident that the resultant grid outputs do
capture the spatial structure quite well. I've also tried a 10 fold
cross validation of the kriging model - this seems to give reasonable
estimates for mean error, mean squared prediction error and mean
square normalized error. I had interpreted this that the variogram
model chosen was doing a reasonable job.
Edzer Pebesma wrote:
Hi Dave,
Dave Depew wrote:
Hi all,
A question for the more experienced geostats users....
I have a data set containing 2-3 variables relating to submerged
plant characteristics inferred from acoustic survey.
The distribution of the % cover variable is bounded (0-100) and
highly left skewed (many 0's). The transect spacing is quite even,
and I can't seem to notice much difference between a run of ordinary
kriging and a variant of RK using a zeroinflated glm of the %cover
residuals.
None of the other co-variates show much correlation with the data
(i.e. bottom depth, x and y). Is this a possible reason why OK and
RK seem to give more or less the same predictions?
Well, yes, if there's not much of a trend, then RK will essentially
simplify to OK.
my second question relates to transformation of the target
variable...in this case zero inflated distributions are difficult to
transform. Is it really a requirement of kriging that the data be
transformed? or just that it will generally perform better with a
target variable with a distribution close to normal?
I believe the argument is along the following lines: kriging is the
BLUP in any case, but in case the data are normally distributed
(around the trend), the BLUP (or more exactly the BLP, simple
kriging) coincides with the conditional expectation, making it the
best possible predictor. In other cases, meaning when data are not
normally distributed, it is still the best linear predictor, but it
may very well be that there are other, better, non-linear predictors
that give a result much closer to the best predictor under those
circumstances.
If there is a transformation for that data that makes them
multivariate Gaussian, then transforming and kriging on that scale is
the way to go. A catch that has gotten very little attention is that
transformation typically looks at marginal distributions, and not at
multivariate distributions, the latter being pretty hard to check
with only one realisation of the random field.
Cressie's book is a good source to read this stuff; I've lost my copy
when I moved jobs in the spring.
--
Edzer
--
Edzer Pebesma
Institute for Geoinformatics (ifgi), University of Münster,
Weseler Straße 253, 48151 Münster, Germany. Phone: +49 251
8333081, Fax: +49 251 8339763 http://ifgi.uni-muenster.de/
_______________________________________________
R-sig-Geo mailing list
R-sig-Geo@stat.math.ethz.ch
https://stat.ethz.ch/mailman/listinfo/r-sig-geo