Dear Tom,

Hengl, T. wrote:
Dear Edzer, I do have a similar problem (extensive computation time, memory limits). I would like to run interpolation of 60,000 points for 500,000 new points using OK and nmax of 30 closest points (I am using a windowsxp machine with 2GB RAM). But I have doubts that it can ever run in less than few hours in gstat. I mean, at each new location, the algorithm has to derive distances to all locations (60,000^2 matrix) and then find the 30 closest ones. There is no more intelligent way to pick up the closest points - or is there?
There is, and it is being used: the short name is spatial index and quadtree; you'll find links and explanation on
http://www.gstat.org/manual/node8.html#SECTION00351000000000000000

Both computers & geosciences papers on gstat mention this.

To see the effect of this, try to enlarge split to a number larger than the number of observations, as in

krige(log(zinc)~1, meuse, meuse[1,], vgm(1, "Exp", 300), set = list(split=1000))

Here, the bucket size is 1000, and all observations will be ranked according to distance to the prediction location (which does not require an n^2 matrix), using quick sort.

Other software, such as GSLIB (iirc) put a course regular grid over the domain to speed up neighbourhood selection.
--
Edzer

Thanx, Tom
________________________________

From: [EMAIL PROTECTED] on behalf of Dave Depew
Sent: Fri 9/19/2008 5:00 PM
To: Edzer Pebesma
Cc: Chris Taylor; r-sig-geo@stat.math.ethz.ch
Subject: Re: [R-sig-Geo] cokriging question



Thanks for the quick responses.
I've often done global OK or UK that can take ~ 2-3 days to complete. I
always assumed that it was b/c the matrices were so large. Looking at
task manager indicated that Rgui.exe only consumed ~ 800 Mb of RAM
during the processes.
I'll try it by passing the maxdist to gstat() rather than predict.gstat().
The only other time I encounter the memory.c line is when I neglect to
remove duplicate observations.

I think I only should have at most 500 or so points within a 100-200m
maxdist...




Edzer Pebesma wrote:
Good morning Dave (late afternoon here),

Chris Taylor wrote:
Good morning Edzer and Dave,
Thanks for bringing up this point.  I had a similar issue recently
using krige().  Observations at 5800 locations, attempting to krige()
predictions at 112,000 locations resulted in the same "memory.c"
error message.  Reducing predicted locations to <<50,000 and reducing
max.dist seemed to help, but the predictions still took a very long
time (>2 hours).  (Running winxp with 4GB memory.)
Can you clarify your suspicion regarding the "lack of standardization
of coordinates"?
In this message, a trend was modelled based on x and y coordinates, as
follows:
      x                y                DN4          indicator4
Min.   :670462   Min.   :4215236   Min.   :18.00   Min.   :0.0000
1st Qu.:670683   1st Qu.:4215456   1st Qu.:24.00   1st Qu.:0.0000
Median :670904   Median :4215677   Median :32.50   Median :0.0000
Mean   :670904   Mean   :4215677   Mean   :43.26   Mean   :0.4795
3rd Qu.:671125   3rd Qu.:4215898   3rd Qu.:64.00   3rd Qu.:1.0000
Max.   :671346   Max.   :4216119   Max.   :87.00   Max.   :1.0000
g<-gstat(id="indicator6",formula=indicator6~x+y+x*y+sqrt(x)+sqrt(y),location=~x+y,data=band6.data,...


computing the x*y will give numbers many orders of magnitude larger
than sqrt(x),or the intercept. The advice is usually to (somewhat)
standardize coordinates before using them as a trend. But I doubt this
helps you very much.

I find it hard to consider > 2 hours as a very long time before I know
all the details (e.g. how many points were there within your
maxdist?), the reason why you want an instant answer, and preferably
have heard a comparison with other software. If you then tell me what
your budget is, I might come up with possible solutions (starting very
cheap, e.g. look at demo(snow) in package gstat, or use an OS that can
assign this 4Gb to a single process).
--
Edzer
Chris

Edzer Pebesma wrote:
Dave,

12000 observations fit, in the c representation, in less than 1 Mb
(64 bytes per observation).

The issue is that you think that passing maxdist to predict.gstat
has an effect. It doesn't; you need to pass it to function gstat().

The same thing happened in this
https://stat.ethz.ch/pipermail/r-sig-geo/2008-September/004182.html
message, where nmax was passed to predict.gstat, and simulation took
forever. The other issue in that question was, I suspect, lack of
standardization of coordinates, used in a trend surface.
--
Edzer

Dave Depew wrote:
Is there a limit to the # of observations or size of file that can
be co-kriged in gstat?
I have a ~12000 observation data set (2 variables), the variograms,
cross variogram and lmc are fit well, and co-kriging starts ok

Linear Model of Coregionalization found. Good.
[using ordinary cokriging]

then immediately outputs

"memory.c", line 57: can't allocate memory in function m_get()
Error in predict.gstat(fit.ck, newdata = EcoSAV.grid, maxdist = 100) :
 m_get

Iv tried different maxdist from 10 to 1000, with exactly the same
result.
I recently upgraded my RAM to 4Gb and flipped the windows XP /3GB
switch.




--
David Depew
PhD Candidate
Department of Biology
University of Waterloo
200 University Ave W.
Waterloo, ON. Canada
N2L 3G1

(T) 1-519-888-4567 x33895
(F) 1-519-746-0614

http://www.science.uwaterloo.ca/~ddepew

_______________________________________________
R-sig-Geo mailing list
R-sig-Geo@stat.math.ethz.ch
https://stat.ethz.ch/mailman/listinfo/r-sig-geo



        [[alternative HTML version deleted]]

_______________________________________________
R-sig-Geo mailing list
R-sig-Geo@stat.math.ethz.ch
https://stat.ethz.ch/mailman/listinfo/r-sig-geo

--
Edzer Pebesma
Institute for Geoinformatics (ifgi), University of Münster,
Weseler Straße 253, 48151 Münster, Germany.  Phone: +49 251
8333081, Fax: +49 251 8339763  http://ifgi.uni-muenster.de/
http://www.springer.com/978-0-387-78170-9

_______________________________________________
R-sig-Geo mailing list
R-sig-Geo@stat.math.ethz.ch
https://stat.ethz.ch/mailman/listinfo/r-sig-geo

Reply via email to