Re: [R-sig-Geo] cokriging question

Hengl, T. Fri, 19 Sep 2008 09:19:14 -0700

 
Dear Edzer,
 
I do have a similar problem (extensive computation time, memory limits). I 
would like to run interpolation of 60,000 points for 500,000 new points using 
OK and nmax of 30 closest points (I am using a windowsxp machine with 2GB RAM).
 
But I have doubts that it can ever run in less than few hours in gstat. I mean, 
at each new location, the algorithm has to derive distances to all locations 
(60,000^2 matrix) and then find the 30 closest ones. There is no more 
intelligent way to pick up the closest points - or is there?
 
Thanx,
 
Tom


________________________________

From: [EMAIL PROTECTED] on behalf of Dave Depew
Sent: Fri 9/19/2008 5:00 PM
To: Edzer Pebesma
Cc: Chris Taylor; r-sig-geo@stat.math.ethz.ch
Subject: Re: [R-sig-Geo] cokriging question



Thanks for the quick responses.
I've often done global OK or UK that can take ~ 2-3 days to complete. I
always assumed that it was b/c the matrices were so large. Looking at
task manager indicated that Rgui.exe only consumed ~ 800 Mb of RAM
during the processes.
I'll try it by passing the maxdist to gstat() rather than predict.gstat().
The only other time I encounter the memory.c line is when I neglect to
remove duplicate observations.

I think I only should have at most 500 or so points within a 100-200m
maxdist...




Edzer Pebesma wrote:
> Good morning Dave (late afternoon here),
>
> Chris Taylor wrote:
>> Good morning Edzer and Dave,
>> Thanks for bringing up this point.  I had a similar issue recently
>> using krige().  Observations at 5800 locations, attempting to krige()
>> predictions at 112,000 locations resulted in the same "memory.c"
>> error message.  Reducing predicted locations to <<50,000 and reducing
>> max.dist seemed to help, but the predictions still took a very long
>> time (>2 hours).  (Running winxp with 4GB memory.)
>> Can you clarify your suspicion regarding the "lack of standardization
>> of coordinates"?
> In this message, a trend was modelled based on x and y coordinates, as
> follows:
>       x                y                DN4          indicator4
> Min.   :670462   Min.   :4215236   Min.   :18.00   Min.   :0.0000
> 1st Qu.:670683   1st Qu.:4215456   1st Qu.:24.00   1st Qu.:0.0000
> Median :670904   Median :4215677   Median :32.50   Median :0.0000
> Mean   :670904   Mean   :4215677   Mean   :43.26   Mean   :0.4795
> 3rd Qu.:671125   3rd Qu.:4215898   3rd Qu.:64.00   3rd Qu.:1.0000
> Max.   :671346   Max.   :4216119   Max.   :87.00   Max.   :1.0000
> >
> g<-gstat(id="indicator6",formula=indicator6~x+y+x*y+sqrt(x)+sqrt(y),location=~x+y,data=band6.data,...
>
>
> computing the x*y will give numbers many orders of magnitude larger
> than sqrt(x),or the intercept. The advice is usually to (somewhat)
> standardize coordinates before using them as a trend. But I doubt this
> helps you very much.
>
> I find it hard to consider > 2 hours as a very long time before I know
> all the details (e.g. how many points were there within your
> maxdist?), the reason why you want an instant answer, and preferably
> have heard a comparison with other software. If you then tell me what
> your budget is, I might come up with possible solutions (starting very
> cheap, e.g. look at demo(snow) in package gstat, or use an OS that can
> assign this 4Gb to a single process).
> --
> Edzer
>>
>> Chris
>>
>> Edzer Pebesma wrote:
>>> Dave,
>>>
>>> 12000 observations fit, in the c representation, in less than 1 Mb
>>> (64 bytes per observation).
>>>
>>> The issue is that you think that passing maxdist to predict.gstat
>>> has an effect. It doesn't; you need to pass it to function gstat().
>>>
>>> The same thing happened in this
>>> https://stat.ethz.ch/pipermail/r-sig-geo/2008-September/004182.html
>>> message, where nmax was passed to predict.gstat, and simulation took
>>> forever. The other issue in that question was, I suspect, lack of
>>> standardization of coordinates, used in a trend surface.
>>> --
>>> Edzer
>>>
>>> Dave Depew wrote:
>>>> Is there a limit to the # of observations or size of file that can
>>>> be co-kriged in gstat?
>>>> I have a ~12000 observation data set (2 variables), the variograms,
>>>> cross variogram and lmc are fit well, and co-kriging starts ok
>>>>
>>>> Linear Model of Coregionalization found. Good.
>>>> [using ordinary cokriging]
>>>>
>>>> then immediately outputs
>>>>
>>>> "memory.c", line 57: can't allocate memory in function m_get()
>>>> Error in predict.gstat(fit.ck, newdata = EcoSAV.grid, maxdist = 100) :
>>>>  m_get
>>>>
>>>> Iv tried different maxdist from 10 to 1000, with exactly the same
>>>> result.
>>>> I recently upgraded my RAM to 4Gb and flipped the windows XP /3GB
>>>> switch.
>>>>
>>>>
>>>
>>
>


--
David Depew
PhD Candidate
Department of Biology
University of Waterloo
200 University Ave W.
Waterloo, ON. Canada
N2L 3G1

(T) 1-519-888-4567 x33895
(F) 1-519-746-0614

http://www.science.uwaterloo.ca/~ddepew

_______________________________________________
R-sig-Geo mailing list
R-sig-Geo@stat.math.ethz.ch
https://stat.ethz.ch/mailman/listinfo/r-sig-geo



        [[alternative HTML version deleted]]

_______________________________________________
R-sig-Geo mailing list
R-sig-Geo@stat.math.ethz.ch
https://stat.ethz.ch/mailman/listinfo/r-sig-geo

Re: [R-sig-Geo] cokriging question

Reply via email to