Dear Edzer and Robert, Thank you for your replies.
> Suggestions to improve this would be to use an index structure (for the > generic case of possibly irregularly spread data) such as the PR-bucket > quadtree used in package gstat (nsearch.c). Yes, I thought about that, as I want to process both regularly and irregularly gridded data. Unfortunately, my C experience is a bit short for that. I'll keep looking for something - if I don't find anything I'd try to code that myself. > You can try one of the 'focal' functions of the raster package on > R-Forge. Not particularly fast either, but probably much faster that > what you have now. Thanks Robert (and congrats for such a great package BTW), I installed the last version and will certainly test it, but as I said I would like to process also ungridded data, so I'm still thinking about a more general solution. I was wondering if it could be a solution to use RSAGA or spgrass6 to do that. Feedback, anyone ? Cheers, Pierre 2010/1/29 Edzer Pebesma <edzer.pebe...@uni-muenster.de>: > Pierre Roudier wrote: >> >> Dear list, >> >> I am processing gridded data like DEMs, imported in R from geotiff >> files. Of course, such data is regularly gridded, but very often with >> NA values. In the framwork of some tests, I have to generate a >> neighbourhood of the DEM, so that to extract local extrema of the >> layer. For this task, I wrote a function based on the knearneig() >> function from the spdep package, with k=4 or k=8, to generate and >> analyse the neighbourhood of each point. >> >> Unfortunately, I often have to process big datasets (let's say grids >> with between 50,000 and 350,000 points), and that's working but >> knearneigh() takes *hours* to process. >> >> Does anybody would have any suggestion to improve the efficiency of this >> step ? >> > > the algorithm (knn.c file in the spdep sources) seems to compute all > distances between all grid cells, and for each grid cell find the nearest k > by some kind of insertion sort. That makes it general for all kinds of > spatial data points, including irregularly spread ones, but makes it also > inefficient (because O(n^2) with n the number of grid cells) for large data > sets. > > Suggestions to improve this would be to use an index structure (for the > generic case of possibly irregularly spread data) such as the PR-bucket > quadtree used in package gstat (nsearch.c), or specifically for gridded data > to write a special knearneigh() that exploits the grid indexing already > present: you can explicitly address the nearest 4/8/12/... neighbours by > using row/col indexes. In this latter case you'd probably want to rewrite > the test function for the full grid, instead of looping over the cells. >> >> Thanks, >> >> Pierre >> >> _______________________________________________ >> R-sig-Geo mailing list >> R-sig-Geo@stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/r-sig-geo >> > > -- > Edzer Pebesma > Institute for Geoinformatics (ifgi), University of Münster Weseler Straße > 253, 48151 Münster, Germany. Phone: +49 251 8333081, Fax: +49 251 8339763 > http://ifgi.uni-muenster.de http://www.52north.org/geostatistics > e.pebe...@wwu.de > > _______________________________________________ R-sig-Geo mailing list R-sig-Geo@stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/r-sig-geo