Hi,

Few days ago I have asked about spatial join on the minimum distance between 2 
sets of points with coordinates and attributes in 2 different data frames.

Simon Knapp sent code to do it when calculating distance on a sphere using lat, 
long coordinates and I've change his code to use Euclidian distances since my 
data had UTM coordinates. 

Typically one data frame has around 30 000 points and the classification data 
frame has around 4000 points, and the aim is to add to each point from the 
first data frame all the attributes from the second data frame of the point 
that is closest to it. 

On my PC (Dell, OptiPlex GX620, X86 – based PC, 4 GB RAM, 3192 Mhz processor)
It took quite a long time to do the join:

   user  system   elapsed 
8166.07 2.98  8194.43

Sys.info()
                     sysname                      release 
                   "Windows"                         "XP" 
                     version                     nodename 
"build 2600, Service Pack 2"              
                     machine                        
                       "x86"                       
I am running R 2.7.1 patched.
I wonder if any of you can suggest or help (or have time) in optimizing this 
code to make it run faster. My programming skills are not high enough to do it.

Thanks,

Monica

#### code follows:
#### x a data frame with over 30000 points with coord in UTM, xeast, xnorth
#### y a data frame with over 4000 points with UTM coord (yeast, ynorth) and 
##### classification
### calculating Euclidian distance

dist <- function(xeast, xnorth, yeast, ynorth) {
((xeast-yeast)^2 + (xnorth-ynorth)^2)^0.5
}

### doing the merge by location with minimum distance

dist.merge <- function(x, y, xeast, xnorth, yeast, ynorth){
tmp <- t(apply(x[,c(xeast, xnorth)], 1, function(x, y){
dists <- apply(y, 1, function(x, y) dist(x[2],
x[1], y[2], y[1]), x)
cbind(1:nrow(y), dists)[dists == min(dists),,drop=F][1,]
}
, y[,c(yeast, ynorth)]))
tmp <- cbind(x, min.dist=tmp[,2], y[tmp[,1],-match(c(yeast,
ynorth), names(y))])
row.names(tmp) <- NULL
tmp
}

#### code end

_________________________________________________________________

 Live.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to