Julian, Thanks for the reply. That seems like an interesting approach. I suppose another 'GIS' way would be to buffer all the points 100m and then find those where the buffers overlap. My dataset isn't too big (1000's records), and what I came up with seems to do the trick, so I'll stick with it for now. For the archives, I was a bit hasty in the solution I pasted below - it caught the second of the pairs, where I wanted the first of the pairs (after sorting descending by date). These are the lines that do it correctly. b <- a[0,] for(i in 1:nrow(a)){ if(is.na(a[match(a$neigh[i],rownames(a)[1:i]),]$ID)){ b <- rbind(b,a[i,]) } } Best to all, Tim
>>> Julian Burgos <jul...@hafro.is> 2/21/2013 3:24 AM >>> Hi Tim, Perhaps you should use clustering to identify groups of points that are separated 100m or more from other points. You could: a) Calculate distances among points b) Do some type of hierarchical clustering (e.g. the function agnes in the cluster package). c) Identify as clusters everything with a dissimilarity less than 100m. d) Randomly select a single point from each cluster. Julian -- Julian Mariano Burgos, PhD Hafrannsóknastofnunin/Marine Research Institute Skúlagata 4, 121 Reykjavík, Iceland Sími/Telephone : +354-5752037 Bréfsími/Telefax: +354-5752001 Netfang/Email: jul...@hafro.is On 02/20/2013 06:39 PM, Tim Howard wrote: I've found a very inelegant solution that continues on the path I was going. Using the dummy dataset below, this code will strip neighbors as I desire. b <- a[0,] for(i in 2:nrow(a)){ if(!is.na(a[match(a$neigh[i],rownames(a)[1:i]),]$ID)){ b <- rbind(b,a[i,]) } } > b ID dist neigh 3 three 5.1 2 4 four 2.2 1 I'd still be curious if anyone knows a cleaner solution. Best, Tim >>> Tim Howard 2/20/2013 10:19 AM >>> I am trying to remove spatial 'duplicates' from a point dataset. The coordinates won't be exactly the same and so I can't use the normal methods for removing the second instance of the points. This generalizes to a question about removing points nearby others, either randomly or based on other criteria (in my case, I want to keep the one with a more recent date attribute). Although my research and fiddling has got me close, I wonder if there already is a solution I'm missing within the various spatial packages so I'm starting with sig-geo, even though I'm stuck at a spot that would use regular R syntax. My approach (code at bottom of email): 1. Move the full point data set over to SpatialPoints as decimal degrees longlat [package=sp] 2. Reproject to utm, using spTransform 3. convert to ppp 4. find the distance from each point to its nearest using nndist() [package = spatstat] 5. identify that nearest using nnwhich() [package = spatstat] 6. extract those with neighbors closer than 100m ** this is where I'm stuck ** I now have a list of neighbors, for which I'd like to keep the first case of each neighbor but remove the second (and sometimes third). Similar to unique(). Here's a dummy example #set up dummy data frame, the dist and neigh columns are from nndist() and nnwhich(), respectively a <- data.frame(ID=c("one","two","three","four"),dist=c(2.2,5.1,5.1,2.2),neigh=c(4,3,2,1)) #here's as far as I've got, I can remove the neighbor to row one with the following line. #a looping solution seems problematic as the size of the dataframe changes with each loop b <- a[-match(a$neigh[1], rownames(a)),] Questions: - Is there already a function in a spatial package that offers a way to remove points within a certain distance of others? - if not, does anyone have any hints for taking the next step from what I've done? ### code for what I've got so far. ### dat.wind.tall is the input DF of lat long decimal degree coordinates library(sp) library(spatstat) library(maptools) library(rgdal) llCRS <- CRS("+proj=longlat +datum=NAD83") wind.sp <- SpatialPoints(dat.wind.tall[,c(66,65)], proj4string=llCRS) prjNew <- CRS("+proj=utm +zone=18 +datum=NAD83") wind.utm <- spTransform(wind.sp, prjNew) wind.ppp <- as(as(wind.utm, "SpatialPoints"), "ppp") turb.dist <- nndist(wind.ppp) turb.nearest <- nnwhich(wind.ppp) dat.wind.tall.nbr <- cbind(dat.wind.tall, nneigh=turb.nearest, dist=turb.dist) closeNeighbors <- dat.wind.tall.nbr[turb.dist<100,] #code for removing neighbor to first row. y <- closeNeighbors[-match(closeNeighbors$nneigh[1], rownames(closeNeighbors)),] Thanks in advance for any help. Tim _______________________________________________ R-sig-Geo mailing listR-sig-Geo@r-project.orghttps://stat.ethz.ch/mailman/listinfo/r-sig-geo _______________________________________________ R-sig-Geo mailing list R-sig-Geo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-geo