On Sep 25, 2015, at 12:54 PM, Lorenzo Isella wrote: > Apologies for not letting this thread rest in peace. > The small script > > ######################################################### > set.seed(1234) > > x <- rnorm(20) > y <- rnorm(20) > > > goodcls <- apply(mtxcomb , 2, function(idx) all( dist( cbind( x[idx], > y[idx]) ) > 0.9)) > > mycomb <- mtxcomb [ , goodcls] > ######################################################### > > > is perfect to detects groups of 5 points whose distances to each other > are always above 0.9. > However, in my practical case I have about 500 points and I am looking > for subset of several tens of points whose distance is above a given > threshold. > Unfortunately, the approach above does not scale, so I wonder if > anybody is aware of an alternative approach.
Find the center of the distribution, eliminate all the points within some reasonable radius perhaps sqrt( sd(x)^2 +sd(y)^2 ) and then work on the reduced set. If you needed to reduce it even further I could imagine sampling in sectors defined by tan(x/y). -- David Winsemius Alameda, CA, USA ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.