Re: [R] Sampling the Distance Matrix

David Winsemius Fri, 25 Sep 2015 13:59:44 -0700

On Sep 25, 2015, at 12:54 PM, Lorenzo Isella wrote:

> Apologies for not letting this thread rest in peace.
> The small script
> 
> #########################################################
> set.seed(1234)
> 
> x <- rnorm(20)
> y <- rnorm(20)
> 
> 
> goodcls <- apply(mtxcomb , 2, function(idx) all( dist( cbind( x[idx],
> y[idx]) ) > 0.9))
> 
> mycomb <- mtxcomb [ , goodcls]
> #########################################################
> 
> 
> is perfect to detects groups of 5 points whose distances to each other
> are always above 0.9.
> However, in my practical case I have about 500 points and I am looking
> for subset of several tens of points whose distance is above a given
> threshold.
> Unfortunately, the approach above does not scale, so I wonder if
> anybody is aware of an alternative approach.


Find the center of the distribution, eliminate all the points within some 
reasonable radius perhaps sqrt( sd(x)^2 +sd(y)^2 ) and then work on the reduced 
set. If you needed to reduce it even further I could imagine sampling in 
sectors defined by tan(x/y).

-- 

David Winsemius
Alameda, CA, USA

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Sampling the Distance Matrix

Reply via email to