On Sep 24, 2015, at 12:36 PM, Lorenzo Isella wrote: > Hi, > And thanks for your reply. > Essentially, your script gets the job done. > For instance, if I run > > mm <- cbind(5/(1:5), -2*sqrt(1:5)) > dst <- dist(mm) > dst2 <- as.matrix(dst) > diag(dst2) <- NA > idx <- which(apply(dst2, 1, function(x) all(na.omit(x)>.9))) > > then it correctly detects the first two rows, where all the values are > larger than 0.9. > In other words, it detects the points that are at least 0.9 units away > from *all* the other points. > My other question (I did not realize this until I got your answer) is > the following: I have the distance matrix of a set of N points. > You gave me an algorithm two find all the points that are at least 0.9 > units away from any other points. > However, in some cases, for me it is OK even a weaker condition: find > a subset of k points (with k tunable) whose distance *from each other* > is greater than 0.9 units (even if their distance from some other > points may be smaller than 0.9).
If I understand ..... Make a matrix of unique combinations, then apply by rows to get the qualifying columns that satisfy the distance criterion: mtxcomb <- combn(1:20, 5) goodcls <- apply(mtxcomb , 2, function(idx) all( dist( cbind( x[idx], y[idx]) ) > 0.9)) mtxcomb [ , goodcls] In my sample it was around 9% of the total 5 item combinations. snipped a lot of output: ..... [,1440] [,1441] [1,] 12 13 [2,] 13 16 [3,] 16 17 [4,] 19 19 [5,] 20 20 > dim( mtxcomb) [1] 5 15504 -- David > Any idea about how to tackle that? Is it simply a matter of detecting > the row and column numbers of all the entries of the distance matrix > larger than 0.9? > Many thanks > > Lorenzo > > > > On Wed, Sep 23, 2015 at 09:23:04PM +0000, David L Carlson wrote: >> I think the OP wanted rows where all values were greater than .9. >> If so, this works: >> >>> set.seed(42) >>> dst <- dist(cbind(rnorm(20), rnorm(20))) >>> dst2 <- as.matrix(dst) >>> diag(dst2) <- NA >>> idx <- which(apply(dst2, 1, function(x) all(na.omit(x)>.9))) >>> idx >> 13 18 19 >> 13 18 19 >>> dst2[idx, idx] >> 13 18 19 >> 13 NA 2.272407 3.606054 >> 18 2.272407 NA 1.578150 >> 19 3.606054 1.578150 NA >> >> ------------------------------------- >> David L Carlson >> Department of Anthropology >> Texas A&M University >> College Station, TX 77840-4352 >> >> >> >> -----Original Message----- >> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of William >> Dunlap >> Sent: Wednesday, September 23, 2015 3:23 PM >> To: Lorenzo Isella >> Cc: r-help@r-project.org >> Subject: Re: [R] Sampling the Distance Matrix >> >>> mm <- cbind(1/(1:5), sqrt(1:5)) >>> d <- dist(mm) >>> d >> 1 2 3 4 >> 2 0.6492864 >> 3 0.9901226 0.3588848 >> 4 1.2500000 0.6369033 0.2806086 >> 5 1.4723668 0.8748970 0.5213550 0.2413050 >>> which(as.matrix(d)>0.9, arr.ind=TRUE) >> row col >> 3 3 1 >> 4 4 1 >> 5 5 1 >> 1 1 3 >> 1 1 4 >> 1 1 5 >> I.e., the distances between mm's rows 3 & 1, 4 & 1, and 5,1 are more than 0.9 >> >> The as.matrix(d) is needed because dist returns the lower triangle of >> the distance >> matrix and an object of class "dist" and as.matrix.dist converts that >> into a matrix. >> >> Bill Dunlap >> TIBCO Software >> wdunlap tibco.com >> >> >> On Wed, Sep 23, 2015 at 12:15 PM, Lorenzo Isella >> <lorenzo.ise...@gmail.com> wrote: >>> Dear All, >>> Suppose you have a distance matrix stored like a dist object, for >>> instance >>> >>> x<-rnorm(20) >>> y<-rnorm(20) >>> >>> mm<-as.matrix(cbind(x,y)) >>> >>> dst<-(dist(mm)) >>> >>> Now, my problem is the following: I would like to get the rows of mm >>> corresponding to points whose distance is always larger of, let's say, >>> 0.9. >>> In other words, if I were to compute the distance matrix on those >>> selected rows of mm, apart from the diagonal, I would get all entries >>> larger than 0.9. >>> Any idea about how I can efficiently code that? >>> Regards >>> >>> Lorenzo >>> >>> ______________________________________________ >>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >> >> ______________________________________________ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius Alameda, CA, USA ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.