Hi,
And thanks for your reply.
Essentially, your script gets the job done.
For instance, if I run

mm <- cbind(5/(1:5), -2*sqrt(1:5))
dst <- dist(mm)
dst2 <- as.matrix(dst)
diag(dst2) <- NA
idx <- which(apply(dst2, 1, function(x) all(na.omit(x)>.9)))

then it correctly detects the first two rows, where all the values are
larger than 0.9.
In other words, it detects the points that are at least 0.9 units away
from *all* the other points.
My other question (I did not realize this until I got your answer) is
the following: I have the distance matrix of a set of N points.
You gave me an algorithm two find all the points that are at least 0.9
units away from any other points.
However, in some cases, for me it is OK even a weaker condition: find
a subset of k points (with k tunable) whose distance *from each other*
is greater than 0.9 units (even if their distance from some other
points may be smaller than 0.9).
Any idea about how to tackle that? Is it simply a matter of detecting
the row and column numbers of all the entries of the distance matrix
larger than 0.9?
Many thanks

Lorenzo



On Wed, Sep 23, 2015 at 09:23:04PM +0000, David L Carlson wrote:
I think the OP wanted rows where all values were greater than .9.
If so, this works:

set.seed(42)
dst <- dist(cbind(rnorm(20), rnorm(20)))
dst2 <- as.matrix(dst)
diag(dst2) <- NA
idx <- which(apply(dst2, 1, function(x) all(na.omit(x)>.9)))
idx
13 18 19
13 18 19
dst2[idx, idx]
        13       18       19
13       NA 2.272407 3.606054
18 2.272407       NA 1.578150
19 3.606054 1.578150       NA

-------------------------------------
David L Carlson
Department of Anthropology
Texas A&M University
College Station, TX 77840-4352



-----Original Message-----
From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of William Dunlap
Sent: Wednesday, September 23, 2015 3:23 PM
To: Lorenzo Isella
Cc: r-help@r-project.org
Subject: Re: [R] Sampling the Distance Matrix

mm <- cbind(1/(1:5), sqrt(1:5))
d <- dist(mm)
d
         1         2         3         4
2 0.6492864
3 0.9901226 0.3588848
4 1.2500000 0.6369033 0.2806086
5 1.4723668 0.8748970 0.5213550 0.2413050
which(as.matrix(d)>0.9, arr.ind=TRUE)
 row col
3   3   1
4   4   1
5   5   1
1   1   3
1   1   4
1   1   5
I.e., the distances between mm's rows 3 & 1, 4 & 1, and 5,1 are more than 0.9

The as.matrix(d) is needed because dist returns the lower triangle of
the distance
matrix and an object of class "dist" and as.matrix.dist converts that
into a matrix.

Bill Dunlap
TIBCO Software
wdunlap tibco.com


On Wed, Sep 23, 2015 at 12:15 PM, Lorenzo Isella
<lorenzo.ise...@gmail.com> wrote:
Dear All,
Suppose you have a distance matrix stored like a dist object, for
instance

x<-rnorm(20)
y<-rnorm(20)

mm<-as.matrix(cbind(x,y))

dst<-(dist(mm))

Now, my problem is the following: I would like to get the rows of mm
corresponding to points whose distance is always larger of, let's say,
0.9.
In other words, if I were to compute the distance matrix on those
selected rows of mm, apart from the diagonal, I would get all entries
larger than 0.9.
Any idea about how I can efficiently code that?
Regards

Lorenzo

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to