Re: [R] Sampling the Distance Matrix

2015-09-25 Thread David Winsemius

On Sep 25, 2015, at 12:54 PM, Lorenzo Isella wrote:

> Apologies for not letting this thread rest in peace.
> The small script
> 
> #
> set.seed(1234)
> 
> x <- rnorm(20)
> y <- rnorm(20)
> 
> 
> goodcls <- apply(mtxcomb , 2, function(idx) all( dist( cbind( x[idx],
> y[idx]) ) > 0.9))
> 
> mycomb <- mtxcomb [ , goodcls]
> #
> 
> 
> is perfect to detects groups of 5 points whose distances to each other
> are always above 0.9.
> However, in my practical case I have about 500 points and I am looking
> for subset of several tens of points whose distance is above a given
> threshold.
> Unfortunately, the approach above does not scale, so I wonder if
> anybody is aware of an alternative approach.

Find the center of the distribution, eliminate all the points within some 
reasonable radius perhaps sqrt( sd(x)^2 +sd(y)^2 ) and then work on the reduced 
set. If you needed to reduce it even further I could imagine sampling in 
sectors defined by tan(x/y).

-- 

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sampling the Distance Matrix

2015-09-25 Thread Lorenzo Isella

Absolutely right!
Thanks to both David for their help.
Cheers

Lorenzo

On Fri, Sep 25, 2015 at 01:54:54PM +, David L Carlson wrote:

You defined x and y in your original email as:


x<-rnorm(20)
y<-rnorm(20)

mm<-as.matrix(cbind(x,y))

dst<-(dist(mm))


-
David L Carlson
Department of Anthropology
Texas A University
College Station, TX 77840-4352


-Original Message-
From: David Winsemius [mailto:dwinsem...@comcast.net]
Sent: Thursday, September 24, 2015 6:30 PM
To: Lorenzo Isella
Cc: David L Carlson; r-help@r-project.org
Subject: Re: [R] Sampling the Distance Matrix


On Sep 24, 2015, at 1:54 PM, Lorenzo Isella wrote:


On Thu, Sep 24, 2015 at 01:30:02PM -0700, David Winsemius wrote:


On Sep 24, 2015, at 12:36 PM, Lorenzo Isella wrote:


Hi,
And thanks for your reply.
Essentially, your script gets the job done.
For instance, if I run

mm <- cbind(5/(1:5), -2*sqrt(1:5))
dst <- dist(mm)
dst2 <- as.matrix(dst)
diag(dst2) <- NA
idx <- which(apply(dst2, 1, function(x) all(na.omit(x)>.9)))

then it correctly detects the first two rows, where all the values are
larger than 0.9.
In other words, it detects the points that are at least 0.9 units away
from *all* the other points.
My other question (I did not realize this until I got your answer) is
the following: I have the distance matrix of a set of N points.
You gave me an algorithm two find all the points that are at least 0.9
units away from any other points.
However, in some cases, for me it is OK even a weaker condition: find
a subset of k points (with k tunable) whose distance *from each other*
is greater than 0.9 units (even if their distance from some other
points may be smaller than 0.9).


If I understand . Make a matrix of unique combinations, then apply by rows 
to get the qualifying columns that satisfy the distance criterion:

mtxcomb <- combn(1:20, 5)
goodcls <- apply(mtxcomb , 2, function(idx) all( dist( cbind( x[idx], y[idx]) ) 
> 0.9))
mtxcomb [ , goodcls]

In my sample it was around 9% of the total 5 item combinations.

snipped a lot of output:
.
  [,1440] [,1441]
[1,]  12  13
[2,]  13  16
[3,]  16  17
[4,]  19  19
[5,]  20  20

dim( mtxcomb)

[1] 5 15504



Hi,
Thanks for your reply.
I think I am getting there, but when I run your commands, I get this
error message

Error in cbind(x[idx], y[idx]) : object 'x' not found

Any idea why? Should I combine those 3 lines with something else?


No idea. I was running the setup that you asked for in your original message 
which you have now omitted from the mail chain.




Cheers

Lorenzo


David Winsemius
Alameda, CA, USA



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sampling the Distance Matrix

2015-09-25 Thread Lorenzo Isella

Apologies for not letting this thread rest in peace.
The small script

#
set.seed(1234)

x <- rnorm(20)
y <- rnorm(20)


goodcls <- apply(mtxcomb , 2, function(idx) all( dist( cbind( x[idx],
y[idx]) ) > 0.9))

mycomb <- mtxcomb [ , goodcls]
#


is perfect to detects groups of 5 points whose distances to each other
are always above 0.9.
However, in my practical case I have about 500 points and I am looking
for subset of several tens of points whose distance is above a given
threshold.
Unfortunately, the approach above does not scale, so I wonder if
anybody is aware of an alternative approach.
Many thanks

Lorenzo

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sampling the Distance Matrix

2015-09-25 Thread David L Carlson
You defined x and y in your original email as:

> x<-rnorm(20)
> y<-rnorm(20)
>
> mm<-as.matrix(cbind(x,y))
>
> dst<-(dist(mm))

-
David L Carlson
Department of Anthropology
Texas A University
College Station, TX 77840-4352


-Original Message-
From: David Winsemius [mailto:dwinsem...@comcast.net] 
Sent: Thursday, September 24, 2015 6:30 PM
To: Lorenzo Isella
Cc: David L Carlson; r-help@r-project.org
Subject: Re: [R] Sampling the Distance Matrix


On Sep 24, 2015, at 1:54 PM, Lorenzo Isella wrote:

> On Thu, Sep 24, 2015 at 01:30:02PM -0700, David Winsemius wrote:
>> 
>> On Sep 24, 2015, at 12:36 PM, Lorenzo Isella wrote:
>> 
>>> Hi,
>>> And thanks for your reply.
>>> Essentially, your script gets the job done.
>>> For instance, if I run
>>> 
>>> mm <- cbind(5/(1:5), -2*sqrt(1:5))
>>> dst <- dist(mm)
>>> dst2 <- as.matrix(dst)
>>> diag(dst2) <- NA
>>> idx <- which(apply(dst2, 1, function(x) all(na.omit(x)>.9)))
>>> 
>>> then it correctly detects the first two rows, where all the values are
>>> larger than 0.9.
>>> In other words, it detects the points that are at least 0.9 units away
>>> from *all* the other points.
>>> My other question (I did not realize this until I got your answer) is
>>> the following: I have the distance matrix of a set of N points.
>>> You gave me an algorithm two find all the points that are at least 0.9
>>> units away from any other points.
>>> However, in some cases, for me it is OK even a weaker condition: find
>>> a subset of k points (with k tunable) whose distance *from each other*
>>> is greater than 0.9 units (even if their distance from some other
>>> points may be smaller than 0.9).
>> 
>> If I understand . Make a matrix of unique combinations, then apply by 
>> rows to get the qualifying columns that satisfy the distance criterion:
>> 
>> mtxcomb <- combn(1:20, 5)
>> goodcls <- apply(mtxcomb , 2, function(idx) all( dist( cbind( x[idx], 
>> y[idx]) ) > 0.9))
>> mtxcomb [ , goodcls]
>> 
>> In my sample it was around 9% of the total 5 item combinations.
>> 
>> snipped a lot of output:
>> .
>>   [,1440] [,1441]
>> [1,]  12  13
>> [2,]  13  16
>> [3,]  16  17
>> [4,]  19  19
>> [5,]  20  20
>>> dim( mtxcomb)
>> [1] 5 15504
>> 
> 
> Hi,
> Thanks for your reply.
> I think I am getting there, but when I run your commands, I get this
> error message
> 
> Error in cbind(x[idx], y[idx]) : object 'x' not found
> 
> Any idea why? Should I combine those 3 lines with something else?

No idea. I was running the setup that you asked for in your original message 
which you have now omitted from the mail chain.



> Cheers
> 
> Lorenzo

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sampling the Distance Matrix

2015-09-24 Thread Lorenzo Isella

Hi,
And thanks for your reply.
Essentially, your script gets the job done.
For instance, if I run

mm <- cbind(5/(1:5), -2*sqrt(1:5))
dst <- dist(mm)
dst2 <- as.matrix(dst)
diag(dst2) <- NA
idx <- which(apply(dst2, 1, function(x) all(na.omit(x)>.9)))

then it correctly detects the first two rows, where all the values are
larger than 0.9.
In other words, it detects the points that are at least 0.9 units away
from *all* the other points.
My other question (I did not realize this until I got your answer) is
the following: I have the distance matrix of a set of N points.
You gave me an algorithm two find all the points that are at least 0.9
units away from any other points.
However, in some cases, for me it is OK even a weaker condition: find
a subset of k points (with k tunable) whose distance *from each other*
is greater than 0.9 units (even if their distance from some other
points may be smaller than 0.9).
Any idea about how to tackle that? Is it simply a matter of detecting
the row and column numbers of all the entries of the distance matrix
larger than 0.9?
Many thanks

Lorenzo



On Wed, Sep 23, 2015 at 09:23:04PM +, David L Carlson wrote:

I think the OP wanted rows where all values were greater than .9.
If so, this works:


set.seed(42)
dst <- dist(cbind(rnorm(20), rnorm(20)))
dst2 <- as.matrix(dst)
diag(dst2) <- NA
idx <- which(apply(dst2, 1, function(x) all(na.omit(x)>.9)))
idx

13 18 19
13 18 19

dst2[idx, idx]

13   18   19
13   NA 2.272407 3.606054
18 2.272407   NA 1.578150
19 3.606054 1.578150   NA

-
David L Carlson
Department of Anthropology
Texas A University
College Station, TX 77840-4352



-Original Message-
From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of William Dunlap
Sent: Wednesday, September 23, 2015 3:23 PM
To: Lorenzo Isella
Cc: r-help@r-project.org
Subject: Re: [R] Sampling the Distance Matrix


mm <- cbind(1/(1:5), sqrt(1:5))
d <- dist(mm)
d

 1 2 3 4
2 0.6492864
3 0.9901226 0.3588848
4 1.250 0.6369033 0.2806086
5 1.4723668 0.8748970 0.5213550 0.2413050

which(as.matrix(d)>0.9, arr.ind=TRUE)

 row col
3   3   1
4   4   1
5   5   1
1   1   3
1   1   4
1   1   5
I.e., the distances between mm's rows 3 & 1, 4 & 1, and 5,1 are more than 0.9

The as.matrix(d) is needed because dist returns the lower triangle of
the distance
matrix and an object of class "dist" and as.matrix.dist converts that
into a matrix.

Bill Dunlap
TIBCO Software
wdunlap tibco.com


On Wed, Sep 23, 2015 at 12:15 PM, Lorenzo Isella
<lorenzo.ise...@gmail.com> wrote:

Dear All,
Suppose you have a distance matrix stored like a dist object, for
instance

x<-rnorm(20)
y<-rnorm(20)

mm<-as.matrix(cbind(x,y))

dst<-(dist(mm))

Now, my problem is the following: I would like to get the rows of mm
corresponding to points whose distance is always larger of, let's say,
0.9.
In other words, if I were to compute the distance matrix on those
selected rows of mm, apart from the diagonal, I would get all entries
larger than 0.9.
Any idea about how I can efficiently code that?
Regards

Lorenzo

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sampling the Distance Matrix

2015-09-24 Thread David Winsemius

On Sep 24, 2015, at 1:54 PM, Lorenzo Isella wrote:

> On Thu, Sep 24, 2015 at 01:30:02PM -0700, David Winsemius wrote:
>> 
>> On Sep 24, 2015, at 12:36 PM, Lorenzo Isella wrote:
>> 
>>> Hi,
>>> And thanks for your reply.
>>> Essentially, your script gets the job done.
>>> For instance, if I run
>>> 
>>> mm <- cbind(5/(1:5), -2*sqrt(1:5))
>>> dst <- dist(mm)
>>> dst2 <- as.matrix(dst)
>>> diag(dst2) <- NA
>>> idx <- which(apply(dst2, 1, function(x) all(na.omit(x)>.9)))
>>> 
>>> then it correctly detects the first two rows, where all the values are
>>> larger than 0.9.
>>> In other words, it detects the points that are at least 0.9 units away
>>> from *all* the other points.
>>> My other question (I did not realize this until I got your answer) is
>>> the following: I have the distance matrix of a set of N points.
>>> You gave me an algorithm two find all the points that are at least 0.9
>>> units away from any other points.
>>> However, in some cases, for me it is OK even a weaker condition: find
>>> a subset of k points (with k tunable) whose distance *from each other*
>>> is greater than 0.9 units (even if their distance from some other
>>> points may be smaller than 0.9).
>> 
>> If I understand . Make a matrix of unique combinations, then apply by 
>> rows to get the qualifying columns that satisfy the distance criterion:
>> 
>> mtxcomb <- combn(1:20, 5)
>> goodcls <- apply(mtxcomb , 2, function(idx) all( dist( cbind( x[idx], 
>> y[idx]) ) > 0.9))
>> mtxcomb [ , goodcls]
>> 
>> In my sample it was around 9% of the total 5 item combinations.
>> 
>> snipped a lot of output:
>> .
>>   [,1440] [,1441]
>> [1,]  12  13
>> [2,]  13  16
>> [3,]  16  17
>> [4,]  19  19
>> [5,]  20  20
>>> dim( mtxcomb)
>> [1] 5 15504
>> 
> 
> Hi,
> Thanks for your reply.
> I think I am getting there, but when I run your commands, I get this
> error message
> 
> Error in cbind(x[idx], y[idx]) : object 'x' not found
> 
> Any idea why? Should I combine those 3 lines with something else?

No idea. I was running the setup that you asked for in your original message 
which you have now omitted from the mail chain.



> Cheers
> 
> Lorenzo

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sampling the Distance Matrix

2015-09-24 Thread David Winsemius

On Sep 24, 2015, at 12:36 PM, Lorenzo Isella wrote:

> Hi,
> And thanks for your reply.
> Essentially, your script gets the job done.
> For instance, if I run
> 
> mm <- cbind(5/(1:5), -2*sqrt(1:5))
> dst <- dist(mm)
> dst2 <- as.matrix(dst)
> diag(dst2) <- NA
> idx <- which(apply(dst2, 1, function(x) all(na.omit(x)>.9)))
> 
> then it correctly detects the first two rows, where all the values are
> larger than 0.9.
> In other words, it detects the points that are at least 0.9 units away
> from *all* the other points.
> My other question (I did not realize this until I got your answer) is
> the following: I have the distance matrix of a set of N points.
> You gave me an algorithm two find all the points that are at least 0.9
> units away from any other points.
> However, in some cases, for me it is OK even a weaker condition: find
> a subset of k points (with k tunable) whose distance *from each other*
> is greater than 0.9 units (even if their distance from some other
> points may be smaller than 0.9).

If I understand . Make a matrix of unique combinations, then apply by rows 
to get the qualifying columns that satisfy the distance criterion:

mtxcomb <- combn(1:20, 5)
goodcls <- apply(mtxcomb , 2, function(idx) all( dist( cbind( x[idx], y[idx]) ) 
> 0.9))
mtxcomb [ , goodcls]

In my sample it was around 9% of the total 5 item combinations.

snipped a lot of output:
.
[,1440] [,1441]
[1,]  12  13
[2,]  13  16
[3,]  16  17
[4,]  19  19
[5,]  20  20
> dim( mtxcomb)
[1] 5 15504


-- 
David

> Any idea about how to tackle that? Is it simply a matter of detecting
> the row and column numbers of all the entries of the distance matrix
> larger than 0.9?
> Many thanks
> 
> Lorenzo
> 
> 
> 
> On Wed, Sep 23, 2015 at 09:23:04PM +, David L Carlson wrote:
>> I think the OP wanted rows where all values were greater than .9.
>> If so, this works:
>> 
>>> set.seed(42)
>>> dst <- dist(cbind(rnorm(20), rnorm(20)))
>>> dst2 <- as.matrix(dst)
>>> diag(dst2) <- NA
>>> idx <- which(apply(dst2, 1, function(x) all(na.omit(x)>.9)))
>>> idx
>> 13 18 19
>> 13 18 19
>>> dst2[idx, idx]
>>13   18   19
>> 13   NA 2.272407 3.606054
>> 18 2.272407   NA 1.578150
>> 19 3.606054 1.578150   NA
>> 
>> -
>> David L Carlson
>> Department of Anthropology
>> Texas A University
>> College Station, TX 77840-4352
>> 
>> 
>> 
>> -Original Message-
>> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of William 
>> Dunlap
>> Sent: Wednesday, September 23, 2015 3:23 PM
>> To: Lorenzo Isella
>> Cc: r-help@r-project.org
>> Subject: Re: [R] Sampling the Distance Matrix
>> 
>>> mm <- cbind(1/(1:5), sqrt(1:5))
>>> d <- dist(mm)
>>> d
>> 1 2 3 4
>> 2 0.6492864
>> 3 0.9901226 0.3588848
>> 4 1.250 0.6369033 0.2806086
>> 5 1.4723668 0.8748970 0.5213550 0.2413050
>>> which(as.matrix(d)>0.9, arr.ind=TRUE)
>> row col
>> 3   3   1
>> 4   4   1
>> 5   5   1
>> 1   1   3
>> 1   1   4
>> 1   1   5
>> I.e., the distances between mm's rows 3 & 1, 4 & 1, and 5,1 are more than 0.9
>> 
>> The as.matrix(d) is needed because dist returns the lower triangle of
>> the distance
>> matrix and an object of class "dist" and as.matrix.dist converts that
>> into a matrix.
>> 
>> Bill Dunlap
>> TIBCO Software
>> wdunlap tibco.com
>> 
>> 
>> On Wed, Sep 23, 2015 at 12:15 PM, Lorenzo Isella
>> <lorenzo.ise...@gmail.com> wrote:
>>> Dear All,
>>> Suppose you have a distance matrix stored like a dist object, for
>>> instance
>>> 
>>> x<-rnorm(20)
>>> y<-rnorm(20)
>>> 
>>> mm<-as.matrix(cbind(x,y))
>>> 
>>> dst<-(dist(mm))
>>> 
>>> Now, my problem is the following: I would like to get the rows of mm
>>> corresponding to points whose distance is always larger of, let's say,
>>> 0.9.
>>> In other words, if I were to compute the distance matrix on those
>>> selected rows of mm, apart from the diagonal, I would get all entries
>>> larger than 0.9.
>>> Any idea about how I can efficiently code that?
>>> Regards
>>> 
>>> Lorenzo
>>> 
>>> __
>>> R-help@r-project.

Re: [R] Sampling the Distance Matrix

2015-09-24 Thread Lorenzo Isella

On Thu, Sep 24, 2015 at 01:30:02PM -0700, David Winsemius wrote:


On Sep 24, 2015, at 12:36 PM, Lorenzo Isella wrote:


Hi,
And thanks for your reply.
Essentially, your script gets the job done.
For instance, if I run

mm <- cbind(5/(1:5), -2*sqrt(1:5))
dst <- dist(mm)
dst2 <- as.matrix(dst)
diag(dst2) <- NA
idx <- which(apply(dst2, 1, function(x) all(na.omit(x)>.9)))

then it correctly detects the first two rows, where all the values are
larger than 0.9.
In other words, it detects the points that are at least 0.9 units away
from *all* the other points.
My other question (I did not realize this until I got your answer) is
the following: I have the distance matrix of a set of N points.
You gave me an algorithm two find all the points that are at least 0.9
units away from any other points.
However, in some cases, for me it is OK even a weaker condition: find
a subset of k points (with k tunable) whose distance *from each other*
is greater than 0.9 units (even if their distance from some other
points may be smaller than 0.9).


If I understand . Make a matrix of unique combinations, then apply by rows 
to get the qualifying columns that satisfy the distance criterion:

mtxcomb <- combn(1:20, 5)
goodcls <- apply(mtxcomb , 2, function(idx) all( dist( cbind( x[idx], y[idx]) ) 
> 0.9))
mtxcomb [ , goodcls]

In my sample it was around 9% of the total 5 item combinations.

snipped a lot of output:
.
   [,1440] [,1441]
[1,]  12  13
[2,]  13  16
[3,]  16  17
[4,]  19  19
[5,]  20  20

dim( mtxcomb)

[1] 5 15504



Hi,
Thanks for your reply.
I think I am getting there, but when I run your commands, I get this
error message

Error in cbind(x[idx], y[idx]) : object 'x' not found

Any idea why? Should I combine those 3 lines with something else?
Cheers

Lorenzo

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sampling the Distance Matrix

2015-09-23 Thread David L Carlson
I think the OP wanted rows where all values were greater than .9.
If so, this works:

> set.seed(42)
> dst <- dist(cbind(rnorm(20), rnorm(20)))
> dst2 <- as.matrix(dst)
> diag(dst2) <- NA
> idx <- which(apply(dst2, 1, function(x) all(na.omit(x)>.9)))
> idx
13 18 19 
13 18 19 
> dst2[idx, idx]
 13   18   19
13   NA 2.272407 3.606054
18 2.272407   NA 1.578150
19 3.606054 1.578150   NA

-
David L Carlson
Department of Anthropology
Texas A University
College Station, TX 77840-4352



-Original Message-
From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of William Dunlap
Sent: Wednesday, September 23, 2015 3:23 PM
To: Lorenzo Isella
Cc: r-help@r-project.org
Subject: Re: [R] Sampling the Distance Matrix

> mm <- cbind(1/(1:5), sqrt(1:5))
> d <- dist(mm)
> d
  1 2 3 4
2 0.6492864
3 0.9901226 0.3588848
4 1.250 0.6369033 0.2806086
5 1.4723668 0.8748970 0.5213550 0.2413050
> which(as.matrix(d)>0.9, arr.ind=TRUE)
  row col
3   3   1
4   4   1
5   5   1
1   1   3
1   1   4
1   1   5
I.e., the distances between mm's rows 3 & 1, 4 & 1, and 5,1 are more than 0.9

The as.matrix(d) is needed because dist returns the lower triangle of
the distance
matrix and an object of class "dist" and as.matrix.dist converts that
into a matrix.

Bill Dunlap
TIBCO Software
wdunlap tibco.com


On Wed, Sep 23, 2015 at 12:15 PM, Lorenzo Isella
<lorenzo.ise...@gmail.com> wrote:
> Dear All,
> Suppose you have a distance matrix stored like a dist object, for
> instance
>
> x<-rnorm(20)
> y<-rnorm(20)
>
> mm<-as.matrix(cbind(x,y))
>
> dst<-(dist(mm))
>
> Now, my problem is the following: I would like to get the rows of mm
> corresponding to points whose distance is always larger of, let's say,
> 0.9.
> In other words, if I were to compute the distance matrix on those
> selected rows of mm, apart from the diagonal, I would get all entries
> larger than 0.9.
> Any idea about how I can efficiently code that?
> Regards
>
> Lorenzo
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sampling the Distance Matrix

2015-09-23 Thread William Dunlap
> mm <- cbind(1/(1:5), sqrt(1:5))
> d <- dist(mm)
> d
  1 2 3 4
2 0.6492864
3 0.9901226 0.3588848
4 1.250 0.6369033 0.2806086
5 1.4723668 0.8748970 0.5213550 0.2413050
> which(as.matrix(d)>0.9, arr.ind=TRUE)
  row col
3   3   1
4   4   1
5   5   1
1   1   3
1   1   4
1   1   5
I.e., the distances between mm's rows 3 & 1, 4 & 1, and 5,1 are more than 0.9

The as.matrix(d) is needed because dist returns the lower triangle of
the distance
matrix and an object of class "dist" and as.matrix.dist converts that
into a matrix.

Bill Dunlap
TIBCO Software
wdunlap tibco.com


On Wed, Sep 23, 2015 at 12:15 PM, Lorenzo Isella
 wrote:
> Dear All,
> Suppose you have a distance matrix stored like a dist object, for
> instance
>
> x<-rnorm(20)
> y<-rnorm(20)
>
> mm<-as.matrix(cbind(x,y))
>
> dst<-(dist(mm))
>
> Now, my problem is the following: I would like to get the rows of mm
> corresponding to points whose distance is always larger of, let's say,
> 0.9.
> In other words, if I were to compute the distance matrix on those
> selected rows of mm, apart from the diagonal, I would get all entries
> larger than 0.9.
> Any idea about how I can efficiently code that?
> Regards
>
> Lorenzo
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.