subject:"\[R\] any way to make it work faster \(deleting rows that contain certain values\)"

Re: [R] any way to make it work faster (deleting rows that contain certain values)

2009-09-23 Thread Dimitri Liakhovitski

Chuck, thank you, but I am not sure I understood what you meant.
There are a lot of rows in "index" where at least 2 columns have equal
values and a lot of rows where column 1 has 2 and some other column
has 5 - same for 3 in column 1 and 6 in some other column, etc.
Thanks a lot for clarifying!
Dimitri

On Tue, Sep 22, 2009 at 5:36 PM, Charles C. Berry  wrote:
> On Tue, 22 Sep 2009, Dimitri Liakhovitski wrote:
>
>> Hello, dear R'ers,
>>
>> index<-expand.grid(1:7,1:4,1:4,1:4,1:4,1:4,1:4,1:4,1:4,1:4,1:4)
>>
>> In this case, dim(index) is 7,340,032 (!)  and 11.
>> I realize it's huge.
>> Then, I am trying to get rid of the undesired combinations of columns.
>> They should not contain identical values in any 2 columns.
>
>
> Right, but you have only four values in each of columns 2:11.
>
> And none of them can be identical.
>
> There are exactly
>
>        choose(4,10)
>
> rows that satisfy that constraint for columns 2:11.
>
> The rows of your result are easily enumerated by hand. ;-)
>
> HTH,
>
> Chuck
>
>> Also if column 1 has a value of 5, there should be no 2 in any other
>> column,
>> if column 1 has a value of 6, there should be no 3 in any other column,
>> and
>> column 1 has a value of 7, there should be no 4 in any other column.
>> I worte a generic script to achieve that (below).
>> However, I was wondering if it's possible to make it any faster - it
>> looks like with that huge index it's going to take me days...
>>
>> Thanks a lot for any suggestion!
>> Dimitri
>>
>> index<-expand.grid(1:7,1:4,1:4,1:4,1:4,1:4,1:4,1:4,1:4,1:4,1:4)
>> bad.pairs<-matrix(c(1,1,2,2,3,3,4,4,5,2,6,3,7,4),nrow=7,ncol=2,byrow=T)
>> for(i in 1:ncol(index)){                # looping through columns of the
>> "index"
>>  for(pair in 1:nrow(bad.pairs)){     # looping through rows of "bad.pairs"
>>   keep<-sapply(1:nrow(index), function(x){
>>     temp<-(index[[x,i]]==bad.pairs[pair,1]) &
>> (any(index[x,-i]==bad.pairs[pair,2]))
>>     return(temp)
>>   })
>>   index<-index[!keep,]
>>  }
>> }
>>
>> --
>> Dimitri Liakhovitski
>> Ninah.com
>> dimitri.liakhovit...@ninah.com
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> Charles C. Berry                            (858) 534-2098
>                                            Dept of Family/Preventive
> Medicine
> E mailto:cbe...@tajo.ucsd.edu               UC San Diego
> http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901
>
>
>



-- 
Dimitri Liakhovitski
Ninah.com
dimitri.liakhovit...@ninah.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] any way to make it work faster (deleting rows that contain certain values)

2009-09-22 Thread Charles C. Berry


On Tue, 22 Sep 2009, Dimitri Liakhovitski wrote:


Hello, dear R'ers,

index<-expand.grid(1:7,1:4,1:4,1:4,1:4,1:4,1:4,1:4,1:4,1:4,1:4)

In this case, dim(index) is 7,340,032 (!)  and 11.
I realize it's huge.
Then, I am trying to get rid of the undesired combinations of columns.
They should not contain identical values in any 2 columns.



Right, but you have only four values in each of columns 2:11.

And none of them can be identical.

There are exactly

choose(4,10)

rows that satisfy that constraint for columns 2:11.

The rows of your result are easily enumerated by hand. ;-)

HTH,

Chuck


Also if column 1 has a value of 5, there should be no 2 in any other column,
if column 1 has a value of 6, there should be no 3 in any other column, and
column 1 has a value of 7, there should be no 4 in any other column.
I worte a generic script to achieve that (below).
However, I was wondering if it's possible to make it any faster - it
looks like with that huge index it's going to take me days...

Thanks a lot for any suggestion!
Dimitri

index<-expand.grid(1:7,1:4,1:4,1:4,1:4,1:4,1:4,1:4,1:4,1:4,1:4)
bad.pairs<-matrix(c(1,1,2,2,3,3,4,4,5,2,6,3,7,4),nrow=7,ncol=2,byrow=T)
for(i in 1:ncol(index)){# looping through columns of the "index"
 for(pair in 1:nrow(bad.pairs)){ # looping through rows of "bad.pairs"
   keep<-sapply(1:nrow(index), function(x){
 temp<-(index[[x,i]]==bad.pairs[pair,1]) &
(any(index[x,-i]==bad.pairs[pair,2]))
 return(temp)
   })
   index<-index[!keep,]
 }
}

--
Dimitri Liakhovitski
Ninah.com
dimitri.liakhovit...@ninah.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



Charles C. Berry(858) 534-2098
Dept of Family/Preventive Medicine
E mailto:cbe...@tajo.ucsd.edu   UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] any way to make it work faster (deleting rows that contain certain values)

2009-09-22 Thread Dimitri Liakhovitski

Hello, dear R'ers,

index<-expand.grid(1:7,1:4,1:4,1:4,1:4,1:4,1:4,1:4,1:4,1:4,1:4)

In this case, dim(index) is 7,340,032 (!)  and 11.
I realize it's huge.
Then, I am trying to get rid of the undesired combinations of columns.
They should not contain identical values in any 2 columns.
Also if column 1 has a value of 5, there should be no 2 in any other column,
if column 1 has a value of 6, there should be no 3 in any other column, and
column 1 has a value of 7, there should be no 4 in any other column.
I worte a generic script to achieve that (below).
However, I was wondering if it's possible to make it any faster - it
looks like with that huge index it's going to take me days...

Thanks a lot for any suggestion!
Dimitri

index<-expand.grid(1:7,1:4,1:4,1:4,1:4,1:4,1:4,1:4,1:4,1:4,1:4)
bad.pairs<-matrix(c(1,1,2,2,3,3,4,4,5,2,6,3,7,4),nrow=7,ncol=2,byrow=T)
for(i in 1:ncol(index)){# looping through columns of the "index"
  for(pair in 1:nrow(bad.pairs)){ # looping through rows of "bad.pairs"
keep<-sapply(1:nrow(index), function(x){
  temp<-(index[[x,i]]==bad.pairs[pair,1]) &
(any(index[x,-i]==bad.pairs[pair,2]))
  return(temp)
})
index<-index[!keep,]
  }
}

-- 
Dimitri Liakhovitski
Ninah.com
dimitri.liakhovit...@ninah.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] any way to make it work faster (deleting rows that contain certain values)

Re: [R] any way to make it work faster (deleting rows that contain certain values)

[R] any way to make it work faster (deleting rows that contain certain values)

3 matches

Site Navigation

Mail list logo

Footer information