Re: [R] Using apply for logical conditions

2010-08-02 Thread Alastair

Wow,

Thanks for all the excellent (and fast) responses. That's really helped.
Sorry I didn't supply a cut and paste-able example (noted for future
reference) but your examples caught the essence of my problem.  

I ended up opting for the apply any solution. But I'll bear the Reduce
function in mind. 

Thanks,
Alastair
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Using-apply-for-logical-conditions-tp2310929p2311079.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using apply for logical conditions

2010-08-02 Thread Bert Gunter
Just for fun, here are another couple of versions that work for data frames.

For Reduce with "|"

do.call(pmax,c(mydata,na.rm=TRUE)) >0

and for "&"

do.call(pmin,c(mydata,na.rm=TRUE)) >0


 Cheers,

Bert Gunter
Genentech Nonclinical Biostatistics


On Mon, Aug 2, 2010 at 2:28 PM, Joshua Wiley  wrote:
> On Mon, Aug 2, 2010 at 2:08 PM, Michael Lachmann  wrote:
>>
>> Reduce() is much nicer, but I usually use
>>
>> rowSums(A) > 0 for 'or', and
>> rowSums(A) == ncols for 'and'.
>>
>> Which works slightly faster.
>
> For the sake of my own curiosity, I compared several of these options,
> but in case others are interested.
>
>> boolean <- c(TRUE, FALSE, FALSE)
>>
>> set.seed(1)
>> mydata <- data.frame(X = sample(boolean, 10^7, replace = TRUE),
> +                      Y = sample(boolean, 10^7, replace = TRUE),
> +                      Z = sample(boolean, 10^7, replace = TRUE))
>>
>> system.time(opt1 <- apply(mydata, 1, any))
>   user  system elapsed
>  147.26    0.42  148.56
>> system.time(opt2 <- Reduce('|', mydata))
>   user  system elapsed
>   0.33    0.00    0.35
>> system.time(opt3 <- as.logical(rowSums(mydata, na.rm = TRUE)))
>   user  system elapsed
>   0.25    0.00    0.27
>> system.time(opt4 <- rowSums(mydata, na.rm = TRUE) > 0)
>   user  system elapsed
>   0.25    0.00    0.25
>>
>> identical(opt1, opt2)
> [1] TRUE
>> identical(opt1, opt3)
> [1] TRUE
>> identical(opt1, opt4)
> [1] TRUE
>>
>> rm(boolean, mydata, opt1, opt2, opt3, opt4)
>
>
>
>>
>> I noticed, though, that Reduce() doesn't work on matrices. Is there an
>> alternative for matrices, or do you have to convert the matrix first to a
>> data.frame, and then use Reduce?
>>
>>
>> --
>> View this message in context: 
>> http://r.789695.n4.nabble.com/Using-apply-for-logical-conditions-tp2310929p2310991.html
>> Sent from the R help mailing list archive at Nabble.com.
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
>
> --
> Joshua Wiley
> Ph.D. Student, Health Psychology
> University of California, Los Angeles
> http://www.joshuawiley.com/
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using apply for logical conditions

2010-08-02 Thread Michael Lachmann

Reduce() is really amazingly fast!

Even with a much larger number of columns, it is still in the same ballpark
(and much more readable):

> boolean <- c(TRUE, rep(FALSE,10^3))
> a<-matrix(sample(boolean, 10^7, replace = TRUE),10^4,10^3)
> b<-data.frame(a)
> system.time({opt4 <- rowSums(a, na.rm = TRUE) > 0})
   user  system elapsed 
  0.129   0.001   0.131 
> system.time({opt2 <- Reduce('|',b)})
   user  system elapsed 
  0.190   0.109   0.303 

and:
> boolean <- c(TRUE, rep(FALSE,10^4))
> a<-matrix(sample(boolean, 10^7, replace = TRUE),10^3,10^4)
> b<-data.frame(a)
> system.time({opt4 <- rowSums(a, na.rm = TRUE) > 0})
   user  system elapsed 
  0.082   0.001   0.083 
> system.time({opt2 <- Reduce('|',b)})
   user  system elapsed 
  0.205   0.001   0.209 

It seems to pretty much make rowSums obsolete, vs. Reduce('+'), except that
it works on lists, and converting a matrix to a data.frame takes ages.

-- 
View this message in context: 
http://r.789695.n4.nabble.com/Using-apply-for-logical-conditions-tp2310929p2311042.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using apply for logical conditions

2010-08-02 Thread Joshua Wiley
On Mon, Aug 2, 2010 at 2:08 PM, Michael Lachmann  wrote:
>
> Reduce() is much nicer, but I usually use
>
> rowSums(A) > 0 for 'or', and
> rowSums(A) == ncols for 'and'.
>
> Which works slightly faster.

For the sake of my own curiosity, I compared several of these options,
but in case others are interested.

> boolean <- c(TRUE, FALSE, FALSE)
>
> set.seed(1)
> mydata <- data.frame(X = sample(boolean, 10^7, replace = TRUE),
+  Y = sample(boolean, 10^7, replace = TRUE),
+  Z = sample(boolean, 10^7, replace = TRUE))
>
> system.time(opt1 <- apply(mydata, 1, any))
   user  system elapsed
 147.260.42  148.56
> system.time(opt2 <- Reduce('|', mydata))
   user  system elapsed
   0.330.000.35
> system.time(opt3 <- as.logical(rowSums(mydata, na.rm = TRUE)))
   user  system elapsed
   0.250.000.27
> system.time(opt4 <- rowSums(mydata, na.rm = TRUE) > 0)
   user  system elapsed
   0.250.000.25
>
> identical(opt1, opt2)
[1] TRUE
> identical(opt1, opt3)
[1] TRUE
> identical(opt1, opt4)
[1] TRUE
>
> rm(boolean, mydata, opt1, opt2, opt3, opt4)



>
> I noticed, though, that Reduce() doesn't work on matrices. Is there an
> alternative for matrices, or do you have to convert the matrix first to a
> data.frame, and then use Reduce?
>
>
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/Using-apply-for-logical-conditions-tp2310929p2310991.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using apply for logical conditions

2010-08-02 Thread Bert Gunter
Yes, you must do the conversion. The reason is that Reduce requires
its argument x, to be a vector; and a matrix is seen a vector obtained
by columnwise concatenation. e.g.

> Reduce("+",matrix(1:6,nr=3))
[1] 21
> Reduce("+",1:6)
[1] 21

The data frame is seen as a list with elements the columns of the
frame. Whence one concludes that the f argument must be vectorized for
the Reduce to work on the columns of the data frame as you expect.
e.g.

> Reduce(min,data.frame(a=1:3,b=4:6))
[1] 1

but

> Reduce(pmin,data.frame(a=1:3,b=4:6))
[1] 1 2 3


Cheers,

Bert Gunter
Genentech Nonclinical Biostatistics


On Mon, Aug 2, 2010 at 2:08 PM, Michael Lachmann  wrote:
>
> Reduce() is much nicer, but I usually use
>
> rowSums(A) > 0 for 'or', and
> rowSums(A) == ncols for 'and'.
>
> Which works slightly faster.
>
> I noticed, though, that Reduce() doesn't work on matrices. Is there an
> alternative for matrices, or do you have to convert the matrix first to a
> data.frame, and then use Reduce?
>
>
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/Using-apply-for-logical-conditions-tp2310929p2310991.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using apply for logical conditions

2010-08-02 Thread Michael Lachmann

Reduce() is much nicer, but I usually use

rowSums(A) > 0 for 'or', and
rowSums(A) == ncols for 'and'.

Which works slightly faster.

I noticed, though, that Reduce() doesn't work on matrices. Is there an
alternative for matrices, or do you have to convert the matrix first to a
data.frame, and then use Reduce?


-- 
View this message in context: 
http://r.789695.n4.nabble.com/Using-apply-for-logical-conditions-tp2310929p2310991.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using apply for logical conditions

2010-08-02 Thread Joshua Wiley
In addition to Reduce(), you can take a look at ?any for '|' and ?all for '&'.

Josh

On Mon, Aug 2, 2010 at 1:43 PM, Allan Engelhardt  wrote:
> `|` is a binary operator which is why the apply will not work.  See
>
> help("Reduce")
>
> For example,
>
> set.seed(1)
> data <- data.frame(A = runif(10) > 0.5, B = runif(10) > 0.5, C = runif(10) >
> 0.5)
> Reduce(`|`, data)
> #  [1]  TRUE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
>
> Hope this helps.
>
> Allan
>
> On 02/08/10 21:35, Alastair wrote:
>>
>> Hi,
>>
>> I've got some boolean data in a data.frame in the form:
>>       X    Y    Z    A   B   C
>> [1]  T     T    F    T   F   F
>> [2]  F     T    T    F   F   F
>> .
>> .
>> .
>>
>>
>> What I want to do is create a new column which is the logical disjunction
>> of
>> several of the columns.
>> Just like:
>>
>> new.column<- data$X | data$Y | data$Z
>>
>> However I don't want to hard code the particular columns into the
>> expression
>> like that. I've tried using apply row wise with `|` as the function:
>>
>> columns<- c(X,Y,Z)
>> apply(data[,columns], 1,`|`)
>>
>> This doesn't seem to do what I would have expected, does anyone have any
>> advice how to use the the apply or similar function to perform a boolean
>> operation on each row (and a specific subset of the columns) in a data
>> frame?
>>
>> Thanks,
>> Alastair
>>
>>
>>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using apply for logical conditions

2010-08-02 Thread Erik Iverson



Alastair wrote:

Hi,

I've got some boolean data in a data.frame in the form:
  XYZA   B   C
[1]  T TFT   F   F
[2]  F TTF   F   F
.
.
.


What I want to do is create a new column which is the logical disjunction of
several of the columns.
Just like:

new.column <- data$X | data$Y | data$Z

However I don't want to hard code the particular columns into the expression
like that. I've tried using apply row wise with `|` as the function:

columns <- c(X,Y,Z)
apply(data[,columns], 1,`|`)



Please provide *reproducible* examples.  I cannot run any of your code 
since you don't give us the objects X, Y, or Z.  An easy way to do this 
is to use ?dput on the objects we need to run your code, e.g., your 
data.frame.


Does this do what you want?

df1 <- data.frame(x = sample(c(TRUE, FALSE), 10, replace = TRUE),
  y = sample(c(TRUE, FALSE), 10, replace = TRUE),
  z = sample(c(TRUE, FALSE), 10, replace = TRUE))

columns <- c("x", "y", "z")

apply(df1[columns], 1, any)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using apply for logical conditions

2010-08-02 Thread Allan Engelhardt

`|` is a binary operator which is why the apply will not work.  See

help("Reduce")

For example,

set.seed(1)
data <- data.frame(A = runif(10) > 0.5, B = runif(10) > 0.5, C = 
runif(10) > 0.5)

Reduce(`|`, data)
#  [1]  TRUE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE

Hope this helps.

Allan

On 02/08/10 21:35, Alastair wrote:

Hi,

I've got some boolean data in a data.frame in the form:
   XYZA   B   C
[1]  T TFT   F   F
[2]  F TTF   F   F
.
.
.


What I want to do is create a new column which is the logical disjunction of
several of the columns.
Just like:

new.column<- data$X | data$Y | data$Z

However I don't want to hard code the particular columns into the expression
like that. I've tried using apply row wise with `|` as the function:

columns<- c(X,Y,Z)
apply(data[,columns], 1,`|`)

This doesn't seem to do what I would have expected, does anyone have any
advice how to use the the apply or similar function to perform a boolean
operation on each row (and a specific subset of the columns) in a data
frame?

Thanks,
Alastair





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.