[R] use rowSums or colSums instead of apply!

Tim Hesterberg Tue, 19 Feb 2008 15:51:48 -0800

There were two queries recently regarding removing
rows or columns that have all NAs.


Three respondents suggested combinations of apply() with
any() or all().

I cringe when I see apply() used unnecessarily.
Using rowSums() or colSums() is much faster, and gives more readable
code.  (Two respondents did suggest colSums for the second query.)

# original small data frame
df <- data.frame(col1=c(1:3,NA,NA,4),col2=c(7:9,NA,NA,NA),col3=c(2:4,NA,NA,4))
system.time( for(i in 1:10^4) temp <- rowSums(is.na(df)) < 3)
# .078
system.time( for(i in 1:10^4) temp <- apply(df,1,function(x)any(!is.na(x))))
# 3.33

# larger data frame
x <- matrix(runif(10^5), 10^3)
x[ runif(10^5) < .99 ] <-  NA
df2 <- data.frame(x)
system.time( for(i in 1:100) temp <- rowSums(is.na(df2)) < 100)
# .34
system.time( for(i in 1:10^4) temp <- apply(df,1,function(x)any(!is.na(x))))
# 3.34

Tim Hesterberg

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] use rowSums or colSums instead of apply!

Reply via email to