Can someone please tell me what is up with na.action in aggregate?

My (somewhat) reproducible example:
(I say somewhat because some lines wouldn't run in a separate session, more
below)

set.seed(100)
dat=data.frame(
        x1=sample(c(NA,'m','f'), 100, replace=TRUE),
        x2=sample(c(NA, 1:10), 100, replace=TRUE),
        x3=sample(c(NA,letters[1:5]), 100, replace=TRUE),
        x4=sample(c(NA,T,F), 100, replace=TRUE),
        y=sample(c(rep(NA,5), rnorm(95))))
dat
## The total from dat:
sum(dat$y, na.rm=T)
## The total from aggregate:
sum(aggregate(dat$y, dat[,1:4], sum, na.rm=T)$x)
sum(aggregate(y~x1+x2+x3+x4, data=dat, sum, na.rm=T)$y)  ## <--- This line
gave an error in a separate R instance
## The aggregate formula is excluding NA

## So, let's try to include NAs
sum(aggregate(y~x1+x2+x3+x4, data=dat, sum, na.rm=T, na.action='na.pass')$y)
sum(aggregate(y~x1+x2+x3+x4, data=dat, sum, na.rm=T, na.action=na.pass)$y)
## The aggregate formula is STILL excluding NA
## In fact, the formula doesn't seem to notice the na.action
sum(aggregate(y~x1+x2+x3+x4, data=dat, sum, na.rm=T, na.action='foo man
chew')$y)
## Hmmmm... that error surprised me (since the previous two things ran)

## So, let's try to change the global options
## (not mentioned in the help, but after reading the help
##  100 times, I thought I would go above and beyond to avoid
##  any r list flames from people complaining
##  that I didn't read the help... but that's a separate topic)
options(na.action ="na.pass")
sum(aggregate(dat$y, dat[,1:4], sum, na.rm=T)$x)
sum(aggregate(y~x1+x2+x3+x4, data=dat, sum, na.rm=T)$y)
sum(aggregate(y~x1+x2+x3+x4, data=dat, sum, na.rm=T, na.action='na.pass')$y)
sum(aggregate(y~x1+x2+x3+x4, data=dat, sum, na.rm=T, na.action=na.pass)$y)
## (NAs are still omitted)

## Even more frustrating...
## Why don't any of these work???
sum(aggregate(dat$y, dat[,1:4], sum, na.rm=T, na.action='na.pass')$x)
sum(aggregate(dat$y, dat[,1:4], sum, na.rm=T, na.action=na.pass)$x)
sum(aggregate(dat$y, dat[,1:4], sum, na.rm=T, na.action='na.omit')$x)
sum(aggregate(dat$y, dat[,1:4], sum, na.rm=T, na.action=na.omit)$x)


## This does work, but in my real data set, I want NA to really be NA
for(j in 1:4)
    dat[is.na(dat[,j]),j] = 'NA'
sum(aggregate(dat$y, dat[,1:4], sum, na.rm=T)$x)
sum(aggregate(y~x1+x2+x3+x4, data=dat, sum, na.rm=T)$y)


## My first session info
#
#> sessionInfo()
#R version 2.12.0 (2010-10-15)
#Platform: i386-pc-mingw32/i386 (32-bit)
#
#locale:
#        [1] LC_COLLATE=English_United States.1252
#[2] LC_CTYPE=English_United States.1252
#[3] LC_MONETARY=English_United States.1252
#[4] LC_NUMERIC=C
#[5] LC_TIME=English_United States.1252
#
#attached base packages:
#        [1] stats     graphics  grDevices utils     datasets  methods
base
#
#other attached packages:
#        [1] plyr_1.2.1  zoo_1.6-4   gdata_2.8.1 rj_0.5.0-5
#
#loaded via a namespace (and not attached):
#        [1] grid_2.12.0     gtools_2.6.2    lattice_0.19-13 rJava_0.8-8
#[5] tools_2.12.0



I tried running that example in a different version of R, with and I got
completely different results

The other version of R wouldn't recognize the formula at all..

My other version of R:

#  My second session info
#> sessionInfo()
#R version 2.10.1 (2009-12-14)
#i386-pc-mingw32
#
#locale:
#        [1] LC_COLLATE=English_United States.1252
#[2] LC_CTYPE=English_United States.1252
#[3] LC_MONETARY=English_United States.1252
#[4] LC_NUMERIC=C
#[5] LC_TIME=English_United States.1252
#
#attached base packages:
#        [1] stats     graphics  grDevices utils     datasets  methods
base
#>
#

PS: Also, I have read the help on aggregate, factor, as.factor, and several
other topics.  If I missed something, please let me know.
Some people like to reply to questions by telling the sender that R has
documentation.  Please don't.  The R help archives are littered with
reminders, friendly and otherwise, of R's documentation.

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to