I often use the following function is.true <- function(x) !is.na(x) & x and, less often, is.false <- function(x) !is.na(x) & !x to report if elements of a logical vector are TRUE (not FALSE or NA) or FALSE (not TRUE or NA), respectively.
Do your complicated logical expression and apply is.true() to the result before passing it to "[" and you will get what subset gives you without subset's nonstandard argument evaluation. (The latter is handy for interactive use and often painful in general purpose functions.) E.g., change your (mydataframeName$myvariableName > 2 & !is.na(mydataframeName$myvariableName)) & (mydataframeName$myotherVariableName == "male" & !is.na(mydataframeName$myotherVariableName)) to is.true(mydataframName$myvariableName > 2 & mydataframName$myotherVariableName == "male") Don't confuse this is.true() with the entirly different base::isTRUE(), which reports whether its argument is identical to TRUE (length 1, no names or other attributes). Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com > -----Original Message----- > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On > Behalf Of Timothy Bates > Sent: Tuesday, September 13, 2011 2:18 PM > To: Hadley Wickham; Duncan Murdoch > Cc: R list > Subject: Re: [R] x %>% y as an alternative to which( x > y) > > Dear Duncan and Hadley, > > I stumbled across the NA behavior of subset a little while ago and thought it > might do the trick. But > my common usage case is not getting a subsetting sans NAs, but setting values > in the whole dataframe. > > So I need T/F at each row, not just the list of rows that match the subset of > matching cases... > > How would you do this with subset? > > data[data$YOB < 1908 & !is.na(data$YOB), "Age"]=NA > > My %<% idea extends the vocabulary established by %in%, and works in the same > grammatical situation. > > here's a real example > > # Fix missing T2 sex for same sex pairs... > > twinData[twinData$Age %<% 12, "flynnEffect"] = FALSE # only set flynn F for > people under 12, not inc > NAs > > Addressing Duncan's point about returning a logical array... the %<% function > should be: > > "%<%" <- function(table, x){ > lessThan = table < x > lessThan[is.na(lessThan)] = FALSE > return(lessThan) > } > > This also works for matrices as it should > > > x = matrix(c(1:10,NA,12:20),nrow=2) > > x %<% 6 > [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] > [1,] TRUE TRUE TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE > [2,] TRUE TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE > > > On Sep 13, 2011, at 8:40 PM, Hadley Wickham wrote: > > >> Because in coding, I often end up with big chunks looking like this: > >> > >> ((mydataframeName$myvariableName > 2 & > >> !is.na(mydataframeName$myvariableName)) & > (mydataframeName$myotherVariableName == "male" & > !is.na(mydataframeName$myotherVariableName))) > >> > >> Which is much less readable/maintainable/editable than > >> > >> mydataframeName$myvariableName > 2 & mydataframeName$myotherVariableName > >> == "male" > > > > Use subset: > > > > subset(mydataframeName, myvariableName > 2 & myotherVariableName == "male") > > > > (subset automatically treats NAs as false) > > > > Hadley > > > > -- > > Assistant Professor / Dobelman Family Junior Chair > > Department of Statistics / Rice University > > http://had.co.nz/ > > > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.