On Mon, 18 Jul 2005, Peter Dalgaard wrote: > Prof Brian Ripley <[EMAIL PROTECTED]> writes: > >> On Mon, 18 Jul 2005, Peter Dalgaard wrote: >> >>> Chuck Cleland <[EMAIL PROTECTED]> writes: >>> >>>>> data <- as.data.frame(cbind(X1,X2,X3,X4,X5)) >>>>> >>>>> So only X1, X3 and X5 are vars without any NAs and there are some vars >>>>> (X2 and >>>>> X4 stacked in between that have NAs). Now, how can I extract those former >>>>> vars >>>>> in a new dataset or remove all those latter vars in between that have NAs >>>>> (without missing a single row)? >>>>> ... >>>> >>>> Someone else will probably suggest something more elegant, but how >>>> about this: >>>> >>>> newdata <- data[,-which(apply(data, 2, function(x){all(is.na(x))}))] >>> >>> (I think that's supposed to be any(), not all(), and which() is >>> crossing the creek to fetch water.) >>> >>> This should do it: >>> >>> data[,apply(!is.na(data),2,all)] >> >> If `data' is a data frame, apply will coerce it to a matrix. > > So will is.na()...
Not quite. is.na on a data frame will create a matrix by cbind-ing columns. I was mainly commenting on Chuck Cleland's version, which coerces a data frame to a matrix then pulls out each column of the matrix, something that is quite wasteful of space. Forming the logical matrix is.na(data) is also I think wasteful. >> I would do >> something like >> >> keep <- sapply(data, function(x) all(!is.na(x))) >> data[keep] >> >> to use the list-like structure of a data frame and make the fewest >> possible copies. > > I think the amount of copying is the same, but your version doesn't > need to store the entire is.na(data) at once. > > Nitpick: !any(is.na(x)) should be marginally faster than all(!is.na(x)). I doubt it is measurably so. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595 ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html