hi all, I have a data frame such as: 1 blue 0.3 1 NA 0.4 1 red NA 2 blue NA 2 green NA 2 blue NA 3 red 0.5 3 blue NA 3 NA 1.1
I wish to find the last non-missing value in every 3ple: ie I want a 3 by 3 data.frame such as: 1 red 0.4 2 blue NA 3 blue 1.1 I have written a little script data = structure(list(V1 = c(1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L ), V2 = structure(c(1L, NA, 3L, 1L, 2L, 1L, 3L, 1L, NA), .Label = c("blue", "green", "red"), class = "factor"), V3 = c(0.3, 0.4, NA, NA, NA, NA, 0.5, NA, 1.1)), .Names = c("V1", "V2", "V3"), class = "data.frame", row.names = c(NA, -9L)) cl = function(x) x[max(which(!is.na(x)))] choose.last = function(x) tapply(x,x[,1],cl) # now function choose.last works properly on numeric vectors: > choose.last(data[,3]) 1 2 3 0.4 NA 1.1 # but not on factors (I loose the factor labels): > choose.last(data[,2]) 1 2 3 3 1 1 # moreover, if I apply this function to the whole data.frame # the output is a character matrix > apply(data,2,choose.last) V1 V2 V3 1 "1" "red" "0.4" 2 "2" "blue" NA 3 "3" "blue" "1.1" # and if I sapply, I loose factors labels > sapply(data,choose.last) V1 V2 V3 1 1 3 0.4 2 2 1 NA 3 3 1 1.1 any hint? Thanks in advance, Patrizio +------------------------------------------------- | Patrizio Frederic, PhD | Research associate in Statistics, | Department of Economics, | University of Modena and Reggio Emilia, | Via Berengario 51, | 41100 Modena, Italy | | tel: +39 059 205 6727 | fax: +39 059 205 6947 | mail: [EMAIL PROTECTED] +------------------------------------------------- ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.