Hi. Thanks for the function. My bad, after looking at the csv file, it seems that NA values come not only from previous Non-NA values but also from the next Non-NA values. Example: | NCQ05 | 11.395 | | NCQ05 | 11.395 | | NCQ05 | | | NCQ06 | | | NCQ06 | 13 | | NCQ06 | 13 |
If i use the function, then the blank row would be filled with 11.395, instead of filling with 11.395 and 13. Does it mean that the function can be modified like this? locf2 <- function(x, initial=NA, IS_BAD = is.na) { # Replace 'bad' values in 'x' with last previous non-bad value. # If no previous non-bad value, replace with 'initial'. stopifnot(is.function(IS_BAD)) good <- !IS_BAD(x) stopifnot(is.logical(good), length(good) == length(x), !anyNA(good)) i <- cumsum(good) x <- x[c(1,which(good))][i+1] x <- x[c(1,which(good))][i+2] x[i==0] <- initial x } On Thursday, 16 March 2017, 1:17, William Dunlap <wdun...@tibco.com> wrote: You could use the following function locf2 <- function(x, initial=NA, IS_BAD = is.na) { # Replace 'bad' values in 'x' with last previous non-bad value. # If no previous non-bad value, replace with 'initial'. stopifnot(is.function(IS_BAD)) good <- !IS_BAD(x) stopifnot(is.logical(good), length(good) == length(x), !anyNA(good)) i <- cumsum(good) x <- x[c(1,which(good))][i+1] x[i==0] <- initial x } as in > locf2(c("", "A", "B", "", "", "C", ""), IS_BAD=function(x)x=="", > initial="---") [1] "---" "A" "B" "B" "B" "C" "C" > locf2(factor(c(NA,"Small","Medium",NA,"Large",NA,NA,NA,"Small"))) [1] <NA> Small Medium Medium Large Large Large Large Small Levels: Large Medium Small > locf2(c(12, NA, 10, 11, NA, NA)) [1] 12 12 10 11 11 11 Bill Dunlap TIBCO Software wdunlap tibco.com On Wed, Mar 15, 2017 at 4:08 AM, Allan Tanaka <allantanak...@yahoo.com> wrote: > The following is an example: > > | Item_Identifier | Item_Weight | > | FDP10 | 19 | > | FDP10 | | > | DRI11 | 8.26 | > | DRI11 | | > | FDW12 | 8.315 | > | FDW12 | | > > > The following is the one that i want to be. That is, filling NA values from > the previous Non-NA values. > | Item_Identifier | Item_Weight | > | FDP10 | 19 | > | FDP10 | 19 | > | DRI11 | 8.26 | > | DRI11 | 8.26 | > | FDW12 | 8.315 | > | FDW12 | 8.315 | > > > My current code data frame: train <- read.csv("Train.csv", header=T,sep = > ",",na.strings = c(""," ",NA)) > > > Some people suggest to use na.locf function but in my case, i don't have > numeric unique values in my Item_Identifier coloumn but rather it's > characters. Not sure what to solve this problem. > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.