sorry, i let toto$Date in the function within lapply() instead of x$Date ! now, it works :
> toto Num Date Place X Y 1 1 1/1/04 0:48 x1 1 1 2 2 1/1/04 8:02 x1 NA NA 3 4 1/1/04 1:55 x4 3 7 4 3 1/1/04 2:14 x3 2 9 5 4 1/1/04 1:19 x4 3 7 6 4 1/1/04 1:02 x4 3 7 7 5 1/1/04 11:15 x5 6 8 8 5 1/1/04 9:06 x5 6 8 9 5 1/1/04 10:32 x5 6 8 > toto <- do.call("rbind", lapply(split(toto, toto$Num), + function(x) x[which.min(as.POSIXct(strptime(x$Date, "%d/%m/%y %H:%M"))),])) > toto Num Date Place X Y 1 1 1/1/04 0:48 x1 1 1 2 2 1/1/04 8:02 x1 NA NA 3 3 1/1/04 2:14 x3 2 9 4 4 1/1/04 1:02 x4 3 7 5 5 1/1/04 9:06 x5 6 8 > toto <- merge(toto[1:3], unique(na.omit(toto[3:5])),by="Place",all.x=T) > toto Place Num Date X Y 1 x1 1 1/1/04 0:48 1 1 2 x1 2 1/1/04 8:02 1 1 3 x3 3 1/1/04 2:14 2 9 4 x4 4 1/1/04 1:02 3 7 5 x5 5 1/1/04 9:06 6 8 Florent Bonneu a écrit : > Thank you very much for your help but I think there is an error for > the answer to the first problem I spent time on searching the > solution but I failed to find it. I tried to put "which.max" instead > of "which.min" but it doesn't work. I tried to do my best but i didn't > have any idea to solve this problem. > > An example : > > Num <- c(1,2,4,3,4,4,5,5,5) > Date <- c("1/1/04 0:48","1/1/04 8:02", "1/1/04 1:55", "1/1/04 2:14", > "1/1/04 1:19", "1/1/04 1:02", "1/1/04 11:15", "1/1/04 9:06", "1/1/04 > 10:32") > Place <- c("x1","x1","x4","x3","x4","x4","x5","x5","x5") > X <- c(1,NA,3,2,3,3,6,6,6) > Y <- c(1,NA,7,9,7,7,8,8,8) > toto <- data.frame(Num,Date,Place,X,Y) > toto[order(toto$Num,as.numeric(as.POSIXct(strptime(toto$Date, > "%d/%m/%y %H:%M")))),] > toto <- merge(toto[1:3], unique(na.omit(toto[3:5])),by="Place",all.x=T) > help <- do.call("rbind", lapply(split(toto, toto$Num), > function(x) x[which.min(as.numeric(as.POSIXct(strptime(toto$Date, > "%d/%m/%y %H:%M")))),])) > help > > The solution must be > > Num <- c(1,2,3,4,5) > Date <- c("1/1/04 0:48","1/1/04 8:02", "1/1/04 2:14", "1/1/04 1:02", > "1/1/04 9:06") > Place <- c("x1","x1","x3","x4","x5") > X <- c(1,1,2,3,6) > Y <- c(1,1,9,7,8) > toto <- data.frame(Num,Date,Place,X,Y) > > > Any suggestion is welcome. > > Florent Bonneu. > > > > Jacques VESLOT wrote: > >> OK ! so try this: >> merge(toto[1:3], unique(na.omit(toto[3:5])),by="Place",all.x=T) >> >> >> Florent Bonneu a écrit : >> >>> Indeed, >>> X <- c(1,Na,2,3,3,3,6,6) >>> Y <- c(1,Na,9,7,7,7,8,8) >>> >>> I want to obtain one line for each Num. It's not a problem if there >>> are several lines for the same place, because my identifier is Num. >>> I just want to get X and Y well-informed in an other line for the >>> same place. For example, "Num=2" is at the place "x1", like "Num=1", >>> but we don't have the coordinates X and Y for "Num=2". Now, the >>> same coordinates are well-informed for "Num=1", so i want to >>> retrieve this coordinates in my line "Num=2" for my columns X and Y. >>> >>> >>> >>> Jacques VESLOT wrote: >>> >>>> something wrong in X and Y definitions... but this could work: >>>> >>>> do.call("rbind", lapply(split(toto, toto$Num), >>>> function(x) x[which.min(as.POSIXct(strptime(toto$Date, "%d/%m/%y >>>> %H:%M"))),])) >>>> >>>> i don't understand the second query; do you want to keep the first >>>> line when there are several lines for the same place ? >>>> >>>> >>>> Florent Bonneu a écrit : >>>> >>>>> I have two problems for the data processing of my large data base >>>>> (50000 rows). For example, a sample is as follows >>>>> >>>>> Num <- c(1,2,3,4,4,4,5,5) >>>>> Date <- c("1/1/04 0:48","1/1/04 1:52", "1/1/04 1:55", "1/1/04 >>>>> 2:14", "1/1/04 3:09", "1/1/04 8:02", "1/1/04 9:05", "1/1/04 9:06") >>>>> Place <- c("x1","x1","x3","x4","x4","x4","x5","x5") >>>>> X <- c(1,””,2,3,3,3,6,6) >>>>> Y <- c(1,””,9,7,7,7,8,8) >>>>> >>>>> toto <- data.frame(Num,Date,Place,X,Y) >>>>> >>>>> The first problem is to keep one line for each Num with the >>>>> “minimum” date. I managed to do it with loops but i would like a >>>>> solution without using loops. It will be better for my large data >>>>> base. >>>>> >>>>> The other one is to retrieve the coordinates ill-informed. For >>>>> example, for the same place “x1”, Num=2 doesn't have X and Y. But, >>>>> we have this information for Num=1. >>>>> >>>>> The example data base must be like this >>>>> >>>>> Num <- c(1,2,3,4,5) >>>>> Date <- c("1/1/04 0:48","1/1/04 1:52", "1/1/04 1:55", "1/1/04 >>>>> 2:14", "1/1/04 9:05") >>>>> Place <- c("x1","x1","x3","x4","x5") >>>>> X <- c(1,1,2,3,6) >>>>> Y <- c(1,1,9,7,8) >>>>> >>>>> toto <- data.frame(Num,Date,Place,X,Y) >>>>> Somebody know how to do ? >>>>> Thanks. >>>>> >>>>> Florent Bonneu >>>>> Laboratoire de Statistique et Probabilités >>>>> bureau 148 bât. 1R2 >>>>> Université Toulouse 3 >>>>> 118 route de Narbonne - 31062 Toulouse cedex 9 >>>>> [EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]> >>>>> >>>>> ______________________________________________ >>>>> R-help@stat.math.ethz.ch mailing list >>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>> PLEASE do read the posting guide! >>>>> http://www.R-project.org/posting-guide.html >>>>> >>>>> >>>>> >>>> >>>> >>>> >>> >> >> >> > ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html