Hi All, I'm having trouble selecting rows to delete, that i can't seem to overcome.
Below is some sample data, i am trying to dedup the data based on each user, and simultaneously the timestamp (at the side i have highlighted expected row to be removed) I've looked at the lag function but can't seem to make it work? My logic ran along the lines of an ifelse statement and then remove after that, but it doesn't seem to work? Any help appreciated Let's call the data test test$lag <- ifelse(test$user_id==lag(test$user_id) & test$timestamp==lag(test$timestamp),1,0) Can anyone help on this? Mike Source_type timestamp user_id 75381 0 07-07-2008-21:03:55 848307909687 75379 1 07-07-2008-19:52:55 848307838407 75380 2 07-07-2008-19:54:14 848307838407 75378 1 07-07-2008-15:24:01 848285633277 75374 1 07-07-2008-13:39:17 848273633667 75377 2 07-07-2008-13:39:55 848273633667 75376 2 07-07-2008-13:39:55 848273633667 Remove 75375 2 07-07-2008-13:56:05 848273633667 75373 1 07-07-2008-17:11:00 848272661427 75371 1 07-07-2008-13:19:00 848270431847 75372 2 07-07-2008-13:19:14 848270431847 75369 1 07-07-2008-12:49:16 848269676907 Remove 75370 2 07-07-2008-12:49:16 848269676907 75366 1 07-07-2008-13:29:15 848263484847 75368 2 07-07-2008-13:29:44 848263484847 Thanks in advance [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.