I am sure there are many good ways to do the task including taking the 
data.frame out into a list of data.frames and making the change to each by 
taking the nth row that matches nrow(it) and changing it and then recombining.

What follows are several attempts leading up to one at the end I find is 
probably the best choice.

I did the following sample demo using the dplyr package in the tidyverse but 
want to explain. My data was three small groups of 1 then 2 then 3. The second 
column in each had the same number as the group and it was unique for that 
group. If the last item can be a duplicate of another item, this method changes 
too much:

library(dplyr)

mydf <-
  tribble(
    ~grouper, ~val,
    1, 1,
    2, 1,
    2, 2,
    3, 1,
    3, 2,
    3, 3,
  )

mydf %>% group_by(grouper) %>% mutate(val2 = last(val), 
val=ifelse(val==val2,0,val))

The result is this:

> mydf %>% group_by(grouper) %>% mutate(val2 = last(val), 
> val=ifelse(val==val2,0,val))
# A tibble: 6 x 3
# Groups:   grouper [3]
grouper   val  val2
<dbl> <dbl> <dbl>
  1       1     0     1
2       2     1     2
3       2     0     2
4       3     1     3
5       3     2     3
6       3     0     3

Now obviously this introduced an extra temporary row called val2, which is 
easily removed by many methods like piping to select(-val2) ...

But that is not needed as a shorter and more direct method is this:

mydf %>% 
  group_by(grouper) %>% 
  mutate(val = ifelse(val==last(val), 
                      0, 
                      val))

But some more research shows the helper functions that make this trivial.

Recall you wanted the last row in each group altered, I think to have an NA in 
column. I used 0 above but can use NA just as easily or any constant. The 
functions are:

n() gives the number of rows in the group.
row_number() gives the number of the current row as the functionality is being 
applied, within that group. The condition being offered is that n() == 
row_number() so this version surgically changes just the last rows no matter 
what other rows contain.

mydf %>% 
  group_by(grouper) %>% 
  mutate(val = ifelse(row_number() == n(), 
                      0, 
                      val))

If you have no interest in using a package like this, someone else will likely 
point you to a way. I suspect using something like split() to make a list of 
data.frames then applying some functionality to each smaller data.frame to get 
the result then recombining it back.


-----Original Message-----
From: R-help <r-help-boun...@r-project.org> On Behalf Of Kai Yang via R-help
Sent: Thursday, September 9, 2021 3:00 PM
To: R-help Mailing List <r-help@r-project.org>
Subject: [R] how to find "first" or "last" record after sort in R

Hello List,
Please look at the sample data frame below:

ID         date1              date2             date3
1    2015-10-08    2015-12-17    2015-07-23

2    2016-01-16    NA                 2015-10-08
3    2016-08-01    NA                 2017-01-10
3    2017-01-10    NA                 2016-01-16
4    2016-01-19    2016-02-24   2016-08-01
5    2016-03-01    2016-03-10   2016-01-19 This data frame was sorted by ID and 
date1. I need to set the column date3 as missing for the "last" record for each 
ID. In the sample data set, the ID 1, 2, 4 and 5 has one row only, so they can 
be consider as first and last records. the data3 can be set as missing. But the 
ID 3 has 2 rows. Since I sorted the data by ID and date1, the ID=3 and 
date1=2017-01-10 should be the last record only. I need to set date3=NA for 
this row only.

the question is, how can I identify the "last" record and set it as NA in date3 
column.
Thank you,
Kai
        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see 
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to