Very helpful and thank you so much!
On Wed, Feb 17, 2021 at 12:50 PM Duncan Murdoch <murdoch.dun...@gmail.com> wrote: > > On 17/02/2021 9:50 a.m., Val wrote: > > HI All, > > > > I am reading a data file which has different date formats. I wanted to > > standardize to one format and used a library anytime but got > > undesired results as shown below. It gave me year 2093 instead of 1993 > > > > > > library(anytime) > > DFX<-read.table(text="name ddate > > A 19-10-02 > > D 11/19/2006 > > F 9/9/2011 > > G1 12/29/2010 > > AA 10/18/93 ",header=TRUE) > > getFormats() > > addFormats(c("%d-%m-%y")) > > addFormats(c("%m-%d-%y")) > > addFormats(c("%Y/%d/%m")) > > addFormats(c("%m/%d/%y")) > > > > DFX$anew=anydate(DFX$ddate) > > > > Output > > name ddate anew > > 1 A 19-10-02 2002-10-19 > > 2 D 11/19/2006 2020-11-19 > > 3 F 9/9/2011 2011-09-09 > > 4 G1 12/29/2010 2020-12-29 > > 5 AA 10/18/93 2093-10-18 > > > > The problem is in the last row. It should be 1993-10-18 instead of > > 2093-10-18 > > > > How do I correct this? > > This looks a little tricky. The basic idea is that the %y format has to > guess at the century, but the guess depends on things specific to your > system. So what would be nice is to say "two digit dates should be > assumed to fall between 1922 and 2021", but there's no way to do that > directly. > > What you could do is recognize when you have a two digit year, and then > force the result into the range you want. Here's a function that does > that, but it's not really tested much at all, so be careful if you use > it. (One thing: I recommend the 'useR = TRUE' option to anydate(); it > worked better in my tests than the default.) > > adjustCentury <- function(inputString, > outputDate = anydate(inputString, useR = TRUE), > start = "1922-01-01") { > > start <- as.Date(start) > > twodigityear <- !grepl("[[:digit:]]{4}", inputString) > > while (length(bad <- which(twodigityear & outputDate < start))) { > for (i in bad) { > longdate <- as.POSIXlt(outputDate[i]) > longdate$year <- longdate$year + 100 > outputDate[i] <- as.Date(longdate) > } > } > longdate <- as.POSIXlt(start) > longdate$year <- longdate$year + 100 > finish <- as.Date(longdate) > > while (length(bad <- which(twodigityear & outputDate >= finish))) { > for (i in bad) { > longdate <- as.POSIXlt(outputDate[i]) > longdate$year <- longdate$year - 100 > outputDate[i] <- as.Date(longdate) > } > } > outputDate > } > > library(anytime) > DFX<-read.table(text="name ddate > A 19-10-02 > D 11/19/2006 > F 9/9/2011 > G1 12/29/2010 > AA 10/18/93 > BB 10/18/1893 > CC 10/18/2093",header=TRUE) > > addFormats(c("%d-%m-%y")) > addFormats(c("%m-%d-%y")) > addFormats(c("%Y/%d/%m")) > addFormats(c("%m/%d/%y")) > > DFX$anew=adjustCentury(DFX$ddate, start = "1921-01-01") > DFX > #> name ddate anew > #> 1 A 19-10-02 2019-10-02 > #> 2 D 11/19/2006 2006-11-19 > #> 3 F 9/9/2011 2011-09-09 > #> 4 G1 12/29/2010 2010-12-29 > #> 5 AA 10/18/93 1993-10-18 > #> 6 BB 10/18/1893 1893-10-18 > #> 7 CC 10/18/2093 2093-10-18 ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.