[ https://issues.apache.org/jira/browse/ARROW-16596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17539387#comment-17539387 ]
Antoine Pitrou commented on ARROW-16596: ---------------------------------------- Also, by reading the code, the vendored implementation we use on Windows (which was copied from the musl libc) probably shows the same lenient behaviour. > [C++] Add a strptime option to control the cutoff between 1900 and 2000 when > %y > -------------------------------------------------------------------------------- > > Key: ARROW-16596 > URL: https://issues.apache.org/jira/browse/ARROW-16596 > Project: Apache Arrow > Issue Type: Improvement > Components: C++, R > Affects Versions: 8.0.0 > Reporter: Dragoș Moldovan-Grünfeld > Priority: Major > > When parsing to datetime a string with year in the short format ({{{}%y{}}}), > it would be great if we could have control over the cutoff point between 1900 > and 2000. Currently it is implicitly set to 68: > {code:r} > library(arrow, warn.conflicts = FALSE) > a <- Array$create(c("68-05-17", "69-05-17")) > call_function("strptime", a, options = list(format = "%y-%m-%d", unit = 0L)) > #> Array > #> <timestamp[s]> > #> [ > #> 2068-05-17 00:00:00, > #> 1969-05-17 00:00:00 > #> ] > {code} > For example, lubridate named this argument {{cutoff_2000}} argument (e.g. for > {{{}fast_strptime){}}}. This works as follows: > {code:r} > library(lubridate, warn.conflicts = FALSE) > dates_vector <- c("68-05-17", "69-05-17", "55-05-17") > fast_strptime(dates_vector, format = "%y-%m-%d") > #> [1] "2068-05-17 UTC" "1969-05-17 UTC" "2055-05-17 UTC" > fast_strptime(dates_vector, format = "%y-%m-%d", cutoff_2000 = 50) > #> [1] "1968-05-17 UTC" "1969-05-17 UTC" "1955-05-17 UTC" > fast_strptime(dates_vector, format = "%y-%m-%d", cutoff_2000 = 70) > #> [1] "2068-05-17 UTC" "2069-05-17 UTC" "2055-05-17 UTC" > {code} > In the {{lubridate::fast_strptime()}} documentation it is described as > follows: > {quote}cutoff_2000 > integer. For y format, two-digit numbers smaller or equal to cutoff_2000 are > parsed as though starting with 20, otherwise parsed as though starting with > 19. {-}Available only for functions relying on lubridates internal parser{-}. > {quote} -- This message was sent by Atlassian Jira (v8.20.7#820007)