[ 
https://issues.apache.org/jira/browse/ARROW-16596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17539387#comment-17539387
 ] 

Antoine Pitrou commented on ARROW-16596:
----------------------------------------

Also, by reading the code, the vendored implementation we use on Windows (which 
was copied from the musl libc) probably shows the same lenient behaviour.

> [C++] Add a strptime option to control the cutoff between 1900 and 2000 when 
> %y 
> --------------------------------------------------------------------------------
>
>                 Key: ARROW-16596
>                 URL: https://issues.apache.org/jira/browse/ARROW-16596
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: C++, R
>    Affects Versions: 8.0.0
>            Reporter: Dragoș Moldovan-Grünfeld
>            Priority: Major
>
> When parsing to datetime a string with year in the short format ({{{}%y{}}}), 
> it would be great if we could have control over the cutoff point between 1900 
> and 2000. Currently it is implicitly set to 68:
> {code:r}
> library(arrow, warn.conflicts = FALSE)
> a <- Array$create(c("68-05-17", "69-05-17"))
> call_function("strptime", a, options = list(format = "%y-%m-%d", unit = 0L))
> #> Array
> #> <timestamp[s]>
> #> [
> #>   2068-05-17 00:00:00,
> #>   1969-05-17 00:00:00
> #> ]
> {code}
> For example, lubridate named this argument {{cutoff_2000}} argument (e.g. for 
> {{{}fast_strptime){}}}. This works as follows:
> {code:r}
> library(lubridate, warn.conflicts = FALSE)
> dates_vector <- c("68-05-17", "69-05-17", "55-05-17")
> fast_strptime(dates_vector, format = "%y-%m-%d")
> #> [1] "2068-05-17 UTC" "1969-05-17 UTC" "2055-05-17 UTC"
> fast_strptime(dates_vector, format = "%y-%m-%d", cutoff_2000 = 50)
> #> [1] "1968-05-17 UTC" "1969-05-17 UTC" "1955-05-17 UTC"
> fast_strptime(dates_vector, format = "%y-%m-%d", cutoff_2000 = 70)
> #> [1] "2068-05-17 UTC" "2069-05-17 UTC" "2055-05-17 UTC"
> {code}
> In the {{lubridate::fast_strptime()}} documentation it is described as 
> follows:
> {quote}cutoff_2000 
> integer. For y format, two-digit numbers smaller or equal to cutoff_2000 are 
> parsed as though starting with 20, otherwise parsed as though starting with 
> 19. {-}Available only for functions relying on lubridates internal parser{-}.
> {quote}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to