[ https://issues.apache.org/jira/browse/ARROW-18242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17629101#comment-17629101 ]
Lucas Mation commented on ARROW-18242: -------------------------------------- ok, same error as before arrow_table(x = '00001976') %>% mutate(y = dmy(x)) %>% collect() # A tibble: 1 x 2 x y <chr> <date> 1 00001976 1975-11-30 > [R] arrow implementation of lubridate::dmy parses invalid date "00001976" as > date > --------------------------------------------------------------------------------- > > Key: ARROW-18242 > URL: https://issues.apache.org/jira/browse/ARROW-18242 > Project: Apache Arrow > Issue Type: Bug > Components: R > Reporter: Lucas Mation > Priority: Critical > > Sorry for so many issues, but I think this is another bug. > Wrong behavior of the arrow implementation of the `lubridate::dmy`. > An invalid date such as '00001976' is being parsed as a valid (and completely > unrelated) date. > #in R > '00001976' %>% dmy > [1] NA > Warning message: > All formats failed to parse. No formats found. > #In arrow > q <- data.table(x=c('00001976','30111976','01011976')) > q %>% write_dataset('q') > q2 <- 'q' %>% open_dataset %>% mutate(x2=dmy) %>% collect > q2 > x > 1: 1975-11-30 > 2: 1976-11-30 > 3: 1976-01-01 > #notice '00001976' is an invalid date. First row of x2 should be NA!!! > -- This message was sent by Atlassian Jira (v8.20.10#820010)