jonkeane commented on a change in pull request #12353:
URL: https://github.com/apache/arrow/pull/12353#discussion_r802006466
##########
File path: r/R/dplyr-funcs-datetime.R
##########
@@ -148,4 +148,56 @@ register_bindings_datetime <- function() {
!call_binding("am", x)
})
+ register_binding("ymd", function(x) {
+ format_map <-
+ list(
+ ymd_hyphen1 = "%Y-%m-%d",
+ ymd_hyphen2 = "%y-%m-%d",
+ ymd_hyphen3 = "%Y-%B-%d",
+ ymd_hyphen4 = "%y-%B-%d",
+ ymd_hyphen5 = "%Y-%b-%d",
+ ymd_hyphen6 = "%y-%b-%d",
Review comment:
Could we use one set of 6 like this, and then pre-process the strings
with a regex like was suggested in
https://issues.apache.org/jira/browse/ARROW-14471?focusedCommentId=17446011&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17446011
?
Something kind of like:
```
format_map <- list(
ymd_hyphen1 = "%Y-%m-%d",
ymd_hyphen2 = "%y-%m-%d",
ymd_hyphen3 = "%Y-%B-%d",
ymd_hyphen4 = "%y-%B-%d",
ymd_hyphen5 = "%Y-%b-%d",
ymd_hyphen6 = "%y-%b-%d"
)
x <- gsub("[^A-Za-z0-9.]", "-", x)
call_binding(
"coalesce",
call_binding("strptime", x, format = format_map[[1]], unit = "s"),
call_binding("strptime", x, format = format_map[[2]], unit = "s"),
call_binding("strptime", x, format = format_map[[3]], unit = "s"),
call_binding("strptime", x, format = format_map[[4]], unit = "s"),
call_binding("strptime", x, format = format_map[[5]], unit = "s"),
call_binding("strptime", x, format = format_map[[6]], unit = "s")
)
```
You might need to use `call_binding("gsub", ...)` instead of being able to
use it directly
##########
File path: r/R/dplyr-funcs-datetime.R
##########
@@ -148,4 +148,56 @@ register_bindings_datetime <- function() {
!call_binding("am", x)
})
+ register_binding("ymd", function(x) {
+ format_map <-
+ list(
+ ymd_hyphen1 = "%Y-%m-%d",
+ ymd_hyphen2 = "%y-%m-%d",
+ ymd_hyphen3 = "%Y-%B-%d",
+ ymd_hyphen4 = "%y-%B-%d",
+ ymd_hyphen5 = "%Y-%b-%d",
+ ymd_hyphen6 = "%y-%b-%d",
Review comment:
This would at least cut down on the number of formats we need to try and
_mostly_ get the right answers
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]