jonkeane commented on a change in pull request #12506:
URL: https://github.com/apache/arrow/pull/12506#discussion_r833487664



##########
File path: r/R/dplyr-funcs-datetime.R
##########
@@ -189,6 +189,65 @@ register_bindings_datetime <- function() {
   })
 }
 
+register_bindings_duration <- function() {
+  register_binding("difftime", function(time1,
+                                        time2,
+                                        tz,
+                                        units = c("auto", "secs", "mins",
+                                                  "hours", "days", "weeks")) {
+    units <- match.arg(units)
+    if (units != "secs") {
+      abort("`difftime()` with units other than seconds not supported in 
Arrow")
+    }
+
+    if (!missing(tz)) {
+      warn("`tz` is an optional argument to `difftime()` in R and will not be 
used in Arrow")
+    }
+
+    time1 <- build_expr("cast", time1, options = cast_options(to_type = 
timestamp()))
+    time2 <- build_expr("cast", time2, options = cast_options(to_type = 
timestamp()))
+
+    build_expr("cast", time1 - time2, options = cast_options(to_type = 
duration("s")))
+  })
+
+  register_binding("as.difftime", function(x,
+                                           format = "%X",
+                                           units = "auto") {
+    # windows doesn't seem to like "%X"
+    if (format == "%X" & tolower(Sys.info()[["sysname"]]) == "windows") {
+      format <- "%H:%M:%S"
+    }
+
+    if (units != "secs") {
+      abort("`as.difftime()` with units other than seconds not supported in 
Arrow")
+    }
+
+    if (call_binding("is.character", x)) {
+      x <- build_expr("strptime", x, options = list(format = format, unit = 
0L))
+      y <- build_expr("strptime", "0:0:0", options = list(format = "%H:%M:%S", 
unit = 0L))
+      diff_x_y <- call_binding("difftime", x, y, units = "secs")
+      return(diff_x_y)
+    }

Review comment:
       Have you seen 
https://issues.apache.org/jira/browse/ARROW-15996?focusedCommentId=17511259&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17511259
 ? 
   
   Would the following work?
   ```
   # strptime, convert to time64
   x <- build_expr("strptime", x, options = list(format = format, unit = 
0L))$cast(time64("us"))
   # Complex casting only due to cast type restrictions: time64 -> int64 -> 
duration(us) -> duration(s)
   x <- x$cast(int64())$cast(duration("us"))$cast(duration("s"))
   ```
   
   Yes, it does involve a series of casts (though most of those should be 
pretty cheap since it's mostly integers being moved around. 
   
   The nice thing about this is that it's conceptually isolated: going from a 
"%H:%M:%S" to a time64 to a duration. Where as (3) requires a few assumptions 
about how `strptime()` fills in date details (if you check this it's actually 
totally different in Arrow and R!) and then discards them. And agree that (1) 
is not ideal: `difftime` in R is definitely more like Arrow's `duration` than 
`time32|64()` objects. 
   
   And there are a number of ways this sequence of casts could be simplified 
(all the examples below use the array: `arr <- Array$create("01:00:00")`) 
either in ARROW-15996 or elsewhere: 
   * being able to go from `time64` to `duration` directly: 
       `call_function("strptime", arr, options = list(format = "%H:%M:%S", unit 
= 0L))$cast(time64("us"))$cast(duration("s"))`
   * being able to go from `int32` to duration, then we could do something 
like: 
       `call_function("strptime", arr, options = list(format = "%H:%M:%S", unit 
= 0L))$cast(time32("s"))$cast(int32())$cast(duration("s"))`
   * Probably a few more...
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to