jonkeane commented on a change in pull request #12357:
URL: https://github.com/apache/arrow/pull/12357#discussion_r808311898
##########
File path: r/tests/testthat/_snaps/dplyr-funcs-datetime.md
##########
@@ -0,0 +1,46 @@
+# extract tz
+
+ Code
+ compare_dplyr_binding(.input %>% mutate(timezone_posixct_date =
tz(posixct_date),
+ timezone_char_date = tz(char_date)) %>% collect(), df)
+ Error <expectation_failure>
+ `via_batch <- rlang::eval_tidy(expr,
rlang::new_data_mask(rlang::env(.input = record_batch(tbl))))` threw an
unexpected warning.
+ Message: In tz(char_date), timezone extraction for objects of class
`string` not supported in Arrow; pulling data into R
+ Class: simpleWarning/warning/condition
+
+---
+
+ Code
+ compare_dplyr_binding(.input %>% mutate(timezone_posixct_date =
tz(posixct_date),
+ timezone_date_date = tz(date_date)) %>% collect(), df)
+ Error <expectation_failure>
+ `via_batch <- rlang::eval_tidy(expr,
rlang::new_data_mask(rlang::env(.input = record_batch(tbl))))` threw an
unexpected warning.
+ Message: In tz(date_date), timezone extraction for objects of class
`date32[day]` not supported in Arrow; pulling data into R
+ Class: simpleWarning/warning/condition
+
+---
+
+ Code
+ compare_dplyr_binding(.input %>% mutate(timezone_posixct_date =
tz(posixct_date),
+ timezone_integer_date = tz(integer_date)) %>% collect(), df)
+ Warning <simpleWarning>
+ tz(): Don't know how to compute timezone for object of class integer;
returning "UTC". This warning will become an error in the next major version of
lubridate.
+ tz(): Don't know how to compute timezone for object of class integer;
returning "UTC". This warning will become an error in the next major version of
lubridate.
+ Error <expectation_failure>
+ `via_batch <- rlang::eval_tidy(expr,
rlang::new_data_mask(rlang::env(.input = record_batch(tbl))))` threw an
unexpected warning.
+ Message: In tz(integer_date), timezone extraction for objects of class
`int32` not supported in Arrow; pulling data into R
+ Class: simpleWarning/warning/condition
+
+---
+
+ Code
+ compare_dplyr_binding(.input %>% mutate(timezone_posixct_date =
tz(posixct_date),
+ timezone_double_date = tz(double_date)) %>% collect(), df)
+ Warning <simpleWarning>
+ tz(): Don't know how to compute timezone for object of class numeric;
returning "UTC". This warning will become an error in the next major version of
lubridate.
+ tz(): Don't know how to compute timezone for object of class numeric;
returning "UTC". This warning will become an error in the next major version of
lubridate.
+ Error <expectation_failure>
+ `via_batch <- rlang::eval_tidy(expr,
rlang::new_data_mask(rlang::env(.input = record_batch(tbl))))` threw an
unexpected warning.
+ Message: In tz(double_date), timezone extraction for objects of class
`double` not supported in Arrow; pulling data into R
+ Class: simpleWarning/warning/condition
+
Review comment:
This file isn't needed using `expect_error()`. I'll let you decide if
you prefer `expect_error()` like I sent (mostly out of convenience—it's what I
had written out to test myself) or switch those `expect_error()`s into one
`expect_snapshot()`
##########
File path: r/tests/testthat/test-dplyr-funcs-datetime.R
##########
@@ -711,3 +711,34 @@ test_that("am/pm mirror lubridate", {
)
})
+
+test_that("extract tz", {
+ df <- tibble(
+ posixct_date = as.POSIXct(c("2022-02-07", "2022-02-10"), tz =
"Pacific/Marquesas"),
+ )
+
+ compare_dplyr_binding(
+ .input %>%
+ mutate(timezone_posixct_date = tz(posixct_date)) %>%
+ collect(),
+ df
+ )
+
+ expect_error(
+ call_binding("tz", Expression$scalar("2020-10-01")),
+ "timezone extraction for objects of class `string` not supported in Arrow"
+ )
Review comment:
The bit that I think you were missing is `Expressoin$scalar(...)`. I
found this by looking around to find similar calls in the tests. Though there
aren't any that are exactly the same, I found
https://github.com/apache/arrow/blob/3a8e409385c8455e6c80b867c5730965a501d113/r/tests/testthat/test-dplyr-funcs-datetime.R#L113-L120
which lead me to look at `Expression`:
https://github.com/apache/arrow/blob/3a8e409385c8455e6c80b867c5730965a501d113/r/R/expression.R#L188-L192
##########
File path: r/R/dplyr-funcs-datetime.R
##########
@@ -147,5 +147,16 @@ register_bindings_datetime <- function() {
register_binding("pm", function(x) {
!call_binding("am", x)
})
+ register_binding("tz", function(x) {
+ if (!call_binding("is.POSIXct", x)) {
Review comment:
This line isn't quite right, actually. We want to use something like
`!is.POSIXct(x)` that will be compatible with either R objects or Arrow types.
I know there is at least one place that we test if a thing is a timestamp like
that. Could you see if you can find it + see what that uses to get this working?
That _should_ allow one to then `arrow:::call_binding("tz", "2020-10-01")`
and get an error directly.
##########
File path: r/R/dplyr-funcs-datetime.R
##########
@@ -147,5 +147,16 @@ register_bindings_datetime <- function() {
register_binding("pm", function(x) {
!call_binding("am", x)
})
+ register_binding("tz", function(x) {
+ if (!call_binding("is.POSIXct", x)) {
+ if (inherits(x, "Expression")) {
+ class <- x$type()$ToString()
+ } else {
+ class <- type(x)
+ }
+ abort(paste0("timezone extraction for objects of class `", class, "` not
supported in Arrow"))
+ }
Review comment:
I had to make a few changes here to make sure that this works with both
R objects and Arrow objects. I only discovered this in my own testing. We
probably won't see many pure R objects here, but it's possible!
We don't (AFAIK) have a unified way to get types that would work from R
objects and Arrow objects, but it might be worth looking for one + making a
jira if we don't have one already.
##########
File path: r/tests/testthat/test-dplyr-funcs-datetime.R
##########
@@ -711,3 +711,34 @@ test_that("am/pm mirror lubridate", {
)
})
+
+test_that("extract tz", {
+ df <- tibble(
+ posixct_date = as.POSIXct(c("2022-02-07", "2022-02-10"), tz =
"Pacific/Marquesas"),
Review comment:
I cleaned up the types that we aren't running through the dplyr pipeline
here — we can add them in if/when we want to test that mechanism.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]