dragosmg commented on a change in pull request #12240:
URL: https://github.com/apache/arrow/pull/12240#discussion_r791548170
##########
File path: r/tests/testthat/test-Array.R
##########
@@ -985,3 +985,14 @@ test_that("Array to C-interface", {
delete_arrow_schema(schema_ptr)
delete_arrow_array(array_ptr)
})
+
+test_that("Array coverts timestamps with missing timezone /assumed local tz
correctly", {
+ withr::with_envvar(c(TZ = "America/Chicago"), {
+ a <- as.POSIXct("1970-01-01 00:00:00")
+ attr(a, "tzone") <- Sys.getenv("TZ")
Review comment:
I've added that. But wouldn't those arrays be equal in their absolute
value (without the `"tzone"` medatadata), but equally wrong (because they are
both shifting the timestamp by the same amount) at the same time?
##########
File path: r/tests/testthat/test-Array.R
##########
@@ -985,3 +985,14 @@ test_that("Array to C-interface", {
delete_arrow_schema(schema_ptr)
delete_arrow_array(array_ptr)
})
+
+test_that("Array coverts timestamps with missing timezone /assumed local tz
correctly", {
+ withr::with_envvar(c(TZ = "America/Chicago"), {
+ a <- as.POSIXct("1970-01-01 00:00:00")
+ attr(a, "tzone") <- Sys.getenv("TZ")
Review comment:
In my mind there is some inconsistency in the Jira ticket:
> Set the timezone to local time without changing the integer value fo the
timestamp. We store whatever integer R passes to us (21600), with CST as the
timezone set. Display is then "1970-01-01 00:00:00 CST"
If we store the integer that R passes to us (i.e. the _absolute value_ = the
number of seconds from the epoch) then the display will never be `"1970-01-01
00:00:00 CST"` in arrow, but rather the UTC date-time corresponding to the
_absolute value_ with the desired timezone label, which I think it would be
terribly confusing.
##########
File path: r/tests/testthat/test-Array.R
##########
@@ -985,3 +985,14 @@ test_that("Array to C-interface", {
delete_arrow_schema(schema_ptr)
delete_arrow_array(array_ptr)
})
+
+test_that("Array coverts timestamps with missing timezone /assumed local tz
correctly", {
+ withr::with_envvar(c(TZ = "America/Chicago"), {
+ a <- as.POSIXct("1970-01-01 00:00:00")
+ attr(a, "tzone") <- Sys.getenv("TZ")
Review comment:
No, I think if we were to offset the display in Arrow (which I think is
the solution), then the timestamp would no longer be confusing.
##########
File path: r/tests/testthat/test-Array.R
##########
@@ -985,3 +985,14 @@ test_that("Array to C-interface", {
delete_arrow_schema(schema_ptr)
delete_arrow_array(array_ptr)
})
+
+test_that("Array coverts timestamps with missing timezone /assumed local tz
correctly", {
+ withr::with_envvar(c(TZ = "America/Chicago"), {
+ a <- as.POSIXct("1970-01-01 00:00:00")
+ attr(a, "tzone") <- Sys.getenv("TZ")
Review comment:
👍🏻 I think we are all on the same page. In this situation what is the
solution? I think we can attach the local / system timezone when it isn't
passed explicitly (and this would theoretically solve this Jira issue). Should
we then open a follow-up issue regarding the printing of arrow timestamps in R
(which falls outside the scope of this ticket)? Which would be an alignment
with [ARROW-14567](https://issues.apache.org/jira/browse/ARROW-14567)?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]