[ 
https://issues.apache.org/jira/browse/ARROW-17132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17569030#comment-17569030
 ] 

Neal Richardson commented on ARROW-17132:
-----------------------------------------

Right, transmute drops the input columns, so that would work around this. Note 
that this isn't about {{mutate()}} but rather the R <–> Arrow conversion, 
and/or how R deals with timezones, or timezone-naive data, or localized data, 
or something. 

 {code}
> expect_identical(as.data.frame(arrow_table(df)), df)
Error: as.data.frame(arrow_table(df)) (`actual`) not identical to `df` 
(`expected`).

`attr(actual$time, 'tzone')` is a character vector ('America/Los_Angeles')
`attr(expected$time, 'tzone')` is absent
{code}

I'm sure if we keep pulling on this, we'll end up back on some issue we've 
worked before about how R treats timestamps with no timezone as being local 
time but arrow reads as UTC, so we have to incorporate time zone information 
when converting from R to Arrow.

{code}
> attributes(as.data.frame(arrow_table(df))$time)
$class
[1] "POSIXct" "POSIXt" 

$tzone
[1] "America/Los_Angeles"

> attributes(df$time)
$class
[1] "POSIXct" "POSIXt" 
{code}


> [R] Mutate in compare_dplyr_binding returns wrong type
> ------------------------------------------------------
>
>                 Key: ARROW-17132
>                 URL: https://issues.apache.org/jira/browse/ARROW-17132
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: R
>            Reporter: Rok Mihevc
>            Priority: Minor
>              Labels: test
>
> The following:
> {code:r}
> df <- tibble::tibble(
>   time = as.POSIXct(seq(as.Date("1999-12-31", tz = "UTC"), 
> as.Date("2001-01-01", tz = "UTC"), by = "day"))
> )
> compare_dplyr_binding(
>   .input %>%
>     mutate(x = yday(time)) %>%
>     collect(),
>   df
> )
> {code}
> Fails with:
> {code:bash}
> Failure (test-dplyr-funcs-datetime.R:574:3): extract wday from timestamp
> `object` (`actual`) not equal to `expected` (`expected`).
> `attr(actual$time, 'tzone')` is a character vector ('UTC')
> `attr(expected$time, 'tzone')` is absent
> Backtrace:
>  1. arrow:::compare_dplyr_binding(...)
>       at test-dplyr-funcs-datetime.R:574:2
>  2. arrow:::expect_equal(via_batch, expected, ...)
>       at tests/testthat/helper-expectation.R:115:4
>  3. testthat::expect_equal(...)
>       at tests/testthat/helper-expectation.R:42:4
> {code}
> This also happens for qday and probably other functions where input is 
> temporal and output is numeric.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to