[ https://issues.apache.org/jira/browse/ARROW-13588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Charlie Gao updated ARROW-13588: -------------------------------- Summary: [R] Empty character attributes not stored (was: R empty character attributes not stored) > [R] Empty character attributes not stored > ----------------------------------------- > > Key: ARROW-13588 > URL: https://issues.apache.org/jira/browse/ARROW-13588 > Project: Apache Arrow > Issue Type: Bug > Components: R > Affects Versions: 5.0.0 > Environment: Ubuntu 20.04 R 4.1 release > Reporter: Charlie Gao > Priority: Minor > > I have come across an issue in the process of incorporating arrow in a > package I develop. > Date-times in the POSIXct format have a 'tzone' attribute that by default is > set to "", an empty character vector (not NULL) when created. > This however is not stored in the Arrow feather file. When the file is read > back, the original and restored dataframes are not identical as per the below > reprex. > I am thinking that this should not be the intention? My workaround at the > moment is making a check when reading back to write the empty string if the > tzone attribute does not exist. > Just to confirm, this is not an issue when the attribute is not empty - it > gets stored correctly. > Thanks. > ``` r > dates <- as.POSIXct(c("2020-01-01", "2020-01-02", "2020-01-02")) > attributes(dates) > #> $class > #> [1] "POSIXct" "POSIXt" > #> > #> $tzone > #> [1] "" > values <- c(1:3) > original <- data.frame(dates, values) > original > #> dates values > #> 1 2020-01-01 1 > #> 2 2020-01-02 2 > #> 3 2020-01-02 3 > tempfile <- tempfile() > arrow::write_feather(original, tempfile) > restored <- arrow::read_feather(tempfile) > identical(original, restored) > #> [1] FALSE > waldo::compare(original, restored) > #> `attr(old$dates, 'tzone')` is a character vector ('') > #> `attr(new$dates, 'tzone')` is absent > unlink(tempfile) > ``` -- This message was sent by Atlassian Jira (v8.3.4#803005)