paleolimbot commented on code in PR #12817:
URL: https://github.com/apache/arrow/pull/12817#discussion_r854354212


##########
r/tests/testthat/helper-data.R:
##########
@@ -25,7 +25,7 @@ example_data <- tibble::tibble(
   fct = factor(letters[c(1:4, NA, NA, 7:10)])
 )
 
-example_with_metadata <- tibble::tibble(
+old_example_with_metadata <- tibble::tibble(

Review Comment:
   A previous version of this couldn't handle classed vectors. Somewhere in the 
middle, I made sure that if `type` was an `ExtensionType`, we went through S3 
dispatch. Thanks to this comment, I moved things around so that we can now 
handle anything where `vctrs::vec_is()` is `TRUE` (e.g., `structure("one", 
class = "special_string")`, which was the reason for the rename since the new 
conversion scheme couldn't handle something with an S3 class that didn't have a 
`infer_type()`/`as_arrow_array()` method).
   
   There are limitations...for example, we can't put columns with S3 classes 
through the compute engine and do thing like `nchar()`. But it does mean that 
we can roundtrip more things through the compute engine (because we are no 
longer relying on the table-level metadata). (For example, we no longer need 
any special handling for `POSIXlt`).
   
   ``` r
   # remotes::install_github("apache/arrow/r#12817")
   library(arrow, warn.conflicts = FALSE)
   library(dplyr, warn.conflicts = FALSE)
   
   example_with_metadata <- tibble::tibble(
     a = structure("one", class = "special_string"),
     b = 2,
     c = tibble::tibble(
       c1 = structure("inner", extra_attr = "something"),
       c2 = 4,
       c3 = 50
     ),
     d = "four"
   )
   
   example_with_metadata %>% 
     as_arrow_table() %>% 
     dplyr::mutate(nchar(a))
   #> Error: NotImplemented: Function 'utf8_length' has no kernel matching 
input types (array[character(0)
   #> attr(,"class")
   #> [1] "special_string"])
   #> 
/Users/deweydunnington/Desktop/rscratch/arrow/cpp/src/arrow/compute/exec/expression.cc:340
  call.function->DispatchBest(&descrs)
   
   # ...but roundtripping just works
   example_with_metadata %>% 
     as_arrow_table() %>% 
     dplyr::mutate(b = b * 2) %>% 
     dplyr::collect()
   #> # A tibble: 1 × 4
   #>   a              b c$c1    $c2   $c3 d    
   #> * <spcl_str> <dbl> <chr> <dbl> <dbl> <chr>
   #> 1 one            4 inner     4    50 four
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to