jonkeane commented on pull request #10445:
URL: https://github.com/apache/arrow/pull/10445#issuecomment-858983224


   Ok, I've run some benchmarks on this branch and I'm seeing a huge speed up 
for floats + integers with `as.vector(array)`. 🎉 
   
   It might be out of scope for this PR, but chunked arrays don't see a similar 
speed up (which makes sense given they call `ArrayVector__as_vector` directly 
rather than routing through `Array__as_vector`, so they aren't being using alt 
rep). I can't quite tell from the cpp if `Table__to_dataframe` would _just 
work_ with alt rep as well if it worked with ChunkedArrays or if we would need 
to more to facilitate that.
   
   
   ``` r
   library(arrow, warn.conflicts = FALSE)
   
   x <- 1:1e3+ 1L
   v <- Array$create(x)
   x1 <- v$as_vector()  
   .Internal(inspect(x1))
   #> @7f9077f5a1a8 13 INTSXP g0c0 [REF(65535)] std::shared_ptr<arrow::Array, 
int32, NONULL> (len=1000, ptr=0x7f90975a9a08)
   
   
   v_chunked <- ChunkedArray$create(x)
   x2 <- v_chunked$as_vector()  
   .Internal(inspect(x2))
   #> @7f908312c000 13 INTSXP g0c7 [REF(2)] (len=1000, tl=0) 2,3,4,5,6,...
   ```
   
   <sup>Created on 2021-06-10 by the [reprex 
package](https://reprex.tidyverse.org) (v2.0.0)</sup>
   
   arrowbench results (using the new benchmarks in 
https://github.com/ursacomputing/arrowbench/pull/28): 
   
[zero-copy-data-conversion.html.zip](https://github.com/apache/arrow/files/6633992/zero-copy-data-conversion.html.zip)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to