[GitHub] [arrow-datafusion] Dandandan commented on pull request #68: Experimenting with arrow2

GitBox Fri, 17 Sep 2021 09:21:54 -0700


Dandandan commented on pull request #68:
URL: https://github.com/apache/arrow-datafusion/pull/68#issuecomment-921923054



   I got one query running on a patched version of DataFusion / the benchmark 
on SF=10 / Parquet data source (split evenly in 16 partitions):
   
   | Query   |      Arrow      |  Arrow2 |
   |----------|:-------------:|------:|
   | 6 |  883.02 | 1061.64 |
   
   As you can see, this query runs a bit slower than the arrow-based version - 
didn't do any profiling or checking yet, but could be some primitives like 
hashing could be faster in the master branch right now.
   
   If this is implemented
   https://github.com/jorgecarleitao/arrow2/issues/418
   I can probably get some more numbers.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow-datafusion] Dandandan commented on pull request #68: Experimenting with arrow2

Reply via email to