[GitHub] [arrow] jorgecarleitao commented on pull request #9271: ARROW-11300: [Rust][DataFusion] Further performance improvements on hash aggregation with small groups

2021-01-26 Thread GitBox
jorgecarleitao commented on pull request #9271: URL: https://github.com/apache/arrow/pull/9271#issuecomment-768014037 Thanks @Dandandan . fmt missing :) This is an automated message from the Apache Git Service. To respond to

[GitHub] [arrow] jorgecarleitao commented on pull request #9271: ARROW-11300: [Rust][DataFusion] Further performance improvements on hash aggregation with small groups

2021-01-22 Thread GitBox
jorgecarleitao commented on pull request #9271: URL: https://github.com/apache/arrow/pull/9271#issuecomment-765865323 Thanks a lot for your points. I am learning a lot! :) Note that for small arrays, we are basically in the metadata problem on which the "payload size" of transmitting

[GitHub] [arrow] jorgecarleitao commented on pull request #9271: ARROW-11300: [Rust][DataFusion] Further performance improvements on hash aggregation with small groups

2021-01-21 Thread GitBox
jorgecarleitao commented on pull request #9271: URL: https://github.com/apache/arrow/pull/9271#issuecomment-765211936 Isn't the data contained on a buffer `Arc`ed? I.e. `Vec::clone()` should be cheap, no? This is an automate

[GitHub] [arrow] jorgecarleitao commented on pull request #9271: ARROW-11300: [Rust][DataFusion] Further performance improvements on hash aggregation with small groups [WIP]

2021-01-20 Thread GitBox
jorgecarleitao commented on pull request #9271: URL: https://github.com/apache/arrow/pull/9271#issuecomment-763529613 Yes, slicing is suboptimal atm. Also, IMO it should not be the `Array` to implement that method, but each implementation individually. I haven't touch that part yet, though