Dandandan commented on issue #6892:
URL: 
https://github.com/apache/arrow-datafusion/issues/6892#issuecomment-1634956646

   Yes that is the idea
   * I found out the `Single` aggregation mode which already does what we want 
to do (do aggregation in one go), so there is no need to create
   
   * I did some experiments skipping the `Partial` based on heuristic (e.g. for 
tables up to a number of columns), but this gets mixed results:
   ```
   ┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
   ┃ Query        ┃ fast_gby_hash ┃ aggregate_partition_mode ┃        Change ┃
   ┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
   │ QQuery 1     │      194.12ms │                 194.38ms │     no change │
   │ QQuery 2     │       36.59ms │                  30.54ms │ +1.20x faster │
   │ QQuery 3     │       44.99ms │                  45.69ms │     no change │
   │ QQuery 4     │       37.34ms │                  37.88ms │     no change │
   │ QQuery 5     │       91.69ms │                  90.49ms │     no change │
   │ QQuery 6     │       10.27ms │                  10.25ms │     no change │
   │ QQuery 7     │      193.05ms │                 190.58ms │     no change │
   │ QQuery 8     │       69.37ms │                  69.39ms │     no change │
   │ QQuery 9     │      132.29ms │                 132.95ms │     no change │
   │ QQuery 10    │       91.51ms │                  90.86ms │     no change │
   │ QQuery 11    │       40.53ms │                  39.73ms │     no change │
   │ QQuery 12    │       67.70ms │                  66.71ms │     no change │
   │ QQuery 13    │      130.96ms │                 132.62ms │     no change │
   │ QQuery 14    │       11.87ms │                  12.10ms │     no change │
   │ QQuery 15    │       14.80ms │                  19.92ms │  1.35x slower │
   │ QQuery 16    │       37.79ms │                  37.07ms │     no change │
   │ QQuery 17    │      210.67ms │                 209.54ms │     no change │
   │ QQuery 18    │      315.60ms │                 381.18ms │  1.21x slower │
   │ QQuery 19    │       57.40ms │                  57.59ms │     no change │
   │ QQuery 20    │       70.88ms │                  58.71ms │ +1.21x faster │
   │ QQuery 21    │      248.35ms │                 252.92ms │     no change │
   │ QQuery 22    │       28.11ms │                  27.87ms │     no change │
   └──────────────┴───────────────┴──────────────────────────┴───────────────┘
   ```
   
   My hope is better for https://github.com/apache/arrow-datafusion/issues/6937 
which I think might be similar to the "adaptive partial aggregation" of 
snowflake / teradata?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to