Dandandan commented on issue #6892: URL: https://github.com/apache/arrow-datafusion/issues/6892#issuecomment-1634956646
Yes that is the idea * I found out the `Single` aggregation mode which already does what we want to do (do aggregation in one go), so there is no need to create * I did some experiments skipping the `Partial` based on heuristic (e.g. for tables up to a number of columns), but this gets mixed results: ``` ┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓ ┃ Query ┃ fast_gby_hash ┃ aggregate_partition_mode ┃ Change ┃ ┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩ │ QQuery 1 │ 194.12ms │ 194.38ms │ no change │ │ QQuery 2 │ 36.59ms │ 30.54ms │ +1.20x faster │ │ QQuery 3 │ 44.99ms │ 45.69ms │ no change │ │ QQuery 4 │ 37.34ms │ 37.88ms │ no change │ │ QQuery 5 │ 91.69ms │ 90.49ms │ no change │ │ QQuery 6 │ 10.27ms │ 10.25ms │ no change │ │ QQuery 7 │ 193.05ms │ 190.58ms │ no change │ │ QQuery 8 │ 69.37ms │ 69.39ms │ no change │ │ QQuery 9 │ 132.29ms │ 132.95ms │ no change │ │ QQuery 10 │ 91.51ms │ 90.86ms │ no change │ │ QQuery 11 │ 40.53ms │ 39.73ms │ no change │ │ QQuery 12 │ 67.70ms │ 66.71ms │ no change │ │ QQuery 13 │ 130.96ms │ 132.62ms │ no change │ │ QQuery 14 │ 11.87ms │ 12.10ms │ no change │ │ QQuery 15 │ 14.80ms │ 19.92ms │ 1.35x slower │ │ QQuery 16 │ 37.79ms │ 37.07ms │ no change │ │ QQuery 17 │ 210.67ms │ 209.54ms │ no change │ │ QQuery 18 │ 315.60ms │ 381.18ms │ 1.21x slower │ │ QQuery 19 │ 57.40ms │ 57.59ms │ no change │ │ QQuery 20 │ 70.88ms │ 58.71ms │ +1.21x faster │ │ QQuery 21 │ 248.35ms │ 252.92ms │ no change │ │ QQuery 22 │ 28.11ms │ 27.87ms │ no change │ └──────────────┴───────────────┴──────────────────────────┴───────────────┘ ``` My hope is better for https://github.com/apache/arrow-datafusion/issues/6937 which I think might be similar to the "adaptive partial aggregation" of snowflake / teradata? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org