Rachelint opened a new issue, #11717:
URL: https://github.com/apache/datafusion/issues/11717

   ### Is your feature request related to a problem or challenge?
   
   Now two part aggregate hash table is used in datafusion, and we actually 
saved the `hashes` of `groups` in the `hash table` part.
   But I found the saved `hashes` are not used during probing bucket, and we 
directly get `group values` and comapre instead, that will lead to many random 
memory accesses, and the compare operations are not cheap for some types.
   
   
   
   ### Describe the solution you'd like
   
   Maybe we should check the saved `hashes` first, and only check the `group 
values` when `hashes` are same for avoid collision.
   
   ### Describe alternatives you've considered
   
   _No response_
   
   ### Additional context
   
   I run the clickbench in local, it seems help to some cases.
   ```
   ┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
   ┃ Query        ┃       main ┃ check-hash-first ┃        Change ┃
   ┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
   │ QQuery 0     │     0.81ms │           0.77ms │ +1.06x faster │
   │ QQuery 1     │    69.11ms │          69.79ms │     no change │
   │ QQuery 2     │   169.47ms │         160.89ms │ +1.05x faster │
   │ QQuery 3     │   182.60ms │         180.05ms │     no change │
   │ QQuery 4     │  1589.41ms │        1595.87ms │     no change │
   │ QQuery 5     │  1597.02ms │        1563.24ms │     no change │
   │ QQuery 6     │    57.92ms │          59.34ms │     no change │
   │ QQuery 7     │    72.33ms │          71.02ms │     no change │
   │ QQuery 8     │  2415.02ms │        2293.14ms │ +1.05x faster │
   │ QQuery 9     │  1928.42ms │        1912.94ms │     no change │
   │ QQuery 10    │   544.39ms │         539.45ms │     no change │
   │ QQuery 11    │   605.79ms │         606.26ms │     no change │
   │ QQuery 12    │  1767.57ms │        1748.26ms │     no change │
   │ QQuery 13    │  4073.33ms │        3979.57ms │     no change │
   │ QQuery 14    │  2583.14ms │        2518.41ms │     no change │
   │ QQuery 15    │  1784.13ms │        1777.43ms │     no change │
   │ QQuery 16    │  5028.55ms │        4898.04ms │     no change │
   │ QQuery 17    │  4956.14ms │        4796.22ms │     no change │
   │ QQuery 18    │ 10436.51ms │       10168.34ms │     no change │
   │ QQuery 19    │   144.11ms │         147.18ms │     no change │
   │ QQuery 20    │  3310.77ms │        3286.34ms │     no change │
   │ QQuery 21    │  3887.09ms │        3867.43ms │     no change │
   │ QQuery 22    │  9398.96ms │        9008.04ms │     no change │
   │ QQuery 23    │ 23087.26ms │       22804.51ms │     no change │
   │ QQuery 24    │  1168.15ms │        1139.59ms │     no change │
   │ QQuery 25    │  1046.92ms │        1010.22ms │     no change │
   │ QQuery 26    │  1352.80ms │        1317.86ms │     no change │
   │ QQuery 27    │  4711.92ms │        4698.67ms │     no change │
   │ QQuery 28    │ 21891.92ms │       22870.99ms │     no change │
   │ QQuery 29    │   920.19ms │         901.89ms │     no change │
   │ QQuery 30    │  2075.81ms │        2036.71ms │     no change │
   │ QQuery 31    │  2961.03ms │        2844.67ms │     no change │
   │ QQuery 32    │ 16167.05ms │       15106.28ms │ +1.07x faster │
   │ QQuery 33    │  9418.20ms │        9429.24ms │     no change │
   │ QQuery 34    │  9388.74ms │        9431.36ms │     no change │
   │ QQuery 35    │  3108.34ms │        3021.89ms │     no change │
   │ QQuery 36    │   270.02ms │         269.25ms │     no change │
   │ QQuery 37    │   166.63ms │         156.78ms │ +1.06x faster │
   │ QQuery 38    │   158.33ms │         157.94ms │     no change │
   │ QQuery 39    │   834.47ms │         844.51ms │     no change │
   │ QQuery 40    │    63.22ms │          62.05ms │     no change │
   │ QQuery 41    │    59.97ms │          58.34ms │     no change │
   │ QQuery 42    │    70.34ms │          72.41ms │     no change │
   └──────────────┴────────────┴──────────────────┴───────────────┘
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to