Re: [PR] [opt](function) Eliminate redundant hash computation in AggregateFunctionUniq [doris]

via GitHub Mon, 30 Mar 2026 00:37:33 -0700


github-actions[bot] commented on PR #61730:
URL: https://github.com/apache/doris/pull/61730#issuecomment-4152902696


   ## Code Review Summary
   
   Reviewed the hash precomputation optimization in `AggregateFunctionUniq` and 
`AggregateFunctionUniqDistributeKey`. **No issues found.**
   
   ### Critical Checkpoint Conclusions
   
   - **Goal / Correctness**: The goal is to eliminate redundant hash 
computation (2x -> 1x per key) in the batch insert paths. The implementation is 
correct: `hash()` returns the mixed hash value, and both `prefetch_hash()` and 
`emplace_with_hash()` expect exactly that value (they apply H1/H2 internally). 
The phmap APIs (`hash()`, `prefetch_hash()`, `emplace_with_hash()`) all exist 
and are used correctly.
   - **Modification scope**: Minimal and focused. Two files with identical 
symmetrical changes (34 additions, 10 deletions). No unrelated changes.
   - **Concurrency**: Not applicable. These functions operate on per-thread 
aggregation state, no shared data.
   - **Lifecycle**: Local `std::vector` with automatic cleanup. No concerns.
   - **Configuration / Incompatible changes**: None.
   - **Parallel code paths**: The `merge()` and `deserialize_and_merge()` 
functions also call `set.insert()`, but they iterate over elements from another 
set/buffer without the prefetch pattern, so the optimization is not applicable 
there. Correct as-is.
   - **Test coverage**: No new tests. This is a pure performance optimization 
with no behavioral change (hash function is deterministic and stateless, 
`emplace_with_hash` produces identical results to `insert`). Existing 
`count(distinct ...)` tests cover correctness. Acceptable.
   - **Performance analysis**: 
     - Hash computation reduced from 2x to 1x per key - clear benefit.
     - The precomputation loop is a sequential scan over `keys[]` which is 
cache-friendly.
     - The extra `std::vector<size_t>` allocation per call is consistent with 
existing allocation patterns already present in the same functions 
(`keys_container`, `array_of_data_set`).
     - This pattern is already established in `hash_map_context.h` 
(`init_hash_values()` + `prefetch()` + `lazy_emplace()`), so the approach is 
consistent with the codebase.
   - **Other issues**: None identified.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] [opt](function) Eliminate redundant hash computation in AggregateFunctionUniq [doris]

Reply via email to