[
https://issues.apache.org/jira/browse/IMPALA-14253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18008150#comment-18008150
]
Joe McDonnell commented on IMPALA-14253:
----------------------------------------
Looks like that changed in IMPALA-7635:
https://github.com/apache/impala/commit/2040b2621f73de4dfaba80dd938221758952183f
> HashTable's travel_length_ statistic is incorrect
> -------------------------------------------------
>
> Key: IMPALA-14253
> URL: https://issues.apache.org/jira/browse/IMPALA-14253
> Project: IMPALA
> Issue Type: Bug
> Components: Backend
> Affects Versions: Impala 5.0.0
> Reporter: Joe McDonnell
> Priority: Major
>
> The profile has some statistics about the hash table:
> {noformat}
> Hash Table [120 instances]:
> - HashBuckets: 33.55M (33554432)
> - HashCollisions: 45.42K (45423)
> - Probes: 38.56M (38562545)
> - Resizes: 176 (176)
> - Travel: 0 (0){noformat}
> The "Travel" statistic comes from HashTable's travel_length_ counter. This is
> not being counted correctly. If HashCollisions are non-zero, the travel
> should be non-zero. The problem is that the code is not updating the
> travel_length_ when it returns early (which is almost always):
> {noformat}
> int64_t step = 0;
> do {
> Bucket* bucket = &buckets[bucket_idx];
> if (LIKELY(!bucket->IsFilled())) return bucket_idx; <--- Doesn't update
> travel_length_
> if (hash == hash_array[bucket_idx]) {
> if (COMPARE_ROW
> && ht_ctx->Equals<INCLUSIVE_EQUALITY>(
> GetRow<TYPE>(bucket, ht_ctx->scratch_row_, bd))) {
> *found = true;
> return bucket_idx; <--------- Doesn't update travel_length_
> }
> // Row equality failed, or not performed. This is a hash collision.
> Continue
> // searching.
> ++ht_ctx->num_hash_collisions_;
> }
> // Move to the next bucket.
> ++step;
> ... logic to pick next bucket ...
> } while (LIKELY(step < num_buckets));
> ht_ctx->travel_length_ += step;{noformat}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]