Michael Smith has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/20396 )
Change subject: IMPALA-12393: Fix inconsistent hash for TimestampValue in DictEncoder ...................................................................... IMPALA-12393: Fix inconsistent hash for TimestampValue in DictEncoder Currently, DictEncoder uses the default hash function for TimestampValue, which means it is hashing the entire TimestampValue struct. This can be inconsistent, because TimestampValue contains some padding that may not be zero in some cases. For TimestampValues that are part of a Tuple, the padding is zero, so this is mainly present in test cases. This was discovered when fixing a Clang Tidy performance-for-range-copy warning by iterating with a const reference rather than making a copy of the value. DictTest.TestTimestamps became flaky with that change, because the hash was no longer consistent. The copy must have had consistent content for the padding through the iteration, but the const reference did not. This adds a template specialization of the Hash function for TimestampValue. The specialization uses TimestampValue::Hash(), which hashes only the non-padding pieces of the struct. This also includes the change to dict-test.cc that uncovered the issue. This fix is mostly to unblock IMPALA-12390. Testing: - Ran dict-test in a loop for a few hundred iterations - Hand tested inserting many timestamps into a Parquet table with dictionary encoding and verified that the performance didn't change. Change-Id: Iad86e9b0f645311c3389cf2804dcc1a346ff10a9 Reviewed-on: http://gerrit.cloudera.org:8080/20396 Tested-by: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Reviewed-by: Daniel Becker <daniel.bec...@cloudera.com> Reviewed-by: Michael Smith <michael.sm...@cloudera.com> --- M be/src/util/dict-encoding.h M be/src/util/dict-test.cc 2 files changed, 8 insertions(+), 1 deletion(-) Approvals: Impala Public Jenkins: Verified Daniel Becker: Looks good to me, but someone else must approve Michael Smith: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/20396 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Iad86e9b0f645311c3389cf2804dcc1a346ff10a9 Gerrit-Change-Number: 20396 Gerrit-PatchSet: 3 Gerrit-Owner: Joe McDonnell <joemcdonn...@cloudera.com> Gerrit-Reviewer: Daniel Becker <daniel.bec...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Joe McDonnell <joemcdonn...@cloudera.com> Gerrit-Reviewer: Michael Smith <michael.sm...@cloudera.com>