alamb commented on code in PR #19910:
URL: https://github.com/apache/datafusion/pull/19910#discussion_r2722561289
##########
datafusion/physical-expr-common/src/binary_map.rs:
##########
@@ -389,7 +389,7 @@ where
// is value is already present in the set?
let entry = self.map.find_mut(hash, |header| {
// compare value if hashes match
- if header.len != value_len {
+ if header.hash != hash {
Review Comment:
I am double checking the reasoning, if the hash values aren't the same we
know the values can not be the same either due to the requirements of the Hash
trait
However (as codex points out to me) there is some small subtle chance that
if two different values collide on hash, **and their inline encodings match**,
they will be treated as equal even though the byte sequences differ which is
what the old len check avoided
So I think we still need the old length checks too 🤔
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]