[ https://issues.apache.org/jira/browse/AVRO-2905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ryan Skraba updated AVRO-2905: ------------------------------ Status: Patch Available (was: Open) > org.apache.avro.util.Utf8 bug in `hashCode` > ------------------------------------------- > > Key: AVRO-2905 > URL: https://issues.apache.org/jira/browse/AVRO-2905 > Project: Apache Avro > Issue Type: Bug > Components: java > Affects Versions: 1.10.0 > Reporter: Kyoungha Min > Assignee: Kyoungha Min > Priority: Major > Fix For: 1.10.1 > > > Bug : utf8 hash reset is not correctly done, as a result, returns different > hash when the values are logically equal. > > cause commit : > > [https://github.com/apache/avro/commit/d5d1cd70e920f45468bf696d3668b496a9adbab3#diff-7001731cdafe4fc24f86e71c7e8a0b61] > [https://github.com/apache/avro/commit/8e345c48e8f9708c27af63754965777314dac85c#diff-7001731cdafe4fc24f86e71c7e8a0b61] > > reproduce : > {code:java} > org.apache.avro.util.Utf8 utf8 = new org.apache.avro.util.Utf8(); > utf8.set("hello"); > System.out.println(utf8.hashCode()); // 99162322 > utf8.set("hello"); > System.out.println(utf8.hashCode()); // -1538739392 > {code} > > Suggested fix. > # get rid of `hasHash` field. In my opinion, it leads to more bug-prone by > adding additional variable to take care of. `hash != 0` should be enough to > check the presence of hash. Redundant computation for `0 hash` elements > should be negligible. > # reset `hash = 0` on every mutation call. > # not needed, but I want to see some minor optimization like avoiding memory > access by creating local variable (int hash = 0; .... this.hash = hash). > -- This message was sent by Atlassian Jira (v8.3.4#803005)