Viktor Somogyi-Vass created KAFKA-10650:
-------------------------------------------

             Summary: Use Murmur3 hashing instead of MD5 in SkimpyOffsetMap
                 Key: KAFKA-10650
                 URL: https://issues.apache.org/jira/browse/KAFKA-10650
             Project: Kafka
          Issue Type: Improvement
          Components: core
            Reporter: Viktor Somogyi-Vass
            Assignee: Viktor Somogyi-Vass


The usage of MD5 has been uncovered during testing Kafka for FIPS (Federal 
Information Processing Standards) verification.

While MD5 isn't a FIPS incompatibility here as it isn't used for cryptographic 
purposes, I spent some time with this as it isn't ideal either. MD5 is a 
relatively fast crypto hashing algo but there are much better performing 
algorithms for hash tables as it's used in SkimpyOffsetMap.

By applying Murmur3 (that is implemented in Streams) I could achieve a 3x 
faster {{put}} operation and the overall segment cleaning sped up by 30% while 
preserving the same collision rate (both performed within 0.0015 - 0.007, 
mostly with 0.004 median).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to