josh gruenberg created KAFKA-3499:
-------------------------------------
Summary: byte[] should not be used as Map key nor Set member
Key: KAFKA-3499
URL: https://issues.apache.org/jira/browse/KAFKA-3499
Project: Kafka
Issue Type: Bug
Components: kafka streams
Reporter: josh gruenberg
On the JVM, Array.equals and Array.hashCode do not incorporate array contents;
they inherit Object.equals/hashCode. This implies that Collections that rely
upon equals/hashCode (eg, HashMap/HashSet and variants) treat two arrays with
equal contents as distinct elements.
Many of the Kafka Streams internal classes currently use generic HashMaps and
Sets to manage caches and invalidation status. For example,
RocksDBStore.cacheDirtyKeys is a HashSet<K>. Then, in RocksDBWindowStore, the
Elements are constructed as RocksDBStore<byte[], byte[]>.
Similarly, the MemoryLRUCache<K, RocksDBCacheEntry> internally holds a
LinkedHashMap<K,V> map, and a HashSet<K> keys, and these end up holding byte[]
keys. Finally, user-code may attempt to use any of these provided types with
byte[], with undesirable results.
Keys that are byte-arrays should be wrapped in a type that incorporates the
content in their computation of equals/hashCode. java.nio.ByteBuffer is one
such type that could be used, but a purpose-built immutable class would likely
be a better solution.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)