Chris Riccomini created SAMZA-505:
-------------------------------------
Summary: CachedStore doesn't support Array keys well
Key: SAMZA-505
URL: https://issues.apache.org/jira/browse/SAMZA-505
Project: Samza
Issue Type: Bug
Components: kv
Affects Versions: 0.8.0
Reporter: Chris Riccomini
Fix For: 0.9.0
Several people have hit an issue when using the Key/Value store with byte[]
keys. Since CachedStore uses a HashMap, and Array.equals/Array.hashCode return
object identity values, the HashMap behaves unexpectedly. This isn't really a
bug, just a common misunderstanding in how things work. It's compounded by the
fact that we default caches to "on". This yields the behavior:
{code}
store.put("a".getBytes, 1)
store.get("a".getBytes) // returns null
{code}
See [this
discussion|http://stackoverflow.com/questions/1058149/using-a-byte-array-as-hashmap-key-java]
for details.
Our TestKeyValueStore uses byte[] keys, but it keeps them in a list, and
re-uses the same exact instance, so we don't hit this problem.
I think we should wrap array keys in ByteBuffer, or use our own wrapper. We'll
have to make sure to unwrap before calling the put/get/delete operations on the
underlying store.
Initially, I was thinking that the safest thing to do would be to have
CachedStore check all keys, and throw an exception. This would allow
individuals to choose the best course of action (ByteBuffer.wrap, use an
alternative key, write a custom wrapper class, etc). But, I think this approach
doesn't work in some cases. If there's a cache with a JSON serde, and the user
is using a key of Array[Int], using the key of Array[Int] is valid. A JSON
serde would just serialize it as [1,2,3], and everything should work in this
case.
Since this problem is basically an implementation detail introduced by
CachedStore, I think it should be fixed internally by wrapping/unwrapping array
keys.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)