There isn't a coder for deterministic maps in Beam, so even if your
datastructure is deterministic, Beam will assume the serialized bytes
aren't deterministic.

You could make one using the MapCoder as a guide:
https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/MapCoder.java
Just change it such that the exception in VerifyDeterministic is removed
and when decoding it instantiates a TreeMap or such instead of a HashMap.

Alternatively, you could just represent your key as a sorted list of KV
pairs. Lookups could be done using binary search if necessary.

Mike

Den tor. 11. jul. 2019 kl. 22.41 skrev Shannon Duncan <
joseph.dun...@liveramp.com>:

> So I'm working on essentially doing a word-count on a complex data
> structure.
>
> I tried just using a HashMap as the Structure, but that didn't work
> because it is non-deterministic.
>
> However when Given a LinkedHashMap or TreeMap which is deterministic the
> SDK complains that it's non-deterministic when trying to use it as a key
> for GroupByKey.
>
> What would be an appropriate Map style data structure that would be
> deterministic enough for Apache Beam to accept it as a key?
>
> Thanks,
> Shannon
>

Reply via email to