Github user kevpeek commented on the issue:
https://github.com/apache/storm/pull/2379
I went ahead and replaced the original hash function based logic with
something faster. This code now runs in about 60% of the time taken by the
original.
As for changing the hash logic, I ran tests to measure the resulting tuple
distribution across downstream tasks. There is not a huge difference, but the
new code produces a slightly more even distribution. This is primarily because
it avoids the case where the two hash functions choose the same task, thus
eliminating the power of two choices.
@revans2
---