I am going to load cache data in a set of worker bolts that I will be 
sending data to with a fields grouping on one field, a string. I need to 
be able to duplicate Storm's fieldsgrouping mod/hash so I can preload the 
worker bolts with data. I tried just taking the hashCode() of the string 
then performing a mod on that with the number of workers I have, but it's 
not the same as the Storm calculation. 

I searched and found the Storm 8 Clojure code that performs the mod hash 
and have a note from Nathan but would like the actual algorithm in Java if 
possible. 

Any help would be appreciated.


Clojure mod/hash code:

        1.      (defn- mk-fields-grouper [^Fields out-fields ^Fields 
group-fields num-tasks]
        2.        (fn [^List values]
        3.          (mod (tuple/list-hash-code (.select out-fields 
group-fields values))
        4.               num-tasks)
        5.          ))


Note from Nathan:
It calls "hashCode" on the list of selected values and mods it by the 
number of consumer tasks. You can play around with that function to see if 
something about your data is causing something degenerative to happen and 
cause skew. 
 
 

Rick Rankin 


Reply via email to