Hi all,

my Reducers need to load a huge HashMap from data present in the HDFS. This data has been partitioned by a previous map/reduce job. The complete data would not fit into main memory of a Reducer machine. It would suffice to load only the correct partition of the data. The problem is that the "correct" partition is determined by the Partitioner, which feeds the current Reducers. I'm not sure how to let a Reducer know in its configure() method which partition it will get from the Partitioner, i.e. which partition to load from HDFS into the HashMap.

Maybe someone has a good idea.

Regards,
Jürgen

Reply via email to