On Aug 19, 2008, at 12:17 PM, Stuart Sierra wrote:
Hello list, Thought I would share this tidbit that frustrated me for a couple of hours. Beware! Hadoop reuses the Writable objects given to the reducer. For example:
Yes. http://issues.apache.org/jira/browse/HADOOP-2399 - fixed in 0.17.0. Arun
public void reduce(K key, Iterator<V> values, OutputCollector<K, V> output, Reporter reporter) throws IOException { List<V> valueList = new ArrayList<V>(); while (values.hasNext()) { valueList.add(values.next()); } // Say there were 10 values. valueList now contains 10 // pointers to the same object. } I assume this is done for efficiency, but a warning in the Reducer documentation would be nice. -Stuart