On Sep 12, 2008, at 3:01 PM, Chris Douglas wrote:

Oh, I see what you mean. Yes, you need to reuse the objects that you're given in your deserializer.


This isn't true in the general case. The Java serializer for instance, always returns a new instance. The SequenceFile reader has a pair of methods:

public Object next(Object key) throws IOException;
public Object nextValue(Object value) throws IOException;

so that you can read java serialized objects from a sequence file. They also work as map outputs and reduce outputs. The only place where you are hosed is the RecordReader interface. HADOOP-1230's changes to the RecordReader were designed to fix the problem.

-- Owen

Reply via email to