We do this with some of our Thrift-serialized types. We account for this behavior explicitly in the ThrittWritable class and make it so that we can read the serialized version off the wire completely by prepending the size. Then, we can read in the raw bytes and hang on to them for later as we see fit. I would think that leaving the bytes on the DataInput would break things in a very impressive way.

-Bryan

On Oct 2, 2008, at 2:48 PM, Jimmy Lin wrote:

Hi everyone,

I'm wondering if it's possible to lazily deserialize a Writable. That is,
when my custom Writable is handed a DataInput from readFields, can I
simply hang on to the reference and read from it later?  This would be
useful if the Writable is a complex data structure that may be expensive to deserialize, so I'd only want to do it on-demand. Or does the runtime
mutate the underlying stream, leaving the Writable with a reference to
something completely different later?

I'm wondering about both present behavior, and the implicit contract
provided by the Hadoop API.

Thanks!

-Jimmy



Reply via email to