We do this with some of our Thrift-serialized types. We account for
this behavior explicitly in the ThrittWritable class and make it so
that we can read the serialized version off the wire completely by
prepending the size. Then, we can read in the raw bytes and hang on
to them for later as we see fit. I would think that leaving the bytes
on the DataInput would break things in a very impressive way.
-Bryan
On Oct 2, 2008, at 2:48 PM, Jimmy Lin wrote:
Hi everyone,
I'm wondering if it's possible to lazily deserialize a Writable.
That is,
when my custom Writable is handed a DataInput from readFields, can I
simply hang on to the reference and read from it later? This would be
useful if the Writable is a complex data structure that may be
expensive
to deserialize, so I'd only want to do it on-demand. Or does the
runtime
mutate the underlying stream, leaving the Writable with a reference to
something completely different later?
I'm wondering about both present behavior, and the implicit contract
provided by the Hadoop API.
Thanks!
-Jimmy