Hello, I'm still getting my head around how Hadoop works.  A survey
question: what kind of serialization do you use to output structured
data from your map/reduce jobs?

When both key and value are primitive types, either TextOutputFormat
or SequenceFileOutputFormat is easy.  But what if you want to store a
more complex data structure as the value?

By "complex data structure," I mean some combination of lists/arrays,
dictionaries/hashes, and primitives (int, float, string, boolean,

I've tried using JSON to store structured data in TextOutputFormat,
which works but is not very efficient.  Any better suggestions?

Thanks, all,

Reply via email to