Hello, I'm still getting my head around how Hadoop works. A survey question: what kind of serialization do you use to output structured data from your map/reduce jobs?
When both key and value are primitive types, either TextOutputFormat or SequenceFileOutputFormat is easy. But what if you want to store a more complex data structure as the value? By "complex data structure," I mean some combination of lists/arrays, dictionaries/hashes, and primitives (int, float, string, boolean, null). I've tried using JSON to store structured data in TextOutputFormat, which works but is not very efficient. Any better suggestions? Thanks, all, -Stuart