What is it that makes you not think JSON has to be inefficient? ? Repeated value parsing ?
? Redundant redundant data labels ? ? Generic "parsing must be slow" prejudice ? On 5/22/08 1:54 PM, "Stuart Sierra" <[EMAIL PROTECTED]> wrote: > Hello, I'm still getting my head around how Hadoop works. A survey > question: what kind of serialization do you use to output structured > data from your map/reduce jobs? > > When both key and value are primitive types, either TextOutputFormat > or SequenceFileOutputFormat is easy. But what if you want to store a > more complex data structure as the value? > > By "complex data structure," I mean some combination of lists/arrays, > dictionaries/hashes, and primitives (int, float, string, boolean, > null). > > I've tried using JSON to store structured data in TextOutputFormat, > which works but is not very efficient. Any better suggestions? > > Thanks, all, > -Stuart