All,
I am having trouble serializing to Json from Pig scripts (Storage). Here is what I've tried and failed with: 1. - Pig 0.10+ PigStorage. Maps are assumed to be String to String, so heavily nested structures are not handled. 2. - Hortonworks toJson UDF. Maps are not supported. 3. - Twitter ElephantBird LzoJsonStorage. Arrays/Bags are not handled. I wondered if anyone is using something to store output from pig scripts as Json and whether they use maps. If so, how are you writing out Json and what issues have you seen? If not, what structured format are you using and why? Avro? Thrift? Historically, all our pig jobs results in more tabular results and therefore it's not been an issue. The input data is in Json and we've used ElephantBird (from twitter) to load it as a map. Given the above experience, our only option is to use Pig's JsonLoader to load the Json using a specified schema but this will pin us into a single schema and the data is not consistent (schemas evolve). Previously we could deal with this inside the script but not if we define a single schema for the loaded data. So I'm honestly reconsidering our use of Json (which is a historical conversation in itself). Cheers, Simon -- Simon Reavely simon.reav...@gmail.com