[ https://issues.apache.org/jira/browse/PIG-1914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13065961#comment-13065961 ]
Dmitriy V. Ryaboy commented on PIG-1914: ---------------------------------------- Very cool. Some quick code review notes: Tiny typo here: "e = foreach d generate flatten(men#'value') as val;" -- that should read menu#'value' {code} boolean notDone = in.nextKeyValue(); if (!notDone) { return null; } {code} Better: {code} if (!in.nextKeyValue()) { return null; } {code} Parse exceptions: it's better to increment a counter and move on than to break on a bad input string. Throwing an exception kills the whole job. So maybe something like {code} t = null; while (t == null && in.nextKeyValue()) { ... } return t; {code} In flatten_array, if the value is an array, you allocate a new bag, populate it recursively, and add the contents of the new bag to the old bag. Why not skip the object allocation and copy, and simply pass the original bag into the recursive call? Also: are null values for keys just plain unsupported? You skip them. setLocation: not that it really matters, but for consistency, you should use PigTextInputFormat instead of PigFileInputFormat here. schema: probably makes sense to implement getSchema? > Support load/store JSON data in Pig > ----------------------------------- > > Key: PIG-1914 > URL: https://issues.apache.org/jira/browse/PIG-1914 > Project: Pig > Issue Type: New Feature > Affects Versions: 0.8.0, 0.9.0 > Reporter: Chao Tian > Attachments: PIG-1914.patch > > > The JSON is a commonly used data storage format. It is popular for storing > structured data, especially for JavaScript data exchange. > Pig should have the ability to load/store JSON format data. I plan to write > one for the piggy bank. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira