Github user tzolov commented on the pull request:

    https://github.com/apache/incubator-hawq/pull/302#issuecomment-178608613
  
    @hornn, @adamjshook  
    As an experimenti've rempimplemented/replaced the 
`JsonRecordReader`&`JsonStreamReader` with ideas and code boroed from the 
[json-mapreduce](https://github.com/alexholmes/json-mapreduce) project.  I 
really like the result. It comes with a (sort of) JsonLexer/Parser that solves 
some of the shortcomings in the current implementation. For example having the 
`identifier` in the nested object(s) values will be handled. 
    Furthermore they why how the identifier  is used is slightly differently 
and more powerful. The identifier refers to a member name which it will use to 
determine the encapsulating object to return.  This is  superior functionality 
over  what we have at the moment as you can point to multiline json objects 
that don't have a parent identifier. 
    
    Shall i add this change to this PR or to a separate one after this has been 
merged? 
    Personally i think it is better to add it now as it changes the 
`identifier` semantics and would be inconvenient for the potential users to 
learn/unlearn. Alternatively we can drop the current JsonRecordReader form this 
PR and introduce the new code as an extension i another one. What do you think?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

Reply via email to