Github user ssriniva123 commented on the issue:

    https://github.com/apache/drill/pull/518
  
    Apologize for getting back on this thread late, got tied up with some 
issues@work.
    
    Paul,
    The json parser is not just a tokenizer, it keeps track of the JSON 
structure and understands various aspects of it like root, array/objectcontext 
and all parsing is done under that context.
    
    - we cannot keep track of {} accurately - For eg: The counting json 
processor does a parser. skipChildren which tries to skip to the end of the 
JSON, but this can rollover to next line when
    there is a malformed JSON in the bottom most json sub object - see example 
below (missing " in last json structure). This is similar behavior with the 
JsonReader.
    
    {"balance": 1000.0,"num": 100,"is_vip": true,"name": 
"foo3","curr":{"denom":"pound","test":{"value  :false}}}
    
    - One possible solution is to rewind the input source to reset the stream 
(which is not recommended and there is no guarentee that all streams support 
mark/reset semantics.
    
    Given where we are, I think the solution proposed works perfect for almost 
all malformed JSON's.
    
    



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

Reply via email to