[ 
https://issues.apache.org/jira/browse/DRILL-4653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15525244#comment-15525244
 ] 

ASF GitHub Bot commented on DRILL-4653:
---------------------------------------

Github user ssriniva123 commented on the issue:

    https://github.com/apache/drill/pull/518
  
    Paul,
    The code you have listed is semantically equivalent to that of what I 
already I have submitted for pull and will not solve handling of all malformed 
json records. Also the code for reporting the 
    error records is working correctly as long as is it is reported by the 
Parser correctly.
    
    As I explained earlier the JSON parser is not just a simple tokenizer, it 
keeps track of internal state,
    hence the issue. SERDE's in hive etc work because they  are record oriented 
with clean record demarkations using a new line.
    
    One solution is to submit a patch to jackson parser to expose a method to 
skip to new line in the
    event of a parsing exception. This can be parametrized so that behavior can 
customized.



> Malformed JSON should not stop the entire query from progressing
> ----------------------------------------------------------------
>
>                 Key: DRILL-4653
>                 URL: https://issues.apache.org/jira/browse/DRILL-4653
>             Project: Apache Drill
>          Issue Type: Improvement
>          Components: Storage - JSON
>    Affects Versions: 1.6.0
>            Reporter: subbu srinivasan
>             Fix For: Future
>
>
> Currently Drill query terminates upon first encounter of a invalid JSON line.
> Drill has to continue progressing after ignoring the bad records. Something 
> similar to a setting of (ignore.malformed.json) would help.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to