[ https://issues.apache.org/jira/browse/DRILL-4653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15525244#comment-15525244 ]
ASF GitHub Bot commented on DRILL-4653: --------------------------------------- Github user ssriniva123 commented on the issue: https://github.com/apache/drill/pull/518 Paul, The code you have listed is semantically equivalent to that of what I already I have submitted for pull and will not solve handling of all malformed json records. Also the code for reporting the error records is working correctly as long as is it is reported by the Parser correctly. As I explained earlier the JSON parser is not just a simple tokenizer, it keeps track of internal state, hence the issue. SERDE's in hive etc work because they are record oriented with clean record demarkations using a new line. One solution is to submit a patch to jackson parser to expose a method to skip to new line in the event of a parsing exception. This can be parametrized so that behavior can customized. > Malformed JSON should not stop the entire query from progressing > ---------------------------------------------------------------- > > Key: DRILL-4653 > URL: https://issues.apache.org/jira/browse/DRILL-4653 > Project: Apache Drill > Issue Type: Improvement > Components: Storage - JSON > Affects Versions: 1.6.0 > Reporter: subbu srinivasan > Fix For: Future > > > Currently Drill query terminates upon first encounter of a invalid JSON line. > Drill has to continue progressing after ignoring the bad records. Something > similar to a setting of (ignore.malformed.json) would help. -- This message was sent by Atlassian JIRA (v6.3.4#6332)