[ https://issues.apache.org/jira/browse/DRILL-4653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15523997#comment-15523997 ]
ASF GitHub Bot commented on DRILL-4653: --------------------------------------- Github user paul-rogers commented on the issue: https://github.com/apache/drill/pull/518 As it turns out, the sample code shown was actually tested with a stock Jackson JSON parser: it does work. No parser changes are needed. The issue is not whether we can make the parser do what is needed: the code posted in the comment above demonstrated a solution. The issue is how we incorporate that code into the JSON parser to clean up partial records and prevent schema changes. When I have time, I'll investigate that question in greater depth. IMHO, without a proper fix, we should simply state that Drill does not support malformed JSON. If an input file might be incorrect, run it though a clean-up step before allowing Drill to scan it. Otherwise, we are opening the door to many hard-to-resolve bugs when people ask Drill to scan corrupt JSON: the result, without a proper fix, would be undefined -- which is worse than the current behavior that simply fails the scan with an error. Let's follow up again after I (or someone) has had a chance to figure out if we can undo a partially built record. If we can do that, then we've got a path to a clean solution: recover the parser (as shown earlier) and discard the in-flight record (as we need to research.) > Malformed JSON should not stop the entire query from progressing > ---------------------------------------------------------------- > > Key: DRILL-4653 > URL: https://issues.apache.org/jira/browse/DRILL-4653 > Project: Apache Drill > Issue Type: Improvement > Components: Storage - JSON > Affects Versions: 1.6.0 > Reporter: subbu srinivasan > Fix For: Future > > > Currently Drill query terminates upon first encounter of a invalid JSON line. > Drill has to continue progressing after ignoring the bad records. Something > similar to a setting of (ignore.malformed.json) would help. -- This message was sent by Atlassian JIRA (v6.3.4#6332)