[ 
https://issues.apache.org/jira/browse/DRILL-4653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15523997#comment-15523997
 ] 

ASF GitHub Bot commented on DRILL-4653:
---------------------------------------

Github user paul-rogers commented on the issue:

    https://github.com/apache/drill/pull/518
  
    As it turns out, the sample code shown was actually tested with a stock 
Jackson JSON parser: it does work. No parser changes are needed.
    
    The issue is not whether we can make the parser do what is needed: the code 
posted in the comment above demonstrated a solution.
    
    The issue is how we incorporate that code into the JSON parser to clean up 
partial records and prevent schema changes. When I have time, I'll investigate 
that question in greater depth.
    
    IMHO, without a proper fix, we should simply state that Drill does not 
support malformed JSON. If an input file might be incorrect, run it though a 
clean-up step before allowing Drill to scan it. Otherwise, we are opening the 
door to many hard-to-resolve bugs when people ask Drill to scan corrupt JSON: 
the result, without a proper fix, would be undefined -- which is worse than the 
current behavior that simply fails the scan with an error.
    
    Let's follow up again after I (or someone) has had a chance to figure out 
if we can undo a partially built record. If we can do that, then we've got a 
path to a clean solution: recover the parser (as shown earlier) and discard the 
in-flight record (as we need to research.)


> Malformed JSON should not stop the entire query from progressing
> ----------------------------------------------------------------
>
>                 Key: DRILL-4653
>                 URL: https://issues.apache.org/jira/browse/DRILL-4653
>             Project: Apache Drill
>          Issue Type: Improvement
>          Components: Storage - JSON
>    Affects Versions: 1.6.0
>            Reporter: subbu srinivasan
>             Fix For: Future
>
>
> Currently Drill query terminates upon first encounter of a invalid JSON line.
> Drill has to continue progressing after ignoring the bad records. Something 
> similar to a setting of (ignore.malformed.json) would help.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to