[ 
https://issues.apache.org/jira/browse/NIFI-551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14523664#comment-14523664
 ] 

Ricky Saltzer commented on NIFI-551:
------------------------------------

Hey [~markap14] -

Unfortunately the API that this processor leverages does not have a way to 
obtain the actual record it failed to parse. To enable this, we'd have to make 
a change to the underlying Kite SDK library. The use case here is to at the 
very least give the user something to go off of if a record fails to parse, 
other than a counter that says "Hey we failed to parse _n_ records". I wouldn't 
be opposed to adding a third relationship called something along the lines of 
"errors" or "conversion errors", and use that for sending the parse errors. 

The reason I didn't just go with sending the parse errors to the bulletin board 
is because if a file has a ton (thousands to millions) of invalid records, 
there would be a overwhelming amount of error messages. One alternative 
approach could be to de-dupe the conversion failures by field name, so I only 
alert for one conversion failure per field. I could hold off till the end of 
the processing to alert, and say something along the lines of "<conversion 
failure> <_n_ other failures like this one>", or something similar...

For example:
{code}
Cannot convert field id [Cannot convert to long: "120V"] (322 similar failures)
Cannot convert field color [Cannot convert to string: 15.23] (933 similar 
failures)
{code}

Thoughts?

Ricky 

> Improve error handling for ConvertJSONToAvro processor
> ------------------------------------------------------
>
>                 Key: NIFI-551
>                 URL: https://issues.apache.org/jira/browse/NIFI-551
>             Project: Apache NiFi
>          Issue Type: Improvement
>          Components: Extensions
>            Reporter: Ricky Saltzer
>              Labels: patch, review
>             Fix For: 0.1.0
>
>         Attachments: NIFI-551.1.patch, NIFI-551.2.patch
>
>
> Currently, if the ConvertJSONToAvro processor fails to process an individual 
> record, a counter is incremented, but no alerts are produced. It would be 
> better to notify the bulletin board that we've failed to process some records 
> for a flowfile. Further, we should stream the records we fail to process down 
> the failure relationship for further inspection. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to