[ 
https://issues.apache.org/jira/browse/NIFI-738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14604480#comment-14604480
 ] 

ASF GitHub Bot commented on NIFI-738:
-------------------------------------

Github user joewitt commented on the pull request:

    https://github.com/apache/incubator-nifi/pull/69#issuecomment-116186741
  
    Builds cleanly including the contribution check for licenses and check 
style.
    
    ConvertCSVToAvro
    - Line 258 should use parameters passed in via object[] rather than string 
concatenation.
    
    - What happens if you have processed an entire input and produced no 
successful results due to a DatasetRecordExceptions?  I think it would forward 
an empty flow file since nothing would have been written to it.  Is that what 
you want?  Such a case seems like a FAILURE to me.
    
    - Consider removing the incompatible relationship and simply adding an 
attribute to the success flow file which lists the reasons or add an attribute 
per reason for difficulty converting the original.  If you will keep the 
incompatible relationship then I’d recommend sending it a clone of the original 
item and then adding the relationship.  That would make troubleshooting or 
analysis of that object much easier.  In any case be very careful about how 
large these attributes could become.  If you’re storing a series of stack 
traces this could become unwieldy and memory inefficient really fast.
    
    ConvertJSONToAvro
    - Line 160 should use parameters passed in via object[] rather than string 
concatenation.
    
    - Same comments as Convert CSV to Avro.
    
    FailureTracker
    - This class is a cool idea.  My concern with it is that it appears to 
store all the raw reasons which could be quite significant in large files with 
*many* thousands of lines.  This could create issues with the heap, cause 
excessive GC, etc.  Consider refactoring this class to both immediately store 
the summarized representation and to bound the total number of reasons it will 
store.  It can just ignore additional errors as they arrive.  This will allow 
it to behave well in terms of memory consumption even in the presence of 
massive failures as everything will be bounded.
    
    Thanks
    Joe


> Do not write conversion error messages to flow file content
> -----------------------------------------------------------
>
>                 Key: NIFI-738
>                 URL: https://issues.apache.org/jira/browse/NIFI-738
>             Project: Apache NiFi
>          Issue Type: Bug
>          Components: Extensions
>    Affects Versions: 0.1.0
>            Reporter: Ryan Blue
>            Assignee: Ryan Blue
>             Fix For: 0.2.0
>
>
> NIFI-551 extended the error handling provided by the ConvertJSONToAvro 
> processor, but wrote error messages as the content of a file sent on the 
> failure relationship. I think the right thing to do is to output the bad 
> records as the file content and put the error messages in the outgoing 
> attributes.
> NIFI-551 wasn't included in 0.1.0, so changing this behavior is safe. 
> Consequently, I'd like to get this fix into 0.2.0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to