[ 
https://issues.apache.org/jira/browse/NIFI-6367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16876494#comment-16876494
 ] 

Evan Reynolds commented on NIFI-6367:
-------------------------------------

[~kefevs] - that did help! Thank you!

It didn't throw the handled exceptions in your case, it threw an exception type 
that tells NiFi to reprocess the flowfile. 

I added two extra error checks - a null (as I could see that happen when 
testing) and also to check that exception to see if we should really retry or 
not -
[https://github.com/apache/nifi/pull/3562]

I think that will fix it up.

> FetchS3Processor responds to md5 error on download by doing download again, 
> again, and again
> --------------------------------------------------------------------------------------------
>
>                 Key: NIFI-6367
>                 URL: https://issues.apache.org/jira/browse/NIFI-6367
>             Project: Apache NiFi
>          Issue Type: Bug
>          Components: Core Framework
>    Affects Versions: 1.7.1
>         Environment: NIFI (CentOS 7.2) with FetchS3Object running towards S3 
> enviroment (non public). Enviroment / S3 had errors that introduced md5 
> errors on sub 0.5% of downloads. Downloads with md5 errors accumulated in the 
> input que of the processor.
>            Reporter: Kefevs Pirkibo
>            Assignee: Evan Reynolds
>            Priority: Critical
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> (6months old, but don't see changes in the relevant parts of the code, though 
> I might be mistaken. This might be hard to replicate, so suggest a code 
> wizard check if this is still a problem. )
> Case: NIFI running with FetchS3Object processor(s) towards S3 enviroment (non 
> public). The enviroment and S3 had in combination hardware errors that 
> resulted in sporadic md5 errors on the same files over and over again. Md5 
> errors resulted in an unhandled AmazonClientException, and the file was 
> downloaded yet again. (Reverted to the input que, first in line.) In our case 
> this was identified after a number of days, with substantial bandwidth usage. 
> It did not help that the FetchS3Objects where running with multiple 
> instances, and after days accumulated the bad md5 checksum files for 
> continuous download.
> Suggest: Someone code savy check what happens to files that are downloaded 
> with bad md5, if they are reverted to the que due to uncought exception or 
> other means, then this is still a potential problem.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to