[ https://issues.apache.org/jira/browse/NIFI-6367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16876494#comment-16876494 ]
Evan Reynolds commented on NIFI-6367: ------------------------------------- [~kefevs] - that did help! Thank you! It didn't throw the handled exceptions in your case, it threw an exception type that tells NiFi to reprocess the flowfile. I added two extra error checks - a null (as I could see that happen when testing) and also to check that exception to see if we should really retry or not - [https://github.com/apache/nifi/pull/3562] I think that will fix it up. > FetchS3Processor responds to md5 error on download by doing download again, > again, and again > -------------------------------------------------------------------------------------------- > > Key: NIFI-6367 > URL: https://issues.apache.org/jira/browse/NIFI-6367 > Project: Apache NiFi > Issue Type: Bug > Components: Core Framework > Affects Versions: 1.7.1 > Environment: NIFI (CentOS 7.2) with FetchS3Object running towards S3 > enviroment (non public). Enviroment / S3 had errors that introduced md5 > errors on sub 0.5% of downloads. Downloads with md5 errors accumulated in the > input que of the processor. > Reporter: Kefevs Pirkibo > Assignee: Evan Reynolds > Priority: Critical > Time Spent: 10m > Remaining Estimate: 0h > > (6months old, but don't see changes in the relevant parts of the code, though > I might be mistaken. This might be hard to replicate, so suggest a code > wizard check if this is still a problem. ) > Case: NIFI running with FetchS3Object processor(s) towards S3 enviroment (non > public). The enviroment and S3 had in combination hardware errors that > resulted in sporadic md5 errors on the same files over and over again. Md5 > errors resulted in an unhandled AmazonClientException, and the file was > downloaded yet again. (Reverted to the input que, first in line.) In our case > this was identified after a number of days, with substantial bandwidth usage. > It did not help that the FetchS3Objects where running with multiple > instances, and after days accumulated the bad md5 checksum files for > continuous download. > Suggest: Someone code savy check what happens to files that are downloaded > with bad md5, if they are reverted to the que due to uncought exception or > other means, then this is still a potential problem. -- This message was sent by Atlassian JIRA (v7.6.3#76005)