[ 
https://issues.apache.org/jira/browse/NIFI-2631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15450709#comment-15450709
 ] 

Joseph Gresock commented on NIFI-2631:
--------------------------------------

I'm totally fine with committing after every batch -- I tend to leave the 
option there when modifying someone else's code, since I'm not sure if they had 
a use case I hadn't thought of.  But if Adam agrees, I'd say let's just make it 
part of the behavior.

> ListS3 improvements: "Use versions" and "Commit mode"
> -----------------------------------------------------
>
>                 Key: NIFI-2631
>                 URL: https://issues.apache.org/jira/browse/NIFI-2631
>             Project: Apache NiFi
>          Issue Type: Improvement
>    Affects Versions: 0.7.0
>            Reporter: Joseph Gresock
>            Assignee: Joseph Gresock
>            Priority: Minor
>             Fix For: 1.1.0, 0.8.0
>
>
> Our team needs to be able to list individual versions in S3.  We also ran 
> into a use case where a bucket with many objects (over 1 million in our case) 
> seemed to cause ListS3 to run forever.  The S3 list command finished in a few 
> minutes, but we believe it was taking a very long time for NiFi to commit all 
> the flow files at once.
> To handle this use case, we added a Commit Mode property to ListS3 that 
> allows you specify that you want to commit "Per page" vs. "Once".  This has 
> proven to correctly emit the flow files as the S3 paging progresses.
> We also implemented support for S3 List Versions, which includes the 
> "s3.version" and "s3.isLatest" attributes if applicable.  The "s3.version" 
> attribute can in turn be used in the FetchS3 processor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to