[ 
https://issues.apache.org/jira/browse/NIFI-25?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14341760#comment-14341760
 ] 

Mark Payne edited comment on NIFI-25 at 3/14/15 9:11 PM:
---------------------------------------------------------

I'm not sure about the ListS3. I can definitely see the value of it. However, 
it requires that the processor maintain a significant amount of state about 
what it has seen This is not cluster friendly at all. It also requires 
continually pulling a potentially huge listing to see if anything has changed.

I think we should instead push users to configure S3 to add a notification to 
SQS when a new object is placed in an S3 bucket. We can then have a GetSQS 
processor to detect that an item was added and then fetch the contents via 
GetS3/FetchS3/RetrieveS3. This is a much more scalable approach and handles 
backpressure well.


was (Author: markap14):
I'm not sure about the ListS3. I can definitely see the value of it. However, 
it requires that the processor maintain a significant amount of state about 
what it has seen This is not cluster friendly at all. It also requires 
continually pulling a potentially huge listing to see if anything has changed.

I think we should instead push users to configure S3 to add a notification to 
SNS when a new object is placed in an S3 bucket. We can then have a GetSNS 
processor to detect that an item was added and then fetch the contents via 
GetS3/FetchS3/RetrieveS3. This is a much more scalable approach and handles 
backpressure well.

> Create processor set to support interaction with Amazon S3
> ----------------------------------------------------------
>
>                 Key: NIFI-25
>                 URL: https://issues.apache.org/jira/browse/NIFI-25
>             Project: Apache NiFi
>          Issue Type: New Feature
>          Components: Extensions
>            Reporter: Joseph Witt
>            Priority: Minor
>
> Determine the appropriate set of processors to interact fully with Amazon's 
> S3.  
> Might need:
> - ListS3
> - RetrieveS3 or GetS3
> - PutS3
> May be able to replace all of these simply with combinations of use of 
> InvokeHTTP.  But need to ensure a high quality user experience as well so it 
> isn't only a power user feature.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to