David,

Can you clarify which part of the FetchS3Object code looks problematic to
you?  From a quick look, I found one use of S3Object in FetchS3Object.java,
line ~106:

        try (final S3Object s3Object = client.getObject(request)) {
            flowFile = session.importFrom(s3Object.getObjectContent(),
flowFile);
            attributes.put("s3.bucket", s3Object.getBucketName());

I believe declaring the variable within the try block will lead to its
proper and certain closure, but I'm not 100% on all the fine print with
that.  Is this what you are referring to, and does it not work as I hope?

https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-aws-bundle/nifi-aws-processors/src/main/java/org/apache/nifi/processors/aws/s3/FetchS3Object.java#L106

Thanks,

James


On Wed, Mar 22, 2017 at 12:41 PM, David Hesson <dh.lo...@gmail.com> wrote:

> Greetings,
>
> In investigating a connection pool issue we were having during development,
> I was checking the FetchS3Object code to see how it reads content from S3.
> I don't see a close()
> <http://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/
> amazonaws/services/s3/model/S3Object.html#close-->invocation
> on the S3Object in the FetchS3Object processor. I believe this can lead to
> leaks on that object.
>
> We we're seeing logs like the following after trying to process some 90k
> objects from S3:
> INFO [Timer-Driven Process Thread-55] com.amazonaws.http.AmazonHttpClient
> Unable to execute HTTP request: Timeout waiting for connection from pool
>
> Is the S3Object not closed because the stream content is lazily loaded
> later in the flow (when accessed)? I didn't check the processSession
> implementation which reads the input stream. Just figured I'd ask and see
> if you all were aware, or that this is for some reason by design.
>
> Thanks,
> dh
>

Reply via email to