[ 
https://issues.apache.org/jira/browse/HADOOP-15460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16476991#comment-16476991
 ] 

Stephan Ewen commented on HADOOP-15460:
---------------------------------------

Thanks Steve. Yes, another big part was the latency / cost of PUT requests 
through the additional HEAD requests, which become an issue for frequent 
creation of smaller (often short lived) files as we see when writing 
checkpoints in Flink.

> S3A FS to add  "s3a:no-existence-checks" to the builder file creation option 
> set
> --------------------------------------------------------------------------------
>
>                 Key: HADOOP-15460
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15460
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 3.1.0
>            Reporter: Steve Loughran
>            Priority: Major
>
> As promised to [~StephanEwen]: add and s3a-specific option to the builder-API 
> to create files for all existence checks to be skipped.
> This
> # eliminates a few hundred milliseconds
> # avoids any caching of negative HEAD/GET responses in the S3 load balancers.
> Callers will be expected to know what what they are doing.
> FWIW, we are doing some PUT calls in the committer which bypass this stuff, 
> for the same reason. If you've just created a directory, you know there's 
> nothing underneath, so no need to check.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to