Steve Loughran created HADOOP-19295:
---------------------------------------
Summary: S3A: fs.s3a.connection.request.timeout too low for large
uploads over slow links
Key: HADOOP-19295
URL: https://issues.apache.org/jira/browse/HADOOP-19295
Project: Hadoop Common
Issue Type: Sub-task
Components: fs/s3
Affects Versions: 3.4.0, 3.4.1
Reporter: Steve Loughran
Assignee: Steve Loughran
The value of {{fs.s3a.connection.request.timeout}} (default = 60s} is too low
for large uploads over slow connections.
I suspect something changed between the v1 and v2 SDK versions so that put was
exempt from the normal timeouts, It is not and now surfaces in failures to
upload 1+ GB files over slower network connections. Smailer (for example 128
MB) files work.
The parallel queuing of writes in the S3ABlockOutputStream is helping create
this problem as it queues multiple blocks at the same time, so per-block
bandwidth becomes available/blocks ; four blocks cuts the capacity down by a
quarter.
The fix is straightforward: use a much bigger timeout. I'm going to propose 15
minutes. We need to strike a balance between upload time allocation and other
requests timing out.
I do worry about other consequences; we've found that timeout exception happy
to hide the underlying causes of retry failures -so in fact this may be better
for all but a server hanging after the HTTP request is initiated.
too bad we can't alter the timeout for different requests
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]