[ 
https://issues.apache.org/jira/browse/HADOOP-11684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14939600#comment-14939600
 ] 

Thomas Demoor commented on HADOOP-11684:
----------------------------------------

S3a has 2 modes for uploading: 
* fs.s3a.fast.upload=false (default): S3AOutputStream.java
** files are buffered to local disk first, on fs.close() the upload to S3 is 
initiated
** similar behaviour to s3n, other 3d party filesystems
** downsides: throughput of local disk, remaining space on local disk, delayed 
start of upload 
* fs.s3a.fast.upload=true: S3AFastOutputStream.java
** Hadoop writes are buffered in memory, if written data > threshold: multipart 
is initiated, uploading multiple parts in parallel in different *threads* (as 
soon as the data is in memory)
** EMR probably does something similar
** in this mode, fs.s3a.multipart.size should be set to something like 64 or 
128MB, similar to hdfs block size.
** downsides: buffers data in memory inside JVM (~ fs.s3a.multipart.size * 
(fs.s3a.threads.max + fs.s3a.max.total.tasks) +1 ), HADOOP-12387 will improve 
memory management

In fast mode, more threads / queued parts improve parallelism but require 
additional memory buffer space. Setting max.total.tasks=1000 certainly runs the 
JVM OOM here, as do applications that write files from separate threads (with 
CallerRuns, not with Blocking Threadpool). In default mode, the threadpool is 
used by the AWS SDK TransferManager.

Indeed, the blocking threadpool is non-trivial (semaphores,...) and thus 
higher-risk. Is there similar code in HDFS we could inspect / reuse?


> S3a to use thread pool that blocks clients
> ------------------------------------------
>
>                 Key: HADOOP-11684
>                 URL: https://issues.apache.org/jira/browse/HADOOP-11684
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 2.7.0
>            Reporter: Thomas Demoor
>            Assignee: Thomas Demoor
>         Attachments: HADOOP-11684-001.patch, HADOOP-11684-002.patch, 
> HADOOP-11684-003.patch
>
>
> Currently, if fs.s3a.max.total.tasks are queued and another (part)upload 
> wants to start, a RejectedExecutionException is thrown. 
> We should use a threadpool that blocks clients, nicely throtthling them, 
> rather than throwing an exception. F.i. something similar to 
> https://github.com/apache/incubator-s4/blob/master/subprojects/s4-comm/src/main/java/org/apache/s4/comm/staging/BlockingThreadPoolExecutorService.java



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to