[ https://issues.apache.org/jira/browse/HADOOP-13560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15840542#comment-15840542 ]
ASF GitHub Bot commented on HADOOP-13560: ----------------------------------------- Github user steveloughran commented on the issue: https://github.com/apache/hadoop/pull/130 Can you comment on that in a JIRA, not a PR? Thanks > S3ABlockOutputStream to support huge (many GB) file writes > ---------------------------------------------------------- > > Key: HADOOP-13560 > URL: https://issues.apache.org/jira/browse/HADOOP-13560 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 > Affects Versions: 2.9.0 > Reporter: Steve Loughran > Assignee: Steve Loughran > Fix For: 2.8.0, 3.0.0-alpha2 > > Attachments: HADOOP-13560-branch-2-001.patch, > HADOOP-13560-branch-2-002.patch, HADOOP-13560-branch-2-003.patch, > HADOOP-13560-branch-2-004.patch > > > There's two output stream mechanisms in Hadooop 2.7.x, neither of which > handle massive multi-GB files that well. > "classic": buffer everything to HDD until to the close() operation; time to > close becomes O(data); as is available disk space. Fails to exploit exploit > idle bandwidth, and on EC2 VMs with not much HDD capacity (especially > completing with HDFS storage), can fill up the disk. > {{S3AFastOutputStream}} uploads data in partition-sized blocks, buffering via > byte arrays. Avoids disk problems and as it writes as soon as the first > partition is ready, close() time is O(outstanding-data). However: needs > tuning to reduce amount of data buffered. Get it wrong, and the first clue > you get may be that the process goes OOM or is killed by YARN. Which is a > shame, as get it right and operations which generates lots of data, complete > much faster, including distcp. > This patch proposes a new output stream, a successor to both, > {{S3ABlockOutputStream}}. > # uses block upload model of S3AFastOutputStream > # supports buffering via: HDD, heap and (recycled) byte buffer, offering a > choice between memory and HDD use. HDD: no OOM problems on small JVMs/need to > tune. > # Uses the fast output stream mechanism of limiting queue size for data to > upload. Even when buffering via HDD, you may need to limit that use. > # lots of instrumentation to see what's being written. > # good defaults out the box (e.g buffer to HDD, partition size to strike a > good balance of early upload and scaleability) > # robust against transient failures. The AWS SDK retries a PUT on failure; > the entire block may need to be replayed, so HDD input cannot be buffered via > {{java.io.BufferedInputStream}}. It has also surfaced in testing that if the > final commit of a multipart option fails, it isn't retried —at least in the > current SDK in use. Do that ourselves. > # use roundrobin directory allocation, for most effective disk use > # take an AWS SDK {{com.amazonaws.event.ProgressListener}} for progress > callbacks, giving more detail on the operation. (It actually takes a > {{org.apache.hadoop.util.Progressable}}, but if that also implements the AWS > interface, that is used instead. > All of this to come with scale tests > * generate large files using all buffer mechanisms > * Do a large copy/rname and verify that the copy really works, including > metadata > * be configurable with sizes up to muti-GB, which also means that the test > timeouts need to be configurable to match the time it can take. > * As they are slow, make them optional, using the {{-Dscale}} switch to > enable. > Verifying large file rename is important on its own, as it is needed for very > large commit operations for committers using rename > The goal here is to implement a single, object stream which can be used for > all outputs, tuneable as to whether to use disk or memory, and on queue > sizes, but otherwise be all that's needed. We can do future development on > this, remove its predecessor {{S3AFastOutputStream}}, so keeping docs and > testing down, and leave the original {{S3AOutputStream}} alone for regression > testing/fallback. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org