[ https://issues.apache.org/jira/browse/HADOOP-13560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15470729#comment-15470729 ]
Steve Loughran commented on HADOOP-13560: ----------------------------------------- Looks related to this [stack overflow topic|http://stackoverflow.com/questions/30121218/aws-s3-uploading-large-file-fails-with-resetexception-failed-to-reset-the-requ] and AWS issue [427|https://github.com/aws/aws-sdk-java/issues/427] Passing the file direct to the xfer manager apparently helps, though that will complicate the process in two ways: 1. the buffering mechanism is now visible to the S3aBlockOutputStream 2. there's the problem of deleting the file after the async xfer operation completes. Currently the stream deletes it in close(); without that a progress callback would need to react to the completed event and delete the file. Viable. before then: experiment without the buffering (performance impact?) and with smaller partition sizes. Also an unrelated idea: what about an option for always making the first block a memory block. That way, small files will be written without going near the local FS, but larger files will be uploaded. > S3A to support huge file writes and operations -with tests > ---------------------------------------------------------- > > Key: HADOOP-13560 > URL: https://issues.apache.org/jira/browse/HADOOP-13560 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 > Affects Versions: 2.9.0 > Reporter: Steve Loughran > Assignee: Steve Loughran > Priority: Minor > Attachments: HADOOP-13560-branch-2-001.patch, > HADOOP-13560-branch-2-002.patch > > > An AWS SDK [issue|https://github.com/aws/aws-sdk-java/issues/367] highlights > that metadata isn't copied on large copies. > 1. Add a test to do that large copy/rname and verify that the copy really > works > 2. Verify that metadata makes it over. > Verifying large file rename is important on its own, as it is needed for very > large commit operations for committers using rename -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org