[ https://issues.apache.org/jira/browse/HADOOP-14028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15854968#comment-15854968 ]
Aaron Fabbri commented on HADOOP-14028: --------------------------------------- Quick review of v4 patch {noformat} @@ -392,8 +391,15 @@ private void putObject() throws IOException { executorService.submit(new Callable<PutObjectResult>() { @Override public PutObjectResult call() throws Exception { - PutObjectResult result = fs.putObjectDirect(putObjectRequest); - block.close(); + PutObjectResult result; + try { + // the put object call automatically closes the input + // stream afterwards. + result = writeOperationHelper.putObject(putObjectRequest); + block.close(); + } finally { + closeCloseables(LOG, block); + } return {noformat} Do you still need block.close() above finally block? {noformat} - /** - * Return the buffer to the pool after the stream is closed. - */ - @Override - public synchronized void close() { - if (byteBuffer != null) { - LOG.debug("releasing buffer"); - releaseBuffer(byteBuffer); + /** + * Return the buffer to the pool after the stream is closed. + */ + @Override + public synchronized void close() { + LOG.debug("ByteBufferInputStream.close() for {}", + ByteBufferBlock.super.toString()); byteBuffer = null; } - } - /** - * Verify that the stream is open. - * @throws IOException if the stream is closed - */ - private void verifyOpen() throws IOException { - if (byteBuffer == null) { - throw new IOException(FSExceptionMessages.STREAM_IS_CLOSED); {noformat} This part of diff was hard to read due to indent change, but did you eliminate the call to releaseBuffer(byteBuffer)? If so can you explain that a bit? > S3A block output streams don't delete temporary files in multipart uploads > -------------------------------------------------------------------------- > > Key: HADOOP-14028 > URL: https://issues.apache.org/jira/browse/HADOOP-14028 > Project: Hadoop Common > Issue Type: Bug > Components: fs/s3 > Affects Versions: 2.8.0 > Environment: JDK 8 + ORC 1.3.0 + hadoop-aws 3.0.0-alpha2 > Reporter: Seth Fitzsimmons > Assignee: Steve Loughran > Priority: Critical > Attachments: HADOOP-14028-branch-2-001.patch, > HADOOP-14028-branch-2.8-002.patch, HADOOP-14028-branch-2.8-003.patch, > HADOOP-14028-branch-2.8-004.patch > > > I have `fs.s3a.fast.upload` enabled with 3.0.0-alpha2 (it's exactly what I > was looking for after running into the same OOM problems) and don't see it > cleaning up the disk-cached blocks. > I'm generating a ~50GB file on an instance with ~6GB free when the process > starts. My expectation is that local copies of the blocks would be deleted > after those parts finish uploading, but I'm seeing more than 15 blocks in > /tmp (and none of them have been deleted thus far). > I see that DiskBlock deletes temporary files when closed, but is it closed > after individual blocks have finished uploading or when the entire file has > been fully written to the FS (full upload completed, including all parts)? > As a temporary workaround to avoid running out of space, I'm listing files, > sorting by atime, and deleting anything older than the first 20: `ls -ut | > tail -n +21 | xargs rm` > Steve Loughran says: > > They should be deleted as soon as the upload completes; the close() call > > that the AWS httpclient makes on the input stream triggers the deletion. > > Though there aren't tests for it, as I recall. -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org