[ 
https://issues.apache.org/jira/browse/HADOOP-19576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17954577#comment-17954577
 ] 

Steve Loughran commented on HADOOP-19576:
-----------------------------------------

people trying to do INSERT OVERWRITE? hmm. 

ok, set to default, mark as incompatible, and update the docs with 
* a section on this. 
* what the error means

people should always have auto cleanup enabled anyway, shouldn't they? always 
always always. in fact, aws cost analyzer should warn if you don't (does it do 
this?). and given that, shouldn't really matter much. we have CLI tools in 
terms of "hadoop s3guard uploads" to enum/purge it too. 

FWIW one of our qe buckets didn't do the cleanup, I used it for a scale test of 
MPU cleanup. We could actually have an ILoadTest for this now that createFile() 
lets you create a zero byte MPU

> Insert Overwrite Jobs With MagicCommitter Fails On S3 Express Storage
> ---------------------------------------------------------------------
>
>                 Key: HADOOP-19576
>                 URL: https://issues.apache.org/jira/browse/HADOOP-19576
>             Project: Hadoop Common
>          Issue Type: Bug
>            Reporter: Syed Shameerur Rahman
>            Priority: Major
>
> Query engines which uses Magic Committer to overwrite a directory would 
> ideally upload the MPUs (not complete) and then delete the contents of the 
> directory before committing the MPU.
>  
> For S3 express storage, The directory purge operation is enabled by default. 
> Refer 
> [here|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java#L688]
>  for code pointers.
>  
> Due to this, the pending MPU uploads are purged and query fails with 
> {{NoSuchUpload: The specified multipart upload does not exist. The upload ID 
> might be invalid, or the multipart upload might have been aborted or 
> completed. }}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to