[jira] [Commented] (HADOOP-19189) ITestS3ACommitterFactory failing

2024-09-09 Thread Syed Shameerur Rahman (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17880507#comment-17880507
 ] 

Syed Shameerur Rahman commented on HADOOP-19189:


[~ste...@apache.org]  - I notice that the PR is merged yet the Jira is 
unresolved.

> ITestS3ACommitterFactory failing
> 
>
> Key: HADOOP-19189
> URL: https://issues.apache.org/jira/browse/HADOOP-19189
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, test
>Affects Versions: 3.4.0
>Reporter: Steve Loughran
>Priority: Minor
>  Labels: pull-request-available
>
> we've had ITestS3ACommitterFactory failing for a while, where it looks like 
> changed committer settings aren't being picked up.
> {code}
> ERROR] 
> ITestS3ACommitterFactory.testEverything:115->testInvalidFileBinding:165 
> Expected a org.apache.hadoop.fs.s3a.commit.PathCommitException to be thrown, 
> but got the result: : 
> FileOutputCommitter{PathOutputCommitter{context=TaskAttemptContextImpl{JobContextImpl
> {code}
> I've spent some time looking at it and it is happening because the test sets 
> the fileystem ref for the local test fs, and not that of the filesystem 
> created by the committer, which is where the option is picked up.
> i've tried to parameterize it but things are still playing up and I'm not 
> sure how hard to try to fix.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19221) S3A: Unable to recover from failure of multipart block upload attempt "Status Code: 400; Error Code: RequestTimeout"

2024-07-24 Thread Syed Shameerur Rahman (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17868533#comment-17868533
 ] 

Syed Shameerur Rahman commented on HADOOP-19221:


[~ste...@apache.org]  - It was a great analysis and a good catch. Sure i will 
review the PR.

> S3A: Unable to recover from failure of multipart block upload attempt "Status 
> Code: 400; Error Code: RequestTimeout"
> 
>
> Key: HADOOP-19221
> URL: https://issues.apache.org/jira/browse/HADOOP-19221
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
>  Labels: pull-request-available
>
> If a multipart PUT request fails for some reason (e.g. networrk error) then 
> all subsequent retry attempts fail with a 400 Response and ErrorCode 
> RequestTimeout .
> {code}
> Your socket connection to the server was not read from or written to within 
> the timeout period. Idle connections will be closed. (Service: Amazon S3; 
> Status Code: 400; Error Code: RequestTimeout; Request ID:; S3 Extended 
> Request ID:
> {code}
> The list of supporessed exceptions contains the root cause (the initial 
> failure was a 500); all retries failed to upload properly from the source 
> input stream {{RequestBody.fromInputStream(fileStream, size)}}.
> Hypothesis: the mark/reset stuff doesn't work for input streams. On the v1 
> sdk we would build a multipart block upload request passing in (file, offset, 
> length), the way we are now doing this doesn't recover.
> probably fixable by providing our own {{ContentStreamProvider}} 
> implementations for
> # file + offset + length
> # bytebuffer
> # byte array
> The sdk does have explicit support for the memory ones, but they copy the 
> data blocks first. we don't want that as it would double the memory 
> requirements of active blocks.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18708) AWS SDK V2 - Implement CSE

2024-06-12 Thread Syed Shameerur Rahman (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17854368#comment-17854368
 ] 

Syed Shameerur Rahman commented on HADOOP-18708:


[~ste...@apache.org]  - I have created a first cut PR and would like to get 
your review: https://github.com/apache/hadoop/pull/6884

> AWS SDK V2 - Implement CSE
> --
>
> Key: HADOOP-18708
> URL: https://issues.apache.org/jira/browse/HADOOP-18708
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Ahmar Suhail
>Assignee: Syed Shameerur Rahman
>Priority: Major
>  Labels: pull-request-available
>
> S3 Encryption client for SDK V2 is now available, so add client side 
> encryption back in. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Assigned] (HADOOP-18708) AWS SDK V2 - Implement CSE

2024-05-27 Thread Syed Shameerur Rahman (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Syed Shameerur Rahman reassigned HADOOP-18708:
--

Assignee: Syed Shameerur Rahman  (was: Ahmar Suhail)

> AWS SDK V2 - Implement CSE
> --
>
> Key: HADOOP-18708
> URL: https://issues.apache.org/jira/browse/HADOOP-18708
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Ahmar Suhail
>Assignee: Syed Shameerur Rahman
>Priority: Major
>  Labels: pull-request-available
>
> S3 Encryption client for SDK V2 is now available, so add client side 
> encryption back in. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19091) Add support for Tez to MagicS3GuardCommitter

2024-03-16 Thread Syed Shameerur Rahman (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17827768#comment-17827768
 ] 

Syed Shameerur Rahman commented on HADOOP-19091:


Ok, [~vnarayanan7] Please feel free to raise PR for the same.

> Add support for Tez to MagicS3GuardCommitter
> 
>
> Key: HADOOP-19091
> URL: https://issues.apache.org/jira/browse/HADOOP-19091
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3
>Affects Versions: 3.3.6
> Environment: Pig 17/Hive 3.1.3 with Hadoop 3.3.3 on AWS EMR 6-12.0
>Reporter: Venkatasubrahmanian Narayanan
>Assignee: Venkatasubrahmanian Narayanan
>Priority: Major
> Attachments: 0001-AWS-Hive-Changes.patch, 
> 0002-HIVE-27698-Backport-of-HIVE-22398-Remove-legacy-code.patch, 
> HADOOP-19091-HIVE-WIP.patch
>
>
> The MagicS3GuardCommitter assumes that the JobID of the task is the same as 
> that of the job's application master when writing/reading the .pendingset 
> file. This assumption is not valid when running with Tez, which creates 
> slightly different JobIDs for tasks and the application master.
>  
> While the MagicS3GuardCommitter is intended only for MRv2, it mostly works 
> fine with an MRv1 wrapper with Hive/Pig (with some minor changes to Hive) run 
> in MR mode. This issue only crops up when running queries with the Tez 
> execution engine. I can upload a patch to Hive 3.1 to reproduce this error on 
> EMR if needed.
>  
> Fixing this will probably require work from both Tez and Hadoop, wanted to 
> start a discussion here so we can figure out how exactly we go about this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19091) Add support for Tez to MagicS3GuardCommitter

2024-03-05 Thread Syed Shameerur Rahman (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17823486#comment-17823486
 ] 

Syed Shameerur Rahman commented on HADOOP-19091:


[~vnarayanan7] - Thanks for the logs with example. 
Please feel free to raise the PR with the required changes. I can help with the 
review.

Is it possible to scope down the changes only to Tez by setting 
`fs.s3a.committer.uuid` appropriately ?

> Add support for Tez to MagicS3GuardCommitter
> 
>
> Key: HADOOP-19091
> URL: https://issues.apache.org/jira/browse/HADOOP-19091
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3
>Affects Versions: 3.3.6
> Environment: Pig 17/Hive 3.1.3 with Hadoop 3.3.3 on AWS EMR 6-12.0
>Reporter: Venkatasubrahmanian Narayanan
>Assignee: Venkatasubrahmanian Narayanan
>Priority: Major
> Attachments: 0001-AWS-Hive-Changes.patch, 
> 0002-HIVE-27698-Backport-of-HIVE-22398-Remove-legacy-code.patch, 
> HADOOP-19091-HIVE-WIP.patch
>
>
> The MagicS3GuardCommitter assumes that the JobID of the task is the same as 
> that of the job's application master when writing/reading the .pendingset 
> file. This assumption is not valid when running with Tez, which creates 
> slightly different JobIDs for tasks and the application master.
>  
> While the MagicS3GuardCommitter is intended only for MRv2, it mostly works 
> fine with an MRv1 wrapper with Hive/Pig (with some minor changes to Hive) run 
> in MR mode. This issue only crops up when running queries with the Tez 
> execution engine. I can upload a patch to Hive 3.1 to reproduce this error on 
> EMR if needed.
>  
> Fixing this will probably require work from both Tez and Hadoop, wanted to 
> start a discussion here so we can figure out how exactly we go about this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-19091) Add support for Tez to MagicS3GuardCommitter

2024-03-03 Thread Syed Shameerur Rahman (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17823030#comment-17823030
 ] 

Syed Shameerur Rahman edited comment on HADOOP-19091 at 3/4/24 4:31 AM:


[~vnarayanan7] - Could you please share the complete error stacktrace ?

As i could see from the code implementation, During commitJob operation, 
[listPendingUploadToCommit|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/magic/MagicS3GuardCommitter.java#L124]
 method is invoked which list all the files under the jobAttemptPath with a 
suffix `.pendingset`. 

So as per the logic, My understanding is that the individual file name under 
the jobAttemptPath should not be a concern here.


was (Author: srahman):
[~vnarayanan7] - Could you please share the complete error stacktrace ?

As i could see from the code implementation, During commitJob operation, 
[listPendingUploadToCommit|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/magic/MagicS3GuardCommitter.java#L124]
 method is invoked which list all the files under the jobAttemptPath with a 
suffix `.pendingset`. If so what is the value returned by (getJobAttemptPath)

What i understand from your comment is that, The `getJobAttemptPath` is not 
returning correct value (for Hive,Pig with Tez) and hence the commitJob is not 
able to read the commit metadata. Is my understanding correct ?

> Add support for Tez to MagicS3GuardCommitter
> 
>
> Key: HADOOP-19091
> URL: https://issues.apache.org/jira/browse/HADOOP-19091
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3
>Affects Versions: 3.3.6
> Environment: Pig 17/Hive 3.1.3 with Hadoop 3.3.3 on AWS EMR 6-12.0
>Reporter: Venkatasubrahmanian Narayanan
>Assignee: Venkatasubrahmanian Narayanan
>Priority: Major
> Attachments: 0001-AWS-Hive-Changes.patch, 
> 0002-HIVE-27698-Backport-of-HIVE-22398-Remove-legacy-code.patch, 
> HADOOP-19091-HIVE-WIP.patch
>
>
> The MagicS3GuardCommitter assumes that the JobID of the task is the same as 
> that of the job's application master when writing/reading the .pendingset 
> file. This assumption is not valid when running with Tez, which creates 
> slightly different JobIDs for tasks and the application master.
>  
> While the MagicS3GuardCommitter is intended only for MRv2, it mostly works 
> fine with an MRv1 wrapper with Hive/Pig (with some minor changes to Hive) run 
> in MR mode. This issue only crops up when running queries with the Tez 
> execution engine. I can upload a patch to Hive 3.1 to reproduce this error on 
> EMR if needed.
>  
> Fixing this will probably require work from both Tez and Hadoop, wanted to 
> start a discussion here so we can figure out how exactly we go about this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-19091) Add support for Tez to MagicS3GuardCommitter

2024-03-03 Thread Syed Shameerur Rahman (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17823030#comment-17823030
 ] 

Syed Shameerur Rahman edited comment on HADOOP-19091 at 3/4/24 4:30 AM:


[~vnarayanan7] - Could you please share the complete error stacktrace ?

As i could see from the code implementation, During commitJob operation, 
[listPendingUploadToCommit|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/magic/MagicS3GuardCommitter.java#L124]
 method is invoked which list all the files under the jobAttemptPath with a 
suffix `.pendingset`. If so what is the value returned by (getJobAttemptPath)

What i understand from your comment is that, The `getJobAttemptPath` is not 
returning correct value (for Hive,Pig with Tez) and hence the commitJob is not 
able to read the commit metadata. Is my understanding correct ?


was (Author: srahman):
[~vnarayanan7] - Could you please share the complete error stacktrace ?

As i could see from the code implementation, During commitJob operation, 
[listPendingUploadToCommit|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/magic/MagicS3GuardCommitter.java#L124]
 method is invoked which list all the files under the jobAttemptPath with a 
suffix `.pendingset`.

What i understand from your comment is that, The `getJobAttemptPath` is not 
returning correct value (for Hive,Pig with Tez) and hence the commitJob is not 
able to read the commit metadata. Is my understanding correct ?

> Add support for Tez to MagicS3GuardCommitter
> 
>
> Key: HADOOP-19091
> URL: https://issues.apache.org/jira/browse/HADOOP-19091
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3
>Affects Versions: 3.3.6
> Environment: Pig 17/Hive 3.1.3 with Hadoop 3.3.3 on AWS EMR 6-12.0
>Reporter: Venkatasubrahmanian Narayanan
>Assignee: Venkatasubrahmanian Narayanan
>Priority: Major
> Attachments: 0001-AWS-Hive-Changes.patch, 
> 0002-HIVE-27698-Backport-of-HIVE-22398-Remove-legacy-code.patch, 
> HADOOP-19091-HIVE-WIP.patch
>
>
> The MagicS3GuardCommitter assumes that the JobID of the task is the same as 
> that of the job's application master when writing/reading the .pendingset 
> file. This assumption is not valid when running with Tez, which creates 
> slightly different JobIDs for tasks and the application master.
>  
> While the MagicS3GuardCommitter is intended only for MRv2, it mostly works 
> fine with an MRv1 wrapper with Hive/Pig (with some minor changes to Hive) run 
> in MR mode. This issue only crops up when running queries with the Tez 
> execution engine. I can upload a patch to Hive 3.1 to reproduce this error on 
> EMR if needed.
>  
> Fixing this will probably require work from both Tez and Hadoop, wanted to 
> start a discussion here so we can figure out how exactly we go about this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19091) Add support for Tez to MagicS3GuardCommitter

2024-03-03 Thread Syed Shameerur Rahman (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17823030#comment-17823030
 ] 

Syed Shameerur Rahman commented on HADOOP-19091:


[~vnarayanan7] - Could you please share the complete error stacktrace ?

As i could see from the code implementation, During commitJob operation, 
[listPendingUploadToCommit|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/magic/MagicS3GuardCommitter.java#L124]
 method is invoked which list all the files under the jobAttemptPath with a 
suffix `.pendingset`.

What i understand from your comment is that, The `getJobAttemptPath` is not 
returning correct value (for Hive,Pig with Tez) and hence the commitJob is not 
able to read the commit metadata. Is my understanding correct ?

> Add support for Tez to MagicS3GuardCommitter
> 
>
> Key: HADOOP-19091
> URL: https://issues.apache.org/jira/browse/HADOOP-19091
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3
>Affects Versions: 3.3.6
> Environment: Pig 17/Hive 3.1.3 with Hadoop 3.3.3 on AWS EMR 6-12.0
>Reporter: Venkatasubrahmanian Narayanan
>Assignee: Venkatasubrahmanian Narayanan
>Priority: Major
> Attachments: 0001-AWS-Hive-Changes.patch, 
> 0002-HIVE-27698-Backport-of-HIVE-22398-Remove-legacy-code.patch, 
> HADOOP-19091-HIVE-WIP.patch
>
>
> The MagicS3GuardCommitter assumes that the JobID of the task is the same as 
> that of the job's application master when writing/reading the .pendingset 
> file. This assumption is not valid when running with Tez, which creates 
> slightly different JobIDs for tasks and the application master.
>  
> While the MagicS3GuardCommitter is intended only for MRv2, it mostly works 
> fine with an MRv1 wrapper with Hive/Pig (with some minor changes to Hive) run 
> in MR mode. This issue only crops up when running queries with the Tez 
> execution engine. I can upload a patch to Hive 3.1 to reproduce this error on 
> EMR if needed.
>  
> Fixing this will probably require work from both Tez and Hadoop, wanted to 
> start a discussion here so we can figure out how exactly we go about this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19091) Add support for Tez to MagicS3GuardCommitter

2024-02-28 Thread Syed Shameerur Rahman (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17821935#comment-17821935
 ] 

Syed Shameerur Rahman commented on HADOOP-19091:


[~vnarayanan7] - Can you share the required logs (with DEBUG) if possible it 
will give some more clarity.

> Add support for Tez to MagicS3GuardCommitter
> 
>
> Key: HADOOP-19091
> URL: https://issues.apache.org/jira/browse/HADOOP-19091
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3
>Affects Versions: 3.3.6
> Environment: Pig 17/Hive 3.1.3 with Hadoop 3.3.3 on AWS EMR 6-12.0
>Reporter: Venkatasubrahmanian Narayanan
>Assignee: Venkatasubrahmanian Narayanan
>Priority: Major
> Attachments: 0001-AWS-Hive-Changes.patch, 
> 0002-HIVE-27698-Backport-of-HIVE-22398-Remove-legacy-code.patch, 
> HADOOP-19091-HIVE-WIP.patch
>
>
> The MagicS3GuardCommitter assumes that the JobID of the task is the same as 
> that of the job's application master when writing/reading the .pendingset 
> file. This assumption is not valid when running with Tez, which creates 
> slightly different JobIDs for tasks and the application master.
>  
> While the MagicS3GuardCommitter is intended only for MRv2, it mostly works 
> fine with an MRv1 wrapper with Hive/Pig (with some minor changes to Hive) run 
> in MR mode. This issue only crops up when running queries with the Tez 
> execution engine. I can upload a patch to Hive 3.1 to reproduce this error on 
> EMR if needed.
>  
> Fixing this will probably require work from both Tez and Hadoop, wanted to 
> start a discussion here so we can figure out how exactly we go about this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-19091) Add support for Tez to MagicS3GuardCommitter

2024-02-26 Thread Syed Shameerur Rahman (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17820967#comment-17820967
 ] 

Syed Shameerur Rahman edited comment on HADOOP-19091 at 2/27/24 5:35 AM:
-

[~vnarayanan7] - I am not sure why MagicS3GuardCommitter won't work with Tez. 
In the past I vaguely remember running MagicS3GuardCommitter with Hive 3.1.3+ 
Tez  0.9.2 (by incorporating the changes mentioned in 
https://issues.apache.org/jira/browse/HIVE-16295)
It would be really helpful, If you can share the replication steps for the same.



was (Author: srahman):
[~vnarayanan7] - I am not sure why MagicS3GuardCommitter won't work with Tez. 
In the past I vaguely remember running MagicS3GuardCommitter with Hive 3.1.3 
and Tez  0.9.2 (by incorporating the changes mentioned in 
https://issues.apache.org/jira/browse/HIVE-16295)
It would be really helpful, If you can share the replication steps for the same.


> Add support for Tez to MagicS3GuardCommitter
> 
>
> Key: HADOOP-19091
> URL: https://issues.apache.org/jira/browse/HADOOP-19091
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3
>Affects Versions: 3.4.0, 3.3.6
> Environment: Pig 17/Hive 3.1.3 with Hadoop 3.3.3 on AWS EMR 6-12.0
>Reporter: Venkatasubrahmanian Narayanan
>Priority: Major
>
> The MagicS3GuardCommitter assumes that the JobID of the task is the same as 
> that of the job's application master when writing/reading the .pendingset 
> file. This assumption is not valid when running with Tez, which creates 
> slightly different JobIDs for tasks and the application master.
>  
> While the MagicS3GuardCommitter is intended only for MRv2, it mostly works 
> fine with an MRv1 wrapper with Hive/Pig (with some minor changes to Hive) run 
> in MR mode. This issue only crops up when running queries with the Tez 
> execution engine. I can upload a patch to Hive 3.1 to reproduce this error on 
> EMR if needed.
>  
> Fixing this will probably require work from both Tez and Hadoop, wanted to 
> start a discussion here so we can figure out how exactly we go about this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19091) Add support for Tez to MagicS3GuardCommitter

2024-02-26 Thread Syed Shameerur Rahman (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17820967#comment-17820967
 ] 

Syed Shameerur Rahman commented on HADOOP-19091:


[~vnarayanan7] - I am not sure why MagicS3GuardCommitter won't work with Tez. 
In the past I vaguely remember running MagicS3GuardCommitter with Hive 3.1.3 
and Tez  0.9.2 (by incorporating the changes mentioned in 
https://issues.apache.org/jira/browse/HIVE-16295)
It would be really helpful, If you can share the replication steps for the same.


> Add support for Tez to MagicS3GuardCommitter
> 
>
> Key: HADOOP-19091
> URL: https://issues.apache.org/jira/browse/HADOOP-19091
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3
>Affects Versions: 3.4.0, 3.3.6
> Environment: Pig 17/Hive 3.1.3 with Hadoop 3.3.3 on AWS EMR 6-12.0
>Reporter: Venkatasubrahmanian Narayanan
>Priority: Major
>
> The MagicS3GuardCommitter assumes that the JobID of the task is the same as 
> that of the job's application master when writing/reading the .pendingset 
> file. This assumption is not valid when running with Tez, which creates 
> slightly different JobIDs for tasks and the application master.
>  
> While the MagicS3GuardCommitter is intended only for MRv2, it mostly works 
> fine with an MRv1 wrapper with Hive/Pig (with some minor changes to Hive) run 
> in MR mode. This issue only crops up when running queries with the Tez 
> execution engine. I can upload a patch to Hive 3.1 to reproduce this error on 
> EMR if needed.
>  
> Fixing this will probably require work from both Tez and Hadoop, wanted to 
> start a discussion here so we can figure out how exactly we go about this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19047) Support InMemory Tracking Of S3A Magic Commits

2024-01-31 Thread Syed Shameerur Rahman (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17812770#comment-17812770
 ] 

Syed Shameerur Rahman commented on HADOOP-19047:


[~ste...@apache.org] - Gentle reminder:
Could you please review the changes?


> Support InMemory Tracking Of S3A Magic Commits
> --
>
> Key: HADOOP-19047
> URL: https://issues.apache.org/jira/browse/HADOOP-19047
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3
>Reporter: Syed Shameerur Rahman
>Assignee: Syed Shameerur Rahman
>Priority: Major
>  Labels: pull-request-available
>
> The following are the operations which happens within a Task when it uses S3A 
> Magic Committer. 
> *During closing of stream*
> 1. A 0-byte file with a same name of the original file is uploaded to S3 
> using PUT operation. Refer 
> [here|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/magic/MagicCommitTracker.java#L152]
>  for more information. This is done so that the downstream application like 
> Spark could get the size of the file which is being written.
> 2. MultiPartUpload(MPU) metadata is uploaded to S3. Refer 
> [here|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/magic/MagicCommitTracker.java#L176]
>  for more information.
> *During TaskCommit*
> 1. All the MPU metadata which the task wrote to S3 (There will be 'x' number 
> of metadata file in S3 if a single task writes to 'x' files) are read and 
> rewritten to S3 as a single metadata file. Refer 
> [here|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/magic/MagicS3GuardCommitter.java#L201]
>  for more information
> Since these operations happens with the Task JVM, We could optimize as well 
> as save cost by storing these information in memory when Task memory usage is 
> not a constraint. Hence the proposal here is to introduce a new MagicCommit 
> Tracker called "InMemoryMagicCommitTracker" which will store the 
> 1. Metadata of MPU in memory till the Task is committed
> 2. Store the size of the file which can be used by the downstream application 
> to get the file size before it is committed/visible to the output path.
> This optimization will save 2 PUT S3 calls, 1 LIST S3 call, and 1 GET S3 call 
> given a Task writes only 1 file.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] (HADOOP-19047) Support InMemory Tracking Of S3A Magic Commits

2024-01-31 Thread Syed Shameerur Rahman (Jira)


[ https://issues.apache.org/jira/browse/HADOOP-19047 ]


Syed Shameerur Rahman deleted comment on HADOOP-19047:


was (Author: srahman):
[~ste...@apache.org] i have converted draft PR to final version. Could you 
please review the same ?

> Support InMemory Tracking Of S3A Magic Commits
> --
>
> Key: HADOOP-19047
> URL: https://issues.apache.org/jira/browse/HADOOP-19047
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3
>Reporter: Syed Shameerur Rahman
>Assignee: Syed Shameerur Rahman
>Priority: Major
>  Labels: pull-request-available
>
> The following are the operations which happens within a Task when it uses S3A 
> Magic Committer. 
> *During closing of stream*
> 1. A 0-byte file with a same name of the original file is uploaded to S3 
> using PUT operation. Refer 
> [here|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/magic/MagicCommitTracker.java#L152]
>  for more information. This is done so that the downstream application like 
> Spark could get the size of the file which is being written.
> 2. MultiPartUpload(MPU) metadata is uploaded to S3. Refer 
> [here|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/magic/MagicCommitTracker.java#L176]
>  for more information.
> *During TaskCommit*
> 1. All the MPU metadata which the task wrote to S3 (There will be 'x' number 
> of metadata file in S3 if a single task writes to 'x' files) are read and 
> rewritten to S3 as a single metadata file. Refer 
> [here|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/magic/MagicS3GuardCommitter.java#L201]
>  for more information
> Since these operations happens with the Task JVM, We could optimize as well 
> as save cost by storing these information in memory when Task memory usage is 
> not a constraint. Hence the proposal here is to introduce a new MagicCommit 
> Tracker called "InMemoryMagicCommitTracker" which will store the 
> 1. Metadata of MPU in memory till the Task is committed
> 2. Store the size of the file which can be used by the downstream application 
> to get the file size before it is committed/visible to the output path.
> This optimization will save 2 PUT S3 calls, 1 LIST S3 call, and 1 GET S3 call 
> given a Task writes only 1 file.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19047) Support InMemory Tracking Of S3A Magic Commits

2024-01-29 Thread Syed Shameerur Rahman (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17812104#comment-17812104
 ] 

Syed Shameerur Rahman commented on HADOOP-19047:


[~ste...@apache.org] i have converted draft PR to final version. Could you 
please review the same ?

> Support InMemory Tracking Of S3A Magic Commits
> --
>
> Key: HADOOP-19047
> URL: https://issues.apache.org/jira/browse/HADOOP-19047
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3
>Reporter: Syed Shameerur Rahman
>Assignee: Syed Shameerur Rahman
>Priority: Major
>  Labels: pull-request-available
>
> The following are the operations which happens within a Task when it uses S3A 
> Magic Committer. 
> *During closing of stream*
> 1. A 0-byte file with a same name of the original file is uploaded to S3 
> using PUT operation. Refer 
> [here|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/magic/MagicCommitTracker.java#L152]
>  for more information. This is done so that the downstream application like 
> Spark could get the size of the file which is being written.
> 2. MultiPartUpload(MPU) metadata is uploaded to S3. Refer 
> [here|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/magic/MagicCommitTracker.java#L176]
>  for more information.
> *During TaskCommit*
> 1. All the MPU metadata which the task wrote to S3 (There will be 'x' number 
> of metadata file in S3 if a single task writes to 'x' files) are read and 
> rewritten to S3 as a single metadata file. Refer 
> [here|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/magic/MagicS3GuardCommitter.java#L201]
>  for more information
> Since these operations happens with the Task JVM, We could optimize as well 
> as save cost by storing these information in memory when Task memory usage is 
> not a constraint. Hence the proposal here is to introduce a new MagicCommit 
> Tracker called "InMemoryMagicCommitTracker" which will store the 
> 1. Metadata of MPU in memory till the Task is committed
> 2. Store the size of the file which can be used by the downstream application 
> to get the file size before it is committed/visible to the output path.
> This optimization will save 2 PUT S3 calls, 1 LIST S3 call, and 1 GET S3 call 
> given a Task writes only 1 file.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-19047) Support InMemory Tracking Of S3A Magic Commits

2024-01-19 Thread Syed Shameerur Rahman (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Syed Shameerur Rahman updated HADOOP-19047:
---
Description: 
The following are the operations which happens within a Task when it uses S3A 
Magic Committer. 

*During closing of stream*

1. A 0-byte file with a same name of the original file is uploaded to S3 using 
PUT operation. Refer 
[here|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/magic/MagicCommitTracker.java#L152]
 for more information. This is done so that the downstream application like 
Spark could get the size of the file which is being written.

2. MultiPartUpload(MPU) metadata is uploaded to S3. Refer 
[here|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/magic/MagicCommitTracker.java#L176]
 for more information.

*During TaskCommit*

1. All the MPU metadata which the task wrote to S3 (There will be 'x' number of 
metadata file in S3 if a single task writes to 'x' files) are read and 
rewritten to S3 as a single metadata file. Refer 
[here|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/magic/MagicS3GuardCommitter.java#L201]
 for more information


Since these operations happens with the Task JVM, We could optimize as well as 
save cost by storing these information in memory when Task memory usage is not 
a constraint. Hence the proposal here is to introduce a new MagicCommit Tracker 
called "InMemoryMagicCommitTracker" which will store the 

1. Metadata of MPU in memory till the Task is committed
2. Store the size of the file which can be used by the downstream application 
to get the file size before it is committed/visible to the output path.

This optimization will save 2 PUT S3 calls, 1 LIST S3 call, and 1 GET S3 call 
given a Task writes only 1 file.



  was:
The following are the operations which happens within a Task when it uses S3A 
Magic Committer. 

*During the closing of stream*

1. A 0-byte file with a same name of the original file is uploaded to S3 using 
PUT operation. Refer 
[here|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/magic/MagicCommitTracker.java#L152]
 for more information. This is done so that the downstream application like 
Spark could get the size of the file which is being written.

2. MultiPartUpload(MPU) metadata is uploaded to S3. Refer 
[here|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/magic/MagicCommitTracker.java#L176]
 for more information.

*During TaskCommit*

1. All the MPU metadata which the task wrote to S3 (There will be 'x' number of 
metadata file in S3 if a single task writes to 'x' files) are read and 
rewritten to S3 as a single metadata file. Refer 
[here|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/magic/MagicS3GuardCommitter.java#L201]
 for more information


Since these operations happens with the Task JVM, We could optimize as well as 
save cost by storing these information in memory when Task memory usage is not 
a constraint. Hence the proposal here is to introduce a new MagicCommit Tracker 
called "InMemoryMagicCommitTracker" which will store the 

1. Metadata of MPU in memory till the Task is committed
2. Store the size of the file which can be used by the downstream application 
to get the file size before it is committed/visible to the output path.

This optimization will save 2 PUT S3 calls, 1 LIST S3 call, and 1 GET S3 call 
given a Task writes only 1 file.




> Support InMemory Tracking Of S3A Magic Commits
> --
>
> Key: HADOOP-19047
> URL: https://issues.apache.org/jira/browse/HADOOP-19047
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3
>Reporter: Syed Shameerur Rahman
>Assignee: Syed Shameerur Rahman
>Priority: Major
>  Labels: pull-request-available
>
> The following are the operations which happens within a Task when it uses S3A 
> Magic Committer. 
> *During closing of stream*
> 1. A 0-byte file with a same name of the original file is uploaded to S3 
> using PUT operation. Refer 
> [here|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/magic/MagicCommitTracker.java#L152]
>  for more information. This is done so that the downstream application like 
> Spark could get the size of the file which is being written.
> 2. MultiPartUpload(MPU) metadata is uploaded to S3. Refer 
> [here|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/o

[jira] [Commented] (HADOOP-19047) Support InMemory Tracking Of S3A Magic Commits

2024-01-19 Thread Syed Shameerur Rahman (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17808588#comment-17808588
 ] 

Syed Shameerur Rahman commented on HADOOP-19047:


I have created a Draft PR [https://github.com/apache/hadoop/pull/6468/files] 
for the approach. 
[~ste...@apache.org] Could you please review the approach ?

> Support InMemory Tracking Of S3A Magic Commits
> --
>
> Key: HADOOP-19047
> URL: https://issues.apache.org/jira/browse/HADOOP-19047
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3
>Reporter: Syed Shameerur Rahman
>Assignee: Syed Shameerur Rahman
>Priority: Major
>  Labels: pull-request-available
>
> The following are the operations which happens within a Task when it uses S3A 
> Magic Committer. 
> *During the closing of stream*
> 1. A 0-byte file with a same name of the original file is uploaded to S3 
> using PUT operation. Refer 
> [here|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/magic/MagicCommitTracker.java#L152]
>  for more information. This is done so that the downstream application like 
> Spark could get the size of the file which is being written.
> 2. MultiPartUpload(MPU) metadata is uploaded to S3. Refer 
> [here|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/magic/MagicCommitTracker.java#L176]
>  for more information.
> *During TaskCommit*
> 1. All the MPU metadata which the task wrote to S3 (There will be 'x' number 
> of metadata file in S3 if a single task writes to 'x' files) are read and 
> rewritten to S3 as a single metadata file. Refer 
> [here|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/magic/MagicS3GuardCommitter.java#L201]
>  for more information
> Since these operations happens with the Task JVM, We could optimize as well 
> as save cost by storing these information in memory when Task memory usage is 
> not a constraint. Hence the proposal here is to introduce a new MagicCommit 
> Tracker called "InMemoryMagicCommitTracker" which will store the 
> 1. Metadata of MPU in memory till the Task is committed
> 2. Store the size of the file which can be used by the downstream application 
> to get the file size before it is committed/visible to the output path.
> This optimization will save 2 PUT S3 calls, 1 LIST S3 call, and 1 GET S3 call 
> given a Task writes only 1 file.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-19047) Support InMemory Tracking Of S3A Magic Commits

2024-01-19 Thread Syed Shameerur Rahman (Jira)
Syed Shameerur Rahman created HADOOP-19047:
--

 Summary: Support InMemory Tracking Of S3A Magic Commits
 Key: HADOOP-19047
 URL: https://issues.apache.org/jira/browse/HADOOP-19047
 Project: Hadoop Common
  Issue Type: Improvement
  Components: fs/s3
Reporter: Syed Shameerur Rahman
Assignee: Syed Shameerur Rahman


The following are the operations which happens within a Task when it uses S3A 
Magic Committer. 

*During the closing of stream*

1. A 0-byte file with a same name of the original file is uploaded to S3 using 
PUT operation. Refer 
[here|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/magic/MagicCommitTracker.java#L152]
 for more information. This is done so that the downstream application like 
Spark could get the size of the file which is being written.

2. MultiPartUpload(MPU) metadata is uploaded to S3. Refer 
[here|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/magic/MagicCommitTracker.java#L176]
 for more information.

*During TaskCommit*

1. All the MPU metadata which the task wrote to S3 (There will be 'x' number of 
metadata file in S3 if a single task writes to 'x' files) are read and 
rewritten to S3 as a single metadata file. Refer 
[here|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/magic/MagicS3GuardCommitter.java#L201]
 for more information


Since these operations happens with the Task JVM, We could optimize as well as 
save cost by storing these information in memory when Task memory usage is not 
a constraint. Hence the proposal here is to introduce a new MagicCommit Tracker 
called "InMemoryMagicCommitTracker" which will store the 

1. Metadata of MPU in memory till the Task is committed
2. Store the size of the file which can be used by the downstream application 
to get the file size before it is committed/visible to the output path.

This optimization will save 2 PUT S3 calls, 1 LIST S3 call, and 1 GET S3 call 
given a Task writes only 1 file.





--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18797) Support Concurrent Writes With S3A Magic Committer

2023-09-27 Thread Syed Shameerur Rahman (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17769643#comment-17769643
 ] 

Syed Shameerur Rahman commented on HADOOP-18797:


PR for branch-3.3 : https://github.com/apache/hadoop/pull/6122

> Support Concurrent Writes With S3A Magic Committer
> --
>
> Key: HADOOP-18797
> URL: https://issues.apache.org/jira/browse/HADOOP-18797
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3
>Reporter: Emanuel Velzi
>Assignee: Syed Shameerur Rahman
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>
> There is a failure in the commit process when multiple jobs are writing to a 
> s3 directory *concurrently* using {*}magic committers{*}.
> This issue is closely related HADOOP-17318.
> When multiple Spark jobs write to the same S3A directory, they upload files 
> simultaneously using "__magic" as the base directory for staging. Inside this 
> directory, there are multiple "/job-some-uuid" directories, each representing 
> a concurrently running job.
> To fix some preoblems related to concunrrency a property was introduced in 
> the previous fix: "spark.hadoop.fs.s3a.committer.abort.pending.uploads". When 
> set to false, it ensures that during the cleanup stage, finalizing jobs do 
> not abort pending uploads from other jobs. So we see in logs this line: 
> {code:java}
> DEBUG [main] o.a.h.fs.s3a.commit.AbstractS3ACommitter (819): Not cleanup up 
> pending uploads to s3a ...{code}
> (from 
> [AbstractS3ACommitter.java#L952|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/AbstractS3ACommitter.java#L952])
> However, in the next step, the {*}"__magic" directory is recursively 
> deleted{*}:
> {code:java}
> INFO  [main] o.a.h.fs.s3a.commit.magic.MagicS3GuardCommitter (98): Deleting 
> magic directory s3a://my-bucket/my-table/__magic: duration 0:00.560s {code}
> (from [AbstractS3ACommitter.java#L1112 
> |https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/AbstractS3ACommitter.java#L1112]and
>  
> [MagicS3GuardCommitter.java#L137)|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/magic/MagicS3GuardCommitter.java#L137)]
> This deletion operation *affects the second job* that is still running 
> because it loses pending uploads (i.e., ".pendingset" and ".pending" files).
> The consequences can range from an exception in the best case to a silent 
> loss of data in the worst case. The latter occurs when Job_1 deletes files 
> just before Job_2 executes "listPendingUploadsToCommit" to list ".pendingset" 
> files in the job attempt directory previous to complete the uploads with POST 
> requests.
> To resolve this issue, it's important {*}to ensure that only the prefix 
> associated with the job currently finalizing is cleaned{*}.
> Here's a possible solution:
> {code:java}
> /**
>  * Delete the magic directory.
>  */
> public void cleanupStagingDirs() {
>   final Path out = getOutputPath();
>  //Path path = magicSubdir(getOutputPath());
>   Path path = new Path(magicSubdir(out), formatJobDir(getUUID()));
>   try(DurationInfo ignored = new DurationInfo(LOG, true,
>   "Deleting magic directory %s", path)) {
> Invoker.ignoreIOExceptions(LOG, "cleanup magic directory", 
> path.toString(),
> () -> deleteWithWarning(getDestFS(), path, true));
>   }
> } {code}
>  
> The side effect of this issue is that the "__magic" directory is never 
> cleaned up. However, I believe this is a minor concern, even considering that 
> other folders such as "_SUCCESS" also persist after jobs end.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18797) Support Concurrent Writes With S3A Magic Committer

2023-09-20 Thread Syed Shameerur Rahman (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Syed Shameerur Rahman updated HADOOP-18797:
---
Issue Type: Improvement  (was: Bug)

> Support Concurrent Writes With S3A Magic Committer
> --
>
> Key: HADOOP-18797
> URL: https://issues.apache.org/jira/browse/HADOOP-18797
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3
>Reporter: Emanuel Velzi
>Assignee: Syed Shameerur Rahman
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>
> There is a failure in the commit process when multiple jobs are writing to a 
> s3 directory *concurrently* using {*}magic committers{*}.
> This issue is closely related HADOOP-17318.
> When multiple Spark jobs write to the same S3A directory, they upload files 
> simultaneously using "__magic" as the base directory for staging. Inside this 
> directory, there are multiple "/job-some-uuid" directories, each representing 
> a concurrently running job.
> To fix some preoblems related to concunrrency a property was introduced in 
> the previous fix: "spark.hadoop.fs.s3a.committer.abort.pending.uploads". When 
> set to false, it ensures that during the cleanup stage, finalizing jobs do 
> not abort pending uploads from other jobs. So we see in logs this line: 
> {code:java}
> DEBUG [main] o.a.h.fs.s3a.commit.AbstractS3ACommitter (819): Not cleanup up 
> pending uploads to s3a ...{code}
> (from 
> [AbstractS3ACommitter.java#L952|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/AbstractS3ACommitter.java#L952])
> However, in the next step, the {*}"__magic" directory is recursively 
> deleted{*}:
> {code:java}
> INFO  [main] o.a.h.fs.s3a.commit.magic.MagicS3GuardCommitter (98): Deleting 
> magic directory s3a://my-bucket/my-table/__magic: duration 0:00.560s {code}
> (from [AbstractS3ACommitter.java#L1112 
> |https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/AbstractS3ACommitter.java#L1112]and
>  
> [MagicS3GuardCommitter.java#L137)|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/magic/MagicS3GuardCommitter.java#L137)]
> This deletion operation *affects the second job* that is still running 
> because it loses pending uploads (i.e., ".pendingset" and ".pending" files).
> The consequences can range from an exception in the best case to a silent 
> loss of data in the worst case. The latter occurs when Job_1 deletes files 
> just before Job_2 executes "listPendingUploadsToCommit" to list ".pendingset" 
> files in the job attempt directory previous to complete the uploads with POST 
> requests.
> To resolve this issue, it's important {*}to ensure that only the prefix 
> associated with the job currently finalizing is cleaned{*}.
> Here's a possible solution:
> {code:java}
> /**
>  * Delete the magic directory.
>  */
> public void cleanupStagingDirs() {
>   final Path out = getOutputPath();
>  //Path path = magicSubdir(getOutputPath());
>   Path path = new Path(magicSubdir(out), formatJobDir(getUUID()));
>   try(DurationInfo ignored = new DurationInfo(LOG, true,
>   "Deleting magic directory %s", path)) {
> Invoker.ignoreIOExceptions(LOG, "cleanup magic directory", 
> path.toString(),
> () -> deleteWithWarning(getDestFS(), path, true));
>   }
> } {code}
>  
> The side effect of this issue is that the "__magic" directory is never 
> cleaned up. However, I believe this is a minor concern, even considering that 
> other folders such as "_SUCCESS" also persist after jobs end.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18797) Support Concurrent Writes With S3A Magic Committer

2023-09-20 Thread Syed Shameerur Rahman (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Syed Shameerur Rahman updated HADOOP-18797:
---
Fix Version/s: 3.4.0

> Support Concurrent Writes With S3A Magic Committer
> --
>
> Key: HADOOP-18797
> URL: https://issues.apache.org/jira/browse/HADOOP-18797
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Reporter: Emanuel Velzi
>Assignee: Syed Shameerur Rahman
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>
> There is a failure in the commit process when multiple jobs are writing to a 
> s3 directory *concurrently* using {*}magic committers{*}.
> This issue is closely related HADOOP-17318.
> When multiple Spark jobs write to the same S3A directory, they upload files 
> simultaneously using "__magic" as the base directory for staging. Inside this 
> directory, there are multiple "/job-some-uuid" directories, each representing 
> a concurrently running job.
> To fix some preoblems related to concunrrency a property was introduced in 
> the previous fix: "spark.hadoop.fs.s3a.committer.abort.pending.uploads". When 
> set to false, it ensures that during the cleanup stage, finalizing jobs do 
> not abort pending uploads from other jobs. So we see in logs this line: 
> {code:java}
> DEBUG [main] o.a.h.fs.s3a.commit.AbstractS3ACommitter (819): Not cleanup up 
> pending uploads to s3a ...{code}
> (from 
> [AbstractS3ACommitter.java#L952|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/AbstractS3ACommitter.java#L952])
> However, in the next step, the {*}"__magic" directory is recursively 
> deleted{*}:
> {code:java}
> INFO  [main] o.a.h.fs.s3a.commit.magic.MagicS3GuardCommitter (98): Deleting 
> magic directory s3a://my-bucket/my-table/__magic: duration 0:00.560s {code}
> (from [AbstractS3ACommitter.java#L1112 
> |https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/AbstractS3ACommitter.java#L1112]and
>  
> [MagicS3GuardCommitter.java#L137)|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/magic/MagicS3GuardCommitter.java#L137)]
> This deletion operation *affects the second job* that is still running 
> because it loses pending uploads (i.e., ".pendingset" and ".pending" files).
> The consequences can range from an exception in the best case to a silent 
> loss of data in the worst case. The latter occurs when Job_1 deletes files 
> just before Job_2 executes "listPendingUploadsToCommit" to list ".pendingset" 
> files in the job attempt directory previous to complete the uploads with POST 
> requests.
> To resolve this issue, it's important {*}to ensure that only the prefix 
> associated with the job currently finalizing is cleaned{*}.
> Here's a possible solution:
> {code:java}
> /**
>  * Delete the magic directory.
>  */
> public void cleanupStagingDirs() {
>   final Path out = getOutputPath();
>  //Path path = magicSubdir(getOutputPath());
>   Path path = new Path(magicSubdir(out), formatJobDir(getUUID()));
>   try(DurationInfo ignored = new DurationInfo(LOG, true,
>   "Deleting magic directory %s", path)) {
> Invoker.ignoreIOExceptions(LOG, "cleanup magic directory", 
> path.toString(),
> () -> deleteWithWarning(getDestFS(), path, true));
>   }
> } {code}
>  
> The side effect of this issue is that the "__magic" directory is never 
> cleaned up. However, I believe this is a minor concern, even considering that 
> other folders such as "_SUCCESS" also persist after jobs end.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-18797) Support Concurrent Writes With S3A Magic Committer

2023-09-20 Thread Syed Shameerur Rahman (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Syed Shameerur Rahman resolved HADOOP-18797.

Resolution: Fixed

PR merged to trunk branch

> Support Concurrent Writes With S3A Magic Committer
> --
>
> Key: HADOOP-18797
> URL: https://issues.apache.org/jira/browse/HADOOP-18797
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Reporter: Emanuel Velzi
>Assignee: Syed Shameerur Rahman
>Priority: Major
>  Labels: pull-request-available
>
> There is a failure in the commit process when multiple jobs are writing to a 
> s3 directory *concurrently* using {*}magic committers{*}.
> This issue is closely related HADOOP-17318.
> When multiple Spark jobs write to the same S3A directory, they upload files 
> simultaneously using "__magic" as the base directory for staging. Inside this 
> directory, there are multiple "/job-some-uuid" directories, each representing 
> a concurrently running job.
> To fix some preoblems related to concunrrency a property was introduced in 
> the previous fix: "spark.hadoop.fs.s3a.committer.abort.pending.uploads". When 
> set to false, it ensures that during the cleanup stage, finalizing jobs do 
> not abort pending uploads from other jobs. So we see in logs this line: 
> {code:java}
> DEBUG [main] o.a.h.fs.s3a.commit.AbstractS3ACommitter (819): Not cleanup up 
> pending uploads to s3a ...{code}
> (from 
> [AbstractS3ACommitter.java#L952|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/AbstractS3ACommitter.java#L952])
> However, in the next step, the {*}"__magic" directory is recursively 
> deleted{*}:
> {code:java}
> INFO  [main] o.a.h.fs.s3a.commit.magic.MagicS3GuardCommitter (98): Deleting 
> magic directory s3a://my-bucket/my-table/__magic: duration 0:00.560s {code}
> (from [AbstractS3ACommitter.java#L1112 
> |https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/AbstractS3ACommitter.java#L1112]and
>  
> [MagicS3GuardCommitter.java#L137)|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/magic/MagicS3GuardCommitter.java#L137)]
> This deletion operation *affects the second job* that is still running 
> because it loses pending uploads (i.e., ".pendingset" and ".pending" files).
> The consequences can range from an exception in the best case to a silent 
> loss of data in the worst case. The latter occurs when Job_1 deletes files 
> just before Job_2 executes "listPendingUploadsToCommit" to list ".pendingset" 
> files in the job attempt directory previous to complete the uploads with POST 
> requests.
> To resolve this issue, it's important {*}to ensure that only the prefix 
> associated with the job currently finalizing is cleaned{*}.
> Here's a possible solution:
> {code:java}
> /**
>  * Delete the magic directory.
>  */
> public void cleanupStagingDirs() {
>   final Path out = getOutputPath();
>  //Path path = magicSubdir(getOutputPath());
>   Path path = new Path(magicSubdir(out), formatJobDir(getUUID()));
>   try(DurationInfo ignored = new DurationInfo(LOG, true,
>   "Deleting magic directory %s", path)) {
> Invoker.ignoreIOExceptions(LOG, "cleanup magic directory", 
> path.toString(),
> () -> deleteWithWarning(getDestFS(), path, true));
>   }
> } {code}
>  
> The side effect of this issue is that the "__magic" directory is never 
> cleaned up. However, I believe this is a minor concern, even considering that 
> other folders such as "_SUCCESS" also persist after jobs end.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18797) Support Concurrent Writes With S3A Magic Committer

2023-09-17 Thread Syed Shameerur Rahman (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Syed Shameerur Rahman updated HADOOP-18797:
---
Summary: Support Concurrent Writes With S3A Magic Committer  (was: S3A 
committer fix lost data on concurrent jobs)

> Support Concurrent Writes With S3A Magic Committer
> --
>
> Key: HADOOP-18797
> URL: https://issues.apache.org/jira/browse/HADOOP-18797
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Reporter: Emanuel Velzi
>Assignee: Syed Shameerur Rahman
>Priority: Major
>  Labels: pull-request-available
>
> There is a failure in the commit process when multiple jobs are writing to a 
> s3 directory *concurrently* using {*}magic committers{*}.
> This issue is closely related HADOOP-17318.
> When multiple Spark jobs write to the same S3A directory, they upload files 
> simultaneously using "__magic" as the base directory for staging. Inside this 
> directory, there are multiple "/job-some-uuid" directories, each representing 
> a concurrently running job.
> To fix some preoblems related to concunrrency a property was introduced in 
> the previous fix: "spark.hadoop.fs.s3a.committer.abort.pending.uploads". When 
> set to false, it ensures that during the cleanup stage, finalizing jobs do 
> not abort pending uploads from other jobs. So we see in logs this line: 
> {code:java}
> DEBUG [main] o.a.h.fs.s3a.commit.AbstractS3ACommitter (819): Not cleanup up 
> pending uploads to s3a ...{code}
> (from 
> [AbstractS3ACommitter.java#L952|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/AbstractS3ACommitter.java#L952])
> However, in the next step, the {*}"__magic" directory is recursively 
> deleted{*}:
> {code:java}
> INFO  [main] o.a.h.fs.s3a.commit.magic.MagicS3GuardCommitter (98): Deleting 
> magic directory s3a://my-bucket/my-table/__magic: duration 0:00.560s {code}
> (from [AbstractS3ACommitter.java#L1112 
> |https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/AbstractS3ACommitter.java#L1112]and
>  
> [MagicS3GuardCommitter.java#L137)|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/magic/MagicS3GuardCommitter.java#L137)]
> This deletion operation *affects the second job* that is still running 
> because it loses pending uploads (i.e., ".pendingset" and ".pending" files).
> The consequences can range from an exception in the best case to a silent 
> loss of data in the worst case. The latter occurs when Job_1 deletes files 
> just before Job_2 executes "listPendingUploadsToCommit" to list ".pendingset" 
> files in the job attempt directory previous to complete the uploads with POST 
> requests.
> To resolve this issue, it's important {*}to ensure that only the prefix 
> associated with the job currently finalizing is cleaned{*}.
> Here's a possible solution:
> {code:java}
> /**
>  * Delete the magic directory.
>  */
> public void cleanupStagingDirs() {
>   final Path out = getOutputPath();
>  //Path path = magicSubdir(getOutputPath());
>   Path path = new Path(magicSubdir(out), formatJobDir(getUUID()));
>   try(DurationInfo ignored = new DurationInfo(LOG, true,
>   "Deleting magic directory %s", path)) {
> Invoker.ignoreIOExceptions(LOG, "cleanup magic directory", 
> path.toString(),
> () -> deleteWithWarning(getDestFS(), path, true));
>   }
> } {code}
>  
> The side effect of this issue is that the "__magic" directory is never 
> cleaned up. However, I believe this is a minor concern, even considering that 
> other folders such as "_SUCCESS" also persist after jobs end.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18797) S3A committer fix lost data on concurrent jobs

2023-09-05 Thread Syed Shameerur Rahman (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17762288#comment-17762288
 ] 

Syed Shameerur Rahman commented on HADOOP-18797:


[~ste...@apache.org] Could you please review the changes?

> S3A committer fix lost data on concurrent jobs
> --
>
> Key: HADOOP-18797
> URL: https://issues.apache.org/jira/browse/HADOOP-18797
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Reporter: Emanuel Velzi
>Assignee: Syed Shameerur Rahman
>Priority: Major
>  Labels: pull-request-available
>
> There is a failure in the commit process when multiple jobs are writing to a 
> s3 directory *concurrently* using {*}magic committers{*}.
> This issue is closely related HADOOP-17318.
> When multiple Spark jobs write to the same S3A directory, they upload files 
> simultaneously using "__magic" as the base directory for staging. Inside this 
> directory, there are multiple "/job-some-uuid" directories, each representing 
> a concurrently running job.
> To fix some preoblems related to concunrrency a property was introduced in 
> the previous fix: "spark.hadoop.fs.s3a.committer.abort.pending.uploads". When 
> set to false, it ensures that during the cleanup stage, finalizing jobs do 
> not abort pending uploads from other jobs. So we see in logs this line: 
> {code:java}
> DEBUG [main] o.a.h.fs.s3a.commit.AbstractS3ACommitter (819): Not cleanup up 
> pending uploads to s3a ...{code}
> (from 
> [AbstractS3ACommitter.java#L952|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/AbstractS3ACommitter.java#L952])
> However, in the next step, the {*}"__magic" directory is recursively 
> deleted{*}:
> {code:java}
> INFO  [main] o.a.h.fs.s3a.commit.magic.MagicS3GuardCommitter (98): Deleting 
> magic directory s3a://my-bucket/my-table/__magic: duration 0:00.560s {code}
> (from [AbstractS3ACommitter.java#L1112 
> |https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/AbstractS3ACommitter.java#L1112]and
>  
> [MagicS3GuardCommitter.java#L137)|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/magic/MagicS3GuardCommitter.java#L137)]
> This deletion operation *affects the second job* that is still running 
> because it loses pending uploads (i.e., ".pendingset" and ".pending" files).
> The consequences can range from an exception in the best case to a silent 
> loss of data in the worst case. The latter occurs when Job_1 deletes files 
> just before Job_2 executes "listPendingUploadsToCommit" to list ".pendingset" 
> files in the job attempt directory previous to complete the uploads with POST 
> requests.
> To resolve this issue, it's important {*}to ensure that only the prefix 
> associated with the job currently finalizing is cleaned{*}.
> Here's a possible solution:
> {code:java}
> /**
>  * Delete the magic directory.
>  */
> public void cleanupStagingDirs() {
>   final Path out = getOutputPath();
>  //Path path = magicSubdir(getOutputPath());
>   Path path = new Path(magicSubdir(out), formatJobDir(getUUID()));
>   try(DurationInfo ignored = new DurationInfo(LOG, true,
>   "Deleting magic directory %s", path)) {
> Invoker.ignoreIOExceptions(LOG, "cleanup magic directory", 
> path.toString(),
> () -> deleteWithWarning(getDestFS(), path, true));
>   }
> } {code}
>  
> The side effect of this issue is that the "__magic" directory is never 
> cleaned up. However, I believe this is a minor concern, even considering that 
> other folders such as "_SUCCESS" also persist after jobs end.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18797) S3A committer fix lost data on concurrent jobs

2023-08-30 Thread Syed Shameerur Rahman (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17760719#comment-17760719
 ] 

Syed Shameerur Rahman commented on HADOOP-18797:


[~ste...@apache.org], Please review the PR: 
https://github.com/apache/hadoop/pull/6006

> S3A committer fix lost data on concurrent jobs
> --
>
> Key: HADOOP-18797
> URL: https://issues.apache.org/jira/browse/HADOOP-18797
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Reporter: Emanuel Velzi
>Assignee: Syed Shameerur Rahman
>Priority: Major
>  Labels: pull-request-available
>
> There is a failure in the commit process when multiple jobs are writing to a 
> s3 directory *concurrently* using {*}magic committers{*}.
> This issue is closely related HADOOP-17318.
> When multiple Spark jobs write to the same S3A directory, they upload files 
> simultaneously using "__magic" as the base directory for staging. Inside this 
> directory, there are multiple "/job-some-uuid" directories, each representing 
> a concurrently running job.
> To fix some preoblems related to concunrrency a property was introduced in 
> the previous fix: "spark.hadoop.fs.s3a.committer.abort.pending.uploads". When 
> set to false, it ensures that during the cleanup stage, finalizing jobs do 
> not abort pending uploads from other jobs. So we see in logs this line: 
> {code:java}
> DEBUG [main] o.a.h.fs.s3a.commit.AbstractS3ACommitter (819): Not cleanup up 
> pending uploads to s3a ...{code}
> (from 
> [AbstractS3ACommitter.java#L952|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/AbstractS3ACommitter.java#L952])
> However, in the next step, the {*}"__magic" directory is recursively 
> deleted{*}:
> {code:java}
> INFO  [main] o.a.h.fs.s3a.commit.magic.MagicS3GuardCommitter (98): Deleting 
> magic directory s3a://my-bucket/my-table/__magic: duration 0:00.560s {code}
> (from [AbstractS3ACommitter.java#L1112 
> |https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/AbstractS3ACommitter.java#L1112]and
>  
> [MagicS3GuardCommitter.java#L137)|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/magic/MagicS3GuardCommitter.java#L137)]
> This deletion operation *affects the second job* that is still running 
> because it loses pending uploads (i.e., ".pendingset" and ".pending" files).
> The consequences can range from an exception in the best case to a silent 
> loss of data in the worst case. The latter occurs when Job_1 deletes files 
> just before Job_2 executes "listPendingUploadsToCommit" to list ".pendingset" 
> files in the job attempt directory previous to complete the uploads with POST 
> requests.
> To resolve this issue, it's important {*}to ensure that only the prefix 
> associated with the job currently finalizing is cleaned{*}.
> Here's a possible solution:
> {code:java}
> /**
>  * Delete the magic directory.
>  */
> public void cleanupStagingDirs() {
>   final Path out = getOutputPath();
>  //Path path = magicSubdir(getOutputPath());
>   Path path = new Path(magicSubdir(out), formatJobDir(getUUID()));
>   try(DurationInfo ignored = new DurationInfo(LOG, true,
>   "Deleting magic directory %s", path)) {
> Invoker.ignoreIOExceptions(LOG, "cleanup magic directory", 
> path.toString(),
> () -> deleteWithWarning(getDestFS(), path, true));
>   }
> } {code}
>  
> The side effect of this issue is that the "__magic" directory is never 
> cleaned up. However, I believe this is a minor concern, even considering that 
> other folders such as "_SUCCESS" also persist after jobs end.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18797) S3A committer fix lost data on concurrent jobs

2023-08-30 Thread Syed Shameerur Rahman (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Syed Shameerur Rahman updated HADOOP-18797:
---
Affects Version/s: (was: 3.3.6)

> S3A committer fix lost data on concurrent jobs
> --
>
> Key: HADOOP-18797
> URL: https://issues.apache.org/jira/browse/HADOOP-18797
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Reporter: Emanuel Velzi
>Assignee: Syed Shameerur Rahman
>Priority: Major
>
> There is a failure in the commit process when multiple jobs are writing to a 
> s3 directory *concurrently* using {*}magic committers{*}.
> This issue is closely related HADOOP-17318.
> When multiple Spark jobs write to the same S3A directory, they upload files 
> simultaneously using "__magic" as the base directory for staging. Inside this 
> directory, there are multiple "/job-some-uuid" directories, each representing 
> a concurrently running job.
> To fix some preoblems related to concunrrency a property was introduced in 
> the previous fix: "spark.hadoop.fs.s3a.committer.abort.pending.uploads". When 
> set to false, it ensures that during the cleanup stage, finalizing jobs do 
> not abort pending uploads from other jobs. So we see in logs this line: 
> {code:java}
> DEBUG [main] o.a.h.fs.s3a.commit.AbstractS3ACommitter (819): Not cleanup up 
> pending uploads to s3a ...{code}
> (from 
> [AbstractS3ACommitter.java#L952|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/AbstractS3ACommitter.java#L952])
> However, in the next step, the {*}"__magic" directory is recursively 
> deleted{*}:
> {code:java}
> INFO  [main] o.a.h.fs.s3a.commit.magic.MagicS3GuardCommitter (98): Deleting 
> magic directory s3a://my-bucket/my-table/__magic: duration 0:00.560s {code}
> (from [AbstractS3ACommitter.java#L1112 
> |https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/AbstractS3ACommitter.java#L1112]and
>  
> [MagicS3GuardCommitter.java#L137)|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/magic/MagicS3GuardCommitter.java#L137)]
> This deletion operation *affects the second job* that is still running 
> because it loses pending uploads (i.e., ".pendingset" and ".pending" files).
> The consequences can range from an exception in the best case to a silent 
> loss of data in the worst case. The latter occurs when Job_1 deletes files 
> just before Job_2 executes "listPendingUploadsToCommit" to list ".pendingset" 
> files in the job attempt directory previous to complete the uploads with POST 
> requests.
> To resolve this issue, it's important {*}to ensure that only the prefix 
> associated with the job currently finalizing is cleaned{*}.
> Here's a possible solution:
> {code:java}
> /**
>  * Delete the magic directory.
>  */
> public void cleanupStagingDirs() {
>   final Path out = getOutputPath();
>  //Path path = magicSubdir(getOutputPath());
>   Path path = new Path(magicSubdir(out), formatJobDir(getUUID()));
>   try(DurationInfo ignored = new DurationInfo(LOG, true,
>   "Deleting magic directory %s", path)) {
> Invoker.ignoreIOExceptions(LOG, "cleanup magic directory", 
> path.toString(),
> () -> deleteWithWarning(getDestFS(), path, true));
>   }
> } {code}
>  
> The side effect of this issue is that the "__magic" directory is never 
> cleaned up. However, I believe this is a minor concern, even considering that 
> other folders such as "_SUCCESS" also persist after jobs end.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Assigned] (HADOOP-18797) S3A committer fix lost data on concurrent jobs

2023-08-30 Thread Syed Shameerur Rahman (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Syed Shameerur Rahman reassigned HADOOP-18797:
--

Assignee: Syed Shameerur Rahman

> S3A committer fix lost data on concurrent jobs
> --
>
> Key: HADOOP-18797
> URL: https://issues.apache.org/jira/browse/HADOOP-18797
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 3.3.6
>Reporter: Emanuel Velzi
>Assignee: Syed Shameerur Rahman
>Priority: Major
>
> There is a failure in the commit process when multiple jobs are writing to a 
> s3 directory *concurrently* using {*}magic committers{*}.
> This issue is closely related HADOOP-17318.
> When multiple Spark jobs write to the same S3A directory, they upload files 
> simultaneously using "__magic" as the base directory for staging. Inside this 
> directory, there are multiple "/job-some-uuid" directories, each representing 
> a concurrently running job.
> To fix some preoblems related to concunrrency a property was introduced in 
> the previous fix: "spark.hadoop.fs.s3a.committer.abort.pending.uploads". When 
> set to false, it ensures that during the cleanup stage, finalizing jobs do 
> not abort pending uploads from other jobs. So we see in logs this line: 
> {code:java}
> DEBUG [main] o.a.h.fs.s3a.commit.AbstractS3ACommitter (819): Not cleanup up 
> pending uploads to s3a ...{code}
> (from 
> [AbstractS3ACommitter.java#L952|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/AbstractS3ACommitter.java#L952])
> However, in the next step, the {*}"__magic" directory is recursively 
> deleted{*}:
> {code:java}
> INFO  [main] o.a.h.fs.s3a.commit.magic.MagicS3GuardCommitter (98): Deleting 
> magic directory s3a://my-bucket/my-table/__magic: duration 0:00.560s {code}
> (from [AbstractS3ACommitter.java#L1112 
> |https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/AbstractS3ACommitter.java#L1112]and
>  
> [MagicS3GuardCommitter.java#L137)|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/magic/MagicS3GuardCommitter.java#L137)]
> This deletion operation *affects the second job* that is still running 
> because it loses pending uploads (i.e., ".pendingset" and ".pending" files).
> The consequences can range from an exception in the best case to a silent 
> loss of data in the worst case. The latter occurs when Job_1 deletes files 
> just before Job_2 executes "listPendingUploadsToCommit" to list ".pendingset" 
> files in the job attempt directory previous to complete the uploads with POST 
> requests.
> To resolve this issue, it's important {*}to ensure that only the prefix 
> associated with the job currently finalizing is cleaned{*}.
> Here's a possible solution:
> {code:java}
> /**
>  * Delete the magic directory.
>  */
> public void cleanupStagingDirs() {
>   final Path out = getOutputPath();
>  //Path path = magicSubdir(getOutputPath());
>   Path path = new Path(magicSubdir(out), formatJobDir(getUUID()));
>   try(DurationInfo ignored = new DurationInfo(LOG, true,
>   "Deleting magic directory %s", path)) {
> Invoker.ignoreIOExceptions(LOG, "cleanup magic directory", 
> path.toString(),
> () -> deleteWithWarning(getDestFS(), path, true));
>   }
> } {code}
>  
> The side effect of this issue is that the "__magic" directory is never 
> cleaned up. However, I believe this is a minor concern, even considering that 
> other folders such as "_SUCCESS" also persist after jobs end.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18797) S3A committer fix lost data on concurrent jobs

2023-08-29 Thread Syed Shameerur Rahman (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17759922#comment-17759922
 ] 

Syed Shameerur Rahman commented on HADOOP-18797:


yes, i noticed, MagicCommitPaths#isMagicPath and 
MagicCommitPaths#magicElementIndex needs to changed as well.

> S3A committer fix lost data on concurrent jobs
> --
>
> Key: HADOOP-18797
> URL: https://issues.apache.org/jira/browse/HADOOP-18797
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 3.3.6
>Reporter: Emanuel Velzi
>Priority: Major
>
> There is a failure in the commit process when multiple jobs are writing to a 
> s3 directory *concurrently* using {*}magic committers{*}.
> This issue is closely related HADOOP-17318.
> When multiple Spark jobs write to the same S3A directory, they upload files 
> simultaneously using "__magic" as the base directory for staging. Inside this 
> directory, there are multiple "/job-some-uuid" directories, each representing 
> a concurrently running job.
> To fix some preoblems related to concunrrency a property was introduced in 
> the previous fix: "spark.hadoop.fs.s3a.committer.abort.pending.uploads". When 
> set to false, it ensures that during the cleanup stage, finalizing jobs do 
> not abort pending uploads from other jobs. So we see in logs this line: 
> {code:java}
> DEBUG [main] o.a.h.fs.s3a.commit.AbstractS3ACommitter (819): Not cleanup up 
> pending uploads to s3a ...{code}
> (from 
> [AbstractS3ACommitter.java#L952|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/AbstractS3ACommitter.java#L952])
> However, in the next step, the {*}"__magic" directory is recursively 
> deleted{*}:
> {code:java}
> INFO  [main] o.a.h.fs.s3a.commit.magic.MagicS3GuardCommitter (98): Deleting 
> magic directory s3a://my-bucket/my-table/__magic: duration 0:00.560s {code}
> (from [AbstractS3ACommitter.java#L1112 
> |https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/AbstractS3ACommitter.java#L1112]and
>  
> [MagicS3GuardCommitter.java#L137)|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/magic/MagicS3GuardCommitter.java#L137)]
> This deletion operation *affects the second job* that is still running 
> because it loses pending uploads (i.e., ".pendingset" and ".pending" files).
> The consequences can range from an exception in the best case to a silent 
> loss of data in the worst case. The latter occurs when Job_1 deletes files 
> just before Job_2 executes "listPendingUploadsToCommit" to list ".pendingset" 
> files in the job attempt directory previous to complete the uploads with POST 
> requests.
> To resolve this issue, it's important {*}to ensure that only the prefix 
> associated with the job currently finalizing is cleaned{*}.
> Here's a possible solution:
> {code:java}
> /**
>  * Delete the magic directory.
>  */
> public void cleanupStagingDirs() {
>   final Path out = getOutputPath();
>  //Path path = magicSubdir(getOutputPath());
>   Path path = new Path(magicSubdir(out), formatJobDir(getUUID()));
>   try(DurationInfo ignored = new DurationInfo(LOG, true,
>   "Deleting magic directory %s", path)) {
> Invoker.ignoreIOExceptions(LOG, "cleanup magic directory", 
> path.toString(),
> () -> deleteWithWarning(getDestFS(), path, true));
>   }
> } {code}
>  
> The side effect of this issue is that the "__magic" directory is never 
> cleaned up. However, I believe this is a minor concern, even considering that 
> other folders such as "_SUCCESS" also persist after jobs end.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18797) S3A committer fix lost data on concurrent jobs

2023-08-28 Thread Syed Shameerur Rahman (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17759626#comment-17759626
 ] 

Syed Shameerur Rahman commented on HADOOP-18797:


[~ste...@apache.org] - I am more inclined towards Approach 3 , BTW 
FileOutputCommitters also faces the same issue of not cleaning failed jobs. The 
temporary files created in spark or hive staging directory will be left 
untouched if the job/Driver crashes.

Let me know your thoughts, I am happy to contribute to the same by creating PR 
and running all tests

> S3A committer fix lost data on concurrent jobs
> --
>
> Key: HADOOP-18797
> URL: https://issues.apache.org/jira/browse/HADOOP-18797
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 3.3.6
>Reporter: Emanuel Velzi
>Priority: Major
>
> There is a failure in the commit process when multiple jobs are writing to a 
> s3 directory *concurrently* using {*}magic committers{*}.
> This issue is closely related HADOOP-17318.
> When multiple Spark jobs write to the same S3A directory, they upload files 
> simultaneously using "__magic" as the base directory for staging. Inside this 
> directory, there are multiple "/job-some-uuid" directories, each representing 
> a concurrently running job.
> To fix some preoblems related to concunrrency a property was introduced in 
> the previous fix: "spark.hadoop.fs.s3a.committer.abort.pending.uploads". When 
> set to false, it ensures that during the cleanup stage, finalizing jobs do 
> not abort pending uploads from other jobs. So we see in logs this line: 
> {code:java}
> DEBUG [main] o.a.h.fs.s3a.commit.AbstractS3ACommitter (819): Not cleanup up 
> pending uploads to s3a ...{code}
> (from 
> [AbstractS3ACommitter.java#L952|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/AbstractS3ACommitter.java#L952])
> However, in the next step, the {*}"__magic" directory is recursively 
> deleted{*}:
> {code:java}
> INFO  [main] o.a.h.fs.s3a.commit.magic.MagicS3GuardCommitter (98): Deleting 
> magic directory s3a://my-bucket/my-table/__magic: duration 0:00.560s {code}
> (from [AbstractS3ACommitter.java#L1112 
> |https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/AbstractS3ACommitter.java#L1112]and
>  
> [MagicS3GuardCommitter.java#L137)|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/magic/MagicS3GuardCommitter.java#L137)]
> This deletion operation *affects the second job* that is still running 
> because it loses pending uploads (i.e., ".pendingset" and ".pending" files).
> The consequences can range from an exception in the best case to a silent 
> loss of data in the worst case. The latter occurs when Job_1 deletes files 
> just before Job_2 executes "listPendingUploadsToCommit" to list ".pendingset" 
> files in the job attempt directory previous to complete the uploads with POST 
> requests.
> To resolve this issue, it's important {*}to ensure that only the prefix 
> associated with the job currently finalizing is cleaned{*}.
> Here's a possible solution:
> {code:java}
> /**
>  * Delete the magic directory.
>  */
> public void cleanupStagingDirs() {
>   final Path out = getOutputPath();
>  //Path path = magicSubdir(getOutputPath());
>   Path path = new Path(magicSubdir(out), formatJobDir(getUUID()));
>   try(DurationInfo ignored = new DurationInfo(LOG, true,
>   "Deleting magic directory %s", path)) {
> Invoker.ignoreIOExceptions(LOG, "cleanup magic directory", 
> path.toString(),
> () -> deleteWithWarning(getDestFS(), path, true));
>   }
> } {code}
>  
> The side effect of this issue is that the "__magic" directory is never 
> cleaned up. However, I believe this is a minor concern, even considering that 
> other folders such as "_SUCCESS" also persist after jobs end.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] (HADOOP-18797) S3A committer fix lost data on concurrent jobs

2023-08-28 Thread Syed Shameerur Rahman (Jira)


[ https://issues.apache.org/jira/browse/HADOOP-18797 ]


Syed Shameerur Rahman deleted comment on HADOOP-18797:


was (Author: srahman):
[~ste...@apache.org] - I am more inclined towards Approach 3  , BTW 
FileOutputCommitters also faces the same issue of not cleaning failed jobs. The 
temporary files created in spark or hive staging directory will be left 
untouched if the job/Driver crashes.

> S3A committer fix lost data on concurrent jobs
> --
>
> Key: HADOOP-18797
> URL: https://issues.apache.org/jira/browse/HADOOP-18797
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 3.3.6
>Reporter: Emanuel Velzi
>Priority: Major
>
> There is a failure in the commit process when multiple jobs are writing to a 
> s3 directory *concurrently* using {*}magic committers{*}.
> This issue is closely related HADOOP-17318.
> When multiple Spark jobs write to the same S3A directory, they upload files 
> simultaneously using "__magic" as the base directory for staging. Inside this 
> directory, there are multiple "/job-some-uuid" directories, each representing 
> a concurrently running job.
> To fix some preoblems related to concunrrency a property was introduced in 
> the previous fix: "spark.hadoop.fs.s3a.committer.abort.pending.uploads". When 
> set to false, it ensures that during the cleanup stage, finalizing jobs do 
> not abort pending uploads from other jobs. So we see in logs this line: 
> {code:java}
> DEBUG [main] o.a.h.fs.s3a.commit.AbstractS3ACommitter (819): Not cleanup up 
> pending uploads to s3a ...{code}
> (from 
> [AbstractS3ACommitter.java#L952|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/AbstractS3ACommitter.java#L952])
> However, in the next step, the {*}"__magic" directory is recursively 
> deleted{*}:
> {code:java}
> INFO  [main] o.a.h.fs.s3a.commit.magic.MagicS3GuardCommitter (98): Deleting 
> magic directory s3a://my-bucket/my-table/__magic: duration 0:00.560s {code}
> (from [AbstractS3ACommitter.java#L1112 
> |https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/AbstractS3ACommitter.java#L1112]and
>  
> [MagicS3GuardCommitter.java#L137)|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/magic/MagicS3GuardCommitter.java#L137)]
> This deletion operation *affects the second job* that is still running 
> because it loses pending uploads (i.e., ".pendingset" and ".pending" files).
> The consequences can range from an exception in the best case to a silent 
> loss of data in the worst case. The latter occurs when Job_1 deletes files 
> just before Job_2 executes "listPendingUploadsToCommit" to list ".pendingset" 
> files in the job attempt directory previous to complete the uploads with POST 
> requests.
> To resolve this issue, it's important {*}to ensure that only the prefix 
> associated with the job currently finalizing is cleaned{*}.
> Here's a possible solution:
> {code:java}
> /**
>  * Delete the magic directory.
>  */
> public void cleanupStagingDirs() {
>   final Path out = getOutputPath();
>  //Path path = magicSubdir(getOutputPath());
>   Path path = new Path(magicSubdir(out), formatJobDir(getUUID()));
>   try(DurationInfo ignored = new DurationInfo(LOG, true,
>   "Deleting magic directory %s", path)) {
> Invoker.ignoreIOExceptions(LOG, "cleanup magic directory", 
> path.toString(),
> () -> deleteWithWarning(getDestFS(), path, true));
>   }
> } {code}
>  
> The side effect of this issue is that the "__magic" directory is never 
> cleaned up. However, I believe this is a minor concern, even considering that 
> other folders such as "_SUCCESS" also persist after jobs end.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-18797) S3A committer fix lost data on concurrent jobs

2023-08-28 Thread Syed Shameerur Rahman (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17759575#comment-17759575
 ] 

Syed Shameerur Rahman edited comment on HADOOP-18797 at 8/28/23 3:58 PM:
-

[~ste...@apache.org] - I am more inclined towards Approach 3  , BTW 
FileOutputCommitters also faces the same issue of not cleaning failed jobs. The 
temporary files created in spark or hive staging directory will be left 
untouched if the job/Driver crashes.


was (Author: srahman):
[~ste...@apache.org] - I am more inclined towards Approach 1 (as mentioned by 
Emanuel Velzi) , BTW FileOutputCommitters also faces the same issue of not 
cleaning failed jobs. The temporary files created in spark or hive staging 
directory will be left untouched if the job/Driver crashes.

> S3A committer fix lost data on concurrent jobs
> --
>
> Key: HADOOP-18797
> URL: https://issues.apache.org/jira/browse/HADOOP-18797
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 3.3.6
>Reporter: Emanuel Velzi
>Priority: Major
>
> There is a failure in the commit process when multiple jobs are writing to a 
> s3 directory *concurrently* using {*}magic committers{*}.
> This issue is closely related HADOOP-17318.
> When multiple Spark jobs write to the same S3A directory, they upload files 
> simultaneously using "__magic" as the base directory for staging. Inside this 
> directory, there are multiple "/job-some-uuid" directories, each representing 
> a concurrently running job.
> To fix some preoblems related to concunrrency a property was introduced in 
> the previous fix: "spark.hadoop.fs.s3a.committer.abort.pending.uploads". When 
> set to false, it ensures that during the cleanup stage, finalizing jobs do 
> not abort pending uploads from other jobs. So we see in logs this line: 
> {code:java}
> DEBUG [main] o.a.h.fs.s3a.commit.AbstractS3ACommitter (819): Not cleanup up 
> pending uploads to s3a ...{code}
> (from 
> [AbstractS3ACommitter.java#L952|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/AbstractS3ACommitter.java#L952])
> However, in the next step, the {*}"__magic" directory is recursively 
> deleted{*}:
> {code:java}
> INFO  [main] o.a.h.fs.s3a.commit.magic.MagicS3GuardCommitter (98): Deleting 
> magic directory s3a://my-bucket/my-table/__magic: duration 0:00.560s {code}
> (from [AbstractS3ACommitter.java#L1112 
> |https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/AbstractS3ACommitter.java#L1112]and
>  
> [MagicS3GuardCommitter.java#L137)|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/magic/MagicS3GuardCommitter.java#L137)]
> This deletion operation *affects the second job* that is still running 
> because it loses pending uploads (i.e., ".pendingset" and ".pending" files).
> The consequences can range from an exception in the best case to a silent 
> loss of data in the worst case. The latter occurs when Job_1 deletes files 
> just before Job_2 executes "listPendingUploadsToCommit" to list ".pendingset" 
> files in the job attempt directory previous to complete the uploads with POST 
> requests.
> To resolve this issue, it's important {*}to ensure that only the prefix 
> associated with the job currently finalizing is cleaned{*}.
> Here's a possible solution:
> {code:java}
> /**
>  * Delete the magic directory.
>  */
> public void cleanupStagingDirs() {
>   final Path out = getOutputPath();
>  //Path path = magicSubdir(getOutputPath());
>   Path path = new Path(magicSubdir(out), formatJobDir(getUUID()));
>   try(DurationInfo ignored = new DurationInfo(LOG, true,
>   "Deleting magic directory %s", path)) {
> Invoker.ignoreIOExceptions(LOG, "cleanup magic directory", 
> path.toString(),
> () -> deleteWithWarning(getDestFS(), path, true));
>   }
> } {code}
>  
> The side effect of this issue is that the "__magic" directory is never 
> cleaned up. However, I believe this is a minor concern, even considering that 
> other folders such as "_SUCCESS" also persist after jobs end.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-18797) S3A committer fix lost data on concurrent jobs

2023-08-28 Thread Syed Shameerur Rahman (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17759527#comment-17759527
 ] 

Syed Shameerur Rahman edited comment on HADOOP-18797 at 8/28/23 3:57 PM:
-

This looks like a valid use-case when multiple job writes to same table but 
different partitions, The MPU metadata (pendingset) of slower running jobs 
might be deleted by the the jobs which completes first.

I could think of three approaches here

Approach 1:  Do job level magic directory deletion ie (__magic/job_/) 
(as mentioned by [~emanuelvelzi])

1. After the job is completed delete the path __magic/job_/

Pros
1. Concurrent writes will be supported

Cons
1. __magic directory will be visible in the table path even though it won't be 
considered
2. The remains of failed job which stay forever unless manually deleted or via 
some S3 policies

Inorder to solve 
[HADOOP-18568|https://issues.apache.org/jira/browse/HADOOP-18568] we can put 
this behind a config similar to  fs.s3a.cleanup.magic.enabled


Approach 2:  Optional delete of __magic directory as mentioned in 
[HADOOP-18568|https://issues.apache.org/jira/browse/HADOOP-18568]

1. Based on the config we can choose to delete or not delete the magic directory

Pros
1. Solves both concurrent and scaling issues.

Cons
1. Say we have two spark clusters, One with config enabled to delete the 
__magic and another with config disabled, If they simultaneously hit the same 
table but different partition we will again hit the same concurrency issue as 
mentioned in this Jira.



Approach 3: Have unique magic directory for each job i.e __magic_job 
(similar to staging directory in FileOutputCommitter)

1. Each job will write pendingset to its specified __magic_job
2. The directory will be deleted after successful commit of the job.

Pros
1. Concurrent writes will be supported
2. if all the jobs are successful no __magic_* directory will be visible 

Cons
1. The remains of failed job which stay forever unless manually deleted or via 
some S3 policies which is similar to FileOutputCommitter




was (Author: srahman):
This looks like a valid use-case when multiple job writes to same table but 
different partitions, The MPU metadata (pendingset) of slower running jobs 
might be deleted by the the jobs which completes first.

I could think of two approaches here

Approach 1:  Do job level magic directory deletion ie (__magic/job_/) 
(as mentioned by [~emanuelvelzi])

1. After the job is completed delete the path __magic/job_/

Pros
1. Concurrent writes will be supported

Cons
1. __magic directory will be visible in the table path even though it won't be 
considered
2. The remains of failed job which stay forever unless manually deleted or via 
some S3 policies

Inorder to solve 
[HADOOP-18568|https://issues.apache.org/jira/browse/HADOOP-18568] we can put 
this behind a config similar to  fs.s3a.cleanup.magic.enabled


Approach 2:  Optional delete of __magic directory as mentioned in 
[HADOOP-18568|https://issues.apache.org/jira/browse/HADOOP-18568]

1. Based on the config we can choose to delete or not delete the magic directory

Pros
1. Solves both concurrent and scaling issues.

Cons
1. Say we have two spark clusters, One with config enabled to delete the 
__magic and another with config disabled, If they simultaneously hit the same 
table but different partition we will again hit the same concurrency issue as 
mentioned in this Jira.







> S3A committer fix lost data on concurrent jobs
> --
>
> Key: HADOOP-18797
> URL: https://issues.apache.org/jira/browse/HADOOP-18797
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 3.3.6
>Reporter: Emanuel Velzi
>Priority: Major
>
> There is a failure in the commit process when multiple jobs are writing to a 
> s3 directory *concurrently* using {*}magic committers{*}.
> This issue is closely related HADOOP-17318.
> When multiple Spark jobs write to the same S3A directory, they upload files 
> simultaneously using "__magic" as the base directory for staging. Inside this 
> directory, there are multiple "/job-some-uuid" directories, each representing 
> a concurrently running job.
> To fix some preoblems related to concunrrency a property was introduced in 
> the previous fix: "spark.hadoop.fs.s3a.committer.abort.pending.uploads". When 
> set to false, it ensures that during the cleanup stage, finalizing jobs do 
> not abort pending uploads from other jobs. So we see in logs this line: 
> {code:java}
> DEBUG [main] o.a.h.fs.s3a.commit.AbstractS3ACommitter (819): Not cleanup up 
> pending uploads to s3a ...{code}
> (from 
> [AbstractS3ACommitter.java#L952|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apa

[jira] [Commented] (HADOOP-18797) S3A committer fix lost data on concurrent jobs

2023-08-28 Thread Syed Shameerur Rahman (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17759575#comment-17759575
 ] 

Syed Shameerur Rahman commented on HADOOP-18797:


[~ste...@apache.org] - I am more inclined towards Approach 1 (as mentioned by 
Emanuel Velzi) , BTW FileOutputCommitters also faces the same issue. The 
temporary files created in spark or hive staging directory will be left 
untouched if the job/Driver crashes.

> S3A committer fix lost data on concurrent jobs
> --
>
> Key: HADOOP-18797
> URL: https://issues.apache.org/jira/browse/HADOOP-18797
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 3.3.6
>Reporter: Emanuel Velzi
>Priority: Major
>
> There is a failure in the commit process when multiple jobs are writing to a 
> s3 directory *concurrently* using {*}magic committers{*}.
> This issue is closely related HADOOP-17318.
> When multiple Spark jobs write to the same S3A directory, they upload files 
> simultaneously using "__magic" as the base directory for staging. Inside this 
> directory, there are multiple "/job-some-uuid" directories, each representing 
> a concurrently running job.
> To fix some preoblems related to concunrrency a property was introduced in 
> the previous fix: "spark.hadoop.fs.s3a.committer.abort.pending.uploads". When 
> set to false, it ensures that during the cleanup stage, finalizing jobs do 
> not abort pending uploads from other jobs. So we see in logs this line: 
> {code:java}
> DEBUG [main] o.a.h.fs.s3a.commit.AbstractS3ACommitter (819): Not cleanup up 
> pending uploads to s3a ...{code}
> (from 
> [AbstractS3ACommitter.java#L952|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/AbstractS3ACommitter.java#L952])
> However, in the next step, the {*}"__magic" directory is recursively 
> deleted{*}:
> {code:java}
> INFO  [main] o.a.h.fs.s3a.commit.magic.MagicS3GuardCommitter (98): Deleting 
> magic directory s3a://my-bucket/my-table/__magic: duration 0:00.560s {code}
> (from [AbstractS3ACommitter.java#L1112 
> |https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/AbstractS3ACommitter.java#L1112]and
>  
> [MagicS3GuardCommitter.java#L137)|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/magic/MagicS3GuardCommitter.java#L137)]
> This deletion operation *affects the second job* that is still running 
> because it loses pending uploads (i.e., ".pendingset" and ".pending" files).
> The consequences can range from an exception in the best case to a silent 
> loss of data in the worst case. The latter occurs when Job_1 deletes files 
> just before Job_2 executes "listPendingUploadsToCommit" to list ".pendingset" 
> files in the job attempt directory previous to complete the uploads with POST 
> requests.
> To resolve this issue, it's important {*}to ensure that only the prefix 
> associated with the job currently finalizing is cleaned{*}.
> Here's a possible solution:
> {code:java}
> /**
>  * Delete the magic directory.
>  */
> public void cleanupStagingDirs() {
>   final Path out = getOutputPath();
>  //Path path = magicSubdir(getOutputPath());
>   Path path = new Path(magicSubdir(out), formatJobDir(getUUID()));
>   try(DurationInfo ignored = new DurationInfo(LOG, true,
>   "Deleting magic directory %s", path)) {
> Invoker.ignoreIOExceptions(LOG, "cleanup magic directory", 
> path.toString(),
> () -> deleteWithWarning(getDestFS(), path, true));
>   }
> } {code}
>  
> The side effect of this issue is that the "__magic" directory is never 
> cleaned up. However, I believe this is a minor concern, even considering that 
> other folders such as "_SUCCESS" also persist after jobs end.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-18797) S3A committer fix lost data on concurrent jobs

2023-08-28 Thread Syed Shameerur Rahman (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17759575#comment-17759575
 ] 

Syed Shameerur Rahman edited comment on HADOOP-18797 at 8/28/23 12:38 PM:
--

[~ste...@apache.org] - I am more inclined towards Approach 1 (as mentioned by 
Emanuel Velzi) , BTW FileOutputCommitters also faces the same issue of not 
cleaning failed jobs. The temporary files created in spark or hive staging 
directory will be left untouched if the job/Driver crashes.


was (Author: srahman):
[~ste...@apache.org] - I am more inclined towards Approach 1 (as mentioned by 
Emanuel Velzi) , BTW FileOutputCommitters also faces the same issue. The 
temporary files created in spark or hive staging directory will be left 
untouched if the job/Driver crashes.

> S3A committer fix lost data on concurrent jobs
> --
>
> Key: HADOOP-18797
> URL: https://issues.apache.org/jira/browse/HADOOP-18797
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 3.3.6
>Reporter: Emanuel Velzi
>Priority: Major
>
> There is a failure in the commit process when multiple jobs are writing to a 
> s3 directory *concurrently* using {*}magic committers{*}.
> This issue is closely related HADOOP-17318.
> When multiple Spark jobs write to the same S3A directory, they upload files 
> simultaneously using "__magic" as the base directory for staging. Inside this 
> directory, there are multiple "/job-some-uuid" directories, each representing 
> a concurrently running job.
> To fix some preoblems related to concunrrency a property was introduced in 
> the previous fix: "spark.hadoop.fs.s3a.committer.abort.pending.uploads". When 
> set to false, it ensures that during the cleanup stage, finalizing jobs do 
> not abort pending uploads from other jobs. So we see in logs this line: 
> {code:java}
> DEBUG [main] o.a.h.fs.s3a.commit.AbstractS3ACommitter (819): Not cleanup up 
> pending uploads to s3a ...{code}
> (from 
> [AbstractS3ACommitter.java#L952|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/AbstractS3ACommitter.java#L952])
> However, in the next step, the {*}"__magic" directory is recursively 
> deleted{*}:
> {code:java}
> INFO  [main] o.a.h.fs.s3a.commit.magic.MagicS3GuardCommitter (98): Deleting 
> magic directory s3a://my-bucket/my-table/__magic: duration 0:00.560s {code}
> (from [AbstractS3ACommitter.java#L1112 
> |https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/AbstractS3ACommitter.java#L1112]and
>  
> [MagicS3GuardCommitter.java#L137)|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/magic/MagicS3GuardCommitter.java#L137)]
> This deletion operation *affects the second job* that is still running 
> because it loses pending uploads (i.e., ".pendingset" and ".pending" files).
> The consequences can range from an exception in the best case to a silent 
> loss of data in the worst case. The latter occurs when Job_1 deletes files 
> just before Job_2 executes "listPendingUploadsToCommit" to list ".pendingset" 
> files in the job attempt directory previous to complete the uploads with POST 
> requests.
> To resolve this issue, it's important {*}to ensure that only the prefix 
> associated with the job currently finalizing is cleaned{*}.
> Here's a possible solution:
> {code:java}
> /**
>  * Delete the magic directory.
>  */
> public void cleanupStagingDirs() {
>   final Path out = getOutputPath();
>  //Path path = magicSubdir(getOutputPath());
>   Path path = new Path(magicSubdir(out), formatJobDir(getUUID()));
>   try(DurationInfo ignored = new DurationInfo(LOG, true,
>   "Deleting magic directory %s", path)) {
> Invoker.ignoreIOExceptions(LOG, "cleanup magic directory", 
> path.toString(),
> () -> deleteWithWarning(getDestFS(), path, true));
>   }
> } {code}
>  
> The side effect of this issue is that the "__magic" directory is never 
> cleaned up. However, I believe this is a minor concern, even considering that 
> other folders such as "_SUCCESS" also persist after jobs end.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18797) S3A committer fix lost data on concurrent jobs

2023-08-28 Thread Syed Shameerur Rahman (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17759527#comment-17759527
 ] 

Syed Shameerur Rahman commented on HADOOP-18797:


This looks like a valid use-case when multiple job writes to same table but 
different partitions, The MPU metadata (pendingset) of slower running jobs 
might be deleted by the the jobs which completes first.

I could think of two approaches here

Approach 1:  Do job level magic directory deletion ie (__magic/job_/) 
(as mentioned by [~emanuelvelzi])

1. After the job is completed delete the path __magic/job_/

Pros
1. Concurrent writes will be supported

Cons
1. __magic directory will be visible in the table path even though it won't be 
considered
2. The remains of failed job which stay forever unless manually deleted or via 
some S3 policies

Inorder to solve 
[HADOOP-18568|https://issues.apache.org/jira/browse/HADOOP-18568] we can put 
this behind a config similar to  fs.s3a.cleanup.magic.enabled


Approach 2:  Optional delete of __magic directory as mentioned in 
[HADOOP-18568|https://issues.apache.org/jira/browse/HADOOP-18568]

1. Based on the config we can choose to delete or not delete the magic directory

Pros
1. Solves both concurrent and scaling issues.

Cons
1. Say we have two spark clusters, One with config enabled to delete the 
__magic and another with config disabled, If they simultaneously hit the same 
table but different partition we will again hit the same concurrency issue as 
mentioned in this Jira.







> S3A committer fix lost data on concurrent jobs
> --
>
> Key: HADOOP-18797
> URL: https://issues.apache.org/jira/browse/HADOOP-18797
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 3.3.6
>Reporter: Emanuel Velzi
>Priority: Major
>
> There is a failure in the commit process when multiple jobs are writing to a 
> s3 directory *concurrently* using {*}magic committers{*}.
> This issue is closely related HADOOP-17318.
> When multiple Spark jobs write to the same S3A directory, they upload files 
> simultaneously using "__magic" as the base directory for staging. Inside this 
> directory, there are multiple "/job-some-uuid" directories, each representing 
> a concurrently running job.
> To fix some preoblems related to concunrrency a property was introduced in 
> the previous fix: "spark.hadoop.fs.s3a.committer.abort.pending.uploads". When 
> set to false, it ensures that during the cleanup stage, finalizing jobs do 
> not abort pending uploads from other jobs. So we see in logs this line: 
> {code:java}
> DEBUG [main] o.a.h.fs.s3a.commit.AbstractS3ACommitter (819): Not cleanup up 
> pending uploads to s3a ...{code}
> (from 
> [AbstractS3ACommitter.java#L952|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/AbstractS3ACommitter.java#L952])
> However, in the next step, the {*}"__magic" directory is recursively 
> deleted{*}:
> {code:java}
> INFO  [main] o.a.h.fs.s3a.commit.magic.MagicS3GuardCommitter (98): Deleting 
> magic directory s3a://my-bucket/my-table/__magic: duration 0:00.560s {code}
> (from [AbstractS3ACommitter.java#L1112 
> |https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/AbstractS3ACommitter.java#L1112]and
>  
> [MagicS3GuardCommitter.java#L137)|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/magic/MagicS3GuardCommitter.java#L137)]
> This deletion operation *affects the second job* that is still running 
> because it loses pending uploads (i.e., ".pendingset" and ".pending" files).
> The consequences can range from an exception in the best case to a silent 
> loss of data in the worst case. The latter occurs when Job_1 deletes files 
> just before Job_2 executes "listPendingUploadsToCommit" to list ".pendingset" 
> files in the job attempt directory previous to complete the uploads with POST 
> requests.
> To resolve this issue, it's important {*}to ensure that only the prefix 
> associated with the job currently finalizing is cleaned{*}.
> Here's a possible solution:
> {code:java}
> /**
>  * Delete the magic directory.
>  */
> public void cleanupStagingDirs() {
>   final Path out = getOutputPath();
>  //Path path = magicSubdir(getOutputPath());
>   Path path = new Path(magicSubdir(out), formatJobDir(getUUID()));
>   try(DurationInfo ignored = new DurationInfo(LOG, true,
>   "Deleting magic directory %s", path)) {
> Invoker.ignoreIOExceptions(LOG, "cleanup magic directory", 
> path.toString(),
> () -> deleteWithWarning(getDestFS(), path, true));
>   }
> } {code}
>  
> The side effect of this issue is that the "__magic" directory is

[jira] [Commented] (HADOOP-18842) Support Overwrite Directory On Commit For S3A Committers

2023-08-28 Thread Syed Shameerur Rahman (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17759497#comment-17759497
 ] 

Syed Shameerur Rahman commented on HADOOP-18842:


> The decision to use disk is made by a config option, and would only need 
> enabling if scale problems were encountered. Use of the same marshalled 
> format in both forms of storage ensures consistent code coverage, gives us 
> efficient storage.

Yes this makes sense!

> Support Overwrite Directory On Commit For S3A Committers
> 
>
> Key: HADOOP-18842
> URL: https://issues.apache.org/jira/browse/HADOOP-18842
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Syed Shameerur Rahman
>Assignee: Syed Shameerur Rahman
>Priority: Major
>  Labels: pull-request-available
>
> The goal is to add a new kind of commit mechanism in which the destination 
> directory is cleared off before committing the file.
> *Use Case*
> In case of dynamicPartition insert overwrite queries, The destination 
> directory which needs to be overwritten are not known before the execution 
> and hence it becomes a challenge to clear off the destination directory.
>  
> One approach to handle this is, The underlying engines/client will clear off 
> all the destination directories before calling the commitJob operation but 
> the issue with this approach is that, In case of failures while committing 
> the files, We might end up with the whole of previous data being deleted 
> making the recovery process difficult or time consuming.
>  
> *Solution*
> Based on mode of commit operation either *INSERT* or *OVERWRITE* , During 
> commitJob operations, The committer will map each destination directory with 
> the commits which needs to be added in the directory and if the mode is 
> *OVERWRITE* , The committer will delete the directory recursively and then 
> commit each of the files in the directory. So in case of failures (worst 
> case) The number of destination directory which will be deleted will be equal 
> to the number of threads if we do it in multi-threaded way as compared to the 
> whole data if it was done in the engine side.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18842) Support Overwrite Directory On Commit For S3A Committers

2023-08-10 Thread Syed Shameerur Rahman (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17752759#comment-17752759
 ] 

Syed Shameerur Rahman commented on HADOOP-18842:


[~ste...@apache.org]  Thanks a lot for the pointers.

The following are some of my observations wrt to your comments
 # Yes, This is similar to staging committer's partitioned overwrite. But what 
i could see is that, staging committers during precommit in commitJob operation 
clears all the directories/partitions if the conflict resolution is 
"{color:#871094}REPLACE{color}" . The issue with this approach is that, In 
worst case scenario when the job fails after precommit, The whole data will be 
lost which might not be desirable 
 # I agree that storing all the SinglePendingCommit in memory puts an extra 
memory pressure on the driver. For instance in my setup to store ~1400 pending 
set files into memory took extra 7MB (this number will be different based on 
your S3 bucket or destination name length). So i guess it is not that much.
 # For a high write intensive jobs which commits tens of thousands of files, 
The memory pressure will be more but for such cases, It is recommended to have 
a larger driver memory size anyway.
 # Streaming the SinglePendingCommit to local fileSystem is a great idea but it 
causes extra delay for serialization/deserialization and extra overhead to read 
and write files which may be not desirable in all the cases.

 

*Proposal*
 # Streaming the SinglePendingCommit to local fileSystem only if there are 
large number of pending files. We can do some approximation like 1 pending set 
will be of 'x' bytes and the user is willing to take in such 'y' such 'x' bytes 
into memory
 # On the other case let's store it in memory.

 

For (1) i.e streaming the commits to local filesystem
 # Read pending set in multi-threaded way 
2. for each pending set extract the single commit and corresponding destination 
directory and store the destination in memory 
3. stream the single commit to file with a unique path for each destination 
directory
4. for each destination directory  :
     4.1 delete the destination directory
     4.2 read the commits from the unique path and call commit

**Unique path for each destination directory will help us to limit the number 
of directories/partitions which will be lost in case of failures.

 

[~ste...@apache.org] Any thoughts on this?

Thanks

> Support Overwrite Directory On Commit For S3A Committers
> 
>
> Key: HADOOP-18842
> URL: https://issues.apache.org/jira/browse/HADOOP-18842
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Syed Shameerur Rahman
>Priority: Major
>  Labels: pull-request-available
>
> The goal is to add a new kind of commit mechanism in which the destination 
> directory is cleared off before committing the file.
> *Use Case*
> In case of dynamicPartition insert overwrite queries, The destination 
> directory which needs to be overwritten are not known before the execution 
> and hence it becomes a challenge to clear off the destination directory.
>  
> One approach to handle this is, The underlying engines/client will clear off 
> all the destination directories before calling the commitJob operation but 
> the issue with this approach is that, In case of failures while committing 
> the files, We might end up with the whole of previous data being deleted 
> making the recovery process difficult or time consuming.
>  
> *Solution*
> Based on mode of commit operation either *INSERT* or *OVERWRITE* , During 
> commitJob operations, The committer will map each destination directory with 
> the commits which needs to be added in the directory and if the mode is 
> *OVERWRITE* , The committer will delete the directory recursively and then 
> commit each of the files in the directory. So in case of failures (worst 
> case) The number of destination directory which will be deleted will be equal 
> to the number of threads if we do it in multi-threaded way as compared to the 
> whole data if it was done in the engine side.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18842) Support Overwrite Directory On Commit For S3A Committers

2023-08-07 Thread Syed Shameerur Rahman (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17751656#comment-17751656
 ] 

Syed Shameerur Rahman commented on HADOOP-18842:


[~ste...@apache.org]  It would be great if you review the above PR or the 
proposed changes. 

Note: It is a WIP PR (need to add unit tests and integration tests). I would 
like to get your thoughts on this before taking it forward.

Thanks

> Support Overwrite Directory On Commit For S3A Committers
> 
>
> Key: HADOOP-18842
> URL: https://issues.apache.org/jira/browse/HADOOP-18842
> Project: Hadoop Common
>  Issue Type: New Feature
>Reporter: Syed Shameerur Rahman
>Priority: Major
>  Labels: pull-request-available
>
> The goal is to add a new kind of commit mechanism in which the destination 
> directory is cleared off before committing the file.
> *Use Case*
> In case of dynamicPartition insert overwrite queries, The destination 
> directory which needs to be overwritten are not known before the execution 
> and hence it becomes a challenge to clear off the destination directory.
>  
> One approach to handle this is, The underlying engines/client will clear off 
> all the destination directories before calling the commitJob operation but 
> the issue with this approach is that, In case of failures while committing 
> the files, We might end up with the whole of previous data being deleted 
> making the recovery process difficult or time consuming.
>  
> *Solution*
> Based on mode of commit operation either *INSERT* or *OVERWRITE* , During 
> commitJob operations, The committer will map each destination directory with 
> the commits which needs to be added in the directory and if the mode is 
> *OVERWRITE* , The committer will delete the directory recursively and then 
> commit each of the files in the directory. So in case of failures (worst 
> case) The number of destination directory which will be deleted will be equal 
> to the number of threads if we do it in multi-threaded way as compared to the 
> whole data if it was done in the engine side.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-18842) Support Overwrite Directory On Commit For S3A Committers

2023-08-07 Thread Syed Shameerur Rahman (Jira)
Syed Shameerur Rahman created HADOOP-18842:
--

 Summary: Support Overwrite Directory On Commit For S3A Committers
 Key: HADOOP-18842
 URL: https://issues.apache.org/jira/browse/HADOOP-18842
 Project: Hadoop Common
  Issue Type: New Feature
Reporter: Syed Shameerur Rahman


The goal is to add a new kind of commit mechanism in which the destination 
directory is cleared off before committing the file.

*Use Case*

In case of dynamicPartition insert overwrite queries, The destination directory 
which needs to be overwritten are not known before the execution and hence it 
becomes a challenge to clear off the destination directory.

 

One approach to handle this is, The underlying engines/client will clear off 
all the destination directories before calling the commitJob operation but the 
issue with this approach is that, In case of failures while committing the 
files, We might end up with the whole of previous data being deleted making the 
recovery process difficult or time consuming.

 

*Solution*

Based on mode of commit operation either *INSERT* or *OVERWRITE* , During 
commitJob operations, The committer will map each destination directory with 
the commits which needs to be added in the directory and if the mode is 
*OVERWRITE* , The committer will delete the directory recursively and then 
commit each of the files in the directory. So in case of failures (worst case) 
The number of destination directory which will be deleted will be equal to the 
number of threads if we do it in multi-threaded way as compared to the whole 
data if it was done in the engine side.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18776) Add OptimizedS3AMagicCommitter For Zero Rename Commits to S3 Endpoints

2023-07-14 Thread Syed Shameerur Rahman (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17743161#comment-17743161
 ] 

Syed Shameerur Rahman commented on HADOOP-18776:


[~ste...@apache.org] - If i understood your comment, You are proposing 
something like even if this committer(which does complete mpu in commitTask) is 
enabled when task attempt retry is 1 then we are okay, If not there should be 
some mechanism to fail the job when we use this committer and  task attempt 
retry > 1 and the task which failed had called commitTask operation

Am i correct?

> Add OptimizedS3AMagicCommitter For Zero Rename Commits to S3 Endpoints
> --
>
> Key: HADOOP-18776
> URL: https://issues.apache.org/jira/browse/HADOOP-18776
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: fs/s3
>Reporter: Syed Shameerur Rahman
>Priority: Major
>  Labels: pull-request-available
>
> The goal is to add a new S3A committer named *OptimizedS3AMagicCommitter* 
> which is an another type of S3 Magic committer but with a better performance 
> by taking in few tradeoffs.
> The following are the differences in MagicCommitter vs OptimizedMagicCommitter
>  
> ||Operation||Magic Committer||*OptimizedS3AMagicCommitter*||
> |commitTask|1. Lists all {{.pending}} files in its attempt directory.
>  
> 2. The contents are loaded into a list of single pending uploads.
>  
> 3. Saved to a {{.pendingset}} file in the job attempt directory.|1. Lists all 
> {{.pending}} files in its attempt directory
>  
> 2. The contents are loaded into a list of single pending uploads.
>  
> 3. For each pending upload, commit operation is called (complete 
> multiPartUpload)|
> |commitJob|1. Loads all {{.pendingset}} files in its job attempt directory
>  
> 2. Then every pending commit in the job will be committed.
>  
> 3. "SUCCESS" marker is created (if config is enabled)
>  
> 4. "__magic" directory is cleaned up.|1. "SUCCESS" marker is created (if 
> config is enabled)
>  
> 2.  "__magic" directory is cleaned up.|
>  
> *Performance Benefits :-*
>  # The primary performance boost due to distributed complete multiPartUpload 
> call being made in the taskAttempts(Task containers/Executors) rather than a 
> single job driver. In case of MagicCommitter it is O(files/threads).
>  # It also saves a couple of S3 calls needed to PUT the "{{{}.pendingset{}}}" 
> files and READ call to read them in the Job Driver.
>  
> *TradeOffs :-*
> The tradeoffs are similar to the one in FileOutputCommitter V2 version. Users 
> migrating from FileOutputCommitter V2 to OptimizedS3AMagicCommitter will no 
> see behavioral change as such
>  # During execution, intermediate data becomes visible after commitTask 
> operation
>  # On a failure, all output must be deleted and the job needs to be restarted.
>  
> *Performance Benchmark :-*
> Cluster : c4.8x large (ec2-instance)
> Instance : 1 (primary) + 5 (core)
> Data Size : 3TB Partitioned(TPC-DS store_sales data)
> Engine : Apache Spark 3.3.1 / Hadoop 3.3.3
> Query: The following query inserts around 3000+ files into the table 
> directory (ran for 3 iterations)
> {code:java}
> insert into  select ss_quantity from store_sales; {code}
> ||Committer||Iteration 1||Iteration 2||Iteration 3||
> |Magic|126|127|122|
> |OptimizedMagic|50|51|58|
> So on an average, OptimizedMagicCommitter was *~2.3x* faster as compared to 
> MagicCommitter.
>  
> _*Note: Unlike MagicCommitter , OptimizedMagicCommitter is not suitable for 
> all the cases where in user requires the guarantees of file not being visible 
> in failure scenarios. Given the performance benefit, user can may choose to 
> use this if they don't require any guarantees or have some mechanism to clean 
> up the data before retrying.*_
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-18776) Add OptimizedS3AMagicCommitter For Zero Rename Commits to S3 Endpoints

2023-07-12 Thread Syed Shameerur Rahman (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Syed Shameerur Rahman resolved HADOOP-18776.

Target Version/s:   (was: 3.4.0)
  Resolution: Won't Fix

Thanks steve for your pointers. Sure, I will that will help.

I am closing this Jira as "won't fix".

> Add OptimizedS3AMagicCommitter For Zero Rename Commits to S3 Endpoints
> --
>
> Key: HADOOP-18776
> URL: https://issues.apache.org/jira/browse/HADOOP-18776
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: fs/s3
>Reporter: Syed Shameerur Rahman
>Priority: Major
>  Labels: pull-request-available
>
> The goal is to add a new S3A committer named *OptimizedS3AMagicCommitter* 
> which is an another type of S3 Magic committer but with a better performance 
> by taking in few tradeoffs.
> The following are the differences in MagicCommitter vs OptimizedMagicCommitter
>  
> ||Operation||Magic Committer||*OptimizedS3AMagicCommitter*||
> |commitTask|1. Lists all {{.pending}} files in its attempt directory.
>  
> 2. The contents are loaded into a list of single pending uploads.
>  
> 3. Saved to a {{.pendingset}} file in the job attempt directory.|1. Lists all 
> {{.pending}} files in its attempt directory
>  
> 2. The contents are loaded into a list of single pending uploads.
>  
> 3. For each pending upload, commit operation is called (complete 
> multiPartUpload)|
> |commitJob|1. Loads all {{.pendingset}} files in its job attempt directory
>  
> 2. Then every pending commit in the job will be committed.
>  
> 3. "SUCCESS" marker is created (if config is enabled)
>  
> 4. "__magic" directory is cleaned up.|1. "SUCCESS" marker is created (if 
> config is enabled)
>  
> 2.  "__magic" directory is cleaned up.|
>  
> *Performance Benefits :-*
>  # The primary performance boost due to distributed complete multiPartUpload 
> call being made in the taskAttempts(Task containers/Executors) rather than a 
> single job driver. In case of MagicCommitter it is O(files/threads).
>  # It also saves a couple of S3 calls needed to PUT the "{{{}.pendingset{}}}" 
> files and READ call to read them in the Job Driver.
>  
> *TradeOffs :-*
> The tradeoffs are similar to the one in FileOutputCommitter V2 version. Users 
> migrating from FileOutputCommitter V2 to OptimizedS3AMagicCommitter will no 
> see behavioral change as such
>  # During execution, intermediate data becomes visible after commitTask 
> operation
>  # On a failure, all output must be deleted and the job needs to be restarted.
>  
> *Performance Benchmark :-*
> Cluster : c4.8x large (ec2-instance)
> Instance : 1 (primary) + 5 (core)
> Data Size : 3TB Partitioned(TPC-DS store_sales data)
> Engine : Apache Spark 3.3.1 / Hadoop 3.3.3
> Query: The following query inserts around 3000+ files into the table 
> directory (ran for 3 iterations)
> {code:java}
> insert into  select ss_quantity from store_sales; {code}
> ||Committer||Iteration 1||Iteration 2||Iteration 3||
> |Magic|126|127|122|
> |OptimizedMagic|50|51|58|
> So on an average, OptimizedMagicCommitter was *~2.3x* faster as compared to 
> MagicCommitter.
>  
> _*Note: Unlike MagicCommitter , OptimizedMagicCommitter is not suitable for 
> all the cases where in user requires the guarantees of file not being visible 
> in failure scenarios. Given the performance benefit, user can may choose to 
> use this if they don't require any guarantees or have some mechanism to clean 
> up the data before retrying.*_
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18776) Add OptimizedS3AMagicCommitter For Zero Rename Commits to S3 Endpoints

2023-06-21 Thread Syed Shameerur Rahman (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Syed Shameerur Rahman updated HADOOP-18776:
---
Description: 
The goal is to add a new S3A committer named *OptimizedS3AMagicCommitter* which 
is an another type of S3 Magic committer but with a better performance by 
taking in few tradeoffs.

The following are the differences in MagicCommitter vs OptimizedMagicCommitter

 
||Operation||Magic Committer||*OptimizedS3AMagicCommitter*||
|commitTask|1. Lists all {{.pending}} files in its attempt directory.
 
2. The contents are loaded into a list of single pending uploads.
 
3. Saved to a {{.pendingset}} file in the job attempt directory.|1. Lists all 
{{.pending}} files in its attempt directory
 
2. The contents are loaded into a list of single pending uploads.
 
3. For each pending upload, commit operation is called (complete 
multiPartUpload)|
|commitJob|1. Loads all {{.pendingset}} files in its job attempt directory
 
2. Then every pending commit in the job will be committed.
 
3. "SUCCESS" marker is created (if config is enabled)
 
4. "__magic" directory is cleaned up.|1. "SUCCESS" marker is created (if config 
is enabled)
 
2.  "__magic" directory is cleaned up.|

 

*Performance Benefits :-*
 # The primary performance boost due to distributed complete multiPartUpload 
call being made in the taskAttempts(Task containers/Executors) rather than a 
single job driver. In case of MagicCommitter it is O(files/threads).
 # It also saves a couple of S3 calls needed to PUT the "{{{}.pendingset{}}}" 
files and READ call to read them in the Job Driver.

 

*TradeOffs :-*

The tradeoffs are similar to the one in FileOutputCommitter V2 version. Users 
migrating from FileOutputCommitter V2 to OptimizedS3AMagicCommitter will no see 
behavioral change as such
 # During execution, intermediate data becomes visible after commitTask 
operation
 # On a failure, all output must be deleted and the job needs to be restarted.

 

*Performance Benchmark :-*

Cluster : c4.8x large (ec2-instance)
Instance : 1 (primary) + 5 (core)
Data Size : 3TB Partitioned(TPC-DS store_sales data)
Engine : Apache Spark 3.3.1 / Hadoop 3.3.3

Query: The following query inserts around 3000+ files into the table directory 
(ran for 3 iterations)
{code:java}
insert into  select ss_quantity from store_sales; {code}
||Committer||Iteration 1||Iteration 2||Iteration 3||
|Magic|126|127|122|
|OptimizedMagic|50|51|58|

So on an average, OptimizedMagicCommitter was *~2.3x* faster as compared to 
MagicCommitter.

 

_*Note: Unlike MagicCommitter , OptimizedMagicCommitter is not suitable for all 
the cases where in user requires the guarantees of file not being visible in 
failure scenarios. Given the performance benefit, user can may choose to use 
this if they don't require any guarantees or have some mechanism to clean up 
the data before retrying.*_

 

  was:
The goal is to add a new S3A committer named *OptimizedS3AMagicCommitter* which 
is an another type of S3 Magic committer but with a better performance by 
taking in few tradeoffs.

The following are the differences in MagicCommitter vs OptimizedMagicCommitter

 
||Operation||Magic Committer||*OptimizedS3AMagicCommitter*||
|commitTask|1. Lists all {{.pending}} files in its attempt directory.
 
2. The contents are loaded into a list of single pending uploads.
 
3. Saved to a {{.pendingset}} file in the job attempt directory.|1. Lists all 
{{.pending}} files in its attempt directory
 
2. The contents are loaded into a list of single pending uploads.
 
3. For each pending upload, commit operation is called (complete 
multiPartUpload)|
|commitJob|1. Loads all {{.pendingset}} files in its job attempt directory
 
2. Then every pending commit in the job will be committed.
 
3. "SUCCESS" marker is created (if config is enabled)
 
4. "__magic" directory is cleaned up.|1. "SUCCESS" marker is created (if config 
is enabled)
 
2.  "__magic" directory is cleaned up.|

 

*Performance Benefits :-*
 # The primary performance boost due to distributed complete multiPartUpload 
call being made in the taskAttempts(Task containers/Executors) rather than a 
single job driver. In case of MagicCommitter it is O(files/threads).
 # It also saves a couple of S3 calls needed to PUT the "{{{}.pendingset{}}}" 
files and READ call to read them in the Job Driver.

 

*TradeOffs :-*

The tradeoffs are similar to the one in FileOutputCommitter V2 version. Users 
migrating from FileOutputCommitter V2 to OptimizedS3AMagicCommitter will no see 
behavioral change as such
 # During execution, intermediate data becomes visible after commitTask 
operation
 # On a failure, all output must be deleted and the job needs to be restarted.

 

*Performance Benchmark :-*

Cluster : c4.8x large (ec2-instance)
Instance : 1 (primary) + 5 (core)
Data Size : 3TB Partitioned(TPC-DS store_sales data)
Engine : Apache S

[jira] [Comment Edited] (HADOOP-18776) Add OptimizedS3AMagicCommitter For Zero Rename Commits to S3 Endpoints

2023-06-21 Thread Syed Shameerur Rahman (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17735980#comment-17735980
 ] 

Syed Shameerur Rahman edited comment on HADOOP-18776 at 6/22/23 5:57 AM:
-

[~ste...@apache.org] - Thanks a lot for taking a took at this.

I fully understand your concerns. I am also aware of the same.

> "it lacks the ability to recover from task failure"

Yes this is true. When a task fails or the task JVM crashes in commitTask 
operation. Some files gets committed(visible) in the final path and some may 
not. If task re-attempts are enabled, A new task will come up and will write 
the files leading to duplicate(some) data in the final path. This issue can be 
solved by using this type of committer only for the use case where there is no 
task attempts and if any of the taskAttempts fails the job will also fail.

This can still have files written by the failed taskAttempts in the final path 
but then since the job had failed, The user can clear off the data manually and 
re-run the same job. I guess the same issue is still possible with 
MagicS3ACommitter as well, Since commitJob is not atomic and if the Job Driver 
JVM crashes in commitJob operation it can also lead to some files being visible 
in the final path.

 

> Finally, I'd love to know size of jobs where you hit problems, use etc. If 
> there's anything you can say publicly, that'd be great

My use case was, I had to write large number of files in a single query and 
since commitJob is single process(multi-threaded as opposed to distributed in 
the proposed use-case) which needs to call complete MPU for all these files it 
can become a bottleneck and hence explored other options  ({*}~2.3x{*} faster 
as compared to MagicCommitter.)

 

So my understanding is that when there is max 1 taskAttempt, This committer 
tend to behave similar (with same grantees) as MagicCommitter and hence can be 
used on specific use-cases.


was (Author: srahman):
[~ste...@apache.org] - Thanks a lot for taking a took at this.

I fully understand your concerns. I am also aware of the same.

> "it lacks the ability to recover from task failure"

Yes this is true. When a task fails or the task JVM crashes in commitTask 
operation. Some files gets committed(visible) in the final path and some may 
not. If task re-attempts are enabled, A new task will come up and will write 
the files leading to duplicate(some) data in the final path. This issue can be 
solved by using this type of committer only for the use case where there is no 
task attempts and if any of the taskAttempts fails the job will also fail.

This can still have files written by the failed taskAttempts in the final path 
but then since the job had failed, The user can clear off the data manually and 
re-run the same job. I guess the same issue is still possible with 
MagicS3ACommitter as well, Since commitJob is not atomic and if the Job Driver 
JVM crashes in commitJob operation it can also lead to some files being visible 
in the final path.

 

> Finally, I'd love to know size of jobs where you hit problems, use etc. If 
> there's anything you can say publicly, that'd be great

My use case was, I had to write large number of files in a single query and 
since commitJob is single process which needs to call complete MPU for all 
these files it can become a bottleneck and hence explored other options  
({*}~2.3x{*} faster as compared to MagicCommitter.)

 

So my understanding is that when there is max 1 taskAttempt, This committer 
tend to behave similar (with same grantees) as MagicCommitter and hence can be 
used on specific use-cases.

> Add OptimizedS3AMagicCommitter For Zero Rename Commits to S3 Endpoints
> --
>
> Key: HADOOP-18776
> URL: https://issues.apache.org/jira/browse/HADOOP-18776
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: fs/s3
>Reporter: Syed Shameerur Rahman
>Priority: Major
>  Labels: pull-request-available
>
> The goal is to add a new S3A committer named *OptimizedS3AMagicCommitter* 
> which is an another type of S3 Magic committer but with a better performance 
> by taking in few tradeoffs.
> The following are the differences in MagicCommitter vs OptimizedMagicCommitter
>  
> ||Operation||Magic Committer||*OptimizedS3AMagicCommitter*||
> |commitTask|1. Lists all {{.pending}} files in its attempt directory.
>  
> 2. The contents are loaded into a list of single pending uploads.
>  
> 3. Saved to a {{.pendingset}} file in the job attempt directory.|1. Lists all 
> {{.pending}} files in its attempt directory
>  
> 2. The contents are loaded into a list of single pending uploads.
>  
> 3. For each pending upload, commit operation is called (complete 
> mult

[jira] [Commented] (HADOOP-18776) Add OptimizedS3AMagicCommitter For Zero Rename Commits to S3 Endpoints

2023-06-21 Thread Syed Shameerur Rahman (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17735980#comment-17735980
 ] 

Syed Shameerur Rahman commented on HADOOP-18776:


[~ste...@apache.org] - Thanks a lot for taking a took at this.

I fully understand your concerns. I am also aware of the same.

> "it lacks the ability to recover from task failure"

Yes this is true. When a task fails or the task JVM crashes in commitTask 
operation. Some files gets committed(visible) in the final path and some may 
not. If task re-attempts are enabled, A new task will come up and will write 
the files leading to duplicate(some) data in the final path. This issue can be 
solved by using this type of committer only for the use case where there is no 
task attempts and if any of the taskAttempts fails the job will also fail.

This can still have files written by the failed taskAttempts in the final path 
but then since the job had failed, The user can clear off the data manually and 
re-run the same job. I guess the same issue is still possible with 
MagicS3ACommitter as well, Since commitJob is not atomic and if the Job Driver 
JVM crashes in commitJob operation it can also lead to some files being visible 
in the final path.

 

> Finally, I'd love to know size of jobs where you hit problems, use etc. If 
> there's anything you can say publicly, that'd be great

My use case was, I had to write large number of files in a single query and 
since commitJob is single process which needs to call complete MPU for all 
these files it can become a bottleneck and hence explored other options  
({*}~2.3x{*} faster as compared to MagicCommitter.)

 

So my understanding is that when there is max 1 taskAttempt, This committer 
tend to behave similar (with same grantees) as MagicCommitter and hence can be 
used on specific use-cases.

> Add OptimizedS3AMagicCommitter For Zero Rename Commits to S3 Endpoints
> --
>
> Key: HADOOP-18776
> URL: https://issues.apache.org/jira/browse/HADOOP-18776
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: fs/s3
>Reporter: Syed Shameerur Rahman
>Priority: Major
>  Labels: pull-request-available
>
> The goal is to add a new S3A committer named *OptimizedS3AMagicCommitter* 
> which is an another type of S3 Magic committer but with a better performance 
> by taking in few tradeoffs.
> The following are the differences in MagicCommitter vs OptimizedMagicCommitter
>  
> ||Operation||Magic Committer||*OptimizedS3AMagicCommitter*||
> |commitTask|1. Lists all {{.pending}} files in its attempt directory.
>  
> 2. The contents are loaded into a list of single pending uploads.
>  
> 3. Saved to a {{.pendingset}} file in the job attempt directory.|1. Lists all 
> {{.pending}} files in its attempt directory
>  
> 2. The contents are loaded into a list of single pending uploads.
>  
> 3. For each pending upload, commit operation is called (complete 
> multiPartUpload)|
> |commitJob|1. Loads all {{.pendingset}} files in its job attempt directory
>  
> 2. Then every pending commit in the job will be committed.
>  
> 3. "SUCCESS" marker is created (if config is enabled)
>  
> 4. "__magic" directory is cleaned up.|1. "SUCCESS" marker is created (if 
> config is enabled)
>  
> 2.  "__magic" directory is cleaned up.|
>  
> *Performance Benefits :-*
>  # The primary performance boost due to distributed complete multiPartUpload 
> call being made in the taskAttempts(Task containers/Executors) rather than a 
> single job driver. In case of MagicCommitter it is O(files/threads).
>  # It also saves a couple of S3 calls needed to PUT the "{{{}.pendingset{}}}" 
> files and READ call to read them in the Job Driver.
>  
> *TradeOffs :-*
> The tradeoffs are similar to the one in FileOutputCommitter V2 version. Users 
> migrating from FileOutputCommitter V2 to OptimizedS3AMagicCommitter will no 
> see behavioral change as such
>  # During execution, intermediate data becomes visible after commitTask 
> operation
>  # On a failure, all output must be deleted and the job needs to be restarted.
>  
> *Performance Benchmark :-*
> Cluster : c4.8x large (ec2-instance)
> Instance : 1 (primary) + 5 (core)
> Data Size : 3TB Partitioned(TPC-DS store_sales data)
> Engine : Apache Spark 3.3.1
> Query: The following query inserts around 3000+ files into the table 
> directory (ran for 3 iterations)
> {code:java}
> insert into  select ss_quantity from store_sales; {code}
> ||Committer||Iteration 1||Iteration 2||Iteration 3||
> |Magic|126|127|122|
> |OptimizedMagic|50|51|58|
> So on an average, OptimizedMagicCommitter was *~2.3x* faster as compared to 
> MagicCommitter.
>  
> _*Note: Unlike MagicCommitter , OptimizedMagicCommitter is not suitable for 
> all

[jira] [Updated] (HADOOP-18776) Add OptimizedS3AMagicCommitter For Zero Rename Commits to S3 Endpoints

2023-06-19 Thread Syed Shameerur Rahman (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Syed Shameerur Rahman updated HADOOP-18776:
---
Description: 
The goal is to add a new S3A committer named *OptimizedS3AMagicCommitter* which 
is an another type of S3 Magic committer but with a better performance by 
taking in few tradeoffs.

The following are the differences in MagicCommitter vs OptimizedMagicCommitter

 
||Operation||Magic Committer||*OptimizedS3AMagicCommitter*||
|commitTask|1. Lists all {{.pending}} files in its attempt directory.
 
2. The contents are loaded into a list of single pending uploads.
 
3. Saved to a {{.pendingset}} file in the job attempt directory.|1. Lists all 
{{.pending}} files in its attempt directory
 
2. The contents are loaded into a list of single pending uploads.
 
3. For each pending upload, commit operation is called (complete 
multiPartUpload)|
|commitJob|1. Loads all {{.pendingset}} files in its job attempt directory
 
2. Then every pending commit in the job will be committed.
 
3. "SUCCESS" marker is created (if config is enabled)
 
4. "__magic" directory is cleaned up.|1. "SUCCESS" marker is created (if config 
is enabled)
 
2.  "__magic" directory is cleaned up.|

 

*Performance Benefits :-*
 # The primary performance boost due to distributed complete multiPartUpload 
call being made in the taskAttempts(Task containers/Executors) rather than a 
single job driver. In case of MagicCommitter it is O(files/threads).
 # It also saves a couple of S3 calls needed to PUT the "{{{}.pendingset{}}}" 
files and READ call to read them in the Job Driver.

 

*TradeOffs :-*

The tradeoffs are similar to the one in FileOutputCommitter V2 version. Users 
migrating from FileOutputCommitter V2 to OptimizedS3AMagicCommitter will no see 
behavioral change as such
 # During execution, intermediate data becomes visible after commitTask 
operation
 # On a failure, all output must be deleted and the job needs to be restarted.

 

*Performance Benchmark :-*

Cluster : c4.8x large (ec2-instance)
Instance : 1 (primary) + 5 (core)
Data Size : 3TB Partitioned(TPC-DS store_sales data)
Engine : Apache Spark 3.3.1

Query: The following query inserts around 3000+ files into the table directory 
(ran for 3 iterations)
{code:java}
insert into  select ss_quantity from store_sales; {code}
||Committer||Iteration 1||Iteration 2||Iteration 3||
|Magic|126|127|122|
|OptimizedMagic|50|51|58|

So on an average, OptimizedMagicCommitter was *~2.3x* faster as compared to 
MagicCommitter.

 

_*Note: Unlike MagicCommitter , OptimizedMagicCommitter is not suitable for all 
the cases where in user requires the guarantees of file not being visible in 
failure scenarios. Given the performance benefit, user can may choose to use 
this if they don't require any guarantees or have some mechanism to clean up 
the data before retrying.*_

 

  was:
The goal is to add a new S3A committer named *OptimizedS3AMagicCommitter* which 
is an another type of S3 Magic committer but with a better performance by 
taking in few tradeoffs.

The following are the differences in MagicCommitter vs OptimizedMagicCommitter

 
||Operation||Magic Committer||*OptimizedS3AMagicCommitter*||
|commitTask |1. Lists all {{.pending}} files in its attempt directory.
 
2. The contents are loaded into a list of single pending uploads.
 
3. Saved to a {{.pendingset}} file in the job attempt directory.|1. Lists all 
{{.pending}} files in its attempt directory
 
2. The contents are loaded into a list of single pending uploads.
 
3. For each pending upload, commit operation is called (complete 
multiPartUpload)|
|commitJob|1. Loads all {{.pendingset}} files in its job attempt directory
 
2. Then every pending commit in the job will be committed.
 
3. "SUCCESS" marker is created (if config is enabled)
 
4. "__magic" directory is cleaned up.|1. "SUCCESS" marker is created (if config 
is enabled)
 
2.  "__magic" directory is cleaned up.|

 

*Performance Benefits :-*
 # The primary performance boost due to distributed complete multiPartUpload 
call being made in the taskAttempts(Task containers/Executors) rather than a 
single job driver. In case of MagicCommitter it is O(files/threads).
 # It also saves a couple of S3 calls needed to PUT the "{{{}.pendingset{}}}" 
files and READ call to read them in the Job Driver.

 

*TradeOffs :-*

The tradeoffs are similar to the one in FileOutputCommitter V2 version. Users 
migrating from FileOutputCommitter V2 to OptimizedS3AMagicCommitter will no see 
behavioral change as such
 # During execution, intermediate data becomes visible after commitTask 
operation
 # On a failure, all output must be deleted and the job needs to be restarted.

 

*Performance Benchmark :-*

Cluster : c4.8x large (ec2-instance)
Instance : 1 (primary) + 5 (core)
Data Size : 3TB (TPC-DS store_sales data)
Engine : Apache Spark 3.3.1

Query: The fo

[jira] [Commented] (HADOOP-18776) Add OptimizedS3AMagicCommitter For Zero Rename Commits to S3 Endpoints

2023-06-19 Thread Syed Shameerur Rahman (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17734071#comment-17734071
 ] 

Syed Shameerur Rahman commented on HADOOP-18776:


[~ste...@apache.org] , It would be great if you review the above PR.

Note: It is a WIP PR (need to add unit tests and integration tests). I would 
like to get the communities though on this before taking this forward.

Thanks

> Add OptimizedS3AMagicCommitter For Zero Rename Commits to S3 Endpoints
> --
>
> Key: HADOOP-18776
> URL: https://issues.apache.org/jira/browse/HADOOP-18776
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: fs/s3
>Reporter: Syed Shameerur Rahman
>Priority: Major
>  Labels: pull-request-available
>
> The goal is to add a new S3A committer named *OptimizedS3AMagicCommitter* 
> which is an another type of S3 Magic committer but with a better performance 
> by taking in few tradeoffs.
> The following are the differences in MagicCommitter vs OptimizedMagicCommitter
>  
> ||Operation||Magic Committer||*OptimizedS3AMagicCommitter*||
> |commitTask |1. Lists all {{.pending}} files in its attempt directory.
>  
> 2. The contents are loaded into a list of single pending uploads.
>  
> 3. Saved to a {{.pendingset}} file in the job attempt directory.|1. Lists all 
> {{.pending}} files in its attempt directory
>  
> 2. The contents are loaded into a list of single pending uploads.
>  
> 3. For each pending upload, commit operation is called (complete 
> multiPartUpload)|
> |commitJob|1. Loads all {{.pendingset}} files in its job attempt directory
>  
> 2. Then every pending commit in the job will be committed.
>  
> 3. "SUCCESS" marker is created (if config is enabled)
>  
> 4. "__magic" directory is cleaned up.|1. "SUCCESS" marker is created (if 
> config is enabled)
>  
> 2.  "__magic" directory is cleaned up.|
>  
> *Performance Benefits :-*
>  # The primary performance boost due to distributed complete multiPartUpload 
> call being made in the taskAttempts(Task containers/Executors) rather than a 
> single job driver. In case of MagicCommitter it is O(files/threads).
>  # It also saves a couple of S3 calls needed to PUT the "{{{}.pendingset{}}}" 
> files and READ call to read them in the Job Driver.
>  
> *TradeOffs :-*
> The tradeoffs are similar to the one in FileOutputCommitter V2 version. Users 
> migrating from FileOutputCommitter V2 to OptimizedS3AMagicCommitter will no 
> see behavioral change as such
>  # During execution, intermediate data becomes visible after commitTask 
> operation
>  # On a failure, all output must be deleted and the job needs to be restarted.
>  
> *Performance Benchmark :-*
> Cluster : c4.8x large (ec2-instance)
> Instance : 1 (primary) + 5 (core)
> Data Size : 3TB (TPC-DS store_sales data)
> Engine : Apache Spark 3.3.1
> Query: The following query inserts around 3000+ files into the table 
> directory (ran for 3 iterations)
> {code:java}
> insert into  select ss_quantity from store_sales; {code}
> ||Committer||Iteration 1||Iteration 2||Iteration 3||
> |Magic|126|127|122|
> |OptimizedMagic|50|51|58|
> So on an average, OptimizedMagicCommitter was *~2.3x* faster as compared to 
> MagicCommitter.
>  
> _*Note: Unlike MagicCommitter , OptimizedMagicCommitter is not suitable for 
> all the cases where in user requires the guarantees of file not being visible 
> in failure scenarios. Given the performance benefit, user can may choose to 
> use this if they don't require any guarantees or have some mechanism to clean 
> up the data before retrying.*_
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-18776) Add OptimizedS3AMagicCommitter For Zero Rename Commits to S3 Endpoints

2023-06-19 Thread Syed Shameerur Rahman (Jira)
Syed Shameerur Rahman created HADOOP-18776:
--

 Summary: Add OptimizedS3AMagicCommitter For Zero Rename Commits to 
S3 Endpoints
 Key: HADOOP-18776
 URL: https://issues.apache.org/jira/browse/HADOOP-18776
 Project: Hadoop Common
  Issue Type: New Feature
  Components: fs/s3
Reporter: Syed Shameerur Rahman


The goal is to add a new S3A committer named *OptimizedS3AMagicCommitter* which 
is an another type of S3 Magic committer but with a better performance by 
taking in few tradeoffs.

The following are the differences in MagicCommitter vs OptimizedMagicCommitter

 
||Operation||Magic Committer||*OptimizedS3AMagicCommitter*||
|commitTask|1. Lists all {{.pending}} files in its attempt directory.
 
2. The contents are loaded into a list of single pending uploads.
 
3. Saved to a {{.pendingset}} file in the job attempt directory.|1. Lists all 
{{.pending}} files in its attempt directory
 
2. The contents are loaded into a list of single pending uploads.
 
3. For each pending upload, commit operation is called (complete 
multiPartUpload)|
|commitJob|1. Loads all {{.pendingset}} files in its job attempt directory
 
2. Then every pending commit in the job will be committed.
 
3. "SUCCESS" marker is created (if config is enabled)
 
4. "__magic" directory is cleaned up.|1. "SUCCESS" marker is created (if config 
is enabled)
 
2.  "__magic" directory is cleaned up.|

 

*Performance Benefits :-*
 # The primary performance boost due to distributed complete multiPartUpload 
call being made in the taskAttempts(Task containers/Executors) rather than a 
single job driver. In case of MagicCommitter it is O(files/threads).
 # It also saves a couple of S3 calls needed to PUT the "{{{}.pendingset{}}}" 
files and READ call to read them in the Job Driver.

 

*TradeOffs :-*

The tradeoffs are similar to the one in FileOutputCommitter V2 version. Users 
migrating from FileOutputCommitter V2 to OptimizedS3AMagicCommitter will no see 
behavioral change as such
 # During execution, intermediate data becomes visible after commitTask 
operation
 # On a failure, all output must be deleted and the job needs to be restarted.

 

*Performance Benchmark :-*

Cluster : c4.8x large (ec2-instance)
Instance : 1 (primary) + 5 (core)
Data Size : 3TB (TPC-DS store_sales data)
Engine : Apache Spark 3.3.1

Query: The following query inserts around 3000+ files into the table directory 
(ran for 3 iterations)
{code:java}
insert into  select ss_quantity from store_sales; {code}
||Committer||Iteration 1||Iteration 2||Iteration 3||
|Magic|126|127|122|
|OptimizedMagic|50|51|58|

So on an average, OptimizedMagicCommitter was *~2.3x* faster as compared to 
MagicCommitter.

 

_*Note: Unlike MagicCommitter , OptimizedMagicCommitter is not suitable for all 
the cases where in user requires the guarantees of file not being visible in 
failure scenarios. Given the performance benefit, user can may choose to use 
this if they don't require any guarantees or have some mechanism to clean up 
the data before retrying.*_

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18776) Add OptimizedS3AMagicCommitter For Zero Rename Commits to S3 Endpoints

2023-06-19 Thread Syed Shameerur Rahman (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Syed Shameerur Rahman updated HADOOP-18776:
---
Description: 
The goal is to add a new S3A committer named *OptimizedS3AMagicCommitter* which 
is an another type of S3 Magic committer but with a better performance by 
taking in few tradeoffs.

The following are the differences in MagicCommitter vs OptimizedMagicCommitter

 
||Operation||Magic Committer||*OptimizedS3AMagicCommitter*||
|commitTask |1. Lists all {{.pending}} files in its attempt directory.
 
2. The contents are loaded into a list of single pending uploads.
 
3. Saved to a {{.pendingset}} file in the job attempt directory.|1. Lists all 
{{.pending}} files in its attempt directory
 
2. The contents are loaded into a list of single pending uploads.
 
3. For each pending upload, commit operation is called (complete 
multiPartUpload)|
|commitJob|1. Loads all {{.pendingset}} files in its job attempt directory
 
2. Then every pending commit in the job will be committed.
 
3. "SUCCESS" marker is created (if config is enabled)
 
4. "__magic" directory is cleaned up.|1. "SUCCESS" marker is created (if config 
is enabled)
 
2.  "__magic" directory is cleaned up.|

 

*Performance Benefits :-*
 # The primary performance boost due to distributed complete multiPartUpload 
call being made in the taskAttempts(Task containers/Executors) rather than a 
single job driver. In case of MagicCommitter it is O(files/threads).
 # It also saves a couple of S3 calls needed to PUT the "{{{}.pendingset{}}}" 
files and READ call to read them in the Job Driver.

 

*TradeOffs :-*

The tradeoffs are similar to the one in FileOutputCommitter V2 version. Users 
migrating from FileOutputCommitter V2 to OptimizedS3AMagicCommitter will no see 
behavioral change as such
 # During execution, intermediate data becomes visible after commitTask 
operation
 # On a failure, all output must be deleted and the job needs to be restarted.

 

*Performance Benchmark :-*

Cluster : c4.8x large (ec2-instance)
Instance : 1 (primary) + 5 (core)
Data Size : 3TB (TPC-DS store_sales data)
Engine : Apache Spark 3.3.1

Query: The following query inserts around 3000+ files into the table directory 
(ran for 3 iterations)
{code:java}
insert into  select ss_quantity from store_sales; {code}
||Committer||Iteration 1||Iteration 2||Iteration 3||
|Magic|126|127|122|
|OptimizedMagic|50|51|58|

So on an average, OptimizedMagicCommitter was *~2.3x* faster as compared to 
MagicCommitter.

 

_*Note: Unlike MagicCommitter , OptimizedMagicCommitter is not suitable for all 
the cases where in user requires the guarantees of file not being visible in 
failure scenarios. Given the performance benefit, user can may choose to use 
this if they don't require any guarantees or have some mechanism to clean up 
the data before retrying.*_

 

  was:
The goal is to add a new S3A committer named *OptimizedS3AMagicCommitter* which 
is an another type of S3 Magic committer but with a better performance by 
taking in few tradeoffs.

The following are the differences in MagicCommitter vs OptimizedMagicCommitter

 
||Operation||Magic Committer||*OptimizedS3AMagicCommitter*||
|commitTask|1. Lists all {{.pending}} files in its attempt directory.
 
2. The contents are loaded into a list of single pending uploads.
 
3. Saved to a {{.pendingset}} file in the job attempt directory.|1. Lists all 
{{.pending}} files in its attempt directory
 
2. The contents are loaded into a list of single pending uploads.
 
3. For each pending upload, commit operation is called (complete 
multiPartUpload)|
|commitJob|1. Loads all {{.pendingset}} files in its job attempt directory
 
2. Then every pending commit in the job will be committed.
 
3. "SUCCESS" marker is created (if config is enabled)
 
4. "__magic" directory is cleaned up.|1. "SUCCESS" marker is created (if config 
is enabled)
 
2.  "__magic" directory is cleaned up.|

 

*Performance Benefits :-*
 # The primary performance boost due to distributed complete multiPartUpload 
call being made in the taskAttempts(Task containers/Executors) rather than a 
single job driver. In case of MagicCommitter it is O(files/threads).
 # It also saves a couple of S3 calls needed to PUT the "{{{}.pendingset{}}}" 
files and READ call to read them in the Job Driver.

 

*TradeOffs :-*

The tradeoffs are similar to the one in FileOutputCommitter V2 version. Users 
migrating from FileOutputCommitter V2 to OptimizedS3AMagicCommitter will no see 
behavioral change as such
 # During execution, intermediate data becomes visible after commitTask 
operation
 # On a failure, all output must be deleted and the job needs to be restarted.

 

*Performance Benchmark :-*

Cluster : c4.8x large (ec2-instance)
Instance : 1 (primary) + 5 (core)
Data Size : 3TB (TPC-DS store_sales data)
Engine : Apache Spark 3.3.1

Query: The following query i

[jira] [Comment Edited] (HADOOP-16963) HADOOP-16582 changed mkdirs() behavior

2020-06-22 Thread Syed Shameerur Rahman (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-16963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17142033#comment-17142033
 ] 

Syed Shameerur Rahman edited comment on HADOOP-16963 at 6/22/20, 1:23 PM:
--

[~ste...@apache.org] Yes, hive uses *ProxyFileSystem* for running qtests. As 
you said we need to override FilterFS.mkdir(path) in hive to avoid qtests 
failing. Verified locally by overriding mkdir(path) in ProxyFileSystem. I will 
raise a corresponding jira in hive.

Sample failures:

{code:java}
Caused by: java.lang.IllegalArgumentException: Wrong FS: 
pfile:/media/ebs1/workspace/hive-3.1-qtest/group/5/label/HiveQTest/hive-1.2.0/itests/qtest/target/warehouse/dest1,
 expected: file:///
{code}



cc: [~kgyrtkirk]


was (Author: srahman):
[~ste...@apache.org] Yes, hive uses *ProxyFileSystem* for running qtests. As 
you said we need to override FilterFS.mkdir(path) in hive to avoid qtests 
failing. Verified locally by overriding mkdir(path) in ProxyFileSystem. I will 
raise a corresponding jira in hive
cc: [~kgyrtkirk]

> HADOOP-16582 changed mkdirs() behavior
> --
>
> Key: HADOOP-16963
> URL: https://issues.apache.org/jira/browse/HADOOP-16963
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.10.0, 3.3.0, 2.8.6, 2.9.3, 3.1.3, 3.2.2
>Reporter: Wei-Chiu Chuang
>Priority: Critical
>
> HADOOP-16582 changed behavior of {{mkdirs()}}
> Some Hive tests depend on the old behavior and they fail miserably.
> {quote}
> earlier:
> all plain mkdirs(somePath) were fast-tracked to FileSystem.mkdirs which have 
> rerouted them to mkdirs(somePath, somePerm) method with some defaults (which 
> were static)
> an implementation of FileSystem have only needed implement "mkdirs(somePath, 
> somePerm)" - because the other was not neccessarily called if it was always 
> in a FilterFileSystem or something like that
> now:
> especially FilterFileSystem forwards the call of mkdirs(p) to the actual fs 
> implementation...which may skip overriden mkdirs(somPath,somePerm) methods
> ...and could cause issues for existing FileSystem implementations
> {quote}
> File this jira to address this problem.
> [~kgyrtkirk] [~ste...@apache.org] [~kihwal]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-16963) HADOOP-16582 changed mkdirs() behavior

2020-06-22 Thread Syed Shameerur Rahman (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-16963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17142033#comment-17142033
 ] 

Syed Shameerur Rahman edited comment on HADOOP-16963 at 6/22/20, 1:22 PM:
--

[~ste...@apache.org] Yes, hive uses *ProxyFileSystem* for running qtests. As 
you said we need to override FilterFS.mkdir(path) in hive to avoid qtests 
failing. Verified locally by overriding mkdir(path) in ProxyFileSystem. I will 
raise a corresponding jira in hive
cc: [~kgyrtkirk]


was (Author: srahman):
[~ste...@apache.org] Yes, hive uses *ProxyFileSystem* for running qtests. As 
you said we need to override FilterFS.mkdir(path) in hive to avoid qtests 
failing. Verified locally by overriding mkdir(path) in ProxyFileSystem
cc: [~kgyrtkirk]

> HADOOP-16582 changed mkdirs() behavior
> --
>
> Key: HADOOP-16963
> URL: https://issues.apache.org/jira/browse/HADOOP-16963
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.10.0, 3.3.0, 2.8.6, 2.9.3, 3.1.3, 3.2.2
>Reporter: Wei-Chiu Chuang
>Priority: Critical
>
> HADOOP-16582 changed behavior of {{mkdirs()}}
> Some Hive tests depend on the old behavior and they fail miserably.
> {quote}
> earlier:
> all plain mkdirs(somePath) were fast-tracked to FileSystem.mkdirs which have 
> rerouted them to mkdirs(somePath, somePerm) method with some defaults (which 
> were static)
> an implementation of FileSystem have only needed implement "mkdirs(somePath, 
> somePerm)" - because the other was not neccessarily called if it was always 
> in a FilterFileSystem or something like that
> now:
> especially FilterFileSystem forwards the call of mkdirs(p) to the actual fs 
> implementation...which may skip overriden mkdirs(somPath,somePerm) methods
> ...and could cause issues for existing FileSystem implementations
> {quote}
> File this jira to address this problem.
> [~kgyrtkirk] [~ste...@apache.org] [~kihwal]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16963) HADOOP-16582 changed mkdirs() behavior

2020-06-22 Thread Syed Shameerur Rahman (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-16963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17142033#comment-17142033
 ] 

Syed Shameerur Rahman commented on HADOOP-16963:


[~ste...@apache.org] Yes, hive uses *ProxyFileSystem* for running qtests. As 
you said we need to override FilterFS.mkdir(path) in hive to avoid qtests 
failing. Verified locally by overriding mkdir(path) in ProxyFileSystem
cc: [~kgyrtkirk]

> HADOOP-16582 changed mkdirs() behavior
> --
>
> Key: HADOOP-16963
> URL: https://issues.apache.org/jira/browse/HADOOP-16963
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.10.0, 3.3.0, 2.8.6, 2.9.3, 3.1.3, 3.2.2
>Reporter: Wei-Chiu Chuang
>Priority: Critical
>
> HADOOP-16582 changed behavior of {{mkdirs()}}
> Some Hive tests depend on the old behavior and they fail miserably.
> {quote}
> earlier:
> all plain mkdirs(somePath) were fast-tracked to FileSystem.mkdirs which have 
> rerouted them to mkdirs(somePath, somePerm) method with some defaults (which 
> were static)
> an implementation of FileSystem have only needed implement "mkdirs(somePath, 
> somePerm)" - because the other was not neccessarily called if it was always 
> in a FilterFileSystem or something like that
> now:
> especially FilterFileSystem forwards the call of mkdirs(p) to the actual fs 
> implementation...which may skip overriden mkdirs(somPath,somePerm) methods
> ...and could cause issues for existing FileSystem implementations
> {quote}
> File this jira to address this problem.
> [~kgyrtkirk] [~ste...@apache.org] [~kihwal]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org