[jira] [Commented] (HADOOP-19091) Add support for Tez to MagicS3GuardCommitter

2024-03-17 Thread Syed Shameerur Rahman (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17827768#comment-17827768
 ] 

Syed Shameerur Rahman commented on HADOOP-19091:


Ok, [~vnarayanan7] Please feel free to raise PR for the same.

> Add support for Tez to MagicS3GuardCommitter
> 
>
> Key: HADOOP-19091
> URL: https://issues.apache.org/jira/browse/HADOOP-19091
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3
>Affects Versions: 3.3.6
> Environment: Pig 17/Hive 3.1.3 with Hadoop 3.3.3 on AWS EMR 6-12.0
>Reporter: Venkatasubrahmanian Narayanan
>Assignee: Venkatasubrahmanian Narayanan
>Priority: Major
> Attachments: 0001-AWS-Hive-Changes.patch, 
> 0002-HIVE-27698-Backport-of-HIVE-22398-Remove-legacy-code.patch, 
> HADOOP-19091-HIVE-WIP.patch
>
>
> The MagicS3GuardCommitter assumes that the JobID of the task is the same as 
> that of the job's application master when writing/reading the .pendingset 
> file. This assumption is not valid when running with Tez, which creates 
> slightly different JobIDs for tasks and the application master.
>  
> While the MagicS3GuardCommitter is intended only for MRv2, it mostly works 
> fine with an MRv1 wrapper with Hive/Pig (with some minor changes to Hive) run 
> in MR mode. This issue only crops up when running queries with the Tez 
> execution engine. I can upload a patch to Hive 3.1 to reproduce this error on 
> EMR if needed.
>  
> Fixing this will probably require work from both Tez and Hadoop, wanted to 
> start a discussion here so we can figure out how exactly we go about this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19091) Add support for Tez to MagicS3GuardCommitter

2024-03-08 Thread Venkatasubrahmanian Narayanan (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17824851#comment-17824851
 ] 

Venkatasubrahmanian Narayanan commented on HADOOP-19091:


[~srahman] Yes, by setting fs.s3a.committer.uuid and having the magic committer 
pick that up, I was able to run my Hive test case without needing to modify 
Hadoop (3.3.3). However, I still intend to put up a Hadoop PR with my MRv1 
wrapper of MagicS3GuardCommitter (it's implemented similarly to the MRv1 
FileOutputCommitter where it just delegates the calls to the existing MRv2 
version), and I will need to make one minor change to the existing MRv2 
MagicS3GuardCommitter - I need to add a constructor that takes a JobContext 
since MRv1 types require it.

> Add support for Tez to MagicS3GuardCommitter
> 
>
> Key: HADOOP-19091
> URL: https://issues.apache.org/jira/browse/HADOOP-19091
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3
>Affects Versions: 3.3.6
> Environment: Pig 17/Hive 3.1.3 with Hadoop 3.3.3 on AWS EMR 6-12.0
>Reporter: Venkatasubrahmanian Narayanan
>Assignee: Venkatasubrahmanian Narayanan
>Priority: Major
> Attachments: 0001-AWS-Hive-Changes.patch, 
> 0002-HIVE-27698-Backport-of-HIVE-22398-Remove-legacy-code.patch, 
> HADOOP-19091-HIVE-WIP.patch
>
>
> The MagicS3GuardCommitter assumes that the JobID of the task is the same as 
> that of the job's application master when writing/reading the .pendingset 
> file. This assumption is not valid when running with Tez, which creates 
> slightly different JobIDs for tasks and the application master.
>  
> While the MagicS3GuardCommitter is intended only for MRv2, it mostly works 
> fine with an MRv1 wrapper with Hive/Pig (with some minor changes to Hive) run 
> in MR mode. This issue only crops up when running queries with the Tez 
> execution engine. I can upload a patch to Hive 3.1 to reproduce this error on 
> EMR if needed.
>  
> Fixing this will probably require work from both Tez and Hadoop, wanted to 
> start a discussion here so we can figure out how exactly we go about this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19091) Add support for Tez to MagicS3GuardCommitter

2024-03-05 Thread Syed Shameerur Rahman (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17823486#comment-17823486
 ] 

Syed Shameerur Rahman commented on HADOOP-19091:


[~vnarayanan7] - Thanks for the logs with example. 
Please feel free to raise the PR with the required changes. I can help with the 
review.

Is it possible to scope down the changes only to Tez by setting 
`fs.s3a.committer.uuid` appropriately ?

> Add support for Tez to MagicS3GuardCommitter
> 
>
> Key: HADOOP-19091
> URL: https://issues.apache.org/jira/browse/HADOOP-19091
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3
>Affects Versions: 3.3.6
> Environment: Pig 17/Hive 3.1.3 with Hadoop 3.3.3 on AWS EMR 6-12.0
>Reporter: Venkatasubrahmanian Narayanan
>Assignee: Venkatasubrahmanian Narayanan
>Priority: Major
> Attachments: 0001-AWS-Hive-Changes.patch, 
> 0002-HIVE-27698-Backport-of-HIVE-22398-Remove-legacy-code.patch, 
> HADOOP-19091-HIVE-WIP.patch
>
>
> The MagicS3GuardCommitter assumes that the JobID of the task is the same as 
> that of the job's application master when writing/reading the .pendingset 
> file. This assumption is not valid when running with Tez, which creates 
> slightly different JobIDs for tasks and the application master.
>  
> While the MagicS3GuardCommitter is intended only for MRv2, it mostly works 
> fine with an MRv1 wrapper with Hive/Pig (with some minor changes to Hive) run 
> in MR mode. This issue only crops up when running queries with the Tez 
> execution engine. I can upload a patch to Hive 3.1 to reproduce this error on 
> EMR if needed.
>  
> Fixing this will probably require work from both Tez and Hadoop, wanted to 
> start a discussion here so we can figure out how exactly we go about this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19091) Add support for Tez to MagicS3GuardCommitter

2024-03-04 Thread Venkatasubrahmanian Narayanan (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17823282#comment-17823282
 ] 

Venkatasubrahmanian Narayanan commented on HADOOP-19091:


[~srahman] While that's true, the problem is that the jobAttemptPath itself is 
different between task and AM. If you look at my previous message, you'll see 
the difference is in the "directory" name of the jobAttemptPath:

 

Task: 
s3a://hive-east-1-bucket/emblembasic/__{_}magic/{_}__magic/job-job_17089738741890_0073/

vs

AM: 
s3a://hive-east-1-bucket/emblembasic/__{_}magic/__{_}{_}magic/job-job_1708973874189_0073/{_}

The extra 0 in the JobID causes them to look at different "directories", and 
hence it doesn't find it.

 

There isn't a stacktrace per se - the commitJob op just doesn't find any 
pending data to commit, so it just goes to the cleanup() code(where since my 
test are with Hadoop 3.3.3, it just looks under __magic and finds everything to 
be deleted).

> Add support for Tez to MagicS3GuardCommitter
> 
>
> Key: HADOOP-19091
> URL: https://issues.apache.org/jira/browse/HADOOP-19091
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3
>Affects Versions: 3.3.6
> Environment: Pig 17/Hive 3.1.3 with Hadoop 3.3.3 on AWS EMR 6-12.0
>Reporter: Venkatasubrahmanian Narayanan
>Assignee: Venkatasubrahmanian Narayanan
>Priority: Major
> Attachments: 0001-AWS-Hive-Changes.patch, 
> 0002-HIVE-27698-Backport-of-HIVE-22398-Remove-legacy-code.patch, 
> HADOOP-19091-HIVE-WIP.patch
>
>
> The MagicS3GuardCommitter assumes that the JobID of the task is the same as 
> that of the job's application master when writing/reading the .pendingset 
> file. This assumption is not valid when running with Tez, which creates 
> slightly different JobIDs for tasks and the application master.
>  
> While the MagicS3GuardCommitter is intended only for MRv2, it mostly works 
> fine with an MRv1 wrapper with Hive/Pig (with some minor changes to Hive) run 
> in MR mode. This issue only crops up when running queries with the Tez 
> execution engine. I can upload a patch to Hive 3.1 to reproduce this error on 
> EMR if needed.
>  
> Fixing this will probably require work from both Tez and Hadoop, wanted to 
> start a discussion here so we can figure out how exactly we go about this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19091) Add support for Tez to MagicS3GuardCommitter

2024-03-03 Thread Syed Shameerur Rahman (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17823030#comment-17823030
 ] 

Syed Shameerur Rahman commented on HADOOP-19091:


[~vnarayanan7] - Could you please share the complete error stacktrace ?

As i could see from the code implementation, During commitJob operation, 
[listPendingUploadToCommit|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/magic/MagicS3GuardCommitter.java#L124]
 method is invoked which list all the files under the jobAttemptPath with a 
suffix `.pendingset`.

What i understand from your comment is that, The `getJobAttemptPath` is not 
returning correct value (for Hive,Pig with Tez) and hence the commitJob is not 
able to read the commit metadata. Is my understanding correct ?

> Add support for Tez to MagicS3GuardCommitter
> 
>
> Key: HADOOP-19091
> URL: https://issues.apache.org/jira/browse/HADOOP-19091
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3
>Affects Versions: 3.3.6
> Environment: Pig 17/Hive 3.1.3 with Hadoop 3.3.3 on AWS EMR 6-12.0
>Reporter: Venkatasubrahmanian Narayanan
>Assignee: Venkatasubrahmanian Narayanan
>Priority: Major
> Attachments: 0001-AWS-Hive-Changes.patch, 
> 0002-HIVE-27698-Backport-of-HIVE-22398-Remove-legacy-code.patch, 
> HADOOP-19091-HIVE-WIP.patch
>
>
> The MagicS3GuardCommitter assumes that the JobID of the task is the same as 
> that of the job's application master when writing/reading the .pendingset 
> file. This assumption is not valid when running with Tez, which creates 
> slightly different JobIDs for tasks and the application master.
>  
> While the MagicS3GuardCommitter is intended only for MRv2, it mostly works 
> fine with an MRv1 wrapper with Hive/Pig (with some minor changes to Hive) run 
> in MR mode. This issue only crops up when running queries with the Tez 
> execution engine. I can upload a patch to Hive 3.1 to reproduce this error on 
> EMR if needed.
>  
> Fixing this will probably require work from both Tez and Hadoop, wanted to 
> start a discussion here so we can figure out how exactly we go about this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19091) Add support for Tez to MagicS3GuardCommitter

2024-03-01 Thread Venkatasubrahmanian Narayanan (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17822709#comment-17822709
 ] 

Venkatasubrahmanian Narayanan commented on HADOOP-19091:


[~ste...@apache.org] Tez is where the different jobID is generated, but it 
doesn't seem to be the vertex index even though Tez does append the vertex 
index to the generated jobID in MROutput. I'll look into the history of that 
code to see if I'm missing something.

Yes, the magic committer is picking up the jobID from the config.

I'll keep the parallel job change in mind.

The patches I uploaded in the JIRA are patches to Hive, not Hadoop(since they 
were just to replicate the behavior). I'll run the tests and add those details 
when I put up the Hadoop PR. Unless you were referring to a Hive PR and I'm 
misunderstanding?

> Add support for Tez to MagicS3GuardCommitter
> 
>
> Key: HADOOP-19091
> URL: https://issues.apache.org/jira/browse/HADOOP-19091
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3
>Affects Versions: 3.3.6
> Environment: Pig 17/Hive 3.1.3 with Hadoop 3.3.3 on AWS EMR 6-12.0
>Reporter: Venkatasubrahmanian Narayanan
>Assignee: Venkatasubrahmanian Narayanan
>Priority: Major
> Attachments: 0001-AWS-Hive-Changes.patch, 
> 0002-HIVE-27698-Backport-of-HIVE-22398-Remove-legacy-code.patch, 
> HADOOP-19091-HIVE-WIP.patch
>
>
> The MagicS3GuardCommitter assumes that the JobID of the task is the same as 
> that of the job's application master when writing/reading the .pendingset 
> file. This assumption is not valid when running with Tez, which creates 
> slightly different JobIDs for tasks and the application master.
>  
> While the MagicS3GuardCommitter is intended only for MRv2, it mostly works 
> fine with an MRv1 wrapper with Hive/Pig (with some minor changes to Hive) run 
> in MR mode. This issue only crops up when running queries with the Tez 
> execution engine. I can upload a patch to Hive 3.1 to reproduce this error on 
> EMR if needed.
>  
> Fixing this will probably require work from both Tez and Hadoop, wanted to 
> start a discussion here so we can figure out how exactly we go about this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19091) Add support for Tez to MagicS3GuardCommitter

2024-03-01 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17822555#comment-17822555
 ] 

Steve Loughran commented on HADOOP-19091:
-

1. can you find out why a different job id is being passed down. What is the 
hive jira where they made that decision? because the spark design "just use the 
current time" was simply a choice of the easiest unique value -which ended up 
breaking when two tasks were launched in the same second.
2. magic committer will pick up the job id as passed in through the config. 

bq. The post-commitJob cleanup does delete the files since it deletes 
everything under the __magic directory instead of looking under the job dir,

not on trunk, it now supports parallel jobs with their own __magic-$jobID 
paths. so no cleanup

[~vnarayanan7] stick the patch up as a Github pr. do run the hadoop-aws 
integration tests and tell us which endpoint/region you tested against. 

> Add support for Tez to MagicS3GuardCommitter
> 
>
> Key: HADOOP-19091
> URL: https://issues.apache.org/jira/browse/HADOOP-19091
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3
>Affects Versions: 3.3.6
> Environment: Pig 17/Hive 3.1.3 with Hadoop 3.3.3 on AWS EMR 6-12.0
>Reporter: Venkatasubrahmanian Narayanan
>Assignee: Venkatasubrahmanian Narayanan
>Priority: Major
> Attachments: 0001-AWS-Hive-Changes.patch, 
> 0002-HIVE-27698-Backport-of-HIVE-22398-Remove-legacy-code.patch, 
> HADOOP-19091-HIVE-WIP.patch
>
>
> The MagicS3GuardCommitter assumes that the JobID of the task is the same as 
> that of the job's application master when writing/reading the .pendingset 
> file. This assumption is not valid when running with Tez, which creates 
> slightly different JobIDs for tasks and the application master.
>  
> While the MagicS3GuardCommitter is intended only for MRv2, it mostly works 
> fine with an MRv1 wrapper with Hive/Pig (with some minor changes to Hive) run 
> in MR mode. This issue only crops up when running queries with the Tez 
> execution engine. I can upload a patch to Hive 3.1 to reproduce this error on 
> EMR if needed.
>  
> Fixing this will probably require work from both Tez and Hadoop, wanted to 
> start a discussion here so we can figure out how exactly we go about this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19091) Add support for Tez to MagicS3GuardCommitter

2024-02-29 Thread Venkatasubrahmanian Narayanan (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17822274#comment-17822274
 ] 

Venkatasubrahmanian Narayanan commented on HADOOP-19091:


[~ste...@apache.org] Will keep all that in mind when I work on the Hadoop 
patches, thanks.

 

Hive uses MRv1 and migrating it to MRv2 would be a lot of effort to switch it 
over to use PathOutputCommitter etc. internally, however, I will see if 
something similar to the factory design can be done with the committer class 
since that is configured explicitly for this design.

 

[~srahman] From the AM logs:

Job UUID job_1708973874189_0073 source JobID

Starting: Task committer attempt_1708973874189_0073_r_00_1: 
commitJob(job_1708973874189_0073) 2024-02-29 18:05:05,766 [DEBUG] [App Shared 
Pool - #2] |impl.IOStatisticsStoreImpl|: Incrementing counter op_list_files by 
1 with final value 1 2024-02-29 18:05:05,766 [DEBUG] [App Shared Pool - #2] 
|s3a.S3AFileSystem|: 
listFiles(s3a://hive-east-1-bucket/emblembasic/__magic/__magic/job-job_1708973874189_0073,
 false) 2024-02-29 18:05:05,767 [DEBUG] [App Shared Pool - #2] 
|s3a.S3AFileSystem|: Requesting all entries under 
emblembasic/__magic/__magic/job-job_1708973874189_0073/ with delimiter '/'

>From the task logs:

Job UUID job_17089738741890_0073 source JobID

Saving work of attempt_17089738741890_0073_r_00_0 to 
s3a://hive-east-1-bucket/emblembasic/__magic/__magic/job-job_17089738741890_0073/task_17089738741890_0073_r_00.pendingset

 

It's a very subtle difference(there's an extra 0 in the ID/path used by the 
task). The post-commitJob cleanup does delete the files since it deletes 
everything under the __magic directory instead of looking under the job dir, 
commitJob itself just fails to find the pending set when it lists the files so 
it doesn't commit the results..

> Add support for Tez to MagicS3GuardCommitter
> 
>
> Key: HADOOP-19091
> URL: https://issues.apache.org/jira/browse/HADOOP-19091
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3
>Affects Versions: 3.3.6
> Environment: Pig 17/Hive 3.1.3 with Hadoop 3.3.3 on AWS EMR 6-12.0
>Reporter: Venkatasubrahmanian Narayanan
>Assignee: Venkatasubrahmanian Narayanan
>Priority: Major
> Attachments: 0001-AWS-Hive-Changes.patch, 
> 0002-HIVE-27698-Backport-of-HIVE-22398-Remove-legacy-code.patch, 
> HADOOP-19091-HIVE-WIP.patch
>
>
> The MagicS3GuardCommitter assumes that the JobID of the task is the same as 
> that of the job's application master when writing/reading the .pendingset 
> file. This assumption is not valid when running with Tez, which creates 
> slightly different JobIDs for tasks and the application master.
>  
> While the MagicS3GuardCommitter is intended only for MRv2, it mostly works 
> fine with an MRv1 wrapper with Hive/Pig (with some minor changes to Hive) run 
> in MR mode. This issue only crops up when running queries with the Tez 
> execution engine. I can upload a patch to Hive 3.1 to reproduce this error on 
> EMR if needed.
>  
> Fixing this will probably require work from both Tez and Hadoop, wanted to 
> start a discussion here so we can figure out how exactly we go about this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19091) Add support for Tez to MagicS3GuardCommitter

2024-02-28 Thread Syed Shameerur Rahman (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17821935#comment-17821935
 ] 

Syed Shameerur Rahman commented on HADOOP-19091:


[~vnarayanan7] - Can you share the required logs (with DEBUG) if possible it 
will give some more clarity.

> Add support for Tez to MagicS3GuardCommitter
> 
>
> Key: HADOOP-19091
> URL: https://issues.apache.org/jira/browse/HADOOP-19091
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3
>Affects Versions: 3.3.6
> Environment: Pig 17/Hive 3.1.3 with Hadoop 3.3.3 on AWS EMR 6-12.0
>Reporter: Venkatasubrahmanian Narayanan
>Assignee: Venkatasubrahmanian Narayanan
>Priority: Major
> Attachments: 0001-AWS-Hive-Changes.patch, 
> 0002-HIVE-27698-Backport-of-HIVE-22398-Remove-legacy-code.patch, 
> HADOOP-19091-HIVE-WIP.patch
>
>
> The MagicS3GuardCommitter assumes that the JobID of the task is the same as 
> that of the job's application master when writing/reading the .pendingset 
> file. This assumption is not valid when running with Tez, which creates 
> slightly different JobIDs for tasks and the application master.
>  
> While the MagicS3GuardCommitter is intended only for MRv2, it mostly works 
> fine with an MRv1 wrapper with Hive/Pig (with some minor changes to Hive) run 
> in MR mode. This issue only crops up when running queries with the Tez 
> execution engine. I can upload a patch to Hive 3.1 to reproduce this error on 
> EMR if needed.
>  
> Fixing this will probably require work from both Tez and Hadoop, wanted to 
> start a discussion here so we can figure out how exactly we go about this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19091) Add support for Tez to MagicS3GuardCommitter

2024-02-28 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17821619#comment-17821619
 ] 

Steve Loughran commented on HADOOP-19091:
-

[~vnarayanan7] 
* everything has to be done via github pRs right now; ;look at the hadoop-aws 
testing doc to see policies there.
* all work to be on hadoop trunk with backports to 3.4.x: don't expect anything 
to go any earlier.
* anything going near committers has to show why it is *correct* rather than 
just passing some tests. Correct ::= atomic and recoverable task commit, 
support for speculative tasks. job commit does not need atomicity though it's 
always nice. 
* and changes had better not break spark. 

I have put a lot of effort into not knowing anything about hive; I hope I don't 
have to change that policy. Having had a quick look at the patch:  it would be 
better if the hive design used our committer factory design as spark does. That 
allows for different committees for different schemas, e.g. the manifest 
committer for abfs/gcs. it should also not contained hard coded checks for 
class types, with PathOutputCommitter the committer API to get working 
directories.




> Add support for Tez to MagicS3GuardCommitter
> 
>
> Key: HADOOP-19091
> URL: https://issues.apache.org/jira/browse/HADOOP-19091
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3
>Affects Versions: 3.4.0, 3.3.6
> Environment: Pig 17/Hive 3.1.3 with Hadoop 3.3.3 on AWS EMR 6-12.0
>Reporter: Venkatasubrahmanian Narayanan
>Priority: Major
> Attachments: 0001-AWS-Hive-Changes.patch, 
> 0002-HIVE-27698-Backport-of-HIVE-22398-Remove-legacy-code.patch, 
> HADOOP-19091-HIVE-WIP.patch
>
>
> The MagicS3GuardCommitter assumes that the JobID of the task is the same as 
> that of the job's application master when writing/reading the .pendingset 
> file. This assumption is not valid when running with Tez, which creates 
> slightly different JobIDs for tasks and the application master.
>  
> While the MagicS3GuardCommitter is intended only for MRv2, it mostly works 
> fine with an MRv1 wrapper with Hive/Pig (with some minor changes to Hive) run 
> in MR mode. This issue only crops up when running queries with the Tez 
> execution engine. I can upload a patch to Hive 3.1 to reproduce this error on 
> EMR if needed.
>  
> Fixing this will probably require work from both Tez and Hadoop, wanted to 
> start a discussion here so we can figure out how exactly we go about this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19091) Add support for Tez to MagicS3GuardCommitter

2024-02-27 Thread Venkatasubrahmanian Narayanan (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17821417#comment-17821417
 ] 

Venkatasubrahmanian Narayanan commented on HADOOP-19091:


Actually, a follow-up thing I remembered: The MagicS3GuardCommitter also does a 
correctness check on the jobID of the task commit vs the job commit which I've 
had to patch as well. Even if we do get Tez to create fs.s3a.uuid for use, 
we'll need to patch the committer to use that for its correctness check.

> Add support for Tez to MagicS3GuardCommitter
> 
>
> Key: HADOOP-19091
> URL: https://issues.apache.org/jira/browse/HADOOP-19091
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3
>Affects Versions: 3.4.0, 3.3.6
> Environment: Pig 17/Hive 3.1.3 with Hadoop 3.3.3 on AWS EMR 6-12.0
>Reporter: Venkatasubrahmanian Narayanan
>Priority: Major
> Attachments: 0001-AWS-Hive-Changes.patch, 
> 0002-HIVE-27698-Backport-of-HIVE-22398-Remove-legacy-code.patch, 
> HADOOP-19091-HIVE-WIP.patch
>
>
> The MagicS3GuardCommitter assumes that the JobID of the task is the same as 
> that of the job's application master when writing/reading the .pendingset 
> file. This assumption is not valid when running with Tez, which creates 
> slightly different JobIDs for tasks and the application master.
>  
> While the MagicS3GuardCommitter is intended only for MRv2, it mostly works 
> fine with an MRv1 wrapper with Hive/Pig (with some minor changes to Hive) run 
> in MR mode. This issue only crops up when running queries with the Tez 
> execution engine. I can upload a patch to Hive 3.1 to reproduce this error on 
> EMR if needed.
>  
> Fixing this will probably require work from both Tez and Hadoop, wanted to 
> start a discussion here so we can figure out how exactly we go about this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19091) Add support for Tez to MagicS3GuardCommitter

2024-02-27 Thread Venkatasubrahmanian Narayanan (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17821372#comment-17821372
 ] 

Venkatasubrahmanian Narayanan commented on HADOOP-19091:


[~srahman] I've uploaded my WIP Hive patch (there are a couple of other open 
sourced patches which need to be backported to Hive 3.1 that I've uploaded as 
well). I still need to clean up a couple of things (hence why the patch 
hardcodes an expectation that tables are on. S3), but the basic idea is to add 
an MRv1 wrapper of the MagicS3GuardCommitter similar to how the 
FileOutputCommitter for MRv1 is implemented, and since Hive uses MRv1 it only 
requires incidental changes to treat paths the way the magic committer expects.

> Add support for Tez to MagicS3GuardCommitter
> 
>
> Key: HADOOP-19091
> URL: https://issues.apache.org/jira/browse/HADOOP-19091
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3
>Affects Versions: 3.4.0, 3.3.6
> Environment: Pig 17/Hive 3.1.3 with Hadoop 3.3.3 on AWS EMR 6-12.0
>Reporter: Venkatasubrahmanian Narayanan
>Priority: Major
> Attachments: 0001-AWS-Hive-Changes.patch, 
> 0002-HIVE-27698-Backport-of-HIVE-22398-Remove-legacy-code.patch, 
> HADOOP-19091-HIVE-WIP.patch
>
>
> The MagicS3GuardCommitter assumes that the JobID of the task is the same as 
> that of the job's application master when writing/reading the .pendingset 
> file. This assumption is not valid when running with Tez, which creates 
> slightly different JobIDs for tasks and the application master.
>  
> While the MagicS3GuardCommitter is intended only for MRv2, it mostly works 
> fine with an MRv1 wrapper with Hive/Pig (with some minor changes to Hive) run 
> in MR mode. This issue only crops up when running queries with the Tez 
> execution engine. I can upload a patch to Hive 3.1 to reproduce this error on 
> EMR if needed.
>  
> Fixing this will probably require work from both Tez and Hadoop, wanted to 
> start a discussion here so we can figure out how exactly we go about this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19091) Add support for Tez to MagicS3GuardCommitter

2024-02-26 Thread Syed Shameerur Rahman (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17820967#comment-17820967
 ] 

Syed Shameerur Rahman commented on HADOOP-19091:


[~vnarayanan7] - I am not sure why MagicS3GuardCommitter won't work with Tez. 
In the past I vaguely remember running MagicS3GuardCommitter with Hive 3.1.3 
and Tez  0.9.2 (by incorporating the changes mentioned in 
https://issues.apache.org/jira/browse/HIVE-16295)
It would be really helpful, If you can share the replication steps for the same.


> Add support for Tez to MagicS3GuardCommitter
> 
>
> Key: HADOOP-19091
> URL: https://issues.apache.org/jira/browse/HADOOP-19091
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3
>Affects Versions: 3.4.0, 3.3.6
> Environment: Pig 17/Hive 3.1.3 with Hadoop 3.3.3 on AWS EMR 6-12.0
>Reporter: Venkatasubrahmanian Narayanan
>Priority: Major
>
> The MagicS3GuardCommitter assumes that the JobID of the task is the same as 
> that of the job's application master when writing/reading the .pendingset 
> file. This assumption is not valid when running with Tez, which creates 
> slightly different JobIDs for tasks and the application master.
>  
> While the MagicS3GuardCommitter is intended only for MRv2, it mostly works 
> fine with an MRv1 wrapper with Hive/Pig (with some minor changes to Hive) run 
> in MR mode. This issue only crops up when running queries with the Tez 
> execution engine. I can upload a patch to Hive 3.1 to reproduce this error on 
> EMR if needed.
>  
> Fixing this will probably require work from both Tez and Hadoop, wanted to 
> start a discussion here so we can figure out how exactly we go about this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19091) Add support for Tez to MagicS3GuardCommitter

2024-02-26 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17820885#comment-17820885
 ] 

Steve Loughran commented on HADOOP-19091:
-

incidentally, look at {{buildJobUUID}} for different ways to pass down unique 
job IDs, as well as the jira for that work (HADOOP-17318) and the spark jiras 
off it. If Tez can generate the right config option to pass down then it may 
work today

> Add support for Tez to MagicS3GuardCommitter
> 
>
> Key: HADOOP-19091
> URL: https://issues.apache.org/jira/browse/HADOOP-19091
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3
>Affects Versions: 3.4.0, 3.3.6
> Environment: Pig 17/Hive 3.1.3 with Hadoop 3.3.3 on AWS EMR 6-12.0
>Reporter: Venkatasubrahmanian Narayanan
>Priority: Major
>
> The MagicS3GuardCommitter assumes that the JobID of the task is the same as 
> that of the job's application master when writing/reading the .pendingset 
> file. This assumption is not valid when running with Tez, which creates 
> slightly different JobIDs for tasks and the application master.
>  
> While the MagicS3GuardCommitter is intended only for MRv2, it mostly works 
> fine with an MRv1 wrapper with Hive/Pig (with some minor changes to Hive) run 
> in MR mode. This issue only crops up when running queries with the Tez 
> execution engine. I can upload a patch to Hive 3.1 to reproduce this error on 
> EMR if needed.
>  
> Fixing this will probably require work from both Tez and Hadoop, wanted to 
> start a discussion here so we can figure out how exactly we go about this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19091) Add support for Tez to MagicS3GuardCommitter

2024-02-26 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17820882#comment-17820882
 ] 

Steve Loughran commented on HADOOP-19091:
-

best place to start. be aware that the use of job id to isolate jobs has 
expanded in HADOOP-18797; you need a plan to not conflict with this. 

> Add support for Tez to MagicS3GuardCommitter
> 
>
> Key: HADOOP-19091
> URL: https://issues.apache.org/jira/browse/HADOOP-19091
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3
>Affects Versions: 3.4.0, 3.3.6
> Environment: Pig 17/Hive 3.1.3 with Hadoop 3.3.3 on AWS EMR 6-12.0
>Reporter: Venkatasubrahmanian Narayanan
>Priority: Major
>
> The MagicS3GuardCommitter assumes that the JobID of the task is the same as 
> that of the job's application master when writing/reading the .pendingset 
> file. This assumption is not valid when running with Tez, which creates 
> slightly different JobIDs for tasks and the application master.
>  
> While the MagicS3GuardCommitter is intended only for MRv2, it mostly works 
> fine with an MRv1 wrapper with Hive/Pig (with some minor changes to Hive) run 
> in MR mode. This issue only crops up when running queries with the Tez 
> execution engine. I can upload a patch to Hive 3.1 to reproduce this error on 
> EMR if needed.
>  
> Fixing this will probably require work from both Tez and Hadoop, wanted to 
> start a discussion here so we can figure out how exactly we go about this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org