[jira] [Commented] (HADOOP-19091) Add support for Tez to MagicS3GuardCommitter

2024-03-08 Thread Venkatasubrahmanian Narayanan (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17824851#comment-17824851
 ] 

Venkatasubrahmanian Narayanan commented on HADOOP-19091:


[~srahman] Yes, by setting fs.s3a.committer.uuid and having the magic committer 
pick that up, I was able to run my Hive test case without needing to modify 
Hadoop (3.3.3). However, I still intend to put up a Hadoop PR with my MRv1 
wrapper of MagicS3GuardCommitter (it's implemented similarly to the MRv1 
FileOutputCommitter where it just delegates the calls to the existing MRv2 
version), and I will need to make one minor change to the existing MRv2 
MagicS3GuardCommitter - I need to add a constructor that takes a JobContext 
since MRv1 types require it.

> Add support for Tez to MagicS3GuardCommitter
> 
>
> Key: HADOOP-19091
> URL: https://issues.apache.org/jira/browse/HADOOP-19091
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3
>Affects Versions: 3.3.6
> Environment: Pig 17/Hive 3.1.3 with Hadoop 3.3.3 on AWS EMR 6-12.0
>Reporter: Venkatasubrahmanian Narayanan
>Assignee: Venkatasubrahmanian Narayanan
>Priority: Major
> Attachments: 0001-AWS-Hive-Changes.patch, 
> 0002-HIVE-27698-Backport-of-HIVE-22398-Remove-legacy-code.patch, 
> HADOOP-19091-HIVE-WIP.patch
>
>
> The MagicS3GuardCommitter assumes that the JobID of the task is the same as 
> that of the job's application master when writing/reading the .pendingset 
> file. This assumption is not valid when running with Tez, which creates 
> slightly different JobIDs for tasks and the application master.
>  
> While the MagicS3GuardCommitter is intended only for MRv2, it mostly works 
> fine with an MRv1 wrapper with Hive/Pig (with some minor changes to Hive) run 
> in MR mode. This issue only crops up when running queries with the Tez 
> execution engine. I can upload a patch to Hive 3.1 to reproduce this error on 
> EMR if needed.
>  
> Fixing this will probably require work from both Tez and Hadoop, wanted to 
> start a discussion here so we can figure out how exactly we go about this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-19091) Add support for Tez to MagicS3GuardCommitter

2024-03-04 Thread Venkatasubrahmanian Narayanan (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17823282#comment-17823282
 ] 

Venkatasubrahmanian Narayanan edited comment on HADOOP-19091 at 3/4/24 6:23 PM:


[~srahman] While that's true, the problem is that the jobAttemptPath itself is 
different between task and AM. If you look at my previous message, you'll see 
the difference is in the "directory" name of the jobAttemptPath:

(spaces inserted below because Jira doesn't like double underscore)

Task: s3a://hive-east-1-bucket/emblembasic/ _ _ magic/ __ __ 
magic/job-job_17089738741890_0073/

vs

AM: s3a://hive-east-1-bucket/emblembasic/ _ _ magic/ _ _ 
magic/job-job_1708973874189_0073/

The extra 0 in the JobID causes them to look at different "directories", and 
hence it doesn't find it.

 

There isn't a stacktrace per se - the commitJob op just doesn't find any 
pending data to commit, so it just goes to the cleanup() code(where since my 
test are with Hadoop 3.3.3, it just looks under __magic and finds everything to 
be deleted).


was (Author: vnarayanan7):
[~srahman] While that's true, the problem is that the jobAttemptPath itself is 
different between task and AM. If you look at my previous message, you'll see 
the difference is in the "directory" name of the jobAttemptPath:

 

Task: 
s3a://hive-east-1-bucket/emblembasic/__magic/__magic/job-job_17089738741890_0073/

vs

AM: 
s3a://hive-east-1-bucket/emblembasic/_{_}_{_}magic/__magic/job-job_1708973874189_0073/

The extra 0 in the JobID causes them to look at different "directories", and 
hence it doesn't find it.

 

There isn't a stacktrace per se - the commitJob op just doesn't find any 
pending data to commit, so it just goes to the cleanup() code(where since my 
test are with Hadoop 3.3.3, it just looks under __magic and finds everything to 
be deleted).

> Add support for Tez to MagicS3GuardCommitter
> 
>
> Key: HADOOP-19091
> URL: https://issues.apache.org/jira/browse/HADOOP-19091
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3
>Affects Versions: 3.3.6
> Environment: Pig 17/Hive 3.1.3 with Hadoop 3.3.3 on AWS EMR 6-12.0
>Reporter: Venkatasubrahmanian Narayanan
>Assignee: Venkatasubrahmanian Narayanan
>Priority: Major
> Attachments: 0001-AWS-Hive-Changes.patch, 
> 0002-HIVE-27698-Backport-of-HIVE-22398-Remove-legacy-code.patch, 
> HADOOP-19091-HIVE-WIP.patch
>
>
> The MagicS3GuardCommitter assumes that the JobID of the task is the same as 
> that of the job's application master when writing/reading the .pendingset 
> file. This assumption is not valid when running with Tez, which creates 
> slightly different JobIDs for tasks and the application master.
>  
> While the MagicS3GuardCommitter is intended only for MRv2, it mostly works 
> fine with an MRv1 wrapper with Hive/Pig (with some minor changes to Hive) run 
> in MR mode. This issue only crops up when running queries with the Tez 
> execution engine. I can upload a patch to Hive 3.1 to reproduce this error on 
> EMR if needed.
>  
> Fixing this will probably require work from both Tez and Hadoop, wanted to 
> start a discussion here so we can figure out how exactly we go about this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19091) Add support for Tez to MagicS3GuardCommitter

2024-03-04 Thread Venkatasubrahmanian Narayanan (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17823282#comment-17823282
 ] 

Venkatasubrahmanian Narayanan commented on HADOOP-19091:


[~srahman] While that's true, the problem is that the jobAttemptPath itself is 
different between task and AM. If you look at my previous message, you'll see 
the difference is in the "directory" name of the jobAttemptPath:

 

Task: 
s3a://hive-east-1-bucket/emblembasic/__{_}magic/{_}__magic/job-job_17089738741890_0073/

vs

AM: 
s3a://hive-east-1-bucket/emblembasic/__{_}magic/__{_}{_}magic/job-job_1708973874189_0073/{_}

The extra 0 in the JobID causes them to look at different "directories", and 
hence it doesn't find it.

 

There isn't a stacktrace per se - the commitJob op just doesn't find any 
pending data to commit, so it just goes to the cleanup() code(where since my 
test are with Hadoop 3.3.3, it just looks under __magic and finds everything to 
be deleted).

> Add support for Tez to MagicS3GuardCommitter
> 
>
> Key: HADOOP-19091
> URL: https://issues.apache.org/jira/browse/HADOOP-19091
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3
>Affects Versions: 3.3.6
> Environment: Pig 17/Hive 3.1.3 with Hadoop 3.3.3 on AWS EMR 6-12.0
>Reporter: Venkatasubrahmanian Narayanan
>Assignee: Venkatasubrahmanian Narayanan
>Priority: Major
> Attachments: 0001-AWS-Hive-Changes.patch, 
> 0002-HIVE-27698-Backport-of-HIVE-22398-Remove-legacy-code.patch, 
> HADOOP-19091-HIVE-WIP.patch
>
>
> The MagicS3GuardCommitter assumes that the JobID of the task is the same as 
> that of the job's application master when writing/reading the .pendingset 
> file. This assumption is not valid when running with Tez, which creates 
> slightly different JobIDs for tasks and the application master.
>  
> While the MagicS3GuardCommitter is intended only for MRv2, it mostly works 
> fine with an MRv1 wrapper with Hive/Pig (with some minor changes to Hive) run 
> in MR mode. This issue only crops up when running queries with the Tez 
> execution engine. I can upload a patch to Hive 3.1 to reproduce this error on 
> EMR if needed.
>  
> Fixing this will probably require work from both Tez and Hadoop, wanted to 
> start a discussion here so we can figure out how exactly we go about this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-19091) Add support for Tez to MagicS3GuardCommitter

2024-03-04 Thread Venkatasubrahmanian Narayanan (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17823282#comment-17823282
 ] 

Venkatasubrahmanian Narayanan edited comment on HADOOP-19091 at 3/4/24 6:22 PM:


[~srahman] While that's true, the problem is that the jobAttemptPath itself is 
different between task and AM. If you look at my previous message, you'll see 
the difference is in the "directory" name of the jobAttemptPath:

 

Task: 
s3a://hive-east-1-bucket/emblembasic/__magic/__magic/job-job_17089738741890_0073/

vs

AM: 
s3a://hive-east-1-bucket/emblembasic/_{_}_{_}magic/__magic/job-job_1708973874189_0073/

The extra 0 in the JobID causes them to look at different "directories", and 
hence it doesn't find it.

 

There isn't a stacktrace per se - the commitJob op just doesn't find any 
pending data to commit, so it just goes to the cleanup() code(where since my 
test are with Hadoop 3.3.3, it just looks under __magic and finds everything to 
be deleted).


was (Author: vnarayanan7):
[~srahman] While that's true, the problem is that the jobAttemptPath itself is 
different between task and AM. If you look at my previous message, you'll see 
the difference is in the "directory" name of the jobAttemptPath:

 

Task: 
s3a://hive-east-1-bucket/emblembasic/__{_}magic/{_}__magic/job-job_17089738741890_0073/

vs

AM: 
s3a://hive-east-1-bucket/emblembasic/__{_}magic/__{_}{_}magic/job-job_1708973874189_0073/{_}

The extra 0 in the JobID causes them to look at different "directories", and 
hence it doesn't find it.

 

There isn't a stacktrace per se - the commitJob op just doesn't find any 
pending data to commit, so it just goes to the cleanup() code(where since my 
test are with Hadoop 3.3.3, it just looks under __magic and finds everything to 
be deleted).

> Add support for Tez to MagicS3GuardCommitter
> 
>
> Key: HADOOP-19091
> URL: https://issues.apache.org/jira/browse/HADOOP-19091
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3
>Affects Versions: 3.3.6
> Environment: Pig 17/Hive 3.1.3 with Hadoop 3.3.3 on AWS EMR 6-12.0
>Reporter: Venkatasubrahmanian Narayanan
>Assignee: Venkatasubrahmanian Narayanan
>Priority: Major
> Attachments: 0001-AWS-Hive-Changes.patch, 
> 0002-HIVE-27698-Backport-of-HIVE-22398-Remove-legacy-code.patch, 
> HADOOP-19091-HIVE-WIP.patch
>
>
> The MagicS3GuardCommitter assumes that the JobID of the task is the same as 
> that of the job's application master when writing/reading the .pendingset 
> file. This assumption is not valid when running with Tez, which creates 
> slightly different JobIDs for tasks and the application master.
>  
> While the MagicS3GuardCommitter is intended only for MRv2, it mostly works 
> fine with an MRv1 wrapper with Hive/Pig (with some minor changes to Hive) run 
> in MR mode. This issue only crops up when running queries with the Tez 
> execution engine. I can upload a patch to Hive 3.1 to reproduce this error on 
> EMR if needed.
>  
> Fixing this will probably require work from both Tez and Hadoop, wanted to 
> start a discussion here so we can figure out how exactly we go about this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19091) Add support for Tez to MagicS3GuardCommitter

2024-03-01 Thread Venkatasubrahmanian Narayanan (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17822709#comment-17822709
 ] 

Venkatasubrahmanian Narayanan commented on HADOOP-19091:


[~ste...@apache.org] Tez is where the different jobID is generated, but it 
doesn't seem to be the vertex index even though Tez does append the vertex 
index to the generated jobID in MROutput. I'll look into the history of that 
code to see if I'm missing something.

Yes, the magic committer is picking up the jobID from the config.

I'll keep the parallel job change in mind.

The patches I uploaded in the JIRA are patches to Hive, not Hadoop(since they 
were just to replicate the behavior). I'll run the tests and add those details 
when I put up the Hadoop PR. Unless you were referring to a Hive PR and I'm 
misunderstanding?

> Add support for Tez to MagicS3GuardCommitter
> 
>
> Key: HADOOP-19091
> URL: https://issues.apache.org/jira/browse/HADOOP-19091
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3
>Affects Versions: 3.3.6
> Environment: Pig 17/Hive 3.1.3 with Hadoop 3.3.3 on AWS EMR 6-12.0
>Reporter: Venkatasubrahmanian Narayanan
>Assignee: Venkatasubrahmanian Narayanan
>Priority: Major
> Attachments: 0001-AWS-Hive-Changes.patch, 
> 0002-HIVE-27698-Backport-of-HIVE-22398-Remove-legacy-code.patch, 
> HADOOP-19091-HIVE-WIP.patch
>
>
> The MagicS3GuardCommitter assumes that the JobID of the task is the same as 
> that of the job's application master when writing/reading the .pendingset 
> file. This assumption is not valid when running with Tez, which creates 
> slightly different JobIDs for tasks and the application master.
>  
> While the MagicS3GuardCommitter is intended only for MRv2, it mostly works 
> fine with an MRv1 wrapper with Hive/Pig (with some minor changes to Hive) run 
> in MR mode. This issue only crops up when running queries with the Tez 
> execution engine. I can upload a patch to Hive 3.1 to reproduce this error on 
> EMR if needed.
>  
> Fixing this will probably require work from both Tez and Hadoop, wanted to 
> start a discussion here so we can figure out how exactly we go about this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19091) Add support for Tez to MagicS3GuardCommitter

2024-02-29 Thread Venkatasubrahmanian Narayanan (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17822274#comment-17822274
 ] 

Venkatasubrahmanian Narayanan commented on HADOOP-19091:


[~ste...@apache.org] Will keep all that in mind when I work on the Hadoop 
patches, thanks.

 

Hive uses MRv1 and migrating it to MRv2 would be a lot of effort to switch it 
over to use PathOutputCommitter etc. internally, however, I will see if 
something similar to the factory design can be done with the committer class 
since that is configured explicitly for this design.

 

[~srahman] From the AM logs:

Job UUID job_1708973874189_0073 source JobID

Starting: Task committer attempt_1708973874189_0073_r_00_1: 
commitJob(job_1708973874189_0073) 2024-02-29 18:05:05,766 [DEBUG] [App Shared 
Pool - #2] |impl.IOStatisticsStoreImpl|: Incrementing counter op_list_files by 
1 with final value 1 2024-02-29 18:05:05,766 [DEBUG] [App Shared Pool - #2] 
|s3a.S3AFileSystem|: 
listFiles(s3a://hive-east-1-bucket/emblembasic/__magic/__magic/job-job_1708973874189_0073,
 false) 2024-02-29 18:05:05,767 [DEBUG] [App Shared Pool - #2] 
|s3a.S3AFileSystem|: Requesting all entries under 
emblembasic/__magic/__magic/job-job_1708973874189_0073/ with delimiter '/'

>From the task logs:

Job UUID job_17089738741890_0073 source JobID

Saving work of attempt_17089738741890_0073_r_00_0 to 
s3a://hive-east-1-bucket/emblembasic/__magic/__magic/job-job_17089738741890_0073/task_17089738741890_0073_r_00.pendingset

 

It's a very subtle difference(there's an extra 0 in the ID/path used by the 
task). The post-commitJob cleanup does delete the files since it deletes 
everything under the __magic directory instead of looking under the job dir, 
commitJob itself just fails to find the pending set when it lists the files so 
it doesn't commit the results..

> Add support for Tez to MagicS3GuardCommitter
> 
>
> Key: HADOOP-19091
> URL: https://issues.apache.org/jira/browse/HADOOP-19091
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3
>Affects Versions: 3.3.6
> Environment: Pig 17/Hive 3.1.3 with Hadoop 3.3.3 on AWS EMR 6-12.0
>Reporter: Venkatasubrahmanian Narayanan
>Assignee: Venkatasubrahmanian Narayanan
>Priority: Major
> Attachments: 0001-AWS-Hive-Changes.patch, 
> 0002-HIVE-27698-Backport-of-HIVE-22398-Remove-legacy-code.patch, 
> HADOOP-19091-HIVE-WIP.patch
>
>
> The MagicS3GuardCommitter assumes that the JobID of the task is the same as 
> that of the job's application master when writing/reading the .pendingset 
> file. This assumption is not valid when running with Tez, which creates 
> slightly different JobIDs for tasks and the application master.
>  
> While the MagicS3GuardCommitter is intended only for MRv2, it mostly works 
> fine with an MRv1 wrapper with Hive/Pig (with some minor changes to Hive) run 
> in MR mode. This issue only crops up when running queries with the Tez 
> execution engine. I can upload a patch to Hive 3.1 to reproduce this error on 
> EMR if needed.
>  
> Fixing this will probably require work from both Tez and Hadoop, wanted to 
> start a discussion here so we can figure out how exactly we go about this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19091) Add support for Tez to MagicS3GuardCommitter

2024-02-27 Thread Venkatasubrahmanian Narayanan (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17821417#comment-17821417
 ] 

Venkatasubrahmanian Narayanan commented on HADOOP-19091:


Actually, a follow-up thing I remembered: The MagicS3GuardCommitter also does a 
correctness check on the jobID of the task commit vs the job commit which I've 
had to patch as well. Even if we do get Tez to create fs.s3a.uuid for use, 
we'll need to patch the committer to use that for its correctness check.

> Add support for Tez to MagicS3GuardCommitter
> 
>
> Key: HADOOP-19091
> URL: https://issues.apache.org/jira/browse/HADOOP-19091
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3
>Affects Versions: 3.4.0, 3.3.6
> Environment: Pig 17/Hive 3.1.3 with Hadoop 3.3.3 on AWS EMR 6-12.0
>Reporter: Venkatasubrahmanian Narayanan
>Priority: Major
> Attachments: 0001-AWS-Hive-Changes.patch, 
> 0002-HIVE-27698-Backport-of-HIVE-22398-Remove-legacy-code.patch, 
> HADOOP-19091-HIVE-WIP.patch
>
>
> The MagicS3GuardCommitter assumes that the JobID of the task is the same as 
> that of the job's application master when writing/reading the .pendingset 
> file. This assumption is not valid when running with Tez, which creates 
> slightly different JobIDs for tasks and the application master.
>  
> While the MagicS3GuardCommitter is intended only for MRv2, it mostly works 
> fine with an MRv1 wrapper with Hive/Pig (with some minor changes to Hive) run 
> in MR mode. This issue only crops up when running queries with the Tez 
> execution engine. I can upload a patch to Hive 3.1 to reproduce this error on 
> EMR if needed.
>  
> Fixing this will probably require work from both Tez and Hadoop, wanted to 
> start a discussion here so we can figure out how exactly we go about this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-19091) Add support for Tez to MagicS3GuardCommitter

2024-02-27 Thread Venkatasubrahmanian Narayanan (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17821372#comment-17821372
 ] 

Venkatasubrahmanian Narayanan edited comment on HADOOP-19091 at 2/27/24 7:38 PM:
-

[~srahman] I've uploaded my WIP Hive patch (there are a couple of other open 
sourced patches which need to be backported to Hive 3.1 that I've uploaded as 
well). I still need to clean up a couple of things (hence why the patch 
hardcodes an expectation that tables are on S3), but the basic idea is to add 
an MRv1 wrapper of the MagicS3GuardCommitter similar to how the 
FileOutputCommitter for MRv1 is implemented, and since Hive uses MRv1 it only 
requires incidental changes to treat paths the way the magic committer expects.

 

I was able to reproduce the behavior with a simple Pig load from csv - store 
into table with HCatStorer script on EMR 6-12.0. In the task and AM logs you 
can see the behavior I described where the path the task container writes the 
pending set to is subtly different from the path the AM tries to read it 
from(in my tests it differed by a single 0 appended after the first part of the 
jtIdentifier). The path is derived from the UUID, which in the default case is 
derived from the jobId. When I patch hadoop-aws to manually drop that extra 
digit from the jtIdentifier string the data is successfully committed(proving 
it's not any other factor at play), but obviously that approach would not work 
in a real solution.


was (Author: vnarayanan7):
[~srahman] I've uploaded my WIP Hive patch (there are a couple of other open 
sourced patches which need to be backported to Hive 3.1 that I've uploaded as 
well). I still need to clean up a couple of things (hence why the patch 
hardcodes an expectation that tables are on S3), but the basic idea is to add 
an MRv1 wrapper of the MagicS3GuardCommitter similar to how the 
FileOutputCommitter for MRv1 is implemented, and since Hive uses MRv1 it only 
requires incidental changes to treat paths the way the magic committer expects.

> Add support for Tez to MagicS3GuardCommitter
> 
>
> Key: HADOOP-19091
> URL: https://issues.apache.org/jira/browse/HADOOP-19091
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3
>Affects Versions: 3.4.0, 3.3.6
> Environment: Pig 17/Hive 3.1.3 with Hadoop 3.3.3 on AWS EMR 6-12.0
>Reporter: Venkatasubrahmanian Narayanan
>Priority: Major
> Attachments: 0001-AWS-Hive-Changes.patch, 
> 0002-HIVE-27698-Backport-of-HIVE-22398-Remove-legacy-code.patch, 
> HADOOP-19091-HIVE-WIP.patch
>
>
> The MagicS3GuardCommitter assumes that the JobID of the task is the same as 
> that of the job's application master when writing/reading the .pendingset 
> file. This assumption is not valid when running with Tez, which creates 
> slightly different JobIDs for tasks and the application master.
>  
> While the MagicS3GuardCommitter is intended only for MRv2, it mostly works 
> fine with an MRv1 wrapper with Hive/Pig (with some minor changes to Hive) run 
> in MR mode. This issue only crops up when running queries with the Tez 
> execution engine. I can upload a patch to Hive 3.1 to reproduce this error on 
> EMR if needed.
>  
> Fixing this will probably require work from both Tez and Hadoop, wanted to 
> start a discussion here so we can figure out how exactly we go about this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-19091) Add support for Tez to MagicS3GuardCommitter

2024-02-27 Thread Venkatasubrahmanian Narayanan (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17821372#comment-17821372
 ] 

Venkatasubrahmanian Narayanan edited comment on HADOOP-19091 at 2/27/24 7:34 PM:
-

[~srahman] I've uploaded my WIP Hive patch (there are a couple of other open 
sourced patches which need to be backported to Hive 3.1 that I've uploaded as 
well). I still need to clean up a couple of things (hence why the patch 
hardcodes an expectation that tables are on S3), but the basic idea is to add 
an MRv1 wrapper of the MagicS3GuardCommitter similar to how the 
FileOutputCommitter for MRv1 is implemented, and since Hive uses MRv1 it only 
requires incidental changes to treat paths the way the magic committer expects.


was (Author: vnarayanan7):
[~srahman] I've uploaded my WIP Hive patch (there are a couple of other open 
sourced patches which need to be backported to Hive 3.1 that I've uploaded as 
well). I still need to clean up a couple of things (hence why the patch 
hardcodes an expectation that tables are on. S3), but the basic idea is to add 
an MRv1 wrapper of the MagicS3GuardCommitter similar to how the 
FileOutputCommitter for MRv1 is implemented, and since Hive uses MRv1 it only 
requires incidental changes to treat paths the way the magic committer expects.

> Add support for Tez to MagicS3GuardCommitter
> 
>
> Key: HADOOP-19091
> URL: https://issues.apache.org/jira/browse/HADOOP-19091
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3
>Affects Versions: 3.4.0, 3.3.6
> Environment: Pig 17/Hive 3.1.3 with Hadoop 3.3.3 on AWS EMR 6-12.0
>Reporter: Venkatasubrahmanian Narayanan
>Priority: Major
> Attachments: 0001-AWS-Hive-Changes.patch, 
> 0002-HIVE-27698-Backport-of-HIVE-22398-Remove-legacy-code.patch, 
> HADOOP-19091-HIVE-WIP.patch
>
>
> The MagicS3GuardCommitter assumes that the JobID of the task is the same as 
> that of the job's application master when writing/reading the .pendingset 
> file. This assumption is not valid when running with Tez, which creates 
> slightly different JobIDs for tasks and the application master.
>  
> While the MagicS3GuardCommitter is intended only for MRv2, it mostly works 
> fine with an MRv1 wrapper with Hive/Pig (with some minor changes to Hive) run 
> in MR mode. This issue only crops up when running queries with the Tez 
> execution engine. I can upload a patch to Hive 3.1 to reproduce this error on 
> EMR if needed.
>  
> Fixing this will probably require work from both Tez and Hadoop, wanted to 
> start a discussion here so we can figure out how exactly we go about this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19091) Add support for Tez to MagicS3GuardCommitter

2024-02-27 Thread Venkatasubrahmanian Narayanan (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17821372#comment-17821372
 ] 

Venkatasubrahmanian Narayanan commented on HADOOP-19091:


[~srahman] I've uploaded my WIP Hive patch (there are a couple of other open 
sourced patches which need to be backported to Hive 3.1 that I've uploaded as 
well). I still need to clean up a couple of things (hence why the patch 
hardcodes an expectation that tables are on. S3), but the basic idea is to add 
an MRv1 wrapper of the MagicS3GuardCommitter similar to how the 
FileOutputCommitter for MRv1 is implemented, and since Hive uses MRv1 it only 
requires incidental changes to treat paths the way the magic committer expects.

> Add support for Tez to MagicS3GuardCommitter
> 
>
> Key: HADOOP-19091
> URL: https://issues.apache.org/jira/browse/HADOOP-19091
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3
>Affects Versions: 3.4.0, 3.3.6
> Environment: Pig 17/Hive 3.1.3 with Hadoop 3.3.3 on AWS EMR 6-12.0
>Reporter: Venkatasubrahmanian Narayanan
>Priority: Major
> Attachments: 0001-AWS-Hive-Changes.patch, 
> 0002-HIVE-27698-Backport-of-HIVE-22398-Remove-legacy-code.patch, 
> HADOOP-19091-HIVE-WIP.patch
>
>
> The MagicS3GuardCommitter assumes that the JobID of the task is the same as 
> that of the job's application master when writing/reading the .pendingset 
> file. This assumption is not valid when running with Tez, which creates 
> slightly different JobIDs for tasks and the application master.
>  
> While the MagicS3GuardCommitter is intended only for MRv2, it mostly works 
> fine with an MRv1 wrapper with Hive/Pig (with some minor changes to Hive) run 
> in MR mode. This issue only crops up when running queries with the Tez 
> execution engine. I can upload a patch to Hive 3.1 to reproduce this error on 
> EMR if needed.
>  
> Fixing this will probably require work from both Tez and Hadoop, wanted to 
> start a discussion here so we can figure out how exactly we go about this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-19091) Add support for Tez to MagicS3GuardCommitter

2024-02-27 Thread Venkatasubrahmanian Narayanan (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Venkatasubrahmanian Narayanan updated HADOOP-19091:
---
Attachment: 0001-AWS-Hive-Changes.patch
0002-HIVE-27698-Backport-of-HIVE-22398-Remove-legacy-code.patch

> Add support for Tez to MagicS3GuardCommitter
> 
>
> Key: HADOOP-19091
> URL: https://issues.apache.org/jira/browse/HADOOP-19091
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3
>Affects Versions: 3.4.0, 3.3.6
> Environment: Pig 17/Hive 3.1.3 with Hadoop 3.3.3 on AWS EMR 6-12.0
>Reporter: Venkatasubrahmanian Narayanan
>Priority: Major
> Attachments: 0001-AWS-Hive-Changes.patch, 
> 0002-HIVE-27698-Backport-of-HIVE-22398-Remove-legacy-code.patch, 
> HADOOP-19091-HIVE-WIP.patch
>
>
> The MagicS3GuardCommitter assumes that the JobID of the task is the same as 
> that of the job's application master when writing/reading the .pendingset 
> file. This assumption is not valid when running with Tez, which creates 
> slightly different JobIDs for tasks and the application master.
>  
> While the MagicS3GuardCommitter is intended only for MRv2, it mostly works 
> fine with an MRv1 wrapper with Hive/Pig (with some minor changes to Hive) run 
> in MR mode. This issue only crops up when running queries with the Tez 
> execution engine. I can upload a patch to Hive 3.1 to reproduce this error on 
> EMR if needed.
>  
> Fixing this will probably require work from both Tez and Hadoop, wanted to 
> start a discussion here so we can figure out how exactly we go about this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-19091) Add support for Tez to MagicS3GuardCommitter

2024-02-27 Thread Venkatasubrahmanian Narayanan (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Venkatasubrahmanian Narayanan updated HADOOP-19091:
---
Attachment: HADOOP-19091-HIVE-WIP.patch

> Add support for Tez to MagicS3GuardCommitter
> 
>
> Key: HADOOP-19091
> URL: https://issues.apache.org/jira/browse/HADOOP-19091
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3
>Affects Versions: 3.4.0, 3.3.6
> Environment: Pig 17/Hive 3.1.3 with Hadoop 3.3.3 on AWS EMR 6-12.0
>Reporter: Venkatasubrahmanian Narayanan
>Priority: Major
> Attachments: HADOOP-19091-HIVE-WIP.patch
>
>
> The MagicS3GuardCommitter assumes that the JobID of the task is the same as 
> that of the job's application master when writing/reading the .pendingset 
> file. This assumption is not valid when running with Tez, which creates 
> slightly different JobIDs for tasks and the application master.
>  
> While the MagicS3GuardCommitter is intended only for MRv2, it mostly works 
> fine with an MRv1 wrapper with Hive/Pig (with some minor changes to Hive) run 
> in MR mode. This issue only crops up when running queries with the Tez 
> execution engine. I can upload a patch to Hive 3.1 to reproduce this error on 
> EMR if needed.
>  
> Fixing this will probably require work from both Tez and Hadoop, wanted to 
> start a discussion here so we can figure out how exactly we go about this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-19091) Add support for Tez to MagicS3GuardCommitter

2024-02-26 Thread Venkatasubrahmanian Narayanan (Jira)
Venkatasubrahmanian Narayanan created HADOOP-19091:
--

 Summary: Add support for Tez to MagicS3GuardCommitter
 Key: HADOOP-19091
 URL: https://issues.apache.org/jira/browse/HADOOP-19091
 Project: Hadoop Common
  Issue Type: Bug
  Components: tools
Affects Versions: 3.3.3
 Environment: Pig 17/Hive 3.1.3 with Hadoop 3.3.3 on AWS EMR 6-12.0
Reporter: Venkatasubrahmanian Narayanan


The MagicS3GuardCommitter assumes that the JobID of the task is the same as 
that of the job's application master when writing/reading the .pendingset file. 
This assumption is not valid when running with Tez, which creates slightly 
different JobIDs for tasks and the application master.

 

While the MagicS3GuardCommitter is intended only for MRv2, it mostly works fine 
with an MRv1 wrapper with Hive/Pig (with some minor changes to Hive) run in MR 
mode. This issue only crops up when running queries with the Tez execution 
engine. I can upload a patch to Hive 3.1 to reproduce this error on EMR if 
needed.

 

Fixing this will probably require work from both Tez and Hadoop, wanted to 
start a discussion here so we can figure out how exactly we go about this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org