[jira] [Commented] (HADOOP-19091) Add support for Tez to MagicS3GuardCommitter
[ https://issues.apache.org/jira/browse/HADOOP-19091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17827768#comment-17827768 ] Syed Shameerur Rahman commented on HADOOP-19091: Ok, [~vnarayanan7] Please feel free to raise PR for the same. > Add support for Tez to MagicS3GuardCommitter > > > Key: HADOOP-19091 > URL: https://issues.apache.org/jira/browse/HADOOP-19091 > Project: Hadoop Common > Issue Type: Improvement > Components: fs/s3 >Affects Versions: 3.3.6 > Environment: Pig 17/Hive 3.1.3 with Hadoop 3.3.3 on AWS EMR 6-12.0 >Reporter: Venkatasubrahmanian Narayanan >Assignee: Venkatasubrahmanian Narayanan >Priority: Major > Attachments: 0001-AWS-Hive-Changes.patch, > 0002-HIVE-27698-Backport-of-HIVE-22398-Remove-legacy-code.patch, > HADOOP-19091-HIVE-WIP.patch > > > The MagicS3GuardCommitter assumes that the JobID of the task is the same as > that of the job's application master when writing/reading the .pendingset > file. This assumption is not valid when running with Tez, which creates > slightly different JobIDs for tasks and the application master. > > While the MagicS3GuardCommitter is intended only for MRv2, it mostly works > fine with an MRv1 wrapper with Hive/Pig (with some minor changes to Hive) run > in MR mode. This issue only crops up when running queries with the Tez > execution engine. I can upload a patch to Hive 3.1 to reproduce this error on > EMR if needed. > > Fixing this will probably require work from both Tez and Hadoop, wanted to > start a discussion here so we can figure out how exactly we go about this. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19091) Add support for Tez to MagicS3GuardCommitter
[ https://issues.apache.org/jira/browse/HADOOP-19091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17824851#comment-17824851 ] Venkatasubrahmanian Narayanan commented on HADOOP-19091: [~srahman] Yes, by setting fs.s3a.committer.uuid and having the magic committer pick that up, I was able to run my Hive test case without needing to modify Hadoop (3.3.3). However, I still intend to put up a Hadoop PR with my MRv1 wrapper of MagicS3GuardCommitter (it's implemented similarly to the MRv1 FileOutputCommitter where it just delegates the calls to the existing MRv2 version), and I will need to make one minor change to the existing MRv2 MagicS3GuardCommitter - I need to add a constructor that takes a JobContext since MRv1 types require it. > Add support for Tez to MagicS3GuardCommitter > > > Key: HADOOP-19091 > URL: https://issues.apache.org/jira/browse/HADOOP-19091 > Project: Hadoop Common > Issue Type: Improvement > Components: fs/s3 >Affects Versions: 3.3.6 > Environment: Pig 17/Hive 3.1.3 with Hadoop 3.3.3 on AWS EMR 6-12.0 >Reporter: Venkatasubrahmanian Narayanan >Assignee: Venkatasubrahmanian Narayanan >Priority: Major > Attachments: 0001-AWS-Hive-Changes.patch, > 0002-HIVE-27698-Backport-of-HIVE-22398-Remove-legacy-code.patch, > HADOOP-19091-HIVE-WIP.patch > > > The MagicS3GuardCommitter assumes that the JobID of the task is the same as > that of the job's application master when writing/reading the .pendingset > file. This assumption is not valid when running with Tez, which creates > slightly different JobIDs for tasks and the application master. > > While the MagicS3GuardCommitter is intended only for MRv2, it mostly works > fine with an MRv1 wrapper with Hive/Pig (with some minor changes to Hive) run > in MR mode. This issue only crops up when running queries with the Tez > execution engine. I can upload a patch to Hive 3.1 to reproduce this error on > EMR if needed. > > Fixing this will probably require work from both Tez and Hadoop, wanted to > start a discussion here so we can figure out how exactly we go about this. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19091) Add support for Tez to MagicS3GuardCommitter
[ https://issues.apache.org/jira/browse/HADOOP-19091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17823486#comment-17823486 ] Syed Shameerur Rahman commented on HADOOP-19091: [~vnarayanan7] - Thanks for the logs with example. Please feel free to raise the PR with the required changes. I can help with the review. Is it possible to scope down the changes only to Tez by setting `fs.s3a.committer.uuid` appropriately ? > Add support for Tez to MagicS3GuardCommitter > > > Key: HADOOP-19091 > URL: https://issues.apache.org/jira/browse/HADOOP-19091 > Project: Hadoop Common > Issue Type: Improvement > Components: fs/s3 >Affects Versions: 3.3.6 > Environment: Pig 17/Hive 3.1.3 with Hadoop 3.3.3 on AWS EMR 6-12.0 >Reporter: Venkatasubrahmanian Narayanan >Assignee: Venkatasubrahmanian Narayanan >Priority: Major > Attachments: 0001-AWS-Hive-Changes.patch, > 0002-HIVE-27698-Backport-of-HIVE-22398-Remove-legacy-code.patch, > HADOOP-19091-HIVE-WIP.patch > > > The MagicS3GuardCommitter assumes that the JobID of the task is the same as > that of the job's application master when writing/reading the .pendingset > file. This assumption is not valid when running with Tez, which creates > slightly different JobIDs for tasks and the application master. > > While the MagicS3GuardCommitter is intended only for MRv2, it mostly works > fine with an MRv1 wrapper with Hive/Pig (with some minor changes to Hive) run > in MR mode. This issue only crops up when running queries with the Tez > execution engine. I can upload a patch to Hive 3.1 to reproduce this error on > EMR if needed. > > Fixing this will probably require work from both Tez and Hadoop, wanted to > start a discussion here so we can figure out how exactly we go about this. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19091) Add support for Tez to MagicS3GuardCommitter
[ https://issues.apache.org/jira/browse/HADOOP-19091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17823282#comment-17823282 ] Venkatasubrahmanian Narayanan commented on HADOOP-19091: [~srahman] While that's true, the problem is that the jobAttemptPath itself is different between task and AM. If you look at my previous message, you'll see the difference is in the "directory" name of the jobAttemptPath: Task: s3a://hive-east-1-bucket/emblembasic/__{_}magic/{_}__magic/job-job_17089738741890_0073/ vs AM: s3a://hive-east-1-bucket/emblembasic/__{_}magic/__{_}{_}magic/job-job_1708973874189_0073/{_} The extra 0 in the JobID causes them to look at different "directories", and hence it doesn't find it. There isn't a stacktrace per se - the commitJob op just doesn't find any pending data to commit, so it just goes to the cleanup() code(where since my test are with Hadoop 3.3.3, it just looks under __magic and finds everything to be deleted). > Add support for Tez to MagicS3GuardCommitter > > > Key: HADOOP-19091 > URL: https://issues.apache.org/jira/browse/HADOOP-19091 > Project: Hadoop Common > Issue Type: Improvement > Components: fs/s3 >Affects Versions: 3.3.6 > Environment: Pig 17/Hive 3.1.3 with Hadoop 3.3.3 on AWS EMR 6-12.0 >Reporter: Venkatasubrahmanian Narayanan >Assignee: Venkatasubrahmanian Narayanan >Priority: Major > Attachments: 0001-AWS-Hive-Changes.patch, > 0002-HIVE-27698-Backport-of-HIVE-22398-Remove-legacy-code.patch, > HADOOP-19091-HIVE-WIP.patch > > > The MagicS3GuardCommitter assumes that the JobID of the task is the same as > that of the job's application master when writing/reading the .pendingset > file. This assumption is not valid when running with Tez, which creates > slightly different JobIDs for tasks and the application master. > > While the MagicS3GuardCommitter is intended only for MRv2, it mostly works > fine with an MRv1 wrapper with Hive/Pig (with some minor changes to Hive) run > in MR mode. This issue only crops up when running queries with the Tez > execution engine. I can upload a patch to Hive 3.1 to reproduce this error on > EMR if needed. > > Fixing this will probably require work from both Tez and Hadoop, wanted to > start a discussion here so we can figure out how exactly we go about this. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19091) Add support for Tez to MagicS3GuardCommitter
[ https://issues.apache.org/jira/browse/HADOOP-19091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17823030#comment-17823030 ] Syed Shameerur Rahman commented on HADOOP-19091: [~vnarayanan7] - Could you please share the complete error stacktrace ? As i could see from the code implementation, During commitJob operation, [listPendingUploadToCommit|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/magic/MagicS3GuardCommitter.java#L124] method is invoked which list all the files under the jobAttemptPath with a suffix `.pendingset`. What i understand from your comment is that, The `getJobAttemptPath` is not returning correct value (for Hive,Pig with Tez) and hence the commitJob is not able to read the commit metadata. Is my understanding correct ? > Add support for Tez to MagicS3GuardCommitter > > > Key: HADOOP-19091 > URL: https://issues.apache.org/jira/browse/HADOOP-19091 > Project: Hadoop Common > Issue Type: Improvement > Components: fs/s3 >Affects Versions: 3.3.6 > Environment: Pig 17/Hive 3.1.3 with Hadoop 3.3.3 on AWS EMR 6-12.0 >Reporter: Venkatasubrahmanian Narayanan >Assignee: Venkatasubrahmanian Narayanan >Priority: Major > Attachments: 0001-AWS-Hive-Changes.patch, > 0002-HIVE-27698-Backport-of-HIVE-22398-Remove-legacy-code.patch, > HADOOP-19091-HIVE-WIP.patch > > > The MagicS3GuardCommitter assumes that the JobID of the task is the same as > that of the job's application master when writing/reading the .pendingset > file. This assumption is not valid when running with Tez, which creates > slightly different JobIDs for tasks and the application master. > > While the MagicS3GuardCommitter is intended only for MRv2, it mostly works > fine with an MRv1 wrapper with Hive/Pig (with some minor changes to Hive) run > in MR mode. This issue only crops up when running queries with the Tez > execution engine. I can upload a patch to Hive 3.1 to reproduce this error on > EMR if needed. > > Fixing this will probably require work from both Tez and Hadoop, wanted to > start a discussion here so we can figure out how exactly we go about this. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19091) Add support for Tez to MagicS3GuardCommitter
[ https://issues.apache.org/jira/browse/HADOOP-19091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17822709#comment-17822709 ] Venkatasubrahmanian Narayanan commented on HADOOP-19091: [~ste...@apache.org] Tez is where the different jobID is generated, but it doesn't seem to be the vertex index even though Tez does append the vertex index to the generated jobID in MROutput. I'll look into the history of that code to see if I'm missing something. Yes, the magic committer is picking up the jobID from the config. I'll keep the parallel job change in mind. The patches I uploaded in the JIRA are patches to Hive, not Hadoop(since they were just to replicate the behavior). I'll run the tests and add those details when I put up the Hadoop PR. Unless you were referring to a Hive PR and I'm misunderstanding? > Add support for Tez to MagicS3GuardCommitter > > > Key: HADOOP-19091 > URL: https://issues.apache.org/jira/browse/HADOOP-19091 > Project: Hadoop Common > Issue Type: Improvement > Components: fs/s3 >Affects Versions: 3.3.6 > Environment: Pig 17/Hive 3.1.3 with Hadoop 3.3.3 on AWS EMR 6-12.0 >Reporter: Venkatasubrahmanian Narayanan >Assignee: Venkatasubrahmanian Narayanan >Priority: Major > Attachments: 0001-AWS-Hive-Changes.patch, > 0002-HIVE-27698-Backport-of-HIVE-22398-Remove-legacy-code.patch, > HADOOP-19091-HIVE-WIP.patch > > > The MagicS3GuardCommitter assumes that the JobID of the task is the same as > that of the job's application master when writing/reading the .pendingset > file. This assumption is not valid when running with Tez, which creates > slightly different JobIDs for tasks and the application master. > > While the MagicS3GuardCommitter is intended only for MRv2, it mostly works > fine with an MRv1 wrapper with Hive/Pig (with some minor changes to Hive) run > in MR mode. This issue only crops up when running queries with the Tez > execution engine. I can upload a patch to Hive 3.1 to reproduce this error on > EMR if needed. > > Fixing this will probably require work from both Tez and Hadoop, wanted to > start a discussion here so we can figure out how exactly we go about this. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19091) Add support for Tez to MagicS3GuardCommitter
[ https://issues.apache.org/jira/browse/HADOOP-19091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17822555#comment-17822555 ] Steve Loughran commented on HADOOP-19091: - 1. can you find out why a different job id is being passed down. What is the hive jira where they made that decision? because the spark design "just use the current time" was simply a choice of the easiest unique value -which ended up breaking when two tasks were launched in the same second. 2. magic committer will pick up the job id as passed in through the config. bq. The post-commitJob cleanup does delete the files since it deletes everything under the __magic directory instead of looking under the job dir, not on trunk, it now supports parallel jobs with their own __magic-$jobID paths. so no cleanup [~vnarayanan7] stick the patch up as a Github pr. do run the hadoop-aws integration tests and tell us which endpoint/region you tested against. > Add support for Tez to MagicS3GuardCommitter > > > Key: HADOOP-19091 > URL: https://issues.apache.org/jira/browse/HADOOP-19091 > Project: Hadoop Common > Issue Type: Improvement > Components: fs/s3 >Affects Versions: 3.3.6 > Environment: Pig 17/Hive 3.1.3 with Hadoop 3.3.3 on AWS EMR 6-12.0 >Reporter: Venkatasubrahmanian Narayanan >Assignee: Venkatasubrahmanian Narayanan >Priority: Major > Attachments: 0001-AWS-Hive-Changes.patch, > 0002-HIVE-27698-Backport-of-HIVE-22398-Remove-legacy-code.patch, > HADOOP-19091-HIVE-WIP.patch > > > The MagicS3GuardCommitter assumes that the JobID of the task is the same as > that of the job's application master when writing/reading the .pendingset > file. This assumption is not valid when running with Tez, which creates > slightly different JobIDs for tasks and the application master. > > While the MagicS3GuardCommitter is intended only for MRv2, it mostly works > fine with an MRv1 wrapper with Hive/Pig (with some minor changes to Hive) run > in MR mode. This issue only crops up when running queries with the Tez > execution engine. I can upload a patch to Hive 3.1 to reproduce this error on > EMR if needed. > > Fixing this will probably require work from both Tez and Hadoop, wanted to > start a discussion here so we can figure out how exactly we go about this. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19091) Add support for Tez to MagicS3GuardCommitter
[ https://issues.apache.org/jira/browse/HADOOP-19091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17822274#comment-17822274 ] Venkatasubrahmanian Narayanan commented on HADOOP-19091: [~ste...@apache.org] Will keep all that in mind when I work on the Hadoop patches, thanks. Hive uses MRv1 and migrating it to MRv2 would be a lot of effort to switch it over to use PathOutputCommitter etc. internally, however, I will see if something similar to the factory design can be done with the committer class since that is configured explicitly for this design. [~srahman] From the AM logs: Job UUID job_1708973874189_0073 source JobID Starting: Task committer attempt_1708973874189_0073_r_00_1: commitJob(job_1708973874189_0073) 2024-02-29 18:05:05,766 [DEBUG] [App Shared Pool - #2] |impl.IOStatisticsStoreImpl|: Incrementing counter op_list_files by 1 with final value 1 2024-02-29 18:05:05,766 [DEBUG] [App Shared Pool - #2] |s3a.S3AFileSystem|: listFiles(s3a://hive-east-1-bucket/emblembasic/__magic/__magic/job-job_1708973874189_0073, false) 2024-02-29 18:05:05,767 [DEBUG] [App Shared Pool - #2] |s3a.S3AFileSystem|: Requesting all entries under emblembasic/__magic/__magic/job-job_1708973874189_0073/ with delimiter '/' >From the task logs: Job UUID job_17089738741890_0073 source JobID Saving work of attempt_17089738741890_0073_r_00_0 to s3a://hive-east-1-bucket/emblembasic/__magic/__magic/job-job_17089738741890_0073/task_17089738741890_0073_r_00.pendingset It's a very subtle difference(there's an extra 0 in the ID/path used by the task). The post-commitJob cleanup does delete the files since it deletes everything under the __magic directory instead of looking under the job dir, commitJob itself just fails to find the pending set when it lists the files so it doesn't commit the results.. > Add support for Tez to MagicS3GuardCommitter > > > Key: HADOOP-19091 > URL: https://issues.apache.org/jira/browse/HADOOP-19091 > Project: Hadoop Common > Issue Type: Improvement > Components: fs/s3 >Affects Versions: 3.3.6 > Environment: Pig 17/Hive 3.1.3 with Hadoop 3.3.3 on AWS EMR 6-12.0 >Reporter: Venkatasubrahmanian Narayanan >Assignee: Venkatasubrahmanian Narayanan >Priority: Major > Attachments: 0001-AWS-Hive-Changes.patch, > 0002-HIVE-27698-Backport-of-HIVE-22398-Remove-legacy-code.patch, > HADOOP-19091-HIVE-WIP.patch > > > The MagicS3GuardCommitter assumes that the JobID of the task is the same as > that of the job's application master when writing/reading the .pendingset > file. This assumption is not valid when running with Tez, which creates > slightly different JobIDs for tasks and the application master. > > While the MagicS3GuardCommitter is intended only for MRv2, it mostly works > fine with an MRv1 wrapper with Hive/Pig (with some minor changes to Hive) run > in MR mode. This issue only crops up when running queries with the Tez > execution engine. I can upload a patch to Hive 3.1 to reproduce this error on > EMR if needed. > > Fixing this will probably require work from both Tez and Hadoop, wanted to > start a discussion here so we can figure out how exactly we go about this. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19091) Add support for Tez to MagicS3GuardCommitter
[ https://issues.apache.org/jira/browse/HADOOP-19091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17821935#comment-17821935 ] Syed Shameerur Rahman commented on HADOOP-19091: [~vnarayanan7] - Can you share the required logs (with DEBUG) if possible it will give some more clarity. > Add support for Tez to MagicS3GuardCommitter > > > Key: HADOOP-19091 > URL: https://issues.apache.org/jira/browse/HADOOP-19091 > Project: Hadoop Common > Issue Type: Improvement > Components: fs/s3 >Affects Versions: 3.3.6 > Environment: Pig 17/Hive 3.1.3 with Hadoop 3.3.3 on AWS EMR 6-12.0 >Reporter: Venkatasubrahmanian Narayanan >Assignee: Venkatasubrahmanian Narayanan >Priority: Major > Attachments: 0001-AWS-Hive-Changes.patch, > 0002-HIVE-27698-Backport-of-HIVE-22398-Remove-legacy-code.patch, > HADOOP-19091-HIVE-WIP.patch > > > The MagicS3GuardCommitter assumes that the JobID of the task is the same as > that of the job's application master when writing/reading the .pendingset > file. This assumption is not valid when running with Tez, which creates > slightly different JobIDs for tasks and the application master. > > While the MagicS3GuardCommitter is intended only for MRv2, it mostly works > fine with an MRv1 wrapper with Hive/Pig (with some minor changes to Hive) run > in MR mode. This issue only crops up when running queries with the Tez > execution engine. I can upload a patch to Hive 3.1 to reproduce this error on > EMR if needed. > > Fixing this will probably require work from both Tez and Hadoop, wanted to > start a discussion here so we can figure out how exactly we go about this. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19091) Add support for Tez to MagicS3GuardCommitter
[ https://issues.apache.org/jira/browse/HADOOP-19091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17821619#comment-17821619 ] Steve Loughran commented on HADOOP-19091: - [~vnarayanan7] * everything has to be done via github pRs right now; ;look at the hadoop-aws testing doc to see policies there. * all work to be on hadoop trunk with backports to 3.4.x: don't expect anything to go any earlier. * anything going near committers has to show why it is *correct* rather than just passing some tests. Correct ::= atomic and recoverable task commit, support for speculative tasks. job commit does not need atomicity though it's always nice. * and changes had better not break spark. I have put a lot of effort into not knowing anything about hive; I hope I don't have to change that policy. Having had a quick look at the patch: it would be better if the hive design used our committer factory design as spark does. That allows for different committees for different schemas, e.g. the manifest committer for abfs/gcs. it should also not contained hard coded checks for class types, with PathOutputCommitter the committer API to get working directories. > Add support for Tez to MagicS3GuardCommitter > > > Key: HADOOP-19091 > URL: https://issues.apache.org/jira/browse/HADOOP-19091 > Project: Hadoop Common > Issue Type: Improvement > Components: fs/s3 >Affects Versions: 3.4.0, 3.3.6 > Environment: Pig 17/Hive 3.1.3 with Hadoop 3.3.3 on AWS EMR 6-12.0 >Reporter: Venkatasubrahmanian Narayanan >Priority: Major > Attachments: 0001-AWS-Hive-Changes.patch, > 0002-HIVE-27698-Backport-of-HIVE-22398-Remove-legacy-code.patch, > HADOOP-19091-HIVE-WIP.patch > > > The MagicS3GuardCommitter assumes that the JobID of the task is the same as > that of the job's application master when writing/reading the .pendingset > file. This assumption is not valid when running with Tez, which creates > slightly different JobIDs for tasks and the application master. > > While the MagicS3GuardCommitter is intended only for MRv2, it mostly works > fine with an MRv1 wrapper with Hive/Pig (with some minor changes to Hive) run > in MR mode. This issue only crops up when running queries with the Tez > execution engine. I can upload a patch to Hive 3.1 to reproduce this error on > EMR if needed. > > Fixing this will probably require work from both Tez and Hadoop, wanted to > start a discussion here so we can figure out how exactly we go about this. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19091) Add support for Tez to MagicS3GuardCommitter
[ https://issues.apache.org/jira/browse/HADOOP-19091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17821417#comment-17821417 ] Venkatasubrahmanian Narayanan commented on HADOOP-19091: Actually, a follow-up thing I remembered: The MagicS3GuardCommitter also does a correctness check on the jobID of the task commit vs the job commit which I've had to patch as well. Even if we do get Tez to create fs.s3a.uuid for use, we'll need to patch the committer to use that for its correctness check. > Add support for Tez to MagicS3GuardCommitter > > > Key: HADOOP-19091 > URL: https://issues.apache.org/jira/browse/HADOOP-19091 > Project: Hadoop Common > Issue Type: Improvement > Components: fs/s3 >Affects Versions: 3.4.0, 3.3.6 > Environment: Pig 17/Hive 3.1.3 with Hadoop 3.3.3 on AWS EMR 6-12.0 >Reporter: Venkatasubrahmanian Narayanan >Priority: Major > Attachments: 0001-AWS-Hive-Changes.patch, > 0002-HIVE-27698-Backport-of-HIVE-22398-Remove-legacy-code.patch, > HADOOP-19091-HIVE-WIP.patch > > > The MagicS3GuardCommitter assumes that the JobID of the task is the same as > that of the job's application master when writing/reading the .pendingset > file. This assumption is not valid when running with Tez, which creates > slightly different JobIDs for tasks and the application master. > > While the MagicS3GuardCommitter is intended only for MRv2, it mostly works > fine with an MRv1 wrapper with Hive/Pig (with some minor changes to Hive) run > in MR mode. This issue only crops up when running queries with the Tez > execution engine. I can upload a patch to Hive 3.1 to reproduce this error on > EMR if needed. > > Fixing this will probably require work from both Tez and Hadoop, wanted to > start a discussion here so we can figure out how exactly we go about this. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19091) Add support for Tez to MagicS3GuardCommitter
[ https://issues.apache.org/jira/browse/HADOOP-19091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17821372#comment-17821372 ] Venkatasubrahmanian Narayanan commented on HADOOP-19091: [~srahman] I've uploaded my WIP Hive patch (there are a couple of other open sourced patches which need to be backported to Hive 3.1 that I've uploaded as well). I still need to clean up a couple of things (hence why the patch hardcodes an expectation that tables are on. S3), but the basic idea is to add an MRv1 wrapper of the MagicS3GuardCommitter similar to how the FileOutputCommitter for MRv1 is implemented, and since Hive uses MRv1 it only requires incidental changes to treat paths the way the magic committer expects. > Add support for Tez to MagicS3GuardCommitter > > > Key: HADOOP-19091 > URL: https://issues.apache.org/jira/browse/HADOOP-19091 > Project: Hadoop Common > Issue Type: Improvement > Components: fs/s3 >Affects Versions: 3.4.0, 3.3.6 > Environment: Pig 17/Hive 3.1.3 with Hadoop 3.3.3 on AWS EMR 6-12.0 >Reporter: Venkatasubrahmanian Narayanan >Priority: Major > Attachments: 0001-AWS-Hive-Changes.patch, > 0002-HIVE-27698-Backport-of-HIVE-22398-Remove-legacy-code.patch, > HADOOP-19091-HIVE-WIP.patch > > > The MagicS3GuardCommitter assumes that the JobID of the task is the same as > that of the job's application master when writing/reading the .pendingset > file. This assumption is not valid when running with Tez, which creates > slightly different JobIDs for tasks and the application master. > > While the MagicS3GuardCommitter is intended only for MRv2, it mostly works > fine with an MRv1 wrapper with Hive/Pig (with some minor changes to Hive) run > in MR mode. This issue only crops up when running queries with the Tez > execution engine. I can upload a patch to Hive 3.1 to reproduce this error on > EMR if needed. > > Fixing this will probably require work from both Tez and Hadoop, wanted to > start a discussion here so we can figure out how exactly we go about this. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19091) Add support for Tez to MagicS3GuardCommitter
[ https://issues.apache.org/jira/browse/HADOOP-19091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17820967#comment-17820967 ] Syed Shameerur Rahman commented on HADOOP-19091: [~vnarayanan7] - I am not sure why MagicS3GuardCommitter won't work with Tez. In the past I vaguely remember running MagicS3GuardCommitter with Hive 3.1.3 and Tez 0.9.2 (by incorporating the changes mentioned in https://issues.apache.org/jira/browse/HIVE-16295) It would be really helpful, If you can share the replication steps for the same. > Add support for Tez to MagicS3GuardCommitter > > > Key: HADOOP-19091 > URL: https://issues.apache.org/jira/browse/HADOOP-19091 > Project: Hadoop Common > Issue Type: Improvement > Components: fs/s3 >Affects Versions: 3.4.0, 3.3.6 > Environment: Pig 17/Hive 3.1.3 with Hadoop 3.3.3 on AWS EMR 6-12.0 >Reporter: Venkatasubrahmanian Narayanan >Priority: Major > > The MagicS3GuardCommitter assumes that the JobID of the task is the same as > that of the job's application master when writing/reading the .pendingset > file. This assumption is not valid when running with Tez, which creates > slightly different JobIDs for tasks and the application master. > > While the MagicS3GuardCommitter is intended only for MRv2, it mostly works > fine with an MRv1 wrapper with Hive/Pig (with some minor changes to Hive) run > in MR mode. This issue only crops up when running queries with the Tez > execution engine. I can upload a patch to Hive 3.1 to reproduce this error on > EMR if needed. > > Fixing this will probably require work from both Tez and Hadoop, wanted to > start a discussion here so we can figure out how exactly we go about this. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19091) Add support for Tez to MagicS3GuardCommitter
[ https://issues.apache.org/jira/browse/HADOOP-19091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17820885#comment-17820885 ] Steve Loughran commented on HADOOP-19091: - incidentally, look at {{buildJobUUID}} for different ways to pass down unique job IDs, as well as the jira for that work (HADOOP-17318) and the spark jiras off it. If Tez can generate the right config option to pass down then it may work today > Add support for Tez to MagicS3GuardCommitter > > > Key: HADOOP-19091 > URL: https://issues.apache.org/jira/browse/HADOOP-19091 > Project: Hadoop Common > Issue Type: Improvement > Components: fs/s3 >Affects Versions: 3.4.0, 3.3.6 > Environment: Pig 17/Hive 3.1.3 with Hadoop 3.3.3 on AWS EMR 6-12.0 >Reporter: Venkatasubrahmanian Narayanan >Priority: Major > > The MagicS3GuardCommitter assumes that the JobID of the task is the same as > that of the job's application master when writing/reading the .pendingset > file. This assumption is not valid when running with Tez, which creates > slightly different JobIDs for tasks and the application master. > > While the MagicS3GuardCommitter is intended only for MRv2, it mostly works > fine with an MRv1 wrapper with Hive/Pig (with some minor changes to Hive) run > in MR mode. This issue only crops up when running queries with the Tez > execution engine. I can upload a patch to Hive 3.1 to reproduce this error on > EMR if needed. > > Fixing this will probably require work from both Tez and Hadoop, wanted to > start a discussion here so we can figure out how exactly we go about this. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19091) Add support for Tez to MagicS3GuardCommitter
[ https://issues.apache.org/jira/browse/HADOOP-19091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17820882#comment-17820882 ] Steve Loughran commented on HADOOP-19091: - best place to start. be aware that the use of job id to isolate jobs has expanded in HADOOP-18797; you need a plan to not conflict with this. > Add support for Tez to MagicS3GuardCommitter > > > Key: HADOOP-19091 > URL: https://issues.apache.org/jira/browse/HADOOP-19091 > Project: Hadoop Common > Issue Type: Improvement > Components: fs/s3 >Affects Versions: 3.4.0, 3.3.6 > Environment: Pig 17/Hive 3.1.3 with Hadoop 3.3.3 on AWS EMR 6-12.0 >Reporter: Venkatasubrahmanian Narayanan >Priority: Major > > The MagicS3GuardCommitter assumes that the JobID of the task is the same as > that of the job's application master when writing/reading the .pendingset > file. This assumption is not valid when running with Tez, which creates > slightly different JobIDs for tasks and the application master. > > While the MagicS3GuardCommitter is intended only for MRv2, it mostly works > fine with an MRv1 wrapper with Hive/Pig (with some minor changes to Hive) run > in MR mode. This issue only crops up when running queries with the Tez > execution engine. I can upload a patch to Hive 3.1 to reproduce this error on > EMR if needed. > > Fixing this will probably require work from both Tez and Hadoop, wanted to > start a discussion here so we can figure out how exactly we go about this. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org