[jira] [Updated] (MAPREDUCE-6839) TestRecovery.testCrashed failed
[ https://issues.apache.org/jira/browse/MAPREDUCE-6839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gergő Pásztor updated MAPREDUCE-6839: - Attachment: MAPREDUCE-6839_v4.patch > TestRecovery.testCrashed failed > --- > > Key: MAPREDUCE-6839 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6839 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: test >Reporter: Gergő Pásztor >Assignee: Gergő Pásztor > Attachments: MAPREDUCE-6839_v1.patch, MAPREDUCE-6839_v2.patch, > MAPREDUCE-6839_v3.patch, MAPREDUCE-6839_v4.patch > > > TestRecovery#testCrashed is a flaky test. > Error Message: > Reduce Task state not correct expected: but was: > Stack Trace: > java.lang.AssertionError: Reduce Task state not correct expected: > but was: > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at > org.apache.hadoop.mapreduce.v2.app.TestRecovery.testCrashed(TestRecovery.java:164) -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6839) TestRecovery.testCrashed failed
[ https://issues.apache.org/jira/browse/MAPREDUCE-6839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15899501#comment-15899501 ] Gergő Pásztor commented on MAPREDUCE-6839: -- I uploaded a new patch. It resolves 3 problems: - TestRecovery.testCrashed fail - TestRecovery.testSpeculative fail, because of a conflict with the previous TestRecovery.testCrashed - TestRecovery.testRecoveryWithoutShuffleSecret fail sometimes, because of the same problem as the testCrashed has > TestRecovery.testCrashed failed > --- > > Key: MAPREDUCE-6839 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6839 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: test >Reporter: Gergő Pásztor >Assignee: Gergő Pásztor > Attachments: MAPREDUCE-6839_v1.patch, MAPREDUCE-6839_v2.patch, > MAPREDUCE-6839_v3.patch, MAPREDUCE-6839_v4.patch > > > TestRecovery#testCrashed is a flaky test. > Error Message: > Reduce Task state not correct expected: but was: > Stack Trace: > java.lang.AssertionError: Reduce Task state not correct expected: > but was: > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at > org.apache.hadoop.mapreduce.v2.app.TestRecovery.testCrashed(TestRecovery.java:164) -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6839) TestRecovery.testCrashed failed
[ https://issues.apache.org/jira/browse/MAPREDUCE-6839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15899504#comment-15899504 ] Peter Bacsko commented on MAPREDUCE-6839: - [~haibochen] I don't know exactly, but somehow the unstopped MRAppMaster either deletes the {{.jhist}} file or doesn't make it possible for the next appmaster to create it. It's not obvious what's happening, but {{testSpeculative}} never fails on its own. If the appmaster is properly stopped, the failure doesn't cascade. > TestRecovery.testCrashed failed > --- > > Key: MAPREDUCE-6839 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6839 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: test >Reporter: Gergő Pásztor >Assignee: Gergő Pásztor > Attachments: MAPREDUCE-6839_v1.patch, MAPREDUCE-6839_v2.patch, > MAPREDUCE-6839_v3.patch, MAPREDUCE-6839_v4.patch > > > TestRecovery#testCrashed is a flaky test. > Error Message: > Reduce Task state not correct expected: but was: > Stack Trace: > java.lang.AssertionError: Reduce Task state not correct expected: > but was: > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at > org.apache.hadoop.mapreduce.v2.app.TestRecovery.testCrashed(TestRecovery.java:164) -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6839) TestRecovery.testCrashed failed
[ https://issues.apache.org/jira/browse/MAPREDUCE-6839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15899505#comment-15899505 ] Peter Bacsko commented on MAPREDUCE-6839: - +1 (non-binding) LGTM > TestRecovery.testCrashed failed > --- > > Key: MAPREDUCE-6839 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6839 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: test >Reporter: Gergő Pásztor >Assignee: Gergő Pásztor > Attachments: MAPREDUCE-6839_v1.patch, MAPREDUCE-6839_v2.patch, > MAPREDUCE-6839_v3.patch, MAPREDUCE-6839_v4.patch > > > TestRecovery#testCrashed is a flaky test. > Error Message: > Reduce Task state not correct expected: but was: > Stack Trace: > java.lang.AssertionError: Reduce Task state not correct expected: > but was: > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at > org.apache.hadoop.mapreduce.v2.app.TestRecovery.testCrashed(TestRecovery.java:164) -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (MAPREDUCE-6839) TestRecovery.testCrashed failed
[ https://issues.apache.org/jira/browse/MAPREDUCE-6839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15899501#comment-15899501 ] Gergő Pásztor edited comment on MAPREDUCE-6839 at 3/7/17 2:18 PM: -- I uploaded a new patch. It resolves 3 problems: - TestRecovery.testCrashed fail - TestRecovery.testSpeculative fail, because of a conflict with the previous TestRecovery.testCrashed - TestRecovery.testRecoveryWithoutShuffleSecret fail sometimes, because of the same problem as the testCrashed has The patched code ran through 3 days without issues. was (Author: pairg): I uploaded a new patch. It resolves 3 problems: - TestRecovery.testCrashed fail - TestRecovery.testSpeculative fail, because of a conflict with the previous TestRecovery.testCrashed - TestRecovery.testRecoveryWithoutShuffleSecret fail sometimes, because of the same problem as the testCrashed has > TestRecovery.testCrashed failed > --- > > Key: MAPREDUCE-6839 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6839 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: test >Reporter: Gergő Pásztor >Assignee: Gergő Pásztor > Attachments: MAPREDUCE-6839_v1.patch, MAPREDUCE-6839_v2.patch, > MAPREDUCE-6839_v3.patch, MAPREDUCE-6839_v4.patch > > > TestRecovery#testCrashed is a flaky test. > Error Message: > Reduce Task state not correct expected: but was: > Stack Trace: > java.lang.AssertionError: Reduce Task state not correct expected: > but was: > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at > org.apache.hadoop.mapreduce.v2.app.TestRecovery.testCrashed(TestRecovery.java:164) -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6839) TestRecovery.testCrashed failed
[ https://issues.apache.org/jira/browse/MAPREDUCE-6839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15899547#comment-15899547 ] Hadoop QA commented on MAPREDUCE-6839: -- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 20s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 1s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 23s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 16s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 24s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 14s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 35s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 16s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 21s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 20s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 20s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 14s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 23s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 11s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 41s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 13s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 9m 6s {color} | {color:green} hadoop-mapreduce-client-app in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 16s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 27m 56s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:a9ad5d6 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12856605/MAPREDUCE-6839_v4.patch | | JIRA Issue | MAPREDUCE-6839 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 2fae10ddb697 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / f597f4c | | Default Java | 1.8.0_121 | | findbugs | v3.0.0 | | Test Results | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6913/testReport/ | | modules | C: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app U: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app | | Console output | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6913/console | | Powered by | Apache Yetus 0.3.0 http://yetus.apache.org | This message was automatically generated. > TestRecovery.testCrashed failed > --- > > Key: MAPREDUCE-6839 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6839 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: test >Reporter: Gergő Pásztor >Assignee: Gergő Pásztor > Attachments: MAPREDUCE-6839_v1.patch, MAPREDUCE-6839_v2.patch, > MAPREDUCE-6839_v3.patch, MAPREDUCE-6839_v4.patch > > > TestRecovery#testCrashe
[jira] [Updated] (MAPREDUCE-6858) HistoryFileManager thrashing due to high volume jobs
[ https://issues.apache.org/jira/browse/MAPREDUCE-6858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yufei Gu updated MAPREDUCE-6858: Description: - JHS scans "done_intermediate" dir for files to process and adds them to a thread pool - Thread pool starts processing these files to move them to "done" dir - JHS scans "done_intermediate" again for files to process and adds them to a thread pool -- If we have enough jobs where the thread pool can't keep up with the scanning interval, they'll get added twice (or more). If this keeps compounding, I wouldn't be surprised if jobs end up piling up and not getting processed for quite some time and getting lots of FileNotFoundException's. By default, it looks like the thread pool only has 3 threads in it (mapreduce.jobhistory.move.thread-count). And the scan interval is 3 minutes (mapreduce.jobhistory.move.interval-ms). Perhaps we should increase these? was: - JHS scans "done_intermediate" dir for files to process and adds them to a thread pool - Thread pool starts processing these files to move them to "done" dir JHS scans "done_intermediate" again for files to process and adds them to a thread pool -- If we have enough jobs where the thread pool can't keep up with the scanning interval, they'll get added twice (or more). If this keeps compounding, I wouldn't be surprised if jobs end up piling up and not getting processed for quite some time and getting lots of FileNotFoundException's. By default, it looks like the thread pool only has 3 threads in it (mapreduce.jobhistory.move.thread-count). And the scan interval is 3 minutes (mapreduce.jobhistory.move.interval-ms). Perhaps we should increase these? > HistoryFileManager thrashing due to high volume jobs > - > > Key: MAPREDUCE-6858 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6858 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobhistoryserver >Reporter: Yufei Gu > > - JHS scans "done_intermediate" dir for files to process and adds them to a > thread pool > - Thread pool starts processing these files to move them to "done" dir > - JHS scans "done_intermediate" again for files to process and adds them to a > thread pool > -- If we have enough jobs where the thread pool can't keep up with the > scanning interval, they'll get added twice (or more). If this keeps > compounding, I wouldn't be surprised if jobs end up piling up and not getting > processed for quite some time and getting lots of FileNotFoundException's. > By default, it looks like the thread pool only has 3 threads in it > (mapreduce.jobhistory.move.thread-count). And the scan interval is 3 minutes > (mapreduce.jobhistory.move.interval-ms). Perhaps we should increase these? -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Created] (MAPREDUCE-6858) HistoryFileManager thrashing due to high volume jobs
Yufei Gu created MAPREDUCE-6858: --- Summary: HistoryFileManager thrashing due to high volume jobs Key: MAPREDUCE-6858 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6858 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobhistoryserver Reporter: Yufei Gu - JHS scans "done_intermediate" dir for files to process and adds them to a thread pool - Thread pool starts processing these files to move them to "done" dir JHS scans "done_intermediate" again for files to process and adds them to a thread pool -- If we have enough jobs where the thread pool can't keep up with the scanning interval, they'll get added twice (or more). If this keeps compounding, I wouldn't be surprised if jobs end up piling up and not getting processed for quite some time and getting lots of FileNotFoundException's. By default, it looks like the thread pool only has 3 threads in it (mapreduce.jobhistory.move.thread-count). And the scan interval is 3 minutes (mapreduce.jobhistory.move.interval-ms). Perhaps we should increase these? -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6858) HistoryFileManager thrashing due to high volume jobs
[ https://issues.apache.org/jira/browse/MAPREDUCE-6858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yufei Gu updated MAPREDUCE-6858: Description: - JHS scans "done_intermediate" dir for files to process and adds them to a thread pool - Thread pool starts processing these files to move them to "done" dir - JHS scans "done_intermediate" again for files to process and adds them to a thread pool -- If we have enough jobs where the thread pool can't keep up with the scanning interval, they'll get added twice (or more). If this keeps compounding, jobs end up would pile up and not getting processed for quite some time and getting lots of FileNotFoundException's. By default, it looks like the thread pool only has 3 threads in it (mapreduce.jobhistory.move.thread-count). And the scan interval is 3 minutes (mapreduce.jobhistory.move.interval-ms). Perhaps we should increase these? was: - JHS scans "done_intermediate" dir for files to process and adds them to a thread pool - Thread pool starts processing these files to move them to "done" dir - JHS scans "done_intermediate" again for files to process and adds them to a thread pool -- If we have enough jobs where the thread pool can't keep up with the scanning interval, they'll get added twice (or more). If this keeps compounding, I wouldn't be surprised if jobs end up piling up and not getting processed for quite some time and getting lots of FileNotFoundException's. By default, it looks like the thread pool only has 3 threads in it (mapreduce.jobhistory.move.thread-count). And the scan interval is 3 minutes (mapreduce.jobhistory.move.interval-ms). Perhaps we should increase these? > HistoryFileManager thrashing due to high volume jobs > - > > Key: MAPREDUCE-6858 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6858 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobhistoryserver >Reporter: Yufei Gu > > - JHS scans "done_intermediate" dir for files to process and adds them to a > thread pool > - Thread pool starts processing these files to move them to "done" dir > - JHS scans "done_intermediate" again for files to process and adds them to a > thread pool > -- If we have enough jobs where the thread pool can't keep up with the > scanning interval, they'll get added twice (or more). If this keeps > compounding, jobs end up would pile up and not getting processed for quite > some time and getting lots of FileNotFoundException's. > By default, it looks like the thread pool only has 3 threads in it > (mapreduce.jobhistory.move.thread-count). And the scan interval is 3 minutes > (mapreduce.jobhistory.move.interval-ms). Perhaps we should increase these? -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6858) HistoryFileManager thrashing due to high volume jobs
[ https://issues.apache.org/jira/browse/MAPREDUCE-6858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yufei Gu updated MAPREDUCE-6858: Description: The log of JHS shows that it tried to move the same *.jhist twice, and the second moving causes FileNotFoundException's. - JHS scans "done_intermediate" dir for files to process and adds them to a thread pool - Thread pool starts processing these files to move them to "done" dir - JHS scans "done_intermediate" again for files to process and adds them to a thread pool -- If we have enough jobs where the thread pool can't keep up with the scanning interval, they'll get added twice (or more). If this keeps compounding, jobs end up would pile up and not getting processed for quite some time and getting lots of FileNotFoundException's. By default, it looks like the thread pool only has 3 threads in it (mapreduce.jobhistory.move.thread-count). And the scan interval is 3 minutes (mapreduce.jobhistory.move.interval-ms). Perhaps we should increase these? was: - JHS scans "done_intermediate" dir for files to process and adds them to a thread pool - Thread pool starts processing these files to move them to "done" dir - JHS scans "done_intermediate" again for files to process and adds them to a thread pool -- If we have enough jobs where the thread pool can't keep up with the scanning interval, they'll get added twice (or more). If this keeps compounding, jobs end up would pile up and not getting processed for quite some time and getting lots of FileNotFoundException's. By default, it looks like the thread pool only has 3 threads in it (mapreduce.jobhistory.move.thread-count). And the scan interval is 3 minutes (mapreduce.jobhistory.move.interval-ms). Perhaps we should increase these? > HistoryFileManager thrashing due to high volume jobs > - > > Key: MAPREDUCE-6858 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6858 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobhistoryserver >Reporter: Yufei Gu > > The log of JHS shows that it tried to move the same *.jhist twice, and the > second moving causes FileNotFoundException's. > - JHS scans "done_intermediate" dir for files to process and adds them to a > thread pool > - Thread pool starts processing these files to move them to "done" dir > - JHS scans "done_intermediate" again for files to process and adds them to a > thread pool > -- If we have enough jobs where the thread pool can't keep up with the > scanning interval, they'll get added twice (or more). If this keeps > compounding, jobs end up would pile up and not getting processed for quite > some time and getting lots of FileNotFoundException's. > By default, it looks like the thread pool only has 3 threads in it > (mapreduce.jobhistory.move.thread-count). And the scan interval is 3 minutes > (mapreduce.jobhistory.move.interval-ms). Perhaps we should increase these? -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6858) HistoryFileManager thrashing due to high volume jobs
[ https://issues.apache.org/jira/browse/MAPREDUCE-6858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yufei Gu updated MAPREDUCE-6858: Description: JHS log shows that it tried to move the same *.jhist twice, and the second moving causes FileNotFoundException's. - JHS scans "done_intermediate" dir for files to process and adds them to a thread pool - Thread pool starts processing these files to move them to "done" dir - JHS scans "done_intermediate" again for files to process and adds them to a thread pool -- If we have enough jobs where the thread pool can't keep up with the scanning interval, they'll get added twice (or more). If this keeps compounding, jobs end up would pile up and not getting processed for quite some time and getting lots of FileNotFoundException's. By default, it looks like the thread pool only has 3 threads in it (mapreduce.jobhistory.move.thread-count). And the scan interval is 3 minutes (mapreduce.jobhistory.move.interval-ms). Perhaps we should increase these? was: The log of JHS shows that it tried to move the same *.jhist twice, and the second moving causes FileNotFoundException's. - JHS scans "done_intermediate" dir for files to process and adds them to a thread pool - Thread pool starts processing these files to move them to "done" dir - JHS scans "done_intermediate" again for files to process and adds them to a thread pool -- If we have enough jobs where the thread pool can't keep up with the scanning interval, they'll get added twice (or more). If this keeps compounding, jobs end up would pile up and not getting processed for quite some time and getting lots of FileNotFoundException's. By default, it looks like the thread pool only has 3 threads in it (mapreduce.jobhistory.move.thread-count). And the scan interval is 3 minutes (mapreduce.jobhistory.move.interval-ms). Perhaps we should increase these? > HistoryFileManager thrashing due to high volume jobs > - > > Key: MAPREDUCE-6858 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6858 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobhistoryserver >Reporter: Yufei Gu > > JHS log shows that it tried to move the same *.jhist twice, and the second > moving causes FileNotFoundException's. > - JHS scans "done_intermediate" dir for files to process and adds them to a > thread pool > - Thread pool starts processing these files to move them to "done" dir > - JHS scans "done_intermediate" again for files to process and adds them to a > thread pool > -- If we have enough jobs where the thread pool can't keep up with the > scanning interval, they'll get added twice (or more). If this keeps > compounding, jobs end up would pile up and not getting processed for quite > some time and getting lots of FileNotFoundException's. > By default, it looks like the thread pool only has 3 threads in it > (mapreduce.jobhistory.move.thread-count). And the scan interval is 3 minutes > (mapreduce.jobhistory.move.interval-ms). Perhaps we should increase these? -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6839) TestRecovery.testCrashed failed
[ https://issues.apache.org/jira/browse/MAPREDUCE-6839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15900198#comment-15900198 ] Robert Kanter commented on MAPREDUCE-6839: -- +1 will commit shortly > TestRecovery.testCrashed failed > --- > > Key: MAPREDUCE-6839 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6839 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: test >Reporter: Gergő Pásztor >Assignee: Gergő Pásztor > Attachments: MAPREDUCE-6839_v1.patch, MAPREDUCE-6839_v2.patch, > MAPREDUCE-6839_v3.patch, MAPREDUCE-6839_v4.patch > > > TestRecovery#testCrashed is a flaky test. > Error Message: > Reduce Task state not correct expected: but was: > Stack Trace: > java.lang.AssertionError: Reduce Task state not correct expected: > but was: > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at > org.apache.hadoop.mapreduce.v2.app.TestRecovery.testCrashed(TestRecovery.java:164) -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6839) TestRecovery.testCrashed failed
[ https://issues.apache.org/jira/browse/MAPREDUCE-6839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Kanter updated MAPREDUCE-6839: - Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 3.0.0-alpha3 2.9.0 Status: Resolved (was: Patch Available) Thanks [~pairg] and other for reviews. Committed to trunk and branch-2! > TestRecovery.testCrashed failed > --- > > Key: MAPREDUCE-6839 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6839 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: test >Reporter: Gergő Pásztor >Assignee: Gergő Pásztor > Fix For: 2.9.0, 3.0.0-alpha3 > > Attachments: MAPREDUCE-6839_v1.patch, MAPREDUCE-6839_v2.patch, > MAPREDUCE-6839_v3.patch, MAPREDUCE-6839_v4.patch > > > TestRecovery#testCrashed is a flaky test. > Error Message: > Reduce Task state not correct expected: but was: > Stack Trace: > java.lang.AssertionError: Reduce Task state not correct expected: > but was: > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at > org.apache.hadoop.mapreduce.v2.app.TestRecovery.testCrashed(TestRecovery.java:164) -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Moved] (MAPREDUCE-6859) hadoop-mapreduce-client-jobclient.jar sets a main class that isn't in the JAR
[ https://issues.apache.org/jira/browse/MAPREDUCE-6859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe moved YARN-6303 to MAPREDUCE-6859: - Affects Version/s: (was: 3.0.0-alpha2) 3.0.0-alpha2 Component/s: (was: client) client Key: MAPREDUCE-6859 (was: YARN-6303) Project: Hadoop Map/Reduce (was: Hadoop YARN) > hadoop-mapreduce-client-jobclient.jar sets a main class that isn't in the JAR > - > > Key: MAPREDUCE-6859 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6859 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: client >Affects Versions: 3.0.0-alpha2 >Reporter: Daniel Templeton >Assignee: Daniel Templeton >Priority: Minor > Attachments: YARN-6303.001.patch > > > The manifest for hadoop-mapreduce-client-jobclient.jar points to > {{org.apache.hadoop.test.MapredTestDriver}}, which is in the test JAR. > Without the test JAR in the class path, running the jobclient JAR will fail > with a class not found exception. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6839) TestRecovery.testCrashed failed
[ https://issues.apache.org/jira/browse/MAPREDUCE-6839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15900287#comment-15900287 ] Hudson commented on MAPREDUCE-6839: --- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #11367 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/11367/]) MAPREDUCE-6839. TestRecovery.testCrashed failed (pairg via rkanter) (rkanter: rev 38d75dfd3a643f8a1acd52e025a466d65065b60e) * (edit) hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestRecovery.java > TestRecovery.testCrashed failed > --- > > Key: MAPREDUCE-6839 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6839 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: test >Reporter: Gergő Pásztor >Assignee: Gergő Pásztor > Fix For: 2.9.0, 3.0.0-alpha3 > > Attachments: MAPREDUCE-6839_v1.patch, MAPREDUCE-6839_v2.patch, > MAPREDUCE-6839_v3.patch, MAPREDUCE-6839_v4.patch > > > TestRecovery#testCrashed is a flaky test. > Error Message: > Reduce Task state not correct expected: but was: > Stack Trace: > java.lang.AssertionError: Reduce Task state not correct expected: > but was: > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at > org.apache.hadoop.mapreduce.v2.app.TestRecovery.testCrashed(TestRecovery.java:164) -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6859) hadoop-mapreduce-client-jobclient.jar sets a main class that isn't in the JAR
[ https://issues.apache.org/jira/browse/MAPREDUCE-6859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15900372#comment-15900372 ] Hadoop QA commented on MAPREDUCE-6859: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 20s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 12m 42s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 22s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 24s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 15s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 13s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 21s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 22s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 2s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 10s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green}104m 4s{color} | {color:green} hadoop-mapreduce-client-jobclient in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 24s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}120m 50s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:a9ad5d6 | | JIRA Issue | YARN-6303 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12856683/YARN-6303.001.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit xml | | uname | Linux e38dc0e38e65 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / e0c239c | | Default Java | 1.8.0_121 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/15196/testReport/ | | modules | C: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient U: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/15196/console | | Powered by | Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > hadoop-mapreduce-client-jobclient.jar sets a main class that isn't in the JAR > - > > Key: MAPREDUCE-6859 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6859 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: client >Affects Versions: 3.0.0-alpha2 >Reporter: Daniel Templeton >Assignee: Daniel Templeton >Priority: Minor > Attachments: YARN-6303.001.patch > > > The manifest for hadoop-mapreduce-client-jobclient.jar points to > {{org.apache.hadoop.test.MapredTestDriver}}, which is in the test JAR. > Without the test JAR in the class path, running the jobclient JAR will fail
[jira] [Commented] (MAPREDUCE-6526) Remove usage of metrics v1 from hadoop-mapreduce
[ https://issues.apache.org/jira/browse/MAPREDUCE-6526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15900394#comment-15900394 ] Zhiyuan Yang commented on MAPREDUCE-6526: - Why metrics tags were removed? @[~ajisakaa] > Remove usage of metrics v1 from hadoop-mapreduce > > > Key: MAPREDUCE-6526 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6526 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: Akira Ajisaka >Assignee: Akira Ajisaka >Priority: Blocker > Fix For: 3.0.0-alpha1 > > Attachments: MAPREDUCE-6526.00.patch, MAPREDUCE-6526.01.patch, > MAPREDUCE-6526.02.patch, MAPREDUCE-6526.03.patch > > > LocalJobRunnerMetrics and ShuffleClientMetrics are still using metrics v1. We > should remove these metrics or rewrite them to use metrics v2. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6859) hadoop-mapreduce-client-jobclient.jar sets a main class that isn't in the JAR
[ https://issues.apache.org/jira/browse/MAPREDUCE-6859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15900445#comment-15900445 ] Hadoop QA commented on MAPREDUCE-6859: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 19s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s {color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 33s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 33s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 37s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 21s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 20s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 27s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 28s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 28s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 29s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s {color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 13s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 112m 55s {color} | {color:green} hadoop-mapreduce-client-jobclient in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 25s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 133m 37s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:a9ad5d6 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12856683/YARN-6303.001.patch | | JIRA Issue | MAPREDUCE-6859 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit xml | | uname | Linux 93edc9fcd621 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 1598fd3 | | Default Java | 1.8.0_121 | | Test Results | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6914/testReport/ | | modules | C: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient U: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient | | Console output | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6914/console | | Powered by | Apache Yetus 0.3.0 http://yetus.apache.org | This message was automatically generated. > hadoop-mapreduce-client-jobclient.jar sets a main class that isn't in the JAR > - > > Key: MAPREDUCE-6859 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6859 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: client >Affects Versions: 3.0.0-alpha2 >Reporter: Daniel Templeton >Assignee: Daniel Templeton >Priority: Minor > Attachments: YARN-6303.001.patch > > > The manifest for hadoop-mapreduce-client-jobclient.jar points to > {{org.apache.hadoop.test.MapredTestDriver}}, which is in the test JAR. > Without the test JAR in the class path, running the jobclient JAR will
[jira] [Created] (MAPREDUCE-6860) User intermediate-done-dir permissions should use history file permissions configuration
Jonathan Hung created MAPREDUCE-6860: Summary: User intermediate-done-dir permissions should use history file permissions configuration Key: MAPREDUCE-6860 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6860 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Jonathan Hung Currently {{JobHistoryEventHandler}} creates the user intermediate-done-dir directory here: {noformat} doneDirPrefixPath = FileContext.getFileContext(conf).makeQualified(new Path(userDoneDirStr)); mkdir(doneDirFS, doneDirPrefixPath, new FsPermission( JobHistoryUtils.HISTORY_INTERMEDIATE_USER_DIR_PERMISSIONS));{noformat} which is hardcoded to 770. But the summary, history, and conf files under this user dir are configurable via {{mapreduce.jobhistory.intermediate-done-dir.file.permission}}. So if the configured permissions has read/write/execute permissions for "other" users, they will still not have access to these files due to the 770 permission on the user dir. I see two options here: # Reuse {{mapreduce.jobhistory.intermediate-done-dir.file.permission}} as the permissions for the user dir # Create a new config for the user dir permissions, using 770 as the default -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6860) User intermediate-done-dir permissions should use history file permissions configuration
[ https://issues.apache.org/jira/browse/MAPREDUCE-6860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Hung updated MAPREDUCE-6860: - Description: Currently {{JobHistoryEventHandler}} creates the user intermediate-done-dir directory here: {noformat} doneDirPrefixPath = FileContext.getFileContext(conf).makeQualified(new Path(userDoneDirStr)); mkdir(doneDirFS, doneDirPrefixPath, new FsPermission( JobHistoryUtils.HISTORY_INTERMEDIATE_USER_DIR_PERMISSIONS));{noformat} which is hardcoded to 770. But the summary, history, and conf files under this user dir are configurable via {{mapreduce.jobhistory.intermediate-done-dir.file.permission}}. So if the configured permissions has read/write/execute permissions for "other" users, they will still not have access to these files due to the 770 permission on the user dir. I see two options here: # Reuse {{mapreduce.jobhistory.intermediate-done-dir.file.permission}} as the permissions for the user dir # Create a new config for the user dir permissions, using 770 as the default The latter makes more sense to me. was: Currently {{JobHistoryEventHandler}} creates the user intermediate-done-dir directory here: {noformat} doneDirPrefixPath = FileContext.getFileContext(conf).makeQualified(new Path(userDoneDirStr)); mkdir(doneDirFS, doneDirPrefixPath, new FsPermission( JobHistoryUtils.HISTORY_INTERMEDIATE_USER_DIR_PERMISSIONS));{noformat} which is hardcoded to 770. But the summary, history, and conf files under this user dir are configurable via {{mapreduce.jobhistory.intermediate-done-dir.file.permission}}. So if the configured permissions has read/write/execute permissions for "other" users, they will still not have access to these files due to the 770 permission on the user dir. I see two options here: # Reuse {{mapreduce.jobhistory.intermediate-done-dir.file.permission}} as the permissions for the user dir # Create a new config for the user dir permissions, using 770 as the default > User intermediate-done-dir permissions should use history file permissions > configuration > > > Key: MAPREDUCE-6860 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6860 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Jonathan Hung > > Currently {{JobHistoryEventHandler}} creates the user intermediate-done-dir > directory here: {noformat} doneDirPrefixPath = > FileContext.getFileContext(conf).makeQualified(new > Path(userDoneDirStr)); > mkdir(doneDirFS, doneDirPrefixPath, new FsPermission( > > JobHistoryUtils.HISTORY_INTERMEDIATE_USER_DIR_PERMISSIONS));{noformat} which > is hardcoded to 770. But the summary, history, and conf files under this user > dir are configurable via > {{mapreduce.jobhistory.intermediate-done-dir.file.permission}}. So if the > configured permissions has read/write/execute permissions for "other" users, > they will still not have access to these files due to the 770 permission on > the user dir. > I see two options here: > # Reuse {{mapreduce.jobhistory.intermediate-done-dir.file.permission}} as the > permissions for the user dir > # Create a new config for the user dir permissions, using 770 as the default > The latter makes more sense to me. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org