[jira] [Commented] (MAPREDUCE-6657) job history server can fail on startup when NameNode is in start phase
[ https://issues.apache.org/jira/browse/MAPREDUCE-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15287686#comment-15287686 ] Hudson commented on MAPREDUCE-6657: --- SUCCESS: Integrated in Hadoop-trunk-Commit #9807 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/9807/]) MAPREDUCE-6657. Job history server can fail on startup when NameNode is (junping_du: rev f6ef876fe158a5334cad7075f1966573a1c4dec9) * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/pom.xml * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeRpcServer.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/test/java/org/apache/hadoop/mapreduce/v2/hs/TestHistoryFileManagerInitWithNonRunningDFS.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/main/java/org/apache/hadoop/mapreduce/v2/hs/HistoryFileManager.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java > job history server can fail on startup when NameNode is in start phase > -- > > Key: MAPREDUCE-6657 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6657 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobhistoryserver >Reporter: Haibo Chen >Assignee: Haibo Chen > Fix For: 2.9.0 > > Attachments: mapreduce6657.001.patch, mapreduce6657.002.patch, > mapreduce6657.003.patch, mapreduce6657.004.patch, mapreduce6657.005.patch, > mapreduce6657.006.patch, mapreduce6657.007.patch > > > Job history server will try to create a history directory in HDFS on startup. > When NameNode is in safe mode, it will keep retrying for a configurable time > period. However, it should also keeps retrying if the name node is in start > state. Safe mode does not happen until the NN is out of the startup phase. > A RetriableException with the text "NameNode still not started" is thrown > when the NN is in its internal service startup phase. We should add the check > for this specific exception in isBecauseSafeMode() to account for that. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6657) job history server can fail on startup when NameNode is in start phase
[ https://issues.apache.org/jira/browse/MAPREDUCE-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15285726#comment-15285726 ] Junping Du commented on MAPREDUCE-6657: --- The test failure is not related. 007 patch LGTM. +1. Will commit it shortly if no further comments from others. > job history server can fail on startup when NameNode is in start phase > -- > > Key: MAPREDUCE-6657 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6657 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobhistoryserver >Reporter: Haibo Chen >Assignee: Haibo Chen > Attachments: mapreduce6657.001.patch, mapreduce6657.002.patch, > mapreduce6657.003.patch, mapreduce6657.004.patch, mapreduce6657.005.patch, > mapreduce6657.006.patch, mapreduce6657.007.patch > > > Job history server will try to create a history directory in HDFS on startup. > When NameNode is in safe mode, it will keep retrying for a configurable time > period. However, it should also keeps retrying if the name node is in start > state. Safe mode does not happen until the NN is out of the startup phase. > A RetriableException with the text "NameNode still not started" is thrown > when the NN is in its internal service startup phase. We should add the check > for this specific exception in isBecauseSafeMode() to account for that. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6657) job history server can fail on startup when NameNode is in start phase
[ https://issues.apache.org/jira/browse/MAPREDUCE-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15285495#comment-15285495 ] Haibo Chen commented on MAPREDUCE-6657: --- Tests timed out, don't think it is related to this patch. > job history server can fail on startup when NameNode is in start phase > -- > > Key: MAPREDUCE-6657 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6657 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobhistoryserver >Reporter: Haibo Chen >Assignee: Haibo Chen > Attachments: mapreduce6657.001.patch, mapreduce6657.002.patch, > mapreduce6657.003.patch, mapreduce6657.004.patch, mapreduce6657.005.patch, > mapreduce6657.006.patch, mapreduce6657.007.patch > > > Job history server will try to create a history directory in HDFS on startup. > When NameNode is in safe mode, it will keep retrying for a configurable time > period. However, it should also keeps retrying if the name node is in start > state. Safe mode does not happen until the NN is out of the startup phase. > A RetriableException with the text "NameNode still not started" is thrown > when the NN is in its internal service startup phase. We should add the check > for this specific exception in isBecauseSafeMode() to account for that. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6657) job history server can fail on startup when NameNode is in start phase
[ https://issues.apache.org/jira/browse/MAPREDUCE-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15285376#comment-15285376 ] Hadoop QA commented on MAPREDUCE-6657: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 14s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 16s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 56s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 3s {color} | {color:green} trunk passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 38s {color} | {color:green} trunk passed with JDK v1.7.0_101 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 35s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 21s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 29s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 42s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 23s {color} | {color:green} trunk passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 21s {color} | {color:green} trunk passed with JDK v1.7.0_101 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 17s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 19s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 55s {color} | {color:green} the patch passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 55s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 40s {color} | {color:green} the patch passed with JDK v1.7.0_101 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 40s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 25s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 13s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 27s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s {color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 52s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 17s {color} | {color:green} the patch passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 2s {color} | {color:green} the patch passed with JDK v1.7.0_101 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 62m 23s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_91. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 6m 12s {color} | {color:green} hadoop-mapreduce-client-hs in the patch passed with JDK v1.8.0_91. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 57m 22s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_101. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 6m 24s {color} | {color:green} hadoop-mapreduce-client-hs in the patch passed with JDK v1.7.0_101. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 28s {color} | {color:green} Patch does not generate ASF License
[jira] [Commented] (MAPREDUCE-6657) job history server can fail on startup when NameNode is in start phase
[ https://issues.apache.org/jira/browse/MAPREDUCE-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15284987#comment-15284987 ] Haibo Chen commented on MAPREDUCE-6657: --- Updated the patch with Junping's comments on adding a static method for this.nn.getRole() + " still not started". > job history server can fail on startup when NameNode is in start phase > -- > > Key: MAPREDUCE-6657 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6657 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobhistoryserver >Reporter: Haibo Chen >Assignee: Haibo Chen > Attachments: mapreduce6657.001.patch, mapreduce6657.002.patch, > mapreduce6657.003.patch, mapreduce6657.004.patch, mapreduce6657.005.patch, > mapreduce6657.006.patch > > > Job history server will try to create a history directory in HDFS on startup. > When NameNode is in safe mode, it will keep retrying for a configurable time > period. However, it should also keeps retrying if the name node is in start > state. Safe mode does not happen until the NN is out of the startup phase. > A RetriableException with the text "NameNode still not started" is thrown > when the NN is in its internal service startup phase. We should add the check > for this specific exception in isBecauseSafeMode() to account for that. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6657) job history server can fail on startup when NameNode is in start phase
[ https://issues.apache.org/jira/browse/MAPREDUCE-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15283178#comment-15283178 ] Junping Du commented on MAPREDUCE-6657: --- bq. I think the HDFS change is trivial and it would be silly for us to fix this with the 006 patch, but then have to do it again when the HDFS JIRA is done. May as well do it now. Agree. The change on HDFS is relatively trivial and lower risky. No necessary to have a separated JIRA. > job history server can fail on startup when NameNode is in start phase > -- > > Key: MAPREDUCE-6657 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6657 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobhistoryserver >Reporter: Haibo Chen >Assignee: Haibo Chen > Attachments: mapreduce6657.001.patch, mapreduce6657.002.patch, > mapreduce6657.003.patch, mapreduce6657.004.patch, mapreduce6657.005.patch, > mapreduce6657.006.patch > > > Job history server will try to create a history directory in HDFS on startup. > When NameNode is in safe mode, it will keep retrying for a configurable time > period. However, it should also keeps retrying if the name node is in start > state. Safe mode does not happen until the NN is out of the startup phase. > A RetriableException with the text "NameNode still not started" is thrown > when the NN is in its internal service startup phase. We should add the check > for this specific exception in isBecauseSafeMode() to account for that. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6657) job history server can fail on startup when NameNode is in start phase
[ https://issues.apache.org/jira/browse/MAPREDUCE-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15283023#comment-15283023 ] Hadoop QA commented on MAPREDUCE-6657: -- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 23s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 32s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 25s {color} | {color:green} trunk passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 21s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 18s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 27s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 16s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 44s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 21s {color} | {color:green} trunk passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 17s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 23s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 23s {color} | {color:green} the patch passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 23s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 20s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 20s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 16s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 25s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 56s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 10s {color} | {color:green} the patch passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 13s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 6m 5s {color} | {color:green} hadoop-mapreduce-client-hs in the patch passed with JDK v1.8.0_91. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 6m 19s {color} | {color:green} hadoop-mapreduce-client-hs in the patch passed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 21s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 29m 9s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:cf2ee45 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12803913/mapreduce6657.006.patch | | JIRA Issue | MAPREDUCE-6657 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 29293b042b44 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 1f2794b | | Default Java | 1.7.0_95 | |
[jira] [Commented] (MAPREDUCE-6657) job history server can fail on startup when NameNode is in start phase
[ https://issues.apache.org/jira/browse/MAPREDUCE-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15283020#comment-15283020 ] Robert Kanter commented on MAPREDUCE-6657: -- I chatted with [~haibochen]. I think the HDFS change is trivial and it would be silly for us to fix this with the 006 patch, but then have to do it again when the HDFS JIRA is done. May as well do it now. > job history server can fail on startup when NameNode is in start phase > -- > > Key: MAPREDUCE-6657 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6657 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobhistoryserver >Reporter: Haibo Chen >Assignee: Haibo Chen > Attachments: mapreduce6657.001.patch, mapreduce6657.002.patch, > mapreduce6657.003.patch, mapreduce6657.004.patch, mapreduce6657.005.patch, > mapreduce6657.006.patch > > > Job history server will try to create a history directory in HDFS on startup. > When NameNode is in safe mode, it will keep retrying for a configurable time > period. However, it should also keeps retrying if the name node is in start > state. Safe mode does not happen until the NN is out of the startup phase. > A RetriableException with the text "NameNode still not started" is thrown > when the NN is in its internal service startup phase. We should add the check > for this specific exception in isBecauseSafeMode() to account for that. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6657) job history server can fail on startup when NameNode is in start phase
[ https://issues.apache.org/jira/browse/MAPREDUCE-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15283005#comment-15283005 ] Daniel Templeton commented on MAPREDUCE-6657: - [~haibochen], thanks. As long as you file the follow-up JIRA, I'm fine with that. I've encountered the same issue with HDFS exception handling before and dealt with it the same way. Fixing HDFS is out of scope for this change. > job history server can fail on startup when NameNode is in start phase > -- > > Key: MAPREDUCE-6657 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6657 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobhistoryserver >Reporter: Haibo Chen >Assignee: Haibo Chen > Attachments: mapreduce6657.001.patch, mapreduce6657.002.patch, > mapreduce6657.003.patch, mapreduce6657.004.patch, mapreduce6657.005.patch, > mapreduce6657.006.patch > > > Job history server will try to create a history directory in HDFS on startup. > When NameNode is in safe mode, it will keep retrying for a configurable time > period. However, it should also keeps retrying if the name node is in start > state. Safe mode does not happen until the NN is out of the startup phase. > A RetriableException with the text "NameNode still not started" is thrown > when the NN is in its internal service startup phase. We should add the check > for this specific exception in isBecauseSafeMode() to account for that. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6657) job history server can fail on startup when NameNode is in start phase
[ https://issues.apache.org/jira/browse/MAPREDUCE-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15282996#comment-15282996 ] Haibo Chen commented on MAPREDUCE-6657: --- Thanks a lot for your remarks, [~djp]. For this jira, I will keep using string comparison given that the issue you pointed out is more of a HDFS issue. Will file a follow up HDFS jira to fix it. Uploaded a patch that is the same as mapreduce.005.patch but with the checkstyle fix. > job history server can fail on startup when NameNode is in start phase > -- > > Key: MAPREDUCE-6657 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6657 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobhistoryserver >Reporter: Haibo Chen >Assignee: Haibo Chen > Attachments: mapreduce6657.001.patch, mapreduce6657.002.patch, > mapreduce6657.003.patch, mapreduce6657.004.patch, mapreduce6657.005.patch, > mapreduce6657.006.patch > > > Job history server will try to create a history directory in HDFS on startup. > When NameNode is in safe mode, it will keep retrying for a configurable time > period. However, it should also keeps retrying if the name node is in start > state. Safe mode does not happen until the NN is out of the startup phase. > A RetriableException with the text "NameNode still not started" is thrown > when the NN is in its internal service startup phase. We should add the check > for this specific exception in isBecauseSafeMode() to account for that. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6657) job history server can fail on startup when NameNode is in start phase
[ https://issues.apache.org/jira/browse/MAPREDUCE-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15281517#comment-15281517 ] Junping Du commented on MAPREDUCE-6657: --- bq. Do you think we should create a subclass of RetriableException for this instead? It is up to you. IMO, it is not necessary to do so just for a special case or it could be too many sub-exceptions. bq. The message is derived from a instance method this.nn.getRole(), and doing string matching is probably not the cleanest way. You can make a static method for {{this.nn.getRole() + " still not started"}} with input of daemon's name ("NameNode" here) which is accessible from both HDFS and MAPREDUCE (JHS). In JHS, just put "NameNode" (or move NamenodeRole from HdfsServerConstants to HdfsConstants and share to JHS) and get the same string with HDFS. That could be much cleaner. > job history server can fail on startup when NameNode is in start phase > -- > > Key: MAPREDUCE-6657 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6657 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobhistoryserver >Reporter: Haibo Chen >Assignee: Haibo Chen > Attachments: mapreduce6657.001.patch, mapreduce6657.002.patch, > mapreduce6657.003.patch, mapreduce6657.004.patch, mapreduce6657.005.patch > > > Job history server will try to create a history directory in HDFS on startup. > When NameNode is in safe mode, it will keep retrying for a configurable time > period. However, it should also keeps retrying if the name node is in start > state. Safe mode does not happen until the NN is out of the startup phase. > A RetriableException with the text "NameNode still not started" is thrown > when the NN is in its internal service startup phase. We should add the check > for this specific exception in isBecauseSafeMode() to account for that. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6657) job history server can fail on startup when NameNode is in start phase
[ https://issues.apache.org/jira/browse/MAPREDUCE-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15280928#comment-15280928 ] Hadoop QA commented on MAPREDUCE-6657: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 12s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 50s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 15s {color} | {color:green} trunk passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 18s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 14s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 22s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 33s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 12s {color} | {color:green} trunk passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 16s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 18s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 13s {color} | {color:green} the patch passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 13s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 15s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 15s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 13s {color} | {color:red} hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs: patch generated 2 new + 16 unchanged - 0 fixed = 18 total (was 16) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 20s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 10s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 41s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 11s {color} | {color:green} the patch passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 13s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 6m 3s {color} | {color:green} hadoop-mapreduce-client-hs in the patch passed with JDK v1.8.0_91. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 6m 27s {color} | {color:green} hadoop-mapreduce-client-hs in the patch passed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 19s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 25m 46s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:cf2ee45 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12800866/mapreduce6657.005.patch | | JIRA Issue | MAPREDUCE-6657 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 95c33ef8963a 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/h
[jira] [Commented] (MAPREDUCE-6657) job history server can fail on startup when NameNode is in start phase
[ https://issues.apache.org/jira/browse/MAPREDUCE-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15280712#comment-15280712 ] Haibo Chen commented on MAPREDUCE-6657: --- Sorry for misunderstanding your previous comments. Do you think we should create a subclass of RetriableException for this instead? [~djp] The message is derived from a instance method this.nn.getRole(), and doing string matching is probably not the cleanest way. If so, I can create file a follow-up jira in HDFS and update isNameNodeStillNotStarted() when we have the new 'NameNodeNotStartedException' that extends RetriableException. > job history server can fail on startup when NameNode is in start phase > -- > > Key: MAPREDUCE-6657 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6657 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobhistoryserver >Reporter: Haibo Chen >Assignee: Haibo Chen > Attachments: mapreduce6657.001.patch, mapreduce6657.002.patch, > mapreduce6657.003.patch, mapreduce6657.004.patch, mapreduce6657.005.patch > > > Job history server will try to create a history directory in HDFS on startup. > When NameNode is in safe mode, it will keep retrying for a configurable time > period. However, it should also keeps retrying if the name node is in start > state. Safe mode does not happen until the NN is out of the startup phase. > A RetriableException with the text "NameNode still not started" is thrown > when the NN is in its internal service startup phase. We should add the check > for this specific exception in isBecauseSafeMode() to account for that. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6657) job history server can fail on startup when NameNode is in start phase
[ https://issues.apache.org/jira/browse/MAPREDUCE-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15280564#comment-15280564 ] Junping Du commented on MAPREDUCE-6657: --- Thanks for updating the patch, [~haibochen]. My above comments is actually trying to say we should define static string in where exception get throw. In this case, we should also change NameNodeRpcServer.java: {noformat} private void checkNNStartup() throws IOException { if (!this.nn.isStarted()) { throw new RetriableException(this.nn.getRole() + " still not started"); } } {noformat} If we define some static string in HDFS and use in both side (NameNodeRpcServer and HistoryFileManager), that can make sure we won't hit this issue again in future if we update exception string. > job history server can fail on startup when NameNode is in start phase > -- > > Key: MAPREDUCE-6657 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6657 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobhistoryserver >Reporter: Haibo Chen >Assignee: Haibo Chen > Attachments: mapreduce6657.001.patch, mapreduce6657.002.patch, > mapreduce6657.003.patch, mapreduce6657.004.patch, mapreduce6657.005.patch, > mapreduce6657.006.patch > > > Job history server will try to create a history directory in HDFS on startup. > When NameNode is in safe mode, it will keep retrying for a configurable time > period. However, it should also keeps retrying if the name node is in start > state. Safe mode does not happen until the NN is out of the startup phase. > A RetriableException with the text "NameNode still not started" is thrown > when the NN is in its internal service startup phase. We should add the check > for this specific exception in isBecauseSafeMode() to account for that. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6657) job history server can fail on startup when NameNode is in start phase
[ https://issues.apache.org/jira/browse/MAPREDUCE-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15280516#comment-15280516 ] Haibo Chen commented on MAPREDUCE-6657: --- Thanks very much for your review, [~djp]. I have updated the patch according to your comments. > job history server can fail on startup when NameNode is in start phase > -- > > Key: MAPREDUCE-6657 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6657 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobhistoryserver >Reporter: Haibo Chen >Assignee: Haibo Chen > Attachments: mapreduce6657.001.patch, mapreduce6657.002.patch, > mapreduce6657.003.patch, mapreduce6657.004.patch, mapreduce6657.005.patch, > mapreduce6657.006.patch > > > Job history server will try to create a history directory in HDFS on startup. > When NameNode is in safe mode, it will keep retrying for a configurable time > period. However, it should also keeps retrying if the name node is in start > state. Safe mode does not happen until the NN is out of the startup phase. > A RetriableException with the text "NameNode still not started" is thrown > when the NN is in its internal service startup phase. We should add the check > for this specific exception in isBecauseSafeMode() to account for that. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6657) job history server can fail on startup when NameNode is in start phase
[ https://issues.apache.org/jira/browse/MAPREDUCE-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15280426#comment-15280426 ] Junping Du commented on MAPREDUCE-6657: --- Thanks [~haibochen] for the patch. The hard code of checking message string is very flaky: {noformat} +return ex.toString().contains("SafeModeException") || +(ex instanceof RetriableException && ex.getMessage().contains( +"NameNode still not started")); {noformat} If HDFS in future change exception message to something else. i.e. "Namenode not start yet.", then the issue will come up again. Instead, we should define the message as a static string. Other looks fine. > job history server can fail on startup when NameNode is in start phase > -- > > Key: MAPREDUCE-6657 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6657 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobhistoryserver >Reporter: Haibo Chen >Assignee: Haibo Chen > Attachments: mapreduce6657.001.patch, mapreduce6657.002.patch, > mapreduce6657.003.patch, mapreduce6657.004.patch, mapreduce6657.005.patch > > > Job history server will try to create a history directory in HDFS on startup. > When NameNode is in safe mode, it will keep retrying for a configurable time > period. However, it should also keeps retrying if the name node is in start > state. Safe mode does not happen until the NN is out of the startup phase. > A RetriableException with the text "NameNode still not started" is thrown > when the NN is in its internal service startup phase. We should add the check > for this specific exception in isBecauseSafeMode() to account for that. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6657) job history server can fail on startup when NameNode is in start phase
[ https://issues.apache.org/jira/browse/MAPREDUCE-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15280098#comment-15280098 ] Daniel Templeton commented on MAPREDUCE-6657: - OK. Latest patch looks good to me. [~rkanter]? > job history server can fail on startup when NameNode is in start phase > -- > > Key: MAPREDUCE-6657 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6657 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobhistoryserver >Reporter: Haibo Chen >Assignee: Haibo Chen > Attachments: mapreduce6657.001.patch, mapreduce6657.002.patch, > mapreduce6657.003.patch, mapreduce6657.004.patch, mapreduce6657.005.patch > > > Job history server will try to create a history directory in HDFS on startup. > When NameNode is in safe mode, it will keep retrying for a configurable time > period. However, it should also keeps retrying if the name node is in start > state. Safe mode does not happen until the NN is out of the startup phase. > A RetriableException with the text "NameNode still not started" is thrown > when the NN is in its internal service startup phase. We should add the check > for this specific exception in isBecauseSafeMode() to account for that. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6657) job history server can fail on startup when NameNode is in start phase
[ https://issues.apache.org/jira/browse/MAPREDUCE-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15258990#comment-15258990 ] Hadoop QA commented on MAPREDUCE-6657: -- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 10s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 9m 14s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 23s {color} | {color:green} trunk passed with JDK v1.8.0_92 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 22s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 17s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 28s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 16s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 44s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 19s {color} | {color:green} trunk passed with JDK v1.8.0_92 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 20s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 23s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 21s {color} | {color:green} the patch passed with JDK v1.8.0_92 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 21s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 20s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 20s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 15s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 28s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 15s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 52s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 18s {color} | {color:green} the patch passed with JDK v1.8.0_92 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 17s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 7m 14s {color} | {color:green} hadoop-mapreduce-client-hs in the patch passed with JDK v1.8.0_92. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 7m 8s {color} | {color:green} hadoop-mapreduce-client-hs in the patch passed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 23s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 32m 0s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:fbe3e86 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12800866/mapreduce6657.005.patch | | JIRA Issue | MAPREDUCE-6657 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux fb4684770a0c 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 6be22dd | | Default Java | 1.7.0_95 | |
[jira] [Commented] (MAPREDUCE-6657) job history server can fail on startup when NameNode is in start phase
[ https://issues.apache.org/jira/browse/MAPREDUCE-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15258981#comment-15258981 ] Haibo Chen commented on MAPREDUCE-6657: --- isNameNodeUnavailable() might be a little too broad as it covers cases where name node can start up and then becomes unavailable, in which JobHistoryServer just throws an Error during initialization as indicated by the current code. I guess the question then becomes do we want JHS to keep retrying even when cases where name node is already started but unavailable. > job history server can fail on startup when NameNode is in start phase > -- > > Key: MAPREDUCE-6657 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6657 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobhistoryserver >Reporter: Haibo Chen >Assignee: Haibo Chen > Attachments: mapreduce6657.001.patch, mapreduce6657.002.patch, > mapreduce6657.003.patch, mapreduce6657.004.patch, mapreduce6657.005.patch > > > Job history server will try to create a history directory in HDFS on startup. > When NameNode is in safe mode, it will keep retrying for a configurable time > period. However, it should also keeps retrying if the name node is in start > state. Safe mode does not happen until the NN is out of the startup phase. > A RetriableException with the text "NameNode still not started" is thrown > when the NN is in its internal service startup phase. We should add the check > for this specific exception in isBecauseSafeMode() to account for that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6657) job history server can fail on startup when NameNode is in start phase
[ https://issues.apache.org/jira/browse/MAPREDUCE-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15258894#comment-15258894 ] Daniel Templeton commented on MAPREDUCE-6657: - Can we make the method name should be something more like {{isNameNodeUnavailable()}} or {{isNameNodeNotReady()}} so that it's more inclusive of both safe mode *and* slow starts? Not a deal breaker... > job history server can fail on startup when NameNode is in start phase > -- > > Key: MAPREDUCE-6657 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6657 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobhistoryserver >Reporter: Haibo Chen >Assignee: Haibo Chen > Attachments: mapreduce6657.001.patch, mapreduce6657.002.patch, > mapreduce6657.003.patch, mapreduce6657.004.patch, mapreduce6657.005.patch > > > Job history server will try to create a history directory in HDFS on startup. > When NameNode is in safe mode, it will keep retrying for a configurable time > period. However, it should also keeps retrying if the name node is in start > state. Safe mode does not happen until the NN is out of the startup phase. > A RetriableException with the text "NameNode still not started" is thrown > when the NN is in its internal service startup phase. We should add the check > for this specific exception in isBecauseSafeMode() to account for that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6657) job history server can fail on startup when NameNode is in start phase
[ https://issues.apache.org/jira/browse/MAPREDUCE-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15256711#comment-15256711 ] Ray Chiang commented on MAPREDUCE-6657: --- Maybe rename it to "checkNameNodeNotStartedYet" or just "checkNameNodeNotStarted" ? > job history server can fail on startup when NameNode is in start phase > -- > > Key: MAPREDUCE-6657 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6657 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobhistoryserver >Reporter: Haibo Chen >Assignee: Haibo Chen > Attachments: mapreduce6657.001.patch, mapreduce6657.002.patch, > mapreduce6657.003.patch, mapreduce6657.004.patch > > > Job history server will try to create a history directory in HDFS on startup. > When NameNode is in safe mode, it will keep retrying for a configurable time > period. However, it should also keeps retrying if the name node is in start > state. Safe mode does not happen until the NN is out of the startup phase. > A RetriableException with the text "NameNode still not started" is thrown > when the NN is in its internal service startup phase. We should add the check > for this specific exception in isBecauseSafeMode() to account for that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6657) job history server can fail on startup when NameNode is in start phase
[ https://issues.apache.org/jira/browse/MAPREDUCE-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15256699#comment-15256699 ] Daniel Templeton commented on MAPREDUCE-6657: - Looks good, [~haibochen]. One thing I just noticed, though, is that you need to update the comment on {{HistoryFileManager.isBecauseSafeMode()}} to reflect the new behavior. You might consider renaming the method as well, since the name is also no longer exactly accurate. > job history server can fail on startup when NameNode is in start phase > -- > > Key: MAPREDUCE-6657 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6657 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobhistoryserver >Reporter: Haibo Chen >Assignee: Haibo Chen > Attachments: mapreduce6657.001.patch, mapreduce6657.002.patch, > mapreduce6657.003.patch, mapreduce6657.004.patch > > > Job history server will try to create a history directory in HDFS on startup. > When NameNode is in safe mode, it will keep retrying for a configurable time > period. However, it should also keeps retrying if the name node is in start > state. Safe mode does not happen until the NN is out of the startup phase. > A RetriableException with the text "NameNode still not started" is thrown > when the NN is in its internal service startup phase. We should add the check > for this specific exception in isBecauseSafeMode() to account for that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6657) job history server can fail on startup when NameNode is in start phase
[ https://issues.apache.org/jira/browse/MAPREDUCE-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15251175#comment-15251175 ] Hadoop QA commented on MAPREDUCE-6657: -- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 37s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 15s {color} | {color:green} trunk passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 18s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 14s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 22s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 32s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 13s {color} | {color:green} trunk passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 17s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 18s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 13s {color} | {color:green} the patch passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 13s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 15s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 15s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 11s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 20s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 10s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 40s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 10s {color} | {color:green} the patch passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 13s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 5m 49s {color} | {color:green} hadoop-mapreduce-client-hs in the patch passed with JDK v1.8.0_77. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 6m 9s {color} | {color:green} hadoop-mapreduce-client-hs in the patch passed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 17s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 24m 59s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:fbe3e86 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12799883/mapreduce6657.004.patch | | JIRA Issue | MAPREDUCE-6657 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux e11038ba9f14 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 1e48eef | | Default Java | 1.7.0_95 |
[jira] [Commented] (MAPREDUCE-6657) job history server can fail on startup when NameNode is in start phase
[ https://issues.apache.org/jira/browse/MAPREDUCE-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15251038#comment-15251038 ] Haibo Chen commented on MAPREDUCE-6657: --- Thanks [~templedf] and @Ray Chiang for your reviews. I have updated my patch accordingly. > job history server can fail on startup when NameNode is in start phase > -- > > Key: MAPREDUCE-6657 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6657 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobhistoryserver >Reporter: Haibo Chen >Assignee: Haibo Chen > Attachments: mapreduce6657.001.patch, mapreduce6657.002.patch, > mapreduce6657.003.patch > > > Job history server will try to create a history directory in HDFS on startup. > When NameNode is in safe mode, it will keep retrying for a configurable time > period. However, it should also keeps retrying if the name node is in start > state. Safe mode does not happen until the NN is out of the startup phase. > A RetriableException with the text "NameNode still not started" is thrown > when the NN is in its internal service startup phase. We should add the check > for this specific exception in isBecauseSafeMode() to account for that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6657) job history server can fail on startup when NameNode is in start phase
[ https://issues.apache.org/jira/browse/MAPREDUCE-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15250774#comment-15250774 ] Ray Chiang commented on MAPREDUCE-6657: --- Reviewed the latest patch. Looks good. +1 (nonbinding). > job history server can fail on startup when NameNode is in start phase > -- > > Key: MAPREDUCE-6657 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6657 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobhistoryserver >Reporter: Haibo Chen >Assignee: Haibo Chen > Attachments: mapreduce6657.001.patch, mapreduce6657.002.patch, > mapreduce6657.003.patch > > > Job history server will try to create a history directory in HDFS on startup. > When NameNode is in safe mode, it will keep retrying for a configurable time > period. However, it should also keeps retrying if the name node is in start > state. Safe mode does not happen until the NN is out of the startup phase. > A RetriableException with the text "NameNode still not started" is thrown > when the NN is in its internal service startup phase. We should add the check > for this specific exception in isBecauseSafeMode() to account for that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6657) job history server can fail on startup when NameNode is in start phase
[ https://issues.apache.org/jira/browse/MAPREDUCE-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15250546#comment-15250546 ] Hadoop QA commented on MAPREDUCE-6657: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 23s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 1s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 37s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 23s {color} | {color:green} trunk passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 21s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 16s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 28s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 15s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 42s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 18s {color} | {color:green} trunk passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 18s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 22s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 19s {color} | {color:green} the patch passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 19s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 19s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 19s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 13s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 26s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 52s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 15s {color} | {color:green} the patch passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 17s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 6m 57s {color} | {color:green} hadoop-mapreduce-client-hs in the patch passed with JDK v1.8.0_77. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 6m 56s {color} | {color:green} hadoop-mapreduce-client-hs in the patch passed with JDK v1.7.0_95. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 21s {color} | {color:red} Patch generated 1 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 30m 43s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:fbe3e86 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12799789/mapreduce6657.003.patch | | JIRA Issue | MAPREDUCE-6657 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux ce566e720948 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / ad36fa6 | | Default Java | 1.7.0_95 | | Multi-JDK vers
[jira] [Commented] (MAPREDUCE-6657) job history server can fail on startup when NameNode is in start phase
[ https://issues.apache.org/jira/browse/MAPREDUCE-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15250484#comment-15250484 ] Daniel Templeton commented on MAPREDUCE-6657: - Thanks, [~haibochen]. Some comments: {code} * HDFS is not running normally (either in start phrase or {code} should be {code} * HDFS is not running normally (either in start phase or {code} {code} private static final String CLUSTER_BASE_DIR = MiniDFSCluster.getBaseDirectory(); ... conf.set(MiniDFSCluster.HDFS_MINIDFS_BASEDIR, CLUSTER_BASE_DIR.substring(0, CLUSTER_BASE_DIR.length() - 1) + "_safemode"); {code} Why not just set the base dir to what you want initially? {code} final long maxJHSWaitTime = 500; {code} Tiny quibble: the name should probably be {{maxJhsWaitTime}}. We have conflicting styles in the code, but IIRC the style guide says to only capitalize the first letter of acronyms in names. (I could be wrong, so feel free to call my bluff.) {code} dfsCluster.getFileSystem().setSafeMode( HdfsConstants.SafeModeAction.SAFEMODE_ENTER); Assert.assertTrue(dfsCluster.getFileSystem().isInSafeMode()); {code} To be completely safe these lines should be inside the try. {code} Assert.assertEquals("Job History Server is expected to be " + expectedExceptionMsg, expectedExceptionMsg, yex.getMessage()); {code} should probably be more like {code} Assert.assertEquals("Unexpected reconnect timeout exception message", expectedExceptionMsg, yex.getMessage()); {code} The assert will include the expected value in the output. > job history server can fail on startup when NameNode is in start phase > -- > > Key: MAPREDUCE-6657 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6657 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobhistoryserver >Reporter: Haibo Chen >Assignee: Haibo Chen > Attachments: mapreduce6657.001.patch, mapreduce6657.002.patch, > mapreduce6657.003.patch > > > Job history server will try to create a history directory in HDFS on startup. > When NameNode is in safe mode, it will keep retrying for a configurable time > period. However, it should also keeps retrying if the name node is in start > state. Safe mode does not happen until the NN is out of the startup phase. > A RetriableException with the text "NameNode still not started" is thrown > when the NN is in its internal service startup phase. We should add the check > for this specific exception in isBecauseSafeMode() to account for that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6657) job history server can fail on startup when NameNode is in start phase
[ https://issues.apache.org/jira/browse/MAPREDUCE-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15250435#comment-15250435 ] Daniel Templeton commented on MAPREDUCE-6657: - I think you uploaded the wrong patch. :) > job history server can fail on startup when NameNode is in start phase > -- > > Key: MAPREDUCE-6657 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6657 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobhistoryserver >Reporter: Haibo Chen >Assignee: Haibo Chen > Attachments: mapreduce6657.001.patch, mapreduce6657.002.patch, > mapreduce6677.003.patch > > > Job history server will try to create a history directory in HDFS on startup. > When NameNode is in safe mode, it will keep retrying for a configurable time > period. However, it should also keeps retrying if the name node is in start > state. Safe mode does not happen until the NN is out of the startup phase. > A RetriableException with the text "NameNode still not started" is thrown > when the NN is in its internal service startup phase. We should add the check > for this specific exception in isBecauseSafeMode() to account for that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6657) job history server can fail on startup when NameNode is in start phase
[ https://issues.apache.org/jira/browse/MAPREDUCE-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15250356#comment-15250356 ] Haibo Chen commented on MAPREDUCE-6657: --- Unit test failures are unrelated to this patch > job history server can fail on startup when NameNode is in start phase > -- > > Key: MAPREDUCE-6657 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6657 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobhistoryserver >Reporter: Haibo Chen >Assignee: Haibo Chen > Attachments: mapreduce6657.001.patch, mapreduce6657.002.patch, > mapreduce6677.003.patch > > > Job history server will try to create a history directory in HDFS on startup. > When NameNode is in safe mode, it will keep retrying for a configurable time > period. However, it should also keeps retrying if the name node is in start > state. Safe mode does not happen until the NN is out of the startup phase. > A RetriableException with the text "NameNode still not started" is thrown > when the NN is in its internal service startup phase. We should add the check > for this specific exception in isBecauseSafeMode() to account for that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6657) job history server can fail on startup when NameNode is in start phase
[ https://issues.apache.org/jira/browse/MAPREDUCE-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15249127#comment-15249127 ] Hadoop QA commented on MAPREDUCE-6657: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 2 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 31s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 9s {color} | {color:green} trunk passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 14s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 29s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 6s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 32s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 26s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 35s {color} | {color:green} trunk passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 41s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 55s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 40s {color} | {color:green} the patch passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 49s {color} | {color:green} hadoop-mapreduce-project_hadoop-mapreduce-client-jdk1.8.0_77 with JDK v1.8.0_77 generated 0 new + 356 unchanged - 6 fixed = 356 total (was 362) {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 40s {color} | {color:green} hadoop-mapreduce-client in the patch passed with JDK v1.8.0_77. {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 53s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 4m 42s {color} | {color:green} hadoop-mapreduce-project_hadoop-mapreduce-client-jdk1.7.0_95 with JDK v1.7.0_95 generated 0 new + 361 unchanged - 6 fixed = 361 total (was 367) {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 53s {color} | {color:green} hadoop-mapreduce-client in the patch passed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 25s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 59s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 25s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 42s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 28s {color} | {color:green} the patch passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 31s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 9m 42s {color} | {color:green} hadoop-mapreduce-client-app in the patch passed with JDK v1.8.0_77. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 108m 25s {color} | {color:red} hadoop-mapreduce-client-jobclient in the patch failed with JDK v1.8.0_77. {color
[jira] [Commented] (MAPREDUCE-6657) job history server can fail on startup when NameNode is in start phase
[ https://issues.apache.org/jira/browse/MAPREDUCE-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15248625#comment-15248625 ] Haibo Chen commented on MAPREDUCE-6657: --- updated the test method according to [~templedf]'s comments, and moved it to a new test class because it cannot share clusters with other test methods in TestHistoryFileManager. > job history server can fail on startup when NameNode is in start phase > -- > > Key: MAPREDUCE-6657 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6657 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobhistoryserver >Reporter: Haibo Chen >Assignee: Haibo Chen > Attachments: mapreduce6657.001.patch, mapreduce6657.002.patch, > mapreduce6677.003.patch > > > Job history server will try to create a history directory in HDFS on startup. > When NameNode is in safe mode, it will keep retrying for a configurable time > period. However, it should also keeps retrying if the name node is in start > state. Safe mode does not happen until the NN is out of the startup phase. > A RetriableException with the text "NameNode still not started" is thrown > when the NN is in its internal service startup phase. We should add the check > for this specific exception in isBecauseSafeMode() to account for that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6657) job history server can fail on startup when NameNode is in start phase
[ https://issues.apache.org/jira/browse/MAPREDUCE-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15247698#comment-15247698 ] Daniel Templeton commented on MAPREDUCE-6657: - The message says that the server should have timed out, but the assert is testing whether the exception message is correct when it does time out. If it doesn't time out, it looks to me like the test will pass. You should probably also have an {{Assert.fail()}} after the {{serviceInit()}} call. > job history server can fail on startup when NameNode is in start phase > -- > > Key: MAPREDUCE-6657 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6657 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobhistoryserver >Reporter: Haibo Chen >Assignee: Haibo Chen > Attachments: mapreduce6657.001.patch, mapreduce6657.002.patch > > > Job history server will try to create a history directory in HDFS on startup. > When NameNode is in safe mode, it will keep retrying for a configurable time > period. However, it should also keeps retrying if the name node is in start > state. Safe mode does not happen until the NN is out of the startup phase. > A RetriableException with the text "NameNode still not started" is thrown > when the NN is in its internal service startup phase. We should add the check > for this specific exception in isBecauseSafeMode() to account for that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6657) job history server can fail on startup when NameNode is in start phase
[ https://issues.apache.org/jira/browse/MAPREDUCE-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15246860#comment-15246860 ] Haibo Chen commented on MAPREDUCE-6657: --- Thanks a lot for you comments, [~templedf] I have added a brief javadoc and made the timeout to be 500. Let me know if 500 looks reasonable to you. Also, the test method is now using the existing dfs cluster instead of a new local one. The only method in TestHistoryManager that is using is both dfs clusters is testCreateDirsWithAdditionalFileSystem(), so maybe it makes more sense to move that method out? The behavior of JHS, when name node is in safe mode, is that it throws a YarnRuntimeException with a timeout message. I think the assert message is actually in line with the expected behavior. > job history server can fail on startup when NameNode is in start phase > -- > > Key: MAPREDUCE-6657 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6657 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobhistoryserver >Reporter: Haibo Chen >Assignee: Haibo Chen > Attachments: mapreduce6657.001.patch, mapreduce6657.002.patch > > > Job history server will try to create a history directory in HDFS on startup. > When NameNode is in safe mode, it will keep retrying for a configurable time > period. However, it should also keeps retrying if the name node is in start > state. Safe mode does not happen until the NN is out of the startup phase. > A RetriableException with the text "NameNode still not started" is thrown > when the NN is in its internal service startup phase. We should add the check > for this specific exception in isBecauseSafeMode() to account for that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6657) job history server can fail on startup when NameNode is in start phase
[ https://issues.apache.org/jira/browse/MAPREDUCE-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15246282#comment-15246282 ] Daniel Templeton commented on MAPREDUCE-6657: - Thanks for the patch, [~haibochen]. I hate that HDFS expects you to parse the text of their exceptions to figure out what's going on. Wanna look into whether the API would allow you to throw a properly typed exception? Maybe just file a followup JIRA? In your test code, it would be nice to add a javadoc header that explains what you're testing. I don't love that you're running two mini-clusters and ignoring one of them. Is there any way to do the test with the existing mini-cluster without disrupting the other tests? If not, I'd consider creating a new test class so that you don't have two mini-clusters running. Is 2000ms the shortest reasonable duration for the timeout? Seems long to me... {code} Assert.assertEquals("Job History Server is expected to time out.", {code} Your assert message is misleading. It should instead say that it didn't get the expected error message. > job history server can fail on startup when NameNode is in start phase > -- > > Key: MAPREDUCE-6657 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6657 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobhistoryserver >Reporter: Haibo Chen >Assignee: Haibo Chen > Attachments: mapreduce6657.001.patch, mapreduce6657.002.patch > > > Job history server will try to create a history directory in HDFS on startup. > When NameNode is in safe mode, it will keep retrying for a configurable time > period. However, it should also keeps retrying if the name node is in start > state. Safe mode does not happen until the NN is out of the startup phase. > A RetriableException with the text "NameNode still not started" is thrown > when the NN is in its internal service startup phase. We should add the check > for this specific exception in isBecauseSafeMode() to account for that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6657) job history server can fail on startup when NameNode is in start phase
[ https://issues.apache.org/jira/browse/MAPREDUCE-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15212135#comment-15212135 ] Hadoop QA commented on MAPREDUCE-6657: -- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 41s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 15s {color} | {color:green} trunk passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 18s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 14s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 23s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 32s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 13s {color} | {color:green} trunk passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 15s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 20s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 13s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 13s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 16s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 16s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 12s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 21s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 11s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 42s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 11s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 13s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 5m 49s {color} | {color:green} hadoop-mapreduce-client-hs in the patch passed with JDK v1.8.0_74. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 6m 7s {color} | {color:green} hadoop-mapreduce-client-hs in the patch passed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 17s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 25m 6s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:fbe3e86 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12795430/mapreduce6657.002.patch | | JIRA Issue | MAPREDUCE-6657 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 636e1ce9949f 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 2c268cc | | Default Java | 1.7.0_95 | |
[jira] [Commented] (MAPREDUCE-6657) job history server can fail on startup when NameNode is in start phase
[ https://issues.apache.org/jira/browse/MAPREDUCE-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15211473#comment-15211473 ] Hadoop QA commented on MAPREDUCE-6657: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 36s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 21s {color} | {color:green} trunk passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 19s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 13s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 25s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 35s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 17s {color} | {color:green} trunk passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 17s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 20s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 18s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 18s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 17s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 17s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 11s {color} | {color:red} hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs: patch generated 1 new + 16 unchanged - 0 fixed = 17 total (was 16) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 23s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 12s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 47s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 14s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 14s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 6m 28s {color} | {color:green} hadoop-mapreduce-client-hs in the patch passed with JDK v1.8.0_74. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 6m 19s {color} | {color:green} hadoop-mapreduce-client-hs in the patch passed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 19s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 27m 32s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:fbe3e86 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12795338/mapreduce6657.001.patch | | JIRA Issue | MAPREDUCE-6657 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux a0caa41ec4e4 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/