[jira] [Updated] (MAPREDUCE-6657) job history server can fail on startup when NameNode is in start phase

2016-05-17 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated MAPREDUCE-6657:
--
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.9.0
   Status: Resolved  (was: Patch Available)

I have commit the patch to trunk and branch-2. Thanks [~haibochen] for 
contributing the patch and [~templedf] and [~rkanter] for review and comments!

> job history server can fail on startup when NameNode is in start phase
> --
>
> Key: MAPREDUCE-6657
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6657
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobhistoryserver
>Reporter: Haibo Chen
>Assignee: Haibo Chen
> Fix For: 2.9.0
>
> Attachments: mapreduce6657.001.patch, mapreduce6657.002.patch, 
> mapreduce6657.003.patch, mapreduce6657.004.patch, mapreduce6657.005.patch, 
> mapreduce6657.006.patch, mapreduce6657.007.patch
>
>
> Job history server will try to create a history directory in HDFS on startup. 
> When NameNode is in safe mode, it will keep retrying for a configurable time 
> period.  However, it should also keeps retrying if the name node is in start 
> state. Safe mode does not happen until the NN is out of the startup phase. 
> A RetriableException with the text "NameNode still not started" is thrown 
> when the NN is in its internal service startup phase. We should add the check 
> for this specific exception in isBecauseSafeMode() to account for that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6657) job history server can fail on startup when NameNode is in start phase

2016-05-16 Thread Haibo Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated MAPREDUCE-6657:
--
Status: Patch Available  (was: Open)

> job history server can fail on startup when NameNode is in start phase
> --
>
> Key: MAPREDUCE-6657
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6657
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobhistoryserver
>Reporter: Haibo Chen
>Assignee: Haibo Chen
> Attachments: mapreduce6657.001.patch, mapreduce6657.002.patch, 
> mapreduce6657.003.patch, mapreduce6657.004.patch, mapreduce6657.005.patch, 
> mapreduce6657.006.patch, mapreduce6657.007.patch
>
>
> Job history server will try to create a history directory in HDFS on startup. 
> When NameNode is in safe mode, it will keep retrying for a configurable time 
> period.  However, it should also keeps retrying if the name node is in start 
> state. Safe mode does not happen until the NN is out of the startup phase. 
> A RetriableException with the text "NameNode still not started" is thrown 
> when the NN is in its internal service startup phase. We should add the check 
> for this specific exception in isBecauseSafeMode() to account for that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6657) job history server can fail on startup when NameNode is in start phase

2016-05-16 Thread Haibo Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated MAPREDUCE-6657:
--
Status: Open  (was: Patch Available)

> job history server can fail on startup when NameNode is in start phase
> --
>
> Key: MAPREDUCE-6657
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6657
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobhistoryserver
>Reporter: Haibo Chen
>Assignee: Haibo Chen
> Attachments: mapreduce6657.001.patch, mapreduce6657.002.patch, 
> mapreduce6657.003.patch, mapreduce6657.004.patch, mapreduce6657.005.patch, 
> mapreduce6657.006.patch, mapreduce6657.007.patch
>
>
> Job history server will try to create a history directory in HDFS on startup. 
> When NameNode is in safe mode, it will keep retrying for a configurable time 
> period.  However, it should also keeps retrying if the name node is in start 
> state. Safe mode does not happen until the NN is out of the startup phase. 
> A RetriableException with the text "NameNode still not started" is thrown 
> when the NN is in its internal service startup phase. We should add the check 
> for this specific exception in isBecauseSafeMode() to account for that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6657) job history server can fail on startup when NameNode is in start phase

2016-05-16 Thread Haibo Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated MAPREDUCE-6657:
--
Attachment: mapreduce6657.007.patch

> job history server can fail on startup when NameNode is in start phase
> --
>
> Key: MAPREDUCE-6657
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6657
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobhistoryserver
>Reporter: Haibo Chen
>Assignee: Haibo Chen
> Attachments: mapreduce6657.001.patch, mapreduce6657.002.patch, 
> mapreduce6657.003.patch, mapreduce6657.004.patch, mapreduce6657.005.patch, 
> mapreduce6657.006.patch, mapreduce6657.007.patch
>
>
> Job history server will try to create a history directory in HDFS on startup. 
> When NameNode is in safe mode, it will keep retrying for a configurable time 
> period.  However, it should also keeps retrying if the name node is in start 
> state. Safe mode does not happen until the NN is out of the startup phase. 
> A RetriableException with the text "NameNode still not started" is thrown 
> when the NN is in its internal service startup phase. We should add the check 
> for this specific exception in isBecauseSafeMode() to account for that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6657) job history server can fail on startup when NameNode is in start phase

2016-05-13 Thread Haibo Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated MAPREDUCE-6657:
--
Status: Patch Available  (was: Open)

> job history server can fail on startup when NameNode is in start phase
> --
>
> Key: MAPREDUCE-6657
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6657
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobhistoryserver
>Reporter: Haibo Chen
>Assignee: Haibo Chen
> Attachments: mapreduce6657.001.patch, mapreduce6657.002.patch, 
> mapreduce6657.003.patch, mapreduce6657.004.patch, mapreduce6657.005.patch, 
> mapreduce6657.006.patch
>
>
> Job history server will try to create a history directory in HDFS on startup. 
> When NameNode is in safe mode, it will keep retrying for a configurable time 
> period.  However, it should also keeps retrying if the name node is in start 
> state. Safe mode does not happen until the NN is out of the startup phase. 
> A RetriableException with the text "NameNode still not started" is thrown 
> when the NN is in its internal service startup phase. We should add the check 
> for this specific exception in isBecauseSafeMode() to account for that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6657) job history server can fail on startup when NameNode is in start phase

2016-05-13 Thread Haibo Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated MAPREDUCE-6657:
--
Status: Open  (was: Patch Available)

> job history server can fail on startup when NameNode is in start phase
> --
>
> Key: MAPREDUCE-6657
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6657
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobhistoryserver
>Reporter: Haibo Chen
>Assignee: Haibo Chen
> Attachments: mapreduce6657.001.patch, mapreduce6657.002.patch, 
> mapreduce6657.003.patch, mapreduce6657.004.patch, mapreduce6657.005.patch, 
> mapreduce6657.006.patch
>
>
> Job history server will try to create a history directory in HDFS on startup. 
> When NameNode is in safe mode, it will keep retrying for a configurable time 
> period.  However, it should also keeps retrying if the name node is in start 
> state. Safe mode does not happen until the NN is out of the startup phase. 
> A RetriableException with the text "NameNode still not started" is thrown 
> when the NN is in its internal service startup phase. We should add the check 
> for this specific exception in isBecauseSafeMode() to account for that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6657) job history server can fail on startup when NameNode is in start phase

2016-05-13 Thread Haibo Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated MAPREDUCE-6657:
--
Attachment: mapreduce6657.006.patch

> job history server can fail on startup when NameNode is in start phase
> --
>
> Key: MAPREDUCE-6657
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6657
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobhistoryserver
>Reporter: Haibo Chen
>Assignee: Haibo Chen
> Attachments: mapreduce6657.001.patch, mapreduce6657.002.patch, 
> mapreduce6657.003.patch, mapreduce6657.004.patch, mapreduce6657.005.patch, 
> mapreduce6657.006.patch
>
>
> Job history server will try to create a history directory in HDFS on startup. 
> When NameNode is in safe mode, it will keep retrying for a configurable time 
> period.  However, it should also keeps retrying if the name node is in start 
> state. Safe mode does not happen until the NN is out of the startup phase. 
> A RetriableException with the text "NameNode still not started" is thrown 
> when the NN is in its internal service startup phase. We should add the check 
> for this specific exception in isBecauseSafeMode() to account for that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6657) job history server can fail on startup when NameNode is in start phase

2016-05-11 Thread Haibo Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated MAPREDUCE-6657:
--
Attachment: (was: mapreduce6657.006.patch)

> job history server can fail on startup when NameNode is in start phase
> --
>
> Key: MAPREDUCE-6657
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6657
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobhistoryserver
>Reporter: Haibo Chen
>Assignee: Haibo Chen
> Attachments: mapreduce6657.001.patch, mapreduce6657.002.patch, 
> mapreduce6657.003.patch, mapreduce6657.004.patch, mapreduce6657.005.patch
>
>
> Job history server will try to create a history directory in HDFS on startup. 
> When NameNode is in safe mode, it will keep retrying for a configurable time 
> period.  However, it should also keeps retrying if the name node is in start 
> state. Safe mode does not happen until the NN is out of the startup phase. 
> A RetriableException with the text "NameNode still not started" is thrown 
> when the NN is in its internal service startup phase. We should add the check 
> for this specific exception in isBecauseSafeMode() to account for that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6657) job history server can fail on startup when NameNode is in start phase

2016-05-11 Thread Haibo Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated MAPREDUCE-6657:
--
Attachment: mapreduce6657.006.patch

> job history server can fail on startup when NameNode is in start phase
> --
>
> Key: MAPREDUCE-6657
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6657
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobhistoryserver
>Reporter: Haibo Chen
>Assignee: Haibo Chen
> Attachments: mapreduce6657.001.patch, mapreduce6657.002.patch, 
> mapreduce6657.003.patch, mapreduce6657.004.patch, mapreduce6657.005.patch, 
> mapreduce6657.006.patch
>
>
> Job history server will try to create a history directory in HDFS on startup. 
> When NameNode is in safe mode, it will keep retrying for a configurable time 
> period.  However, it should also keeps retrying if the name node is in start 
> state. Safe mode does not happen until the NN is out of the startup phase. 
> A RetriableException with the text "NameNode still not started" is thrown 
> when the NN is in its internal service startup phase. We should add the check 
> for this specific exception in isBecauseSafeMode() to account for that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6657) job history server can fail on startup when NameNode is in start phase

2016-04-26 Thread Haibo Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated MAPREDUCE-6657:
--
Status: Open  (was: Patch Available)

> job history server can fail on startup when NameNode is in start phase
> --
>
> Key: MAPREDUCE-6657
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6657
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobhistoryserver
>Reporter: Haibo Chen
>Assignee: Haibo Chen
> Attachments: mapreduce6657.001.patch, mapreduce6657.002.patch, 
> mapreduce6657.003.patch, mapreduce6657.004.patch, mapreduce6657.005.patch
>
>
> Job history server will try to create a history directory in HDFS on startup. 
> When NameNode is in safe mode, it will keep retrying for a configurable time 
> period.  However, it should also keeps retrying if the name node is in start 
> state. Safe mode does not happen until the NN is out of the startup phase. 
> A RetriableException with the text "NameNode still not started" is thrown 
> when the NN is in its internal service startup phase. We should add the check 
> for this specific exception in isBecauseSafeMode() to account for that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6657) job history server can fail on startup when NameNode is in start phase

2016-04-26 Thread Haibo Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated MAPREDUCE-6657:
--
Status: Patch Available  (was: Open)

> job history server can fail on startup when NameNode is in start phase
> --
>
> Key: MAPREDUCE-6657
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6657
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobhistoryserver
>Reporter: Haibo Chen
>Assignee: Haibo Chen
> Attachments: mapreduce6657.001.patch, mapreduce6657.002.patch, 
> mapreduce6657.003.patch, mapreduce6657.004.patch, mapreduce6657.005.patch
>
>
> Job history server will try to create a history directory in HDFS on startup. 
> When NameNode is in safe mode, it will keep retrying for a configurable time 
> period.  However, it should also keeps retrying if the name node is in start 
> state. Safe mode does not happen until the NN is out of the startup phase. 
> A RetriableException with the text "NameNode still not started" is thrown 
> when the NN is in its internal service startup phase. We should add the check 
> for this specific exception in isBecauseSafeMode() to account for that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6657) job history server can fail on startup when NameNode is in start phase

2016-04-26 Thread Haibo Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated MAPREDUCE-6657:
--
Attachment: mapreduce6657.005.patch

updated method name per Ray and Daniel's comment.

> job history server can fail on startup when NameNode is in start phase
> --
>
> Key: MAPREDUCE-6657
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6657
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobhistoryserver
>Reporter: Haibo Chen
>Assignee: Haibo Chen
> Attachments: mapreduce6657.001.patch, mapreduce6657.002.patch, 
> mapreduce6657.003.patch, mapreduce6657.004.patch, mapreduce6657.005.patch
>
>
> Job history server will try to create a history directory in HDFS on startup. 
> When NameNode is in safe mode, it will keep retrying for a configurable time 
> period.  However, it should also keeps retrying if the name node is in start 
> state. Safe mode does not happen until the NN is out of the startup phase. 
> A RetriableException with the text "NameNode still not started" is thrown 
> when the NN is in its internal service startup phase. We should add the check 
> for this specific exception in isBecauseSafeMode() to account for that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6657) job history server can fail on startup when NameNode is in start phase

2016-04-20 Thread Haibo Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated MAPREDUCE-6657:
--
Attachment: mapreduce6657.004.patch

> job history server can fail on startup when NameNode is in start phase
> --
>
> Key: MAPREDUCE-6657
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6657
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobhistoryserver
>Reporter: Haibo Chen
>Assignee: Haibo Chen
> Attachments: mapreduce6657.001.patch, mapreduce6657.002.patch, 
> mapreduce6657.003.patch, mapreduce6657.004.patch
>
>
> Job history server will try to create a history directory in HDFS on startup. 
> When NameNode is in safe mode, it will keep retrying for a configurable time 
> period.  However, it should also keeps retrying if the name node is in start 
> state. Safe mode does not happen until the NN is out of the startup phase. 
> A RetriableException with the text "NameNode still not started" is thrown 
> when the NN is in its internal service startup phase. We should add the check 
> for this specific exception in isBecauseSafeMode() to account for that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6657) job history server can fail on startup when NameNode is in start phase

2016-04-20 Thread Haibo Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated MAPREDUCE-6657:
--
Attachment: (was: mapreduce6677.003.patch)

> job history server can fail on startup when NameNode is in start phase
> --
>
> Key: MAPREDUCE-6657
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6657
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobhistoryserver
>Reporter: Haibo Chen
>Assignee: Haibo Chen
> Attachments: mapreduce6657.001.patch, mapreduce6657.002.patch, 
> mapreduce6657.003.patch
>
>
> Job history server will try to create a history directory in HDFS on startup. 
> When NameNode is in safe mode, it will keep retrying for a configurable time 
> period.  However, it should also keeps retrying if the name node is in start 
> state. Safe mode does not happen until the NN is out of the startup phase. 
> A RetriableException with the text "NameNode still not started" is thrown 
> when the NN is in its internal service startup phase. We should add the check 
> for this specific exception in isBecauseSafeMode() to account for that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6657) job history server can fail on startup when NameNode is in start phase

2016-04-20 Thread Haibo Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated MAPREDUCE-6657:
--
Attachment: mapreduce6657.003.patch

> job history server can fail on startup when NameNode is in start phase
> --
>
> Key: MAPREDUCE-6657
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6657
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobhistoryserver
>Reporter: Haibo Chen
>Assignee: Haibo Chen
> Attachments: mapreduce6657.001.patch, mapreduce6657.002.patch, 
> mapreduce6657.003.patch
>
>
> Job history server will try to create a history directory in HDFS on startup. 
> When NameNode is in safe mode, it will keep retrying for a configurable time 
> period.  However, it should also keeps retrying if the name node is in start 
> state. Safe mode does not happen until the NN is out of the startup phase. 
> A RetriableException with the text "NameNode still not started" is thrown 
> when the NN is in its internal service startup phase. We should add the check 
> for this specific exception in isBecauseSafeMode() to account for that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6657) job history server can fail on startup when NameNode is in start phase

2016-04-19 Thread Haibo Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated MAPREDUCE-6657:
--
Attachment: mapreduce6677.003.patch

> job history server can fail on startup when NameNode is in start phase
> --
>
> Key: MAPREDUCE-6657
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6657
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobhistoryserver
>Reporter: Haibo Chen
>Assignee: Haibo Chen
> Attachments: mapreduce6657.001.patch, mapreduce6657.002.patch, 
> mapreduce6677.003.patch
>
>
> Job history server will try to create a history directory in HDFS on startup. 
> When NameNode is in safe mode, it will keep retrying for a configurable time 
> period.  However, it should also keeps retrying if the name node is in start 
> state. Safe mode does not happen until the NN is out of the startup phase. 
> A RetriableException with the text "NameNode still not started" is thrown 
> when the NN is in its internal service startup phase. We should add the check 
> for this specific exception in isBecauseSafeMode() to account for that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6657) job history server can fail on startup when NameNode is in start phase

2016-03-25 Thread Haibo Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated MAPREDUCE-6657:
--
Attachment: mapreduce6657.002.patch

Updated per checkstyle issue

> job history server can fail on startup when NameNode is in start phase
> --
>
> Key: MAPREDUCE-6657
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6657
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobhistoryserver
>Reporter: Haibo Chen
>Assignee: Haibo Chen
> Attachments: mapreduce6657.001.patch, mapreduce6657.002.patch
>
>
> Job history server will try to create a history directory in HDFS on startup. 
> When NameNode is in safe mode, it will keep retrying for a configurable time 
> period.  However, it should also keeps retrying if the name node is in start 
> state. Safe mode does not happen until the NN is out of the startup phase. 
> A RetriableException with the text "NameNode still not started" is thrown 
> when the NN is in its internal service startup phase. We should add the check 
> for this specific exception in isBecauseSafeMode() to account for that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6657) job history server can fail on startup when NameNode is in start phase

2016-03-24 Thread Haibo Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated MAPREDUCE-6657:
--
Status: Patch Available  (was: Open)

> job history server can fail on startup when NameNode is in start phase
> --
>
> Key: MAPREDUCE-6657
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6657
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobhistoryserver
>Reporter: Haibo Chen
>Assignee: Haibo Chen
> Attachments: mapreduce6657.001.patch
>
>
> Job history server will try to create a history directory in HDFS on startup. 
> When NameNode is in safe mode, it will keep retrying for a configurable time 
> period.  However, it should also keeps retrying if the name node is in start 
> state. Safe mode does not happen until the NN is out of the startup phase. 
> A RetriableException with the text "NameNode still not started" is thrown 
> when the NN is in its internal service startup phase. We should add the check 
> for this specific exception in isBecauseSafeMode() to account for that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MAPREDUCE-6657) job history server can fail on startup when NameNode is in start phase

2016-03-24 Thread Haibo Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated MAPREDUCE-6657:
--
Attachment: mapreduce6657.001.patch

Added a unit test for SafeModeException in Job History File Manager, and 
manually test the case when Name Node is in start phase because it happens when 
multiple threads run in certain order in NameNode (namely, name node 
initialization thread, and RPC server thread)

> job history server can fail on startup when NameNode is in start phase
> --
>
> Key: MAPREDUCE-6657
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6657
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobhistoryserver
>Reporter: Haibo Chen
>Assignee: Haibo Chen
> Attachments: mapreduce6657.001.patch
>
>
> Job history server will try to create a history directory in HDFS on startup. 
> When NameNode is in safe mode, it will keep retrying for a configurable time 
> period.  However, it should also keeps retrying if the name node is in start 
> state. Safe mode does not happen until the NN is out of the startup phase. 
> A RetriableException with the text "NameNode still not started" is thrown 
> when the NN is in its internal service startup phase. We should add the check 
> for this specific exception in isBecauseSafeMode() to account for that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)