[jira] [Work logged] (HDDS-1982) Extend SCMNodeManager to support decommission and maintenance states

2019-09-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1982?focusedWorklogId=315882&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-315882
 ]

ASF GitHub Bot logged work on HDDS-1982:


Author: ASF GitHub Bot
Created on: 20/Sep/19 18:52
Start Date: 20/Sep/19 18:52
Worklog Time Spent: 10m 
  Work Description: anuengineer commented on pull request #1344: HDDS-1982 
Extend SCMNodeManager to support decommission and maintenance states
URL: https://github.com/apache/hadoop/pull/1344
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 315882)
Time Spent: 7h  (was: 6h 50m)

> Extend SCMNodeManager to support decommission and maintenance states
> 
>
> Key: HDDS-1982
> URL: https://issues.apache.org/jira/browse/HDDS-1982
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: SCM
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 7h
>  Remaining Estimate: 0h
>
> Currently, within SCM a node can have the following states:
> HEALTHY
> STALE
> DEAD
> DECOMMISSIONING
> DECOMMISSIONED
> The last 2 are not currently used.
> In order to support decommissioning and maintenance mode, we need to extend 
> the set of states a node can have to include decommission and maintenance 
> states.
> It is also important to note that a node decommissioning or entering 
> maintenance can also be HEALTHY, STALE or go DEAD.
> Therefore in this Jira I propose we should model a node state with two 
> different sets of values. The first, is effectively the liveliness of the 
> node, with the following states. This is largely what is in place now:
> HEALTHY
> STALE
> DEAD
> The second is the node operational state:
> IN_SERVICE
> DECOMMISSIONING
> DECOMMISSIONED
> ENTERING_MAINTENANCE
> IN_MAINTENANCE
> That means the overall total number of states for a node is the cross-product 
> of the two above lists, however it probably makes sense to keep the two 
> states seperate internally.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1982) Extend SCMNodeManager to support decommission and maintenance states

2019-09-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1982?focusedWorklogId=315881&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-315881
 ]

ASF GitHub Bot logged work on HDDS-1982:


Author: ASF GitHub Bot
Created on: 20/Sep/19 18:52
Start Date: 20/Sep/19 18:52
Worklog Time Spent: 10m 
  Work Description: anuengineer commented on issue #1344: HDDS-1982 Extend 
SCMNodeManager to support decommission and maintenance states
URL: https://github.com/apache/hadoop/pull/1344#issuecomment-533670612
 
 
   @nandakumar131 @elek @swagle  Thank you for all the comments and discussion. 
@sodonnel  Thank you for the contribution. I have committed this patch to the 
HDDS-1880-Decom branch.
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 315881)
Time Spent: 6h 50m  (was: 6h 40m)

> Extend SCMNodeManager to support decommission and maintenance states
> 
>
> Key: HDDS-1982
> URL: https://issues.apache.org/jira/browse/HDDS-1982
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: SCM
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 6h 50m
>  Remaining Estimate: 0h
>
> Currently, within SCM a node can have the following states:
> HEALTHY
> STALE
> DEAD
> DECOMMISSIONING
> DECOMMISSIONED
> The last 2 are not currently used.
> In order to support decommissioning and maintenance mode, we need to extend 
> the set of states a node can have to include decommission and maintenance 
> states.
> It is also important to note that a node decommissioning or entering 
> maintenance can also be HEALTHY, STALE or go DEAD.
> Therefore in this Jira I propose we should model a node state with two 
> different sets of values. The first, is effectively the liveliness of the 
> node, with the following states. This is largely what is in place now:
> HEALTHY
> STALE
> DEAD
> The second is the node operational state:
> IN_SERVICE
> DECOMMISSIONING
> DECOMMISSIONED
> ENTERING_MAINTENANCE
> IN_MAINTENANCE
> That means the overall total number of states for a node is the cross-product 
> of the two above lists, however it probably makes sense to keep the two 
> states seperate internally.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1982) Extend SCMNodeManager to support decommission and maintenance states

2019-09-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1982?focusedWorklogId=315150&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-315150
 ]

ASF GitHub Bot logged work on HDDS-1982:


Author: ASF GitHub Bot
Created on: 19/Sep/19 16:18
Start Date: 19/Sep/19 16:18
Worklog Time Spent: 10m 
  Work Description: anuengineer commented on issue #1344: HDDS-1982 Extend 
SCMNodeManager to support decommission and maintenance states
URL: https://github.com/apache/hadoop/pull/1344#issuecomment-533205315
 
 
   > @anuengineer @nandakumar131 please let us now if you have any further 
comments.
   > 
   > I am planning to commit it tomorrow if no more objections.
   
   Let us commit this into a branch, not into Trunk. Thanks
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 315150)
Time Spent: 6h 40m  (was: 6.5h)

> Extend SCMNodeManager to support decommission and maintenance states
> 
>
> Key: HDDS-1982
> URL: https://issues.apache.org/jira/browse/HDDS-1982
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: SCM
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 6h 40m
>  Remaining Estimate: 0h
>
> Currently, within SCM a node can have the following states:
> HEALTHY
> STALE
> DEAD
> DECOMMISSIONING
> DECOMMISSIONED
> The last 2 are not currently used.
> In order to support decommissioning and maintenance mode, we need to extend 
> the set of states a node can have to include decommission and maintenance 
> states.
> It is also important to note that a node decommissioning or entering 
> maintenance can also be HEALTHY, STALE or go DEAD.
> Therefore in this Jira I propose we should model a node state with two 
> different sets of values. The first, is effectively the liveliness of the 
> node, with the following states. This is largely what is in place now:
> HEALTHY
> STALE
> DEAD
> The second is the node operational state:
> IN_SERVICE
> DECOMMISSIONING
> DECOMMISSIONED
> ENTERING_MAINTENANCE
> IN_MAINTENANCE
> That means the overall total number of states for a node is the cross-product 
> of the two above lists, however it probably makes sense to keep the two 
> states seperate internally.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1982) Extend SCMNodeManager to support decommission and maintenance states

2019-09-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1982?focusedWorklogId=315097&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-315097
 ]

ASF GitHub Bot logged work on HDDS-1982:


Author: ASF GitHub Bot
Created on: 19/Sep/19 14:39
Start Date: 19/Sep/19 14:39
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on issue #1344: HDDS-1982 Extend 
SCMNodeManager to support decommission and maintenance states
URL: https://github.com/apache/hadoop/pull/1344#issuecomment-533162016
 
 
   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | 0 | reexec | 35 | Docker mode activated. |
   ||| _ Prechecks _ |
   | +1 | dupname | 1 | No case conflicting files found. |
   | +1 | @author | 0 | The patch does not contain any @author tags. |
   | +1 | test4tests | 0 | The patch appears to include 15 new or modified test 
files. |
   ||| _ trunk Compile Tests _ |
   | 0 | mvndep | 24 | Maven dependency ordering for branch |
   | -1 | mvninstall | 29 | hadoop-ozone in trunk failed. |
   | -1 | compile | 19 | hadoop-ozone in trunk failed. |
   | +1 | checkstyle | 50 | trunk passed |
   | +1 | mvnsite | 0 | trunk passed |
   | +1 | shadedclient | 941 | branch has no errors when building and testing 
our client artifacts. |
   | -1 | javadoc | 59 | hadoop-ozone in trunk failed. |
   | 0 | spotbugs | 233 | Used deprecated FindBugs config; considering 
switching to SpotBugs. |
   | -1 | findbugs | 30 | hadoop-ozone in trunk failed. |
   ||| _ Patch Compile Tests _ |
   | 0 | mvndep | 33 | Maven dependency ordering for patch |
   | -1 | mvninstall | 50 | hadoop-ozone in the patch failed. |
   | -1 | compile | 28 | hadoop-ozone in the patch failed. |
   | -1 | cc | 28 | hadoop-ozone in the patch failed. |
   | -1 | javac | 28 | hadoop-ozone in the patch failed. |
   | +1 | checkstyle | 65 | the patch passed |
   | +1 | mvnsite | 0 | the patch passed |
   | +1 | whitespace | 0 | The patch has no whitespace issues. |
   | +1 | shadedclient | 859 | patch has no errors when building and testing 
our client artifacts. |
   | -1 | javadoc | 89 | hadoop-hdds generated 1 new + 16 unchanged - 0 fixed = 
17 total (was 16) |
   | -1 | javadoc | 58 | hadoop-ozone in the patch failed. |
   | -1 | findbugs | 26 | hadoop-ozone in the patch failed. |
   ||| _ Other Tests _ |
   | +1 | unit | 255 | hadoop-hdds in the patch passed. |
   | -1 | unit | 33 | hadoop-ozone in the patch failed. |
   | +1 | asflicense | 35 | The patch does not generate ASF License warnings. |
   | | | 3762 | |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | Client=19.03.1 Server=19.03.1 base: 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1344/7/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/1344 |
   | Optional Tests | dupname asflicense compile cc mvnsite javac unit javadoc 
mvninstall shadedclient findbugs checkstyle |
   | uname | Linux ff3109c18cd4 4.15.0-60-generic #67-Ubuntu SMP Thu Aug 22 
16:55:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | personality/hadoop.sh |
   | git revision | trunk / d4205dc |
   | Default Java | 1.8.0_222 |
   | mvninstall | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1344/7/artifact/out/branch-mvninstall-hadoop-ozone.txt
 |
   | compile | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1344/7/artifact/out/branch-compile-hadoop-ozone.txt
 |
   | javadoc | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1344/7/artifact/out/branch-javadoc-hadoop-ozone.txt
 |
   | findbugs | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1344/7/artifact/out/branch-findbugs-hadoop-ozone.txt
 |
   | mvninstall | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1344/7/artifact/out/patch-mvninstall-hadoop-ozone.txt
 |
   | compile | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1344/7/artifact/out/patch-compile-hadoop-ozone.txt
 |
   | cc | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1344/7/artifact/out/patch-compile-hadoop-ozone.txt
 |
   | javac | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1344/7/artifact/out/patch-compile-hadoop-ozone.txt
 |
   | javadoc | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1344/7/artifact/out/diff-javadoc-javadoc-hadoop-hdds.txt
 |
   | javadoc | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1344/7/artifact/out/patch-javadoc-hadoop-ozone.txt
 |
   | findbugs | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1344/7/artifact/out/patch-findbugs-hadoop-ozone.txt
 |
   | unit | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1344/7/artifact/out/patch-unit-hadoop-ozone.txt
 |
   |  Test Results | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1344/7/te

[jira] [Work logged] (HDDS-1982) Extend SCMNodeManager to support decommission and maintenance states

2019-09-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1982?focusedWorklogId=314995&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-314995
 ]

ASF GitHub Bot logged work on HDDS-1982:


Author: ASF GitHub Bot
Created on: 19/Sep/19 12:06
Start Date: 19/Sep/19 12:06
Worklog Time Spent: 10m 
  Work Description: elek commented on issue #1344: HDDS-1982 Extend 
SCMNodeManager to support decommission and maintenance states
URL: https://github.com/apache/hadoop/pull/1344#issuecomment-533099643
 
 
   @sodonnel Can you please rebase and push (some of the integration tests are 
fixed on trunk, we can double check the test results with a new, updated push)
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 314995)
Time Spent: 6h 20m  (was: 6h 10m)

> Extend SCMNodeManager to support decommission and maintenance states
> 
>
> Key: HDDS-1982
> URL: https://issues.apache.org/jira/browse/HDDS-1982
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: SCM
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 6h 20m
>  Remaining Estimate: 0h
>
> Currently, within SCM a node can have the following states:
> HEALTHY
> STALE
> DEAD
> DECOMMISSIONING
> DECOMMISSIONED
> The last 2 are not currently used.
> In order to support decommissioning and maintenance mode, we need to extend 
> the set of states a node can have to include decommission and maintenance 
> states.
> It is also important to note that a node decommissioning or entering 
> maintenance can also be HEALTHY, STALE or go DEAD.
> Therefore in this Jira I propose we should model a node state with two 
> different sets of values. The first, is effectively the liveliness of the 
> node, with the following states. This is largely what is in place now:
> HEALTHY
> STALE
> DEAD
> The second is the node operational state:
> IN_SERVICE
> DECOMMISSIONING
> DECOMMISSIONED
> ENTERING_MAINTENANCE
> IN_MAINTENANCE
> That means the overall total number of states for a node is the cross-product 
> of the two above lists, however it probably makes sense to keep the two 
> states seperate internally.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1982) Extend SCMNodeManager to support decommission and maintenance states

2019-09-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1982?focusedWorklogId=314988&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-314988
 ]

ASF GitHub Bot logged work on HDDS-1982:


Author: ASF GitHub Bot
Created on: 19/Sep/19 12:05
Start Date: 19/Sep/19 12:05
Worklog Time Spent: 10m 
  Work Description: elek commented on issue #1344: HDDS-1982 Extend 
SCMNodeManager to support decommission and maintenance states
URL: https://github.com/apache/hadoop/pull/1344#issuecomment-533099312
 
 
   LGTM
   
   If I understood well everybody agreed with this approach and AFAIK all of 
the comments are addressed.
   
   @anuengineer @nandakumar131 please let us now if you have any further 
comments.
   
   I am planning to commit it tomorrow if no more objections.
   
   I think we can commit it to the trunk, I am not sure if we need a separated 
branch (let me know if you prefer a feature branch).
   
* It's smaller or the same size as the OM HA
* Complexity is smaller (at least for the existing code base), most of the 
code will be new and independent.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 314988)
Time Spent: 6h 10m  (was: 6h)

> Extend SCMNodeManager to support decommission and maintenance states
> 
>
> Key: HDDS-1982
> URL: https://issues.apache.org/jira/browse/HDDS-1982
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: SCM
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 6h 10m
>  Remaining Estimate: 0h
>
> Currently, within SCM a node can have the following states:
> HEALTHY
> STALE
> DEAD
> DECOMMISSIONING
> DECOMMISSIONED
> The last 2 are not currently used.
> In order to support decommissioning and maintenance mode, we need to extend 
> the set of states a node can have to include decommission and maintenance 
> states.
> It is also important to note that a node decommissioning or entering 
> maintenance can also be HEALTHY, STALE or go DEAD.
> Therefore in this Jira I propose we should model a node state with two 
> different sets of values. The first, is effectively the liveliness of the 
> node, with the following states. This is largely what is in place now:
> HEALTHY
> STALE
> DEAD
> The second is the node operational state:
> IN_SERVICE
> DECOMMISSIONING
> DECOMMISSIONED
> ENTERING_MAINTENANCE
> IN_MAINTENANCE
> That means the overall total number of states for a node is the cross-product 
> of the two above lists, however it probably makes sense to keep the two 
> states seperate internally.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1982) Extend SCMNodeManager to support decommission and maintenance states

2019-09-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1982?focusedWorklogId=312504&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-312504
 ]

ASF GitHub Bot logged work on HDDS-1982:


Author: ASF GitHub Bot
Created on: 14/Sep/19 05:09
Start Date: 14/Sep/19 05:09
Worklog Time Spent: 10m 
  Work Description: anuengineer commented on pull request #1344: HDDS-1982 
Extend SCMNodeManager to support decommission and maintenance states
URL: https://github.com/apache/hadoop/pull/1344#discussion_r324412721
 
 

 ##
 File path: 
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/node/states/NodeStateMap.java
 ##
 @@ -309,4 +381,61 @@ private void checkIfNodeExist(UUID uuid) throws 
NodeNotFoundException {
   throw new NodeNotFoundException("Node UUID: " + uuid);
 }
   }
+
+  /**
+   * Create a list of datanodeInfo for all nodes matching the passed states.
+   * Passing null for one of the states acts like a wildcard for that state.
+   *
+   * @param opState
+   * @param health
+   * @return List of DatanodeInfo objects matching the passed state
+   */
+  private List filterNodes(
+  NodeOperationalState opState, NodeState health) {
+if (opState != null && health != null) {
 
 Review comment:
   please be aware that stream.filter kind of patterns have a huge overhead 
over normal for. If this code is going to be in any sort of critical path, it 
is better for the code to stay normal for.
   
   Please see some fixes made by todd lipcon, in HDFS because of this issue.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 312504)
Time Spent: 6h  (was: 5h 50m)

> Extend SCMNodeManager to support decommission and maintenance states
> 
>
> Key: HDDS-1982
> URL: https://issues.apache.org/jira/browse/HDDS-1982
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: SCM
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 6h
>  Remaining Estimate: 0h
>
> Currently, within SCM a node can have the following states:
> HEALTHY
> STALE
> DEAD
> DECOMMISSIONING
> DECOMMISSIONED
> The last 2 are not currently used.
> In order to support decommissioning and maintenance mode, we need to extend 
> the set of states a node can have to include decommission and maintenance 
> states.
> It is also important to note that a node decommissioning or entering 
> maintenance can also be HEALTHY, STALE or go DEAD.
> Therefore in this Jira I propose we should model a node state with two 
> different sets of values. The first, is effectively the liveliness of the 
> node, with the following states. This is largely what is in place now:
> HEALTHY
> STALE
> DEAD
> The second is the node operational state:
> IN_SERVICE
> DECOMMISSIONING
> DECOMMISSIONED
> ENTERING_MAINTENANCE
> IN_MAINTENANCE
> That means the overall total number of states for a node is the cross-product 
> of the two above lists, however it probably makes sense to keep the two 
> states seperate internally.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1982) Extend SCMNodeManager to support decommission and maintenance states

2019-09-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1982?focusedWorklogId=310984&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-310984
 ]

ASF GitHub Bot logged work on HDDS-1982:


Author: ASF GitHub Bot
Created on: 11/Sep/19 20:38
Start Date: 11/Sep/19 20:38
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on issue #1344: HDDS-1982 Extend 
SCMNodeManager to support decommission and maintenance states
URL: https://github.com/apache/hadoop/pull/1344#issuecomment-530555681
 
 
   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | 0 | reexec | 82 | Docker mode activated. |
   ||| _ Prechecks _ |
   | +1 | dupname | 1 | No case conflicting files found. |
   | +1 | @author | 0 | The patch does not contain any @author tags. |
   | +1 | test4tests | 0 | The patch appears to include 15 new or modified test 
files. |
   ||| _ trunk Compile Tests _ |
   | 0 | mvndep | 80 | Maven dependency ordering for branch |
   | +1 | mvninstall | 639 | trunk passed |
   | +1 | compile | 397 | trunk passed |
   | +1 | checkstyle | 76 | trunk passed |
   | +1 | mvnsite | 0 | trunk passed |
   | +1 | shadedclient | 981 | branch has no errors when building and testing 
our client artifacts. |
   | +1 | javadoc | 234 | trunk passed |
   | 0 | spotbugs | 531 | Used deprecated FindBugs config; considering 
switching to SpotBugs. |
   | +1 | findbugs | 786 | trunk passed |
   ||| _ Patch Compile Tests _ |
   | 0 | mvndep | 44 | Maven dependency ordering for patch |
   | +1 | mvninstall | 723 | the patch passed |
   | +1 | compile | 467 | the patch passed |
   | +1 | cc | 467 | the patch passed |
   | +1 | javac | 467 | the patch passed |
   | +1 | checkstyle | 99 | the patch passed |
   | +1 | mvnsite | 0 | the patch passed |
   | +1 | whitespace | 0 | The patch has no whitespace issues. |
   | +1 | shadedclient | 876 | patch has no errors when building and testing 
our client artifacts. |
   | -1 | javadoc | 95 | hadoop-hdds generated 1 new + 16 unchanged - 0 fixed = 
17 total (was 16) |
   | +1 | findbugs | 788 | the patch passed |
   ||| _ Other Tests _ |
   | +1 | unit | 418 | hadoop-hdds in the patch passed. |
   | -1 | unit | 334 | hadoop-ozone in the patch failed. |
   | +1 | asflicense | 60 | The patch does not generate ASF License warnings. |
   | | | 7525 | |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | 
hadoop.ozone.om.ratis.TestOzoneManagerDoubleBufferWithOMResponse |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | Client=19.03.2 Server=19.03.2 base: 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1344/6/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/1344 |
   | Optional Tests | dupname asflicense compile cc mvnsite javac unit javadoc 
mvninstall shadedclient findbugs checkstyle |
   | uname | Linux 425bc211517e 4.15.0-60-generic #67-Ubuntu SMP Thu Aug 22 
16:55:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | personality/hadoop.sh |
   | git revision | trunk / 9221704 |
   | Default Java | 1.8.0_222 |
   | javadoc | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1344/6/artifact/out/diff-javadoc-javadoc-hadoop-hdds.txt
 |
   | unit | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1344/6/artifact/out/patch-unit-hadoop-ozone.txt
 |
   |  Test Results | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1344/6/testReport/ |
   | Max. process+thread count | 426 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdds/common hadoop-hdds/server-scm hadoop-hdds/tools 
hadoop-ozone/integration-test U: . |
   | Console output | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1344/6/console |
   | versions | git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1 |
   | Powered by | Apache Yetus 0.10.0 http://yetus.apache.org |
   
   
   This message was automatically generated.
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 310984)
Time Spent: 5h 50m  (was: 5h 40m)

> Extend SCMNodeManager to support decommission and maintenance states
> 
>
> Key: HDDS-1982
> URL: https://issues.apache.org/jira/browse/HDDS-1982
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: SCM
>Reporter: Stephen O'Donnell
>Assignee: Ste

[jira] [Work logged] (HDDS-1982) Extend SCMNodeManager to support decommission and maintenance states

2019-09-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1982?focusedWorklogId=310884&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-310884
 ]

ASF GitHub Bot logged work on HDDS-1982:


Author: ASF GitHub Bot
Created on: 11/Sep/19 17:38
Start Date: 11/Sep/19 17:38
Worklog Time Spent: 10m 
  Work Description: sodonnel commented on pull request #1344: HDDS-1982 
Extend SCMNodeManager to support decommission and maintenance states
URL: https://github.com/apache/hadoop/pull/1344#discussion_r323370846
 
 

 ##
 File path: 
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/node/SCMNodeManager.java
 ##
 @@ -417,9 +451,12 @@ private SCMNodeStat getNodeStatInternal(DatanodeDetails 
datanodeDetails) {
 
   @Override
   public Map getNodeCount() {
+// TODO - This does not consider decom, maint etc.
 Map nodeCountMap = new HashMap();
 
 Review comment:
   The existing code had Map, but I agree it would be better 
with  or . I plan to leave this as is 
for now, as this method is used only for JMX right now, and I plan to split 
that out into a separate change via HDDS-2113 as there are some open questions 
there.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 310884)
Time Spent: 5h 40m  (was: 5.5h)

> Extend SCMNodeManager to support decommission and maintenance states
> 
>
> Key: HDDS-1982
> URL: https://issues.apache.org/jira/browse/HDDS-1982
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: SCM
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 5h 40m
>  Remaining Estimate: 0h
>
> Currently, within SCM a node can have the following states:
> HEALTHY
> STALE
> DEAD
> DECOMMISSIONING
> DECOMMISSIONED
> The last 2 are not currently used.
> In order to support decommissioning and maintenance mode, we need to extend 
> the set of states a node can have to include decommission and maintenance 
> states.
> It is also important to note that a node decommissioning or entering 
> maintenance can also be HEALTHY, STALE or go DEAD.
> Therefore in this Jira I propose we should model a node state with two 
> different sets of values. The first, is effectively the liveliness of the 
> node, with the following states. This is largely what is in place now:
> HEALTHY
> STALE
> DEAD
> The second is the node operational state:
> IN_SERVICE
> DECOMMISSIONING
> DECOMMISSIONED
> ENTERING_MAINTENANCE
> IN_MAINTENANCE
> That means the overall total number of states for a node is the cross-product 
> of the two above lists, however it probably makes sense to keep the two 
> states seperate internally.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1982) Extend SCMNodeManager to support decommission and maintenance states

2019-09-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1982?focusedWorklogId=310882&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-310882
 ]

ASF GitHub Bot logged work on HDDS-1982:


Author: ASF GitHub Bot
Created on: 11/Sep/19 17:36
Start Date: 11/Sep/19 17:36
Worklog Time Spent: 10m 
  Work Description: sodonnel commented on pull request #1344: HDDS-1982 
Extend SCMNodeManager to support decommission and maintenance states
URL: https://github.com/apache/hadoop/pull/1344#discussion_r323369810
 
 

 ##
 File path: 
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/node/states/NodeStateMap.java
 ##
 @@ -309,4 +381,61 @@ private void checkIfNodeExist(UUID uuid) throws 
NodeNotFoundException {
   throw new NodeNotFoundException("Node UUID: " + uuid);
 }
   }
+
+  /**
+   * Create a list of datanodeInfo for all nodes matching the passed states.
+   * Passing null for one of the states acts like a wildcard for that state.
+   *
+   * @param opState
+   * @param health
+   * @return List of DatanodeInfo objects matching the passed state
+   */
+  private List filterNodes(
+  NodeOperationalState opState, NodeState health) {
+if (opState != null && health != null) {
 
 Review comment:
   I had not really looked into the Streams API before, but I change the code 
to use streams and it does make it easier to follow, so I have made this 
change. I still kept the IF statements at the start of the method as if both 
params are null we can just return the entire list with no searching and if 
both are non-null we can search using the NodeStatus which should be slightly 
more efficient.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 310882)
Time Spent: 5.5h  (was: 5h 20m)

> Extend SCMNodeManager to support decommission and maintenance states
> 
>
> Key: HDDS-1982
> URL: https://issues.apache.org/jira/browse/HDDS-1982
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: SCM
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 5.5h
>  Remaining Estimate: 0h
>
> Currently, within SCM a node can have the following states:
> HEALTHY
> STALE
> DEAD
> DECOMMISSIONING
> DECOMMISSIONED
> The last 2 are not currently used.
> In order to support decommissioning and maintenance mode, we need to extend 
> the set of states a node can have to include decommission and maintenance 
> states.
> It is also important to note that a node decommissioning or entering 
> maintenance can also be HEALTHY, STALE or go DEAD.
> Therefore in this Jira I propose we should model a node state with two 
> different sets of values. The first, is effectively the liveliness of the 
> node, with the following states. This is largely what is in place now:
> HEALTHY
> STALE
> DEAD
> The second is the node operational state:
> IN_SERVICE
> DECOMMISSIONING
> DECOMMISSIONED
> ENTERING_MAINTENANCE
> IN_MAINTENANCE
> That means the overall total number of states for a node is the cross-product 
> of the two above lists, however it probably makes sense to keep the two 
> states seperate internally.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1982) Extend SCMNodeManager to support decommission and maintenance states

2019-09-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1982?focusedWorklogId=310511&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-310511
 ]

ASF GitHub Bot logged work on HDDS-1982:


Author: ASF GitHub Bot
Created on: 11/Sep/19 11:52
Start Date: 11/Sep/19 11:52
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on issue #1344: HDDS-1982 Extend 
SCMNodeManager to support decommission and maintenance states
URL: https://github.com/apache/hadoop/pull/1344#issuecomment-530346256
 
 
   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | 0 | reexec | 39 | Docker mode activated. |
   ||| _ Prechecks _ |
   | +1 | dupname | 1 | No case conflicting files found. |
   | +1 | @author | 0 | The patch does not contain any @author tags. |
   | +1 | test4tests | 0 | The patch appears to include 15 new or modified test 
files. |
   ||| _ trunk Compile Tests _ |
   | 0 | mvndep | 65 | Maven dependency ordering for branch |
   | +1 | mvninstall | 609 | trunk passed |
   | +1 | compile | 409 | trunk passed |
   | +1 | checkstyle | 77 | trunk passed |
   | +1 | mvnsite | 0 | trunk passed |
   | +1 | shadedclient | 969 | branch has no errors when building and testing 
our client artifacts. |
   | +1 | javadoc | 169 | trunk passed |
   | 0 | spotbugs | 428 | Used deprecated FindBugs config; considering 
switching to SpotBugs. |
   | +1 | findbugs | 627 | trunk passed |
   ||| _ Patch Compile Tests _ |
   | 0 | mvndep | 32 | Maven dependency ordering for patch |
   | +1 | mvninstall | 541 | the patch passed |
   | +1 | compile | 375 | the patch passed |
   | +1 | cc | 374 | the patch passed |
   | +1 | javac | 374 | the patch passed |
   | +1 | checkstyle | 78 | the patch passed |
   | +1 | mvnsite | 0 | the patch passed |
   | +1 | whitespace | 0 | The patch has no whitespace issues. |
   | +1 | shadedclient | 729 | patch has no errors when building and testing 
our client artifacts. |
   | -1 | javadoc | 81 | hadoop-hdds generated 1 new + 16 unchanged - 0 fixed = 
17 total (was 16) |
   | +1 | findbugs | 720 | the patch passed |
   ||| _ Other Tests _ |
   | +1 | unit | 271 | hadoop-hdds in the patch passed. |
   | -1 | unit | 2091 | hadoop-ozone in the patch failed. |
   | +1 | asflicense | 43 | The patch does not generate ASF License warnings. |
   | | | 8200 | |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.ozone.container.TestContainerReplication |
   |   | hadoop.ozone.scm.TestContainerSmallFile |
   |   | hadoop.ozone.TestSecureOzoneCluster |
   |   | hadoop.ozone.om.TestOzoneManagerRestart |
   |   | hadoop.ozone.om.TestOMRatisSnapshots |
   |   | hadoop.ozone.client.rpc.TestWatchForCommit |
   |   | hadoop.ozone.client.rpc.TestBlockOutputStreamWithFailures |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | Client=19.03.1 Server=19.03.1 base: 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1344/5/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/1344 |
   | Optional Tests | dupname asflicense compile cc mvnsite javac unit javadoc 
mvninstall shadedclient findbugs checkstyle |
   | uname | Linux f8e32d81502e 4.15.0-60-generic #67-Ubuntu SMP Thu Aug 22 
16:55:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | personality/hadoop.sh |
   | git revision | trunk / c255333 |
   | Default Java | 1.8.0_222 |
   | javadoc | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1344/5/artifact/out/diff-javadoc-javadoc-hadoop-hdds.txt
 |
   | unit | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1344/5/artifact/out/patch-unit-hadoop-ozone.txt
 |
   |  Test Results | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1344/5/testReport/ |
   | Max. process+thread count | 5328 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdds/common hadoop-hdds/server-scm hadoop-hdds/tools 
hadoop-ozone/integration-test U: . |
   | Console output | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1344/5/console |
   | versions | git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1 |
   | Powered by | Apache Yetus 0.10.0 http://yetus.apache.org |
   
   
   This message was automatically generated.
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 310511)
Time Spent: 5h 20m  (was: 5h 10m)

> Extend SCMNodeManager to support decommission and maintenance states
> --

[jira] [Work logged] (HDDS-1982) Extend SCMNodeManager to support decommission and maintenance states

2019-09-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1982?focusedWorklogId=310436&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-310436
 ]

ASF GitHub Bot logged work on HDDS-1982:


Author: ASF GitHub Bot
Created on: 11/Sep/19 09:29
Start Date: 11/Sep/19 09:29
Worklog Time Spent: 10m 
  Work Description: sodonnel commented on issue #1344: HDDS-1982 Extend 
SCMNodeManager to support decommission and maintenance states
URL: https://github.com/apache/hadoop/pull/1344#issuecomment-530300616
 
 
   The failing unit test passes locally and the integration tests which failed, 
are flaky, I think. I will push the change to fix the style issue and see how 
the re-test goes.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 310436)
Time Spent: 5h 10m  (was: 5h)

> Extend SCMNodeManager to support decommission and maintenance states
> 
>
> Key: HDDS-1982
> URL: https://issues.apache.org/jira/browse/HDDS-1982
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: SCM
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 5h 10m
>  Remaining Estimate: 0h
>
> Currently, within SCM a node can have the following states:
> HEALTHY
> STALE
> DEAD
> DECOMMISSIONING
> DECOMMISSIONED
> The last 2 are not currently used.
> In order to support decommissioning and maintenance mode, we need to extend 
> the set of states a node can have to include decommission and maintenance 
> states.
> It is also important to note that a node decommissioning or entering 
> maintenance can also be HEALTHY, STALE or go DEAD.
> Therefore in this Jira I propose we should model a node state with two 
> different sets of values. The first, is effectively the liveliness of the 
> node, with the following states. This is largely what is in place now:
> HEALTHY
> STALE
> DEAD
> The second is the node operational state:
> IN_SERVICE
> DECOMMISSIONING
> DECOMMISSIONED
> ENTERING_MAINTENANCE
> IN_MAINTENANCE
> That means the overall total number of states for a node is the cross-product 
> of the two above lists, however it probably makes sense to keep the two 
> states seperate internally.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1982) Extend SCMNodeManager to support decommission and maintenance states

2019-09-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1982?focusedWorklogId=310317&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-310317
 ]

ASF GitHub Bot logged work on HDDS-1982:


Author: ASF GitHub Bot
Created on: 11/Sep/19 05:46
Start Date: 11/Sep/19 05:46
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on issue #1344: HDDS-1982 Extend 
SCMNodeManager to support decommission and maintenance states
URL: https://github.com/apache/hadoop/pull/1344#issuecomment-530229574
 
 
   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | 0 | reexec | 92 | Docker mode activated. |
   ||| _ Prechecks _ |
   | +1 | dupname | 1 | No case conflicting files found. |
   | +1 | @author | 0 | The patch does not contain any @author tags. |
   | +1 | test4tests | 0 | The patch appears to include 15 new or modified test 
files. |
   ||| _ trunk Compile Tests _ |
   | 0 | mvndep | 24 | Maven dependency ordering for branch |
   | +1 | mvninstall | 649 | trunk passed |
   | +1 | compile | 405 | trunk passed |
   | +1 | checkstyle | 76 | trunk passed |
   | +1 | mvnsite | 0 | trunk passed |
   | +1 | shadedclient | 932 | branch has no errors when building and testing 
our client artifacts. |
   | +1 | javadoc | 182 | trunk passed |
   | 0 | spotbugs | 499 | Used deprecated FindBugs config; considering 
switching to SpotBugs. |
   | +1 | findbugs | 750 | trunk passed |
   ||| _ Patch Compile Tests _ |
   | 0 | mvndep | 32 | Maven dependency ordering for patch |
   | -1 | mvninstall | 299 | hadoop-ozone in the patch failed. |
   | -1 | compile | 247 | hadoop-ozone in the patch failed. |
   | -1 | cc | 247 | hadoop-ozone in the patch failed. |
   | -1 | javac | 247 | hadoop-ozone in the patch failed. |
   | -0 | checkstyle | 43 | hadoop-ozone: The patch generated 1 new + 0 
unchanged - 0 fixed = 1 total (was 0) |
   | +1 | mvnsite | 0 | the patch passed |
   | +1 | whitespace | 0 | The patch has no whitespace issues. |
   | +1 | shadedclient | 739 | patch has no errors when building and testing 
our client artifacts. |
   | -1 | javadoc | 75 | hadoop-hdds generated 20 new + 16 unchanged - 0 fixed 
= 36 total (was 16) |
   | -1 | findbugs | 411 | hadoop-ozone in the patch failed. |
   ||| _ Other Tests _ |
   | +1 | unit | 339 | hadoop-hdds in the patch passed. |
   | -1 | unit | 466 | hadoop-ozone in the patch failed. |
   | +1 | asflicense | 40 | The patch does not generate ASF License warnings. |
   | | | 6574 | |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | Client=19.03.2 Server=19.03.2 base: 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1344/4/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/1344 |
   | Optional Tests | dupname asflicense compile cc mvnsite javac unit javadoc 
mvninstall shadedclient findbugs checkstyle |
   | uname | Linux 2edae08c6f80 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 
11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | personality/hadoop.sh |
   | git revision | trunk / dacc448 |
   | Default Java | 1.8.0_212 |
   | mvninstall | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1344/4/artifact/out/patch-mvninstall-hadoop-ozone.txt
 |
   | compile | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1344/4/artifact/out/patch-compile-hadoop-ozone.txt
 |
   | cc | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1344/4/artifact/out/patch-compile-hadoop-ozone.txt
 |
   | javac | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1344/4/artifact/out/patch-compile-hadoop-ozone.txt
 |
   | checkstyle | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1344/4/artifact/out/diff-checkstyle-hadoop-ozone.txt
 |
   | javadoc | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1344/4/artifact/out/diff-javadoc-javadoc-hadoop-hdds.txt
 |
   | findbugs | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1344/4/artifact/out/patch-findbugs-hadoop-ozone.txt
 |
   | unit | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1344/4/artifact/out/patch-unit-hadoop-ozone.txt
 |
   |  Test Results | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1344/4/testReport/ |
   | Max. process+thread count | 1247 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdds/common hadoop-hdds/server-scm hadoop-hdds/tools 
hadoop-ozone/integration-test U: . |
   | Console output | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1344/4/console |
   | versions | git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1 |
   | Powered by | Apache Yetus 0.10.0 http://yetus.apache.org |
   
   
   This message was automatically generated.
   
   
 

This is an automated message

[jira] [Work logged] (HDDS-1982) Extend SCMNodeManager to support decommission and maintenance states

2019-09-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1982?focusedWorklogId=310050&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-310050
 ]

ASF GitHub Bot logged work on HDDS-1982:


Author: ASF GitHub Bot
Created on: 10/Sep/19 19:20
Start Date: 10/Sep/19 19:20
Worklog Time Spent: 10m 
  Work Description: swagle commented on pull request #1344: HDDS-1982 
Extend SCMNodeManager to support decommission and maintenance states
URL: https://github.com/apache/hadoop/pull/1344#discussion_r322918560
 
 

 ##
 File path: 
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/node/states/NodeStateMap.java
 ##
 @@ -309,4 +381,61 @@ private void checkIfNodeExist(UUID uuid) throws 
NodeNotFoundException {
   throw new NodeNotFoundException("Node UUID: " + uuid);
 }
   }
+
+  /**
+   * Create a list of datanodeInfo for all nodes matching the passed states.
+   * Passing null for one of the states acts like a wildcard for that state.
+   *
+   * @param opState
+   * @param health
+   * @return List of DatanodeInfo objects matching the passed state
+   */
+  private List filterNodes(
+  NodeOperationalState opState, NodeState health) {
+if (opState != null && health != null) {
 
 Review comment:
   Can we write Line 395-440 with one simple stream().filter? Nothing wrong 
with code itself but just a thought.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 310050)
Time Spent: 4h 50m  (was: 4h 40m)

> Extend SCMNodeManager to support decommission and maintenance states
> 
>
> Key: HDDS-1982
> URL: https://issues.apache.org/jira/browse/HDDS-1982
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: SCM
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h 50m
>  Remaining Estimate: 0h
>
> Currently, within SCM a node can have the following states:
> HEALTHY
> STALE
> DEAD
> DECOMMISSIONING
> DECOMMISSIONED
> The last 2 are not currently used.
> In order to support decommissioning and maintenance mode, we need to extend 
> the set of states a node can have to include decommission and maintenance 
> states.
> It is also important to note that a node decommissioning or entering 
> maintenance can also be HEALTHY, STALE or go DEAD.
> Therefore in this Jira I propose we should model a node state with two 
> different sets of values. The first, is effectively the liveliness of the 
> node, with the following states. This is largely what is in place now:
> HEALTHY
> STALE
> DEAD
> The second is the node operational state:
> IN_SERVICE
> DECOMMISSIONING
> DECOMMISSIONED
> ENTERING_MAINTENANCE
> IN_MAINTENANCE
> That means the overall total number of states for a node is the cross-product 
> of the two above lists, however it probably makes sense to keep the two 
> states seperate internally.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1982) Extend SCMNodeManager to support decommission and maintenance states

2019-09-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1982?focusedWorklogId=310038&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-310038
 ]

ASF GitHub Bot logged work on HDDS-1982:


Author: ASF GitHub Bot
Created on: 10/Sep/19 19:02
Start Date: 10/Sep/19 19:02
Worklog Time Spent: 10m 
  Work Description: swagle commented on pull request #1344: HDDS-1982 
Extend SCMNodeManager to support decommission and maintenance states
URL: https://github.com/apache/hadoop/pull/1344#discussion_r322911485
 
 

 ##
 File path: 
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/node/SCMNodeManager.java
 ##
 @@ -417,9 +451,12 @@ private SCMNodeStat getNodeStatInternal(DatanodeDetails 
datanodeDetails) {
 
   @Override
   public Map getNodeCount() {
+// TODO - This does not consider decom, maint etc.
 Map nodeCountMap = new HashMap();
 
 Review comment:
   Why not ? It makes it easier to consume for the caller 
in my opinion.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 310038)
Time Spent: 4h 40m  (was: 4.5h)

> Extend SCMNodeManager to support decommission and maintenance states
> 
>
> Key: HDDS-1982
> URL: https://issues.apache.org/jira/browse/HDDS-1982
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: SCM
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> Currently, within SCM a node can have the following states:
> HEALTHY
> STALE
> DEAD
> DECOMMISSIONING
> DECOMMISSIONED
> The last 2 are not currently used.
> In order to support decommissioning and maintenance mode, we need to extend 
> the set of states a node can have to include decommission and maintenance 
> states.
> It is also important to note that a node decommissioning or entering 
> maintenance can also be HEALTHY, STALE or go DEAD.
> Therefore in this Jira I propose we should model a node state with two 
> different sets of values. The first, is effectively the liveliness of the 
> node, with the following states. This is largely what is in place now:
> HEALTHY
> STALE
> DEAD
> The second is the node operational state:
> IN_SERVICE
> DECOMMISSIONING
> DECOMMISSIONED
> ENTERING_MAINTENANCE
> IN_MAINTENANCE
> That means the overall total number of states for a node is the cross-product 
> of the two above lists, however it probably makes sense to keep the two 
> states seperate internally.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1982) Extend SCMNodeManager to support decommission and maintenance states

2019-09-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1982?focusedWorklogId=309954&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-309954
 ]

ASF GitHub Bot logged work on HDDS-1982:


Author: ASF GitHub Bot
Created on: 10/Sep/19 16:56
Start Date: 10/Sep/19 16:56
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on issue #1344: HDDS-1982 Extend 
SCMNodeManager to support decommission and maintenance states
URL: https://github.com/apache/hadoop/pull/1344#issuecomment-530027446
 
 
   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | 0 | reexec | 82 | Docker mode activated. |
   ||| _ Prechecks _ |
   | +1 | dupname | 0 | No case conflicting files found. |
   | +1 | @author | 0 | The patch does not contain any @author tags. |
   | +1 | test4tests | 0 | The patch appears to include 3 new or modified test 
files. |
   ||| _ trunk Compile Tests _ |
   | 0 | mvndep | 67 | Maven dependency ordering for branch |
   | +1 | mvninstall | 628 | trunk passed |
   | +1 | compile | 390 | trunk passed |
   | +1 | checkstyle | 75 | trunk passed |
   | +1 | mvnsite | 0 | trunk passed |
   | +1 | shadedclient | 948 | branch has no errors when building and testing 
our client artifacts. |
   | +1 | javadoc | 175 | trunk passed |
   | 0 | spotbugs | 459 | Used deprecated FindBugs config; considering 
switching to SpotBugs. |
   | +1 | findbugs | 682 | trunk passed |
   ||| _ Patch Compile Tests _ |
   | 0 | mvndep | 31 | Maven dependency ordering for patch |
   | +1 | mvninstall | 576 | the patch passed |
   | +1 | compile | 388 | the patch passed |
   | +1 | cc | 388 | the patch passed |
   | +1 | javac | 388 | the patch passed |
   | -0 | checkstyle | 37 | hadoop-hdds: The patch generated 1 new + 0 
unchanged - 0 fixed = 1 total (was 0) |
   | +1 | mvnsite | 0 | the patch passed |
   | +1 | whitespace | 1 | The patch has no whitespace issues. |
   | +1 | shadedclient | 736 | patch has no errors when building and testing 
our client artifacts. |
   | +1 | javadoc | 179 | the patch passed |
   | -1 | findbugs | 215 | hadoop-hdds generated 2 new + 0 unchanged - 0 fixed 
= 2 total (was 0) |
   ||| _ Other Tests _ |
   | -1 | unit | 298 | hadoop-hdds in the patch failed. |
   | -1 | unit | 3499 | hadoop-ozone in the patch failed. |
   | +1 | asflicense | 61 | The patch does not generate ASF License warnings. |
   | | | 9701 | |
   
   
   | Reason | Tests |
   |---:|:--|
   | FindBugs | module:hadoop-hdds |
   |  |  Dead store to nodes in 
org.apache.hadoop.hdds.scm.node.NodeStateManager.getAllNodes()  At 
NodeStateManager.java:org.apache.hadoop.hdds.scm.node.NodeStateManager.getAllNodes()
  At NodeStateManager.java:[line 396] |
   |  |  
org.apache.hadoop.hdds.scm.node.states.NodeStateMap.getNodes(NodeStatus) does 
not release lock on all exception paths  At NodeStateMap.java:on all exception 
paths  At NodeStateMap.java:[line 156] |
   | Failed junit tests | hadoop.hdds.scm.block.TestBlockManager |
   |   | hadoop.ozone.container.TestContainerReplication |
   |   | hadoop.ozone.TestSecureOzoneCluster |
   |   | hadoop.ozone.scm.node.TestQueryNode |
   |   | hadoop.ozone.client.rpc.TestOzoneRpcClient |
   |   | hadoop.ozone.client.rpc.TestOzoneRpcClientWithRatis |
   |   | hadoop.ozone.client.rpc.TestMultiBlockWritesWithDnFailures |
   |   | hadoop.ozone.scm.TestContainerSmallFile |
   |   | hadoop.ozone.client.rpc.Test2WayCommitInRatis |
   |   | hadoop.ozone.TestMiniChaosOzoneCluster |
   |   | 
hadoop.ozone.container.common.statemachine.commandhandler.TestBlockDeletion |
   |   | hadoop.ozone.scm.TestGetCommittedBlockLengthAndPutKey |
   |   | hadoop.ozone.client.rpc.TestDeleteWithSlowFollower |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | Client=19.03.1 Server=19.03.1 base: 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1344/3/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/1344 |
   | Optional Tests | dupname asflicense compile cc mvnsite javac unit javadoc 
mvninstall shadedclient findbugs checkstyle |
   | uname | Linux 7a7a07260082 4.15.0-54-generic #58-Ubuntu SMP Mon Jun 24 
10:55:24 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | personality/hadoop.sh |
   | git revision | trunk / dc9abd2 |
   | Default Java | 1.8.0_222 |
   | checkstyle | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1344/3/artifact/out/diff-checkstyle-hadoop-hdds.txt
 |
   | findbugs | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1344/3/artifact/out/new-findbugs-hadoop-hdds.html
 |
   | unit | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1344/3/artifact/out/patch-unit-hadoop-hdds.txt
 |
   | unit | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1344

[jira] [Work logged] (HDDS-1982) Extend SCMNodeManager to support decommission and maintenance states

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1982?focusedWorklogId=308007&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-308007
 ]

ASF GitHub Bot logged work on HDDS-1982:


Author: ASF GitHub Bot
Created on: 06/Sep/19 16:44
Start Date: 06/Sep/19 16:44
Worklog Time Spent: 10m 
  Work Description: anuengineer commented on issue #1344: HDDS-1982 Extend 
SCMNodeManager to support decommission and maintenance states
URL: https://github.com/apache/hadoop/pull/1344#issuecomment-528927862
 
 
   Just a note; Originally DatanodeInfo was based on the HDFS code. Then I 
think we copied and created our own structure. At this point, diverging should 
not be a big deal is what I think.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 308007)
Time Spent: 4h 20m  (was: 4h 10m)

> Extend SCMNodeManager to support decommission and maintenance states
> 
>
> Key: HDDS-1982
> URL: https://issues.apache.org/jira/browse/HDDS-1982
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: SCM
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> Currently, within SCM a node can have the following states:
> HEALTHY
> STALE
> DEAD
> DECOMMISSIONING
> DECOMMISSIONED
> The last 2 are not currently used.
> In order to support decommissioning and maintenance mode, we need to extend 
> the set of states a node can have to include decommission and maintenance 
> states.
> It is also important to note that a node decommissioning or entering 
> maintenance can also be HEALTHY, STALE or go DEAD.
> Therefore in this Jira I propose we should model a node state with two 
> different sets of values. The first, is effectively the liveliness of the 
> node, with the following states. This is largely what is in place now:
> HEALTHY
> STALE
> DEAD
> The second is the node operational state:
> IN_SERVICE
> DECOMMISSIONING
> DECOMMISSIONED
> ENTERING_MAINTENANCE
> IN_MAINTENANCE
> That means the overall total number of states for a node is the cross-product 
> of the two above lists, however it probably makes sense to keep the two 
> states seperate internally.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1982) Extend SCMNodeManager to support decommission and maintenance states

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1982?focusedWorklogId=308006&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-308006
 ]

ASF GitHub Bot logged work on HDDS-1982:


Author: ASF GitHub Bot
Created on: 06/Sep/19 16:43
Start Date: 06/Sep/19 16:43
Worklog Time Spent: 10m 
  Work Description: anuengineer commented on pull request #1344: HDDS-1982 
Extend SCMNodeManager to support decommission and maintenance states
URL: https://github.com/apache/hadoop/pull/1344#discussion_r321819701
 
 

 ##
 File path: 
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/node/states/NodeStateMap.java
 ##
 @@ -43,7 +45,7 @@
   /**
* Represents the current state of node.
*/
-  private final ConcurrentHashMap> stateMap;
+  private final ConcurrentHashMap stateMap;
 
 Review comment:
   even if you have 15x states, the number of nodes is less. if you have 100 
nodes, there are only 1500 states, and if you have 1000 nodes, it is 15000 
states. It is still trivial to keep these in memory. Here is the real kicker, 
just like we decided not to write all cross products for the NodeState static 
functions, we will end up needing lists of only frequently accessed pattern (in 
mind that would be (in_service, healthy). All other node queries can be 
retrieved by iterating the lists as needed.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 308006)
Time Spent: 4h 10m  (was: 4h)

> Extend SCMNodeManager to support decommission and maintenance states
> 
>
> Key: HDDS-1982
> URL: https://issues.apache.org/jira/browse/HDDS-1982
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: SCM
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> Currently, within SCM a node can have the following states:
> HEALTHY
> STALE
> DEAD
> DECOMMISSIONING
> DECOMMISSIONED
> The last 2 are not currently used.
> In order to support decommissioning and maintenance mode, we need to extend 
> the set of states a node can have to include decommission and maintenance 
> states.
> It is also important to note that a node decommissioning or entering 
> maintenance can also be HEALTHY, STALE or go DEAD.
> Therefore in this Jira I propose we should model a node state with two 
> different sets of values. The first, is effectively the liveliness of the 
> node, with the following states. This is largely what is in place now:
> HEALTHY
> STALE
> DEAD
> The second is the node operational state:
> IN_SERVICE
> DECOMMISSIONING
> DECOMMISSIONED
> ENTERING_MAINTENANCE
> IN_MAINTENANCE
> That means the overall total number of states for a node is the cross-product 
> of the two above lists, however it probably makes sense to keep the two 
> states seperate internally.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1982) Extend SCMNodeManager to support decommission and maintenance states

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1982?focusedWorklogId=307698&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-307698
 ]

ASF GitHub Bot logged work on HDDS-1982:


Author: ASF GitHub Bot
Created on: 06/Sep/19 08:24
Start Date: 06/Sep/19 08:24
Worklog Time Spent: 10m 
  Work Description: nandakumar131 commented on pull request #1344: 
HDDS-1982 Extend SCMNodeManager to support decommission and maintenance states
URL: https://github.com/apache/hadoop/pull/1344#discussion_r321626983
 
 

 ##
 File path: 
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/node/states/NodeStateMap.java
 ##
 @@ -43,7 +45,7 @@
   /**
* Represents the current state of node.
*/
-  private final ConcurrentHashMap> stateMap;
+  private final ConcurrentHashMap stateMap;
 
 Review comment:
   It is better to have `NodeStatus` inside `DatanodeInfo` rather than having 
two separate fields.
   
Yes, `stateMap` helped us to easily get the list/count of nodes in a 
specific state, but with the current changes it is not straight forward to 
maintain `state -> list of nodes`. In any case we will be iterating over all 
the available nodes to get list of nodes in a given state. 
   The number of nodes in a cluster should not go beyond 3-4 order of 
magnitude. We can re-visit and optimize this, if we run into any performance 
issue.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 307698)
Time Spent: 4h  (was: 3h 50m)

> Extend SCMNodeManager to support decommission and maintenance states
> 
>
> Key: HDDS-1982
> URL: https://issues.apache.org/jira/browse/HDDS-1982
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: SCM
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> Currently, within SCM a node can have the following states:
> HEALTHY
> STALE
> DEAD
> DECOMMISSIONING
> DECOMMISSIONED
> The last 2 are not currently used.
> In order to support decommissioning and maintenance mode, we need to extend 
> the set of states a node can have to include decommission and maintenance 
> states.
> It is also important to note that a node decommissioning or entering 
> maintenance can also be HEALTHY, STALE or go DEAD.
> Therefore in this Jira I propose we should model a node state with two 
> different sets of values. The first, is effectively the liveliness of the 
> node, with the following states. This is largely what is in place now:
> HEALTHY
> STALE
> DEAD
> The second is the node operational state:
> IN_SERVICE
> DECOMMISSIONING
> DECOMMISSIONED
> ENTERING_MAINTENANCE
> IN_MAINTENANCE
> That means the overall total number of states for a node is the cross-product 
> of the two above lists, however it probably makes sense to keep the two 
> states seperate internally.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1982) Extend SCMNodeManager to support decommission and maintenance states

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1982?focusedWorklogId=307697&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-307697
 ]

ASF GitHub Bot logged work on HDDS-1982:


Author: ASF GitHub Bot
Created on: 06/Sep/19 08:23
Start Date: 06/Sep/19 08:23
Worklog Time Spent: 10m 
  Work Description: nandakumar131 commented on pull request #1344: 
HDDS-1982 Extend SCMNodeManager to support decommission and maintenance states
URL: https://github.com/apache/hadoop/pull/1344#discussion_r321626983
 
 

 ##
 File path: 
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/node/states/NodeStateMap.java
 ##
 @@ -43,7 +45,7 @@
   /**
* Represents the current state of node.
*/
-  private final ConcurrentHashMap> stateMap;
+  private final ConcurrentHashMap stateMap;
 
 Review comment:
   It is better to have `NodeStatus` inside `DatanodeInfo` rather than having 
two separate fields.
   
Yes, `stateMap` helped us to easily get the list/count of nodes in a 
specific state, but with the current changes it is not straight forward to 
maintain `state -> list of nodes`. In any case we will be iteration over all 
the available nodes to get list of nodes in a given state. 
   The number of nodes in a cluster should not go beyond 3-4 order of 
magnitude. We can re-visit and optimize this, if we run into any performance 
issue.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 307697)
Time Spent: 3h 50m  (was: 3h 40m)

> Extend SCMNodeManager to support decommission and maintenance states
> 
>
> Key: HDDS-1982
> URL: https://issues.apache.org/jira/browse/HDDS-1982
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: SCM
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> Currently, within SCM a node can have the following states:
> HEALTHY
> STALE
> DEAD
> DECOMMISSIONING
> DECOMMISSIONED
> The last 2 are not currently used.
> In order to support decommissioning and maintenance mode, we need to extend 
> the set of states a node can have to include decommission and maintenance 
> states.
> It is also important to note that a node decommissioning or entering 
> maintenance can also be HEALTHY, STALE or go DEAD.
> Therefore in this Jira I propose we should model a node state with two 
> different sets of values. The first, is effectively the liveliness of the 
> node, with the following states. This is largely what is in place now:
> HEALTHY
> STALE
> DEAD
> The second is the node operational state:
> IN_SERVICE
> DECOMMISSIONING
> DECOMMISSIONED
> ENTERING_MAINTENANCE
> IN_MAINTENANCE
> That means the overall total number of states for a node is the cross-product 
> of the two above lists, however it probably makes sense to keep the two 
> states seperate internally.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1982) Extend SCMNodeManager to support decommission and maintenance states

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1982?focusedWorklogId=307688&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-307688
 ]

ASF GitHub Bot logged work on HDDS-1982:


Author: ASF GitHub Bot
Created on: 06/Sep/19 07:55
Start Date: 06/Sep/19 07:55
Worklog Time Spent: 10m 
  Work Description: sodonnel commented on pull request #1344: HDDS-1982 
Extend SCMNodeManager to support decommission and maintenance states
URL: https://github.com/apache/hadoop/pull/1344#discussion_r321616777
 
 

 ##
 File path: 
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/node/states/NodeStateMap.java
 ##
 @@ -43,7 +45,7 @@
   /**
* Represents the current state of node.
*/
-  private final ConcurrentHashMap> stateMap;
+  private final ConcurrentHashMap stateMap;
 
 Review comment:
   Do you think it makes sense to have a field inside DatanodeInfo of type 
NodeStatus, so we can always pass the states around as a pair, or should we add 
two individual fields to DatanodeInfo - nodeHealth and nodeOperationalState?
   
   Also, one other thing to consider, is nodeStateMap originally kept a list of 
healthy, stale and dead, so it was possible to quickly return all nodes in that 
state. However now, we need to iterate over the whole list to find those nodes. 
One reason for this, is that we have 15 different states now instead of 3. If 
we move nodeStatus into datanodeInfo, it would be more difficult to optimise 
this later if needed. However it would simplify things if we simply remove this 
stateMap.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 307688)
Time Spent: 3h 40m  (was: 3.5h)

> Extend SCMNodeManager to support decommission and maintenance states
> 
>
> Key: HDDS-1982
> URL: https://issues.apache.org/jira/browse/HDDS-1982
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: SCM
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> Currently, within SCM a node can have the following states:
> HEALTHY
> STALE
> DEAD
> DECOMMISSIONING
> DECOMMISSIONED
> The last 2 are not currently used.
> In order to support decommissioning and maintenance mode, we need to extend 
> the set of states a node can have to include decommission and maintenance 
> states.
> It is also important to note that a node decommissioning or entering 
> maintenance can also be HEALTHY, STALE or go DEAD.
> Therefore in this Jira I propose we should model a node state with two 
> different sets of values. The first, is effectively the liveliness of the 
> node, with the following states. This is largely what is in place now:
> HEALTHY
> STALE
> DEAD
> The second is the node operational state:
> IN_SERVICE
> DECOMMISSIONING
> DECOMMISSIONED
> ENTERING_MAINTENANCE
> IN_MAINTENANCE
> That means the overall total number of states for a node is the cross-product 
> of the two above lists, however it probably makes sense to keep the two 
> states seperate internally.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1982) Extend SCMNodeManager to support decommission and maintenance states

2019-09-05 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1982?focusedWorklogId=307658&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-307658
 ]

ASF GitHub Bot logged work on HDDS-1982:


Author: ASF GitHub Bot
Created on: 06/Sep/19 06:22
Start Date: 06/Sep/19 06:22
Worklog Time Spent: 10m 
  Work Description: nandakumar131 commented on pull request #1344: 
HDDS-1982 Extend SCMNodeManager to support decommission and maintenance states
URL: https://github.com/apache/hadoop/pull/1344#discussion_r321576076
 
 

 ##
 File path: 
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/node/states/NodeStateMap.java
 ##
 @@ -43,7 +45,7 @@
   /**
* Represents the current state of node.
*/
-  private final ConcurrentHashMap> stateMap;
+  private final ConcurrentHashMap stateMap;
 
 Review comment:
   `stateMap` is no longer required, we can move `NodeStatus` inside 
`DatanodeInfo`
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 307658)
Time Spent: 3.5h  (was: 3h 20m)

> Extend SCMNodeManager to support decommission and maintenance states
> 
>
> Key: HDDS-1982
> URL: https://issues.apache.org/jira/browse/HDDS-1982
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: SCM
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> Currently, within SCM a node can have the following states:
> HEALTHY
> STALE
> DEAD
> DECOMMISSIONING
> DECOMMISSIONED
> The last 2 are not currently used.
> In order to support decommissioning and maintenance mode, we need to extend 
> the set of states a node can have to include decommission and maintenance 
> states.
> It is also important to note that a node decommissioning or entering 
> maintenance can also be HEALTHY, STALE or go DEAD.
> Therefore in this Jira I propose we should model a node state with two 
> different sets of values. The first, is effectively the liveliness of the 
> node, with the following states. This is largely what is in place now:
> HEALTHY
> STALE
> DEAD
> The second is the node operational state:
> IN_SERVICE
> DECOMMISSIONING
> DECOMMISSIONED
> ENTERING_MAINTENANCE
> IN_MAINTENANCE
> That means the overall total number of states for a node is the cross-product 
> of the two above lists, however it probably makes sense to keep the two 
> states seperate internally.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1982) Extend SCMNodeManager to support decommission and maintenance states

2019-09-05 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1982?focusedWorklogId=307285&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-307285
 ]

ASF GitHub Bot logged work on HDDS-1982:


Author: ASF GitHub Bot
Created on: 05/Sep/19 16:14
Start Date: 05/Sep/19 16:14
Worklog Time Spent: 10m 
  Work Description: anuengineer commented on pull request #1344: HDDS-1982 
Extend SCMNodeManager to support decommission and maintenance states
URL: https://github.com/apache/hadoop/pull/1344#discussion_r321356653
 
 

 ##
 File path: 
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/node/NodeStatus.java
 ##
 @@ -0,0 +1,88 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hdds.scm.node;
+
+import org.apache.hadoop.hdds.protocol.proto.HddsProtos;
+
+import java.util.Objects;
+
+/**
+ * This class is used to capture the current status of a datanode. This
+ * includes its health (healthy, stale or dead) and its operation status (
+ * in_service, decommissioned and maintenance mode.
+ */
+public class NodeStatus {
+
+  private HddsProtos.NodeOperationalState operationalState;
+  private HddsProtos.NodeState health;
+
+  public NodeStatus(HddsProtos.NodeOperationalState operationalState,
+ HddsProtos.NodeState health) {
+this.operationalState = operationalState;
+this.health = health;
+  }
+
+  public static NodeStatus inServiceHealthy() {
+return new NodeStatus(HddsProtos.NodeOperationalState.IN_SERVICE,
+HddsProtos.NodeState.HEALTHY);
 
 Review comment:
   Cool.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 307285)
Time Spent: 3h 20m  (was: 3h 10m)

> Extend SCMNodeManager to support decommission and maintenance states
> 
>
> Key: HDDS-1982
> URL: https://issues.apache.org/jira/browse/HDDS-1982
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: SCM
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> Currently, within SCM a node can have the following states:
> HEALTHY
> STALE
> DEAD
> DECOMMISSIONING
> DECOMMISSIONED
> The last 2 are not currently used.
> In order to support decommissioning and maintenance mode, we need to extend 
> the set of states a node can have to include decommission and maintenance 
> states.
> It is also important to note that a node decommissioning or entering 
> maintenance can also be HEALTHY, STALE or go DEAD.
> Therefore in this Jira I propose we should model a node state with two 
> different sets of values. The first, is effectively the liveliness of the 
> node, with the following states. This is largely what is in place now:
> HEALTHY
> STALE
> DEAD
> The second is the node operational state:
> IN_SERVICE
> DECOMMISSIONING
> DECOMMISSIONED
> ENTERING_MAINTENANCE
> IN_MAINTENANCE
> That means the overall total number of states for a node is the cross-product 
> of the two above lists, however it probably makes sense to keep the two 
> states seperate internally.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1982) Extend SCMNodeManager to support decommission and maintenance states

2019-09-05 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1982?focusedWorklogId=307283&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-307283
 ]

ASF GitHub Bot logged work on HDDS-1982:


Author: ASF GitHub Bot
Created on: 05/Sep/19 16:13
Start Date: 05/Sep/19 16:13
Worklog Time Spent: 10m 
  Work Description: anuengineer commented on pull request #1344: HDDS-1982 
Extend SCMNodeManager to support decommission and maintenance states
URL: https://github.com/apache/hadoop/pull/1344#discussion_r321356299
 
 

 ##
 File path: 
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/node/NodeStateManager.java
 ##
 @@ -219,47 +221,51 @@ private void initialiseState2EventMap() {
*  |   |  | |
*  V   V  | |
* [HEALTHY]--->[STALE]--->[DEAD]
-   *| (TIMEOUT)  | (TIMEOUT)   |
-   *|| |
-   *|| |
-   *|| |
-   *|| |
-   *| (DECOMMISSION) | (DECOMMISSION)  | (DECOMMISSION)
-   *|V |
-   *+--->[DECOMMISSIONING]<+
-   * |
-   * | (DECOMMISSIONED)
-   * |
-   * V
-   *  [DECOMMISSIONED]
*
*/
 
   /**
* Initializes the lifecycle of node state machine.
*/
-  private void initializeStateMachine() {
-stateMachine.addTransition(
+  private void initializeStateMachines() {
+nodeHealthSM.addTransition(
 NodeState.HEALTHY, NodeState.STALE, NodeLifeCycleEvent.TIMEOUT);
-stateMachine.addTransition(
+nodeHealthSM.addTransition(
 NodeState.STALE, NodeState.DEAD, NodeLifeCycleEvent.TIMEOUT);
-stateMachine.addTransition(
+nodeHealthSM.addTransition(
 NodeState.STALE, NodeState.HEALTHY, NodeLifeCycleEvent.RESTORE);
-stateMachine.addTransition(
+nodeHealthSM.addTransition(
 NodeState.DEAD, NodeState.HEALTHY, NodeLifeCycleEvent.RESURRECT);
-stateMachine.addTransition(
-NodeState.HEALTHY, NodeState.DECOMMISSIONING,
-NodeLifeCycleEvent.DECOMMISSION);
-stateMachine.addTransition(
-NodeState.STALE, NodeState.DECOMMISSIONING,
-NodeLifeCycleEvent.DECOMMISSION);
-stateMachine.addTransition(
-NodeState.DEAD, NodeState.DECOMMISSIONING,
-NodeLifeCycleEvent.DECOMMISSION);
-stateMachine.addTransition(
-NodeState.DECOMMISSIONING, NodeState.DECOMMISSIONED,
-NodeLifeCycleEvent.DECOMMISSIONED);
 
+nodeOpStateSM.addTransition(
+NodeOperationalState.IN_SERVICE, NodeOperationalState.DECOMMISSIONING,
+NodeOperationStateEvent.START_DECOMMISSION);
+nodeOpStateSM.addTransition(
+NodeOperationalState.DECOMMISSIONING, NodeOperationalState.IN_SERVICE,
+NodeOperationStateEvent.RETURN_TO_SERVICE);
+nodeOpStateSM.addTransition(
+NodeOperationalState.DECOMMISSIONING,
+NodeOperationalState.DECOMMISSIONED,
+NodeOperationStateEvent.COMPLETE_DECOMMISSION);
+nodeOpStateSM.addTransition(
+NodeOperationalState.DECOMMISSIONED, NodeOperationalState.IN_SERVICE,
+NodeOperationStateEvent.RETURN_TO_SERVICE);
+
+nodeOpStateSM.addTransition(
+NodeOperationalState.IN_SERVICE,
+NodeOperationalState.ENTERING_MAINTENANCE,
+NodeOperationStateEvent.START_MAINTENANCE);
+nodeOpStateSM.addTransition(
+NodeOperationalState.ENTERING_MAINTENANCE,
+NodeOperationalState.IN_SERVICE,
+NodeOperationStateEvent.RETURN_TO_SERVICE);
+nodeOpStateSM.addTransition(
+NodeOperationalState.ENTERING_MAINTENANCE,
+NodeOperationalState.IN_MAINTENANCE,
+NodeOperationStateEvent.ENTER_MAINTENANCE);
+nodeOpStateSM.addTransition(
+NodeOperationalState.IN_MAINTENANCE, NodeOperationalState.IN_SERVICE,
+NodeOperationStateEvent.RETURN_TO_SERVICE);
 
 Review comment:
   Along with your consideration, do we need an edge called Timeout that leads 
from IN_MAINTENANCE to IN_SERVICE? or do you plan to send in RETURN_TO_SERVICE 
event when there is a timeout? Either works, I was wondering if we should 
capture the time out edge in the state machine at all ?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries

[jira] [Work logged] (HDDS-1982) Extend SCMNodeManager to support decommission and maintenance states

2019-09-05 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1982?focusedWorklogId=307284&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-307284
 ]

ASF GitHub Bot logged work on HDDS-1982:


Author: ASF GitHub Bot
Created on: 05/Sep/19 16:13
Start Date: 05/Sep/19 16:13
Worklog Time Spent: 10m 
  Work Description: anuengineer commented on pull request #1344: HDDS-1982 
Extend SCMNodeManager to support decommission and maintenance states
URL: https://github.com/apache/hadoop/pull/1344#discussion_r321356555
 
 

 ##
 File path: 
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/node/NodeStateManager.java
 ##
 @@ -426,18 +432,20 @@ public int getStaleNodeCount() {
* @return dead node count
*/
   public int getDeadNodeCount() {
-return getNodeCount(NodeState.DEAD);
+// TODO - hard coded IN_SERVICE
+return getNodeCount(
+new NodeStatus(NodeOperationalState.IN_SERVICE, NodeState.DEAD));
 
 Review comment:
   Perfect, works well. I saw that later in the code. It is fine for now. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 307284)
Time Spent: 3h 10m  (was: 3h)

> Extend SCMNodeManager to support decommission and maintenance states
> 
>
> Key: HDDS-1982
> URL: https://issues.apache.org/jira/browse/HDDS-1982
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: SCM
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> Currently, within SCM a node can have the following states:
> HEALTHY
> STALE
> DEAD
> DECOMMISSIONING
> DECOMMISSIONED
> The last 2 are not currently used.
> In order to support decommissioning and maintenance mode, we need to extend 
> the set of states a node can have to include decommission and maintenance 
> states.
> It is also important to note that a node decommissioning or entering 
> maintenance can also be HEALTHY, STALE or go DEAD.
> Therefore in this Jira I propose we should model a node state with two 
> different sets of values. The first, is effectively the liveliness of the 
> node, with the following states. This is largely what is in place now:
> HEALTHY
> STALE
> DEAD
> The second is the node operational state:
> IN_SERVICE
> DECOMMISSIONING
> DECOMMISSIONED
> ENTERING_MAINTENANCE
> IN_MAINTENANCE
> That means the overall total number of states for a node is the cross-product 
> of the two above lists, however it probably makes sense to keep the two 
> states seperate internally.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1982) Extend SCMNodeManager to support decommission and maintenance states

2019-09-05 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1982?focusedWorklogId=307281&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-307281
 ]

ASF GitHub Bot logged work on HDDS-1982:


Author: ASF GitHub Bot
Created on: 05/Sep/19 16:11
Start Date: 05/Sep/19 16:11
Worklog Time Spent: 10m 
  Work Description: anuengineer commented on pull request #1344: HDDS-1982 
Extend SCMNodeManager to support decommission and maintenance states
URL: https://github.com/apache/hadoop/pull/1344#discussion_r321355379
 
 

 ##
 File path: 
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/node/NodeStateManager.java
 ##
 @@ -219,47 +221,51 @@ private void initialiseState2EventMap() {
*  |   |  | |
*  V   V  | |
* [HEALTHY]--->[STALE]--->[DEAD]
-   *| (TIMEOUT)  | (TIMEOUT)   |
-   *|| |
-   *|| |
-   *|| |
-   *|| |
-   *| (DECOMMISSION) | (DECOMMISSION)  | (DECOMMISSION)
-   *|V |
-   *+--->[DECOMMISSIONING]<+
-   * |
-   * | (DECOMMISSIONED)
-   * |
-   * V
-   *  [DECOMMISSIONED]
*
*/
 
   /**
* Initializes the lifecycle of node state machine.
*/
-  private void initializeStateMachine() {
-stateMachine.addTransition(
+  private void initializeStateMachines() {
+nodeHealthSM.addTransition(
 NodeState.HEALTHY, NodeState.STALE, NodeLifeCycleEvent.TIMEOUT);
-stateMachine.addTransition(
+nodeHealthSM.addTransition(
 NodeState.STALE, NodeState.DEAD, NodeLifeCycleEvent.TIMEOUT);
-stateMachine.addTransition(
+nodeHealthSM.addTransition(
 NodeState.STALE, NodeState.HEALTHY, NodeLifeCycleEvent.RESTORE);
-stateMachine.addTransition(
+nodeHealthSM.addTransition(
 NodeState.DEAD, NodeState.HEALTHY, NodeLifeCycleEvent.RESURRECT);
-stateMachine.addTransition(
-NodeState.HEALTHY, NodeState.DECOMMISSIONING,
-NodeLifeCycleEvent.DECOMMISSION);
-stateMachine.addTransition(
-NodeState.STALE, NodeState.DECOMMISSIONING,
-NodeLifeCycleEvent.DECOMMISSION);
-stateMachine.addTransition(
-NodeState.DEAD, NodeState.DECOMMISSIONING,
-NodeLifeCycleEvent.DECOMMISSION);
-stateMachine.addTransition(
-NodeState.DECOMMISSIONING, NodeState.DECOMMISSIONED,
-NodeLifeCycleEvent.DECOMMISSIONED);
 
+nodeOpStateSM.addTransition(
+NodeOperationalState.IN_SERVICE, NodeOperationalState.DECOMMISSIONING,
+NodeOperationStateEvent.START_DECOMMISSION);
+nodeOpStateSM.addTransition(
+NodeOperationalState.DECOMMISSIONING, NodeOperationalState.IN_SERVICE,
+NodeOperationStateEvent.RETURN_TO_SERVICE);
+nodeOpStateSM.addTransition(
+NodeOperationalState.DECOMMISSIONING,
+NodeOperationalState.DECOMMISSIONED,
+NodeOperationStateEvent.COMPLETE_DECOMMISSION);
+nodeOpStateSM.addTransition(
+NodeOperationalState.DECOMMISSIONED, NodeOperationalState.IN_SERVICE,
+NodeOperationStateEvent.RETURN_TO_SERVICE);
 
 Review comment:
   makes sense, I also do this quite often; let us see what sticks.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 307281)
Time Spent: 2h 50m  (was: 2h 40m)

> Extend SCMNodeManager to support decommission and maintenance states
> 
>
> Key: HDDS-1982
> URL: https://issues.apache.org/jira/browse/HDDS-1982
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: SCM
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> Currently, within SCM a node can have the following states:
> HEALTHY
> STALE
> DEAD
> DECOMMISSIONING
> DECOMMISSIONED
> The last 2 are not currently used.
> In order

[jira] [Work logged] (HDDS-1982) Extend SCMNodeManager to support decommission and maintenance states

2019-09-05 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1982?focusedWorklogId=307079&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-307079
 ]

ASF GitHub Bot logged work on HDDS-1982:


Author: ASF GitHub Bot
Created on: 05/Sep/19 11:18
Start Date: 05/Sep/19 11:18
Worklog Time Spent: 10m 
  Work Description: sodonnel commented on pull request #1344: HDDS-1982 
Extend SCMNodeManager to support decommission and maintenance states
URL: https://github.com/apache/hadoop/pull/1344#discussion_r321204004
 
 

 ##
 File path: 
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/node/SCMNodeManager.java
 ##
 @@ -185,7 +190,7 @@ public int getNodeCount(NodeState nodestate) {
   @Override
   public NodeState getNodeState(DatanodeDetails datanodeDetails) {
 
 Review comment:
   Yes, the 'external interface' of SCMNodeManager will need to change but I 
want to get these changes to be good internally before we push them up the 
stack.
   
   Thanks for taking the time to review this WIP. Glad to hear this is going in 
the correct direction so I will look to tidy things up and then we can consider 
the next step.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 307079)
Time Spent: 2h 40m  (was: 2.5h)

> Extend SCMNodeManager to support decommission and maintenance states
> 
>
> Key: HDDS-1982
> URL: https://issues.apache.org/jira/browse/HDDS-1982
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: SCM
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> Currently, within SCM a node can have the following states:
> HEALTHY
> STALE
> DEAD
> DECOMMISSIONING
> DECOMMISSIONED
> The last 2 are not currently used.
> In order to support decommissioning and maintenance mode, we need to extend 
> the set of states a node can have to include decommission and maintenance 
> states.
> It is also important to note that a node decommissioning or entering 
> maintenance can also be HEALTHY, STALE or go DEAD.
> Therefore in this Jira I propose we should model a node state with two 
> different sets of values. The first, is effectively the liveliness of the 
> node, with the following states. This is largely what is in place now:
> HEALTHY
> STALE
> DEAD
> The second is the node operational state:
> IN_SERVICE
> DECOMMISSIONING
> DECOMMISSIONED
> ENTERING_MAINTENANCE
> IN_MAINTENANCE
> That means the overall total number of states for a node is the cross-product 
> of the two above lists, however it probably makes sense to keep the two 
> states seperate internally.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1982) Extend SCMNodeManager to support decommission and maintenance states

2019-09-05 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1982?focusedWorklogId=307076&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-307076
 ]

ASF GitHub Bot logged work on HDDS-1982:


Author: ASF GitHub Bot
Created on: 05/Sep/19 11:16
Start Date: 05/Sep/19 11:16
Worklog Time Spent: 10m 
  Work Description: sodonnel commented on pull request #1344: HDDS-1982 
Extend SCMNodeManager to support decommission and maintenance states
URL: https://github.com/apache/hadoop/pull/1344#discussion_r321203027
 
 

 ##
 File path: 
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/node/SCMNodeManager.java
 ##
 @@ -151,7 +152,9 @@ private void unregisterMXBean() {
*/
   @Override
   public List getNodes(NodeState nodestate) {
-return nodeStateManager.getNodes(nodestate).stream()
+return nodeStateManager.getNodes(
+new NodeStatus(HddsProtos.NodeOperationalState.IN_SERVICE, nodestate))
+.stream()
 .map(node -> (DatanodeDetails)node).collect(Collectors.toList());
   }
 
 Review comment:
   Yea I need to fix the query function. I can imagine we will need things like 
all IN_MAINT nodes (ignoring healthy, dead etc) or all dead (ignore op state). 
Right now that is not possible to query until I figure out how to enhance the 
interface.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 307076)
Time Spent: 2.5h  (was: 2h 20m)

> Extend SCMNodeManager to support decommission and maintenance states
> 
>
> Key: HDDS-1982
> URL: https://issues.apache.org/jira/browse/HDDS-1982
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: SCM
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> Currently, within SCM a node can have the following states:
> HEALTHY
> STALE
> DEAD
> DECOMMISSIONING
> DECOMMISSIONED
> The last 2 are not currently used.
> In order to support decommissioning and maintenance mode, we need to extend 
> the set of states a node can have to include decommission and maintenance 
> states.
> It is also important to note that a node decommissioning or entering 
> maintenance can also be HEALTHY, STALE or go DEAD.
> Therefore in this Jira I propose we should model a node state with two 
> different sets of values. The first, is effectively the liveliness of the 
> node, with the following states. This is largely what is in place now:
> HEALTHY
> STALE
> DEAD
> The second is the node operational state:
> IN_SERVICE
> DECOMMISSIONING
> DECOMMISSIONED
> ENTERING_MAINTENANCE
> IN_MAINTENANCE
> That means the overall total number of states for a node is the cross-product 
> of the two above lists, however it probably makes sense to keep the two 
> states seperate internally.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1982) Extend SCMNodeManager to support decommission and maintenance states

2019-09-05 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1982?focusedWorklogId=307075&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-307075
 ]

ASF GitHub Bot logged work on HDDS-1982:


Author: ASF GitHub Bot
Created on: 05/Sep/19 11:12
Start Date: 05/Sep/19 11:12
Worklog Time Spent: 10m 
  Work Description: sodonnel commented on pull request #1344: HDDS-1982 
Extend SCMNodeManager to support decommission and maintenance states
URL: https://github.com/apache/hadoop/pull/1344#discussion_r321201465
 
 

 ##
 File path: 
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/node/NodeStatus.java
 ##
 @@ -0,0 +1,88 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hdds.scm.node;
+
+import org.apache.hadoop.hdds.protocol.proto.HddsProtos;
+
+import java.util.Objects;
+
+/**
+ * This class is used to capture the current status of a datanode. This
+ * includes its health (healthy, stale or dead) and its operation status (
+ * in_service, decommissioned and maintenance mode.
+ */
+public class NodeStatus {
+
+  private HddsProtos.NodeOperationalState operationalState;
+  private HddsProtos.NodeState health;
+
+  public NodeStatus(HddsProtos.NodeOperationalState operationalState,
+ HddsProtos.NodeState health) {
+this.operationalState = operationalState;
+this.health = health;
+  }
+
+  public static NodeStatus inServiceHealthy() {
+return new NodeStatus(HddsProtos.NodeOperationalState.IN_SERVICE,
+HddsProtos.NodeState.HEALTHY);
+  }
+
+  public static NodeStatus inServiceStale() {
+return new NodeStatus(HddsProtos.NodeOperationalState.IN_SERVICE,
+HddsProtos.NodeState.STALE);
+  }
+
+  public static NodeStatus inServiceDead() {
+return new NodeStatus(HddsProtos.NodeOperationalState.IN_SERVICE,
+HddsProtos.NodeState.DEAD);
+  }
+
 
 Review comment:
   Yes. I got tired of typing the whole new NodeStatus(...) and decided to try 
adding the static methods. It definitely makes the code cleaner, but the cross 
product worries me. At the moment its only 5 * 3 = 15 states, but what if we 
add a 3rd status or a couple more states. The number of helper methods will get 
out of control. We can see how it develops I guess.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 307075)
Time Spent: 2h 20m  (was: 2h 10m)

> Extend SCMNodeManager to support decommission and maintenance states
> 
>
> Key: HDDS-1982
> URL: https://issues.apache.org/jira/browse/HDDS-1982
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: SCM
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> Currently, within SCM a node can have the following states:
> HEALTHY
> STALE
> DEAD
> DECOMMISSIONING
> DECOMMISSIONED
> The last 2 are not currently used.
> In order to support decommissioning and maintenance mode, we need to extend 
> the set of states a node can have to include decommission and maintenance 
> states.
> It is also important to note that a node decommissioning or entering 
> maintenance can also be HEALTHY, STALE or go DEAD.
> Therefore in this Jira I propose we should model a node state with two 
> different sets of values. The first, is effectively the liveliness of the 
> node, with the following states. This is largely what is in place now:
> HEALTHY
> STALE
> DEAD
> The second is the node operational state:
> IN_SERVICE
> DECOMMISSIONING
> DECOMMISSIONED
> ENTERING_MAINTENANCE
> IN_MAINTENANCE
> That means the 

[jira] [Work logged] (HDDS-1982) Extend SCMNodeManager to support decommission and maintenance states

2019-09-05 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1982?focusedWorklogId=307070&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-307070
 ]

ASF GitHub Bot logged work on HDDS-1982:


Author: ASF GitHub Bot
Created on: 05/Sep/19 11:06
Start Date: 05/Sep/19 11:06
Worklog Time Spent: 10m 
  Work Description: sodonnel commented on pull request #1344: HDDS-1982 
Extend SCMNodeManager to support decommission and maintenance states
URL: https://github.com/apache/hadoop/pull/1344#discussion_r321199279
 
 

 ##
 File path: 
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/node/NodeStatus.java
 ##
 @@ -0,0 +1,88 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hdds.scm.node;
+
+import org.apache.hadoop.hdds.protocol.proto.HddsProtos;
+
+import java.util.Objects;
+
+/**
+ * This class is used to capture the current status of a datanode. This
+ * includes its health (healthy, stale or dead) and its operation status (
+ * in_service, decommissioned and maintenance mode.
+ */
+public class NodeStatus {
+
+  private HddsProtos.NodeOperationalState operationalState;
+  private HddsProtos.NodeState health;
+
+  public NodeStatus(HddsProtos.NodeOperationalState operationalState,
+ HddsProtos.NodeState health) {
+this.operationalState = operationalState;
+this.health = health;
+  }
+
+  public static NodeStatus inServiceHealthy() {
+return new NodeStatus(HddsProtos.NodeOperationalState.IN_SERVICE,
+HddsProtos.NodeState.HEALTHY);
 
 Review comment:
   Yea, we could optimize this and always return the same object.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 307070)
Time Spent: 2h 10m  (was: 2h)

> Extend SCMNodeManager to support decommission and maintenance states
> 
>
> Key: HDDS-1982
> URL: https://issues.apache.org/jira/browse/HDDS-1982
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: SCM
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> Currently, within SCM a node can have the following states:
> HEALTHY
> STALE
> DEAD
> DECOMMISSIONING
> DECOMMISSIONED
> The last 2 are not currently used.
> In order to support decommissioning and maintenance mode, we need to extend 
> the set of states a node can have to include decommission and maintenance 
> states.
> It is also important to note that a node decommissioning or entering 
> maintenance can also be HEALTHY, STALE or go DEAD.
> Therefore in this Jira I propose we should model a node state with two 
> different sets of values. The first, is effectively the liveliness of the 
> node, with the following states. This is largely what is in place now:
> HEALTHY
> STALE
> DEAD
> The second is the node operational state:
> IN_SERVICE
> DECOMMISSIONING
> DECOMMISSIONED
> ENTERING_MAINTENANCE
> IN_MAINTENANCE
> That means the overall total number of states for a node is the cross-product 
> of the two above lists, however it probably makes sense to keep the two 
> states seperate internally.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1982) Extend SCMNodeManager to support decommission and maintenance states

2019-09-05 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1982?focusedWorklogId=307068&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-307068
 ]

ASF GitHub Bot logged work on HDDS-1982:


Author: ASF GitHub Bot
Created on: 05/Sep/19 11:05
Start Date: 05/Sep/19 11:05
Worklog Time Spent: 10m 
  Work Description: sodonnel commented on pull request #1344: HDDS-1982 
Extend SCMNodeManager to support decommission and maintenance states
URL: https://github.com/apache/hadoop/pull/1344#discussion_r321198662
 
 

 ##
 File path: 
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/node/NodeStateManager.java
 ##
 @@ -578,39 +587,33 @@ private void checkNodesHealth() {
 Predicate deadNodeCondition =
 (lastHbTime) -> lastHbTime < staleNodeDeadline;
 try {
-  for (NodeState state : NodeState.values()) {
-List nodes = nodeStateMap.getNodes(state);
-for (UUID id : nodes) {
-  DatanodeInfo node = nodeStateMap.getNodeInfo(id);
-  switch (state) {
-  case HEALTHY:
-// Move the node to STALE if the last heartbeat time is less than
-// configured stale-node interval.
-updateNodeState(node, staleNodeCondition, state,
-  NodeLifeCycleEvent.TIMEOUT);
-break;
-  case STALE:
-// Move the node to DEAD if the last heartbeat time is less than
-// configured dead-node interval.
-updateNodeState(node, deadNodeCondition, state,
-NodeLifeCycleEvent.TIMEOUT);
-// Restore the node if we have received heartbeat before configured
-// stale-node interval.
-updateNodeState(node, healthyNodeCondition, state,
-NodeLifeCycleEvent.RESTORE);
-break;
-  case DEAD:
-// Resurrect the node if we have received heartbeat before
-// configured stale-node interval.
-updateNodeState(node, healthyNodeCondition, state,
-NodeLifeCycleEvent.RESURRECT);
-break;
-// We don't do anything for DECOMMISSIONING and DECOMMISSIONED in
-// heartbeat processing.
-  case DECOMMISSIONING:
-  case DECOMMISSIONED:
-  default:
-  }
+  for(DatanodeInfo node : nodeStateMap.getAllDatanodeInfos()) {
+NodeState state =
+nodeStateMap.getNodeStatus(node.getUuid()).getHealth();
+switch (state) {
+case HEALTHY:
+  // Move the node to STALE if the last heartbeat time is less than
+  // configured stale-node interval.
+  updateNodeState(node, staleNodeCondition, state,
+  NodeLifeCycleEvent.TIMEOUT);
+  break;
+case STALE:
+  // Move the node to DEAD if the last heartbeat time is less than
+  // configured dead-node interval.
+  updateNodeState(node, deadNodeCondition, state,
+  NodeLifeCycleEvent.TIMEOUT);
+  // Restore the node if we have received heartbeat before configured
+  // stale-node interval.
+  updateNodeState(node, healthyNodeCondition, state,
+  NodeLifeCycleEvent.RESTORE);
+  break;
+case DEAD:
+  // Resurrect the node if we have received heartbeat before
+  // configured stale-node interval.
+  updateNodeState(node, healthyNodeCondition, state,
+  NodeLifeCycleEvent.RESURRECT);
+  break;
+default:
 }
 
 Review comment:
   This loop didn't need to change for this change, but it seemed to be a 
double loop when it didn't really need to be, and was doing extra lookups from 
the NodeStateMap, so this makes it cleaner to read and slightly more efficient 
too.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 307068)
Time Spent: 2h  (was: 1h 50m)

> Extend SCMNodeManager to support decommission and maintenance states
> 
>
> Key: HDDS-1982
> URL: https://issues.apache.org/jira/browse/HDDS-1982
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: SCM
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> Currently, within SCM a node can have the following states:
> HEALTHY
> STALE
> DEAD
> DECO

[jira] [Work logged] (HDDS-1982) Extend SCMNodeManager to support decommission and maintenance states

2019-09-05 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1982?focusedWorklogId=307067&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-307067
 ]

ASF GitHub Bot logged work on HDDS-1982:


Author: ASF GitHub Bot
Created on: 05/Sep/19 11:01
Start Date: 05/Sep/19 11:01
Worklog Time Spent: 10m 
  Work Description: sodonnel commented on pull request #1344: HDDS-1982 
Extend SCMNodeManager to support decommission and maintenance states
URL: https://github.com/apache/hadoop/pull/1344#discussion_r321197327
 
 

 ##
 File path: 
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/node/NodeStateManager.java
 ##
 @@ -426,18 +432,20 @@ public int getStaleNodeCount() {
* @return dead node count
*/
   public int getDeadNodeCount() {
-return getNodeCount(NodeState.DEAD);
+// TODO - hard coded IN_SERVICE
+return getNodeCount(
+new NodeStatus(NodeOperationalState.IN_SERVICE, NodeState.DEAD));
 
 Review comment:
   There are a bunch of places where I have hardcoded IN_SERVICE, so once we 
get this working we will need different events for DECOM / IN_MAINT + DEAD, as 
that is an expected state rather than an error condition as it would be now.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 307067)
Time Spent: 1h 50m  (was: 1h 40m)

> Extend SCMNodeManager to support decommission and maintenance states
> 
>
> Key: HDDS-1982
> URL: https://issues.apache.org/jira/browse/HDDS-1982
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: SCM
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Currently, within SCM a node can have the following states:
> HEALTHY
> STALE
> DEAD
> DECOMMISSIONING
> DECOMMISSIONED
> The last 2 are not currently used.
> In order to support decommissioning and maintenance mode, we need to extend 
> the set of states a node can have to include decommission and maintenance 
> states.
> It is also important to note that a node decommissioning or entering 
> maintenance can also be HEALTHY, STALE or go DEAD.
> Therefore in this Jira I propose we should model a node state with two 
> different sets of values. The first, is effectively the liveliness of the 
> node, with the following states. This is largely what is in place now:
> HEALTHY
> STALE
> DEAD
> The second is the node operational state:
> IN_SERVICE
> DECOMMISSIONING
> DECOMMISSIONED
> ENTERING_MAINTENANCE
> IN_MAINTENANCE
> That means the overall total number of states for a node is the cross-product 
> of the two above lists, however it probably makes sense to keep the two 
> states seperate internally.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1982) Extend SCMNodeManager to support decommission and maintenance states

2019-09-05 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1982?focusedWorklogId=307061&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-307061
 ]

ASF GitHub Bot logged work on HDDS-1982:


Author: ASF GitHub Bot
Created on: 05/Sep/19 10:49
Start Date: 05/Sep/19 10:49
Worklog Time Spent: 10m 
  Work Description: sodonnel commented on pull request #1344: HDDS-1982 
Extend SCMNodeManager to support decommission and maintenance states
URL: https://github.com/apache/hadoop/pull/1344#discussion_r321192741
 
 

 ##
 File path: 
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/node/NodeStateManager.java
 ##
 @@ -219,47 +221,51 @@ private void initialiseState2EventMap() {
*  |   |  | |
*  V   V  | |
* [HEALTHY]--->[STALE]--->[DEAD]
-   *| (TIMEOUT)  | (TIMEOUT)   |
-   *|| |
-   *|| |
-   *|| |
-   *|| |
-   *| (DECOMMISSION) | (DECOMMISSION)  | (DECOMMISSION)
-   *|V |
-   *+--->[DECOMMISSIONING]<+
-   * |
-   * | (DECOMMISSIONED)
-   * |
-   * V
-   *  [DECOMMISSIONED]
*
*/
 
   /**
* Initializes the lifecycle of node state machine.
*/
-  private void initializeStateMachine() {
-stateMachine.addTransition(
+  private void initializeStateMachines() {
+nodeHealthSM.addTransition(
 NodeState.HEALTHY, NodeState.STALE, NodeLifeCycleEvent.TIMEOUT);
-stateMachine.addTransition(
+nodeHealthSM.addTransition(
 NodeState.STALE, NodeState.DEAD, NodeLifeCycleEvent.TIMEOUT);
-stateMachine.addTransition(
+nodeHealthSM.addTransition(
 NodeState.STALE, NodeState.HEALTHY, NodeLifeCycleEvent.RESTORE);
-stateMachine.addTransition(
+nodeHealthSM.addTransition(
 NodeState.DEAD, NodeState.HEALTHY, NodeLifeCycleEvent.RESURRECT);
-stateMachine.addTransition(
-NodeState.HEALTHY, NodeState.DECOMMISSIONING,
-NodeLifeCycleEvent.DECOMMISSION);
-stateMachine.addTransition(
-NodeState.STALE, NodeState.DECOMMISSIONING,
-NodeLifeCycleEvent.DECOMMISSION);
-stateMachine.addTransition(
-NodeState.DEAD, NodeState.DECOMMISSIONING,
-NodeLifeCycleEvent.DECOMMISSION);
-stateMachine.addTransition(
-NodeState.DECOMMISSIONING, NodeState.DECOMMISSIONED,
-NodeLifeCycleEvent.DECOMMISSIONED);
 
+nodeOpStateSM.addTransition(
+NodeOperationalState.IN_SERVICE, NodeOperationalState.DECOMMISSIONING,
+NodeOperationStateEvent.START_DECOMMISSION);
+nodeOpStateSM.addTransition(
+NodeOperationalState.DECOMMISSIONING, NodeOperationalState.IN_SERVICE,
+NodeOperationStateEvent.RETURN_TO_SERVICE);
+nodeOpStateSM.addTransition(
+NodeOperationalState.DECOMMISSIONING,
+NodeOperationalState.DECOMMISSIONED,
+NodeOperationStateEvent.COMPLETE_DECOMMISSION);
+nodeOpStateSM.addTransition(
+NodeOperationalState.DECOMMISSIONED, NodeOperationalState.IN_SERVICE,
+NodeOperationStateEvent.RETURN_TO_SERVICE);
+
+nodeOpStateSM.addTransition(
+NodeOperationalState.IN_SERVICE,
+NodeOperationalState.ENTERING_MAINTENANCE,
+NodeOperationStateEvent.START_MAINTENANCE);
+nodeOpStateSM.addTransition(
+NodeOperationalState.ENTERING_MAINTENANCE,
+NodeOperationalState.IN_SERVICE,
+NodeOperationStateEvent.RETURN_TO_SERVICE);
+nodeOpStateSM.addTransition(
+NodeOperationalState.ENTERING_MAINTENANCE,
+NodeOperationalState.IN_MAINTENANCE,
+NodeOperationStateEvent.ENTER_MAINTENANCE);
+nodeOpStateSM.addTransition(
+NodeOperationalState.IN_MAINTENANCE, NodeOperationalState.IN_SERVICE,
+NodeOperationStateEvent.RETURN_TO_SERVICE);
 
 Review comment:
   I hadn't considered where to store that as yet. Probably it will be outside 
of the state machine, but need to consider where it fits in. Perhaps in 
NodeStatus, but that would change that object from being immutable, to carrying 
a time. 
   
   We will need some sort of decommission / maintenance mode monitor, probably 
separate from the heartbeat monitor. The decomm monitor will need to check when 
all blocks are replicated etc, so it could also keep track of the node 
maintenance timeout and hence switch the node to 'IN_SER

[jira] [Work logged] (HDDS-1982) Extend SCMNodeManager to support decommission and maintenance states

2019-09-05 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1982?focusedWorklogId=307060&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-307060
 ]

ASF GitHub Bot logged work on HDDS-1982:


Author: ASF GitHub Bot
Created on: 05/Sep/19 10:45
Start Date: 05/Sep/19 10:45
Worklog Time Spent: 10m 
  Work Description: sodonnel commented on pull request #1344: HDDS-1982 
Extend SCMNodeManager to support decommission and maintenance states
URL: https://github.com/apache/hadoop/pull/1344#discussion_r321191360
 
 

 ##
 File path: 
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/node/NodeStateManager.java
 ##
 @@ -219,47 +221,51 @@ private void initialiseState2EventMap() {
*  |   |  | |
*  V   V  | |
* [HEALTHY]--->[STALE]--->[DEAD]
-   *| (TIMEOUT)  | (TIMEOUT)   |
-   *|| |
-   *|| |
-   *|| |
-   *|| |
-   *| (DECOMMISSION) | (DECOMMISSION)  | (DECOMMISSION)
-   *|V |
-   *+--->[DECOMMISSIONING]<+
-   * |
-   * | (DECOMMISSIONED)
-   * |
-   * V
-   *  [DECOMMISSIONED]
*
*/
 
   /**
* Initializes the lifecycle of node state machine.
*/
-  private void initializeStateMachine() {
-stateMachine.addTransition(
+  private void initializeStateMachines() {
+nodeHealthSM.addTransition(
 NodeState.HEALTHY, NodeState.STALE, NodeLifeCycleEvent.TIMEOUT);
-stateMachine.addTransition(
+nodeHealthSM.addTransition(
 NodeState.STALE, NodeState.DEAD, NodeLifeCycleEvent.TIMEOUT);
-stateMachine.addTransition(
+nodeHealthSM.addTransition(
 NodeState.STALE, NodeState.HEALTHY, NodeLifeCycleEvent.RESTORE);
-stateMachine.addTransition(
+nodeHealthSM.addTransition(
 NodeState.DEAD, NodeState.HEALTHY, NodeLifeCycleEvent.RESURRECT);
-stateMachine.addTransition(
-NodeState.HEALTHY, NodeState.DECOMMISSIONING,
-NodeLifeCycleEvent.DECOMMISSION);
-stateMachine.addTransition(
-NodeState.STALE, NodeState.DECOMMISSIONING,
-NodeLifeCycleEvent.DECOMMISSION);
-stateMachine.addTransition(
-NodeState.DEAD, NodeState.DECOMMISSIONING,
-NodeLifeCycleEvent.DECOMMISSION);
-stateMachine.addTransition(
-NodeState.DECOMMISSIONING, NodeState.DECOMMISSIONED,
-NodeLifeCycleEvent.DECOMMISSIONED);
 
+nodeOpStateSM.addTransition(
+NodeOperationalState.IN_SERVICE, NodeOperationalState.DECOMMISSIONING,
+NodeOperationStateEvent.START_DECOMMISSION);
+nodeOpStateSM.addTransition(
+NodeOperationalState.DECOMMISSIONING, NodeOperationalState.IN_SERVICE,
+NodeOperationStateEvent.RETURN_TO_SERVICE);
+nodeOpStateSM.addTransition(
+NodeOperationalState.DECOMMISSIONING,
+NodeOperationalState.DECOMMISSIONED,
+NodeOperationStateEvent.COMPLETE_DECOMMISSION);
+nodeOpStateSM.addTransition(
+NodeOperationalState.DECOMMISSIONED, NodeOperationalState.IN_SERVICE,
+NodeOperationStateEvent.RETURN_TO_SERVICE);
 
 Review comment:
   I have not yet considered what should happen. First stage is to get the 
states in and make sure nothing breaks, then figure out how to use them :)
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 307060)
Time Spent: 1.5h  (was: 1h 20m)

> Extend SCMNodeManager to support decommission and maintenance states
> 
>
> Key: HDDS-1982
> URL: https://issues.apache.org/jira/browse/HDDS-1982
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: SCM
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Currently, within SCM a node can have the following states:
> HEALTHY
> STALE
> DEAD
> DECO

[jira] [Work logged] (HDDS-1982) Extend SCMNodeManager to support decommission and maintenance states

2019-09-04 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1982?focusedWorklogId=306751&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-306751
 ]

ASF GitHub Bot logged work on HDDS-1982:


Author: ASF GitHub Bot
Created on: 04/Sep/19 22:24
Start Date: 04/Sep/19 22:24
Worklog Time Spent: 10m 
  Work Description: anuengineer commented on pull request #1344: HDDS-1982 
Extend SCMNodeManager to support decommission and maintenance states
URL: https://github.com/apache/hadoop/pull/1344#discussion_r320994828
 
 

 ##
 File path: 
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/node/NodeStateManager.java
 ##
 @@ -219,47 +221,51 @@ private void initialiseState2EventMap() {
*  |   |  | |
*  V   V  | |
* [HEALTHY]--->[STALE]--->[DEAD]
-   *| (TIMEOUT)  | (TIMEOUT)   |
-   *|| |
-   *|| |
-   *|| |
-   *|| |
-   *| (DECOMMISSION) | (DECOMMISSION)  | (DECOMMISSION)
-   *|V |
-   *+--->[DECOMMISSIONING]<+
-   * |
-   * | (DECOMMISSIONED)
-   * |
-   * V
-   *  [DECOMMISSIONED]
*
*/
 
   /**
* Initializes the lifecycle of node state machine.
*/
-  private void initializeStateMachine() {
-stateMachine.addTransition(
+  private void initializeStateMachines() {
+nodeHealthSM.addTransition(
 NodeState.HEALTHY, NodeState.STALE, NodeLifeCycleEvent.TIMEOUT);
-stateMachine.addTransition(
+nodeHealthSM.addTransition(
 NodeState.STALE, NodeState.DEAD, NodeLifeCycleEvent.TIMEOUT);
-stateMachine.addTransition(
+nodeHealthSM.addTransition(
 NodeState.STALE, NodeState.HEALTHY, NodeLifeCycleEvent.RESTORE);
-stateMachine.addTransition(
+nodeHealthSM.addTransition(
 NodeState.DEAD, NodeState.HEALTHY, NodeLifeCycleEvent.RESURRECT);
-stateMachine.addTransition(
-NodeState.HEALTHY, NodeState.DECOMMISSIONING,
-NodeLifeCycleEvent.DECOMMISSION);
-stateMachine.addTransition(
-NodeState.STALE, NodeState.DECOMMISSIONING,
-NodeLifeCycleEvent.DECOMMISSION);
-stateMachine.addTransition(
-NodeState.DEAD, NodeState.DECOMMISSIONING,
-NodeLifeCycleEvent.DECOMMISSION);
-stateMachine.addTransition(
-NodeState.DECOMMISSIONING, NodeState.DECOMMISSIONED,
-NodeLifeCycleEvent.DECOMMISSIONED);
 
+nodeOpStateSM.addTransition(
+NodeOperationalState.IN_SERVICE, NodeOperationalState.DECOMMISSIONING,
+NodeOperationStateEvent.START_DECOMMISSION);
+nodeOpStateSM.addTransition(
+NodeOperationalState.DECOMMISSIONING, NodeOperationalState.IN_SERVICE,
+NodeOperationStateEvent.RETURN_TO_SERVICE);
+nodeOpStateSM.addTransition(
+NodeOperationalState.DECOMMISSIONING,
+NodeOperationalState.DECOMMISSIONED,
+NodeOperationStateEvent.COMPLETE_DECOMMISSION);
+nodeOpStateSM.addTransition(
+NodeOperationalState.DECOMMISSIONED, NodeOperationalState.IN_SERVICE,
+NodeOperationStateEvent.RETURN_TO_SERVICE);
 
 Review comment:
   What happens when we do this ? is this a new node? or do we pick up from 
where we left off, say if there are containers on this machine, they are 
treated as part of the system?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 306751)

> Extend SCMNodeManager to support decommission and maintenance states
> 
>
> Key: HDDS-1982
> URL: https://issues.apache.org/jira/browse/HDDS-1982
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: SCM
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Currently, within SCM a node can have the following states:
> HEALTHY
> STALE
> DEAD
> DECOMMIS

[jira] [Work logged] (HDDS-1982) Extend SCMNodeManager to support decommission and maintenance states

2019-09-04 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1982?focusedWorklogId=306753&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-306753
 ]

ASF GitHub Bot logged work on HDDS-1982:


Author: ASF GitHub Bot
Created on: 04/Sep/19 22:24
Start Date: 04/Sep/19 22:24
Worklog Time Spent: 10m 
  Work Description: anuengineer commented on pull request #1344: HDDS-1982 
Extend SCMNodeManager to support decommission and maintenance states
URL: https://github.com/apache/hadoop/pull/1344#discussion_r320999602
 
 

 ##
 File path: 
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/node/SCMNodeManager.java
 ##
 @@ -151,7 +152,9 @@ private void unregisterMXBean() {
*/
   @Override
   public List getNodes(NodeState nodestate) {
-return nodeStateManager.getNodes(nodestate).stream()
+return nodeStateManager.getNodes(
+new NodeStatus(HddsProtos.NodeOperationalState.IN_SERVICE, nodestate))
+.stream()
 .map(node -> (DatanodeDetails)node).collect(Collectors.toList());
   }
 
 Review comment:
   In the final patch, should change the Node Query Function? so we can say, 
get me all the nodes that are in service and healthy, or all nodes in 
maintenance mode but dead? Let us Add that feature when we need it. I am ok 
with all operations mapping to IN_SERVICE for now. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 306753)

> Extend SCMNodeManager to support decommission and maintenance states
> 
>
> Key: HDDS-1982
> URL: https://issues.apache.org/jira/browse/HDDS-1982
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: SCM
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Currently, within SCM a node can have the following states:
> HEALTHY
> STALE
> DEAD
> DECOMMISSIONING
> DECOMMISSIONED
> The last 2 are not currently used.
> In order to support decommissioning and maintenance mode, we need to extend 
> the set of states a node can have to include decommission and maintenance 
> states.
> It is also important to note that a node decommissioning or entering 
> maintenance can also be HEALTHY, STALE or go DEAD.
> Therefore in this Jira I propose we should model a node state with two 
> different sets of values. The first, is effectively the liveliness of the 
> node, with the following states. This is largely what is in place now:
> HEALTHY
> STALE
> DEAD
> The second is the node operational state:
> IN_SERVICE
> DECOMMISSIONING
> DECOMMISSIONED
> ENTERING_MAINTENANCE
> IN_MAINTENANCE
> That means the overall total number of states for a node is the cross-product 
> of the two above lists, however it probably makes sense to keep the two 
> states seperate internally.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1982) Extend SCMNodeManager to support decommission and maintenance states

2019-09-04 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1982?focusedWorklogId=306756&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-306756
 ]

ASF GitHub Bot logged work on HDDS-1982:


Author: ASF GitHub Bot
Created on: 04/Sep/19 22:24
Start Date: 04/Sep/19 22:24
Worklog Time Spent: 10m 
  Work Description: anuengineer commented on pull request #1344: HDDS-1982 
Extend SCMNodeManager to support decommission and maintenance states
URL: https://github.com/apache/hadoop/pull/1344#discussion_r320998604
 
 

 ##
 File path: 
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/node/NodeStatus.java
 ##
 @@ -0,0 +1,88 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hdds.scm.node;
+
+import org.apache.hadoop.hdds.protocol.proto.HddsProtos;
+
+import java.util.Objects;
+
+/**
+ * This class is used to capture the current status of a datanode. This
+ * includes its health (healthy, stale or dead) and its operation status (
+ * in_service, decommissioned and maintenance mode.
+ */
+public class NodeStatus {
+
+  private HddsProtos.NodeOperationalState operationalState;
+  private HddsProtos.NodeState health;
+
+  public NodeStatus(HddsProtos.NodeOperationalState operationalState,
+ HddsProtos.NodeState health) {
+this.operationalState = operationalState;
+this.health = health;
+  }
+
+  public static NodeStatus inServiceHealthy() {
+return new NodeStatus(HddsProtos.NodeOperationalState.IN_SERVICE,
+HddsProtos.NodeState.HEALTHY);
+  }
+
+  public static NodeStatus inServiceStale() {
+return new NodeStatus(HddsProtos.NodeOperationalState.IN_SERVICE,
+HddsProtos.NodeState.STALE);
+  }
+
+  public static NodeStatus inServiceDead() {
+return new NodeStatus(HddsProtos.NodeOperationalState.IN_SERVICE,
+HddsProtos.NodeState.DEAD);
+  }
+
 
 Review comment:
   I am presuming that you have to define the whole cross product at some 
point, but right now this is all we need?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 306756)
Time Spent: 1h 20m  (was: 1h 10m)

> Extend SCMNodeManager to support decommission and maintenance states
> 
>
> Key: HDDS-1982
> URL: https://issues.apache.org/jira/browse/HDDS-1982
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: SCM
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Currently, within SCM a node can have the following states:
> HEALTHY
> STALE
> DEAD
> DECOMMISSIONING
> DECOMMISSIONED
> The last 2 are not currently used.
> In order to support decommissioning and maintenance mode, we need to extend 
> the set of states a node can have to include decommission and maintenance 
> states.
> It is also important to note that a node decommissioning or entering 
> maintenance can also be HEALTHY, STALE or go DEAD.
> Therefore in this Jira I propose we should model a node state with two 
> different sets of values. The first, is effectively the liveliness of the 
> node, with the following states. This is largely what is in place now:
> HEALTHY
> STALE
> DEAD
> The second is the node operational state:
> IN_SERVICE
> DECOMMISSIONING
> DECOMMISSIONED
> ENTERING_MAINTENANCE
> IN_MAINTENANCE
> That means the overall total number of states for a node is the cross-product 
> of the two above lists, however it probably makes sense to keep the two 
> states seperate internally.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

---

[jira] [Work logged] (HDDS-1982) Extend SCMNodeManager to support decommission and maintenance states

2019-09-04 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1982?focusedWorklogId=306755&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-306755
 ]

ASF GitHub Bot logged work on HDDS-1982:


Author: ASF GitHub Bot
Created on: 04/Sep/19 22:24
Start Date: 04/Sep/19 22:24
Worklog Time Spent: 10m 
  Work Description: anuengineer commented on pull request #1344: HDDS-1982 
Extend SCMNodeManager to support decommission and maintenance states
URL: https://github.com/apache/hadoop/pull/1344#discussion_r320998407
 
 

 ##
 File path: 
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/node/NodeStatus.java
 ##
 @@ -0,0 +1,88 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hdds.scm.node;
+
+import org.apache.hadoop.hdds.protocol.proto.HddsProtos;
+
+import java.util.Objects;
+
+/**
+ * This class is used to capture the current status of a datanode. This
+ * includes its health (healthy, stale or dead) and its operation status (
+ * in_service, decommissioned and maintenance mode.
+ */
+public class NodeStatus {
+
+  private HddsProtos.NodeOperationalState operationalState;
+  private HddsProtos.NodeState health;
+
+  public NodeStatus(HddsProtos.NodeOperationalState operationalState,
+ HddsProtos.NodeState health) {
+this.operationalState = operationalState;
+this.health = health;
+  }
+
+  public static NodeStatus inServiceHealthy() {
+return new NodeStatus(HddsProtos.NodeOperationalState.IN_SERVICE,
+HddsProtos.NodeState.HEALTHY);
 
 Review comment:
   Is there a reason to allocate this each time? just create a static one and 
return a reference to that, maybe? Not important at all, just wondering.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 306755)
Time Spent: 1h 20m  (was: 1h 10m)

> Extend SCMNodeManager to support decommission and maintenance states
> 
>
> Key: HDDS-1982
> URL: https://issues.apache.org/jira/browse/HDDS-1982
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: SCM
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Currently, within SCM a node can have the following states:
> HEALTHY
> STALE
> DEAD
> DECOMMISSIONING
> DECOMMISSIONED
> The last 2 are not currently used.
> In order to support decommissioning and maintenance mode, we need to extend 
> the set of states a node can have to include decommission and maintenance 
> states.
> It is also important to note that a node decommissioning or entering 
> maintenance can also be HEALTHY, STALE or go DEAD.
> Therefore in this Jira I propose we should model a node state with two 
> different sets of values. The first, is effectively the liveliness of the 
> node, with the following states. This is largely what is in place now:
> HEALTHY
> STALE
> DEAD
> The second is the node operational state:
> IN_SERVICE
> DECOMMISSIONING
> DECOMMISSIONED
> ENTERING_MAINTENANCE
> IN_MAINTENANCE
> That means the overall total number of states for a node is the cross-product 
> of the two above lists, however it probably makes sense to keep the two 
> states seperate internally.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1982) Extend SCMNodeManager to support decommission and maintenance states

2019-09-04 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1982?focusedWorklogId=306752&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-306752
 ]

ASF GitHub Bot logged work on HDDS-1982:


Author: ASF GitHub Bot
Created on: 04/Sep/19 22:24
Start Date: 04/Sep/19 22:24
Worklog Time Spent: 10m 
  Work Description: anuengineer commented on pull request #1344: HDDS-1982 
Extend SCMNodeManager to support decommission and maintenance states
URL: https://github.com/apache/hadoop/pull/1344#discussion_r320995341
 
 

 ##
 File path: 
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/node/NodeStateManager.java
 ##
 @@ -219,47 +221,51 @@ private void initialiseState2EventMap() {
*  |   |  | |
*  V   V  | |
* [HEALTHY]--->[STALE]--->[DEAD]
-   *| (TIMEOUT)  | (TIMEOUT)   |
-   *|| |
-   *|| |
-   *|| |
-   *|| |
-   *| (DECOMMISSION) | (DECOMMISSION)  | (DECOMMISSION)
-   *|V |
-   *+--->[DECOMMISSIONING]<+
-   * |
-   * | (DECOMMISSIONED)
-   * |
-   * V
-   *  [DECOMMISSIONED]
*
*/
 
   /**
* Initializes the lifecycle of node state machine.
*/
-  private void initializeStateMachine() {
-stateMachine.addTransition(
+  private void initializeStateMachines() {
+nodeHealthSM.addTransition(
 NodeState.HEALTHY, NodeState.STALE, NodeLifeCycleEvent.TIMEOUT);
-stateMachine.addTransition(
+nodeHealthSM.addTransition(
 NodeState.STALE, NodeState.DEAD, NodeLifeCycleEvent.TIMEOUT);
-stateMachine.addTransition(
+nodeHealthSM.addTransition(
 NodeState.STALE, NodeState.HEALTHY, NodeLifeCycleEvent.RESTORE);
-stateMachine.addTransition(
+nodeHealthSM.addTransition(
 NodeState.DEAD, NodeState.HEALTHY, NodeLifeCycleEvent.RESURRECT);
-stateMachine.addTransition(
-NodeState.HEALTHY, NodeState.DECOMMISSIONING,
-NodeLifeCycleEvent.DECOMMISSION);
-stateMachine.addTransition(
-NodeState.STALE, NodeState.DECOMMISSIONING,
-NodeLifeCycleEvent.DECOMMISSION);
-stateMachine.addTransition(
-NodeState.DEAD, NodeState.DECOMMISSIONING,
-NodeLifeCycleEvent.DECOMMISSION);
-stateMachine.addTransition(
-NodeState.DECOMMISSIONING, NodeState.DECOMMISSIONED,
-NodeLifeCycleEvent.DECOMMISSIONED);
 
+nodeOpStateSM.addTransition(
+NodeOperationalState.IN_SERVICE, NodeOperationalState.DECOMMISSIONING,
+NodeOperationStateEvent.START_DECOMMISSION);
+nodeOpStateSM.addTransition(
+NodeOperationalState.DECOMMISSIONING, NodeOperationalState.IN_SERVICE,
+NodeOperationStateEvent.RETURN_TO_SERVICE);
+nodeOpStateSM.addTransition(
+NodeOperationalState.DECOMMISSIONING,
+NodeOperationalState.DECOMMISSIONED,
+NodeOperationStateEvent.COMPLETE_DECOMMISSION);
+nodeOpStateSM.addTransition(
+NodeOperationalState.DECOMMISSIONED, NodeOperationalState.IN_SERVICE,
+NodeOperationStateEvent.RETURN_TO_SERVICE);
+
+nodeOpStateSM.addTransition(
+NodeOperationalState.IN_SERVICE,
+NodeOperationalState.ENTERING_MAINTENANCE,
+NodeOperationStateEvent.START_MAINTENANCE);
+nodeOpStateSM.addTransition(
+NodeOperationalState.ENTERING_MAINTENANCE,
+NodeOperationalState.IN_SERVICE,
+NodeOperationStateEvent.RETURN_TO_SERVICE);
+nodeOpStateSM.addTransition(
+NodeOperationalState.ENTERING_MAINTENANCE,
+NodeOperationalState.IN_MAINTENANCE,
+NodeOperationStateEvent.ENTER_MAINTENANCE);
 
 Review comment:
   From an English point of view, this is slightly confusing. But I see why :)
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 306752)
Time Spent: 1h 10m  (was: 1h)

> Extend SCMNodeManager to support decommission and maintenance states
> 
>
> Key: HDDS-1982
>  

[jira] [Work logged] (HDDS-1982) Extend SCMNodeManager to support decommission and maintenance states

2019-09-04 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1982?focusedWorklogId=306750&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-306750
 ]

ASF GitHub Bot logged work on HDDS-1982:


Author: ASF GitHub Bot
Created on: 04/Sep/19 22:24
Start Date: 04/Sep/19 22:24
Worklog Time Spent: 10m 
  Work Description: anuengineer commented on pull request #1344: HDDS-1982 
Extend SCMNodeManager to support decommission and maintenance states
URL: https://github.com/apache/hadoop/pull/1344#discussion_r320996724
 
 

 ##
 File path: 
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/node/NodeStateManager.java
 ##
 @@ -426,18 +432,20 @@ public int getStaleNodeCount() {
* @return dead node count
*/
   public int getDeadNodeCount() {
-return getNodeCount(NodeState.DEAD);
+// TODO - hard coded IN_SERVICE
+return getNodeCount(
+new NodeStatus(NodeOperationalState.IN_SERVICE, NodeState.DEAD));
 
 Review comment:
   Interesting; what happens to a node in maintenance mode, but switched off? 
or dead? Does that become a dead node? I think I agree with your conclusion 
that it is not a dead node, but flagging for others to consider.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 306750)

> Extend SCMNodeManager to support decommission and maintenance states
> 
>
> Key: HDDS-1982
> URL: https://issues.apache.org/jira/browse/HDDS-1982
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: SCM
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Currently, within SCM a node can have the following states:
> HEALTHY
> STALE
> DEAD
> DECOMMISSIONING
> DECOMMISSIONED
> The last 2 are not currently used.
> In order to support decommissioning and maintenance mode, we need to extend 
> the set of states a node can have to include decommission and maintenance 
> states.
> It is also important to note that a node decommissioning or entering 
> maintenance can also be HEALTHY, STALE or go DEAD.
> Therefore in this Jira I propose we should model a node state with two 
> different sets of values. The first, is effectively the liveliness of the 
> node, with the following states. This is largely what is in place now:
> HEALTHY
> STALE
> DEAD
> The second is the node operational state:
> IN_SERVICE
> DECOMMISSIONING
> DECOMMISSIONED
> ENTERING_MAINTENANCE
> IN_MAINTENANCE
> That means the overall total number of states for a node is the cross-product 
> of the two above lists, however it probably makes sense to keep the two 
> states seperate internally.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1982) Extend SCMNodeManager to support decommission and maintenance states

2019-09-04 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1982?focusedWorklogId=306754&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-306754
 ]

ASF GitHub Bot logged work on HDDS-1982:


Author: ASF GitHub Bot
Created on: 04/Sep/19 22:24
Start Date: 04/Sep/19 22:24
Worklog Time Spent: 10m 
  Work Description: anuengineer commented on pull request #1344: HDDS-1982 
Extend SCMNodeManager to support decommission and maintenance states
URL: https://github.com/apache/hadoop/pull/1344#discussion_r320997400
 
 

 ##
 File path: 
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/node/NodeStateManager.java
 ##
 @@ -578,39 +587,33 @@ private void checkNodesHealth() {
 Predicate deadNodeCondition =
 (lastHbTime) -> lastHbTime < staleNodeDeadline;
 try {
-  for (NodeState state : NodeState.values()) {
-List nodes = nodeStateMap.getNodes(state);
-for (UUID id : nodes) {
-  DatanodeInfo node = nodeStateMap.getNodeInfo(id);
-  switch (state) {
-  case HEALTHY:
-// Move the node to STALE if the last heartbeat time is less than
-// configured stale-node interval.
-updateNodeState(node, staleNodeCondition, state,
-  NodeLifeCycleEvent.TIMEOUT);
-break;
-  case STALE:
-// Move the node to DEAD if the last heartbeat time is less than
-// configured dead-node interval.
-updateNodeState(node, deadNodeCondition, state,
-NodeLifeCycleEvent.TIMEOUT);
-// Restore the node if we have received heartbeat before configured
-// stale-node interval.
-updateNodeState(node, healthyNodeCondition, state,
-NodeLifeCycleEvent.RESTORE);
-break;
-  case DEAD:
-// Resurrect the node if we have received heartbeat before
-// configured stale-node interval.
-updateNodeState(node, healthyNodeCondition, state,
-NodeLifeCycleEvent.RESURRECT);
-break;
-// We don't do anything for DECOMMISSIONING and DECOMMISSIONED in
-// heartbeat processing.
-  case DECOMMISSIONING:
-  case DECOMMISSIONED:
-  default:
-  }
+  for(DatanodeInfo node : nodeStateMap.getAllDatanodeInfos()) {
+NodeState state =
+nodeStateMap.getNodeStatus(node.getUuid()).getHealth();
+switch (state) {
+case HEALTHY:
+  // Move the node to STALE if the last heartbeat time is less than
+  // configured stale-node interval.
+  updateNodeState(node, staleNodeCondition, state,
+  NodeLifeCycleEvent.TIMEOUT);
+  break;
+case STALE:
+  // Move the node to DEAD if the last heartbeat time is less than
+  // configured dead-node interval.
+  updateNodeState(node, deadNodeCondition, state,
+  NodeLifeCycleEvent.TIMEOUT);
+  // Restore the node if we have received heartbeat before configured
+  // stale-node interval.
+  updateNodeState(node, healthyNodeCondition, state,
+  NodeLifeCycleEvent.RESTORE);
+  break;
+case DEAD:
+  // Resurrect the node if we have received heartbeat before
+  // configured stale-node interval.
+  updateNodeState(node, healthyNodeCondition, state,
+  NodeLifeCycleEvent.RESURRECT);
+  break;
+default:
 }
 
 Review comment:
   Not sure why we need this loop change, but it does make code reading simpler.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 306754)

> Extend SCMNodeManager to support decommission and maintenance states
> 
>
> Key: HDDS-1982
> URL: https://issues.apache.org/jira/browse/HDDS-1982
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: SCM
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Currently, within SCM a node can have the following states:
> HEALTHY
> STALE
> DEAD
> DECOMMISSIONING
> DECOMMISSIONED
> The last 2 are not currently used.
> In order to support decommissioning and maintenance mode, we need to extend 
> the set of states a node can have to in

[jira] [Work logged] (HDDS-1982) Extend SCMNodeManager to support decommission and maintenance states

2019-09-04 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1982?focusedWorklogId=306748&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-306748
 ]

ASF GitHub Bot logged work on HDDS-1982:


Author: ASF GitHub Bot
Created on: 04/Sep/19 22:24
Start Date: 04/Sep/19 22:24
Worklog Time Spent: 10m 
  Work Description: anuengineer commented on pull request #1344: HDDS-1982 
Extend SCMNodeManager to support decommission and maintenance states
URL: https://github.com/apache/hadoop/pull/1344#discussion_r320995703
 
 

 ##
 File path: 
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/node/NodeStateManager.java
 ##
 @@ -219,47 +221,51 @@ private void initialiseState2EventMap() {
*  |   |  | |
*  V   V  | |
* [HEALTHY]--->[STALE]--->[DEAD]
-   *| (TIMEOUT)  | (TIMEOUT)   |
-   *|| |
-   *|| |
-   *|| |
-   *|| |
-   *| (DECOMMISSION) | (DECOMMISSION)  | (DECOMMISSION)
-   *|V |
-   *+--->[DECOMMISSIONING]<+
-   * |
-   * | (DECOMMISSIONED)
-   * |
-   * V
-   *  [DECOMMISSIONED]
*
*/
 
   /**
* Initializes the lifecycle of node state machine.
*/
-  private void initializeStateMachine() {
-stateMachine.addTransition(
+  private void initializeStateMachines() {
+nodeHealthSM.addTransition(
 NodeState.HEALTHY, NodeState.STALE, NodeLifeCycleEvent.TIMEOUT);
-stateMachine.addTransition(
+nodeHealthSM.addTransition(
 NodeState.STALE, NodeState.DEAD, NodeLifeCycleEvent.TIMEOUT);
-stateMachine.addTransition(
+nodeHealthSM.addTransition(
 NodeState.STALE, NodeState.HEALTHY, NodeLifeCycleEvent.RESTORE);
-stateMachine.addTransition(
+nodeHealthSM.addTransition(
 NodeState.DEAD, NodeState.HEALTHY, NodeLifeCycleEvent.RESURRECT);
-stateMachine.addTransition(
-NodeState.HEALTHY, NodeState.DECOMMISSIONING,
-NodeLifeCycleEvent.DECOMMISSION);
-stateMachine.addTransition(
-NodeState.STALE, NodeState.DECOMMISSIONING,
-NodeLifeCycleEvent.DECOMMISSION);
-stateMachine.addTransition(
-NodeState.DEAD, NodeState.DECOMMISSIONING,
-NodeLifeCycleEvent.DECOMMISSION);
-stateMachine.addTransition(
-NodeState.DECOMMISSIONING, NodeState.DECOMMISSIONED,
-NodeLifeCycleEvent.DECOMMISSIONED);
 
+nodeOpStateSM.addTransition(
+NodeOperationalState.IN_SERVICE, NodeOperationalState.DECOMMISSIONING,
+NodeOperationStateEvent.START_DECOMMISSION);
+nodeOpStateSM.addTransition(
+NodeOperationalState.DECOMMISSIONING, NodeOperationalState.IN_SERVICE,
+NodeOperationStateEvent.RETURN_TO_SERVICE);
+nodeOpStateSM.addTransition(
+NodeOperationalState.DECOMMISSIONING,
+NodeOperationalState.DECOMMISSIONED,
+NodeOperationStateEvent.COMPLETE_DECOMMISSION);
+nodeOpStateSM.addTransition(
+NodeOperationalState.DECOMMISSIONED, NodeOperationalState.IN_SERVICE,
+NodeOperationStateEvent.RETURN_TO_SERVICE);
+
+nodeOpStateSM.addTransition(
+NodeOperationalState.IN_SERVICE,
+NodeOperationalState.ENTERING_MAINTENANCE,
+NodeOperationStateEvent.START_MAINTENANCE);
+nodeOpStateSM.addTransition(
+NodeOperationalState.ENTERING_MAINTENANCE,
+NodeOperationalState.IN_SERVICE,
+NodeOperationStateEvent.RETURN_TO_SERVICE);
+nodeOpStateSM.addTransition(
+NodeOperationalState.ENTERING_MAINTENANCE,
+NodeOperationalState.IN_MAINTENANCE,
+NodeOperationStateEvent.ENTER_MAINTENANCE);
+nodeOpStateSM.addTransition(
+NodeOperationalState.IN_MAINTENANCE, NodeOperationalState.IN_SERVICE,
+NodeOperationStateEvent.RETURN_TO_SERVICE);
 
 Review comment:
   How do we handle the edge of timeOut, Maintenance might have time out -- 
that is I put the maintenance for one day and forget about it. Or is that 
handled outside the state machine?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking

[jira] [Work logged] (HDDS-1982) Extend SCMNodeManager to support decommission and maintenance states

2019-09-04 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1982?focusedWorklogId=306749&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-306749
 ]

ASF GitHub Bot logged work on HDDS-1982:


Author: ASF GitHub Bot
Created on: 04/Sep/19 22:24
Start Date: 04/Sep/19 22:24
Worklog Time Spent: 10m 
  Work Description: anuengineer commented on pull request #1344: HDDS-1982 
Extend SCMNodeManager to support decommission and maintenance states
URL: https://github.com/apache/hadoop/pull/1344#discussion_r32176
 
 

 ##
 File path: 
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/node/SCMNodeManager.java
 ##
 @@ -185,7 +190,7 @@ public int getNodeCount(NodeState nodestate) {
   @Override
   public NodeState getNodeState(DatanodeDetails datanodeDetails) {
 
 Review comment:
   We might want to write alternate version which take the operational status 
too ..since these calls are internal. Again, not something that need to be done 
in this patch. I am just writing down things as I see them. Please don't treat 
any of my suggests as a code review thought. More like, something that might be 
useful in the long run is more appropriate.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 306749)
Time Spent: 1h  (was: 50m)

> Extend SCMNodeManager to support decommission and maintenance states
> 
>
> Key: HDDS-1982
> URL: https://issues.apache.org/jira/browse/HDDS-1982
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: SCM
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Currently, within SCM a node can have the following states:
> HEALTHY
> STALE
> DEAD
> DECOMMISSIONING
> DECOMMISSIONED
> The last 2 are not currently used.
> In order to support decommissioning and maintenance mode, we need to extend 
> the set of states a node can have to include decommission and maintenance 
> states.
> It is also important to note that a node decommissioning or entering 
> maintenance can also be HEALTHY, STALE or go DEAD.
> Therefore in this Jira I propose we should model a node state with two 
> different sets of values. The first, is effectively the liveliness of the 
> node, with the following states. This is largely what is in place now:
> HEALTHY
> STALE
> DEAD
> The second is the node operational state:
> IN_SERVICE
> DECOMMISSIONING
> DECOMMISSIONED
> ENTERING_MAINTENANCE
> IN_MAINTENANCE
> That means the overall total number of states for a node is the cross-product 
> of the two above lists, however it probably makes sense to keep the two 
> states seperate internally.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1982) Extend SCMNodeManager to support decommission and maintenance states

2019-08-27 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1982?focusedWorklogId=301848&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-301848
 ]

ASF GitHub Bot logged work on HDDS-1982:


Author: ASF GitHub Bot
Created on: 27/Aug/19 09:55
Start Date: 27/Aug/19 09:55
Worklog Time Spent: 10m 
  Work Description: sodonnel commented on issue #1344: HDDS-1982 Extend 
SCMNodeManager to support decommission and maintenance states
URL: https://github.com/apache/hadoop/pull/1344#issuecomment-525230582
 
 
   This got a few test failures. TestSCMNodeMetrics was a legitimate failure, I 
have fixed it.
   
   TestSecureContainerServer.testClientServerRatisGrpc() was failing on trunk, 
but has now been fixed.
   
   TestBlockOutputStreamWithFailures.testWatchForCommitDatanodeFailure() seems 
flaky. It has passed and failed a few times locally.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 301848)
Time Spent: 40m  (was: 0.5h)

> Extend SCMNodeManager to support decommission and maintenance states
> 
>
> Key: HDDS-1982
> URL: https://issues.apache.org/jira/browse/HDDS-1982
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: SCM
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Currently, within SCM a node can have the following states:
> HEALTHY
> STALE
> DEAD
> DECOMMISSIONING
> DECOMMISSIONED
> The last 2 are not currently used.
> In order to support decommissioning and maintenance mode, we need to extend 
> the set of states a node can have to include decommission and maintenance 
> states.
> It is also important to note that a node decommissioning or entering 
> maintenance can also be HEALTHY, STALE or go DEAD.
> Therefore in this Jira I propose we should model a node state with two 
> different sets of values. The first, is effectively the liveliness of the 
> node, with the following states. This is largely what is in place now:
> HEALTHY
> STALE
> DEAD
> The second is the node operational state:
> IN_SERVICE
> DECOMMISSIONING
> DECOMMISSIONED
> ENTERING_MAINTENANCE
> IN_MAINTENANCE
> That means the overall total number of states for a node is the cross-product 
> of the two above lists, however it probably makes sense to keep the two 
> states seperate internally.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1982) Extend SCMNodeManager to support decommission and maintenance states

2019-08-27 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1982?focusedWorklogId=301754&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-301754
 ]

ASF GitHub Bot logged work on HDDS-1982:


Author: ASF GitHub Bot
Created on: 27/Aug/19 07:37
Start Date: 27/Aug/19 07:37
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on issue #1344: HDDS-1982 Extend 
SCMNodeManager to support decommission and maintenance states
URL: https://github.com/apache/hadoop/pull/1344#issuecomment-525179725
 
 
   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | 0 | reexec | 41 | Docker mode activated. |
   ||| _ Prechecks _ |
   | +1 | dupname | 0 | No case conflicting files found. |
   | +1 | @author | 0 | The patch does not contain any @author tags. |
   | -1 | test4tests | 0 | The patch doesn't appear to include any new or 
modified tests.  Please justify why no new tests are needed for this patch. 
Also please list what manual steps were performed to verify this patch. |
   ||| _ trunk Compile Tests _ |
   | 0 | mvndep | 75 | Maven dependency ordering for branch |
   | +1 | mvninstall | 668 | trunk passed |
   | +1 | compile | 381 | trunk passed |
   | +1 | checkstyle | 80 | trunk passed |
   | +1 | mvnsite | 0 | trunk passed |
   | +1 | shadedclient | 862 | branch has no errors when building and testing 
our client artifacts. |
   | +1 | javadoc | 179 | trunk passed |
   | 0 | spotbugs | 448 | Used deprecated FindBugs config; considering 
switching to SpotBugs. |
   | +1 | findbugs | 661 | trunk passed |
   ||| _ Patch Compile Tests _ |
   | 0 | mvndep | 25 | Maven dependency ordering for patch |
   | +1 | mvninstall | 559 | the patch passed |
   | +1 | compile | 393 | the patch passed |
   | +1 | cc | 393 | the patch passed |
   | +1 | javac | 393 | the patch passed |
   | +1 | checkstyle | 88 | the patch passed |
   | +1 | mvnsite | 0 | the patch passed |
   | +1 | whitespace | 0 | The patch has no whitespace issues. |
   | +1 | shadedclient | 716 | patch has no errors when building and testing 
our client artifacts. |
   | +1 | javadoc | 201 | the patch passed |
   | +1 | findbugs | 783 | the patch passed |
   ||| _ Other Tests _ |
   | +1 | unit | 371 | hadoop-hdds in the patch passed. |
   | -1 | unit | 263 | hadoop-ozone in the patch failed. |
   | +1 | asflicense | 58 | The patch does not generate ASF License warnings. |
   | | | 6570 | |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | 
hadoop.ozone.security.TestOzoneDelegationTokenSecretManager |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | Client=19.03.1 Server=19.03.1 base: 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1344/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/1344 |
   | Optional Tests | dupname asflicense compile cc mvnsite javac unit javadoc 
mvninstall shadedclient findbugs checkstyle |
   | uname | Linux dd5f8f252930 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 
11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | personality/hadoop.sh |
   | git revision | trunk / b69ac57 |
   | Default Java | 1.8.0_222 |
   | unit | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1344/1/artifact/out/patch-unit-hadoop-ozone.txt
 |
   |  Test Results | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1344/1/testReport/ |
   | Max. process+thread count | 1138 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdds/common hadoop-hdds/server-scm hadoop-hdds/tools 
U: hadoop-hdds |
   | Console output | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-1344/1/console |
   | versions | git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1 |
   | Powered by | Apache Yetus 0.10.0 http://yetus.apache.org |
   
   
   This message was automatically generated.
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 301754)
Time Spent: 0.5h  (was: 20m)

> Extend SCMNodeManager to support decommission and maintenance states
> 
>
> Key: HDDS-1982
> URL: https://issues.apache.org/jira/browse/HDDS-1982
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: SCM
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
>  Labels: pull-request-av

[jira] [Work logged] (HDDS-1982) Extend SCMNodeManager to support decommission and maintenance states

2019-08-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1982?focusedWorklogId=300359&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-300359
 ]

ASF GitHub Bot logged work on HDDS-1982:


Author: ASF GitHub Bot
Created on: 23/Aug/19 16:19
Start Date: 23/Aug/19 16:19
Worklog Time Spent: 10m 
  Work Description: sodonnel commented on issue #1344: HDDS-1982 Extend 
SCMNodeManager to support decommission and maintenance states
URL: https://github.com/apache/hadoop/pull/1344#issuecomment-524376220
 
 
   /label ozone
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 300359)
Time Spent: 20m  (was: 10m)

> Extend SCMNodeManager to support decommission and maintenance states
> 
>
> Key: HDDS-1982
> URL: https://issues.apache.org/jira/browse/HDDS-1982
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: SCM
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Currently, within SCM a node can have the following states:
> HEALTHY
> STALE
> DEAD
> DECOMMISSIONING
> DECOMMISSIONED
> The last 2 are not currently used.
> In order to support decommissioning and maintenance mode, we need to extend 
> the set of states a node can have to include decommission and maintenance 
> states.
> It is also important to note that a node decommissioning or entering 
> maintenance can also be HEALTHY, STALE or go DEAD.
> Therefore in this Jira I propose we should model a node state with two 
> different sets of values. The first, is effectively the liveliness of the 
> node, with the following states. This is largely what is in place now:
> HEALTHY
> STALE
> DEAD
> The second is the node operational state:
> IN_SERVICE
> DECOMMISSIONING
> DECOMMISSIONED
> ENTERING_MAINTENANCE
> IN_MAINTENANCE
> That means the overall total number of states for a node is the cross-product 
> of the two above lists, however it probably makes sense to keep the two 
> states seperate internally.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1982) Extend SCMNodeManager to support decommission and maintenance states

2019-08-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1982?focusedWorklogId=300358&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-300358
 ]

ASF GitHub Bot logged work on HDDS-1982:


Author: ASF GitHub Bot
Created on: 23/Aug/19 16:18
Start Date: 23/Aug/19 16:18
Worklog Time Spent: 10m 
  Work Description: sodonnel commented on pull request #1344: HDDS-1982 
Extend SCMNodeManager to support decommission and maintenance states
URL: https://github.com/apache/hadoop/pull/1344
 
 
   Remove the existing decommission states from the protobuf definition.
   
   At this stage, this PR is really a test to see if the build passes with 
these states removed.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 300358)
Remaining Estimate: 0h
Time Spent: 10m

> Extend SCMNodeManager to support decommission and maintenance states
> 
>
> Key: HDDS-1982
> URL: https://issues.apache.org/jira/browse/HDDS-1982
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: SCM
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently, within SCM a node can have the following states:
> HEALTHY
> STALE
> DEAD
> DECOMMISSIONING
> DECOMMISSIONED
> The last 2 are not currently used.
> In order to support decommissioning and maintenance mode, we need to extend 
> the set of states a node can have to include decommission and maintenance 
> states.
> It is also important to note that a node decommissioning or entering 
> maintenance can also be HEALTHY, STALE or go DEAD.
> Therefore in this Jira I propose we should model a node state with two 
> different sets of values. The first, is effectively the liveliness of the 
> node, with the following states. This is largely what is in place now:
> HEALTHY
> STALE
> DEAD
> The second is the node operational state:
> IN_SERVICE
> DECOMMISSIONING
> DECOMMISSIONED
> ENTERING_MAINTENANCE
> IN_MAINTENANCE
> That means the overall total number of states for a node is the cross-product 
> of the two above lists, however it probably makes sense to keep the two 
> states seperate internally.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org