[
https://issues.apache.org/jira/browse/AMBARI-10029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14495437#comment-14495437
]
Hudson commented on AMBARI-10029:
---------------------------------
FAILURE: Integrated in Ambari-trunk-Commit #2312 (See
[https://builds.apache.org/job/Ambari-trunk-Commit/2312/])
AMBARI-10029. Node auto-recovery (phase-I) (smohanty:
http://git-wip-us.apache.org/repos/asf?p=ambari.git&a=commit&h=dcbf12ef92b3922d13e5fb00bc5551fd927cbf08)
*
ambari-server/src/main/java/org/apache/ambari/server/agent/ComponentRecoveryReport.java
*
ambari-server/src/main/java/org/apache/ambari/server/agent/RegistrationResponse.java
*
ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatHandler.java
*
ambari-server/src/main/java/org/apache/ambari/server/actionmanager/ExecutionCommandWrapper.java
* ambari-server/src/main/java/org/apache/ambari/server/agent/StatusCommand.java
*
ambari-server/src/test/java/org/apache/ambari/server/controller/AmbariManagementControllerTest.java
* ambari-server/src/main/java/org/apache/ambari/server/agent/AgentRequests.java
* ambari-server/src/main/java/org/apache/ambari/server/agent/RecoveryReport.java
* ambari-server/src/main/java/org/apache/ambari/server/agent/RecoveryConfig.java
*
ambari-server/src/main/java/org/apache/ambari/server/controller/AmbariManagementControllerImpl.java
* ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeat.java
* ambari-agent/src/test/python/ambari_agent/TestActionQueue.py
* ambari-agent/src/main/python/ambari_agent/RecoveryManager.py
* ambari-server/src/main/resources/properties.json
*
ambari-server/src/main/java/org/apache/ambari/server/controller/AmbariManagementController.java
* ambari-agent/src/test/python/ambari_agent/TestController.py
* ambari-agent/src/main/python/ambari_agent/DataCleaner.py
* ambari-agent/src/main/python/ambari_agent/Controller.py
*
ambari-server/src/main/java/org/apache/ambari/server/controller/internal/URLStreamProvider.java
*
ambari-server/src/test/java/org/apache/ambari/server/agent/TestHeartbeatHandler.java
* ambari-agent/src/main/python/ambari_agent/ActionQueue.py
*
ambari-server/src/main/java/org/apache/ambari/server/agent/ComponentStatus.java
*
ambari-server/src/main/java/org/apache/ambari/server/controller/HostResponse.java
* ambari-agent/src/main/python/ambari_agent/Heartbeat.py
*
ambari-server/src/main/java/org/apache/ambari/server/agent/HeartbeatMonitor.java
*
ambari-server/src/main/java/org/apache/ambari/server/controller/internal/HostResourceProvider.java
*
ambari-server/src/main/java/org/apache/ambari/server/configuration/Configuration.java
* ambari-agent/src/main/python/ambari_agent/CustomServiceOrchestrator.py
* ambari-server/src/main/java/org/apache/ambari/server/state/Host.java
*
ambari-server/src/test/java/org/apache/ambari/server/controller/internal/HostResourceProviderTest.java
* ambari-server/src/main/java/org/apache/ambari/server/state/host/HostImpl.java
* ambari-agent/src/test/python/ambari_agent/TestRecoveryManager.py
> Node auto-recovery
> ------------------
>
> Key: AMBARI-10029
> URL: https://issues.apache.org/jira/browse/AMBARI-10029
> Project: Ambari
> Issue Type: New Feature
> Components: ambari-agent, ambari-server
> Affects Versions: 2.0.0
> Reporter: Sumit Mohanty
> Assignee: Sumit Mohanty
> Fix For: 2.1.0
>
> Attachments: AMBARI-10029.patch, NodeRecovery.pdf
>
>
> Using blue-print, it is possible to perform a zero-touch install of hadoop
> clusters using Ambari. This is especially useful in the cloud environment.
> However, cloud environment also can be dynamic in the sense that nodes will
> get rebooted or reset to the original image.
> Reset, being that the node (usually VM) gets reverted to original state where
> it joined the cluster. It is assumed that a reset node has ambari-agent
> installed and configured to communicate with the server. The node may also
> have all packages pre-instaled.
> Node recovery is the feature to bring back a rebooted/reset online by
> starting or installing and then starting the host components that are already
> on the host.
> In general, temporarily losing a node and then performing node recovery on a
> slave host should not affect the whole cluster. If its is a master node then
> there can be some disruption based on what is deployed on the master host and
> if HA is enabled for the master services or not.
> Node recovery, discussed in this JIRA, only addresses the ability to
> automatically INSTALL/CONFIGURE/START host components on the node so that the
> desired state of the host component matches the actual state.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)