[ 
https://issues.apache.org/jira/browse/AMBARI-10029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14495437#comment-14495437
 ] 

Hudson commented on AMBARI-10029:
---------------------------------

FAILURE: Integrated in Ambari-trunk-Commit #2312 (See 
[https://builds.apache.org/job/Ambari-trunk-Commit/2312/])
AMBARI-10029. Node auto-recovery (phase-I) (smohanty: 
http://git-wip-us.apache.org/repos/asf?p=ambari.git&a=commit&h=dcbf12ef92b3922d13e5fb00bc5551fd927cbf08)
* 
ambari-server/src/main/java/org/apache/ambari/server/agent/ComponentRecoveryReport.java
* 
ambari-server/src/main/java/org/apache/ambari/server/agent/RegistrationResponse.java
* 
ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeatHandler.java
* 
ambari-server/src/main/java/org/apache/ambari/server/actionmanager/ExecutionCommandWrapper.java
* ambari-server/src/main/java/org/apache/ambari/server/agent/StatusCommand.java
* 
ambari-server/src/test/java/org/apache/ambari/server/controller/AmbariManagementControllerTest.java
* ambari-server/src/main/java/org/apache/ambari/server/agent/AgentRequests.java
* ambari-server/src/main/java/org/apache/ambari/server/agent/RecoveryReport.java
* ambari-server/src/main/java/org/apache/ambari/server/agent/RecoveryConfig.java
* 
ambari-server/src/main/java/org/apache/ambari/server/controller/AmbariManagementControllerImpl.java
* ambari-server/src/main/java/org/apache/ambari/server/agent/HeartBeat.java
* ambari-agent/src/test/python/ambari_agent/TestActionQueue.py
* ambari-agent/src/main/python/ambari_agent/RecoveryManager.py
* ambari-server/src/main/resources/properties.json
* 
ambari-server/src/main/java/org/apache/ambari/server/controller/AmbariManagementController.java
* ambari-agent/src/test/python/ambari_agent/TestController.py
* ambari-agent/src/main/python/ambari_agent/DataCleaner.py
* ambari-agent/src/main/python/ambari_agent/Controller.py
* 
ambari-server/src/main/java/org/apache/ambari/server/controller/internal/URLStreamProvider.java
* 
ambari-server/src/test/java/org/apache/ambari/server/agent/TestHeartbeatHandler.java
* ambari-agent/src/main/python/ambari_agent/ActionQueue.py
* 
ambari-server/src/main/java/org/apache/ambari/server/agent/ComponentStatus.java
* 
ambari-server/src/main/java/org/apache/ambari/server/controller/HostResponse.java
* ambari-agent/src/main/python/ambari_agent/Heartbeat.py
* 
ambari-server/src/main/java/org/apache/ambari/server/agent/HeartbeatMonitor.java
* 
ambari-server/src/main/java/org/apache/ambari/server/controller/internal/HostResourceProvider.java
* 
ambari-server/src/main/java/org/apache/ambari/server/configuration/Configuration.java
* ambari-agent/src/main/python/ambari_agent/CustomServiceOrchestrator.py
* ambari-server/src/main/java/org/apache/ambari/server/state/Host.java
* 
ambari-server/src/test/java/org/apache/ambari/server/controller/internal/HostResourceProviderTest.java
* ambari-server/src/main/java/org/apache/ambari/server/state/host/HostImpl.java
* ambari-agent/src/test/python/ambari_agent/TestRecoveryManager.py


> Node auto-recovery
> ------------------
>
>                 Key: AMBARI-10029
>                 URL: https://issues.apache.org/jira/browse/AMBARI-10029
>             Project: Ambari
>          Issue Type: New Feature
>          Components: ambari-agent, ambari-server
>    Affects Versions: 2.0.0
>            Reporter: Sumit Mohanty
>            Assignee: Sumit Mohanty
>             Fix For: 2.1.0
>
>         Attachments: AMBARI-10029.patch, NodeRecovery.pdf
>
>
> Using blue-print, it is possible to perform a zero-touch install of hadoop 
> clusters using Ambari. This is especially useful in the cloud environment. 
> However, cloud environment also can be dynamic in the sense that nodes will 
> get rebooted or reset to the original image.
> Reset, being that the node (usually VM) gets reverted to original state where 
> it joined the cluster. It is assumed that a reset node has ambari-agent 
> installed and configured to communicate with the server. The node may also 
> have all packages pre-instaled.
> Node recovery is the feature to bring back a rebooted/reset online by 
> starting or installing and then starting the host components that are already 
> on the host.
> In general, temporarily losing a node and then performing node recovery on a 
> slave host should not affect the whole cluster. If its is a master node then 
> there can be some disruption based on what is deployed on the master host and 
> if HA is enabled for the master services or not.
> Node recovery, discussed in this JIRA, only addresses the ability to 
> automatically INSTALL/CONFIGURE/START host components on the node so that the 
> desired state of the host component matches the actual state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to