[ https://issues.apache.org/jira/browse/HADOOP-7417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13054787#comment-13054787 ]
Eric Yang commented on HADOOP-7417: ----------------------------------- A brief description of Hadoop Management System design: !http://people.apache.org/~eyang/docs/HMS.svg! h4. Setup HMS Agent is a list of rpm packages which can be deployed as part of OS image through PXE boot. HMS Beacon is a daemon which runs on each zookeeper nodes to broadcast the location of the zookeeper. HMS Agent and controllers are standalone daemons, which resolve zookeeper location through HMS Beacon (zeroconf). h4. Operation Operator can issue command through HMS client and pass through HMS controller REST API. HMS command is serialized into JSON messages and queued in Zookeeper. Multiple HMS controllers watch the command queue for commands. When a command triggers the controller to execute, HMS controllers compete to create a lock for the command, and corresponding cluster to execute the command. If locks are successfully created, the controller begin to translate the command into a list of actions to perform on the managed nodes. HMS controller watches for the status queues and coordinate actions to perform on HMS agents. HMS managed agents download software through yum repository or bit torrent through peer exchange. HMS agent reports installation status and configuration status back to agent status queue for HMS controller to orchestrate the cluster deployment. Once, all actions are finalized, HMS controller store the deployment command history in the cluster node. In the event of node failures (to be implemented), operator can re-image the defected node. When the agent join back, HMS agent can send status to controller to replay the installation and configuration history to recover. h4. Monitoring Proposal For large clusters deployment, monitoring setup could be complex. HMS can simplify this by orchestrate Hadoop 0.20.2+1 (append branch) + HBase 0.90.3 + Pig 0.8.1 + Chukwa 0.5 deployment using the proposed RPM packages for HADOOP-6255, ZOOKEEPER-999, HBASE-3606, PIG-1857, CHUKWA (HADOOP-5030). > Hadoop Management System (Umbrella) > ----------------------------------- > > Key: HADOOP-7417 > URL: https://issues.apache.org/jira/browse/HADOOP-7417 > Project: Hadoop Common > Issue Type: New Feature > Environment: Java 6, Linux > Reporter: Eric Yang > Assignee: Eric Yang > > The primary goal of Hadoop Management System is to build a component around > management and deployment of Hadoop related projects. This includes software > installation, configuration, application orchestration, deployment automation > and monitoring Hadoop. > Prototype demo source code can be obtained from: > http://github.com/macroadster/hms -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira