liuxun created ZEPPELIN-3626:
--------------------------------

             Summary: Cluster management and client module design
                 Key: ZEPPELIN-3626
                 URL: https://issues.apache.org/jira/browse/ZEPPELIN-3626
             Project: Zeppelin
          Issue Type: Sub-task
          Components: zeppelin-server
    Affects Versions: 0.9.0
            Reporter: liuxun
            Assignee: liuxun


h4. Cluster management service

The cluster management service uses the Raft algorithm library copycatServer to 
form a service cluster with consistent service status in the Zeppelin cluster.
 # The cluster management service runs in each Zeppelin-Server;

 # The cluster management service establishes a cluster by using the 
copycatServer class of the Raft algorithm library, maintains the 
ClusterStateMachine, and manages the service state metadata of each 
Zeppelin-Server through the PutCommand, GetQuery, and DeleteCommand operation 
commands.

 # Launch the Thrift service in the cluster management service to enable the 
cluster interpreter process to be created by remote calls in each 
Zeppelin-Server;

h4. Cluster management client

The cluster management client connects to the cluster management service for 
metadata operations of services and processes through the Raft algorithm 
library copycatClient.
 # The cluster management client runs in each Zeppelin-Server and Zeppelin 
Interpreter process;

 # The cluster management client manages the Zeppelin-Server and Zeppelin 
Interpreter process state (metadata information) in the ClusterStateMachine by 
using the copycatClient class of the Raft library to connect to the 
copycatServer. When the Zeppelin-Server and Zeppelin Interpreter processes are 
started, They are added to the ClusterStateMachine and are removed from the 
ClusterStateMachine when the Zeppelin-Server and Zeppelin Interpreter processes 
are closed.

 # In a distributed environment, network anomalies, network delays, or service 
exceptions may occur. After copycatClient submits metadata to the cluster, it 
checks whether the submission is successful. After the submission fails, the 
metadata is saved in the local message queue. Retrying by copycatClient through 
a separate commit thread;

h4. Cluster monitoring module

The cluster monitoring module checks if each Zeppelin-Server and Zeppelin 
Interpreter process in the cluster is active
 # The cluster monitoring module runs in each Zeppelin-Server and Zeppelin 
Interpreter process, periodically sending heartbeat data of the service or 
process to the cluster;

 # When the cluster monitoring module runs in Zeppelin-Server, it collects the 
CPU and MEMORY usage of the server, and sends the resource usage rate to the 
cluster's ClusterStateMachine. When the cluster interpreter process needs to be 
created, the server is idle from the resource. Created in ;

 # Resource usage statistics strategy, in order to avoid the instantaneous high 
peak and low peak of the server, the cluster monitoring will collect the 
average resource usage in the most recent period for reporting, and improve the 
reasonable line and effectiveness of the server resources as much as possible;

 # When the cluster monitoring module runs in Zeppelin-Server, it checks the 
heartbeat data of each Zeppelin-Server and Zeppelin Interpreter process. If it 
times out, it considers that the service or process is abnormally unavailable 
and removes it from the cluster.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to