[ https://issues.apache.org/jira/browse/LIVY-698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jeffrey(Xilang) Yan closed LIVY-698. ------------------------------------ Resolution: Duplicate > Cluster support for Livy > ------------------------ > > Key: LIVY-698 > URL: https://issues.apache.org/jira/browse/LIVY-698 > Project: Livy > Issue Type: New Feature > Components: Server > Affects Versions: 0.6.0 > Reporter: Jeffrey(Xilang) Yan > Priority: Major > > This is a proposal for livy cluster support, it is designed to be a light > solution for livy cluster which can compatible with non-cluster livy and > different HA level. > Server ID: integer configured by livy.server.id, with default value 0 which > is standalone mode. Lagest server id is 213, reason described below. > Session ID: session ID is generated as 3 digit server ID and 7 digit > auto-increment integer. As the biggest integer is 2,147,483,647, so largest > server ID is 213 and each server can have 9,999,999 sessions. Limitation here > is each cluster can have most 213 instance which I think is enough. For > standalone mode, as server id is 0, so works the same. > Zookeeper: as zookeeper is required by config livy.server.zookeeper.quorum > Server Registration: each server should register themself to zookeeper path > /livy/server/ > Leader: one of livy server is elected as leader of cluster on zookeeper path > /livy/leader/ > Coordination between servers: servers don't talk with earch other directly, > server just detect which livy server the session lives on, and send http 307 > redirect to the correct livy server. For example, if server A receive a > request > [http://serverA:8999/sessions/1010000001|http://servera:8999/sessions/1010000001], > server A know session is on server B, then it send a 307 redirect to > [http://serverB:8999/sessions/1010000001|http://serverb:8999/sessions/1010000001]. > > > Session HA: consider server failure case, user should be able to decide if > want to keep sessions on failure server. This lead to two different mode: > Non session HA mode: > With this mode, session lost when sever failed(but it can still work with > session-store, recover session when server get well) > request redirect: Sever detect session's correct server just by get the first > 3 digit of session id. > Session HA mode: > With this mode, session information will be persistent to ZK store(reuse > current session-store code), and recover in another server when server failed. > session registration: all sessions are registered to zk path > /livy/session/type/ > request redirect: each server detect correct server of session by read zk and > then send 307 redirect, server may cache session-server pair for a while > server failure: cluster leader should detect server failure, reassign session > on failure server to other servers, other server should recover session by > read session information in ZK > server notification: server need to send msg to other server, for example > leader send command to ask server to recover session. all such msgs are sent > through zk, in path /livy/notification/ > > Thrift Server: Thrift session has thrift session ID, thrift session can be > restored with same session ID, however current hive-jdbc doesn't support > restore session. So for thrift server, just register server to ZK to > implement same server discovery as hive have. Later when hive-jdbc support > restore session, thrift server can use exisitng livy cluster solution to > adapt hive-jdbc. > -- This message was sent by Atlassian Jira (v8.3.4#803005)