[jira] [Updated] (LIVY-698) Cluster support for Livy

Jeffrey(Xilang) Yan (Jira) Wed, 16 Oct 2019 20:38:59 -0700


     [ 
https://issues.apache.org/jira/browse/LIVY-698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Jeffrey(Xilang) Yan updated LIVY-698:
-------------------------------------
    Description: 
This is a proposal for livy cluster support, it is designed to be a light 
solution for livy cluster which can compatible with non-cluster livy and 
different HA level. 

Server ID:  integer configured by livy.server.id, with default value 0 which is 
standalone mode. Lagest server id is 213, reason described below.

Session ID: session ID is generated as 3 digit server ID and 7 digit 
auto-increment integer. As the biggest integer is  2,147,483,647, so largest 
server ID is 213 and each server can have 9,999,999 sessions. Limitation here 
is each cluster can have most 213 instance which I think is enough. For 
standalone mode, as server id is 0, so works the same.

Zookeeper: as zookeeper is required by config livy.server.zookeeper.quorum

Server Registration: each server should register themself to zookeeper path 
/livy/server/ 

Leader: one of livy server is elected as leader of cluster on zookeeper path 
/livy/leader/

Coordination between servers:  servers don't talk with earch other directly, 
server just detect which livy server the session lives on, and send http 307 
redirect to the correct livy server.  For example, if server A receive a 
request  
[http://serverA:8999/sessions/1010000001|http://servera:8999/sessions/1010000001],
 server A know session is on server B, then it send a 307 redirect to 
[http://serverB:8999/sessions/1010000001|http://serverb:8999/sessions/1010000001].
 

 

Session HA: consider server failure case, user should be able to decide if want 
to keep sessions on failure server. This lead to two different mode:

Non session HA mode:

With this mode, session lost when sever failed(but it can still work with 
session-store, recover session when server get well)

request redirect: Sever detect session's correct server just by get the first 3 
digit of session id.
 Session HA mode:

With this mode,  session information will be persistent to ZK store(reuse 
current session-store code), and recover in another server when server failed.

session registration:  all sessions are registered to zk path 
/livy/session/type/

request redirect: each server detect correct server of session by read zk and 
then send 307 redirect, server may cache session-server pair for a while

server failure: cluster leader should detect server failure, reassign session 
on failure server to other servers, other server should recover session by read 
session information in ZK

server notification: server need to send msg to other server, for example 
leader send command to ask server to recover session. all such msgs are sent 
through zk, in path /livy/notification/

 

Thrift Server: Thrift session has thrift session ID, thrift session can be 
restored with same session ID, however current hive-jdbc doesn't support 
restore session. So for thrift server, just register server to ZK to implement 
same server discovery as hive have. Later when hive-jdbc support restore 
session, thrift server can use exisitng livy cluster solution to adapt 
hive-jdbc.

 

  was:
This is a proposal for livy cluster support, it is designed to be a light 
solution for livy cluster which can compatible with non-cluster livy and 
different HA level. 

Server ID:  integer configured by livy.server.id, with default value 0 which is 
standalone mode. Lagest server id is 213, reason described below.

Session ID: session ID is generated as 3 digit server ID and 7 digit 
auto-increment integer. As the biggest integer is  2,147,483,647, so largest 
server ID is 213 and each server can have 9,999,999 sessions. Limitation here 
is each cluster can have most 213 instance which I think is enough. For 
standalone mode, as server id is 0, so works the same.

Zookeeper: as zookeerp is required by config livy.server.zookeeper.quorum

Server Registration: each server should register themself to zookeeper path 
/livy/server/ 

Leader: one of livy server is elected as leader of cluster on zookeeper path 
/livy/leader/

Coordination between servers:  servers don't talk with earch other directly, 
server just detect which livy server the session lives on, and send http 307 
redirect to the correct livy server.  For example, if server A receive a 
request  
[http://serverA:8999/sessions/1010000001|http://servera:8999/sessions/1010000001],
 server A know session is on server B, then it send a 307 redirect to 
[http://serverB:8999/sessions/1010000001|http://serverb:8999/sessions/1010000001].
 

 

Session HA: consider server failure case, user should be able to decide if want 
to keep sessions on failure server. This lead to two different mode:

Non session HA mode:

With this mode, session lost when sever failed(but it can still work with 
session-store, recover session when server get well)

request redirect: Sever detect session's correct server just by get the first 3 
digit of session id.
 Session HA mode:

With this mode,  session information will be persistent to ZK store(reuse 
current session-store code), and recover in another server when server failed.

session registration:  all sessions are registered to zk path 
/livy/session/type/

request redirect: each server detect correct server of session by read zk and 
then send 307 redirect, server may cache session-server pair for a while

server failure: cluster leader should detect server failure, reassign session 
on failure server to other servers, other server should recover session by read 
session information in ZK

server notification: server need to send msg to other server, for example 
leader send command to ask server to recover session. all such msgs are sent 
through zk, in path /livy/notification/

 

Thrift Server: Thrift session has thrift session ID, thrift session can be 
restored with same session ID, however current hive-jdbc doesn't support 
restore session. So for thrift server, just register server to ZK to implement 
same server discovery as hive have. Later when hive-jdbc support restore 
session, thrift server can use exisitng livy cluster solution to adapt 
hive-jdbc.

 


> Cluster support for Livy
> ------------------------
>
>                 Key: LIVY-698
>                 URL: https://issues.apache.org/jira/browse/LIVY-698
>             Project: Livy
>          Issue Type: New Feature
>          Components: Server
>    Affects Versions: 0.6.0
>            Reporter: Jeffrey(Xilang) Yan
>            Priority: Major
>
> This is a proposal for livy cluster support, it is designed to be a light 
> solution for livy cluster which can compatible with non-cluster livy and 
> different HA level. 
> Server ID:  integer configured by livy.server.id, with default value 0 which 
> is standalone mode. Lagest server id is 213, reason described below.
> Session ID: session ID is generated as 3 digit server ID and 7 digit 
> auto-increment integer. As the biggest integer is  2,147,483,647, so largest 
> server ID is 213 and each server can have 9,999,999 sessions. Limitation here 
> is each cluster can have most 213 instance which I think is enough. For 
> standalone mode, as server id is 0, so works the same.
> Zookeeper: as zookeeper is required by config livy.server.zookeeper.quorum
> Server Registration: each server should register themself to zookeeper path 
> /livy/server/ 
> Leader: one of livy server is elected as leader of cluster on zookeeper path 
> /livy/leader/
> Coordination between servers:  servers don't talk with earch other directly, 
> server just detect which livy server the session lives on, and send http 307 
> redirect to the correct livy server.  For example, if server A receive a 
> request  
> [http://serverA:8999/sessions/1010000001|http://servera:8999/sessions/1010000001],
>  server A know session is on server B, then it send a 307 redirect to 
> [http://serverB:8999/sessions/1010000001|http://serverb:8999/sessions/1010000001].
>  
>  
> Session HA: consider server failure case, user should be able to decide if 
> want to keep sessions on failure server. This lead to two different mode:
> Non session HA mode:
> With this mode, session lost when sever failed(but it can still work with 
> session-store, recover session when server get well)
> request redirect: Sever detect session's correct server just by get the first 
> 3 digit of session id.
>  Session HA mode:
> With this mode,  session information will be persistent to ZK store(reuse 
> current session-store code), and recover in another server when server failed.
> session registration:  all sessions are registered to zk path 
> /livy/session/type/
> request redirect: each server detect correct server of session by read zk and 
> then send 307 redirect, server may cache session-server pair for a while
> server failure: cluster leader should detect server failure, reassign session 
> on failure server to other servers, other server should recover session by 
> read session information in ZK
> server notification: server need to send msg to other server, for example 
> leader send command to ask server to recover session. all such msgs are sent 
> through zk, in path /livy/notification/
>  
> Thrift Server: Thrift session has thrift session ID, thrift session can be 
> restored with same session ID, however current hive-jdbc doesn't support 
> restore session. So for thrift server, just register server to ZK to 
> implement same server discovery as hive have. Later when hive-jdbc support 
> restore session, thrift server can use exisitng livy cluster solution to 
> adapt hive-jdbc.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (LIVY-698) Cluster support for Livy

Reply via email to