[ 
https://issues.apache.org/jira/browse/LENS-658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14642704#comment-14642704
 ] 

Bala Nathan commented on LENS-658:
----------------------------------

[~amareshwari]: Sure. Makes sense to isolate changes and merge them to master 
when it is stable. 

> Distributed Lens Server Deployment
> ----------------------------------
>
>                 Key: LENS-658
>                 URL: https://issues.apache.org/jira/browse/LENS-658
>             Project: Apache Lens
>          Issue Type: New Feature
>          Components: server
>            Reporter: Bala Nathan
>
> Currently lens can be deployed and function only on a single node. This JIRA 
> tracks the approach to make lens work in a clustered environment. The 
> following key aspects of clustered deployment are discussed below:
> 1) Session Management
> Creating and Persisting a Session:
> A lens session needs to be created before any queries can be submitted to the 
> Lens server. A Lens session is associated with a unique session handle, 
> current database being used, any jar's been added which must be passed when 
> making queries, or doing metadata operations in the same session. The session 
> service is started as part of the lens server. Today the session object is 
> persisted on a local filesystem periodically (set via 
> lens.server.persist.location conf) for the purposes of recovery i.e. the Lens 
> Server will read from the location when it is restarted and recovery is 
> enabled. In a multi node deployment, this destination needs to be available 
> to all nodes. Hence the session information can be persisted on a datastore 
> instead of a filesystem that can allow reads from this location so that any 
> node in the cluster can re-construct the session and be able to execute 
> queries. 
> Restoring a Session:
> Lets consider the following scenario:
> 1) Lets assume there are three nodes in the lens cluster (L1, L2, L3). These 
> nodes are behind a load balancer (LB) and the load balance balance policy is 
> round robin.
> 2) When a User sends a create session to LB, LB will send the request to L1. 
> L1 will create a session, persist it in the datastore and return the session 
> handle back to caller.
> 3) User sends a submit query to LB, LB will send the request to L2. L2 will 
> check if the session is valid before it can submit the query. However, since 
> L2 does not have session info, it may not be able to process the query. Hence 
> L2 must rebuild the session information on itself from the datastore and then 
> submit a query. Based on the load balance policy, every node in the cluster 
> will eventually build all session information.
> 2) Query Lifecycle Management
> Once a query is issued to the lens server, it moves through various states:
>                                                                       
> Queued -> Launched -> Running -> Failed / Successful -> Completed 
> A client can check the status of a query based on a query handle. Since this 
> request can be routed to any server in the cluster, the query status also 
> needs to be persisted in a datastore. 
> Updating Status of a Query in the case of node failure: When the status of a 
> query is being requested, any node in the lens cluster can get the request. 
> If the node which initially issued the query is down for some reason, another 
> node needs to take over to update the status of the query. This can be 
> achieved by either maintaining a heartbeat or health check of each node in 
> the cluster. For Example:
> 1) L1 submitted the query and updated the datastore
> 2) L1 server went down
> 3) Request for query comes to L2. L2 checks if L1 is up via a healthcheck 
> URL. If the response is 200 OK, it routes the request to L1 , else it updates 
> the datastore (changes from L1 to L2) and takes over
> In the case of a Lens cluster restart, there may be a high load on the hive 
> server if every server in a cluster attempts to check the status of all hive 
> query handles. Hence, this would require some sort of stickiness to be 
> maintained in the query handle such that upon a restart, each server may 
> attempt to restore only those queries which originated from itself. 
> 3) Result Set Management 
> Lens supports two consumption modes for a query: In-Memory and Persistent. A 
> user can ask for a query to persist results in a HDFS location and can 
> retrieve it from a different node. However, this does not apply for in-memory 
> result sets (particularly for JDBC drivers). In the case of a in-memory 
> resultsets, a request for retrieving the resultset must go to the same node 
> which executed the query. Hence, a query handle stickiness is unavoidable. 
> Lets take an example to illustrate this:
> 1) Again, ets assume there are three nodes in the lens cluster (L1, L2, L3). 
> These nodes are behind a load balancer (LB) and the load balance balance 
> policy is round robin.
> 2) User created a session and submitted a query with in-memory resultset to 
> LB. LB routed the information to L1 in the cluster. L1 submitted the query, 
> persisted the state in a datastore and returned the query handle to user.
> 3) User sent a request to LB asking for status of query. LB routes it to L2. 
> L2 checks the datastore and says the query is completed.
> 4) User asks for resultset of query to LB. LB now sends it to L3. However, L3 
> does not have the resultset. Hence it cannot serve the request. 
> However if we persist the node information along with the query handle in the 
> datastore (#2), when the user asks for a resultset, L3 can now redirect the 
> request to L2 and ask it to take over. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to