[
https://issues.apache.org/jira/browse/LENS-658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14642704#comment-14642704
]
Bala Nathan commented on LENS-658:
----------------------------------
[~amareshwari]: Sure. Makes sense to isolate changes and merge them to master
when it is stable.
> Distributed Lens Server Deployment
> ----------------------------------
>
> Key: LENS-658
> URL: https://issues.apache.org/jira/browse/LENS-658
> Project: Apache Lens
> Issue Type: New Feature
> Components: server
> Reporter: Bala Nathan
>
> Currently lens can be deployed and function only on a single node. This JIRA
> tracks the approach to make lens work in a clustered environment. The
> following key aspects of clustered deployment are discussed below:
> 1) Session Management
> Creating and Persisting a Session:
> A lens session needs to be created before any queries can be submitted to the
> Lens server. A Lens session is associated with a unique session handle,
> current database being used, any jar's been added which must be passed when
> making queries, or doing metadata operations in the same session. The session
> service is started as part of the lens server. Today the session object is
> persisted on a local filesystem periodically (set via
> lens.server.persist.location conf) for the purposes of recovery i.e. the Lens
> Server will read from the location when it is restarted and recovery is
> enabled. In a multi node deployment, this destination needs to be available
> to all nodes. Hence the session information can be persisted on a datastore
> instead of a filesystem that can allow reads from this location so that any
> node in the cluster can re-construct the session and be able to execute
> queries.
> Restoring a Session:
> Lets consider the following scenario:
> 1) Lets assume there are three nodes in the lens cluster (L1, L2, L3). These
> nodes are behind a load balancer (LB) and the load balance balance policy is
> round robin.
> 2) When a User sends a create session to LB, LB will send the request to L1.
> L1 will create a session, persist it in the datastore and return the session
> handle back to caller.
> 3) User sends a submit query to LB, LB will send the request to L2. L2 will
> check if the session is valid before it can submit the query. However, since
> L2 does not have session info, it may not be able to process the query. Hence
> L2 must rebuild the session information on itself from the datastore and then
> submit a query. Based on the load balance policy, every node in the cluster
> will eventually build all session information.
> 2) Query Lifecycle Management
> Once a query is issued to the lens server, it moves through various states:
>
> Queued -> Launched -> Running -> Failed / Successful -> Completed
> A client can check the status of a query based on a query handle. Since this
> request can be routed to any server in the cluster, the query status also
> needs to be persisted in a datastore.
> Updating Status of a Query in the case of node failure: When the status of a
> query is being requested, any node in the lens cluster can get the request.
> If the node which initially issued the query is down for some reason, another
> node needs to take over to update the status of the query. This can be
> achieved by either maintaining a heartbeat or health check of each node in
> the cluster. For Example:
> 1) L1 submitted the query and updated the datastore
> 2) L1 server went down
> 3) Request for query comes to L2. L2 checks if L1 is up via a healthcheck
> URL. If the response is 200 OK, it routes the request to L1 , else it updates
> the datastore (changes from L1 to L2) and takes over
> In the case of a Lens cluster restart, there may be a high load on the hive
> server if every server in a cluster attempts to check the status of all hive
> query handles. Hence, this would require some sort of stickiness to be
> maintained in the query handle such that upon a restart, each server may
> attempt to restore only those queries which originated from itself.
> 3) Result Set Management
> Lens supports two consumption modes for a query: In-Memory and Persistent. A
> user can ask for a query to persist results in a HDFS location and can
> retrieve it from a different node. However, this does not apply for in-memory
> result sets (particularly for JDBC drivers). In the case of a in-memory
> resultsets, a request for retrieving the resultset must go to the same node
> which executed the query. Hence, a query handle stickiness is unavoidable.
> Lets take an example to illustrate this:
> 1) Again, ets assume there are three nodes in the lens cluster (L1, L2, L3).
> These nodes are behind a load balancer (LB) and the load balance balance
> policy is round robin.
> 2) User created a session and submitted a query with in-memory resultset to
> LB. LB routed the information to L1 in the cluster. L1 submitted the query,
> persisted the state in a datastore and returned the query handle to user.
> 3) User sent a request to LB asking for status of query. LB routes it to L2.
> L2 checks the datastore and says the query is completed.
> 4) User asks for resultset of query to LB. LB now sends it to L3. However, L3
> does not have the resultset. Hence it cannot serve the request.
> However if we persist the node information along with the query handle in the
> datastore (#2), when the user asks for a resultset, L3 can now redirect the
> request to L2 and ask it to take over.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)