[
https://issues.apache.org/jira/browse/ZOOKEEPER-1147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13135609#comment-13135609
]
Vishal Kathuria commented on ZOOKEEPER-1147:
--------------------------------------------
I am just starting working on this for 3.5, so seeking comments.
I am defining the proposed changes in terms of the C library, but they can
easily be translated into corresponding Java library changes. The only change
to the API is that you have to request this new optimization while creating
your client (since I wanted to not change the behavior for existing clients).
The optimization will delay the creation of the persistent (or global) session
until this client creates an ephemeral node for the first time.
1. zoo_init will take a flag indicating delayed persistent session creation.
2. Server will look at this flag and create a session that is local to the
server and not send a request to the leader.
3. Server will expose a new operation - upgradeToPersistent - that will upgrade
a local session to a persistent session. This is the first time that the leader
will become aware of this session (assuming the client is connected to a
follower)
4. If there is a zoo_create with ephemeral node, the client will send a
upgradeToPersistent request to the server before sending the create ephemeral
node request. This request would be async, so I don't expect it to delay the
creation of ephemeral node much.
We can extend the behavior to be similar to chubby by having the client or the
server execute upgradeToPersistent after a random time interval.
> Add support for local sessions
> ------------------------------
>
> Key: ZOOKEEPER-1147
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1147
> Project: ZooKeeper
> Issue Type: Improvement
> Components: server
> Affects Versions: 3.3.3
> Reporter: Vishal Kathuria
> Labels: api-change, scaling
> Fix For: 3.5.0
>
> Original Estimate: 840h
> Remaining Estimate: 840h
>
> This improvement is in the bucket of making ZooKeeper work at a large scale.
> We are planning on having about a 1 million clients connect to a ZooKeeper
> ensemble through a set of 50-100 observers. Majority of these clients are
> read only - ie they do not do any updates or create ephemeral nodes.
> In ZooKeeper today, the client creates a session and the session creation is
> handled like any other update. In the above use case, the session create/drop
> workload can easily overwhelm an ensemble. The following is a proposal for a
> "local session", to support a larger number of connections.
> 1. The idea is to introduce a new type of session - "local" session. A
> "local" session doesn't have a full functionality of a normal session.
> 2. Local sessions cannot create ephemeral nodes.
> 3. Once a local session is lost, you cannot re-establish it using the
> session-id/password. The session and its watches are gone for good.
> 4. When a local session connects, the session info is only maintained
> on the zookeeper server (in this case, an observer) that it is connected to.
> The leader is not aware of the creation of such a session and there is no
> state written to disk.
> 5. The pings and expiration is handled by the server that the session
> is connected to.
> With the above changes, we can make ZooKeeper scale to a much larger number
> of clients without making the core ensemble a bottleneck.
> In terms of API, there are two options that are being considered
> 1. Let the client specify at the connect time which kind of session do they
> want.
> 2. All sessions connect as local sessions and automatically get promoted to
> global sessions when they do an operation that requires a global session
> (e.g. creating an ephemeral node)
> Chubby took the approach of lazily promoting all sessions to global, but I
> don't think that would work in our case, where we want to keep sessions which
> never create ephemeral nodes as always local. Option 2 would make it more
> broadly usable but option 1 would be easier to implement.
> We are thinking of implementing option 1 as the first cut. There would be a
> client flag, IsLocalSession (much like the current readOnly flag) that would
> be used to determine whether to create a local session or a global session.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira