Session/clustering API and the web tier

Greg Wilkins Tue, 11 Jul 2006 16:06:38 -0700

All,


Here are my comments on the Session API that were promised after apachecon 
dublin.
This is also CC'd to the wadi list and some of the points are relevant to them 
as well.

My own reason for focusing on the Session API when I think about clustering,
is that I like the idea of pluggable clustering implementations.   Clustering
is not one size fits all and solutions can go from non-replicated nodes 
configured
in a flat file to auto discovered, self healing, redundant hierarchies.

I think the previous discussions we had on this were making good progress, but
I think we ran out of steam before the API was improved etc.  So I think it
worthwhile to re-read the original threads.

But I will repeat my main unresolved concerns here:
While I appreciate the keep-it-simple-stupid approach adopted by the proposed
session API, I remain concerned that it may over simplify and may also mix 
concerns.

However, I do think that the API is pitched at about the right level - namely
that it is below the specific concerns of things such as HTTP.  As the 
implementor
of the web container, I would prefer to not delegate HttpSession management
or request proxying to a pluggable session implementation (I doubt that 
a cluster impl wants to deal with non-blocking proxying of requests etc.)



I see that the webcontainer needs to interact with the cluster implementation
in 4 areas:


1) Policy
---------

When a container receives a request, it needs to make a policy decision along 
the lines of:

    1) The request can be handled locally.
    2) The request can be handled locally, but only after some other actions 
       (eg session is moved to local)
    3) request cannot be handled locally, but can be redirected to another node
    4) request cannot be handled locally, but can be proxied to another node.

This kind of corresponds to the Locator and SessionLocation APIs.  However
these APIs give the power to enact a policy decision, but give no support to 
make
a policy decision.   

To implement a policy, you might want to use:  the size of the cluster, the 
total 
number of sessions, the number of  session on the local node, the number of 
sessions 
collocated with a remote session, how many requests for the session have 
recently 
arrived on what nodes, etc. etc.

The API does not give me this information and I think it would be 
difficult to provide all that might be used.  Potentially
we could get by with a mechanism to store/access cluster wide meta-data
attributes? 

However, it is very unlikely that one policy will fit all, so each consumer
of this Location API will have to implement a pluggable policy frame work of 
some sorts.

But as the session API is already a pluggable framework, why don't we 
just delegate the policy decision to the API.  The web container 
should make the policy decision, but should call the session API to
make the decision.  Something like:

  SessionLocation executeAt =  locator.getSessionExecutionLocation(clientId);
  if (executeAt.isLocal())
    // handle request
  else
    // proxy or redirect to executeAt location.

(Note the need for something like this has been discussed before and 
generally agreed.  I have seen the proposed RemoteSessionStrategy, but I am not 
sure how you obtain a handle to one - nor do I think the policy should
decide between redirect and proxy - which is HTTP business).


I final concern about location and policy is that the API does
not support non-homogeneous clusters.  For a given client ID, the 
EJBSession beans may be on one node and the HttpSession beans on
another.    So perhaps the type of the session needs to be passed
to the policy and the policy may decide to only move part of the 
session around.



2) State.

The proposed Session interface is where we put the state for
the session.   The first question about this is why does it not
implement the Map interface?   The primatives are much the same: 

   addState(String key, Object value);
   getState(String key);
   removeState(String key);

but without any support for size, iteration, bulk operations etc.
These operations will be needed to implement getAttributeNames at least.

Also note that there is not a 1 to 1 mapping between this session object
and a HttpSession, as for a given client ID there may be a HttpSession for
each webapp context visited, plus EJB sessions etc.  To handle this,
the name space of the session must be structured, so when the user calls

 setAttribute("foo","value");

the actual call is something like

 addState("web:"+contextPath+"foo","value");

Which kind of sux on a number of levels.   First of all contextPath is not 
sufficiently uniq and virtual hosts and ports must be brought into play.  There
is also going to be a cost in creating lots of strings just to lookup values.

I don't see why this scoping could not be done by the session instance 
itself and that the web container is passed an instance that is pre-scoped to 
the
specific context (even if it just added names prefixes behind the scenes).


Finally, something needs to be done about the events that servlet
containers need to implement about passivated, binding, invalidation etc.
This is difficult without making the API servlet specific.  However, I 
think that some generic binding/passivated event mechanism together with
an iterator (to discover the values that implement the servlet event listener
APIs) would be sufficient - we had a rough agreement on that before however
I was concerned that it was a bit too complex and weblike.



3) Life cycle

Unfortunately the life and death of a session is not simple - specially
when cross context dispatch is considered.  Session ID's may or may not
be reused, their uniqueness might need to be guarenteed and the decision
may depend on the existence of the same session ID in other contexts.

I think this can be modeled with a structured name space - so perhaps
this is not an issue anymore?


4) Configuration and Management
It would generally be good to be able to know how many nodes are in
the cluster (or to set what the nodes are). To be able to monitor node
status and give commands to gracefully or brutally shutdown a node, move
sessions etc.

Clustering aware clients (JNDI stubs, EJB proxies or potentially fancy 
Ajax web clients) might need to be passed a list of known nodes - but it
is not possible to obtain/set that from the API - thus every impl will need
to implement it's own cluster config/discover even if that information is
available in other implementations.


That is more than enough to ponder for now....




























So I would like to see an API to which I could delegate this policy decision,
but I am not sure that the current API with just isLocal() API is rich enough 
to support this.

Session/clustering API and the web tier

Reply via email to