All,
Here are my comments on the Session API that were promised after apachecon dublin. This is also CC'd to the wadi list and some of the points are relevant to them as well. My own reason for focusing on the Session API when I think about clustering, is that I like the idea of pluggable clustering implementations. Clustering is not one size fits all and solutions can go from non-replicated nodes configured in a flat file to auto discovered, self healing, redundant hierarchies. I think the previous discussions we had on this were making good progress, but I think we ran out of steam before the API was improved etc. So I think it worthwhile to re-read the original threads. But I will repeat my main unresolved concerns here: While I appreciate the keep-it-simple-stupid approach adopted by the proposed session API, I remain concerned that it may over simplify and may also mix concerns. However, I do think that the API is pitched at about the right level - namely that it is below the specific concerns of things such as HTTP. As the implementor of the web container, I would prefer to not delegate HttpSession management or request proxying to a pluggable session implementation (I doubt that a cluster impl wants to deal with non-blocking proxying of requests etc.) I see that the webcontainer needs to interact with the cluster implementation in 4 areas: 1) Policy --------- When a container receives a request, it needs to make a policy decision along the lines of: 1) The request can be handled locally. 2) The request can be handled locally, but only after some other actions (eg session is moved to local) 3) request cannot be handled locally, but can be redirected to another node 4) request cannot be handled locally, but can be proxied to another node. This kind of corresponds to the Locator and SessionLocation APIs. However these APIs give the power to enact a policy decision, but give no support to make a policy decision. To implement a policy, you might want to use: the size of the cluster, the total number of sessions, the number of session on the local node, the number of sessions collocated with a remote session, how many requests for the session have recently arrived on what nodes, etc. etc. The API does not give me this information and I think it would be difficult to provide all that might be used. Potentially we could get by with a mechanism to store/access cluster wide meta-data attributes? However, it is very unlikely that one policy will fit all, so each consumer of this Location API will have to implement a pluggable policy frame work of some sorts. But as the session API is already a pluggable framework, why don't we just delegate the policy decision to the API. The web container should make the policy decision, but should call the session API to make the decision. Something like: SessionLocation executeAt = locator.getSessionExecutionLocation(clientId); if (executeAt.isLocal()) // handle request else // proxy or redirect to executeAt location. (Note the need for something like this has been discussed before and generally agreed. I have seen the proposed RemoteSessionStrategy, but I am not sure how you obtain a handle to one - nor do I think the policy should decide between redirect and proxy - which is HTTP business). I final concern about location and policy is that the API does not support non-homogeneous clusters. For a given client ID, the EJBSession beans may be on one node and the HttpSession beans on another. So perhaps the type of the session needs to be passed to the policy and the policy may decide to only move part of the session around. 2) State. The proposed Session interface is where we put the state for the session. The first question about this is why does it not implement the Map interface? The primatives are much the same: addState(String key, Object value); getState(String key); removeState(String key); but without any support for size, iteration, bulk operations etc. These operations will be needed to implement getAttributeNames at least. Also note that there is not a 1 to 1 mapping between this session object and a HttpSession, as for a given client ID there may be a HttpSession for each webapp context visited, plus EJB sessions etc. To handle this, the name space of the session must be structured, so when the user calls setAttribute("foo","value"); the actual call is something like addState("web:"+contextPath+"foo","value"); Which kind of sux on a number of levels. First of all contextPath is not sufficiently uniq and virtual hosts and ports must be brought into play. There is also going to be a cost in creating lots of strings just to lookup values. I don't see why this scoping could not be done by the session instance itself and that the web container is passed an instance that is pre-scoped to the specific context (even if it just added names prefixes behind the scenes). Finally, something needs to be done about the events that servlet containers need to implement about passivated, binding, invalidation etc. This is difficult without making the API servlet specific. However, I think that some generic binding/passivated event mechanism together with an iterator (to discover the values that implement the servlet event listener APIs) would be sufficient - we had a rough agreement on that before however I was concerned that it was a bit too complex and weblike. 3) Life cycle Unfortunately the life and death of a session is not simple - specially when cross context dispatch is considered. Session ID's may or may not be reused, their uniqueness might need to be guarenteed and the decision may depend on the existence of the same session ID in other contexts. I think this can be modeled with a structured name space - so perhaps this is not an issue anymore? 4) Configuration and Management It would generally be good to be able to know how many nodes are in the cluster (or to set what the nodes are). To be able to monitor node status and give commands to gracefully or brutally shutdown a node, move sessions etc. Clustering aware clients (JNDI stubs, EJB proxies or potentially fancy Ajax web clients) might need to be passed a list of known nodes - but it is not possible to obtain/set that from the API - thus every impl will need to implement it's own cluster config/discover even if that information is available in other implementations. That is more than enough to ponder for now.... So I would like to see an API to which I could delegate this policy decision, but I am not sure that the current API with just isLocal() API is rich enough to support this.