James Strachan wrote:
On 7/12/06, Jules Gosnell <[EMAIL PROTECTED]> wrote:

Greg Wilkins wrote:
> All,
>
>
> Here are my comments on the Session API that were promised after apachecon dublin. > This is also CC'd to the wadi list and some of the points are relevant to them
> as well.
>
> My own reason for focusing on the Session API when I think about clustering, > is that I like the idea of pluggable clustering implementations. Clustering > is not one size fits all and solutions can go from non-replicated nodes configured
> in a flat file to auto discovered, self healing, redundant hierarchies.


Agreed. We should be focussed purely on what is the contract between a
container and the session API and making that contract as simple and
abstract as is possible while minimising leaky abstractions.


> I think the previous discussions we had on this were making good progress, but > I think we ran out of steam before the API was improved etc. So I think it
> worthwhile to re-read the original threads.
>
> But I will repeat my main unresolved concerns here:
> While I appreciate the keep-it-simple-stupid approach adopted by the proposed > session API, I remain concerned that it may over simplify and may also mix concerns.
>
> However, I do think that the API is pitched at about the right level - namely > that it is below the specific concerns of things such as HTTP. As the implementor > of the web container, I would prefer to not delegate HttpSession management
> or request proxying to a pluggable session implementation (I doubt that
> a cluster impl wants to deal with non-blocking proxying of requests etc.)

I think that our discussions about this have suffered from an ambiguity
around the word 'delegate'...

In one sense of the word, given WADI's current implementation, Jetty
does delegate Session management and HTTP handling to WADI, in that WADI
passes the WebApp/Jetty an object on which it calls a method and the
work in question is done.

However, in another sense, Jetty need not delegate this task, since the
object returned in these cases is managed by WADI, but created by a
Factory that is injected at startup time. This factory might be
generating instances of a class that has very Jetty-specific knowledge
or is even a part of the Jetty distro...


Thats certainly one approach. Another is for the container to just ask
the policy API what to do (i.e. is the request going to be serviced
locally or not) so that the container can take care of the rest.

This leaks clustering concerns into the container's space.


I understand the cleanliness from the session API implementor's
perspective of using a factory and calling back the container when you
see fit - however I also understand the container developers
requirement to understand at all times what each thread is doing, to
tune things aggressively with full knowledge of threading models and
to generally be master of its own domain, so I can understand why a
container developer might prefer a non-callback related solution
(which could introduce all kinds of nasty thread related bugs into the
container).

Any clustering solution will use threads underneath its API. If this is a concern you should simply make explicit where they may be used.

I don't see why both options can't be offered.


I would wholeheartedly agree that the code for Http request relocation
should be written by someone with expertise in that area - namely the
container writer. I would just rather see it injected into the clustered
manager, so that it can be called when required, without having to
burden Jetty with the added task of making this decision itself.


I don't see that as mutually exclusive. Just have a way for Jetty to
ask the clustering solution if a request can be satisfied locally, if
not Jetty does the proxy/redirect thing.


> I see that the webcontainer needs to interact with the cluster implementation
> in 4 areas:
>
>
> 1) Policy
> ---------
>
> When a container receives a request, it needs to make a policy decision along
> the lines of:
>
>     1) The request can be handled locally.
> 2) The request can be handled locally, but only after some other actions
>        (eg session is moved to local)
> 3) request cannot be handled locally, but can be redirected to another node > 4) request cannot be handled locally, but can be proxied to another node.
>
> This kind of corresponds to the Locator and SessionLocation APIs. However > these APIs give the power to enact a policy decision, but give no support to make
> a policy decision.
>
> To implement a policy, you might want to use: the size of the cluster, the total > number of sessions, the number of session on the local node, the number of sessions > collocated with a remote session, how many requests for the session have recently
> arrived on what nodes, etc. etc.
>
> The API does not give me this information and I think it would be
> difficult to provide all that might be used.  Potentially
> we could get by with a mechanism to store/access cluster wide meta-data
> attributes?
>
> However, it is very unlikely that one policy will fit all, so each consumer > of this Location API will have to implement a pluggable policy frame work of
> some sorts.
>
> But as the session API is already a pluggable framework, why don't we
> just delegate the policy decision to the API.  The web container
> should make the policy decision, but should call the session API to
> make the decision.  Something like:
>
> SessionLocation executeAt = locator.getSessionExecutionLocation(clientId);
>   if (executeAt.isLocal())
>     // handle request
>   else
>     // proxy or redirect to executeAt location.
>
> (Note the need for something like this has been discussed before and
> generally agreed. I have seen the proposed RemoteSessionStrategy, but I am not
> sure how you obtain a handle to one - nor do I think the policy should
> decide between redirect and proxy - which is HTTP business).


Agreed. Just some way to ask the Session API if a request can be
processed locally might do the trick, then if not Jetty can do its
proxy/redirect thing. The trickier thing is what to pass into the
strategy to help it decide...



by having Jetty make the decision:

- you leak clustering concerns into the web tier
- you have to duplicate similar code in every clustered tier

By exposing the 'policy' api to the container and putting it in charge
of when it used, you are exposing clustering details to it.


Also the container details may be required by this policy. e.g.
details about the previous http requests received at the current node,
their type and various metadata statitsics and so forth which only the
container is aware of.



sophisticated policies require access to both container and clustered session manager details on which to make an informed decision.

there are two places that you can abstract -

a) an api over the necessary details in the container
b) an api over the necessary details in the clustered session manager

if you go with (a), then each session manager can provide a single policy which will run on any container.

if you go with (b), then you each container will have to implement its own policy code that will run on any session manager.

since the number of containers in the equation is always likely to outnumber the number of clustered session manager implementations (a) will allow for the most code reuse - WADI takes this approach, to get something going with minimum code, whilst not shutting the door on the plugging in of policies which use native APIs on both sides, thus allowing maximum sophistication.

taking the (b) route, will allow different tiers to use different logic to decide where to locate their session. This is a bad idea because :

1) tier owners are not clustering architects - once again we have the leakage of concerns.

2) this opens us to the possibility of different tiers making contradictory decisions and session-groups (e.g. a web and ejb session) being ping-ponged back and forth within the cluster because (e.g. the web and ejb) containers are using different logic to decide the best place to keep their session.


Ultimately you could abstract on both sides of the coin - but I think that you would over constrain the policy's input and don't see much value in a policy that would port between different clustered session managers as their implementations are likely to be very different.

WADI's approach is to completely shield the container from having to
know anything about clustering, whilst maintaining contracts with the
container encapsulating various pieces of tier/domain-specific
functionality that may be injected into the clustered session manager.


The issue is though, how invisible can clustering ever be? Information
from the container and from the clustering implementation will
typically be required for the policy decision.


> 3) Life cycle
>
> Unfortunately the life and death of a session is not simple - specially
> when cross context dispatch is considered.  Session ID's may or may not
> be reused, their uniqueness might need to be guarenteed and the decision
> may depend on the existence of the same session ID in other contexts.
>
> I think this can be modeled with a structured name space - so perhaps
> this is not an issue anymore?
>
>
> 4) Configuration and Management
> It would generally be good to be able to know how many nodes are in
> the cluster (or to set what the nodes are). To be able to monitor node
> status and give commands to gracefully or brutally shutdown a node, move
> sessions etc.
>
> Clustering aware clients (JNDI stubs, EJB proxies or potentially fancy
> Ajax web clients) might need to be passed a list of known nodes - but it > is not possible to obtain/set that from the API - thus every impl will need > to implement it's own cluster config/discover even if that information is
> available in other implementations.
>
>

This is the clustering API (in my mind) that was mooted in the meeting.
A number of clustering substrates (JGroups, ActiveCluster, Tribes,
etc...) have homesteaded this area (WADI maintains an abstraction layer
that can map on to any of these three). All provide an API which
provides membership notification/querying, 1->1 and 1->all messaging
functionality. These are the basic building blocks of clustering and
they will be required in every clustered service that is built for
Geronimo. This is a natural candidate for encapsulation and sharing.
Failing to do this will result in each different service having to build
its own concepts about clustering from the ground up, which would be a
disaster.


Agreed.

Things in the Java world have changed greatly since the introduction
of JGroups, JCluster, ActiveCluster, Tribes et al. Nowadays there is
no reason why we can't have a really simple POJO based model to
represent Nodes in a cluster with listeners to be notified when nodes
come and go. (Its really the main point of ActiveCluster - but we
could maybe refactor that API to be just a POJO model of a cluster
with no dependencies on external APIs or technologies and with the
ability maybe to cast a Node to some service interface to communicate
with the nodes).

Then using things like Spring Remoting we can add the remoting
technology as a deployment issue (rather than having lots of different
middleware specific APIs). e.g. see how Lingo allows you to invisibly
add JMS remoting to any POJO. (http://lingo.codehaus.org/)


This is an interesting thought. I'll let it soak in for a while.


Jules


--
"Open Source is a self-assembling organism. You dangle a piece of
string into a super-saturated solution and a whole operating-system
crystallises out around it."

/**********************************
 * Jules Gosnell
 * Partner
 * Core Developers Network (Europe)
 *
 *    www.coredevelopers.net
 *
 * Open Source Training & Support.
 **********************************/

Reply via email to