Re: Dublin - Clustering get-together.... - Report

2006-07-05 Thread Jules Gosnell

Thanks, Greg,

The more views on what was actually discussed and decided the better.

I hope to have some constructive thoughts on the session API and a 
propsed clustering API shortly - time allowing.


Jules



Greg Wilkins wrote:

All,

I am going to give my own short report on the meeting.  I'm not 
intending to decent from Jules report - simply to give a short version 
that highlights the important issues (for me).



We covered lots of clustering requirements and issues - made us all 
realize how big and challenging a complete clustering solution is.  
There are few easy wins here.


Views on what we are aiming for ranged from "we need world class solution 
for all aspects of clustering" to "we need something that works for the
tick box aspects ASAP".  Many thought both. 

A pluggable API approach to allow multiple implementations was widely 
accepted as the best way to go.


The session API is the current focus for putting clustering in G.  
Some do not think it should be... but for better or worse it is.


Views on the current session API ranged from: "it is pretty good for 
the job." through "it is the best we can expect to do" and "it is 
adequate for what it is, but we need more"  to "it is totally 
unsatisfactory".


It was apparent that it was difficult to discuss other aspects
of clustering (eg management/configuration) with out the conversation 
returning to the suitability or otherwise of the session API.


Matt floated the idea that in order to move on, we have a period of review on
the session API (which we time box).   Critics of the API have the next few
weeks to make the case to either extend, re factor or replace this API, 
after which we should try to push through to working implementations (with 
the normal amount of agile refactoring etc.).


This was accepted as the key outcome of the meeting.



Some secondary points:

We frequently blurred and then clarified the somewhat conflicting requirements
for clustering for availability and clustering for scalability.  It is very
easy to miss communicate when talking clustering!

It was pointed out that even with a G clustering API, we will have to work
within the limitations of the implementations that we plug into it and we
cannot dictate G implementations of things like heartbeats and cluster
discovery.  The counter point to this was that it would be good that if 
an implementation we could use a standard API to extract cluster meta data

from an implementation that did do heartbeats and discover, for reuse by
one that did not (eg JNDI impl could get it's list of known nodes from the
HTTP session impl). 



cheers




--
"Open Source is a self-assembling organism. You dangle a piece of
string into a super-saturated solution and a whole operating-system
crystallises out around it."

/**
 * Jules Gosnell
 * Partner
 * Core Developers Network (Europe)
 *
 *www.coredevelopers.net
 *
 * Open Source Training & Support.
 **/


Re: Dublin - Clustering get-together.... - Report

2006-07-05 Thread Greg Wilkins

All,

I am going to give my own short report on the meeting.  I'm not 
intending to decent from Jules report - simply to give a short version 
that highlights the important issues (for me).


We covered lots of clustering requirements and issues - made us all 
realize how big and challenging a complete clustering solution is.  
There are few easy wins here.

Views on what we are aiming for ranged from "we need world class solution 
for all aspects of clustering" to "we need something that works for the
tick box aspects ASAP".  Many thought both. 

A pluggable API approach to allow multiple implementations was widely 
accepted as the best way to go.

The session API is the current focus for putting clustering in G.  
Some do not think it should be... but for better or worse it is.

Views on the current session API ranged from: "it is pretty good for 
the job." through "it is the best we can expect to do" and "it is 
adequate for what it is, but we need more"  to "it is totally 
unsatisfactory".

It was apparent that it was difficult to discuss other aspects
of clustering (eg management/configuration) with out the conversation 
returning to the suitability or otherwise of the session API.

Matt floated the idea that in order to move on, we have a period of review on
the session API (which we time box).   Critics of the API have the next few
weeks to make the case to either extend, re factor or replace this API, 
after which we should try to push through to working implementations (with 
the normal amount of agile refactoring etc.).

This was accepted as the key outcome of the meeting.



Some secondary points:

We frequently blurred and then clarified the somewhat conflicting requirements
for clustering for availability and clustering for scalability.  It is very
easy to miss communicate when talking clustering!

It was pointed out that even with a G clustering API, we will have to work
within the limitations of the implementations that we plug into it and we
cannot dictate G implementations of things like heartbeats and cluster
discovery.  The counter point to this was that it would be good that if 
an implementation we could use a standard API to extract cluster meta data
from an implementation that did do heartbeats and discover, for reuse by
one that did not (eg JNDI impl could get it's list of known nodes from the
HTTP session impl). 


cheers



Dublin - Clustering get-together.... - Report

2006-07-03 Thread Jules Gosnell

Here is the promised report on the Geronimo Clustering get-together
held on thursday (6:00pm-8:00pm) in Dublin:

Attendees:

In the room :

Aaron Mulder
Alan Cabrera
Filip Hanik
Greg Wilkins
Jan Bartel
Jeremy Boynes
Jules Gosnell
Mark Brewer
Matt Hogstrom
Paul Buck
Phil Robinson
Rainer Jung
Winston Damarillo

On the phone:

Bill Dudney
Dain Sundstrom
Jeff Genender
Rajith Attapattu

We started by enumerating the areas that were felt to require some
form of clustering support (web, ejb, jndi, jms, deployment, management, 
monitoring, pojo, db). The idea was to prioritise the outstanding

issues and thus scope work to be done. We used the first cut of this
document as a basis for this walkthrough :

http://cwiki.apache.org/GMOxDOC10/clustering.html

I shall be bringing this up to date with the meetings findings as soon
as I have arranged write access with the wiki's owners.

All of these areas were covered and priorities were assigned thus :

1 - Web
2 - EJB/JNDI
3 - Management/Monitoring/Provisioning
4 - .. remaining issues

We then looked at the software components available to us (ActiveMQ, 
ActiveCluster, ActiveSpace, WADI, Tribes, Kache, Tomcat Clustering) - 
also listed in the architecture document (to be updated with new 
software components ASAP).


A number of interesting and useful points came up during the
discussion :

The 'fastest to market' clustering solution is not necessarily the
same as the 'optimal' clustering solution.

Multiple clustering solutions must be accomodated - APIs help
abstract away from implementations - facilitating this

Re: Web - although there seem to be two independant routes to a
session clustering API (the containers' own native session management
API vs a common Geronimo API), both must be made accessible since both
native (e.g. Tomcat clustering and WADI) and Geronimo (Kache)
solutions must be pluggable. Furthermore, the adaptor from native API
to Geronimo API requires the exposure of the native API to be plugged
in in the first place.

Re: EJB - stateless session beans, whilst stateless, may still be
facading more 'stateful' components, such as large trees of Entity
beans that may be expensive to throw away/reload - so it may be useful
to treat SLSBs as SFSBs as far as the session management API is
concerned - also providing some form of session affinity rather than
round robin load-balancing in the client stub.

Re: EJB - it may not be a requirement to add further clustered
support for Entity beans (i.e. clustered caches with distributed
invalidations etc).

Re: Management/Monitoring - tooling will hook in via standard and
possibly extended APIs made available via JMX. GBeans managing
e.g. native clustering solutions might adapt and make available
equivalent functionality.

Re: farmed deployment - It was noted that some form of pull/push
support deployment was available. Perhas Aaron's plugin architecture ?

Re: JMS - AMQ clustering is not yet integrated with Geronimo - it is
awaiting decisions on what the integration should look like.

Re: POJO clustering and JCache - whilst interesting, it is probably
better if users bring their own clustered caches to Geronimo and plug
them in, rather than Geronimo mandating the use of a single solution.

Re: Database clustering - This was also marked as probably out of
scope - although there was some discussion about the Sequoia project.

The support of 3rd party clustered components means that a number of
different implementations of a cluster may be in operation within any
given node at any given time. These different implementations may each
be from the ground up and share no code. Questions around this were e.g. :

- can we expose different components' ideas about cluster membership
through a common API (i.e. compare which nodes e.g. ProductA thinks
are running against e.g. what Geronimo thinks are running) ?

- should this preclude the usage of a common clustering layer and
therefore API in areas where we are building clustered Geronimo
services from the ground up - Sessions, Monitoring/Management, JNDI...

The Geronimo Session API came in for particular discussion. This
seemed to hinge around the fact that its aim might be to hide
clustering issues from client containers (although it actually exposes
a session's location and thus clustering issues at this level), but
that it would be useful if a lower level clustering API, on which
other clustered Geronimo services might be constructed, were also
exposed. There were a number of differing and strongly held positions
- including the view that trying to define what a cluster actually
was, was much too hard a task in the first place !

It was finally agreed that there should be a 2 week period during
which modifications and alternate session management apis might be
floated on the dev list. The need for a lower-level clustering API and
potential candidates for this role might also be debated during this
time. Immediately after this period, a vote would decide the way based
on

Re: Dublin - Clustering get-together.... - Report...

2006-06-30 Thread Jules Gosnell

Guys,

They are taking down the network here in the next 5 mins or so and I am 
only half way through the write up - so I am going to ask you all to 
bear with me until I get home tomorrow.


Sorry to keep you all waiting,


Jules




Jules Gosnell wrote:




It looks like I am going to make it to Dublin afterall.

I think this will be a good chance for anyone interested in Geronimo 
clustering to get together and talk about scope, architecture, 
resourcing, roadmap etc - in short, anything that we need to get the 
beast rolled out.


Here is a link to a clustering architecture overview that I put 
together some time ago - it's a little out of date. It doesn't mention 
Lingo, which should be very useful for cluster management, the 
proposed Session API or recent developments with geronimo-cache etc... 
but it is a good starting point to demostrate the direction that I am 
coming from.


http://opensource.atlassian.com/confluence/oss/display/GERONIMO/Clustering 



Covalent have kindly offered to host the meeting - its just up to us 
to provide some content and some people :-)


Everyone with an interest in clustering is welcome. It will be very 
informal, structure depending largely on who turns up.


If you would like to come, we need to know ASAP so that we can find a 
window that will fit with the majority of your time constraints, so, 
please join the thread and let us know. We are thinking of some time 
thursday afternoon or evening. If you cannot make a slot during this 
window, please mention it.



looking forward to seeing you all there,



Jules