Re: Dublin - Clustering get-together.... - Report
Thanks, Greg, The more views on what was actually discussed and decided the better. I hope to have some constructive thoughts on the session API and a propsed clustering API shortly - time allowing. Jules Greg Wilkins wrote: All, I am going to give my own short report on the meeting. I'm not intending to decent from Jules report - simply to give a short version that highlights the important issues (for me). We covered lots of clustering requirements and issues - made us all realize how big and challenging a complete clustering solution is. There are few easy wins here. Views on what we are aiming for ranged from "we need world class solution for all aspects of clustering" to "we need something that works for the tick box aspects ASAP". Many thought both. A pluggable API approach to allow multiple implementations was widely accepted as the best way to go. The session API is the current focus for putting clustering in G. Some do not think it should be... but for better or worse it is. Views on the current session API ranged from: "it is pretty good for the job." through "it is the best we can expect to do" and "it is adequate for what it is, but we need more" to "it is totally unsatisfactory". It was apparent that it was difficult to discuss other aspects of clustering (eg management/configuration) with out the conversation returning to the suitability or otherwise of the session API. Matt floated the idea that in order to move on, we have a period of review on the session API (which we time box). Critics of the API have the next few weeks to make the case to either extend, re factor or replace this API, after which we should try to push through to working implementations (with the normal amount of agile refactoring etc.). This was accepted as the key outcome of the meeting. Some secondary points: We frequently blurred and then clarified the somewhat conflicting requirements for clustering for availability and clustering for scalability. It is very easy to miss communicate when talking clustering! It was pointed out that even with a G clustering API, we will have to work within the limitations of the implementations that we plug into it and we cannot dictate G implementations of things like heartbeats and cluster discovery. The counter point to this was that it would be good that if an implementation we could use a standard API to extract cluster meta data from an implementation that did do heartbeats and discover, for reuse by one that did not (eg JNDI impl could get it's list of known nodes from the HTTP session impl). cheers -- "Open Source is a self-assembling organism. You dangle a piece of string into a super-saturated solution and a whole operating-system crystallises out around it." /** * Jules Gosnell * Partner * Core Developers Network (Europe) * *www.coredevelopers.net * * Open Source Training & Support. **/
Re: Dublin - Clustering get-together.... - Report
All, I am going to give my own short report on the meeting. I'm not intending to decent from Jules report - simply to give a short version that highlights the important issues (for me). We covered lots of clustering requirements and issues - made us all realize how big and challenging a complete clustering solution is. There are few easy wins here. Views on what we are aiming for ranged from "we need world class solution for all aspects of clustering" to "we need something that works for the tick box aspects ASAP". Many thought both. A pluggable API approach to allow multiple implementations was widely accepted as the best way to go. The session API is the current focus for putting clustering in G. Some do not think it should be... but for better or worse it is. Views on the current session API ranged from: "it is pretty good for the job." through "it is the best we can expect to do" and "it is adequate for what it is, but we need more" to "it is totally unsatisfactory". It was apparent that it was difficult to discuss other aspects of clustering (eg management/configuration) with out the conversation returning to the suitability or otherwise of the session API. Matt floated the idea that in order to move on, we have a period of review on the session API (which we time box). Critics of the API have the next few weeks to make the case to either extend, re factor or replace this API, after which we should try to push through to working implementations (with the normal amount of agile refactoring etc.). This was accepted as the key outcome of the meeting. Some secondary points: We frequently blurred and then clarified the somewhat conflicting requirements for clustering for availability and clustering for scalability. It is very easy to miss communicate when talking clustering! It was pointed out that even with a G clustering API, we will have to work within the limitations of the implementations that we plug into it and we cannot dictate G implementations of things like heartbeats and cluster discovery. The counter point to this was that it would be good that if an implementation we could use a standard API to extract cluster meta data from an implementation that did do heartbeats and discover, for reuse by one that did not (eg JNDI impl could get it's list of known nodes from the HTTP session impl). cheers
Dublin - Clustering get-together.... - Report
Here is the promised report on the Geronimo Clustering get-together held on thursday (6:00pm-8:00pm) in Dublin: Attendees: In the room : Aaron Mulder Alan Cabrera Filip Hanik Greg Wilkins Jan Bartel Jeremy Boynes Jules Gosnell Mark Brewer Matt Hogstrom Paul Buck Phil Robinson Rainer Jung Winston Damarillo On the phone: Bill Dudney Dain Sundstrom Jeff Genender Rajith Attapattu We started by enumerating the areas that were felt to require some form of clustering support (web, ejb, jndi, jms, deployment, management, monitoring, pojo, db). The idea was to prioritise the outstanding issues and thus scope work to be done. We used the first cut of this document as a basis for this walkthrough : http://cwiki.apache.org/GMOxDOC10/clustering.html I shall be bringing this up to date with the meetings findings as soon as I have arranged write access with the wiki's owners. All of these areas were covered and priorities were assigned thus : 1 - Web 2 - EJB/JNDI 3 - Management/Monitoring/Provisioning 4 - .. remaining issues We then looked at the software components available to us (ActiveMQ, ActiveCluster, ActiveSpace, WADI, Tribes, Kache, Tomcat Clustering) - also listed in the architecture document (to be updated with new software components ASAP). A number of interesting and useful points came up during the discussion : The 'fastest to market' clustering solution is not necessarily the same as the 'optimal' clustering solution. Multiple clustering solutions must be accomodated - APIs help abstract away from implementations - facilitating this Re: Web - although there seem to be two independant routes to a session clustering API (the containers' own native session management API vs a common Geronimo API), both must be made accessible since both native (e.g. Tomcat clustering and WADI) and Geronimo (Kache) solutions must be pluggable. Furthermore, the adaptor from native API to Geronimo API requires the exposure of the native API to be plugged in in the first place. Re: EJB - stateless session beans, whilst stateless, may still be facading more 'stateful' components, such as large trees of Entity beans that may be expensive to throw away/reload - so it may be useful to treat SLSBs as SFSBs as far as the session management API is concerned - also providing some form of session affinity rather than round robin load-balancing in the client stub. Re: EJB - it may not be a requirement to add further clustered support for Entity beans (i.e. clustered caches with distributed invalidations etc). Re: Management/Monitoring - tooling will hook in via standard and possibly extended APIs made available via JMX. GBeans managing e.g. native clustering solutions might adapt and make available equivalent functionality. Re: farmed deployment - It was noted that some form of pull/push support deployment was available. Perhas Aaron's plugin architecture ? Re: JMS - AMQ clustering is not yet integrated with Geronimo - it is awaiting decisions on what the integration should look like. Re: POJO clustering and JCache - whilst interesting, it is probably better if users bring their own clustered caches to Geronimo and plug them in, rather than Geronimo mandating the use of a single solution. Re: Database clustering - This was also marked as probably out of scope - although there was some discussion about the Sequoia project. The support of 3rd party clustered components means that a number of different implementations of a cluster may be in operation within any given node at any given time. These different implementations may each be from the ground up and share no code. Questions around this were e.g. : - can we expose different components' ideas about cluster membership through a common API (i.e. compare which nodes e.g. ProductA thinks are running against e.g. what Geronimo thinks are running) ? - should this preclude the usage of a common clustering layer and therefore API in areas where we are building clustered Geronimo services from the ground up - Sessions, Monitoring/Management, JNDI... The Geronimo Session API came in for particular discussion. This seemed to hinge around the fact that its aim might be to hide clustering issues from client containers (although it actually exposes a session's location and thus clustering issues at this level), but that it would be useful if a lower level clustering API, on which other clustered Geronimo services might be constructed, were also exposed. There were a number of differing and strongly held positions - including the view that trying to define what a cluster actually was, was much too hard a task in the first place ! It was finally agreed that there should be a 2 week period during which modifications and alternate session management apis might be floated on the dev list. The need for a lower-level clustering API and potential candidates for this role might also be debated during this time. Immediately after this period, a vote would decide the way based on
Re: Dublin - Clustering get-together.... - Report...
Guys, They are taking down the network here in the next 5 mins or so and I am only half way through the write up - so I am going to ask you all to bear with me until I get home tomorrow. Sorry to keep you all waiting, Jules Jules Gosnell wrote: It looks like I am going to make it to Dublin afterall. I think this will be a good chance for anyone interested in Geronimo clustering to get together and talk about scope, architecture, resourcing, roadmap etc - in short, anything that we need to get the beast rolled out. Here is a link to a clustering architecture overview that I put together some time ago - it's a little out of date. It doesn't mention Lingo, which should be very useful for cluster management, the proposed Session API or recent developments with geronimo-cache etc... but it is a good starting point to demostrate the direction that I am coming from. http://opensource.atlassian.com/confluence/oss/display/GERONIMO/Clustering Covalent have kindly offered to host the meeting - its just up to us to provide some content and some people :-) Everyone with an interest in clustering is welcome. It will be very informal, structure depending largely on who turns up. If you would like to come, we need to know ASAP so that we can find a window that will fit with the majority of your time constraints, so, please join the thread and let us know. We are thinking of some time thursday afternoon or evening. If you cannot make a slot during this window, please mention it. looking forward to seeing you all there, Jules