[OT cluster]Request to comment

János Löbb Wed, 13 Apr 2011 10:01:43 -0700

Hi,

I have this write up regarding a tomcat cluster I have to set up:
<nyissz>
Clustering Information: 
Two or more servers with each having a tomcat instance installed. The cluster 
or proxy server will need to have load balancing.


Session Replication between all the tomcat instances on all the servers in the 
cluster.
Each tomcat instance should be able to handle the session of any user. Most 
likely will need to use the DeltaManager.

Need to make sure fail-over is transparent to the user and the sessions are 
replicated between the tomcat instances.

The key thing is removing a server from the cluster and adding it back to the 
cluster without affecting any users. This feature is needed for our application 
deployment strategy of deploying a web application, testing it with some users 
and then adding it back to the cluster.

Scenario - Tomcat 1, Tomcat 2, Tomcat 3 are running in a cluster. Remove Tomcat 
3 from a cluster while people are using the application. Access it directly 
with http:\\addressOfTomcat3:port. Access management console and stop/remove 
our application. Deploy a new version of the application. Ask users to use an 
alternative link to that App. Let it run for some time. Add this application to 
the cluster in such a way that the new application is now deployed or take 
other two machines out of the cluster. Deploy the new app on them while users 
seamlessly switch to Tomcat 3. Add t1 and t2 back to the cluster. Now all apps 
are in sync.
<nyassz>

The first 3 paragraphs are done.  I have two machines  with OSX 10.6.5, httpd 
2.2.17, mod_jk 1.2.31, tomcat 7.0.10. One of the httpd is set up as a reverse 
proxy, so the httpds are load balanced.  Every httpd have the mod_jk module and 
that dos the load balance for the tomcats.  This part is working beautifully , 
sesions are replicated and fail over if I stop one or the other tomcats or if I 
stop the httpd that is not the reverse proxy.

My problems are starting with the requirement of removing a server from the 
cluster and adding it back to the cluster without affecting the users.  I know 
just three ways to take a server out from the cluster.   I assume "server" here 
means a tomcat instance.
-       shut that tomcat down.
-       comment out the <Cluster..../> tag and let tomcat re-read the config 
change, although I do not know how to do that.  I do not see any "graceful" 
options.
-       Change its multicast address and again re-read the config change 
somehow.
Is there any command line or web based interface that can interact with the 
running cluster and take a cluster member "out" and put it back later ?

Then my problems are magnified by the described Scenario.  What is the right 
method to remove a tomcat instance from a cluster ?  Obviously it cannot be 
shut down, because it still had to be accessed by another mean.  If I just shut 
down the web application on one cluster member it renders all the replicated 
sessions of the webapp on other tomcat unreachable via the reverse proxy, I 
already tested that out.  To have a successful failover looks like the 
heartbeat of the desired tomcat instance should be stopped for the given 
cluster.
I can imagine that by changing the multicast address and re-read the newly 
modified server.xml will allow the web app to continue on the other tomcat 
instances of the cluster and in the meantime refurbish the "taken out" tomcat 
instance with a new version of the webapp.  I can also imagine that the old 
webapp still run nicely on the remaing member of the cluster and the new webapp 
will be accessed on this now standalone tomcat instance.  After the test is 
over, the tomcat instance with the new webapp can be added back t the cluster 
by changing back its multicast address to the cluster's multicast and "take 
out" the tomcat instances which are still serving the old wab app by changing 
their multicast address in server.xml and re-read the config change somehow.

Are my assumptions correct ?  Of course I will start to test them this 
afternoon, but all comments are welcome.  Even those to read the FM, if it is 
chapter or paragraph specific.

My thinking is this:
- First I have to look jkmanager to see if the demanded functionality can be 
provided by it.
- I will set up on the other httpd another reverse proxy for the httpds.  Lets 
call it tctest. 
- In both httpd configure an additonal JkMount for the test wersion of the 
webapp but with a differently named load balancer, let say lbtest.  This point 
is fuzzy in my mind at this moment, because I would like the test to have the 
same context path as production, but I guess that is impossible, so for 
production itt will be something like:

JkMount /examples/* lb
and for test
JkMount /tctest/examples/* lbtest

I have to figure out how to deploy into a subdirectory of 
$CATALINA_BASE/webapps/, in this case into $CATALINA_BASE/webapps/tctest/.  
Looks very curly to me and I afraid that the Developer has to package the 
example webapp as tctest/example and create the WAR file, so I am all ears for 
a better solution. 

- Set up two additional tomcat instances as a test cluster with different ports 
and different multicast address.  Modify the workers.properties file on both 
machines to incorporate the additional two tomcat instances and the additional 
laad balance worker.
- Instead of "taking out" a tomcat instance from a cluster by mocking with its 
server.xml at every occasion, simple deploy the new version of the webapp on 
the test tomcat cluster.  Redirect the users to the other reverse proxy for 
access.  The change will be just the host and the reverse proxy indicator.
- When testing is done simple redeploy the new app on the production tomcat 
cluster.

What do you think ?

Thanks ahead,

János

[OT cluster]Request to comment

Reply via email to