Here's my problem: A tomcat server participating in a farm-deploy scheme goes off-line...
For this particular sticky situation we'll say the connection to the rest of the cluster was interrupted when a network cable was knocked loose. While the tomcat server is off-line a parallel deployment takes place. The off-line server doesn't get the new artifact. When the off-line server comes back on-line the front-end load-balancer (in my case, haproxy) doesn't know it's running an older webapp and happily begins to pass traffic to it. In a farm deployment scenario, the master node will announce to the cluster a new artifact is available and then the clustered tomcats will retrieve and deploy the new artifact. The problem is that the tomcat server that went off-line never heard the announcement. There doesn't seem to exist a mechanism to re-announce, or announce at regular intervals. This seems like a real weakness in the scheme. That makes me think I'm missing something obvious. Apache's docs on the subject amount to barely a paragraph. The other howtos out there don't address the problem of when a server goes off-line temporarily. My question is directed to other tomcat admins out there who are handling this scenario gracefully. What are you doing to handle this problem?