Michal wrote: > On 28/09/10 14:05, Miles Fidelman wrote: >> lrhorer wrote: >>> If there is a better forum for this, let me know and I will post my >>> questions there. >>> >>> I am building an application which needs to have high reliability. >>> I have two essentially identical Linux servers which can host the >>> application. Right now, I have the programs - a bash script and a c >>> binary, running on one machine every minute in a cron job. I also >>> have an rsync cron job running to synchronize the files on the >>> standby >>> machine so the data (and binaries, of course) will be identical. >>> What I need to do is have the standby machine take over operations >>> if the applications on the primary machine quite working, for >>> whatever reason. Of course I can easily ping the primary to make >>> sure the machine is up, but what is going to be my best bet for >>> having the standby machine wake up and start running the apps every >>> minute until such time as the >>> primary comes back online? I'm wide open on how to implement. An >>> external application would be great, or I could write either or both >>> c or shell apps to have the two machines talk to one another. >> It may be overkill, but take a look at Pacemaker and the Linux-HA >> Project - http://www.linux-ha.org - it's specifically intended for >> such applications. >> >> Also look at DRBD - www.drbd.org - which mirrors a disk (or >> partition), in realtime across two machines. >> >> The combination gives you automated fail-over capability. >> >> Now, if you want to get really fancy, you can run your application in >> a virtual machine, and use pacemaker and DRBD to fail-over the entire >> VM. >> >> Be warned, it takes a while to get all of these working properly - >> both individually and in combination. You could also take a look at >> ganeti - http://code.google.com/p/ganeti/ - which pulls a bunch of >> the pieces together. >> >> > DRBD and soforth might be overkill, or it might not. If you think it > is and are happy with the rsync you can stick to that and use > heartbeat to monitor the application. Easy to setup as well
Thanks. I skimmed the intro, and I'm not sure heartbeat is really what I need. These aren't server applications that run full-time on the machine. They are relatively simple programs that only take a few seconds to run, and then terminate (hopefully with a return value of 0), to be run again in 60 seconds. I don't need the inter-machine service to cause something to happen on the standby system if the apps are no longer running, or at least not immediately so. Rather, I need the standby machine to take over if: 1. The primary machine is no longer on the network (clearly the heartbeat app can do this). 2. The applications being run by cron fail to run, say, 5 or 6 times in a row. 3. One or more of the applications fails to terminate (hangs). 4. One or more of the applications terminates with other than a 0 status. #4 may be the trickiest, because one of the apps may be spawned as a detached process from the other. Can the heartbeat app handle this? It sounded to me like it will pass the notification to the standby system a few milliseconds after one the monitored processes ends. What I need it to do is monitor the termination status of the programs. -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/if6dnxovioq8et_rnz2dnuvz_rodn...@giganews.com