On Mon, Jun 16, 2008 at 10:41, Chun Tian (binghe) <[EMAIL PROTECTED]> wrote: > Hi, > >> >> On Jun 15, 2008, at 8:25 PM, Chun Tian (binghe) wrote: >> >>> Hi, Joe >>> >>> GNU cfengine [1] may be helpful here. >> >> >> no no no >> >> let the CIB synchronizes the files itself. >> the most common reason for the files to get out-of-sync, is if some other >> process (human or automated) modified the on-disk copy (which is what you're >> proposing Joe does). > > Let me say clearly. When I start a new Heartbeat cluster, I must do > following steps: > > 1) Make all nodes of my cluster have the same /etc/ha.d directory > 2) Copy a "bootstrap" cib.xml into every nodes' /var/lib/heartbeat/crm > directory
not 100% required. just make sure the first node you power up has it and the rest will get it automatically. > > I do above two things by using GNU cfengine. When cluster is running up, I > never let cfengine to modify the cib.xml (but can read them, save them) > > When cluster is running, I can add/remove resources by using cibadmin or > hb_gui, and let Heartbeat to synchronize the cib.xml. > > But sometimes HA cluster may goes wrong (rarely, with any/unknown reason), Then you _really_ should report this... thats the only way its going to get fixed ;-) Can I ask what version you're running? > and I have to do a "cold" retart, using following steps: > > 1) stop all heartbeat process on all nodes > 2) using cfengine, copy one node's cib.xml back to my cfengine main server, > to save these resources which I modified after HA's first run. > 3) using cfengine, clean cib.xml (clean <cib>, <nodes>), and copy it back to > all nodes again, > 4) start "heartbeat" processes on every nodes. > >> >> >> >>> I used it to distribute initial cib.xml and other HA config files, and >>> collect cib.xml back to configuration server after I modified HA resources. >>> >>> By using cfengine's "editfiles" facility I can clean it's <nodes> and >>> <cib> labels when collect back: >>> >>> editfiles: >>> pull_upload_cib.afs_1:: >>> { $(cfinput)/files/heartbeat/cib.xml >>> ReplaceFirst "<cib .*>" With "<cib>" >>> LocateLineMatching "^ +<nodes>$" >>> DeleteToLineMatching "^ +</nodes>$" >>> ReplaceFirst "</nodes>" With "<nodes/>" >>> } >> >> btw. if you're going to do this - you need to at least remove the .sig* >> files in the same directory. >> >> you could also just as easily remove the configuration from every node >> except the one you start first. > > Yes, I do remove the .sig* files, and clean/backup heartbeat log files. What > I showed is just a small piece of my big cfengine config. ok - just making sure > I think a good feature of Heartbeat 2.x is that one can modify HA config > after it runs up. And a bad feature for Unix system administrator is that I > must do something to save the live config (cib.xml, mainly), when something > goes wrong and I have to start the cluster again. Well it's always saved to disk - and if there is at least one surviving node, then any node joining the cluster will receive a copy of the current configuration. So I'm not completely sure what you're talking about here - unless you mean when you have to populate the configuration when you create a cluster from scratch. _______________________________________________ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems