Hi,
On Jun 15, 2008, at 8:25 PM, Chun Tian (binghe) wrote:
Hi, Joe
GNU cfengine [1] may be helpful here.
no no no
let the CIB synchronizes the files itself.
the most common reason for the files to get out-of-sync, is if some
other process (human or automated) modified the on-disk copy (which
is what you're proposing Joe does).
Let me say clearly. When I start a new Heartbeat cluster, I must do
following steps:
1) Make all nodes of my cluster have the same /etc/ha.d directory
2) Copy a "bootstrap" cib.xml into every nodes' /var/lib/heartbeat/crm
directory
I do above two things by using GNU cfengine. When cluster is running
up, I never let cfengine to modify the cib.xml (but can read them,
save them)
When cluster is running, I can add/remove resources by using cibadmin
or hb_gui, and let Heartbeat to synchronize the cib.xml.
But sometimes HA cluster may goes wrong (rarely, with any/unknown
reason), and I have to do a "cold" retart, using following steps:
1) stop all heartbeat process on all nodes
2) using cfengine, copy one node's cib.xml back to my cfengine main
server, to save these resources which I modified after HA's first run.
3) using cfengine, clean cib.xml (clean <cib>, <nodes>), and copy it
back to all nodes again,
4) start "heartbeat" processes on every nodes.
I used it to distribute initial cib.xml and other HA config files,
and collect cib.xml back to configuration server after I modified
HA resources.
By using cfengine's "editfiles" facility I can clean it's <nodes>
and <cib> labels when collect back:
editfiles:
pull_upload_cib.afs_1::
{ $(cfinput)/files/heartbeat/cib.xml
ReplaceFirst "<cib .*>" With "<cib>"
LocateLineMatching "^ +<nodes>$"
DeleteToLineMatching "^ +</nodes>$"
ReplaceFirst "</nodes>" With "<nodes/>"
}
btw. if you're going to do this - you need to at least remove
the .sig* files in the same directory.
you could also just as easily remove the configuration from every
node except the one you start first.
Yes, I do remove the .sig* files, and clean/backup heartbeat log
files. What I showed is just a small piece of my big cfengine config.
I think a good feature of Heartbeat 2.x is that one can modify HA
config after it runs up. And a bad feature for Unix system
administrator is that I must do something to save the live config
(cib.xml, mainly), when something goes wrong and I have to start the
cluster again.
Regards,
Chun Tian (binghe)
_______________________________________________
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems