OK forget that, these 2 databases were using InnoDB which does not use
.myd and .myi files, sorry for the mix up...
please accept my applogies...
Rob Morin
Dido Internet Inc.
Montreal,Canada
http://www.dido.ca
514-990-4444
Rob Morin wrote:
Hehehe.. so i changed the config to bcast eth0 eth1
did a /etc/init.d/heartbeat reload
and not realizing this initiated a restart on the heartbeats, i
thought it would just reload the configs, and think , well hey i have
to go to another eth now.... but it brought down the primary and teh
secondary took over flawlessy without my knowing..... then an hour
later i simply brought heartbeat down and then up on the primary and
everything came back... wow... i was impressed that i go no phone
calls at all from clients! :)
However i noticed only one thing, it can be dangerous.... mysql seemed
to have lost a few files in its /var/www/mysql?? when the secondary
took over the freshly mounted system was missing some .MYD files for
some tables...... there were only 2 sites affected in a minor way,
but what happened to these files? When i brought back the primary
these files were still not there( this is normal i guess) however the
errors for the site disappear but the files were still not there...???
Any ideas or suggestions?
Thanks again for all your help... it was exciting to see it work
unexpectedly.... :)
p.s. my haresources file in case you were wondering, i thought it
might be due to start up where teh database files are not there(not
mounted) but mysql starts...
joe IPaddr::xx.xx.xx.150 drbddisk::mail drbddisk::web \
Filesystem::/dev/drbd0::/var/mail/virtual::ext3::defaults \
Filesystem::/dev/drbd1::/var/www::ext3::defaults \
postfix courier-authdaemon courier-pop courier-imap mysql apache2 proftpd
Rob Morin
Dido Internet Inc.
Montreal,Canada
http://www.dido.ca
514-990-4444
Madd Sauer wrote:
Hello,
On Thu, May 08, 2008 at 08:53:39AM -0400, Rob Morin wrote:
Actually another question....
I would simply add eth1 to the heartbeat ha.cf then? and whats the
diff between using mcast vs bcast? I am not sure i understand this ?
mcast = multicast
if your router supports multicast-routing this packes were routed
bcast = broadcast
broadcasts will NEVER routed.
ucast = unicast
that's my choice, broadcast are mostly trash on the net (imho) and i use
always non-bcast if I can. unicast ist also routed traffic.
Madd
Thanks a bunch
:)
Rob Morin
Dido Internet Inc.
Montreal,Canada
http://www.dido.ca
514-990-4444
Dominik Klein wrote:
Rob Morin wrote:
I have not seen my original email get to the list yet... but after
looking through the logs i see this on each node...
see below for log excerts...
My test involved bringing down eth0 only(heartbeat & replication),
should i have also brought down eth1 the public side of Joe(primary)
my conf file is...
logfacility daemon # This is deprecated
keepalive 2 # Interval between heartbeat (HB)
packets.
deadtime 60 # How quickly HB determines a dead
node.
warntime 5 # Time HB will issue a late HB.
initdead 120 # Time delay needed by HB to report
a dead node.
udpport 694 # UDP port HB uses to communicate
between nodes.
#ping 192.168.5.1 # Ping VMware Server host to
simulate network resource.
bcast eth0
You only use one connection for heartbeat communication. That is a
configuration error.
As you unplugged that interface for testing, you forced a
splitbrain situation. Read http://www.linux-ha.org/SplitBrain
Dual split brain so to speak. Your drbd replication is also done
over this link. So not only does heartbeat loose connection, but
also does drbd. In a standard setup, a not connected secondary drbd
device can be promoted disregarding the peer's drbd state.
You might want to read about dopd, too:
http://www.drbd.org/users-guide/s-heartbeat-dopd.html
It can prevent drbd splitbrain, but you need to have >1 network
connection anyways.
#baud 115200
#serial /dev/ttyS0 # Which interface to use for HB
packets.
coredumps true
auto_failback on # Auto promotion of primary node upon
return to cluster.
Your comment answers your later question on what will happen when a
rebooted (stonith'd) node rejoins the cluster.
Regards
Dominik
node joe # Node name must be same as uname -n.
node stewie # Node name must be same as uname -n.
###
###
respawn hacluster /usr/lib/heartbeat/ipfail
# Specifies which programs to run at startup
# DO not use the below unless you use the
/var/lib/heartbeat/crm/cib/xml config file instead
#crm on
use_logd yes # Use system logging.
logfile /var/log/hb.log # Heartbeat logfile.
debugfile /var/log/heartbeat-debug.log # Debugging logfile.
Primary
--------
May 6 23:04:44 joe heartbeat: [4342]: WARN: node stewie: is dead
May 6 23:04:44 joe heartbeat: [4342]: WARN: No STONITH device
configured.
May 6 23:04:44 joe heartbeat: [4342]: WARN: Shared disks are not
protected.
May 6 23:04:44 joe heartbeat: [4342]: info: Resources being
acquired
>from stewie.
May 6 23:04:44 joe heartbeat: [4342]: info: Link stewie:eth0 dead.
May 6 23:04:44 joe heartbeat: [4249]: debug: notify_world:
setting SIGCHLD Handler to SIG_DFL
May 6 23:04:44 joe mach_down[4283]: [4328]: info:
/usr/lib/heartbeat/mach_down: nice_failback: foreign resources
acquired
May 6 23:04:44 joe heartbeat: [4342]: info: mach_down takeover
complete.
May 6 23:04:44 joe heartbeat: [4342]: debug:
StartNextRemoteRscReq(): child count 1
May 6 23:04:44 joe heartbeat: [4250]: info: Local Resource
acquisition completed.
Secondary
-----------
May 6 23:04:46 stewie heartbeat: [21820]: info: Resources being
acquired from joe.
May 6 23:04:46 stewie heartbeat: [21820]: info: Link joe:eth0 dead.
May 6 23:04:46 stewie heartbeat: [4946]: info: No local resources
[/usr/lib/heartbeat/ResourceManager listkeys stewie] to acquire.
May 6 23:04:46 stewie heartbeat: [21825]: ERROR: MSG[4] :
[info=req_our_resources()]
May 6 23:05:10 stewie mach_down[4953]: [6063]: info:
/usr/lib/heartbeat/mach_down: nice_failback: foreign resources
acquired
May 6 23:05:10 stewie heartbeat: [21820]: info: mach_down
takeover complete.
May 6 23:05:10 stewie heartbeat: [21825]: ERROR: MSG[2] :
[info=mach_down]
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems