Am 22.11.11 20:35, schrieb Florian Haas: > On 11/22/11 20:18, Willi Fehler wrote: >> Hi, >> >> I'm trying to setup a database cluster with MySQL/Redis. My problem is, >> the failover is working if I shutdown/reboot one node. > I take it that _that_ part isn't really a problem. :) > >> If I shutdown the network on one node(ifdown eth0 or ifdown eth1), the >> failover isn't working. > No failover would be expected there. So what's "not working" here? > > If I shutdown eth0 and eth1 >> the failover is working > If you shut down both your cluster communications links and you failed > to configure fencing of any kind, then you don't get any "working" > failover. Instead, you'll have your service running on both nodes. > > but if I reboot the node without network access, >> I get a split-brain. > No, you get split brain straight away, it's just that it's not detected > until you reboot (and DRBD reconnects). > >> I hope you can help me. > You ignored this part of the DRBD User's Guide, and you really shouldn't > have: > > http://www.drbd.org/users-guide-8.3/s-pacemaker-fencing.html > > A few other issues: > >> My current setup: >> 2 nodes with CentOS-6.0 >> Pacemaker > Suggest to go to Pacemaker 1.1.5 instead of using the stock 1.1.2 that > ships with 6.0. > >> OpenAIS >> Corosync > Strongly recommend to go with at least Corosync 1.4.1 if you're using > RRP (which you are). > >> DRBD > I'll assume that that's DRBD 8.3.x as opposed to 8.4.0. > >> MySQL >> Redis >> >> crm(live)configure#primitive mysqld lsb:mysql \ >> op monitor interval="15s" > Strongly suggest to use ocf:heartbeat:mysql here instead. > >> crm(live)configure#primitive redisd lsb:redis \ >> op monitor interval="15s" >> crm(live)configure#group mysql_redis fs_mysql ip_mysql_redis mysqld >> fs_redis redisd >> crm(live)configure#location cli-prefer-mysql_redis mysql_redis \ >> rule $id="cli-prefer-rule-mysql_redis" inf: #uname > That looks like a leftover constraint set by "crm resource move"; > consider doing "crm resource unmove". > >> eq ESCPDB-HA-01v.escapio.local > .local is a really poor choice for a domain name, unless you're running > a DNS-free environment and everything resolves via mDNS. > >> # Please read the corosync.conf.5 manual page >> compatibility: whitetank >> >> totem { >> version: 2 >> secauth: off >> threads: 0 >> rrp_mode: passive >> interface { >> ringnumber: 0 >> bindnetaddr: 10.246.214.0 >> mcastaddr: 225.94.1.1 >> mcastport: 5404 >> } >> interface { >> ringnumber: 1 >> bindnetaddr: 10.10.10.0 >> mcastaddr: 225.94.2.1 >> mcastport: 5406 >> } >> } >> >> logging { >> fileline: off >> to_stderr: no >> to_logfile: yes >> to_syslog: yes >> logfile: /var/log/corosync.log >> debug: off >> timestamp: on >> logger_subsys { >> subsys: AMF >> debug: off >> } >> } >> >> amf { >> mode: disabled >> } >> >> service { >> ver: 0 >> name: pacemaker >> use_mgmtd: yes >> } > Strongly suggest to use ver:1 and pacemakerd, and to disable mgmtd. > > Hope this is useful. > > Cheers, > Florian > Hi Florian,
thank you so much for your feedback. My goal is, if the cluster communication eth0 get's failed on the active node, a failover should be triggered by pacemaker, because if eth0 is down, the application can't "talk" to the cluster. (service-ip) Could you please let me know, where I can find newer versions of Pacemaker/Corosync? I don't know any repository, maybe I have to compile them on my own. Regards - Willi _______________________________________________ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems