Re: [Linux-HA] failover questions

Willi Fehler Tue, 22 Nov 2011 22:51:29 -0800

Am 22.11.11 20:35, schrieb Florian Haas:
> On 11/22/11 20:18, Willi Fehler wrote:
>> Hi,
>>
>> I'm trying to setup a database cluster with MySQL/Redis. My problem is,
>> the failover is working if I shutdown/reboot one node.
> I take it that _that_ part isn't really a problem. :)
>
>> If I shutdown the network on one node(ifdown eth0 or ifdown eth1), the
>> failover isn't working.
> No failover would be expected there. So what's "not working" here?
>
>   If I shutdown eth0 and eth1
>> the failover is working
> If you shut down both your cluster communications links and you failed
> to configure fencing of any kind, then you don't get any "working"
> failover. Instead, you'll have your service running on both nodes.
>
>   but if I reboot the node without network access,
>> I get a split-brain.
> No, you get split brain straight away, it's just that it's not detected
> until you reboot (and DRBD reconnects).
>
>> I hope you can help me.
> You ignored this part of the DRBD User's Guide, and you really shouldn't
> have:
>
> http://www.drbd.org/users-guide-8.3/s-pacemaker-fencing.html
>
> A few other issues:
>
>> My current setup:
>> 2 nodes with CentOS-6.0
>> Pacemaker
> Suggest to go to Pacemaker 1.1.5 instead of using the stock 1.1.2 that
> ships with 6.0.
>
>> OpenAIS
>> Corosync
> Strongly recommend to go with at least Corosync 1.4.1 if you're using
> RRP (which you are).
>
>> DRBD
> I'll assume that that's DRBD 8.3.x as opposed to 8.4.0.
>
>> MySQL
>> Redis
>>
>> crm(live)configure#primitive mysqld lsb:mysql \
>>                              op monitor interval="15s"
> Strongly suggest to use ocf:heartbeat:mysql here instead.
>
>> crm(live)configure#primitive redisd lsb:redis \
>>                              op monitor interval="15s"
>> crm(live)configure#group mysql_redis fs_mysql ip_mysql_redis mysqld
>> fs_redis redisd
>> crm(live)configure#location cli-prefer-mysql_redis mysql_redis \
>>                      rule $id="cli-prefer-rule-mysql_redis" inf: #uname
> That looks like a leftover constraint set by "crm resource move";
> consider doing "crm resource unmove".
>
>> eq ESCPDB-HA-01v.escapio.local
> .local is a really poor choice for a domain name, unless you're running
> a DNS-free environment and everything resolves via mDNS.
>
>> # Please read the corosync.conf.5 manual page
>> compatibility: whitetank
>>
>> totem {
>>           version: 2
>>           secauth: off
>>           threads: 0
>>           rrp_mode: passive
>>           interface {
>>                   ringnumber: 0
>>                   bindnetaddr: 10.246.214.0
>>                   mcastaddr: 225.94.1.1
>>                   mcastport: 5404
>>           }
>>           interface {
>>                   ringnumber: 1
>>                   bindnetaddr: 10.10.10.0
>>                   mcastaddr: 225.94.2.1
>>                   mcastport: 5406
>>           }
>> }
>>
>> logging {
>>           fileline: off
>>           to_stderr: no
>>           to_logfile: yes
>>           to_syslog: yes
>>           logfile: /var/log/corosync.log
>>           debug: off
>>           timestamp: on
>>           logger_subsys {
>>                   subsys: AMF
>>                   debug: off
>>           }
>> }
>>
>> amf {
>>           mode: disabled
>> }
>>
>> service {
>>     ver:       0
>>     name:      pacemaker
>>     use_mgmtd: yes
>> }
> Strongly suggest to use ver:1 and pacemakerd, and to disable mgmtd.
>
> Hope this is useful.
>
> Cheers,
> Florian
>
Hi Florian,


thank you so much for your feedback. My goal is, if the cluster 
communication eth0 get's failed
on the active node, a failover should be triggered by pacemaker, because 
if eth0 is down, the application
can't "talk" to the cluster. (service-ip)
Could you please let me know, where I can find newer versions
of Pacemaker/Corosync? I don't know any repository, maybe I have to 
compile them on my own.

Regards - Willi


_______________________________________________
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] failover questions

Reply via email to