Pavlos,

Thanks for helping out on this.  We are running on RHEL 5.5 running on the iron and not a VM.   We don't have SELinux turned on and the firewall is disabled.  Here is information in the /etc/modprobe.conf file.

alias eth0 bnx2
alias eth1 bnx2
alias scsi_hostadapter cciss
alias scsi_hostadapter1 qla2xxx
alias scsi_hostadapter2 usb-storage
alias bond0 bonding
options bond0 mode=1 miimon=100
options lpfc lpfc_lun_queue_depth=16 lpfc_nodev_tmo=30 lpfc_discovery_threads=32


We did take off the bond0 as a test and now only have our IP address assigned to eth0 and still having the same problem when starting corosync. The problem we are finding in the /var/log/cluster/corosync.log file is below.

Sep 30 07:58:57 e-magdb1.buysub.com crmd: [28406]: info: crm_timer_popped: Election Trigger (I_DC_TIMEOUT) just popped!
Sep 30 07:58:57 e-magdb1.buysub.com crmd: [28406]: WARN: do_log: FSA: Input I_DC_TIMEOUT from crm_timer_popped() received in state S_PENDING
Sep 30 07:58:57 e-magdb1.buysub.com crmd: [28406]: info: do_state_transition: State transition S_PENDING -> S_ELECTION [ input=I_DC_TIMEOUT cause=C_TIMER_POPPED origin=crm_timer_popped ]

What could this 'just popped' message mean?

Mike


From: Pavlos Parissis <pavlos.paris...@gmail.com>
To: The Pacemaker cluster resource manager <pacemaker@oss.clusterlabs.org>
Date: 09/29/2010 04:01 PM
Subject: Re: [Pacemaker] Does bond0 network interface work with        corosync/pacemaker







On 29 September 2010 21:01, Andreas Hofmeister <a...@collax.com> wrote:
On 29.09.2010 19:59, Mike A Meyer wrote:
We have two nodes that we have the IP address assigned to a bond0 network interface instead of the usual eth0 network interface.  We are wondering if there are issues with trying to configure corosync/pacemaker with an IP assigned to a bond0 network interface.  We are seeing that corosync/pacemaker will start on both nodes, but it doesn't detect other nodes in the cluster.  We do have SELinux and the firewall shut off on both nodes.  Any information would be helpful.
 
We run the cluster stuff on bonding devices (actually on a VLan on top of a bond)  and it works well. We use it in a two-node setup in round-robin mode, the nodes are connected back-to-back (i.e. no Switch in between).

If you use bonding over a Switch, check your bonding mode - round-robin just won't work. Try LACP if you have connected each node to  a single switch or if your Switches support link aggregation over multiple Devices (the cheaper ones won't). Try "active-backup" with multiple switches.

To check your configuration, use "ping" and check the "icmp_seq" in the replies. If some sequence number is missing, your setup is probably broken.


It is quite common to connect both interfaces of a bond on the same switch and then face issues.
Mike you need to tell us a bit more on the layer 2 connectivity and how it does look like.

We also use active-backup mode on our bond interfaces, but we use 2 switches and it works without any problem

Cheers,
Pavlos

_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home:
http://www.clusterlabs.org
Getting started:
http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs:
http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker

Reply via email to