Howdy, > On Mon, 7 Feb 2011 16:36:46 +0100, Dejan Muhamedagic wrote:> Hi, > > On Mon, Feb 07, 2011 at 02:01:11PM +0100, Stephan-Frank Henry wrote: > > Hello again, > > > > I am having some possible problems with Corosync and IPAddr. > > To be more specific, when I do a /etc/init.d/corosync stop, while > everything shuts down more or less gracefully, the virtual ip never is > released > (still visible with ifconfig). > > > > if I do a 'sudo ifdown --force eth0:0' it works. So there should be no > direct reason for this. > > > > This might not by itself be a problem, but I fear it could also be > related to a 'split-brain' corosync handling due to network cable disconnect. > > Though that might be something else, I'd rather remove all other > problems and then see if it fixes itself. > > > > I have checked syslog, but nothing really jumps out. > > Are there any other logs or places where I can look? > > > > thanks everyone! > > > > Frank > > > > (pls scream if more or other info is needed) > > > > ------------------------------------------------------------- > > > > OS: Debian Lenny 64bit, kernel version: 2.6.33.3 > > Corosnyc: 1.2.1-1~bpo50+1 > > cluster-glue: 1.0.6-1~bpo50+1 > > libheartbeat2: 1:3.0.3-2~bpo50+1 > > > > relevant cib.xml entry: > > <primitive id="ip_resource" class="ocf" type="IPaddr" > provider="heartbeat"> > > <instance_attributes id="virtual-ip-attribs"> > > <attributes> > > <nvpair id="virtual-ip-addr" name="ip" value="150.158.183.30"/> > > <nvpair id="virtual-ip-addr-nic" name="nic" value="eth0"/> > > <nvpair id="virtual-ip-addr-netmask" name="cidr_netmask" > value="22"/> > > </attributes> > > </instance_attributes> > > <operations> > > <op id="virtual-ip-monitor-10s" interval="10s" name="monitor"/> > > </operations> > > </primitive> > > > > here is a reduced log (only the ip stuff): > > Feb 7 13:39:40 serverA pengine: [8695]: notice: unpack_rsc_op: > Operation ip_resource_monitor_0 found resource ip_resource active on serverA > > Feb 7 13:39:40 serverA pengine: [8695]: notice: native_print: > ip_resource#011(ocf::heartbeat:IPaddr):#011Started serverA > > Feb 7 13:39:40 serverA pengine: [8695]: info: native_merge_weights: > ms_drbd0: Rolling back scores from ip_resource > > Feb 7 13:39:40 serverA pengine: [8695]: info: native_merge_weights: > ms_drbd0: Rolling back scores from ip_resource > > Feb 7 13:39:40 serverA pengine: [8695]: info: native_merge_weights: > ip_resource: Rolling back scores from fs0 > > Feb 7 13:39:40 serverA pengine: [8695]: info: native_color: Resource > ip_resource cannot run anywhere > > Feb 7 13:39:40 serverA pengine: [8695]: notice: LogActions: Stop > resource ip_resource#011(serverA) > > Feb 7 13:39:40 serverA crmd: [8696]: info: do_state_transition: State > transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS > cause=C_IPC_MESSAGE origin=handle_response ] > > Feb 7 13:39:42 serverA crmd: [8696]: info: te_rsc_command: Initiating > action 33: stop ip_resource_stop_0 on serverA (local) > > Feb 7 13:39:42 serverA lrmd: [8693]: info: cancel_op: operation > monitor[7] on ocf::IPaddr::ip_resource for client 8696, its parameters: > CRM_meta_interval=[10000] ip=[150.158.183.30] > > Feb 7 13:39:42 serverA crmd: [8696]: info: do_lrm_rsc_op: Performing > key=33:13:0:0dff3321-22f5-411c-a50a-e95fcfa4dd6f op=ip_resource_stop_0 ) > > Feb 7 13:39:42 serverA lrmd: [8693]: info: rsc:ip_resource:14: stop > > Feb 7 13:39:42 serverA crmd: [8696]: info: process_lrm_event: LRM > operation ip_resource_monitor_10000 (call=7, status=1, cib-update=0, > confirmed=true) Cancelled > > Feb 7 13:40:02 serverA lrmd: [8693]: WARN: ip_resource:stop process > (PID 10541) timed out (try 1). Killing with signal SIGTERM (15). > > The stop action times out. You should check why. Note that > ifdown ... is not what IPaddr uses, but ifconfig down. You can > also test the resource using ocf-tester outside of cluster.
Yeah, I had seen that but was at a loss to what the cause was. Would there have been any way to find out what the reasons were? For now I followed Shravan's suggestion and switched to IPaddr2. I had it in in my first versions, but the interface did not show up in ifconfig. After some googling I also added the iflabel. Yay. Now everything is working good, I just have some 'issues' with how Corosync & Drbd work, or rather my expectations and how they might differ from my config. :D I'll write a new post for this. Thanks again! Frank -- NEU: FreePhone - kostenlos mobil telefonieren und surfen! Jetzt informieren: http://www.gmx.net/de/go/freephone _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker