Hi, Regarding to the discussion in the pacemaker ML below, I would suggest a patch as attached.
The patch includes: 1) Fix IPaddr to return the correct OCF value (It returned 255 when delete_interface failed). 2) Add a description about the assumption in IPaddr / IPaddr2 meta-data. Regards, Keisuke MORI 2010/4/14 Lars Ellenberg <lars.ellenb...@linbit.com>: > On Tue, Apr 13, 2010 at 08:28:09PM +0200, Lars Ellenberg wrote: >> On Tue, Apr 13, 2010 at 12:10:18PM +0200, Dejan Muhamedagic wrote: >> > Hi, >> > >> > On Mon, Apr 12, 2010 at 05:26:19PM +0200, Markus M. wrote: >> > > Markus M. wrote: >> > > >is there a known problem with IPaddr(2) when defining many (in my >> > > >case: 11) ip resources which are started/stopped concurrently? >> > >> > Don't remember any problems. >> > >> > > Well... some further investigation revealed that it seems to be a >> > > problem with the way how the ip addresses are assigned. >> > > >> > > When looking at the output of "ip addr", the first ip address added >> > > to the interface gets the scope "global", all further aliases gets >> > > the scope "global secondary". >> > > >> > > If afterwards the first ip address is removed before the secondaries >> > > (due to concurrently run of the scripts), ALL secondaries are >> > > removed at the same time by the "ip" command, leading to an error >> > > for all subsequent trials to remove the other ip addresses because >> > > they are already gone. >> > > >> > > I am not sure how "ip" decides for the "secondary" scope, maybe >> > > beacuse the other ip addresses are in the same subnet as the first >> > > one. >> > >> > That sounds bad. Instances should be independent of each other. >> > Can you please open a bugzilla and attach a hb_report. >> >> Oh, that is perfectly expected the way he describes it. >> The assumption has always been that there is at least one >> "normal", not managed by crm, address on the interface, >> so no one will have noticed before. >> >> I suggest the following patch, >> basically doing one retry. >> >> For the described scenario, >> the second try will find the IP already "non existant", >> and exit $OCF_SUCCESS. > > Though that obviously won't make instances independent. > > The typical way to achieve that is to have them all as "secondary" IPs. > Which implies that for successful use of independent IPaddr2 resources > on the same device, you need at least one "system" IP (as opposed to > "managed by cluster") on that device. > > The first IP assigned will get "primary" status. > Usually, if you delete a "primary" IP, the kernel will also > delete all secondary IP addresses. > > If using a "system" IP is not an option, here is the alternative: > "Recent" kernels (a quick check revealed that this setting is around > since at least 2.6.12) can do "alias promotion", which can be enabled > using > sysctl -w net.ipv4.conf.all.promote_secondaries=1 > (or per device) > > In both cases the previously "retry on ip_stop" patch is unnecesssary. > But won't do any harm, either. Most likely ;-) > > Glad that helped ;-) > > Somebody please add that to the man page respectively agent meta data... > > -- > : Lars Ellenberg > : LINBIT | Your Way to High Availability > : DRBD/HA support and consulting http://www.linbit.com > > DRBD® and LINBIT® are registered trademarks of LINBIT, Austria. > > _______________________________________________ > Pacemaker mailing list: pacema...@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > -- Keisuke MORI
agents-ipaddr-retval.patch
Description: Binary data
_______________________________________________________ Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev Home Page: http://linux-ha.org/