Hi Dejan, -----Original Message----- From: linux-ha-dev-boun...@lists.linux-ha.org [mailto:linux-ha-dev-boun...@lists.linux-ha.org] On Behalf Of Dejan Muhamedagic Sent: Monday, December 24, 2012 11:07 AM To: linux-ha-dev@lists.linux-ha.org Subject: Re: [Linux-ha-dev] IPsrcaddr bug, and fix recommendation
Hi, On Thu, Dec 20, 2012 at 08:03:32PM +0100, Attila Megyeri wrote: > hi, > > I have a cluster configuration with two IPsrcaddr resources (e.g. IP > address "A" and "B") They are configured to two different addresses, and are > never supposed to run on the same nodes. So "A" can run on nodes N1 and N2, > "B" can run on N3,N4. > > My problem is, that in some cases, crm_mon shows that an ipsrcaddr resource > is running on a node where it shouldn't, and of course it is in unmanaged > state and cannot be stopped. > For instance: > IP address "A" is started, unamanged on node N3. > > I am using pacemaker 1.1.6 on a debian system, with the latest RA from github. > > I checked the RA, and here are my findings. > > > - When status is called, it calls the srca_read() function > > - srca_read() returns 2, if a srcip is running on the given node, > but with a different IP address. > > - srca_status(), when gets "2" from srca_read(), returns > "$OCF_ERR_GENERIC" > > As a result, in my case IP "B" is running on N3, which is OK, but > CRM_mon reports that IP "A" is also running on N3 (unmanaged). [for some > reason this is how the OCF_ERR_GENERIC is interpreted] This is definitively a > bug, the question is whether in pacemaker or in the RA. > If I change the script to return "$OCF_NOT_RUNNING" instead of > $OCF_ERR_GENERIC" it works properly. > > What is the proper behavior in this case? > My recommendation is to fix the RA so that srca_read() returns 1, if there is > a srcip on the node, but it is not the queried one. The comment in the agent says: # NOTES: # # 1) There must be one and not more than 1 default route! Mainly because # I can't see why you should have more than one. And if there is more # than one, we would have to box clever to find out which one is to be # modified, or we would have to pass its identity as an argument. # This should actually be in the meta-data, as it is obviously intended for users. It looks like your use case doesn't fit this description, right? Perhaps we could add a parameter like "allow_multiple_default_routes". Thanks, Dejan On the host where the resource is running I have only one default gateway. The other pair of this host (the other node) uses a different default gateway - but I do not think this should be a limitation (on that host I have a single default gateway as well). The srca_read() function does not fail in the steps that check the default gateway. The function runs till the last line where 2 is returned, although it is not a generic error, rather the SRC ip is not running on the node. Thanks, Attila > In this case the RA would return a "$OCF_NOT_RUNNING" > > > > Cheers, > Attila > > _______________________________________________________ > Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org > http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev > Home Page: http://linux-ha.org/ _______________________________________________________ Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev Home Page: http://linux-ha.org/ _______________________________________________________ Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev Home Page: http://linux-ha.org/