On Mon, Nov 19, 2018 at 2:56 PM Daniel Alvarez Sanchez <dalva...@redhat.com> wrote:
> Having thought this again, I'd rather merge the patch I proposed in my > previous email (I'd need tests and propose a formal patch after your > feedback) but in the long term I think it'd make sense to also implement > some sort of aging to the MAC_Binding entries so that they eventually > expire, especially for entries that come from external networks. > > On Fri, Nov 16, 2018 at 6:41 PM Daniel Alvarez Sanchez < > dalva...@redhat.com> wrote: > >> >> On Sat, Nov 10, 2018 at 12:21 AM Ben Pfaff <b...@ovn.org> wrote: >> > >> > On Mon, Oct 29, 2018 at 05:21:13PM +0530, Numan Siddique wrote: >> > > On Mon, Oct 29, 2018 at 5:00 PM Daniel Alvarez Sanchez < >> dalva...@redhat.com> >> > > wrote: >> > > >> > > > Hi, >> > > > >> > > > After digging further. The problem seems to be reduced to reusing an >> > > > old gateway IP address for a dnat_and_snat entry. >> > > > When a gateway port is bound to a chassis, its entry will show up in >> > > > the MAC_Binding table (at least when that Logical Switch is >> connected >> > > > to more than one Logical Router). After deleting the Logical Router >> > > > and all its ports, this entry will remain there. If a new Logical >> > > > Router is created and a Floating IP (dnat_and_snat) is assigned to a >> > > > VM with the old gw IP address, it will become unreachable. >> > > > >> > > > A workaround now from networking-ovn (OpenStack integration) is to >> > > > delete MAC_Binding entries for that IP address upon a FIP creation. >> I >> > > > think that this however should be done from OVN, what do you folks >> > > > think? >> > > > >> > > > >> > > Agree. Since the MAC_Binding table row is created by ovn-controller, >> it >> > > should >> > > be handled properly within OVN. >> > >> > I see that this has been sitting here for a while. The solution seems >> > reasonable to me. Are either of you working on it? >> >> I started working on it. I came up with a solution (see patch below) >> which works but I wanted to give you a bit more of context and get your >> feedback: >> >> >> ^ localnet >> | >> +---+---+ >> | | >> +------+ pub +------+ >> | | | | >> | +-------+ | >> | 172.24.4.0/24 | >> | | >> 172.24.4.220 | | 172.24.4.221 >> +---+---+ +---+---+ >> | | | | >> | LR0 | | LR1 | >> | | | | >> +---+---+ +---+---+ >> 10.0.0.254 | | 20.0.0.254 >> | | >> +---+---+ +---+---+ >> | | | | >> 10.0.0.0/24 | SW0 | | SW1 | 20.0.0.0/24 >> | | | | >> +---+---+ +---+---+ >> | | >> | | >> +---+---+ +---+---+ >> | | | | >> | VM0 | | VM1 | >> | | | | >> +-------+ +-------+ >> 10.0.0.10 20.0.0.10 >> 172.24.4.100 172.24.4.200 >> >> >> When I ping VM1 floating IP from the external network, a new entry for >> 172.24.4.221 in the LR0 datapath appears in the MAC_Binding table: >> >> _uuid : 85e30e87-3c59-423e-8681-ec4cfd9205f9 >> datapath : ac5984b9-0fea-485f-84d4-031bdeced29b >> ip : "172.24.4.221" >> logical_port : "lrp02" >> mac : "00:00:02:01:02:04" >> >> >> Now, if LR1 gets removed and the old gateway IP (172.24.4.221) is reused >> for VM2 FIP with different MAC and new gateway IP is created (for example >> 172.24.4.222 00:00:02:01:02:99), VM2 FIP becomes unreachable from VM1 >> until the old MAC_Binding entry gets deleted as pinging 172.24.4.221 will >> use the wrong address ("00:00:02:01:02:04"). >> >> With the patch below, removing LR1 results in deleting all MAC_Binding >> entries for every datapath where '172.24.4.221' appears in the 'ip' column >> so the problem goes away. >> >> Another solution would be implementing some kind of 'aging' for >> MAC_Binding entries but perhaps it's more complex. >> Looking forward for your comments :) >> >> As discussed with you offline, ageing itself might not solve this issue. We might still hit the issue until the mac _binding entry ages out and flushed out. Your proposed solution seems fine to me. Thanks Numan >> diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c >> index 58bef7d..a86733e 100644 >> --- a/ovn/northd/ovn-northd.c >> +++ b/ovn/northd/ovn-northd.c >> @@ -2324,6 +2324,18 @@ cleanup_mac_bindings(struct northd_context *ctx, >> struct hmap *ports) >> } >> } >> >> +static void >> +delete_mac_binding_by_ip(struct northd_context *ctx, const char *ip) >> +{ >> + const struct sbrec_mac_binding *b, *n; >> + SBREC_MAC_BINDING_FOR_EACH_SAFE (b, n, ctx->ovnsb_idl) { >> + if (strstr(ip, b->ip)) { >> + sbrec_mac_binding_delete(b); >> + } >> + } >> +} >> + >> + >> /* Updates the southbound Port_Binding table so that it contains the >> logical >> * switch ports specified by the northbound database. >> * >> @@ -2383,6 +2395,15 @@ build_ports(struct northd_context *ctx, >> /* Delete southbound records without northbound matches. */ >> LIST_FOR_EACH_SAFE(op, next, list, &sb_only) { >> ovs_list_remove(&op->list); >> + >> + /* Delete all MAC_Binding entries which match the IP addresses >> of the >> + * deleted logical router port (ie. port with a peer). */ >> + const char *peer = smap_get(&op->sb->options, "peer"); >> + if (peer) { >> + for (int i = 0; i < op->sb->n_mac; i++) { >> + delete_mac_binding_by_ip(ctx, op->sb->mac[i]); >> + } >> + } >> sbrec_port_binding_delete(op->sb); >> ovn_port_destroy(ports, op); >> } >> >
_______________________________________________ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss