In logical router ingress table 22, we install a couple of conflicting
types of flows.

First, in build_arp_resolve_flows_for_lsp(), if the logical switch port
is peered with a router, then we install a series of priority 100 flows.
For each logical switch port on the logical switch, we install a flow on
the peered router port. The flow matches on logical outport and next hop
IP address. The flow rewrites the destination MAC to the MAC of the
logical switch port where the IP address is bound.

match: outport == router_port, next_hop == lsp IP
actions: dst_mac = lsp MAC

Next, in build_arp_resolve_flows_for_lrp(), if the logical router port
is a distributed gateway port (DGP), and the port has
options:redirect-type=bridged set, then we install a priority 50 flow.
THis flow matches on the logical output port and checks if the DGP is
not chassis-resident. The flow rewrites the destination MAC to the MAC
of the DGP.

match: outport == DGP, !is_chassis_resident(DGP)
actions: dst_mac = DGP MAC

On a hypervisor where the output port is the DGP, the DGP is not
chassis-resident, and the next hop IP address is one of the IP addresses
of the peered logical switch, then the priority 100 flow will match, and
we will set the destination MAC to the logical switch port's MAC.

This logic would be correct so long as we are using an encapsulation
method such as Geneve or VXLAN to send the packet to the hypervisor
where the DGP is bound. In that case, the logical router pipeline can
run partially on the first hypervisor, then the packet can be tunneled
to the hypervisor where the DGP is bound. On the gateway hypervisor, we
can then continue running the logical router pipeline. This allows for
the router features that need to run on the gateway hypervisor to run,
and then the packet can be directed to the appropriate logical switch
after.

However, if the router has options:redirect-type=bridged set, then this
means instead of tunneling the packet within the logical router
pipeline, we need to use the attached switch's localnet port to
redirect the packet to the gateway hypervisor. In commit 8ba15c3d1084c7
it was established that the way this is done is to set the
destination MAC address to the DGP's MAC and send the packet over the
attached switch's localnet port to the gateway hypervisor. Once the
packet arrives on the gateway hypervisor, since the destination MAC is
set to the DGP's MAC, the packet will get sent to the logical router to
be processed a second time on the gateway hypervisor. This is what the
priority 50 flow is intended to do.

Since we are not hitting the priority 50 flow, it means that packets are
being redirected with the wrong destination MAC address. In many cases,
this might be transparent since the packet might end up getting sent out
the correct port in the end. But, if the destination logical switch port
is not bound to the same chassis as the DGP (this is rare, but it
apparently happens), then the packet will end up getting dropped because
the logical switch will receive the packet on the localnet port and
determine that the packet needs to be sent back out the localnet port.
Since the loopback flag never gets set, this results in the packet
getting dropped since the inport and outport are the same. What's worse,
though, is that since the packet never enters the logical router
pipeline on the gateway chassis, it means that router features that are
intended to be invoked on the gateway chassis are not being invoked.
NATs, load balancers, etc. are being skipped.

The fix presented here is to not install the priority 100 flows in the
case where the router port is a DGP and it has
options:redirect-type=bridged set. This way, we will only ever hit the
priority 50 flow, thus avoiding the issues presented above.

Reported-at: https://issues.redhat.com/browse/FDP-1454
Signed-off-by: Mark Michelson <[email protected]>
---
 northd/northd.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/northd/northd.c b/northd/northd.c
index 8b5413ef3..2c496d58d 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -14674,6 +14674,14 @@ build_arp_resolve_flows_for_lsp(
                         continue;
                     }
 
+                    if (lrp_is_l3dgw(peer)) {
+                        const char *redirect_type = 
smap_get(&peer->nbrp->options,
+                                                            "redirect-type");
+                        if (redirect_type && !strcasecmp(redirect_type, 
"bridged")) {
+                            continue;
+                        }
+                    }
+
                     if (!lrp_find_member_ip(peer, ip_s)) {
                         continue;
                     }
-- 
2.50.1

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to