Thanks Michael for reporting this and Dumitru for fixing it!
@Michael, I guess that this bug is only relevant for the non-DVR case
right? ie. if the NAT entry for the FIP has both the external_mac and
logical_port fields, then the ARP reply will happen on the compute node
hosting the target VM and that'd be fine as it doesn't need to traverse the
gw node. Am I right?
On Tue, Mar 24, 2020 at 12:17 PM Dumitru Ceara wrote:
> On 3/24/20 8:50 AM, Plato, Michael wrote:
> > Hi Dumitru,
> >
> > thank you very much for the patch. I tried it and it works. VM1 can now
> reach VM2.
> >
> > Best regards!
> >
> > Michael
>
> Hi Michael,
>
> The fix is now merged in OVN master and branch 20.03:
>
>
> https://github.com/ovn-org/ovn/commit/d2ab98463f299e67a9f9a31e8b7c42680b8645cf
>
> Regards,
> Dumitru
>
> >
> >
> > -Ursprüngliche Nachricht-
> > Von: Dumitru Ceara
> > Gesendet: Montag, 23. März 2020 13:28
> > An: Plato, Michael ;
> ovs-discuss@openvswitch.org
> > Betreff: Re: [ovs-discuss] No connectivity due to missing ARP reply
> >
> > On 3/21/20 7:04 PM, Plato, Michael wrote:
> >>
> >> Hi all,
> >>
> >> we use OVN with Openstack and have a problem with the following setup:
> >>
> >>
> >> | |
> >> --- | 10.176.0.156 | ---
> >> | VM1 |- | 192.168.0.1|---| VM2 |
> >> --- | |
> ---
> >> 10.176.0.3.123 |--| R1 |-| 192.168.0.201 /
> GW: 192.168.0.1
> >> GW:10.176.0.1| |(test)| | FIP:
> 10.176.2.19
> >> | |
> >> Outside test
> >> (10.176.0.0/16) (192.168.0.0/24)
> >> (VLAN) (GENEVE)
> >>
> >>
> >> Versions:
> >> - OVN (20.03)
> >> - OVS (2.13)
> >> - networking-ovn (7.1.0)
> >>
> >> Problem:
> >> - no connectivity due to missing ARP reply for FIP 10.176.2.19 from
> >> VM1 (if VM1 is not on GW Chassis for R1 -> is_chassis_resident rules
> >> not applied)
> >> - after moving VM1 to chassis hosting R1 ARP reply appears (due to
> >> local "is_chassis_resident" ARP responder rules)
> >> - temporarily removing priority 75 rules (inserted by commit [0])
> >> restores functionality (even on non gateway chassis), because ARP
> >> requests were flooded to complete L2 domain (but this creates a
> >> scaling issue)
> >>
> >>
> >> Analysis:
> >> - according to ovs-detrace the ARP requests were dropped instead of
> >> being forwarded to remote chassis hosting R1 (as intended by [0])
> >>
> >>
> >> Flow:
> >> arp,in_port=61,vlan_tci=0x,dl_src=fa:16:3e:5e:79:d9,dl_dst=ff:ff:f
> >> f:ff:ff:ff,arp_spa=10.176.3.123,arp_tpa=10.176.2.19,arp_op=1,arp_sha=f
> >> a:16:3e:5e:79:d9,arp_tha=00:00:00:00:00:00
> >>
> >>
> >> bridge("br-int")
> >>
> >> 0. in_port=61, priority 100, cookie 0x862b95fc
> >> set_field:0x1->reg13
> >> set_field:0x7->reg11
> >> set_field:0x5->reg12
> >> set_field:0x1a->metadata
> >> set_field:0x4->reg14
> >> resubmit(,8)
> >> * Logical datapath: "neutron-c2a82a31-632b-4d24-8f35-8a79e2a207a7"
> >> (d516056b-19a6-4613-9838-8c62452fe31d)
> >> * Port Binding: logical_port "b19ceab1-c7fe-4c3b-8733-d88cabaa0a23",
> tunnel_key 4, chassis-name "383eb44a-de85-485a-9606-2fc649a9cbb9",
> chassis-str "os-compute-01"
> >> 8. reg14=0x4,metadata=0x1a,dl_src=fa:16:3e:5e:79:d9, priority 50,
> >> cookie 0x9a357820
> >> resubmit(,9)
> >> * Logical datapath: "neutron-c2a82a31-632b-4d24-8f35-8a79e2a207a7"
> >> (d516056b-19a6-4613-9838-8c62452fe31d) [ingress]
> >> * Logical flow: table=0 (ls_in_port_sec_l2), priority=50,
> >> match=(inport == "b19ceab1-c7fe-4c3b-8733-d88cabaa0a23" && eth.src ==
> >> {fa:16:3e:5e:79:d9}), actions=(next;)
> >>* Logical Switch Port: b19ceab1-c7fe-4c3b-8733-d88cabaa0a23 type
> >> (addresses ['fa:16:3e:5e:79:d9 10.176.3.123'], dynamic addresses [],
> >> security ['fa:16:3e:5e:79:d9 10.176.3.123'] 9. metadata=0x1a, priority
> >> 0, cookie 0x1a478ee1
> >> resubmit(,10)
> >> * Logical datapath: "neutron-c2a82a31-632b-4d24-8f35-8a79e2a207a7"
> >> (d516056b-19a6-4613-9838-8c62452fe31d) [ingress]
> >> * Logical flow: table=1 (ls_in_port_sec_ip), priority=0, match=(1),
> >> actions=(next;) 10.
> >> arp,reg14=0x4,metadata=0x1a,dl_src=fa:16:3e:5e:79:d9,arp_spa=10.176.3.
> >> 123,arp_sha=fa:16:3e:5e:79:d9, priority 90, cookie 0x8c5af8ff
> >> resubmit(,11)
> >> * Logical datapath: "neutron-c2a82a31-632b-4d24-8f35-8a79e2a207a7"
> >> (d516056b-19a6-4613-9838-8c62452fe31d) [ingress]
> >> * Logical flow: table=2 (ls_in_port_sec_nd), priority=90,
> >> match=(inport == "b19ceab1-c7fe-4c3b-8733-d88cabaa0a23" && eth.src ==
> >> fa:16:3e:5e:79:d9 && arp.sha == fa:16:3e:5e:79:d9 && arp.spa ==
> >> {10.176.3.123}), actions=(next;