I've pushed a new revision, please review. The thread is `[PATCH v2 ovn] Do not reply on unicast arps for IPv4 targets`
On Thu, May 23, 2024 at 1:07 PM Han Zhou <hz...@ovn.org> wrote: > > > On Thu, May 23, 2024 at 1:51 PM Vasyl Saienko <vsaie...@mirantis.com> > wrote: > > > > Thanks for feedback, > > > > I'm not an ipv6 expert, I did some research. There is no ARP in ipv6, > it's replaced with Neighbor Discovery. The ARP resolution is done via > Neighbor Solicitation and Neighbour Advertisement messages. > > Neighbor Solicitation is an analog of ARP request and is sent to either > > > > solicited-node multicast address 33:33:ff:XX:XX:XX where XX:XX:XX is the > last 3 bytes of the target ipv6 address > > or target node address > > > > So in our case we can update the flows to match only the solicited-node > multicast address 33:33:ff:XX:XX:XX as follows. From > > > > table=??(ls_in_arp_rsp ), priority=100 , match=(nd_ns && ip6.dst > == {fd00::10, ff02::1:ff00:10} && nd.target == fd00::10 && inport == > "S1-vm1"), action=(next;) > > table=??(ls_in_arp_rsp ), priority=50 , match=(nd_ns && ip6.dst > == {fd00::10, ff02::1:ff00:10} && nd.target == fd00::10), action=(nd_na { > eth.src = 50:54:00:00:00:10; ip6.src = fd00::10; nd.target = fd00::10; > nd.tll = 50:54:00:00:00:10; outport = inport; flags.loopback = 1; output; > };) > > > > to > > > > table=??(ls_in_arp_rsp ), priority=100 , match=(nd_ns && eth.dst > == {33:33:ff:00:00:10, 33:33:ff:ff:00:10} && ip6.dst == {fd00::10, > ff02::1:ff00:10} && nd.target == fd00::10 && inport == "S1-vm1"), > action=(next;) > > table=??(ls_in_arp_rsp ), priority=50 , match=(nd_ns && eth.dst > == {33:33:ff:00:00:10, 33:33:ff:ff:00:10} && ip6.dst == {fd00::10, > ff02::1:ff00:10} && nd.target == fd00::10), action=(nd_na { eth.src = > 50:54:00:00:00:10; ip6.src = fd00::10; nd.target = fd00::10; nd.tll = > 50:54:00:00:00:10; outport = inport; flags.loopback = 1; output; };) > > > > Will this be okay? Maybe we can do this in a separate patch? > > > I am not an IPv6 expert either. I even forgot that IPv6 NS uses multicast > instead of broadcast MAC address, and was assuming we could simply append > the eth.dst == ff:ff:ff:ff:ff:ff to the IPv6 NS responder flows, which is > obviously wrong. Please ignore my feedback for the current patch and we can > address IPv6 in a separate patch if it is required. > > Thanks, > Han > > > On Wed, May 22, 2024 at 12:16 AM Han Zhou <hz...@ovn.org> wrote: > >> > >> > >> > >> On Tue, May 21, 2024 at 9:56 PM Numan Siddique <num...@ovn.org> wrote: > >> > > >> > On Mon, May 20, 2024 at 1:47 AM Vasyl Saienko <vsaie...@mirantis.com> > wrote: > >> > > > >> > > Thanks for feedback Numan. > >> > > > >> > > On Wed, May 15, 2024 at 12:04 AM Numan Siddique <num...@ovn.org> > wrote: > >> > > > >> > > > On Mon, May 13, 2024 at 9:00 AM Dumitru Ceara <dce...@redhat.com> > wrote: > >> > > > > > >> > > > > On 5/4/24 11:38, Vasyl Saienko wrote: > >> > > > > > Hello Numan > >> > > > > > > >> > > > > > Thanks for review and feedback. > >> > > > > > > >> > > > > > On Fri, May 3, 2024 at 7:14 PM Numan Siddique <num...@ovn.org> > wrote: > >> > > > > > > >> > > > > >> On Mon, Apr 22, 2024 at 2:54 AM Vasyl Saienko < > vsaie...@mirantis.com> > >> > > > > >> wrote: > >> > > > > >>> > >> > > > > >>> Reply only if target ethernet address is broadcast, if > >> > > > > >>> address is specified explicitly do noting to let target > >> > > > > >>> reply by itself. This technique allows to monitor target > >> > > > > >>> aliveness with arping. > >> > > > > >> > >> > > > > >> Thanks for the patch. > >> > > > > >> > >> > > > > >> This patch does change the behavior of OVN. Having ARP > responder > >> > > > > >> flows avoids tunnelling the request packet if the > destination is on a > >> > > > > >> different compute node. > >> > > > > >> But I don't think its a big deal. > >> > > > > >> > >> > > > > >> You are totally correct that the ARP responder allows > limiting ARP > >> > > > > > broadcast traffic between nodes. Normally ARP requests are > sent to > >> > > > > > broadcast ff:ff:ff:ff:ff:ff ethernet, as the protocol is > designed to > >> > > > > > identify ethernet addresses for known IP addresses. In this > case since > >> > > > > > traffic is broadcast it will be flooded among all nodes if ARP > >> > > > responder is > >> > > > > > not applied. At the same time the client may specify the > target > >> > > > > > ethernet address as unicast (in this case a broadcast storm > will not > >> > > > > > happen, as the destination ethernet address is unicast). > >> > > > > > > >> > > > > > In OpenStack we use unicast ARP requests as a way to monitor > VM > >> > > > aliveness > >> > > > > > from remote locations to make sure there are no issues with > >> > > > > > flows/underlying networking infra between nodes. We use ARPs > due to > >> > > > several > >> > > > > > reasons: > >> > > > > > 1. This protocol is mandatory and can't be disabled on the > target > >> > > > machine > >> > > > > > (which guarantees that the target VM will always reply to > ARPs if it is > >> > > > > > alive) > >> > > > > > 2. This protocol is not filtered by firewalls/security groups > as it is > >> > > > > > mandatory for network workability > >> > > > > > 3. We can't use upper layers protocols such as ICMP as they > will > >> > > > require > >> > > > > > specific firewall rules inside VMs, and we do not control VMs > in cloud > >> > > > > > cases, but still need to monitor infrastructure aliveness. > >> > > > > > > >> > > > > > If ARP responder replies to requests with broadcast and > unicast target > >> > > > > > ethernet address, we can't use this technique for monitoring, > as even > >> > > > if > >> > > > > > target VM is down (not necessary that ovn port is down, the > VM may > >> > > > stuck or > >> > > > > > be in kernel panic for example, or datapath between > monitoring tool > >> > > > and VM > >> > > > > > is broken) the ARP responder will do reply to our request so > we can't > >> > > > > > identify if VM is really accessible or not as we will always > got > >> > > > replies > >> > > > > > from local ARP responder. At the same time there is no need > to set up > >> > > > ARP > >> > > > > > responder for requests with a unicast ethernet address, as > they will > >> > > > not > >> > > > > > generate broadcast storms by design (they are unicasts). > >> > > > > > > >> > > > > > >> > > > > Right, this sounds acceptable to me too. > >> > > > > > >> > > > > > > >> > > > > >> 1. Accept your patch > >> > > > > >> 2. Use the NB Global option - "ignore_lsp_down" and set it > to false, > >> > > > > >> so that ovn-northd will install the ARP responder flows only > if the > >> > > > > >> logical port is up. > >> > > > > >> > >> > > > > > This option will not allow catching a situation when there is > an issue > >> > > > in > >> > > > > > datapath between monitoring software and VM located on > another node, > >> > > > for > >> > > > > > example issues with flows or some underlying networking > problems. > >> > > > > > > >> > > > > > Have you considered this approach for your use case ? > >> > > > > >> 3. Add another global option - "disable_arp_responder" > which when set > >> > > > > >> to true, ovn-northd will not install ARP responder flows. > >> > > > > >> > >> > > > > >> Disabling ARP responder for ports completely will allow > broadcast > >> > > > storms > >> > > > > > for ARP requests. Which we would like to avoid, but enable the > >> > > > possibility > >> > > > > > to monitor infrastructure aliveness. > >> > > > > > > >> > > > > > > >> > > > > >> Given that OpenStack neutron ml2ovs uses ARP responder flows > only for > >> > > > > >> broad cast my preference would be (1). > >> > > > > >> > >> > > > > >> I'd like to know other's opinion on this. > >> > > > > >> > >> > > > > > >> > > > > I'd accept the patch too so I'd go for (1). > >> > > > > > >> > > > > Regards, > >> > > > > Dumitru > >> > > > > > >> > > > > >> Thanks > >> > > > > >> Numan > >> > > > > >> > >> > > > > >>> > >> > > > > >>> Closes #239 > >> > > > > >>> > >> > > > > >>> Signed-off-by: Vasyl Saienko <vsaie...@mirantis.com> > >> > > > > >>> --- > >> > > > > >>> northd/northd.c | 11 +++++++++-- > >> > > > > >>> tests/ovn-northd.at | 4 ++-- > >> > > > > >>> 2 files changed, 11 insertions(+), 4 deletions(-) > >> > > > > >>> > >> > > > > >>> diff --git a/northd/northd.c b/northd/northd.c > >> > > > > >>> index 37f443e70..e80e1885d 100644 > >> > > > > >>> --- a/northd/northd.c > >> > > > > >>> +++ b/northd/northd.c > >> > > > > >>> @@ -8832,8 +8832,15 @@ > >> > > > build_lswitch_arp_nd_responder_known_ips(struct > >> > > > > >> ovn_port *op, > >> > > > > >>> for (size_t i = 0; i < op->n_lsp_addrs; i++) { > >> > > > > >>> for (size_t j = 0; j < > op->lsp_addrs[i].n_ipv4_addrs; > >> > > > j++) { > >> > > > > >>> ds_clear(match); > >> > > > > >>> - ds_put_format(match, "arp.tpa == %s && > arp.op == 1", > >> > > > > >>> - > op->lsp_addrs[i].ipv4_addrs[j].addr_s); > >> > > > > >>> + /* NOTE(vsaienko): Do not reply on unicast > ARPs, > >> > > > forward > >> > > > > >>> + * them to the target to have ability to > monitor > >> > > > target > >> > > > > >>> + * aliveness via ARPs. > >> > > > > >>> + */ > >> > > > > >>> + ds_put_format(match, > >> > > > > >>> + "arp.tpa == %s && " > >> > > > > >>> + "arp.op == 1 && " > >> > > > > >>> + "eth.dst == ff:ff:ff:ff:ff:ff", > >> > > > > >>> + op->lsp_addrs[i].ipv4_addrs[j].addr_s); > >> > > > > >>> ds_clear(actions); > >> > > > > >>> ds_put_format(actions, > >> > > > > >>> "eth.dst = eth.src; " > >> > > > > >>> diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at > >> > > > > >>> index be006fb32..4162196f4 100644 > >> > > > > >>> --- a/tests/ovn-northd.at > >> > > > > >>> +++ b/tests/ovn-northd.at > >> > > > > >>> @@ -9283,9 +9283,9 @@ AT_CAPTURE_FILE([S1flows]) > >> > > > > >>> > >> > > > > >>> AT_CHECK([grep -e "ls_in_arp_rsp" S1flows | > ovn_strip_lflows], [0], > >> > > > [dnl > >> > > > > >>> table=??(ls_in_arp_rsp ), priority=0 , match=(1), > >> > > > > >> action=(next;) > >> > > > > >>> - table=??(ls_in_arp_rsp ), priority=100 , > match=(arp.tpa == > >> > > > > >> 192.168.0.10 && arp.op == 1 && inport == "S1-vm1"), > action=(next;) > >> > > > > >>> + table=??(ls_in_arp_rsp ), priority=100 , > match=(arp.tpa == > >> > > > > >> 192.168.0.10 && arp.op == 1 && eth.dst == ff:ff:ff:ff:ff:ff > && inport > >> > > > == > >> > > > > >> "S1-vm1"), action=(next;) > >> > > > > >>> table=??(ls_in_arp_rsp ), priority=100 , > match=(nd_ns && > >> > > > > >> ip6.dst == {fd00::10, ff02::1:ff00:10} && nd.target == > fd00::10 && > >> > > > inport > >> > > > > >> == "S1-vm1"), action=(next;) > >> > > > > >>> - table=??(ls_in_arp_rsp ), priority=50 , > match=(arp.tpa == > >> > > > > >> 192.168.0.10 && arp.op == 1), action=(eth.dst = eth.src; > eth.src = > >> > > > > >> 50:54:00:00:00:10; arp.op = 2; /* ARP reply */ arp.tha = > arp.sha; > >> > > > arp.sha = > >> > > > > >> 50:54:00:00:00:10; arp.tpa = arp.spa; arp.spa = > 192.168.0.10; outport > >> > > > = > >> > > > > >> inport; flags.loopback = 1; output;) > >> > > > > >>> + table=??(ls_in_arp_rsp ), priority=50 , > match=(arp.tpa == > >> > > > > >> 192.168.0.10 && arp.op == 1 && eth.dst == ff:ff:ff:ff:ff:ff), > >> > > > > >> action=(eth.dst = eth.src; eth.src = 50:54:00:00:00:10; > arp.op = 2; > >> > > > /* ARP > >> > > > > >> reply */ arp.tha = arp.sha; arp.sha = 50:54:00:00:00:10; > arp.tpa = > >> > > > arp.spa; > >> > > > > >> arp.spa = 192.168.0.10; outport = inport; flags.loopback = > 1; output;) > >> > > > > >>> table=??(ls_in_arp_rsp ), priority=50 , > match=(nd_ns && > >> > > > > >> ip6.dst == {fd00::10, ff02::1:ff00:10} && nd.target == > fd00::10), > >> > > > > >> action=(nd_na { eth.src = 50:54:00:00:00:10; ip6.src = > fd00::10; > >> > > > nd.target > >> > > > > >> = fd00::10; nd.tll = 50:54:00:00:00:10; outport = inport; > >> > > > flags.loopback = > >> > > > > >> 1; output; };) > >> > > > > >> > > > I took a look at this patch. There's one problem with this > patch. If > >> > > > you see the test case changes, this patch now > >> > > > replaces the below logical flow > >> > > > > >> > > > table=??(ls_in_arp_rsp ), priority=100 , match=(arp.tpa == > >> > > > 192.168.0.10 && arp.op == 1 && inport == "S1-vm1"), action=(next;) > >> > > > > >> > > > with > >> > > > > >> > > > table=??(ls_in_arp_rsp ), priority=100 , match=(arp.tpa == > >> > > > 192.168.0.10 && arp.op == 1 && eth.dst == ff:ff:ff:ff:ff:ff && > inport > >> > > > == "S1-vm1"), action=(next;) > >> > > > > >> > > > This is wrong. If there is an ARP request packet entering from a > >> > > > router port to the switch, we were advancing this packet to the > next > >> > > > stage. > >> > > > But now, we only advance if it's a broadcast pkt. Please correct > this. > >> > > > > >> > > > This does not look wrong, the router is yet another client that > uses ARP > >> > > to map IP address to MAC address. > >> > > If the MAC address is known, the router will not send ARP to the > target at > >> > > all, in other cases it will send it to broadcast where > >> > > we will reply from the ARP responder. > >> > > >> > Yes. You're right. > >> > >> The patch looks correct to me, but shall we add the same for IPv6 (just > a few lines below this in the same function)? > >> > >> Thanks, > >> Han > >> > >> > > >> > Thanks > >> > Numan > >> > > >> > > >> > > >> > > > >> > > > >> > > > Also please update the logical flow documentation in > ovn-northd.8.xml > >> > > > for the "ls_in_arp_rsp" stage. I think its also worth mentioning > this > >> > > > change in the > >> > > > NEWS file. > >> > > > > >> > > > Will update in next revision. > >> > > > >> > > > >> > > > Thanks > >> > > > Numan > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > >>> ]) > >> > > > > >>> > >> > > > > >>> -- > >> > > > > >>> 2.40.1 > >> > > > > >>> > >> > > > > >>> _______________________________________________ > >> > > > > >>> dev mailing list > >> > > > > >>> d...@openvswitch.org > >> > > > > >>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev > >> > > > > >>> > >> > > > > >> > >> > > > > > > >> > > > > > > >> > > > > > >> > > > > _______________________________________________ > >> > > > > dev mailing list > >> > > > > d...@openvswitch.org > >> > > > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > >> > > > > >> > > > >> > > > >> > > -- > >> > > <https://www.mirantis.com/> > >> > > Vasyl Saienko > >> > > > >> > > Principal DevOps Engineer > >> > > vsaie...@mirantis.com <dstoltenb...@mirantis.com> > >> > > +(380) 66 072 07 17 <++1+(650)+564+7038> > >> > > _______________________________________________ > >> > > dev mailing list > >> > > d...@openvswitch.org > >> > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > >> > _______________________________________________ > >> > dev mailing list > >> > d...@openvswitch.org > >> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > > > > > > > > -- > > > > Vasyl Saienko > > > > Principal DevOps Engineer > > > > vsaie...@mirantis.com > > +(380) 66 072 07 17 > >> -- <https://www.mirantis.com/> Vasyl Saienko Principal DevOps Engineer vsaie...@mirantis.com <dstoltenb...@mirantis.com> +(380) 66 072 07 17 <++1+(650)+564+7038> _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev