Gaëtan Rivet <gr...@u256.net> writes: > On Tue, Jan 25, 2022, at 17:10, Mike Pattrick wrote: >> On Mon, Dec 27, 2021 at 8:27 AM Paolo Valerio <pvale...@redhat.com> wrote: >>> >>> Some sporadic false positive may be visible for the following tests: >>> >>> - conntrack - IPv6 HTTP >>> - conntrack - FTP over IPv6 >>> >>> The failures show up randomly. >>> The reason appears to be source address used when performing the >>> request using wget: >>> -tcp,orig=(src=fc00::1,dst=fc00::2,sport=<cleared>,dport=<cleared>),reply=(src=fc00::2,dst=fc00::1,sport=<cleared>,dport=<cleared>),protoinfo=(state=<cleared>) >>> +tcp,orig=(src=fe80::f0eb:f8ff:fef0:138f,dst=fc00::2,sport=<cleared>,dport=<cleared>),reply=(src=fc00::2,dst=fe80::f0eb:f8ff:fef0:138f,sport=<cleared>,dport=<cleared>),protoinfo=(state=<cleared>) >>> >>> It seems that the problem can be addressed in multiple ways, but using >>> "nodad" seems to be safe enough to fix the issue that now, after >>> hundreds of attempts, is no longer present. >>> >>> Signed-off-by: Paolo Valerio <pvale...@redhat.com> >>> --- >> >> I wasn't able to reproduce the sporadic failures after 100 iterations >> of "conntrack - FTP over IPv6", but I still think this fix makes >> sense, and shouldn't break anything. >> >> Acked-by: Mike Pattrick <m...@redhat.com> >> > > Hi Paolo, > > I'm sorry, I don't understand the root cause of the failure. > > Seeing the src IP switched from fc00:: to fe80::, and how the nodad flag > solves it, I am guessing that assigning fc00::1 on the veth failed due > to this address already being in use in the LAN? > > How does it happen if all involved links are veths? Could it be that either > other tests > are running in parallel with veths having the same IP? Or maybe something else > unrelated to OVS? > > Nodad might make the link creation work, but I'm worried that it could make > the > result unpredictable, i.e. making it less likely, and less easy to understand > and debug > in the rare remaining times it triggers. >
I see your concern, and it's a fair point. I don't expect dad to fail as there are no duplicates. I probed the error path that is supposed to handle the DAD failure, and event never gets generated, that is, it didn't fail. What it seems to happen is: - DAD starts (the ifaddr tentative flags is set) - The process connect()s (skipping ifaddr with tentative flag set) - DAD successfully terminates (tentative gets cleared) Probing during the failure and dumping the first byte of the address (for convenience) and the ifaddr flags seems to confirm that: ping6 651231 [010] 15603.586147: probe:__ipv6_dev_get_saddr_L9: (ffffffffb5b5919a) flags=0xc0 u6_addr8=0xfc ping6 651231 [010] 15603.586148: probe:__ipv6_dev_get_saddr_L9: (ffffffffb5b5919a) flags=0xc0 u6_addr8=0xfe ping6 651233 [009] 15603.692852: probe:__ipv6_dev_get_saddr_L9: (ffffffffb5b5919a) flags=0xc0 u6_addr8=0xfc ping6 651233 [009] 15603.692856: probe:__ipv6_dev_get_saddr_L9: (ffffffffb5b5919a) flags=0xc0 u6_addr8=0xfe ping6 651266 [005] 15604.807881: probe:__ipv6_dev_get_saddr_L9: (ffffffffb5b5919a) flags=0xc0 u6_addr8=0xfc ping6 651266 [005] 15604.807885: probe:__ipv6_dev_get_saddr_L9: (ffffffffb5b5919a) flags=0x80 u6_addr8=0xfe ping6 651266 [005] 15604.808005: probe:__ipv6_dev_get_saddr_L9: (ffffffffb5b5919a) flags=0xc0 u6_addr8=0xfc ping6 651266 [005] 15604.808007: probe:__ipv6_dev_get_saddr_L9: (ffffffffb5b5919a) flags=0x80 u6_addr8=0xfe wget 651293 [009] 15605.041008: probe:__ipv6_dev_get_saddr_L9: (ffffffffb5b5919a) flags=0xc0 u6_addr8=0xfc wget 651293 [009] 15605.041010: probe:__ipv6_dev_get_saddr_L9: (ffffffffb5b5919a) flags=0x80 u6_addr8=0xfe #define IFA_F_TENTATIVE 0x40 #define IFA_F_PERMANENT 0x80 the ping fails to assign the IP, but it retries until succeeds (IFA_TENTATIVE cleared for "fe80::"). The subsequent connect() call performed by wget returns "fe80::" as it is the only one without the tentative flag set. I didn't test it, but I suppose that "optimistic" would also solve the problem, but AFAICS, disabling dad in a test involving namespaces is fine (furthermore, we know there are no duplicates). This is something we already perform in "datapath - ping over vxlan6 tunnel" and it seems ok to me. > Best, > -- > Gaetan Rivet _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev