The context has gotten messy here, so I’m going to break down and top-post.

I started review https://reviews.freebsd.org/D10059 with a fix for the reference-count leak. It changes the semantics so only routes within an in_pcb automatically do L2 caching.

I’ll put the tcp_output change for V6 in a separate review when this one is done.

Andrey, could you try your iperf test again? Thanks,

                Mike

On 14 Mar 2017, at 21:39, Mike Karels wrote:

On 14 Mar 2017, at 3:50, Andrey V. Elsukov wrote:

On 14.03.2017 11:40, Mike Karels wrote:
Hi All,

Eugene has reported about the following assertion in the ARP code:
        http://www.grosbein.net/freebsd/crash/arp-kassert.txt

After some investigation I found that L2 cache has reference leak, that
can lead to integer overflow and this assertion.
The one of the ways to reproduce this overflow can be demonstrated with simple IP forwarding, when ip_forward() is used (not ip_tryforward).

I asked olivier@ to reproduce this leak and he got this result:
        http://slexy.org/view/s21ql7nA0q

After further investigation I found similar leak in the IPv6 TCP path.
Simple iperf test shows these results:

# dtrace -n 'fbt::in6_lltable_dump_entry:entry {printf("%d",
args[1]->lle_refcnt);}'
dtrace: description 'fbt::in6_lltable_dump_entry:entry ' matched 1 probe
CPU     ID                    FUNCTION:NAME
 51  18589     in6_lltable_dump_entry:entry 55721
 51  18589     in6_lltable_dump_entry:entry 1
 51  18589     in6_lltable_dump_entry:entry 1
 51  18589     in6_lltable_dump_entry:entry 2
 38  18589     in6_lltable_dump_entry:entry 111417
 38  18589     in6_lltable_dump_entry:entry 1
 38  18589     in6_lltable_dump_entry:entry 1

--
WBR, Andrey V. Elsukov

Thanks!  Could you try the following patch (compiles, but untested):

Index: netinet/ip_input.c
===================================================================
--- netinet/ip_input.c  (revision 315160)
+++ netinet/ip_input.c  (working copy)
@@ -60,6 +60,7 @@
 #include <net/if_types.h>
 #include <net/if_var.h>
 #include <net/if_dl.h>
+#include <net/if_llatbl.h>
 #include <net/route.h>
 #include <net/netisr.h>
 #include <net/rss_config.h>
@@ -1066,6 +1067,8 @@
        if (error == EMSGSIZE && ro.ro_rt)
                mtu = ro.ro_rt->rt_mtu;
        RO_RTFREE(&ro);
+       if (ro.ro_lle)
+               LLE_FREE(ro.ro_lle);

        if (error)
                IPSTAT_INC(ips_cantforward);

I think it would be better to set RT_LLE_CACHE flag only for protocols that expect presence of L2 cache. I.e. only for the TCP and UDP and do it in the corresponding protocol output routine, not in the ip[6]_output.

Hmm, let me think about that. TCP and UDP know nothing about L2 cache, they just use the in_pcb cache which handles it. L3 caching was removed earlier by someone who thought of it as a layering violation, which is why I tried to keep in the IP layer for the most part. I can probably find a
way to encapsulate it.


Index: netinet6/ip6_forward.c
===================================================================
--- netinet6/ip6_forward.c      (revision 315160)
+++ netinet6/ip6_forward.c      (working copy)
@@ -52,6 +52,7 @@
 #include <net/if.h>
 #include <net/if_var.h>
 #include <net/netisr.h>
+#include <net/if_llatbl.h>
 #include <net/route.h>
 #include <net/pfil.h>

@@ -431,4 +432,6 @@
 out:
        if (rt != NULL)
                RTFREE(rt);
+       if (rin6.ro_lle)
+               LLE_FREE(rin6.ro_lle);
 }

I don't think this chunk will help. ip6_forward() doesn't use
ip6_output(). And IPv6 forwarding is not affected by this problem. Look
at the tcp_output(), it uses local route variable for IPv6 output.

Ah, right, I obviously didn’t read closely enough earlier. I’ve attached a patch that should fix this, as well as adding route caching for TCP/IPv6.

I'm not sure, but probably SCTP also can be affected by this problem.

Probably true. SCTP could probably benefit from L2 caching, but this also
argues for making this more transparent.

                Mike


_______________________________________________
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Reply via email to