<<On Tue, 28 Apr 2015 17:06:02 -0400 (EDT), Rick Macklem <rmack...@uoguelph.ca> said:
> There have been email list threads discussing how allocating 9K jumbo > mbufs will fragment the KVM (kernel virtual memory) used for mbuf > cluster allocation and cause grief. The problem is not KVA fragmentation -- the clusters come from a separate map which should prevent that -- it's that clusters have to be physically contiguous, and an active machine is going to have trouble with that. The fact that 9k is a goofy size (two pages plus a little bit) doesn't help matters. The other side, as Neel and others have pointed out, is that it's beneficial for the hardware to have a big chunk of physically contiguous memory to dump packets into, especially with various kinds of receive-side offloading. I see two solutions to this, but don't have the time or resources (or, frankly, the need) to implement them (and both are probably required for different situations): 1) Reserve a big chunk of physical memory early on for big clusters. How much this needs to be will depend on the application and the particular network interface hardware, but you should be thinking in terms of megabytes or (on a big server) gigabytes. Big enough to be mapped as superpages on hardware where that's beneficial. If you have aggressive LRO, "big clusters" might be 64k or larger in size. 2) Use the IOMMU -- if it's available, which it won't be when running under a hypervisor that's already using it for passthrough -- to obviate the need for physically contiguous pages; then the problem reduces to KVA fragmentation, which is easier to avoid in the allocator. > As far as I know (just from email discussion, never used them myself), > you can either stop using jumbo packets or switch to a different net > interface that doesn't allocate 9K jumbo mbufs (doing the receives of > jumbo packets into a list of smaller mbuf clusters). Or just hack the driver to not use them. For the Intel drivers this is easy, and at least for the hardware I have there's no benefit to using 9k clusters over 4k; for Chelsio it's quite a bit harder. -GAWollman _______________________________________________ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"