Hi all,
I have recently worked, during my master’s thesis with the supervision
of Prof. Luigi Rizzo, on a project to add GSO (Generic Segmentation
Offload) support in FreeBSD. I will present this project at EuroBSDcon
2014, in Sofia (Bulgaria) on September 28, 2014.

Following is a brief description of our project:

The use of large frames makes network communication much less
demanding for the CPU. Yet, backward compatibility and slow links
requires the use of 1500 byte or smaller frames.  Modern NICs with
hardware TCP segmentation offloading (TSO) address this problem.
However, a generic software version (GSO) provided by the OS has
reason to exist, for use on paths with no suitable hardware, such
as between virtual machines or with older or buggy NICs.

Much of the advantage of TSO comes from crossing the network stack only
once per (large) segment instead of once per 1500-byte frame.
GSO does the same both for segmentation (TCP) and fragmentation (UDP)
by doing these operations as late as possible. Ideally, this could be done
within the device driver, but that would require modifications to all
drivers.
A more convenient, similarly effective approach is to segment
just before the packet is passed to the driver (in ether_output()).

Our preliminary implementation supports TCP and UDP on IPv4/IPv6;
it only intercepts packets large than the MTU (others are left unchanged),
and only when GSO is marked as enabled for the interface.

Segments larger than the MTU are not split in tcp_output(),
udp_output(), or ip_output(), but marked with a flag (contained in
m_pkthdr.csum_flags), which is processed by ether_output() just
before calling the device driver.

ether_output(), through gso_dispatch(), splits the large frame as needed,
creating headers and possibly doing checksums if not supported by
the hardware.

In experiments agains an LRO-enabled receiver (otherwise TSO/GSO
are ineffective) we have seen the following performance,
taken at different clock speeds (because at top speeds the
10G link becomes the bottleneck):


    Testing enviroment (all with Intel 10Gbit NIC)
    Sender: FreeBSD 11-CURRENT - CPU i7-870 at 2.93 GHz + Turboboost
    Receiver: Linux 3.12.8 - CPU i7-3770K at 3.50GHz + Turboboost
    Benchmark tool: netperf 2.6.0

    --- TCP/IPv4 packets (checksum offloading enabled) ---
    Freq.      TSO       GSO     none     Speedup
    [GHz]     [Gbps]   [Gbps]   [Gbps]   GSO-none
    2.93       9347      9298      8308     12 %
    2.53       9266      9401      6771     39 %
    2.00       9408      9294      5499     69 %
    1.46       9408      8087      4075     98 %
    1.05       9408      5673      2884     97 %
    0.45       6760      2206      1244     77 %


    --- TCP/IPv6 packets (checksum offloading enabled) ---
    Freq.      TSO       GSO     none     Speedup
    [GHz]     [Gbps]   [Gbps]   [Gbps]   GSO-none
    2.93       7530      6939      4966     40 %
    2.53       5133      7145      4008     78 %
    2.00       5965      6331      3152    101 %
    1.46       5565      5180      2348    121 %
    1.05       8501      3607      1732    108 %
    0.45       3665      1505        651    131 %


    --- UDP/IPv4 packets (9K) ---
    Freq.      GSO      none     Speedup
    [GHz]     [Gbps]   [Gbps]   GSO-none
    2.93       9440      8084     17 %
    2.53       7772      6649     17 %
    2.00       6336      5338     19 %
    1.46       4748      4014     18 %
    1.05       3359      2831     19 %
    0.45       1312      1120     17 %


    --- UDP/IPv6 packets (9K) ---
    Freq.      GSO      none     Speedup
    [GHz]     [Gbps]   [Gbps]   GSO-none
    2.93       7281      6197     18 %
    2.53       5953      5020     19 %
    2.00       4804      4048     19 %
    1.46       3582      3004     19 %
    1.05       2512      2092     20 %
    0.45         998        826     21 %

We tried to change as little as possible the network stack to add
GSO support. To avoid changing API/ABI, we temporarily used spare
fields in struct tcpcb (TCP Control Block) and struct ifnet to store
some information related to GSO (enabled, max burst size, etc.).
The code that performs the segmentation/fragmentation is contained
in the file gso.[h|c] in sys/net.  We used 4 bit in m_pkthdr.csum_flags
(CSUM_GSO_MASK) to encode the packet type (TCP/IPv4, TCP/IPv6, etc)
to prevent access to the TCP/IP/Ethernet headers of each packet.
In ether_output_frame(), if the packet requires the GSO
((m->m_pkthdr.csum_flags & CSUM_GSO_MASK) != 0), it is segmented
or fragmented, and then they are sent to the device driver.

At https://github.com/stefano-garzarella/freebsd-gso
you can find the kernel patches for FreeBSD-current, FreeBSD
10-stable, FreeBSD 9-stable, a simple application (gso-stats.c)
that prints the GSO statistics and picobsd images with GSO support.

At https://github.com/stefano-garzarella/freebsd-gso-src
you can get the FreeBSD source with GSO patch (various branch for
FreeBSD current, 10-stable, 9-stable).

Any feedbacks, comments, questions are welcome​.

Thank you very much,
Stefano Garzarella

------------------------------------------------------------------------------------------------------------
How to use GSO:

- Apply the right kernel patch.

- To compile the GSO support add ‘ options GSO ' to your kernel config file
and
   rebuild a kernel.

- To manage the GSO parameters there are some sysctls:
     - net.inet.tcp.gso - GSO enable on TCP communications (!=0)
     - net.inet.udp.gso - GSO enable on UDP communications (!=0)

     - for each interface:
          - net.gso.dev."ifname”.max_burst - GSO burst length limit
                [default: IP_MAXPACKET=65535]
          - net.gso.dev."ifname”.enable_gso - GSO enable on “ifname”
interface (!=0)

- To show statistics:
     - make sure that the GSO_STATS macro is defined in sys/net/gso.h
     - use the simple gso-stats.c application to access the sysctl
net.gso.stats
       that contains the address of the gsostats structure (defined in
gso.h)
       which records the statistics.  (compile with
-I/path/to/kernel/src/patched/)
------------------------------------------------------------------------------------------------------------

-- 
*Stefano Garzarella*
stefano.garzare...@gmail.com
_______________________________________________
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Reply via email to