On Thu, Dec 1, 2016 at 2:47 PM, Hannes Frederic Sowa <han...@stressinduktion.org> wrote: > Side note: > > On 01.12.2016 20:51, Tom Herbert wrote: >>> > E.g. "mini-skb": Even if we assume that this provides a speedup >>> > (where does that come from? should make no difference if a 32 or >>> > 320 byte buffer gets allocated). >>> > >> It's the zero'ing of three cache lines. I believe we talked about that >> as netdev. > > Jesper and me played with that again very recently: > > https://github.com/netoptimizer/prototype-kernel/blob/master/kernel/lib/time_bench_memset.c#L590 > > In micro-benchmarks we saw a pretty good speed up not using the rep > stosb generated by gcc builtin but plain movq's. Probably the cost model > for __builtin_memset in gcc is wrong? > > When Jesper is free we wanted to benchmark this and maybe come up with a > arch specific way of cleaning if it turns out to really improve throughput. > > SIMD instructions seem even faster but the kernel_fpu_begin/end() kill > all the benefits. > One nice direction of XDP is that it forces drivers to defer allocating (and hence zero'ing) skbs. In the receive path I think we can exploit this property deeper into the stack. The only time we _really_ to allocate an skbuf is when we need to put the packet onto a queue. All the other use cases are really just to pass a structure containing a packet from function to function. For that purpose we should be able to just pass a much smaller structure in a stack argument and only allocate an skbuff when we need to enqueue. In cases where we don't ever queue a packet we might never need to allocate any skbuff-- this includes pure acks, packets that end up being dropped. But even more than that, if a received packet generates a TX packet (like a SYN causes a SYN-ACK) then we might even be able to just recycle the received packet and avoid needing any skbuff allocation on transmit (XDP_TX already does this in a limited context)-- this could be a win to handle SYN attacks for instance. Also, since we don't queue on the socket buffer for UDP it's conceivable we could avoid skbuffs in an expedited UDP TX path.
Currently, nearly the whole stack depends on packets always being passed in skbuffs, however __skb_flow_dissect is an interesting exception as it can handle packets passed in either an skbuff or by just a void *-- so we know that this "dual mode" is at least possible. Trying to retrain the whole stack to be able to handle both skbuffs and raw pages is probably untenable at this point, but selectively augmenting some critical performance functions for dual mode (ip_rcv, tcp_rcv, udp_rcv functions for instance) might work. Thanks, Tom > Bye, > Hannes >