On Mon, Oct 14, 2013 at 02:07:48PM -0700, Eric Dumazet wrote:
> On Mon, 2013-10-14 at 09:49 +0200, Ingo Molnar wrote:
> > * Andi Kleen <a...@firstfloor.org> wrote:
> > 
> > > Neil Horman <nhor...@tuxdriver.com> writes:
> > > 
> > > > Sébastien Dugué reported to me that devices implementing ipoib (which 
> > > > don't have checksum offload hardware were spending a significant 
> > > > amount of time computing
> > > 
> > > Must be an odd workload, most TCP/UDP workloads do copy-checksum 
> > > anyways. I would rather investigate why that doesn't work.
> > 
> > There's a fair amount of csum_partial()-only workloads, a packet does not 
> > need to hit user-space to be a significant portion of the system's 
> > workload.
> > 
> > That said, it would indeed be nice to hear which particular code path was 
> > hit in this case, if nothing else then for education purposes.
> 
> Many NIC do not provide a CHECKSUM_COMPLETE information for encapsulated
> frames, meaning we have to fallback to software csum to validate
> TCP frames, once tunnel header is pulled.
> 
> So to reproduce the issue, all you need is to setup a GRE tunnel between
> two hosts, and use any tcp stream workload.
> 
> Then receiver profile looks like :
> 
> 11.45%        [kernel]         [k] csum_partial
>  3.08%        [kernel]         [k] _raw_spin_lock
>  3.04%        [kernel]         [k] intel_idle
>  2.73%        [kernel]         [k] ipt_do_table
>  2.57%        [kernel]         [k] __netif_receive_skb_core
>  2.15%        [kernel]         [k] copy_user_generic_string
>  2.05%        [kernel]         [k] __hrtimer_start_range_ns
>  1.42%        [kernel]         [k] ip_rcv
>  1.39%        [kernel]         [k] kmem_cache_free
>  1.36%        [kernel]         [k] _raw_spin_unlock_irqrestore
>  1.24%        [kernel]         [k] __schedule
>  1.13%        [bnx2x]          [k] bnx2x_rx_int
>  1.12%        [bnx2x]          [k] bnx2x_start_xmit
>  1.11%        [kernel]         [k] fib_table_lookup
>  0.99%        [ip_tunnel]  [k] ip_tunnel_lookup
>  0.91%        [ip_tunnel]  [k] ip_tunnel_rcv
>  0.90%        [kernel]         [k] check_leaf.isra.7
>  0.89%        [kernel]         [k] nf_iterate
> 
As I noted previously the workload that this got reported on was ipoib, which
has a simmilar profile, since infiniband cards tend to not be able to do
checksum offload for ip frames.

Neil

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to