Re: [net-next PATCH RFC] mlx4: RX prefetch loop

2016-07-12 Thread Alexei Starovoitov
On Tue, Jul 12, 2016 at 09:52:52PM +0200, Jesper Dangaard Brouer wrote: > > > > >> Also unconditionally doing batch of 8 may also hurt depending on what > > >> is happening either with the stack, bpf afterwards or even cpu version. > > > > > > See this as software DDIO, if the unlikely case

Re: [net-next PATCH RFC] mlx4: RX prefetch loop

2016-07-12 Thread Jesper Dangaard Brouer
On Tue, 12 Jul 2016 09:46:26 -0700 Alexander Duyck wrote: > On Tue, Jul 12, 2016 at 5:45 AM, Jesper Dangaard Brouer > wrote: > > On Mon, 11 Jul 2016 16:05:11 -0700 > > Alexei Starovoitov wrote: > > > >> On Mon, Jul

Re: [net-next PATCH RFC] mlx4: RX prefetch loop

2016-07-12 Thread Alexander Duyck
On Tue, Jul 12, 2016 at 5:45 AM, Jesper Dangaard Brouer wrote: > On Mon, 11 Jul 2016 16:05:11 -0700 > Alexei Starovoitov wrote: > >> On Mon, Jul 11, 2016 at 01:09:22PM +0200, Jesper Dangaard Brouer wrote: >> > > - /* Process all completed CQEs */

Re: [net-next PATCH RFC] mlx4: RX prefetch loop

2016-07-12 Thread Jesper Dangaard Brouer
On Mon, 11 Jul 2016 16:05:11 -0700 Alexei Starovoitov wrote: > On Mon, Jul 11, 2016 at 01:09:22PM +0200, Jesper Dangaard Brouer wrote: > > > - /* Process all completed CQEs */ > > > + /* Extract and prefetch completed CQEs */ > > > while (XNOR(cqe->owner_sr_opcode

Re: [net-next PATCH RFC] mlx4: RX prefetch loop

2016-07-11 Thread Alexei Starovoitov
On Mon, Jul 11, 2016 at 01:09:22PM +0200, Jesper Dangaard Brouer wrote: > > - /* Process all completed CQEs */ > > + /* Extract and prefetch completed CQEs */ > > while (XNOR(cqe->owner_sr_opcode & MLX4_CQE_OWNER_MASK, > > cq->mcq.cons_index & cq->size)) { > > +

Re: [net-next PATCH RFC] mlx4: RX prefetch loop

2016-07-11 Thread Brenden Blanco
On Mon, Jul 11, 2016 at 01:09:22PM +0200, Jesper Dangaard Brouer wrote: [...] > This patch is based on top of Brenden's patch 11/12, and is mean to > replace patch 12/12. > > Prefetching is very important for XDP, especially when using a CPU > without DDIO (here i7-4790K CPU @ 4.00GHz). > >

Re: [net-next PATCH RFC] mlx4: RX prefetch loop

2016-07-11 Thread Jesper Dangaard Brouer
On Fri, 08 Jul 2016 18:02:20 +0200 Jesper Dangaard Brouer wrote: > This patch is about prefetching without being opportunistic. > The idea is only to start prefetching on packets that are marked as > ready/completed in the RX ring. > > This is acheived by splitting the

[net-next PATCH RFC] mlx4: RX prefetch loop

2016-07-08 Thread Jesper Dangaard Brouer
This patch is about prefetching without being opportunistic. The idea is only to start prefetching on packets that are marked as ready/completed in the RX ring. This is acheived by splitting the napi_poll call mlx4_en_process_rx_cq() loop into two. The first loop extract completed CQEs and start