Re: [RFC PATCH v2 net-next 06/12] net: core: propagate SKB lists through packet_type lookup

2018-06-27 Thread Edward Cree
On 27/06/18 17:00, Willem de Bruijn wrote:
> On Wed, Jun 27, 2018 at 10:49 AM Edward Cree  wrote:
>> On 27/06/18 15:36, Willem de Bruijn wrote:
>>> Also, this function does more than just process network taps.
>> This is true, but naming things is hard, and I couldn't think of either a
>>  better new name for this function or a name that could fit in between
>>  __netif_receive_skb() and __netif_receive_skb_core() for the new function
>>  in my patch named __netif_receive_skb_core().  Any suggestions?
> netif_receive_skb_core? Not that four underscores is particularly
> readable. Perhaps __netif_receive_skb_core_inner. It's indeed tricky (and
> not the most important, I didn't mean to bikeshed).
I've gone with __netif_receive_skb_one_core() (by contrast to ..._list_core())
 for the outer function.  And I don't mind when people shed bikes :)

> Come to think of it, from your fast path assumptions, we could perhaps wrap
> ptype_all and rx_handler logic in a static_branch similar to tc and netfilter
> (and sk_memalloc_socks). Remaining branches like skip_classify, pfmemalloc
> and deliver_exact can also not be reached if all these are off, so this entire
> section can be skipped. Then it could become __netif_receive_skb_slow,
> taken only on the static branch or for vlan packets.  I do not suggest it as
> part of this patchset. it would be a pretty complex change on its own.

That is an interesting idea, but agreed that it'd be quite complex.



Re: [RFC PATCH v2 net-next 06/12] net: core: propagate SKB lists through packet_type lookup

2018-06-27 Thread Willem de Bruijn
On Wed, Jun 27, 2018 at 10:49 AM Edward Cree  wrote:
>
> On 27/06/18 15:36, Willem de Bruijn wrote:
> > On Tue, Jun 26, 2018 at 8:19 PM Edward Cree  wrote:
> >> __netif_receive_skb_taps() does a depressingly large amount of per-packet
> >>  work that can't easily be listified, because the another_round looping
> >>  makes it nontrivial to slice up into smaller functions.
> >> Fortunately, most of that work disappears in the fast path:
> >>  * Hardware devices generally don't have an rx_handler
> >>  * Unless you're tcpdumping or something, there is usually only one ptype
> >>  * VLAN processing comes before the protocol ptype lookup, so doesn't force
> >>a pt_prev deliver
> >>  so normally, __netif_receive_skb_taps() will run straight through and 
> >> return
> >>  the one ptype found in ptype_base[hash of skb->protocol].
> >>
> >> Signed-off-by: Edward Cree 
> >> ---
> >> -static int __netif_receive_skb_core(struct sk_buff *skb, bool pfmemalloc)
> >> +static int __netif_receive_skb_taps(struct sk_buff *skb, bool pfmemalloc,
> >> +   struct packet_type **pt_prev)
> > A lot of code churn can be avoided by keeping local variable pt_prev and
> > calling this ppt_prev or so, then assigning just before returning on 
> > success.
> Good idea, I'll try that.
>
> > Also, this function does more than just process network taps.
> This is true, but naming things is hard, and I couldn't think of either a
>  better new name for this function or a name that could fit in between
>  __netif_receive_skb() and __netif_receive_skb_core() for the new function
>  in my patch named __netif_receive_skb_core().  Any suggestions?

netif_receive_skb_core? Not that four underscores is particularly
readable. Perhaps __netif_receive_skb_core_inner. It's indeed tricky (and
not the most important, I didn't mean to bikeshed).

Come to think of it, from your fast path assumptions, we could perhaps wrap
ptype_all and rx_handler logic in a static_branch similar to tc and netfilter
(and sk_memalloc_socks). Remaining branches like skip_classify, pfmemalloc
and deliver_exact can also not be reached if all these are off, so this entire
section can be skipped. Then it could become __netif_receive_skb_slow,
taken only on the static branch or for vlan packets.  I do not suggest it as
part of this patchset. it would be a pretty complex change on its own.


Re: [RFC PATCH v2 net-next 06/12] net: core: propagate SKB lists through packet_type lookup

2018-06-27 Thread Edward Cree
On 27/06/18 15:36, Willem de Bruijn wrote:
> On Tue, Jun 26, 2018 at 8:19 PM Edward Cree  wrote:
>> __netif_receive_skb_taps() does a depressingly large amount of per-packet
>>  work that can't easily be listified, because the another_round looping
>>  makes it nontrivial to slice up into smaller functions.
>> Fortunately, most of that work disappears in the fast path:
>>  * Hardware devices generally don't have an rx_handler
>>  * Unless you're tcpdumping or something, there is usually only one ptype
>>  * VLAN processing comes before the protocol ptype lookup, so doesn't force
>>a pt_prev deliver
>>  so normally, __netif_receive_skb_taps() will run straight through and return
>>  the one ptype found in ptype_base[hash of skb->protocol].
>>
>> Signed-off-by: Edward Cree 
>> ---
>> -static int __netif_receive_skb_core(struct sk_buff *skb, bool pfmemalloc)
>> +static int __netif_receive_skb_taps(struct sk_buff *skb, bool pfmemalloc,
>> +   struct packet_type **pt_prev)
> A lot of code churn can be avoided by keeping local variable pt_prev and
> calling this ppt_prev or so, then assigning just before returning on success.
Good idea, I'll try that.

> Also, this function does more than just process network taps.
This is true, but naming things is hard, and I couldn't think of either a
 better new name for this function or a name that could fit in between
 __netif_receive_skb() and __netif_receive_skb_core() for the new function
 in my patch named __netif_receive_skb_core().  Any suggestions?


Re: [RFC PATCH v2 net-next 06/12] net: core: propagate SKB lists through packet_type lookup

2018-06-27 Thread Willem de Bruijn
On Tue, Jun 26, 2018 at 8:19 PM Edward Cree  wrote:
>
> __netif_receive_skb_taps() does a depressingly large amount of per-packet
>  work that can't easily be listified, because the another_round looping
>  makes it nontrivial to slice up into smaller functions.
> Fortunately, most of that work disappears in the fast path:
>  * Hardware devices generally don't have an rx_handler
>  * Unless you're tcpdumping or something, there is usually only one ptype
>  * VLAN processing comes before the protocol ptype lookup, so doesn't force
>a pt_prev deliver
>  so normally, __netif_receive_skb_taps() will run straight through and return
>  the one ptype found in ptype_base[hash of skb->protocol].
>
> Signed-off-by: Edward Cree 
> ---

> -static int __netif_receive_skb_core(struct sk_buff *skb, bool pfmemalloc)
> +static int __netif_receive_skb_taps(struct sk_buff *skb, bool pfmemalloc,
> +   struct packet_type **pt_prev)

A lot of code churn can be avoided by keeping local variable pt_prev and
calling this ppt_prev or so, then assigning just before returning on success.

Also, this function does more than just process network taps.

>  {
> -   struct packet_type *ptype, *pt_prev;
> rx_handler_func_t *rx_handler;
> struct net_device *orig_dev;
> bool deliver_exact = false;
> +   struct packet_type *ptype;
> int ret = NET_RX_DROP;
> __be16 type;
>
> @@ -4514,7 +4515,7 @@ static int __netif_receive_skb_core(struct sk_buff 
> *skb, bool pfmemalloc)
> skb_reset_transport_header(skb);
> skb_reset_mac_len(skb);
>
> -   pt_prev = NULL;
> +   *pt_prev = NULL;
>
>  another_round:
> skb->skb_iif = skb->dev->ifindex;
> @@ -4535,25 +4536,25 @@ static int __netif_receive_skb_core(struct sk_buff 
> *skb, bool pfmemalloc)
> goto skip_taps;
>
> list_for_each_entry_rcu(ptype, &ptype_all, list) {
> -   if (pt_prev)
> -   ret = deliver_skb(skb, pt_prev, orig_dev);
> -   pt_prev = ptype;
> +   if (*pt_prev)
> +   ret = deliver_skb(skb, *pt_prev, orig_dev);
> +   *pt_prev = ptype;
> }