On Thu, Mar 31, 2016 at 2:30 PM, Pravin B Shelar <pshe...@ovn.org> wrote:
> STT implementation we saw performance improvements with linearizing
> skb for SLUB case.  So following patch skips zero copy operation
> for such a case.
>
> Tested-By: Vasmi Abidi <vab...@vmware.com>
> Signed-off-by: Pravin B Shelar <pshe...@ovn.org>

Can you add some performance numbers to the commit message?

> diff --git a/datapath/linux/compat/stt.c b/datapath/linux/compat/stt.c
> index eb397e8..ae33d5e 100644
> --- a/datapath/linux/compat/stt.c
> +++ b/datapath/linux/compat/stt.c
>  static int coalesce_skb(struct sk_buff **headp)
>  {
> -       struct sk_buff *frag, *head, *prev;
> +       struct sk_buff *frag, *head;
> +#ifndef SKIP_ZERO_COPY
> +       struct sk_buff *prev;
> +#endif
>         int err;
>
>         err = straighten_frag_list(headp);

I don't think that we need to straighten the frag list in the
SKIP_ZERO_COPY case. It's basically just undoing what the for loop
that forms the frag list will do. __skb_linearize() will be able to
handle it in any case.

> +#ifdef SKIP_ZERO_COPY
> +       if (skb_shinfo(head)->frag_list) {
> +               err = __skb_linearize(head);
> +               return err;
> +       }
> +#endif

Maybe move this to try_to_segment()? It seems like it is a little more
consistent.

>  static int segment_skb(struct sk_buff **headp, bool csum_partial,
>                        bool ipv4, bool tcp, int l4_offset)
>  {
> +#ifndef SKIP_ZERO_COPY
>         int err;
>
>         err = coalesce_skb(headp);
> @@ -543,6 +575,7 @@ static int segment_skb(struct sk_buff **headp, bool 
> csum_partial,
>         if (skb_shinfo(*headp)->frag_list)
>                 return __try_to_segment(*headp, csum_partial,
>                                         ipv4, tcp, l4_offset);
> +#endif
>         return 0;
>  }

Is this OK? It will retain a skb with a frag_list on the transmit path
where this wasn't possible before. This used to cause kernel panics
since the skb geometry wasn't even when the packet went to be
segmented. I don't remember if this is still the case but if not then
it seems like we should be able to simply the code regardless of this
patch.

> @@ -1093,6 +1141,15 @@ static struct sk_buff *reassemble(struct sk_buff *skb)
>
>         frag = lookup_frag(dev_net(skb->dev), stt_percpu, &key, hash);
>         if (!frag->skbs) {
> +               int err;
> +
> +               err = pskb_expand_head(skb, skb_headroom(skb),
> +                                      skb->data_len + tot_len, GFP_ATOMIC);
> +               if (likely(!err)) {
> +                       if (unlikely(!__pskb_pull_tail(skb, skb->data_len)))
> +                               BUG();
> +               }

This linearizes the packet even in non-SKIP_ZERO_COPY cases. I guess
we probably don't want to do that. It's also possible that the first
skb received isn't necessarily the first packet in the reassembled
packet.

This is effectively optimizing for the case where all packets are
large and will generate a frag list after reassembly. However, it's
possible in many situations that packets can be reassembled with zero
copy that doesn't later need to be split (maybe the total packet is
around 20k). Do you know how the performance compares vs. just
linearizing at the end in that case? Is there a benefit to doing an
early copy?

> @@ -1154,8 +1216,10 @@ static struct sk_buff *reassemble(struct sk_buff *skb)
>
>         FRAG_CB(frag->skbs)->first.set_ecn_ce |= INET_ECN_is_ce(iph->tos);
>         FRAG_CB(frag->skbs)->first.rcvd_len += skb->len;
> -       FRAG_CB(frag->skbs)->first.mem_used += skb->truesize;
> -       stt_percpu->frag_mem_used += skb->truesize;
> +       if (!copied_skb) {
> +               FRAG_CB(frag->skbs)->first.mem_used += skb->truesize;
> +               stt_percpu->frag_mem_used += skb->truesize;
> +       }

pskb_expand_head() doesn't actually change the truesize so our count
will be more inaccurate than before. However, we don't have to worry
about the attack case of very small packets consuming a large amount
of memory due to having many copies of struct sk_buff.
_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Reply via email to