On Tue, Apr 13, 2021 at 3:27 PM Eric Dumazet <eduma...@google.com> wrote:
>
> On Tue, Apr 13, 2021 at 2:57 PM Michael S. Tsirkin <m...@redhat.com> wrote:
> >
> > On Mon, Apr 12, 2021 at 06:47:07PM +0200, Eric Dumazet wrote:
> > > On Mon, Apr 12, 2021 at 6:31 PM Eric Dumazet <eduma...@google.com> wrote:
> > > >
> > > > On Mon, Apr 12, 2021 at 6:28 PM Linus Torvalds
> > > > <torva...@linux-foundation.org> wrote:
> > > > >
> > > > > On Sun, Apr 11, 2021 at 10:14 PM Guenter Roeck <li...@roeck-us.net> 
> > > > > wrote:
> > > > > >
> > > > > > Qemu test results:
> > > > > >         total: 460 pass: 459 fail: 1
> > > > > > Failed tests:
> > > > > >         sh:rts7751r2dplus_defconfig:ata:net,virtio-net:rootfs
> > > > > >
> > > > > > The failure bisects to commit 0f6925b3e8da ("virtio_net: Do not 
> > > > > > pull payload in
> > > > > > skb->head"). It is a spurious problem - the test passes roughly 
> > > > > > every other
> > > > > > time. When the failure is seen, udhcpc fails to get an IP address 
> > > > > > and aborts
> > > > > > with SIGTERM. So far I have only seen this with the "sh" 
> > > > > > architecture.
> > > > >
> > > > > Hmm. Let's add in some more of the people involved in that commit, and
> > > > > also netdev.
> > > > >
> > > > > Nothing in there looks like it should have any interaction with
> > > > > architecture, so that "it happens on sh" sounds odd, but maybe it's
> > > > > some particular interaction with the qemu environment.
> > > >
> > > > Yes, maybe.
> > > >
> > > > I spent few hours on this, and suspect a buggy memcpy() implementation
> > > > on SH, but this was not conclusive.
> > > >
> > > > By pulling one extra byte, the problem goes away.
> > > >
> > > > Strange thing is that the udhcpc process does not go past sendto().
> > >
> > > This is the patch working around the issue. Unfortunately I was not
> > > able to root-cause it (I really suspect something on SH)
> > >
> > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> > > index 
> > > 0824e6999e49957f7aaf7c990f6259792d42f32b..fd890a951beea03bdf24406809042666eb972655
> > > 100644
> > > --- a/drivers/net/virtio_net.c
> > > +++ b/drivers/net/virtio_net.c
> > > @@ -408,11 +408,17 @@ static struct sk_buff *page_to_skb(struct
> > > virtnet_info *vi,
> > >
> > >         /* Copy all frame if it fits skb->head, otherwise
> > >          * we let virtio_net_hdr_to_skb() and GRO pull headers as needed.
> > > +        *
> > > +        * Apparently, pulling only the Ethernet Header triggers a bug
> > > on qemu-system-sh4.
> > > +        * Since GRO aggregation really cares of IPv4/IPv6, pull 20 bytes
> > > +        * more to work around this bug : These 20 bytes can not belong
> > > +        * to UDP/TCP payload.
> > > +        * As a bonus, this makes GRO slightly faster for IPv4 (one less 
> > > copy).
> > >          */
> >
> > Question: do we still want to do this for performance reasons?
> > We also have the hdr_len coming from the device which is
> > just skb_headlen on the host.
>
> Well, putting 20 bytes in skb->head will disable frag0 optimization.
>
> The change would only benefit to sh architecture :)
>
> About hdr_len, I suppose we could try it, with appropriate safety checks.

I have added traces, hdr_len seems to be 0 with the qemu-system-sh4 I am using.

Have I understood you correctly ?

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 
0824e6999e49957f7aaf7c990f6259792d42f32b..f024860f7dc260d4efbc35a3b8ffd358bd0da894
100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -399,9 +399,10 @@ static struct sk_buff *page_to_skb(struct virtnet_info *vi,
                hdr_padded_len = sizeof(struct padded_vnet_hdr);

        /* hdr_valid means no XDP, so we can copy the vnet header */
-       if (hdr_valid)
+       if (hdr_valid) {
                memcpy(hdr, p, hdr_len);
-
+               pr_err("hdr->hdr_len=%u\n", hdr->hdr.hdr_len);
+       }
        len -= hdr_len;
        offset += hdr_padded_len;
        p += hdr_padded_len;

Reply via email to