On Sun, Dec 10, 2017 at 9:02 AM, Alexander Duyck
<alexander.du...@gmail.com> wrote:
> On Sat, Dec 9, 2017 at 10:40 PM, Michael Chan <michael.c...@broadcom.com> 
> wrote:
>> It is possible that if you have incoming packets 1, 2, 3, 4, 5 for a
>> TCP connection, HW_GRO can aggregate packets 1, 2, 3, but cannot
>> aggregate packets 4 and 5 due to hardware resource limitation.
>> Software GRO aggregates 4 and 5.  So it works well together.
>
> Right. But in the case where 1, 2, 3, 4, and 5 were not aggregated by
> hardware GRO because the frames could not be aggregated and then GRO
> burns cycles coming to the same conclusion you have waste. Same thing
> goes for if hardware GRO aggregates 1 through 5 and then SW GRO tries
> to see if it can do more.
>
> They are both doing the same thing, but what I see it as is two
> passes, not something where they are working together. The hardware
> GRO can rely on software GRO for a second pass, but it doesn't need
> to. The fact that it doesn't need to tells me that it isn't a hard
> requirement to have GRO in order to make use of software GRO.

I guess I look at this as feature propagation from net device to
hardware.  To me, it makes a lot of sense.

We've been doing hardware GRO for a while and I never think of it as
replacement for software GRO.  It's hardware accelerated GRO for a
subset of the connections that hardware can handle.

As for the additional GRO pass in software, I think it is quite
efficient.  When hardware has aggregated a GRO frame, software GRO
will effectively "flush" it and never hold it for more aggregation.
After this patchset is done, I can look at the code and see if we can
further optimize the "2nd pass" code path when hardware has already
aggregated the packet.

Anyway, I will move the GRO_HW/GRO dependency to the
ndo_fix_features() of the 3 drivers, so we can move on and get these
patches accepted.

>> May be you meant put the RXCSUM check in the outer if statement so
>> that someone could add more inner checks?  OK, I think that's what you
>> meant.
>
> Yes that is what I meant.

OK.  Will change in v4.

>> We need patch #2 otherwise generic GRO won't work on these 3 drivers.
>> I don't think I fully understand your concerns about propagation.  To
>> me propagation is just a usage model where an upper device will
>> control the common features of lower devices.  It is more convenient
>> to have propagation, but requires upper devices to be aware of all
>> features that propagate (GRO, RXCSUM).  Without propagation, it is
>> still fine.
>
> I'm not sure if it is. It depends on how much XDP depends on the
> frames being non-linear. As-is I am pretty sure this doesn't work for
> the stacked case anyway since GRO was still enabled for lower devices
> anyway. So you might look at just modifying patch 2 to not worry about
> the stacked devices case since I think that was already broken with
> GRO anyway.

OK.  I don't think anyone will run generic XDP on an upper device anyway.

Reply via email to