qede: Tunnel hardware GRO support

Yuval Mintz Thu, 23 Jun 2016 14:13:23 -0700

>>> Then again, if you're basically saying every HW-assisted offload on
>>> receive should be done under LRO flag, what would be the use case
>>> where a GRO-assisted offload would help?
>>> I.e., afaik LRO is superior to GRO in `brute force' -
>>> it creates better packed packets and utilizes memory better
>>> [with all the obvious cons such as inability for defragmentation].
>>> So if you'd have the choice of having an adpater perform 'classic'
>>> LRO aggregation or something that resembles a GRO packet,
>>> what would be the gain from doing the latter?


> LRO and GRO shouldn't really differ in packing or anything like that.
> The big difference between the two is that LRO is destructive while
> GRO is not.  Specifically in the case of GRO you should be able to
> take the resultant frame, feed it through GSO, and get the original
> stream of frames back out.  So you can pack the frames however you
> want the only key is that you must capture all the correct offsets and
> set the gso_size correct for the flow.

While the implementation might lack in things [such as issues with
future implementation], following your logic it is GRO - I.e., forwarding
scenarios work fine with HW assisted GRO.

>> Just to relate to bnx2x/qede differences in current implementation -
>> when this GRO hw-offload was added to bnx2x, it has already
>> supported classical LRO, and due to above statement whenever LRO
>> was set driver aggregated incoming traffic as classic LRO.
>> I agree that in hindsight the lack of distinction between sw/hw GRO
>> was hurting us.

> In the case of bnx2x it sounds like you have issues that are
> significantly hurting the performance versus classic software GRO.  If
> that is the case you might want to simply flip the logic for the
> module parameter that Rick mentioned and just disable the hardware
> assisted GRO unless it is specifically requested.

A bit hard to flip; The module parameter also disables LRO support.
And given that module parameters is mostly a thing of the past, I
don't think we should strive fixing things through additional changes
in that area.

> > qede isn't implementing LRO, so we could easily mark this feature
> > under LRO there - but question is, given that the adapter can support
> > LRO, if we're going to suffer from all the shotrages that arise from
> > putting this feature under LRO, why should we bother?

> The idea is to address feature isolation.  The fact is the hardware
> exists outside of kernel control.  If you end up linking an internal
> kernel feature to your device like this you are essentially stripping
> the option of using the kernel feature.

> I would prefer to see us extend LRO to support "close enough GRO"
> instead of have us extend GRO to also include LRO. 

Again - why? What's the benefit of HW doing LRO and trying to
control imitate GRO, if it's still carrying all the LRO baggage
[specifically, disabling it on forwarding] as opposed to simply
doing classic LRO?

> > You can argue that we might need a new feature bit for control
> > over such a feature; If we don't do that, is there any gain in all of this?

> I would argue that yes there are many cases where we will be able to
> show gain.  The fact is there is a strong likelihood of the GRO on
> your parts having some differences either now, or at some point in the
> future as the code evolves.  As I mentioned there was already some
> talk about possibly needing to push the UDP tunnel aggregation out of
> GRO and perhaps handling it sometime after IP look up had verified
> that the destination was in fact a local address in the namespace.  In
> addition it makes the changes to include the tunnel encapsulation much
> more acceptable as LRO is already naturally dropped in the routing and
> bridging cases if I recall correctly.

I think it all boils down to the question of "do we actually want to have
HW-assisted GRO?". If we do [and not necessarily for the UDP-tunnel
scenario] then we need to have it distinct from LRO, otherwise there's
very little gain. If we believe tGRO should remain SW-only, then
I think the discussion is mott; We need to stop trying this, and offload
only LRO - in which case we can aggregate it in whichever 'destructive'
[correct] format we like, without trying to have it resemble GRO.

Re: [PATCH net-next 0/5] qed/qede: Tunnel hardware GRO support

Reply via email to