Em 01-02-2016 15:03, Alexander Duyck escreveu:
> On Mon, Feb 1, 2016 at 8:22 AM, Marcelo Ricardo Leitner
> <marcelo.leit...@gmail.com> wrote:
>> Em 30-01-2016 02:07, Alexander Duyck escreveu:
>>>
>>> On Fri, Jan 29, 2016 at 11:42 AM, Marcelo Ricardo Leitner
>>> <marcelo.leit...@gmail.com> wrote:
>>>>
>>>> On Fri, Jan 29, 2016 at 11:15:54AM -0800, Alexander Duyck wrote:
>>>>>
>>>>> On Wed, Jan 27, 2016 at 9:06 AM, Marcelo Ricardo Leitner
>>
>> ...
>>
>>>>>> diff --git a/net/core/dev.c b/net/core/dev.c
>>>>>> index
>>>>>> 8cba3d852f251c503b193823b71b27aaef3fb3ae..9583284086967c0746de5f553535e25e125714a5
>>>>>> 100644
>>>>>> --- a/net/core/dev.c
>>>>>> +++ b/net/core/dev.c
>>>>>> @@ -2680,7 +2680,11 @@ EXPORT_SYMBOL(skb_mac_gso_segment);
>>>>>> static inline bool skb_needs_check(struct sk_buff *skb, bool tx_path)
>>>>>>    {
>>>>>>           if (tx_path)
>>>>>> -               return skb->ip_summed != CHECKSUM_PARTIAL;
>>>>>> + /* FIXME: Why only packets with checksum offloading are
>>>>>> +                * supported for GSO?
>>>>>> +                */
>>>>>> +               return skb->ip_summed != CHECKSUM_PARTIAL &&
>>>>>> +                      skb->ip_summed != CHECKSUM_UNNECESSARY;
>>>>>>           else
>>>>>>                   return skb->ip_summed == CHECKSUM_NONE;
>>>>>>    }
>>>>>
>>>>>
>>>>> Tom Herbert just got rid of the use of CHECKSUM_UNNECESSARY in the
>>>>> transmit path a little while ago.  Please don't reintroduce it.
>>>>
>>>>
>>>> Can you give me some pointers on that? I cannot find such change.
>>>> skb_needs_check() seems to be like that since beginning.
>>>
>>>
>>> Maybe you need to update your kernel.  All this stuff was changed in
>>> December and has been this way for a little while now.
>>>
>>> Commits:
>>> 7a6ae71b24905 "net: Elaborate on checksum offload interface description"
>>> 253aab0597d9e "fcoe: Use CHECKSUM_PARTIAL to indicate CRC offload"
>>> 53692b1de419c "sctp: Rename NETIF_F_SCTP_CSUM to NETIF_F_SCTP_CRC"
>>>
>>> The main reason I even noticed it is because of some of the work I did
>>> on the Intel NIC offloads.
>>
>>
>> Ok I have those here, but my need here is different. I want to do GSO with >> packets that won't have CRC offloaded, so I shouldn't use CHECKSUM_PARTIAL
>> but something else.
>
> CHECKSUM_NONE if you don't want to have any of the CRC or checksums
> offloaded.  However as I mentioned before you will want to fake it
> then since skb_segment assumes it is doing a 1's compliment checksum
> so you will want to pass NET_F_HW_CSUM as a feature flag and then set
> CHECKSUM_NONE after the frame has been segmented.

Ok

>> ...
>>
>>>>>> diff --git a/net/sctp/offload.c b/net/sctp/offload.c
>>>>>> index
>>>>>> 7080a6318da7110c1688dd0c5bb240356dbd0cd3..3b96035fa180a4e7195f7b6e7a8be7b97c8f8b26
>>>>>> 100644
>>>>>> --- a/net/sctp/offload.c
>>>>>> +++ b/net/sctp/offload.c
>>>>>> @@ -36,8 +36,61 @@
>>>>>>    #include <net/sctp/checksum.h>
>>>>>>    #include <net/protocol.h>
>>>>>>
>>>>>> +static __le32 sctp_gso_make_checksum(struct sk_buff *skb)
>>>>>> +{
>>>>>> +       skb->ip_summed = CHECKSUM_NONE;
>>>>>> +       return sctp_compute_cksum(skb, skb_transport_offset(skb));
>>>>>> +}
>>>>>> +
>>>>>
>>>>>
>>>>> I really despise the naming of this bit here.  SCTP does not use a
>>>>> checksum.  It uses a CRC.  Please don't call this a checksum as it
>>>>> will just make the code really confusing. I think the name should be
>>>>> something like gso_make_crc32c.
>>>>
>>>>
>>>> Agreed. SCTP code still references it as 'cksum'. I'll change that in
>>>> another patch.
>>>>
>>>>> I think we need to address the CRC issues before we can really get
>>>>> into segmentation.  Specifically we need to be able to offload SCTP
>>>>> and FCoE in software since they both use the CHECKSUM_PARTIAL value
>>>>> and then we can start cleaning up more of this mess and move onto
>>>>> segmentation.
>>>>
>>>>
>>>> Hm? The mess on CRC issues here is caused by this patch alone. It's good >>>> as it is today. And a good part of this mess is caused by trying to GSO
>>>> without offloading CRC too.
>>>>
>>>> Or you mean that SCTP and FCoE should stop using CHECKSUM_* at all?
>>>
>>>
>>> Well after Tom's change both SCTP and FCoE use CHECKSUM_PARTIAL.
>>> CHECKSUM_PARTIAL is what is used to indicate to the hardware that a
>>> checksum offload has been requested so that is what is looked for at
>>> the driver level.
>>
>>
>> SCTP was actually already using CHECKSUM_PARTIAL. That patch was just a
>> rename in an attempt to make this crc difference more evident. Yet I'll
>> continue the rename within sctp code.
>
> Yeah it was FCoE that was doing something different.
>
>>> My concern with all this is that we should probably be looking at
>>> coming up with a means of offloading this in software when
>>> skb_checksum_help is called.  Right now validate_xmit_skb doesn't have
>>> any understanding of what to do with SCTP or FCoE and will try to just
>>> compute a checksum for them.
>>
>>
>> My worry is placed a bit earlier than that, I think. Currently I just cannot
>> do GSO with packets that doesn't have checksum/crc offloaded too because
>> validate_xmit_skb() will complain.
>
> That is probably because you are passing CHECKSUM_PARTIAL instead of
> CHECKSUM_NONE.

Other way around, but it's cool. We are pretty much on the same page now, I think.

>> As NICs hardly have sctp crc offloading capabilities, I'm thinking it makes
>> sense to do GSO even without crc offloaded. After all, it doesn't matter
>> much in which stage we are computing the crc as we are computing it anyway.
>
> Agreed.  You will need to support CHECKSUM_PARTIAL being passed to a
> device that doesn't support SCTP first.  That way you can start
> looking at just always setting CHECKSUM_PARTIAL in the transport layer
> which is really needed if you want to do SCO (SCTP Segmentation
> Offload) in the first place.  Once you have that you could then start
> looking at doing the SCO since from that point on you should already
> be in good shape to address those type of issues.  You should probably
> use the csum_offset value in the skb in order to flag if this is
> possibly SCTP.  As far as I know for now there shouldn't be any other
> protocols that are using the same offset, and if needed you can
> actually parse the headers to verify if the frame is actually SCTP.

Cool, yes.
Just cannot set CHECKSUM_PARTIAL always because if frame is not a GSO, we will not have another chance to fill in SCTP CRC if it's not offloaded. A check still have to consider for that, but np.

>>>>>> +static struct sk_buff *sctp_gso_segment(struct sk_buff *skb,
>>>>>> +                                       netdev_features_t features)
>>>>>> +{
>>>>>> +       struct sk_buff *segs = ERR_PTR(-EINVAL);
>>>>>> +       struct sctphdr *sh;
>>>>>> +
>>>>>> +       sh = sctp_hdr(skb);
>>>>>> +       if (!pskb_may_pull(skb, sizeof(*sh)))
>>>>>> +               goto out;
>>>>>> +
>>>>>> +       __skb_pull(skb, sizeof(*sh));
>>>>>> +
>>>>>> +       if (skb_gso_ok(skb, features | NETIF_F_GSO_ROBUST)) {
>>>>>> + /* Packet is from an untrusted source, reset gso_segs.
>>>>>> */
>>>>>> +               int type = skb_shinfo(skb)->gso_type;
>>>>>> +
>>>>>> +               if (unlikely(type &
>>>>>> +                            ~(SKB_GSO_SCTP | SKB_GSO_DODGY |
>>>>>> +                              0) ||
>>>>>> +                            !(type & (SKB_GSO_SCTP))))
>>>>>> +                       goto out;
>>>>>> +
>>>>>> +               /* This should not happen as no NIC has SCTP GSO
>>>>>> +                * offloading, it's always via software and thus we
>>>>>> +                * won't send a large packet down the stack.
>>>>>> +                */
>>>>>> + WARN_ONCE(1, "SCTP segmentation offloading to NICs is
>>>>>> not supported.");
>>>>>> +               goto out;
>>>>>> +       }
>>>>>> +
>>>>>
>>>>>
>>>>> So what you are going to end up needing here is some way to tell the
>>>>> hardware that you are doing the checksum no matter what.  There is no
>>>>> value in you computing a 1's compliment checksum for the payload if
>>>>> you aren't going to use it.  What you can probably do is just clear
>>>>> the standard checksum flags and then OR in NETIF_F_HW_CSUM if
>>>>> NETIF_F_SCTP_CRC is set and that should get skb_segment to skip
>>>>> offloading the checksum.
>>>>
>>>>
>>>> Interesting, ok
>>>>
>>>>> One other bit that will make this more complicated is if we ever get
>>>>> around to supporting SCTP in tunnels.  Then we will need to sort out
>>>>> how things like remote checksum offload should impact SCTP, and how to
>>>>> deal with needing to compute both a CRC and 1's compliment checksum.
>>>>> What we would probably need to do is check for encap_hdr_csum and if
>>>>> it is set and we are doing SCTP then we would need to clear the
>>>>> NETIF_F_HW_CSUM, NETIF_F_IP_CSUM, and NETIF_F_IPV6_CSUM flags.
>>>>
>>>>
>>>> Yup. And that includes on storing pointers to where to store each of it.
>>>
>>>
>>> Actually the pointers bit is easy.  The csum_start and csum_offset
>>> values should be set up after you have segmented the skb and should be
>>> updated after the skb has been segmented.  If nothing else you can
>>> probably take a look at the TCP code tcp_gso_segment and
>>> tcp4_gso_segment for inspiration.  Basically you need to make sure
>>> that you set the ip_summed, csum_start, and csum_offset values for
>>> your first frame before you start segmenting it into multiple frames.
>>
>>
>> Ah yes, ok, that's for now, when not doing crc offloading with some chksum
>> offloading (tunnel) too.
>
> Actually that would be regardless of tunnel offloading.  We don't
> store the outer checksum offsets.  If we need outer checksum we
> restore them after the fact since the inner checksum offsets are
> needed as part of the inner header TCP checksum computation.

Hm okay

>>>>>> +       segs = skb_segment(skb, features);
>>>>>> +       if (IS_ERR(segs))
>>>>>> +               goto out;
>>>>>> +
>>>>>> +       /* All that is left is update SCTP CRC if necessary */
>>>>>> +       for (skb = segs; skb; skb = skb->next) {
>>>>>> +               if (skb->ip_summed != CHECKSUM_PARTIAL) {
>>>>>> +                       sh = sctp_hdr(skb);
>>>>>> +                       sh->checksum = sctp_gso_make_checksum(skb);
>>>>>> +               }
>>>>>> +       }
>>>>>> +
>>>>>
>>>>>
>>>>> Okay, so it looks like you are doing the right thing here and leaving
>>>>> this as CHECKSUM_PARTIAL.
>>>>
>>>>
>>>> Actually no then. sctp_gso_make_checksum() replaces it:
>>>> +static __le32 sctp_gso_make_checksum(struct sk_buff *skb)
>>>> +{
>>>> +       skb->ip_summed = CHECKSUM_NONE;
>>>> +       return sctp_compute_cksum(skb, skb_transport_offset(skb));
>>>>
>>>> Why again would have to leave it as CHECKSUM_PARTIAL? IP header?
>>>
>>>
>>> My earlier comment is actually incorrect.  This section is pretty much
>>> broken since CHECKSUM_PARTIAL only reflects a 1's compliment checksum
>>> in the case of skb_segment so whatever the value it is worthless.
>>> CHECKSUM_PARTIAL is used to indicate if a given frame needs to be
>>> offloaded.  It is meant to let the device know that it still needs to
>>> compute a checksum or CRC beginning at csum_start and then storing the
>>> new value at csum_offset.  However for skb_segment it is actually
>>> referring to a 1's compliment checksum and if it returns CHECKSUM_NONE
>>> it means it is stored in skb->csum which would really wreck things for
>>> you since that was your skb->csum_start and skb->csum_offset values.
>>> I have a patch to change this so that we update a checksum in the
>>> SKB_GSO_CB, but I wasn't planning on submitting that until net-next
>>> opens.
>>
>>
>> sctp currently ignores skb->csum. It doesn't mess with the crc but computing
>> it is at least not optimal, yes.
>
> Actually sctp sets csum_start and csum_offset if it sets
> CHECKSUM_PARTIAL.  So it does mess with skb->csum since it is
> contained in a union with those two fields.

Well, yes, but point was that messed value is not used for anything useful later on..
I'll implement the NETIF_F_HW_CSUM trick.

>>> In the case of SCTP you probably don't even need to bother checking
>>> the value since it is meaningless as skb_segment doesn't know how to
>>> do an SCTP checksum anyway.  To that end for now what you could do is
>>> just set NETIF_F_HW_CSUM.  This way skb_segment won't go and try to
>>> compute a 1's compliment checksum on the payload since there is no
>>> actual need for it.
>>
>>
>> Nice, ok.
>>
>>> One other bit you will need to do is to check the value of SCTP_CRC
>>> outside of skb_segment.  You might look at how
>>> __skb_udp_tunnel_segment does this to populate its own offload_csum
>>> boolean value, though you would want to use features, not
>>> skb->dev->features as that is a bit of a workaround since features is
>>> stripped by hw_enc_features in some paths if I recall correctly.
>>
>>>
>>>
>>> Once the frames are segmented and if you don't support the offload you
>>> could then call gso_make_crc32c() or whatever you want to name it to
>>> perform the CRC calculation and populate the field.  One question by
>>
>>
>> Hmmm.. does it mean that we can use CHECKSUM_PARTIAL then even if CRC
>> offloading is not possible then? Because the packet will not be offloaded in >> the end, yes, but this solves my questions above. Then while doing GSO, it
>> re-evaluates if it can offload crc or not?
>
> If you compute the CRC you set CHECKSUM_NONE, if you want the device
> to do it on transmit you should set CHECKSUM_PARTIAL.

Okay

>>> the way.  Don't you need to initialize the checksum value to 0 before
>>> you compute it?  I think you might have missed that step when you were
>>> setting this up.
>>
>>
>> It's fine :) sctp_compute_cksum will replace it with zeroes, calculate, and
>> put back the old value, which then we overwrite with the new one at
>> sctp_gso_segment.
>
> Right but there are scenarios where this will be offloaded isn't
> there?  You would probably be better off setting the CRC to 0 before
> you start segmentation and then that way you can either just set
> csum_offset, csum_start and ip_summed if the lower device supports
> SCTP CRC offload, otherwise you can just compute it without the need
> to write the 0 into the header.

Ahh, it's also zeroed when the header is constructed. There is 'sh->checksum = 0;' in sctp_packet_transmit for this.

I'll look into moving this decision on CRC offloading or not into the segmentation moment. I think it will have to be done twice, actually, for sctp-reasons. Like, if packet will be fragmented by IP, it currently doesn't allow offloading CRC computing. I'll check, then post a v2. I think at that least the crc offloading is now clarified. Thanks Alex.

Marcelo

Reply via email to