Bill Fischofer(Bill-Fischofer-Linaro) replied on github web page: include/odp/api/spec/packet_io.h line 21 @@ -256,11 +256,27 @@ typedef struct odp_pktio_param_t { * belong to time synchronization protocol (PTP). * * Packet input checksum checking may be enabled or disabled. When it is - * enabled, implementation will verify checksum correctness on incoming packets - * and depending on drop configuration either deliver erroneous packets with - * appropriate flags set (e.g. odp_packet_has_l3_error()) or drop those. - * When packet dropping is enabled, application will never receive a packet - * with the specified error and may avoid to check the error flag. + * enabled, implementation will attempt to verify checksum correctness on + * incoming packets and depending on drop configuration either deliver erroneous + * packets with appropriate flags set (e.g. odp_packet_has_l3_error(), + * odp_packet_l3_chksum_status()) or drop those. When packet dropping is + * enabled, application will never receive a packet with the specified error + * and may avoid to check the error flag. + * + * If checksum checking is enabled, IPv4 header checksum checking is always + * done for packets that do not have IP options and L4 checksum checking + * is done for unfragmented packets that do not have IPv4 options or IPv6 + * extension headers. In other cases checksum checking may or may not + * be done. For example, L4 checksum of fragmented packets is typically + * not checked.
Comment: @JannePeltonen I thought that's why we were providing the per-packet override capability here--so that applications can say "While I normally want checksums added, for this packet that's not needed". Or "Please don't add checksums by default, however for this packet please do so". The point is, if the application determines that checksum processing is needed for a particular packet (or interface) then that work needs to be done. The ODP implementation is in a better position than the application to determine how best to do that work on this platform since it has direct access to the platform-specific HW, specialized instructions, etc., and if it needs to fall back to SW can do so using instruction sequences optimized to its microarchitecture. On the RX side, again we either don't care about checksum validation or else we do. If we do then that work has to be done. So I don't see what advantage is gained by requiring every application to have a "backup plan" in this area. Also, since ODP requires the application to have correctly set the L3 and L4 offset values, that's all the information required to perform these calculations as specified by the relevant RFCs independent of the rest of the packet structure. So again, it's cleaner to have ODP do that once rather than requiring every application to duplicate that effort. > Bill Fischofer(Bill-Fischofer-Linaro) wrote: > Why you'd prefer two calls instead of one? With ABI compat, the number of > function calls per packet matters. Also, "has checksum been checked" and > "checksum check result" are likely reading the same bits in packet header > anyway. So, it makes sense to do both checks in one go. >> Bill-Fischofer-Linaro wrote >> Instead of this I would prefer to have in >> ./include/odp/api/spec/packet_flags.h / >> ./platform/linux-generic/odp_packet_flags.c something like: >> >> int odp_packet_has_l3_csum(odp_packet_t pkt); >> int odp_packet_has_l4_csum(odp_packet_t pkt); >> >> * @retval non-zero operation was performed >> * @retval 0 operation was not performed >> >> For completeness, we may add explicit csum operation result call but likely >> odp_packet_has_error() or odp_packet_has_l3_error() or >> odp_packet_has_l4_error() will be preferred. >> >> int odp_packet_has_l3_csum_error(odp_packet_t pkt); >> int odp_packet_has_l4_csum_error(odp_packet_t pkt); >>> Bill-Fischofer-Linaro wrote >>> It would not be good to require ODP to always check the L4 checksum if >>> checksum checking is enabled in the config, because then some >>> implementations might have to fall back to SW checking for some packets and >>> doing the check in SW would be a waste of resources in case the application >>> is not interested in the L4 checksum of all packets (e.g. an application >>> that both terminates and forwards IP packets needs to check the L4 checksum >>> of the locally received packets but should ignore the L4 checksum of >>> forwarded packets. >>>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>>> This talks **application** passing fragments to ODP for L4 checksum >>>> insertion. When ODP IPsec inline creates fragments, application does not >>>> have control on those (no need for API spec) and obviously implementation >>>> must not trick itself (it's an implementation internal bug to create a >>>> fragment and ask output HW to insert L4 checksum for those). >>>> >>>> Application passes full packet to IPSEC with fragmentation and L4 checksum >>>> offload requests - implementation inserts L4 checksum, does fragmentation >>>> if needed, passes new packets back to user or directly to output (in case >>>> of inline). >>>>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>>>> The intention is best effort on top of the minimum requirement that all >>>>> basic IP packets must be checked. The wording says that quite clearly >>>>> already (what needs to change ?). Again it's a fact that not all HW will >>>>> check non-basic IP packets, or if they do (e.g. with firmware / SW) >>>>> packet rate may be much lower for _all_ packets. >>>>> >>>>> Typically, these non-basic packets are a small fraction of all packets >>>>> and belong to the slow path anyway. >>>>>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>>>>> That's RFC for IP stack not for NIC HW. ODP just pass information >>>>>> between app and HW. IP stack is an app. So, the app (IP stack) must do >>>>>> checksum check if HW didn't do it. We leave option for the HW to not do >>>>>> it, since not all HW do it for packets with options/extensions. >>>>>>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>>>>>> About the values, we can change those later if there's evidence from >>>>>>> various implementations that OK == 0 would be better than UNKNOWN == 0. >>>>>>> If there's no measurable performance difference, UNKNOWN == 0 is more >>>>>>> robust as init value. >>>>>>>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>>>>>>> These three values are needed. For example, we cannot dictate that all >>>>>>>> HW must be able to calculate checksums when there are extension >>>>>>>> headers. Still some HW can do that. So, even if application enabled >>>>>>>> checksumming, it depends on the packet and on the implementation if >>>>>>>> check was done or not. We just dictate that basic packets (no options >>>>>>>> / extensions) must always be checked when checksumming is enabled. >>>>>>>>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>>>>>>>> OK >>>>>>>>>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>>>>>>>>> Then perhaps the 0 enum should be `ODP_PACKET_CHECKSUM_NORMAL` where >>>>>>>>>> "normal" is interpreted as OK if checksum offload was requested and >>>>>>>>>> "unknown" if it was not requested. The enum is then simply either >>>>>>>>>> NORMAL or BAD. The point is the only thing the application really >>>>>>>>>> wants to know is whether the checksum is bad or not. If it's >>>>>>>>>> disabled checksumming for whatever reason then presumably it doesn't >>>>>>>>>> care. >>>>>>>>>> >>>>>>>>>> If we reduce the enum to only two values, then it can simply >>>>>>>>>> collapse into a "bad" bit that's set if we know the checksum is >>>>>>>>>> incorrect and left as 0 otherwise. The 0 means it's OK if we are >>>>>>>>>> checking checksums and unknown if we don't care about them. >>>>>>>>>>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>>>>>>>>>> Agreed, however whether IPsec fragments the output for a given >>>>>>>>>>> packet may not be known in advance by the application. As worded, >>>>>>>>>>> this would seem to imply that I cannot request L4 checksum >>>>>>>>>>> processing for inline IPsec output on the off chance it may need to >>>>>>>>>>> be fragmented. I doubt if that is your intent here. >>>>>>>>>>>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>>>>>>>>>>> Again, TCP checksums are not optional and their presence is not >>>>>>>>>>>> conditional upon such things. If the intent here is to request >>>>>>>>>>>> best-effort HW offload (as opposed to requiring the ODP >>>>>>>>>>>> implementation provide this service however it chooses) then this >>>>>>>>>>>> wording needs to change to reflect that distinction. I'd take the >>>>>>>>>>>> position that applications should never have to worry about >>>>>>>>>>>> checksums unless they are being overridden for specific diagnostic >>>>>>>>>>>> or other purposes. In normal operation applications should be able >>>>>>>>>>>> to rely on the ODP implementation taking care of them (preferably >>>>>>>>>>>> in HW, but otherwise in SW if needed). >>>>>>>>>>>>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>>>>>>>>>>>> Not according to [RFC 791](https://tools.ietf.org/html/rfc791): >>>>>>>>>>>>> >>>>>>>>>>>>> ``` >>>>>>>>>>>>> The Header Checksum provides a verification that the >>>>>>>>>>>>> information used >>>>>>>>>>>>> in processing internet datagram has been transmitted correctly. >>>>>>>>>>>>> The >>>>>>>>>>>>> data may contain errors. If the header checksum fails, the >>>>>>>>>>>>> internet >>>>>>>>>>>>> datagram is discarded at once by the entity which detects the >>>>>>>>>>>>> error. >>>>>>>>>>>>> ``` >>>>>>>>>>>>> >>>>>>>>>>>>> This requirement is independent of whether or not IP options >>>>>>>>>>>>> exist. >>>>>>>>>>>>>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>>>>>>>>>>>>> Not easily for IPv6, the fragmentation extension header may be >>>>>>>>>>>>>> behind a list of other extension headers. Also it does not make >>>>>>>>>>>>>> sense to ask L4 checksum for a packet that it cannot be done. >>>>>>>>>>>>>> Usually even HW assisted checksum insertion is ordered by SW on >>>>>>>>>>>>>> per packet basis. So, if application does not order correctly, >>>>>>>>>>>>>> ODP would need to parse packet and find out if it can be ordered >>>>>>>>>>>>>> or not. >>>>>>>>>>>>>>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>>>>>>>>>>>>>> L4 check includes fields from IP header. If the IP header is >>>>>>>>>>>>>>> not in basic form (has options/extensions) HW may not be able >>>>>>>>>>>>>>> to do the L4 checksum check. >>>>>>>>>>>>>>>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>>>>>>>>>>>>>>> Basic IP packets checksum checking is a MUST, support for >>>>>>>>>>>>>>>> options is a MAY. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> We don't want to force every implementation to check also >>>>>>>>>>>>>>>> packets with IP options, since not all HW can do it. User >>>>>>>>>>>>>>>> finds out if the check was done from the packet. >>>>>>>>>>>>>>>>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>>>>>>>>>>>>>>>> Not in this patch, since statistics are defined to work as >>>>>>>>>>>>>>>>> RFCs specify. We don't want to redefine how packet_io_stats_t >>>>>>>>>>>>>>>>> works in here. >>>>>>>>>>>>>>>>>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>>>>>>>>>>>>>>>>> OK and UNKNOWN are the most common cases: OK when checking >>>>>>>>>>>>>>>>>> is enabled, UNKNOWN when checking disabled. E.g. a router >>>>>>>>>>>>>>>>>> would not need check L4 checksum, so IPv4 check would be on, >>>>>>>>>>>>>>>>>> but all L4 checksum checks off. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Initialization is more robust with UNKNOWN == 0. Odp-linux >>>>>>>>>>>>>>>>>> would save these along the other packet flags (init would be >>>>>>>>>>>>>>>>>> flags.all_bits = 0). On the other hand, HW specific >>>>>>>>>>>>>>>>>> implementation may not store this enum anywhere, but >>>>>>>>>>>>>>>>>> directly mask HW specific packet descriptor flags to produce >>>>>>>>>>>>>>>>>> the status. In that case, the values do not matter too much. >>>>>>>>>>>>>>>>>>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>>>>>>>>>>>>>>>>>> Can't the interface detect the IPfrag itself? How does this >>>>>>>>>>>>>>>>>>> effect IPsec inline output fragmentation support? >>>>>>>>>>>>>>>>>>>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>>>>>>>>>>>>>>>>>>> L4 checksums being skipped for IP fragments is normal, but >>>>>>>>>>>>>>>>>>>> why the caveat about IP options or IPv6 extension headers? >>>>>>>>>>>>>>>>>>>> A TCP or UCP checksum is well defined independent of these >>>>>>>>>>>>>>>>>>>> L3 considerations. >>>>>>>>>>>>>>>>>>>>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>>>>>>>>>>>>>>>>>>>> What is the relevance of IP options to checksum >>>>>>>>>>>>>>>>>>>>> processing? These two are orthogonal. >>>>>>>>>>>>>>>>>>>>>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>>>>>>>>>>>>>>>>>>>>> Worth mentioning anything about statistics here? >>>>>>>>>>>>>>>>>>>>>> Presumably such drops are accumulated there. >>>>>>>>>>>>>>>>>>>>>>> Bill Fischofer(Bill-Fischofer-Linaro) wrote: >>>>>>>>>>>>>>>>>>>>>>> Since ODP_PACKET_CHKSUM_OK is the most likely case, >>>>>>>>>>>>>>>>>>>>>>> wouldn't it be better to use 0 for that enum? In almost >>>>>>>>>>>>>>>>>>>>>>> all cases HW is going to "do the right thing" so SW >>>>>>>>>>>>>>>>>>>>>>> should only need to initialize this to something else >>>>>>>>>>>>>>>>>>>>>>> if that's not the case. https://github.com/Linaro/odp/pull/167#discussion_r137800145 updated_at 2017-09-08 14:43:42