Hi, On 01/20/2015 02:12 AM, Ananyev, Konstantin wrote: >>>> I think a good definition would >>>> be: >>>> >>>> Packet is IPv4. This flag must be set when using any offload >>>> feature (TSO, L3 or L4 checksum) to tell the NIC that the packet >>>> is an IPv4 packet. >>>> >>>> That's why I added PKT_TX_IPV4 in the examples. >>> >>> I suppose we discussed it several times: both ways are possible. >>> From PMD perspective - treating PKT_TX_IPV4 and PKT_TX_IP_CSUM >>> As mutually exclusive seems a bit more plausible. >>> From the upper layer - my understanding, that it is doesn't really matter. >>> I thought we had an agreement about it in 1.8, no? >> >> Indeed, this was already discussed, but there was a lot of pressure >> for 1.8.0 to push something, even not perfect. The fog around comments >> shows that the API was not very clearly defined for 1.8.0. If you read >> the comments of the API, it is impossible to understand when the >> PKT_TX_IPV4 or PKT_TX_IP_CSUM flags must be set. I would even say >> more: the only place where the comments bring a valuable information >> (L4 checksum and TSO) describe the case where PKT_TX_IPV4 and >> PKT_TX_IP_CSUM are not exclusive... >> >> So I will fix that in my coming patch series. Just for information, >> I'm pretty sure that having PKT_TX_IPV4 and PKT_TX_IP_CSUM as not >> exclusive flag would not require any change anywhere in the PMDs (even >> in i40e). > > Right now - no. > Though as I said from PMD perspective having them exclusive is a bit > preferable. > Again, I don't see any big difference from upper layer code.
Sure, it does not make a big difference in terms of code. But in terms of API, the naming of the flag is coherent to what it is used for. And it's easier to find a simple definition, like: * Packet is IPv4. This flag must be set when using any offload feature * (TSO, L3 or L4 checksum) to tell the NIC that the packet is an IPv4 * packet. >> On the contrary, making them exclusive would require to >> change the ixgbe TSO code because we check. > > Hmm, so you are saying there is a bug somewhere in ixbe_rxtx.c? > What particular place you are talking about? Sorry, I spoke too fast. In TSO code, we check PKT_TX_IP_CKSUM (and not PKT_TX_IPV4 as I thought), so it would work for both methods without patching the code. In this case, it means that both approach would not require to modify the code. >>>> *Problem 3*: without using the word "fortville", it is difficult >>>> to understand the goal of the flag PKT_TX_UDP_TUNNEL_PKT. Indeed, >>>> once PKT_TX_OUTER_IPV4/6 is set, it looks obvious that it's a >>>> tunnel packet. I suggest to remove the PKT_TX_UDP_TUNNEL_PKT >>>> flag. In linux, the driver doesn't care about the tunnel type, >>>> it always set I40E_TXD_CTX_UDP_TUNNELING for all encapsulations [6]. >>> >>> It might be obvious that it is a tunnel packet from PKT_TX_OUTER_* is set, >>> but it is not obvious what type of tunnelling it would be. >>> FVL HW supports HW TX offloads for different type of tunnelling and >>> requires that SW provide information about tunnelling type. >>> From i40e datasheet: >>> L4TUNT L4 Tunneling Type (Teredo / GRE header / VXLAN header) indication: >>> 00b - No UDP / GRE tunneling (field must be set to zero if EIPT equals to >>> zero) >>> 01b - UDP tunneling header (any UDP tunneling, VXLAN and Geneve). >>> 10b - GRE tunneling header >>> As we do plan to support other than UDP tunnelling types, I suppose we'll >>> need to keep >>> PKT_TX_UDP_TUNNEL_PKT flag. >> >> As I've said: in linux, the driver doesn't care about the tunnel type, >> it always set I40E_TXD_CTX_UDP_TUNNELING for all encapsulations. > > Ok, and why it should be our problem? > We have a lot of things done in a different manner then linux/freebsd kernel > drivers, > Why now it became a problem? If linux doesn't need an equivalent flag for doing the same thing, it probably means we don't need it either. In a performance-oriented software like dpdk, having a flag that we don't know what the hardware does with, that is not needed in other drivers of the same harware, that makes the API harder to understand could be a problem. Another argument: if we can remove this flag, it would make the testpmd commands reworkd proposed by Jijiang much more easy to understand: only a new "csum parse-tunnel on|off" would be required, and it can be explained in a few words. I'll try to do some tests on a fortville NIC if I can find one. I'm curious to see if we can transmit any encapsulation packet (ip in ip, ip in gre, eth in gre, eth in vxlan, or even a proprietary tunnel). We should avoid the need to specify the tunnel type in the OUTER checksum API if we can, else it would limit us to specific supported protocols. >>>> I think the following cases should be *forbidden by the API*: >>>> >>>> case 9) calculate checksum of in_ip and in_tcp (was case B.1 in [1]) >>>> >>>> mb->outer_l2_len = len(out_eth) >>>> mb->outer_l3_len = len(out_ip) >>>> mb->l2_len = len(out_udp + vxlan + in_eth) >>>> mb->l3_len = len(out_ip) >>>> mb->ol_flags |= PKT_TX_IPV4 | PKT_TX_UDP_TUNNEL_PKT | \ >>>> PKT_TX_IP_CSUM | PKT_TX_UDP_CKSUM; >>>> set out_ip checksum to 0 in the packet >>>> set out_udp checksum to pseudo header using rte_ipv4_phdr_cksum() >>>> >>>> If we remove the flag PKT_TX_UDP_TUNNEL_PKT, this cannot be >>>> supported, but there is no reason to support it as there is >>>> already one way to do the same. >>>> >>>> I think the driver should not even look at mb->outer_l2_len >>>> and mb->outer_l3_len if no flag PKT_TX_OUTER_* is set. >>> >>> Why it should be forbidden? >>> I admit it might be a bit slower than case 4), >>> but I think absolutely legal way to setup HW offloads for inner L3/L4. >>> As I said we need a PKT_TX_UDP_TUNNEL_PKT anyway, so I suppose >>> PKT_TX_*_TUNNEL_PKT should be an indication is it a tunnel packet or not. >>> PKT_TX_OUTER_* flags indicate does outer cksum offload is required or not. >> >> I don't understand. The result in terms of hardware is exactly the >> same than case 4). Why should we have 2 different ways for doing the >> same thing? > > If HW supports that capability, why should we forbid it? > Let user to choose himself what way to use. > FVL spec lists it as a valid approach. It is not a hardware feature. Case 4) and case 9) would fill the hardware registers exactly the same. To me, it's just an API question. > As one of possible use-cases: HW VLAN tags insertion for both inner and > outer packets. > FVL can do that, though as I know our PMD doesn't implement it yet. > For that, we'll need to specify at least: > outer_l2_len, outer_l3_len, l2_len. > While PKT_TX_OUTER_* might stay cleared. If a VLAN flag has to be inserted in outer header, a new flag PKT_TX_OUTER_INSERT_VLAN would be added. So my specification would still be correct: The driver should look at mb->outer_lX_len only if a PKT_TX_OUTER_* flag is present. >> This is really confusing for an API. Moreover, you said >> it: it is slower that case 4). > > I don't know would be slower then 4) or not for sure. > That's my guess, based on the fact that for 9) we need to fill 2 descriptors, > while fro 4) - only 1. > Though I didn't measure the difference. > That's actually one more reason why to allow and support it - > so people can make sure that on FVL both ways work as expected and measure > the difference. > > Konstantin Regards, Olivier