On Mon, Nov 28, 2016 at 12:03:06PM +0100, Thomas Monjalon wrote: > We need attention of every PMD developers on this thread.
I've been following this thread from the beginning while working on rte_flow and wanted to see where it was headed before replying. (I know, v11 was submitted about 1 month ago but still.) > Reminder of what Konstantin suggested: > " > - if the PMD supports TX offloads AND > - if to be able use any of these offloads the upper layer SW would have to: > * modify the contents of the packet OR > * obey HW specific restrictions > then it is a PMD developer responsibility to provide tx_prep() that would > implement > expected modifications of the packet contents and restriction checks. > Otherwise, tx_prep() implementation is not required and can be safely set to > NULL. > " > > I copy/paste also my previous conclusion: > > Before txprep, there is only one API: the application must prepare the > packets checksum itself (get_psd_sum in testpmd). > With txprep, the application have 2 choices: keep doing the job itself > or call txprep which calls a PMD-specific function. Something is definitely needed here, and only PMDs can provide it. I think applications should not have to clear checksum fields or initialize them to some magic value, same goes for any other offload or hardware limitation that needs to be worked around. tx_prep() is one possible answer to this issue, however as mentioned in the original patch it can be very expensive if exposed by the PMD. Another issue I'm more concerned about is the way limitations are managed (struct rte_eth_desc_lim). While not officially tied to tx_prep(), this structure contains new fields that are only relevant to a few devices, and I fear it will keep growing with each new hardware quirk to manage, breaking ABIs in the process. What are applications supposed to do, check each of them regardless before attempting to send a burst? I understand tx_prep() automates this process, however I'm wondering why isn't the TX burst function doing that itself. Using nb_mtu_seg_max as an example, tx_prep() has an extra check in case of TSO that the TX burst function does not perform. This ends up being much more expensive to applications due to the additional loop doing redundant testing on each mbuf. If, say as a performance improvement, we decided to leave the validation part to the TX burst function; what remains in tx_prep() is basically heavy "preparation" requiring mbuf changes (i.e. erasing checksums, for now). Following the same logic, why can't such a thing be made part of the TX burst function as well (through a direct call to rte_phdr_cksum_fix() whenever necessary). From an application standpoint, what are the advantages of having to: if (tx_prep()) // iterate and update mbufs as needed tx_burst(); // iterate and send Compared to: tx_burst(); // iterate, update as needed and send Note that PMDs could still provide different TX callbacks depending on the set of enabled offloads so performance is not unnecessarily impacted. In my opinion the second approach is both faster to applications and more friendly from a usability perspective, am I missing something obvious? > The question is: does non-Intel drivers need a checksum preparation for TSO? > Will it behave well if txprep does nothing in these drivers? > > When looking at the code, most of drivers handle the TSO flags. > But it is hard to know whether they rely on the pseudo checksum or not. > > git grep -l 'PKT_TX_UDP_CKSUM\|PKT_TX_TCP_CKSUM\|PKT_TX_TCP_SEG' drivers/net/ > > drivers/net/bnxt/bnxt_txr.c > drivers/net/cxgbe/sge.c > drivers/net/e1000/em_rxtx.c > drivers/net/e1000/igb_rxtx.c > drivers/net/ena/ena_ethdev.c > drivers/net/enic/enic_rxtx.c > drivers/net/fm10k/fm10k_rxtx.c > drivers/net/i40e/i40e_rxtx.c > drivers/net/ixgbe/ixgbe_rxtx.c > drivers/net/mlx4/mlx4.c > drivers/net/mlx5/mlx5_rxtx.c > drivers/net/nfp/nfp_net.c > drivers/net/qede/qede_rxtx.c > drivers/net/thunderx/nicvf_rxtx.c > drivers/net/virtio/virtio_rxtx.c > drivers/net/vmxnet3/vmxnet3_rxtx.c > > Please, we need a comment for each driver saying > "it is OK, we do not need any checksum preparation for TSO" > or > "yes we have to implement tx_prepare or TSO will not work in this mode" For both mlx4 and mlx5 then, "it is OK, we do not need any checksum preparation for TSO". Actually I do not think we'll ever need tx_prep() unless we add our own quirks to struct rte_eth_desc_lim (and friends) which are currently quietly handled by TX burst functions. -- Adrien Mazarguil 6WIND