RE: [PATCH v2 3/8] mbuf: fix Tx checksum offload examples

Konstantin Ananyev Tue, 09 Apr 2024 06:38:51 -0700

> > From: David Marchand [mailto:david.march...@redhat.com]
> > Sent: Friday, 5 April 2024 16.46
> >
> > Mandate use of rte_eth_tx_prepare() in the mbuf Tx checksum offload
> > examples.
> 
> I strongly disagree with this change!
> 
> It will cause a huge performance degradation for shaping applications:
> 
> A packet will be processed and finalized at an output or forwarding pipeline 
> stage, where some other fields might also be written, so
> zeroing e.g. the out_ip checksum at this stage has low cost (no new cache 
> misses).
> 
> Then, the packet might be queued for QoS or similar.
> 
> If rte_eth_tx_prepare() must be called at the egress pipeline stage, it has 
> to write to the packet and cause a cache miss per packet,
> instead of simply passing on the packet to the NIC hardware.
> 
> It must be possible to finalize the packet at the output/forwarding pipeline 
> stage!

If you can finalize your packet on  output/forwarding, then why you can't 
invoke tx_prepare() on the same stage?
There seems to be some misunderstanding about what tx_prepare() does - 
in fact it doesn't communicate with HW queue (doesn't update TXD ring, etc.), 
what it does - just make changes in mbuf itself.
Yes, it reads some fields in SW TX queue struct (max number of TXDs per packet, 
etc.), but AFAIK it is safe
to call tx_prepare() and tx_burst() from different threads.
At least on implementations I am aware about.
Just checked the docs - it seems not stated explicitly anywhere, might be 
that's why it causing such misunderstanding.    
 
> 
> Also, how is rte_eth_tx_prepare() supposed to work for cloned packets 
> egressing on different NIC hardware?

If you create a clone of full packet (including L2/L3) headers then obviously 
such construction might not
work properly with tx_prepare() over two different NICs.
Though In majority of cases you do clone segments with data, while at least L2 
headers are put into different segments.
One simple approach would be to keep L3 header in that separate segment.
But yes, there is a problem when you'll need to send exactly the same packet 
over different NICs.
As I remember, for bonding PMD things don't work quite well here - you might 
have a bond over 2 NICs with
different tx_prepare() and which one to call might be not clear till actual PMD 
tx_burst() is invoked.  

> 
> In theory, it might get even worse if we make this opaque instead of 
> transparent and standardized:
> One PMD might reset out_ip checksum to 0x0000, and another PMD might reset it 
> to 0xFFFF.
 
> 
> I can only see one solution:
> We need to standardize on common minimum requirements for how to prepare 
> packets for each TX offload.

If we can make each and every vendor to agree here - that definitely will help 
to simplify things quite a bit.
Then we can probably have one common tx_prepare() for all vendors ;)
RE: [PATCH v2 3/8] mbuf: fix Tx checksum offload examples

Reply via email to