> From: Scott Mitchell [mailto:[email protected]]
> Sent: Friday, 9 January 2026 16.27
> 
> On Fri, Jan 9, 2026 at 4:26 AM Morten Brørup <[email protected]>
> wrote:
> >
> > > Changes in v8:
> > > - __rte_raw_cksum: use native pointer arithmetic instead of
> RTE_PTR_ADD
> > >   to avoid incorrect results with -O3 for UDP checksums. Also
> improves
> > >   performance due to less assembly generated with Clang.
> >
> > Personally, I also have observed GCC's optimizer behave as if it
> loses some contextual information when using RTE_PTR_ADD, and thus
> emitting less optimal code.
> > I didn't look further into it, and thus have no data or examples to
> back up the claim. Which is why I haven't started a discussion about
> discouraging the use of RTE_PTR_ADD.
> > In other words: I support this change.
> 
> Sounds good! I observed ~600 (dpdk ptr macros) vs ~500 (native c ptr
> operations) TSC cycles/block in cksum_perf_autotest.

That is a significant performance degradation caused by the RTE_PTR_ADD() 
macros. We really should look into that - some day. ;-)
Our application code base has RTE_CONST_PTR_ADD/SUB() for type consistency 
reasons (not for performance reasons). But I haven't gotten around to 
submitting them to the DPDK project yet.
I wonder if the implicit stripping of "const" when using the RTE_PTR_ADD() 
macros makes the difference, or if the difference stems from other optimizer 
context getting lost.

> 
> >
> > >       /* if length is odd, keeping it byte order independent */
> > > -     if (unlikely(len % 2)) {
> > > +     if (len & 1) {
> > >               uint16_t left = 0;
> > > -
> > >               memcpy(&left, end, 1);
> > >               sum += left;
> > >       }
> >
> > Changing "len % 2" to "len & 1" made sense for consistency in
> previous versions handling 32/16/8/4/2-byte chunks before this 1-byte
> chunk; now it makes no difference, so consider not changing this part
> at all.
> > Under all circumstances, don't remove the unlikely() for handling odd
> length in __rte_raw_cksum(). The vast majority of packets (and partial
> packets, e.g. headers) being checksummed are even length.
> >
> 
> Sounds good. I will restore the original.
> 
> The use case that motivated these changes was software interfaces
> (veth)
> with encapsulation requiring software checksum on inner IPv4 payloads,
> where lengths may be odd/even.

You might want to mention the use case in the cover letter or patch description.
Having a real use case often helps getting a patch accepted, especially for 
optimization patches.

> However, I agree that header checksums
> with even lengths are the more common case and unlikely() is
> appropriate.

In my experience (and based on statistics from our appliances deployed), 
internet traffic is dominated by "max size" (1500 byte or QUIC "safe max" 
somewhat below 1500 byte) packets and empty TCP ACK packets, which are even 
size.
So I also consider the unlikely()applicable to "real life" traffic on the 
internet.
Although it's not in the order of magnitude many people advocate should be a 
requirement for unlikely().

Reply via email to