On Mon, May 04, 2026 at 05:10:23PM -0700, Jacob Keller wrote:
> 
> Hi,
> 
> Based on your patch description, I assume that you've tested this on
> real hardware.
> 
> I dug a little through some of our internal changes history and sawe
> that it looks like the hardware has a register setting in its
> GL_RDPU_CNTRL register which determines whether the checksum value
> reported is inverted or not. In E830 hardware, it is supposed to be off
> (i.e. the checksum value reported already matches the expected setting.
> 
> Perhaps your device somehow got the GL_RDPU_CNTRL register set to the
> wrong mode and that results in the swap being necessary. Hmm.
> 
> I'll ask the team to see if they can confirm this behavior.

Hi Jake,

Thanks for digging into this.

I read GL_RDPU_CNTRL on our affected E830 and the value is the same on
both ports of the NIC:

  0000:c1:00.0: GL_RDPU_CNTRL = 0x0020a275
  0000:c1:00.1: GL_RDPU_CNTRL = 0x0020a275

Decoding bit 22 (E830_GL_RDPU_CNTRL_CHECKSUM_COMPLETE_INV) gives 0,
i.e. the hardware is supposedly in "not inverted" mode, which matches
the default you described.

However, looking at the data on the wire I see:

  - netdev_rx_csum_fault fires ~65 000 times/sec on this host.
  - bpftrace at fexit:ice_process_skb_fields shows skb->csum =
    swab16(raw_csum) directly (no negation), e.g. raw_csum=0xfb4f
    -> skb->csum=0x4ffb.
  - At fentry:__skb_checksum_complete the upper 16 bits of skb->csum
    are 0xFFFF on every TCP/UDP packet -- the signature of nf_ip_checksum
    adding the pseudo-header to a value that was the un-negated raw_csum.
  - fold2(skb->csum_at_fentry + skb_checksum(skb,0,len,0)) ≈ 0xFFFF
    for every packet, which means the two values are ones-complement
    complements of each other, i.e. the driver stored S where the
    stack expects ~S.

Negating the checksum makes the failures go away.

Thanks,
Matt

Reply via email to