On 12/1/2025 3:38 PM, Jesse Brandeburg wrote: > From: Jesse Brandeburg <[email protected]> > > Since the beginning, the Intel ice driver has counted receive checksum > offload mismatches into the rx_errors member of the rtnl_link_stats64 > struct. In ethtool -S these show up as rx_csum_bad.nic. > > I believe counting these in rx_errors is fundamentally wrong, as it's > pretty clear from the comments in if_link.h and from every other statistic > the driver is summing into rx_errors, that all of them would cause a > "hardware drop" except for the UDP checksum mismatch, as well as the fact > that all the other causes for rx_errors are L2 reasons, and this L4 UDP > "mismatch" is an outlier. > > A last nail in the coffin is that rx_errors is monitored in production and > can indicate a bad NIC/cable/Switch port, but instead some random series of > UDP packets with bad checksums will now trigger this alert. This false > positive makes the alert useless and affects us as well as other companies. > > This packet with presumably a bad UDP checksum is *already* passed to the > stack, just not marked as offloaded by the hardware/driver. If it is > dropped by the stack it will show up as UDP_MIB_CSUMERRORS. > > And one more thing, none of the other Intel drivers, and at least bnxt_en > and mlx5 both don't appear to count UDP offload mismatches as rx_errors. > > Here is a related customer complaint: > https://community.intel.com/t5/Ethernet-Products/ice-rx-errros-is-too-sensitive-to-IP-TCP-attack-packets-Intel/td-p/1662125 > > Fixes: 4f1fe43c920b ("ice: Add more Rx errors to netdev's rx_error counter") > Cc: Tony Nguyen <[email protected]> > Cc: Jake Keller <[email protected]> > Cc: IWL <[email protected]> > Signed-off-by: Jesse Brandeburg <[email protected]> > -- > I am sending this to net as I consider it a bug, and it will backport > cleanly. > ---
Its fine with me. I can't find anything explaining why we originally chose to put these in rx_errors, and I think its better to align with other drivers and vendors. I suspect its just as "this is an error, it obviously goes in rx_errors" even though its of a completely different kind. Acked-by: Jacob Keller <[email protected]> > drivers/net/ethernet/intel/ice/ice_main.c | 1 - > 1 file changed, 1 deletion(-) > > diff --git a/drivers/net/ethernet/intel/ice/ice_main.c > b/drivers/net/ethernet/intel/ice/ice_main.c > index 86f5859e88ef..d004acfa0f36 100644 > --- a/drivers/net/ethernet/intel/ice/ice_main.c > +++ b/drivers/net/ethernet/intel/ice/ice_main.c > @@ -6995,7 +6995,6 @@ void ice_update_vsi_stats(struct ice_vsi *vsi) > cur_ns->rx_errors = pf->stats.crc_errors + > pf->stats.illegal_bytes + > pf->stats.rx_undersize + > - pf->hw_csum_rx_error + > pf->stats.rx_jabber + > pf->stats.rx_fragments + > pf->stats.rx_oversize;
OpenPGP_signature.asc
Description: OpenPGP digital signature
