> -----Original Message-----
> From: Keller, Jacob E <[email protected]>
> Sent: Wednesday, November 5, 2025 10:07 PM
> To: Loktionov, Aleksandr <[email protected]>; Lobakin,
> Aleksander <[email protected]>; Nguyen, Anthony L
> <[email protected]>; Kitszel, Przemyslaw
> <[email protected]>
> Cc: [email protected]; [email protected]; Keller,
> Jacob E <[email protected]>
> Subject: [PATCH iwl-next v2 3/9] ice: use cacheline groups for
> ice_tx_ring structure
> 
> The ice ring structure was reorganized by commit 65124bbf980c ("ice:
> Reorganize tx_buf and ring structs"), and later split into a separate
> ice_tx_ring structure by commit e72bba21355d ("ice: split ice_ring
> onto Tx/Rx separate structs").
> 
> The ice_tx_ring structure has comments left over from this
> reorganization and split indicating which fields are supposed to
> belong to which cachelines. Unfortunately, these comments are almost
> completely incorrect.
> 
>  * Cacheline 1 spans from the start of the structure to the vsi
> pointer.
>    This cacheline is correct, and appears to be the only one that is.
> 
>  * Cacheline 2 spans from the DMA address down to the xps_state field.
> The
>    comments indicate it should end at the rcu head field.
> 
>  * Cacheline 3 spans from the ice_channel pointer to the end of the
> struct,
>    and completely contains what is marked as a separate 4th cacheline.
> 
> The use of such comments to indicate cache lines is error prone. It is
> extremely likely that the comments will become out of date with future
> refactors. Instead, use __cacheline_group_(begin|end)_aligned() which
> is more explicit. It guarantees that members between the cacheline
> groups will be in distinct cache lines through the use of padding. It
> additionally enables compile time assertions to help prevent new
> fields from drastically re-arranging the cache lines.
> 
> There are two main issues if we just replace the existing comments
> with cache line group markers. First, the spinlock_t tx_lock field is
> 24 bytes on most kernels, but is 64 bytes on CONFIG_DEBUG_LOCK_ALLOC
> kernels.
> Ideally we want to place this field at the start of a cacheline so
> that other fields in the group don't get split across such a debug
> kernel. While optimizing such a debug kernel is not a priority, doing
> this makes the assertions around the cacheline a bit easier to
> understand.
> 
> Remove the out-of-date cacheline comments, and add __cacheline_group
> annotations. These are set to match the existing layout instead of
> matching the original comments. The only change to layout is to re-
> order the tx_lock to be the start of cacheline 3, and move txq_teid to
> avoid a 4-byte gap in the layout.
> 
> Ideally, we should profile the Tx hot path and figure out which fields
> go together and re-arrange the cacheline groups, possibly along
> "read_mostly", "readwrite" and "cold" groupings similar to the idpf
> driver. This has been left as an exercise for a later improvement.
> 
> Finally, add annotations which check the cacheline sizes. For
> cacheline 3, we enforce that tx_lock is in this cacheline group, and
> check the size based on whether or not the CONFIG_DEBUG_LOCK_ALLOC is
> enabled. Similar to the Rx annotations, these check that the size of
> each cacheline group (excluding padding) is no greater than 64 bytes.
> This is primarily intended to produce compiler failures if developers
> add or re-arrange fields such that cacheline groups exceed the
> expected 64 byte sizes on x86_64 systems.
> Because the assertions check the size excluding any padding, they
> should behave the same even on systems with larger L1 cacheline sizes.
> 
> Signed-off-by: Jacob Keller <[email protected]>
> ---

...
 
> --
> 2.51.0.rc1.197.g6d975e95c9d7

Reviewed-by: Aleksandr Loktionov <[email protected]>

Reply via email to