> -----Original Message----- > From: Keller, Jacob E <[email protected]> > Sent: Wednesday, November 5, 2025 10:07 PM > To: Loktionov, Aleksandr <[email protected]>; Lobakin, > Aleksander <[email protected]>; Nguyen, Anthony L > <[email protected]>; Kitszel, Przemyslaw > <[email protected]> > Cc: [email protected]; [email protected]; Keller, > Jacob E <[email protected]> > Subject: [PATCH iwl-next v2 3/9] ice: use cacheline groups for > ice_tx_ring structure > > The ice ring structure was reorganized by commit 65124bbf980c ("ice: > Reorganize tx_buf and ring structs"), and later split into a separate > ice_tx_ring structure by commit e72bba21355d ("ice: split ice_ring > onto Tx/Rx separate structs"). > > The ice_tx_ring structure has comments left over from this > reorganization and split indicating which fields are supposed to > belong to which cachelines. Unfortunately, these comments are almost > completely incorrect. > > * Cacheline 1 spans from the start of the structure to the vsi > pointer. > This cacheline is correct, and appears to be the only one that is. > > * Cacheline 2 spans from the DMA address down to the xps_state field. > The > comments indicate it should end at the rcu head field. > > * Cacheline 3 spans from the ice_channel pointer to the end of the > struct, > and completely contains what is marked as a separate 4th cacheline. > > The use of such comments to indicate cache lines is error prone. It is > extremely likely that the comments will become out of date with future > refactors. Instead, use __cacheline_group_(begin|end)_aligned() which > is more explicit. It guarantees that members between the cacheline > groups will be in distinct cache lines through the use of padding. It > additionally enables compile time assertions to help prevent new > fields from drastically re-arranging the cache lines. > > There are two main issues if we just replace the existing comments > with cache line group markers. First, the spinlock_t tx_lock field is > 24 bytes on most kernels, but is 64 bytes on CONFIG_DEBUG_LOCK_ALLOC > kernels. > Ideally we want to place this field at the start of a cacheline so > that other fields in the group don't get split across such a debug > kernel. While optimizing such a debug kernel is not a priority, doing > this makes the assertions around the cacheline a bit easier to > understand. > > Remove the out-of-date cacheline comments, and add __cacheline_group > annotations. These are set to match the existing layout instead of > matching the original comments. The only change to layout is to re- > order the tx_lock to be the start of cacheline 3, and move txq_teid to > avoid a 4-byte gap in the layout. > > Ideally, we should profile the Tx hot path and figure out which fields > go together and re-arrange the cacheline groups, possibly along > "read_mostly", "readwrite" and "cold" groupings similar to the idpf > driver. This has been left as an exercise for a later improvement. > > Finally, add annotations which check the cacheline sizes. For > cacheline 3, we enforce that tx_lock is in this cacheline group, and > check the size based on whether or not the CONFIG_DEBUG_LOCK_ALLOC is > enabled. Similar to the Rx annotations, these check that the size of > each cacheline group (excluding padding) is no greater than 64 bytes. > This is primarily intended to produce compiler failures if developers > add or re-arrange fields such that cacheline groups exceed the > expected 64 byte sizes on x86_64 systems. > Because the assertions check the size excluding any padding, they > should behave the same even on systems with larger L1 cacheline sizes. > > Signed-off-by: Jacob Keller <[email protected]> > ---
... > -- > 2.51.0.rc1.197.g6d975e95c9d7 Reviewed-by: Aleksandr Loktionov <[email protected]>
