On Wed, Nov 05, 2025 at 01:06:34PM -0800, Jacob Keller wrote:
> The ice ring structure was reorganized back by commit 65124bbf980c ("ice:
> Reorganize tx_buf and ring structs"), and later split into a separate
> ice_rx_ring structure by commit e72bba21355d ("ice: split ice_ring onto
> Tx/Rx separate structs")
> 
> The ice_rx_ring structure has comments left over from this prior
> reorganization indicating which fields belong to which cachelines.
> Unfortunately, these comments are not all accurate. The intended layout is
> for x86_64 systems with a 64-byte cache.
> 
>  * Cacheline 1 spans from the start of the struct to the end of the rx_fqes
>    and xdp_buf union. The comments correctly match this.
> 
>  * Cacheline 2 spans from hdr_fqes to the end of hdr_truesize, but the
>    comment indicates it should end xdp and xsk union.
> 
>  * Cacheline 3 spans from the truesize field to the xsk_pool, but the
>    comment wants this to be from the pkt_ctx down to the rcu head field.
> 
>  * Cacheline 4 spans from the rx_hdr_len down to the flags field, but the
>    comment indicates that it starts back at the ice_channel structure
>    pointer.
> 
>  * Cacheline 5 is indicated to cover the xdp_rxq. Because this field is
>    aligned to 64 bytes, this is actually true. However, there is a large 45
>    byte gap at the end of cacheline 4.
> 
> The use of comments to indicate cachelines is poor design. In practice,
> comments like this quickly become outdated as developers add or remove
> fields, or as other sub-structures change layout and size unexpectedly.
> 
> The ice_rx_ring structure *is* 5 cachelines (320 bytes), but ends up having
> quite a lot of empty space at the end just before xdp_rxq.
> 
> Replace the comments with __cacheline_group_(begin|end)_aligned()
> annotations. These macros enforce alignment to the start of cachelines, and
> enforce padding between groups, thus guaranteeing that a following group
> cannot be part of the same cacheline.
> 
> Doing this changes the layout by effectively spreading the padding at the
> tail of cacheline 4 between groups to ensure that the relevant fields are
> kept as separate cachelines on x86_64 systems. For systems with the
> expected cache line size of 64 bytes, the structure size does not change.
> For systems with a larger SMB_CACHE_BYTES size, the structure *will*
> increase in size a fair bit, however we'll now guarantee that each group is
> in a separate cacheline. This has an advantage that updates to fields in
> one group won't trigger cacheline eviction of the other groups. This comes
> at the expense of extra memory footprint for the rings.
> 
> If fields get re-arranged, added, or removed, the alignment and padding
> ensure the relevant fields are kept on separate cache lines. This could
> result in unexpected changes in the structure size due to padding to keep
> cachelines separate.
> 
> To catch such changes during early development, add build time compiler
> assertions that check the size of each group to ensure it doesn't exceed 64
> bytes, the expected cache line size. The assertion checks that the size of
> the group excluding any padding at the end is less than the provided size
> of 64 bytes. This type of check will behave the same even on architectures
> with larger cache sizes. The primary aim is to produce a warning if
> developers add fields into a cacheline group which exceeds the size of the
> expected 64 byte groupings.
> 
> Reviewed-by: Aleksandr Loktionov <[email protected]>
> Signed-off-by: Jacob Keller <[email protected]>
> ---
>  drivers/net/ethernet/intel/ice/ice_txrx.h | 26 +++++++++++++++++++++-----
>  drivers/net/ethernet/intel/ice/ice_main.c |  2 ++
>  2 files changed, 23 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.h 
> b/drivers/net/ethernet/intel/ice/ice_txrx.h

...

> @@ -298,10 +302,22 @@ struct ice_rx_ring {
>  #define ICE_RX_FLAGS_MULTIDEV                BIT(3)
>  #define ICE_RX_FLAGS_RING_GCS                BIT(4)
>       u8 flags;
> -     /* CL5 - 5th cacheline starts here */
> +     __cacheline_group_end_aligned(cl4);
> +
> +     __cacheline_group_begin_aligned(cl5);
>       struct xdp_rxq_info xdp_rxq;
> +     __cacheline_group_end_aligned(cl5);
>  } ____cacheline_internodealigned_in_smp;
>  
> +static inline void ice_rx_ring_struct_check(void)
> +{
> +     CACHELINE_ASSERT_GROUP_SIZE(struct ice_rx_ring, cl1, 64);
> +     CACHELINE_ASSERT_GROUP_SIZE(struct ice_rx_ring, cl2, 64);
> +     CACHELINE_ASSERT_GROUP_SIZE(struct ice_rx_ring, cl3, 64);
> +     CACHELINE_ASSERT_GROUP_SIZE(struct ice_rx_ring, cl4, 64);
> +     CACHELINE_ASSERT_GROUP_SIZE(struct ice_rx_ring, cl5, 64);

Hi Jacob,

Unfortunately the last line results in a build failure on ARM (32-bit)
with allmodconfig. It seems that in that case the size of the group is
128 bytes.

> +}
> +
>  struct ice_tx_ring {
>       /* CL1 - 1st cacheline starts here */
>       struct ice_tx_ring *next;       /* pointer to next ring in q_vector */

...

Reply via email to