This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
Full review at: https://sashiko.dev
---
Date: Thu Jun 11 19:02:22 2026 +0000
net: mana: Add Interrupt Moderation support
This commit adds Static and Dynamic Interrupt Moderation (DIM) support for
Rx and Tx. It updates the queue creation procedure, adds functions to
collect stats for DIM, and provides ethtool handler support to get/set
the moderation settings.
> diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c
> b/drivers/net/ethernet/microsoft/mana/mana_en.c
> index 26aef21c6c2c8..d92e022d8533c 100644
> --- a/drivers/net/ethernet/microsoft/mana/mana_en.c
> +++ b/drivers/net/ethernet/microsoft/mana/mana_en.c
> @@ -1579,6 +1579,9 @@ int mana_create_wq_obj(struct mana_port_context *apc,
>
> mana_gd_init_req_hdr(&req.hdr, MANA_CREATE_WQ_OBJ,
> sizeof(req), sizeof(resp));
> +
> + req.hdr.req.msg_version = GDMA_MESSAGE_V3;
> + req.hdr.resp.msg_version = GDMA_MESSAGE_V2;
> req.vport = vport;
> req.wq_type = wq_type;
> req.wq_gdma_region = wq_spec->gdma_region;
> @@ -1587,6 +1590,9 @@ int mana_create_wq_obj(struct mana_port_context *apc,
> req.cq_size = cq_spec->queue_size;
> req.cq_moderation_ctx_id = cq_spec->modr_ctx_id;
> req.cq_parent_qid = cq_spec->attached_eq;
> + req.req_cq_moderation = cq_spec->req_cq_moderation;
> + req.cq_moderation_comp = cq_spec->cq_moderation_comp;
> + req.cq_moderation_usec = cq_spec->cq_moderation_usec;
[Severity: High]
Does this break queue creation on older Azure host firmwares?
The code unconditionally bumps the WQ object creation request to use
GDMA_MESSAGE_V3 and passes the req_cq_moderation flag. If the firmware does
not support GDMA_PF_CAP_FLAG_1_DYN_INTERRUPT_MODERATION, could the firmware
reject the queue creation request and break device initialization entirely?
[ ... ]
> +static void mana_update_tx_dim(struct mana_cq *cq)
> +{
> + struct mana_port_context *apc = netdev_priv(cq->txq->ndev);
> + struct dim_sample dim_sample = {};
> + struct mana_txq *txq = cq->txq;
> +
> + if (!apc->tx_dim_enabled)
> + return;
> +
> + dim_update_sample(READ_ONCE(cq->dim_event_ctr), txq->stats.packets,
> + txq->stats.bytes, &dim_sample);
> + net_dim(&cq->dim, &dim_sample);
> +}
[Severity: High]
Does this measure the CPU enqueue rate rather than the hardware completion
rate?
It appears that txq->stats.packets is incremented in mana_start_xmit()
upon enqueue, not during completion in mana_poll_tx_cq(). Dynamic interrupt
moderation relies on measuring the actual hardware completion rate to
function correctly.
Additionally, is it safe to read txq->stats.packets here without the
u64_stats_fetch_begin() retry loop? Reading a 64-bit statistic locklessly
without the synchronization primitives could result in torn reads on 32-bit
systems.
[ ... ]
> @@ -2324,7 +2434,13 @@ static int mana_cq_handler(void *context, struct
> gdma_queue *gdma_queue)
> if (w < cq->budget) {
> mana_gd_ring_cq(gdma_queue, SET_ARM_BIT);
> cq->work_done_since_doorbell = 0;
> - napi_complete_done(&cq->napi, w);
> +
> + if (napi_complete_done(&cq->napi, w)) {
> + if (cq->type == MANA_CQ_TYPE_RX)
> + mana_update_rx_dim(cq);
> + else
> + mana_update_tx_dim(cq);
> + }
[Severity: High]
Is it safe to call mana_update_rx_dim() and mana_update_tx_dim() after
napi_complete_done()?
Since napi_complete_done() ends the NAPI polling cycle and relinquishes
mutual exclusion, and the hardware interrupt was already re-armed by
mana_gd_ring_cq() right before this, couldn't a new packet trigger an
interrupt immediately? This could allow NAPI to be scheduled on another CPU
concurrently while this CPU is still running the DIM update, potentially
corrupting the non-thread-safe net_dim() state machine.
[ ... ]
> @@ -2562,6 +2685,9 @@ static int mana_create_txq(struct mana_port_context
> *apc,
> napi_enable_locked(&cq->napi);
> txq->napi_initialized = true;
>
> + INIT_WORK(&cq->dim.work, mana_tx_dim_work);
> + cq->dim.mode = DIM_CQ_PERIOD_MODE_START_FROM_EQE;
[Severity: High]
Could this allow NAPI to schedule an uninitialized work_struct?
napi_enable_locked() is called right before INIT_WORK(). Once NAPI is
enabled, it can be scheduled to run immediately (for example, via netpoll).
If NAPI executes before INIT_WORK completes, mana_update_tx_dim() would
pass the uninitialized cq->dim to net_dim(), which could cause a kernel
panic when it attempts to call schedule_work().
This pattern also occurs in mana_create_rxq().
> diff --git a/drivers/net/ethernet/microsoft/mana/mana_ethtool.c
> b/drivers/net/ethernet/microsoft/mana/mana_ethtool.c
> index 94e658d07a27e..67a2b282ff4da 100644
> --- a/drivers/net/ethernet/microsoft/mana/mana_ethtool.c
> +++ b/drivers/net/ethernet/microsoft/mana/mana_ethtool.c
[ ... ]
> @@ -440,18 +474,123 @@ static int mana_set_coalesce(struct net_device *ndev,
> + if (enable_rx_dim)
> + apc->rx_dim_enabled = true;
> + if (enable_tx_dim)
> + apc->tx_dim_enabled = true;
[Severity: High]
Could the lack of memory barriers here expose uninitialized DIM structures
to NAPI?
The driver initializes DIM via mana_dim_change() and then sets
apc->rx_dim_enabled to true. Without an smp_store_release() here and a
corresponding smp_load_acquire() in mana_update_rx_dim(), weakly-ordered
CPUs like ARM64 might reorder the stores. Concurrently, NAPI polling might
observe the flag as true before the initialization is fully visible in memory,
potentially invoking net_dim() on garbage memory.