From: AI Reviewer <[email protected]>

This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
Full review at: https://netdev-ai.bots.linux.dev/sashiko/
---
net: ena: Add error bound to PHC communication layer

This commit extends the ENA PHC communication layer to retrieve an
error_bound value from the device alongside the timestamp, and adds a
phc_err_eb statistic to count error_bound retrieval failures.

> diff --git a/drivers/net/ethernet/amazon/ena/ena_com.c 
> b/drivers/net/ethernet/amazon/ena/ena_com.c
> index 257910460bd8..832c4c1d0e5e 100644
> --- a/drivers/net/ethernet/amazon/ena/ena_com.c
> +++ b/drivers/net/ethernet/amazon/ena/ena_com.c

[ ... ]

> @@ -1882,14 +1888,15 @@ int ena_com_phc_get_timestamp(struct ena_com_dev 
> *ena_dev, u64 *timestamp)
>               }
>  
>               /* req_id was updated by the device which indicates that
> -              * PHC timestamp and error_flags are updated too,
> -              * checking errors before retrieving timestamp
> +              * PHC timestamp, error_bound and error_flags are updated too,
> +              * checking error flags before retrieving timestamp and
> +              * error_bound values
>                */
>               if (unlikely(resp->error_flags & ENA_PHC_ERROR_FLAGS)) {

[High]

Once req_id is observed equal to phc->req_id, the function reads
resp->error_flags, resp->timestamp and (now) resp->error_bound from the
device-shared DMA buffer. The updated comment explicitly states that once
req_id is observed updated, "PHC timestamp, error_bound and error_flags
are updated too" — which is the producer/consumer pattern that normally
requires a read barrier between the index check and the dependent data
loads.

The buffer is declared volatile and the req_id check uses READ_ONCE(),
but on weakly-ordered architectures such as arm64 (which includes AWS
Graviton, a primary ENA deployment target) neither volatile nor
READ_ONCE() orders dependent loads against DMA writes from the device.
Without a dma_rmb() the CPU may observe the new req_id while still
returning stale or torn values for the 64-bit timestamp, the error_bound,
or the error_flags.

Could a dma_rmb() be added between the req_id check and the subsequent
reads of resp->error_flags / resp->timestamp / resp->error_bound, similar
to the pattern already used elsewhere in the same driver?

For example, ena_com_handle_admin_completion() in the same file does:

        /* Do not read the rest of the completion entry before the
         * phase bit was validated
         */
        dma_rmb();

and ena_com_aenq_intr_handler() has:

        /* Make sure the phase bit (ownership) is as expected before
         * reading the rest of the descriptor.
         */
        dma_rmb();

The same shape applies here: device writes data fields, then writes
req_id last; driver checks req_id, then reads data fields.

This ordering issue is not introduced by this patch — the timestamp and
error_flags reads after the req_id check existed before — but the patch
extends the same unsynchronized read pattern to a third field
(error_bound) and the new comment now documents the assumed ordering
that is not actually enforced. Would it make sense to address the missing
dma_rmb() while touching this code?

>                       /* Retrieved invalid PHC timestamp, PHC enters into

[ ... ]

> @@ -1897,12 +1904,15 @@ int ena_com_phc_get_timestamp(struct ena_com_dev 
> *ena_dev, u64 *timestamp)
>  
>               /* PHC timestamp value is returned to the caller */
>               *timestamp = resp->timestamp;
> +             if (error_bound)
> +                     *error_bound = resp->error_bound;
>  
>               /* Update statistic on valid PHC timestamp retrieval */
>               phc->stats.phc_cnt++;

[ ... ]
-- 
This is an AI-generated review.


Reply via email to