On Wed, Jun 03, 2026 at 09:28:11AM +0800, yangjiale133 wrote:
> From the device's perspective, during a single read of the descriptor list
> specifically when that list spans across cache lines.
> the retrieved data will show `desc[head].flags` as valid, 
> and `desc[i].flags` as valid as well; however, 
> the `desc[i].addr` and `len` fields may be invalid.
> 
> I apologize that I currently lack the necessary environment to verify 
> whether this modification definitively resolves the issue or 
> merely reduces the probability of its occurrence; 
> therefore, this patch can be discarded.
> 
> yangjiale


it could be that your device does a single read of the descriptor.
the pci spec is explicit that after getting valid flags,
device must read the descriptor again.


> 
> At 2026-06-02 14:04:13, "Eugenio Perez Martin" <[email protected]> wrote:
> >On Tue, Jun 2, 2026 at 6:34 AM yangjiale <[email protected]> wrote:
> >>
> >> When a descriptor list spans across cache lines,
> >> updating the flag first can lead to a scenario where the device side
> >> perceives the flag as valid, yet the corresponding address and length
> >> fields remain unupdated—resulting in invalid values.
> >> Therefore, the flag field must be updated last.
> >>
> >> Signed-off-by: yangjiale <[email protected]>
> >> ---
> >>  drivers/virtio/virtio_ring.c | 8 ++++----
> >>  1 file changed, 4 insertions(+), 4 deletions(-)
> >>
> >> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> >> index fbca7ce1c6bf..036b4f90d30f 100644
> >> --- a/drivers/virtio/virtio_ring.c
> >> +++ b/drivers/virtio/virtio_ring.c
> >> @@ -1688,6 +1688,10 @@ static inline int virtqueue_add_packed(struct 
> >> vring_virtqueue *vq,
> >>                                              &addr, &len, premapped, attr))
> >>                                 goto unmap_release;
> >>
> >> +                       desc[i].addr = cpu_to_le64(addr);
> >> +                       desc[i].len = cpu_to_le32(len);
> >> +                       desc[i].id = cpu_to_le16(id);
> >> +
> >>                         flags = cpu_to_le16(vq->packed.avail_used_flags |
> >>                                     (++c == total_sg ? 0 : 
> >> VRING_DESC_F_NEXT) |
> >>                                     (n < out_sgs ? 0 : 
> >> VRING_DESC_F_WRITE));
> >> @@ -1696,10 +1700,6 @@ static inline int virtqueue_add_packed(struct 
> >> vring_virtqueue *vq,
> >>                         else
> >>                                 desc[i].flags = flags;
> >>
> >> -                       desc[i].addr = cpu_to_le64(addr);
> >> -                       desc[i].len = cpu_to_le32(len);
> >> -                       desc[i].id = cpu_to_le16(id);
> >> -
> >>                         if (unlikely(vq->use_map_api)) {
> >>                                 vq->packed.desc_extra[curr].addr = 
> >> premapped ?
> >>                                         DMA_MAPPING_ERROR : addr;
> >
> >These flags are updated before the flags of the head descriptor at the
> >end of the function, at "vq->packed.vring.desc[head].flags =
> >head_flags", so the device should not see these. Because of that, the
> >relative order between the rest of the fields of the same descriptor
> >or other descriptors' fields, except for the head descriptor's flags,
> >should not matter. There is a write memory barrier just before
> >updating the head's flags.
> >
> >Also, I don't get why the cache line matters here. Can you expand? Am
> >I missing something?
> 


Reply via email to