Re: [PATCH v2 03/10] xen/arm: smmuv3: Ensure queue is read after updating prod pointer
On Mon, 5 Sep 2022, Julien Grall wrote: > On 05/09/2022 11:23, Bertrand Marquis wrote: > > > On 5 Sep 2022, at 10:31, Julien Grall wrote: > > > On 05/09/2022 10:18, Rahul Singh wrote: > > > > > On 3 Sep 2022, at 12:21 am, Stefano Stabellini > > > > > wrote: > > > > > > > > > > On Fri, 2 Sep 2022, Rahul Singh wrote: > > > > > > From: Zhou Wang > > > > > > > > > > > > Backport Linux commit a76a3f2c. This is the clean backport > > > > > > without > > > > > > any changes. > > > > > > > > > > > > Reading the 'prod' MMIO register in order to determine whether or > > > > > > not there is valid data beyond 'cons' for a given queue does not > > > > > > provide sufficient dependency ordering, as the resulting access is > > > > > > address dependent only on 'cons' and can therefore be speculated > > > > > > ahead of time, potentially allowing stale data to be read by the > > > > > > CPU. > > > > > > > > > > > > Use readl() instead of readl_relaxed() when updating the shadow copy > > > > > > of the 'prod' pointer, so that all speculated memory reads from the > > > > > > corresponding queue can occur only from valid slots. > > > > > > > > > > > > Signed-off-by: Zhou Wang > > > > > > Link: > > > > > > https://lore.kernel.org/r/1601281922-117296-1-git-send-email-wangzh...@hisilicon.com > > > > > > [will: Use readl() instead of explicit barrier. Update 'cons' side > > > > > > to match.] > > > > > > Signed-off-by: Will Deacon > > > > > > Origin: > > > > > > git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git > > > > > > a76a3f2c > > > > > > Signed-off-by: Rahul Singh > > > > > > --- > > > > > > Changes in v2: > > > > > > - fix commit msg > > > > > > - add _iomb changes also from the origin patch > > > > > > --- > > > > > > xen/arch/arm/include/asm/system.h | 1 + > > > > > > xen/drivers/passthrough/arm/smmu-v3.c | 11 +-- > > > > > > 2 files changed, 10 insertions(+), 2 deletions(-) > > > > > > > > > > > > diff --git a/xen/arch/arm/include/asm/system.h > > > > > > b/xen/arch/arm/include/asm/system.h > > > > > > index 65d5c8e423..fe27cf8c5e 100644 > > > > > > --- a/xen/arch/arm/include/asm/system.h > > > > > > +++ b/xen/arch/arm/include/asm/system.h > > > > > > @@ -29,6 +29,7 @@ > > > > > > #endif > > > > > > > > > > > > #define smp_wmb() dmb(ishst) > > > > > > +#define __iomb()dmb(osh) > > > > > > > > > > We don't have any other #define starting with __ in system.h. > > > > > I wonder if we should call this macro differently or simply iomb(). > > > > I think either iomb() or dma_mb() will be the right name. > > > > Please let me know your view on this. > > > > > > It is not 100% clear why Linux went with __iomb() rather than iomb(). But > > > I would prefer to keep the __* version to match Linux. > > > > > > If the others really want to drop the __. Then I think it should be name > > > iomb(). The rationale is while __iomb() is an alias to dma_mb(), the > > > __iormb() behaves differently compare to dma_mb() (I haven't into details > > > why). > > > > > > So if it was a read barrier, we would likely want to use the iormb() > > > semantic. This will keep the terminology consistent with Linux (even if we > > > remove the __). > > > > We need the __iomb as “linux compatibility” in fact so I would suggest for > > now to only introduce it at the beginning of smmu-v3.c with other linux > > compatibility stuff to prevent adding this to Xen overall. > > I would be fine with that. +1
Re: [PATCH v2 03/10] xen/arm: smmuv3: Ensure queue is read after updating prod pointer
Hi Bertrand, On 05/09/2022 11:23, Bertrand Marquis wrote: Hi, On 5 Sep 2022, at 10:31, Julien Grall wrote: On 05/09/2022 10:18, Rahul Singh wrote: On 3 Sep 2022, at 12:21 am, Stefano Stabellini wrote: On Fri, 2 Sep 2022, Rahul Singh wrote: From: Zhou Wang Backport Linux commit a76a3f2c. This is the clean backport without any changes. Reading the 'prod' MMIO register in order to determine whether or not there is valid data beyond 'cons' for a given queue does not provide sufficient dependency ordering, as the resulting access is address dependent only on 'cons' and can therefore be speculated ahead of time, potentially allowing stale data to be read by the CPU. Use readl() instead of readl_relaxed() when updating the shadow copy of the 'prod' pointer, so that all speculated memory reads from the corresponding queue can occur only from valid slots. Signed-off-by: Zhou Wang Link: https://lore.kernel.org/r/1601281922-117296-1-git-send-email-wangzh...@hisilicon.com [will: Use readl() instead of explicit barrier. Update 'cons' side to match.] Signed-off-by: Will Deacon Origin: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git a76a3f2c Signed-off-by: Rahul Singh --- Changes in v2: - fix commit msg - add _iomb changes also from the origin patch --- xen/arch/arm/include/asm/system.h | 1 + xen/drivers/passthrough/arm/smmu-v3.c | 11 +-- 2 files changed, 10 insertions(+), 2 deletions(-) diff --git a/xen/arch/arm/include/asm/system.h b/xen/arch/arm/include/asm/system.h index 65d5c8e423..fe27cf8c5e 100644 --- a/xen/arch/arm/include/asm/system.h +++ b/xen/arch/arm/include/asm/system.h @@ -29,6 +29,7 @@ #endif #define smp_wmb() dmb(ishst) +#define __iomb()dmb(osh) We don't have any other #define starting with __ in system.h. I wonder if we should call this macro differently or simply iomb(). I think either iomb() or dma_mb() will be the right name. Please let me know your view on this. It is not 100% clear why Linux went with __iomb() rather than iomb(). But I would prefer to keep the __* version to match Linux. If the others really want to drop the __. Then I think it should be name iomb(). The rationale is while __iomb() is an alias to dma_mb(), the __iormb() behaves differently compare to dma_mb() (I haven't into details why). So if it was a read barrier, we would likely want to use the iormb() semantic. This will keep the terminology consistent with Linux (even if we remove the __). We need the __iomb as “linux compatibility” in fact so I would suggest for now to only introduce it at the beginning of smmu-v3.c with other linux compatibility stuff to prevent adding this to Xen overall. I would be fine with that. Cheers, -- Julien Grall
Re: [PATCH v2 03/10] xen/arm: smmuv3: Ensure queue is read after updating prod pointer
Hi, > On 5 Sep 2022, at 10:31, Julien Grall wrote: > > > > On 05/09/2022 10:18, Rahul Singh wrote: >>> On 3 Sep 2022, at 12:21 am, Stefano Stabellini >>> wrote: >>> >>> On Fri, 2 Sep 2022, Rahul Singh wrote: From: Zhou Wang Backport Linux commit a76a3f2c. This is the clean backport without any changes. Reading the 'prod' MMIO register in order to determine whether or not there is valid data beyond 'cons' for a given queue does not provide sufficient dependency ordering, as the resulting access is address dependent only on 'cons' and can therefore be speculated ahead of time, potentially allowing stale data to be read by the CPU. Use readl() instead of readl_relaxed() when updating the shadow copy of the 'prod' pointer, so that all speculated memory reads from the corresponding queue can occur only from valid slots. Signed-off-by: Zhou Wang Link: https://lore.kernel.org/r/1601281922-117296-1-git-send-email-wangzh...@hisilicon.com [will: Use readl() instead of explicit barrier. Update 'cons' side to match.] Signed-off-by: Will Deacon Origin: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git a76a3f2c Signed-off-by: Rahul Singh --- Changes in v2: - fix commit msg - add _iomb changes also from the origin patch --- xen/arch/arm/include/asm/system.h | 1 + xen/drivers/passthrough/arm/smmu-v3.c | 11 +-- 2 files changed, 10 insertions(+), 2 deletions(-) diff --git a/xen/arch/arm/include/asm/system.h b/xen/arch/arm/include/asm/system.h index 65d5c8e423..fe27cf8c5e 100644 --- a/xen/arch/arm/include/asm/system.h +++ b/xen/arch/arm/include/asm/system.h @@ -29,6 +29,7 @@ #endif #define smp_wmb() dmb(ishst) +#define __iomb()dmb(osh) >>> >>> We don't have any other #define starting with __ in system.h. >>> I wonder if we should call this macro differently or simply iomb(). >> I think either iomb() or dma_mb() will be the right name. >> Please let me know your view on this. > > It is not 100% clear why Linux went with __iomb() rather than iomb(). But I > would prefer to keep the __* version to match Linux. > > If the others really want to drop the __. Then I think it should be name > iomb(). The rationale is while __iomb() is an alias to dma_mb(), the > __iormb() behaves differently compare to dma_mb() (I haven't into details > why). > > So if it was a read barrier, we would likely want to use the iormb() > semantic. This will keep the terminology consistent with Linux (even if we > remove the __). We need the __iomb as “linux compatibility” in fact so I would suggest for now to only introduce it at the beginning of smmu-v3.c with other linux compatibility stuff to prevent adding this to Xen overall. Cheers Bertrand > > Cheers, > > -- > Julien Grall
Re: [PATCH v2 03/10] xen/arm: smmuv3: Ensure queue is read after updating prod pointer
On 05/09/2022 10:18, Rahul Singh wrote: On 3 Sep 2022, at 12:21 am, Stefano Stabellini wrote: On Fri, 2 Sep 2022, Rahul Singh wrote: From: Zhou Wang Backport Linux commit a76a3f2c. This is the clean backport without any changes. Reading the 'prod' MMIO register in order to determine whether or not there is valid data beyond 'cons' for a given queue does not provide sufficient dependency ordering, as the resulting access is address dependent only on 'cons' and can therefore be speculated ahead of time, potentially allowing stale data to be read by the CPU. Use readl() instead of readl_relaxed() when updating the shadow copy of the 'prod' pointer, so that all speculated memory reads from the corresponding queue can occur only from valid slots. Signed-off-by: Zhou Wang Link: https://lore.kernel.org/r/1601281922-117296-1-git-send-email-wangzh...@hisilicon.com [will: Use readl() instead of explicit barrier. Update 'cons' side to match.] Signed-off-by: Will Deacon Origin: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git a76a3f2c Signed-off-by: Rahul Singh --- Changes in v2: - fix commit msg - add _iomb changes also from the origin patch --- xen/arch/arm/include/asm/system.h | 1 + xen/drivers/passthrough/arm/smmu-v3.c | 11 +-- 2 files changed, 10 insertions(+), 2 deletions(-) diff --git a/xen/arch/arm/include/asm/system.h b/xen/arch/arm/include/asm/system.h index 65d5c8e423..fe27cf8c5e 100644 --- a/xen/arch/arm/include/asm/system.h +++ b/xen/arch/arm/include/asm/system.h @@ -29,6 +29,7 @@ #endif #define smp_wmb() dmb(ishst) +#define __iomb()dmb(osh) We don't have any other #define starting with __ in system.h. I wonder if we should call this macro differently or simply iomb(). I think either iomb() or dma_mb() will be the right name. Please let me know your view on this. It is not 100% clear why Linux went with __iomb() rather than iomb(). But I would prefer to keep the __* version to match Linux. If the others really want to drop the __. Then I think it should be name iomb(). The rationale is while __iomb() is an alias to dma_mb(), the __iormb() behaves differently compare to dma_mb() (I haven't into details why). So if it was a read barrier, we would likely want to use the iormb() semantic. This will keep the terminology consistent with Linux (even if we remove the __). Cheers, -- Julien Grall
Re: [PATCH v2 03/10] xen/arm: smmuv3: Ensure queue is read after updating prod pointer
Hi Stefano, > On 3 Sep 2022, at 12:21 am, Stefano Stabellini wrote: > > On Fri, 2 Sep 2022, Rahul Singh wrote: >> From: Zhou Wang >> >> Backport Linux commit a76a3f2c. This is the clean backport without >> any changes. >> >> Reading the 'prod' MMIO register in order to determine whether or >> not there is valid data beyond 'cons' for a given queue does not >> provide sufficient dependency ordering, as the resulting access is >> address dependent only on 'cons' and can therefore be speculated >> ahead of time, potentially allowing stale data to be read by the >> CPU. >> >> Use readl() instead of readl_relaxed() when updating the shadow copy >> of the 'prod' pointer, so that all speculated memory reads from the >> corresponding queue can occur only from valid slots. >> >> Signed-off-by: Zhou Wang >> Link: >> https://lore.kernel.org/r/1601281922-117296-1-git-send-email-wangzh...@hisilicon.com >> [will: Use readl() instead of explicit barrier. Update 'cons' side to match.] >> Signed-off-by: Will Deacon >> Origin: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git >> a76a3f2c >> Signed-off-by: Rahul Singh >> --- >> Changes in v2: >> - fix commit msg >> - add _iomb changes also from the origin patch >> --- >> xen/arch/arm/include/asm/system.h | 1 + >> xen/drivers/passthrough/arm/smmu-v3.c | 11 +-- >> 2 files changed, 10 insertions(+), 2 deletions(-) >> >> diff --git a/xen/arch/arm/include/asm/system.h >> b/xen/arch/arm/include/asm/system.h >> index 65d5c8e423..fe27cf8c5e 100644 >> --- a/xen/arch/arm/include/asm/system.h >> +++ b/xen/arch/arm/include/asm/system.h >> @@ -29,6 +29,7 @@ >> #endif >> >> #define smp_wmb() dmb(ishst) >> +#define __iomb()dmb(osh) > > We don't have any other #define starting with __ in system.h. > I wonder if we should call this macro differently or simply iomb(). I think either iomb() or dma_mb() will be the right name. Please let me know your view on this. Regards, Rahul
Re: [PATCH v2 03/10] xen/arm: smmuv3: Ensure queue is read after updating prod pointer
On Fri, 2 Sep 2022, Rahul Singh wrote: > From: Zhou Wang > > Backport Linux commit a76a3f2c. This is the clean backport without > any changes. > > Reading the 'prod' MMIO register in order to determine whether or > not there is valid data beyond 'cons' for a given queue does not > provide sufficient dependency ordering, as the resulting access is > address dependent only on 'cons' and can therefore be speculated > ahead of time, potentially allowing stale data to be read by the > CPU. > > Use readl() instead of readl_relaxed() when updating the shadow copy > of the 'prod' pointer, so that all speculated memory reads from the > corresponding queue can occur only from valid slots. > > Signed-off-by: Zhou Wang > Link: > https://lore.kernel.org/r/1601281922-117296-1-git-send-email-wangzh...@hisilicon.com > [will: Use readl() instead of explicit barrier. Update 'cons' side to match.] > Signed-off-by: Will Deacon > Origin: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git > a76a3f2c > Signed-off-by: Rahul Singh > --- > Changes in v2: > - fix commit msg > - add _iomb changes also from the origin patch > --- > xen/arch/arm/include/asm/system.h | 1 + > xen/drivers/passthrough/arm/smmu-v3.c | 11 +-- > 2 files changed, 10 insertions(+), 2 deletions(-) > > diff --git a/xen/arch/arm/include/asm/system.h > b/xen/arch/arm/include/asm/system.h > index 65d5c8e423..fe27cf8c5e 100644 > --- a/xen/arch/arm/include/asm/system.h > +++ b/xen/arch/arm/include/asm/system.h > @@ -29,6 +29,7 @@ > #endif > > #define smp_wmb() dmb(ishst) > +#define __iomb()dmb(osh) We don't have any other #define starting with __ in system.h. I wonder if we should call this macro differently or simply iomb(). > #define smp_mb__before_atomic()smp_mb() > #define smp_mb__after_atomic() smp_mb() > diff --git a/xen/drivers/passthrough/arm/smmu-v3.c > b/xen/drivers/passthrough/arm/smmu-v3.c > index 64d39bb4d3..cee13d1fc7 100644 > --- a/xen/drivers/passthrough/arm/smmu-v3.c > +++ b/xen/drivers/passthrough/arm/smmu-v3.c > @@ -951,7 +951,7 @@ static void queue_sync_cons_out(struct arm_smmu_queue *q) >* Ensure that all CPU accesses (reads and writes) to the queue >* are complete before we update the cons pointer. >*/ > - mb(); > + __iomb(); > writel_relaxed(q->llq.cons, q->cons_reg); > } > > @@ -963,8 +963,15 @@ static void queue_inc_cons(struct arm_smmu_ll_queue *q) > > static int queue_sync_prod_in(struct arm_smmu_queue *q) > { > + u32 prod; > int ret = 0; > - u32 prod = readl_relaxed(q->prod_reg); > + > + /* > + * We can't use the _relaxed() variant here, as we must prevent > + * speculative reads of the queue before we have determined that > + * prod has indeed moved. > + */ > + prod = readl(q->prod_reg); > > if (Q_OVF(prod) != Q_OVF(q->llq.prod)) > ret = -EOVERFLOW; > -- > 2.25.1 >