On 8/6/19 11:47 PM, Dmitry Safonov wrote: > Hi Pavel, > > On 8/3/19 10:34 PM, Pavel Machek wrote: >> Hi! >> >>> --- a/drivers/iommu/intel-iommu.c >>> +++ b/drivers/iommu/intel-iommu.c >>> @@ -3721,7 +3721,7 @@ static void intel_unmap(struct device *d >>> >>> freelist = domain_unmap(domain, start_pfn, last_pfn); >>> >>> - if (intel_iommu_strict) { >>> + if (intel_iommu_strict || !has_iova_flush_queue(&domain->iovad)) { >>> iommu_flush_iotlb_psi(iommu, domain, start_pfn, >>> nrpages, !freelist, 0); >>> /* free iova */ >>> --- a/drivers/iommu/iova.c >>> +++ b/drivers/iommu/iova.c >>> @@ -65,9 +65,14 @@ init_iova_domain(struct iova_domain *iov >>> } >>> EXPORT_SYMBOL_GPL(init_iova_domain); >>> >>> +bool has_iova_flush_queue(struct iova_domain *iovad) >>> +{ >>> + return !!iovad->fq; >> >> Should this be READ_ONCE()? > > Why? Compiler can't anyhow assume that it's always true/false and there > is a clear data dependency between this and: > : queue_iova(&domain->iovad, iova_pfn, nrpages, > : (unsigned long)freelist); > >> >>> @@ -100,13 +106,17 @@ int init_iova_flush_queue(struct iova_do >>> for_each_possible_cpu(cpu) { >>> struct iova_fq *fq; >>> >>> - fq = per_cpu_ptr(iovad->fq, cpu); >>> + fq = per_cpu_ptr(queue, cpu); >>> fq->head = 0; >>> fq->tail = 0; >>> >>> spin_lock_init(&fq->lock); >>> } >>> >>> + smp_wmb(); >>> + >>> + iovad->fq = queue; >>> + >> >> Could we have a comment why the barrier is needed, > > I'm up for the comment if you feel like it - in my POV it's quite > obvious that we want finish initializing the queue's internals before > assigning the queue. I didn't put the comment exactly because I felt > like it would state something evident [in my POV]. > >> and perhaps there >> should be oposing smp_rmb() somewhere? Does this need to be >> WRITE_ONCE() as it is racing against reader? > > I feel confused. I might have forgotten everything about barriers, but > again if I'm not mistaken, one doesn't need a barrier in: > : if (A->a != NULL) > : use(A); /* dereferences A->a */ > : else > : /* don't use `a' */
And in this simplified example I mean that use() will either see A->a initialized (IOW, CPU can't load A->a->field1 before checking the condition) or use() will not be called. Thanks, Dmitry