On 22/07/2019 16:01, Jan Beulich wrote: > On 22.07.2019 15:36, Andrew Cooper wrote: >> On 22/07/2019 09:34, Jan Beulich wrote: >>> On 19.07.2019 19:27, Andrew Cooper wrote: >>>> On 16/07/2019 17:38, Jan Beulich wrote: >>>>> @@ -142,7 +178,15 @@ static void free_intremap_entry(const st >>>>> { >>>>> union irte_ptr entry = get_intremap_entry(iommu, bdf, index); >>>>> >>>>> - ACCESS_ONCE(entry.ptr32->raw[0]) = 0; >>>>> + if ( iommu->ctrl.ga_en ) >>>>> + { >>>>> + ACCESS_ONCE(entry.ptr128->raw[0]) = 0; >>>>> + /* Low half (containing RemapEn) needs to be cleared first. */ >>>>> + barrier(); >>>> While this will function on x86, I still consider this buggy. From a >>>> conceptual point of view, barrier() is not the correct construction to >>>> use, whereas smp_wmb() is. >>> I think it's the 3rd time now that I respond saying that barrier() is >>> as good or as bad as smp_wmb(), just for different reasons. >> barrier() and smp_wmb() are different constructs, with specific, >> *different* meanings. From a programmers point of view, they should be >> considered black boxes of functionality. >> >> barrier() is for forcing the compiler to not reorder things. >> >> smp_wmb() is about the external visibility of writes, as observed by a >> different entity on a coherent fabric. > I'm afraid I disagree here: The "smp" in its name means "CPU", not > "entity" in your sentence.
Citation definitely needed. The term SMP means Symmetric MultiProcessing, but no computer these days matches any of the traditional definitions. You can thank the fact we are one of the fastest evolving industries in the world, and that the term you're using is more than 20 years old. In particular, it predates cache-coherent uncore devices. Cache-coherent devices by definition have the same ordering properties and constraints as cpus, because they are part of one shared (or dare I say, symmetric), cache-coherent domain. How would your argument change if the IOMMU was a real CPU running real x86 code? Its interface to the rest of the system would be identical, and in that case, it would obviously need an smp_{r,w}mb() pair for correctness reasons. This is why smp_wmb() is the only appropriate construct to use. ~Andrew _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel