On 22.07.2019 17:43, Andrew Cooper wrote:
> On 22/07/2019 16:01, Jan Beulich wrote:
>> On 22.07.2019 15:36, Andrew Cooper wrote:
>>> On 22/07/2019 09:34, Jan Beulich wrote:
>>>> On 19.07.2019 19:27, Andrew Cooper wrote:
>>>>> On 16/07/2019 17:38, Jan Beulich wrote:
>>>>>> @@ -142,7 +178,15 @@ static void free_intremap_entry(const st
>>>>>>      {
>>>>>>          union irte_ptr entry = get_intremap_entry(iommu, bdf, index);
>>>>>>      
>>>>>> -    ACCESS_ONCE(entry.ptr32->raw[0]) = 0;
>>>>>> +    if ( iommu->ctrl.ga_en )
>>>>>> +    {
>>>>>> +        ACCESS_ONCE(entry.ptr128->raw[0]) = 0;
>>>>>> +        /* Low half (containing RemapEn) needs to be cleared first. */
>>>>>> +        barrier();
>>>>> While this will function on x86, I still consider this buggy.  From a
>>>>> conceptual point of view, barrier() is not the correct construction to
>>>>> use, whereas smp_wmb() is.
>>>> I think it's the 3rd time now that I respond saying that barrier() is
>>>> as good or as bad as smp_wmb(), just for different reasons.
>>> barrier() and smp_wmb() are different constructs, with specific,
>>> *different* meanings.  From a programmers point of view, they should be
>>> considered black boxes of functionality.
>>>
>>> barrier() is for forcing the compiler to not reorder things.
>>>
>>> smp_wmb() is about the external visibility of writes, as observed by a
>>> different entity on a coherent fabric.
>> I'm afraid I disagree here: The "smp" in its name means "CPU", not
>> "entity" in your sentence.
> 
> Citation definitely needed.

Which I did provide in the earlier reply: If what you say was
intended to be that way, the !CONFIG_SMP definitions in Linux were
wrong, and ...

> The term SMP means Symmetric MultiProcessing, but no computer these days
> matches any of the traditional definitions.  You can thank the fact we
> are one of the fastest evolving industries in the world, and that the
> term you're using is more than 20 years old.

... would have been for a long time.

> In particular, it predates cache-coherent uncore devices.
> Cache-coherent devices by definition have the same ordering properties
> and constraints as cpus, because they are part of one shared (or dare I
> say, symmetric), cache-coherent domain.
> 
> How would your argument change if the IOMMU was a real CPU running real
> x86 code?  Its interface to the rest of the system would be identical,
> and in that case, it would obviously need an smp_{r,w}mb() pair for
> correctness reasons.  This is why smp_wmb() is the only appropriate
> construct to use.

It wouldn't change at all. What matters (as per above) is the
understanding the OS has, i.e. what is being surfaced to it as CPU.

Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Reply via email to