Vaibhav Jain <vaib...@linux.vnet.ibm.com> writes:

> Hi Mpe,
>
> Thanks for reviewing the patch
>
> Michael Ellerman <m...@ellerman.id.au> writes:
>
>>> +   ctx->elem->software_state = cpu_to_be32(CXL_PE_SOFTWARE_STATE_V);
>>> +   /* Make sure the changes to the PE are visible to the card */
>>
>> A barrier orders something vs something else. So what's the something
>> else in this case? Is it the afu_reset() below, what does that actually do?
>>
>
> The issue is with call to afu_enable() after the call to afu_reset that
> would start the AFU. If this load gets reordered and PSL doesnt see the
> valid bit set for this structure then it will result in PSL entering a
> freeze-state.

OK, so it's ordering the store above to ctx->elem->software_state vs the
store to the AFU in afu_enable().

> Though on second thoughts afu_enable() is grabbing a spin-lock before
> doing an mmio to start the AFU that would be forcing a barrier
> anyways. But since that spans the function boundary hence to be safe
> have added a write barrier after populating the process element.

The spin lock doesn't help you, stores are allowed to leak into the
locked region.

But the MMIO is preceeded by a sync.

> Lastly function is not performance critical as it will be usually called
> in the life time of a process only once. So the impact smp_wmb() is
> having would be minimal.

Sure. Performance is not the issue, barriers are subtle so it's
important that they're well documented.


So I don't think you need the barrier, because the out_be64() will do it
for you. But if you really want to add one, I don't mind.

But, you should use wmb(), not smp_wmb(), because the ordering is still
required on non-SMP systems. And please update the comment to capture
all of the above discussion.

cheers

Reply via email to