On 23/07/18 12:01, Andrew Dinn wrote:
> Hi Florian,
> 
> Thank you for the feedback. Comments inline below (extensive comments in
> the case of the memory model question - it is complex).
Having written up what I assumed was needed, of course I then went back
to have another look at the Intel code and as a result I think I have

  i) corrected my mistaken understanding of what it does
  ii) come to the conclusion that libpmem is doing the right thing

No surprise ;-]

Firstly, I was mistaken in thinking that the libpmem code was executing
a memory barrier before the writeback operations (I was confused by talk
of pre-drain, hw-drain and drain at various points in the code). The
definition of the critical function pmem_persist is as follows

void pmem_persist(const void *addr, size_t len)
{
        pmem_flush(addr, len);
        pmem_drain();
}

where

  pmem_flush is implemented via clflush/clflushopt/clwb or dc(CVAC)
  pmem_drain is implemented as an sfence or dmb(ISH)

So, libpmem is emitting a memory barrier /after/ writeback and this
barrier serves to to ensure that writeback is complete before any
subsequent writes can proceed to completion and become visible. That
deals with the data + control update issue I was concerned about.

On the other hand, libpmem does /not/ execute any memory barriers prior
to executing the writeback instructions. I had assumed that a pre-sync
was also necessary to ensure that prior writes completed before the
writeback was initiated. However, looking at the details provided in the
documentation this seems to be a spurious requirement of my own invention.

On x86_64 writeback operations proceeding via CLFLUSHOPT or CLWB on some
specific cache line are ordered wrt to writes on that same cache line.
The relevant text is:

"Executions of the CLFLUSHOPT instruction are [...]  ordered with
respect to the following accesses to the cache line being invalidated:
writes, executions of CLFLUSH, and executions of CLFLUSHOPT."

"CLWB is implicitly ordered with older stores executed by the logical
processor to the same address"

So, there is no need to execute a memory barrier to avoid the
possibility of writing back a cache line before it has been updated by
an in-progress write to that same cache line. Of course, neither is
there a need to be concerned about in-progress writes to cache lines
that are not going to be flushed. They won't have any effect on the
correctness of the writeback. So, a pre-barrier is unnecessary.

The same consideration seem to apply for AArch64.

"All data cache instructions, other than DC ZVA, that specify an address:

- Execute in program order relative to loads or stores that access an
address in Normal memory with either Inner Write Through or Inner Write
Back attributes within the same cache line of minimum size, as indicated
by CTR_EL0.DMinLine"

So, once again a DMB is required after but not before writeback is
initiated.

I'm also now unsure that a DMB(ISHST)is actually adequate. The relevant
text from the ARM ARM is at the end of the text I previously cited.

"In all cases, where the text in this section refers to a DMB or a DSB ,
this means a DMB or DSB whose required access type is both loads and
stores."

I think that means that the pmem_drain barrier has to be a DMB(ISH) not
a DMB(ISHST). I'm not really sure why but it does seem to be the
implication.

So, to sum up

Unsafe.writebackPreSync0 should be implemented as
  a no-op on x86_64
  a no-op on AArch64

Unsafe.writebackPosSync0 should be implemented as
    a no-op on x86_64 when using clflush
    an sfence on x86_64 when using clflushopt or clwb
    a dmb(ISH) on AArch64

I think it may still be worth retaining Unsafe.writebackPreSync0 (even
though it currently translates to a no-op) in case new hardware requires
some extra preparatory operation before writeback commences.

regards,


Andrew Dinn
-----------
Senior Principal Software Engineer
Red Hat UK Ltd
Registered in England and Wales under Company Registration No. 03798903
Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander

Reply via email to