Hello Robert,

01.09.2023 23:21, Robert Haas wrote:
On Fri, Sep 1, 2023 at 6:13 AM Alexander Lakhin<exclus...@gmail.com>  wrote:
(Placing "pg_compiler_barrier();" just after "waiting = true;" fixed the
issue for us.)
Maybe it'd be worth trying something stronger, like
pg_memory_barrier(). A compiler barrier doesn't prevent the CPU from
reordering loads and stores as it goes, and ARM64 has weak memory
ordering.

Indeed, thank you for the tip!
So maybe here we deal with not compiler's, but with CPU's optimization.
The wider code fragment is:
  805c48: 52800028      mov     w8, #1 // true
  805c4c: 52800319      mov     w25, #24
  805c50: 5280073a      mov     w26, #57
  805c54: fd446128      ldr     d8, [x9, #2240]
  805c58: 90000d7b      adrp    x27, 0x9b1000 <ModifyWaitEvent+0xb0>
  805c5c: fd415949      ldr     d9, [x10, #688]
  805c60: f9071d68      str     x8, [x11, #3640] // waiting = true (x8 = w8)
  805c64: f90003f3      str     x19, [sp]
  805c68: 14000010      b       0x805ca8 <WaitEventSetWait+0x108>

  805ca8: f9400a88      ldr     x8, [x20, #16] // if (set->latch && 
set->latch->is_set)
  805cac: b4000068      cbz     x8, 0x805cb8 <WaitEventSetWait+0x118>
  805cb0: f9400108      ldr     x8, [x8]
  805cb4: b5001248      cbnz    x8, 0x805efc <WaitEventSetWait+0x35c>
  805cb8: f9401280      ldr     x0, [x20, #32]

If that CPU can delay the writing to the variable waiting
(str x8, [x11, #3640]) in it's internal form like
"store 1 to [address]" to 805cb0 or a later instruction, then we can get the
behavior discussed. Something like that is shown in the ARM documentation:
https://developer.arm.com/documentation/102336/0100/Memory-ordering?lang=en
I'll try to test this guess on the target machine...

Best regards,
Alexander


Reply via email to