Hello Robert,
01.09.2023 23:21, Robert Haas wrote:
On Fri, Sep 1, 2023 at 6:13 AM Alexander Lakhin<exclus...@gmail.com> wrote:
(Placing "pg_compiler_barrier();" just after "waiting = true;" fixed the
issue for us.)
Maybe it'd be worth trying something stronger, like
pg_memory_barrier(). A compiler barrier doesn't prevent the CPU from
reordering loads and stores as it goes, and ARM64 has weak memory
ordering.
Indeed, thank you for the tip!
So maybe here we deal with not compiler's, but with CPU's optimization.
The wider code fragment is:
805c48: 52800028 mov w8, #1 // true
805c4c: 52800319 mov w25, #24
805c50: 5280073a mov w26, #57
805c54: fd446128 ldr d8, [x9, #2240]
805c58: 90000d7b adrp x27, 0x9b1000 <ModifyWaitEvent+0xb0>
805c5c: fd415949 ldr d9, [x10, #688]
805c60: f9071d68 str x8, [x11, #3640] // waiting = true (x8 = w8)
805c64: f90003f3 str x19, [sp]
805c68: 14000010 b 0x805ca8 <WaitEventSetWait+0x108>
805ca8: f9400a88 ldr x8, [x20, #16] // if (set->latch &&
set->latch->is_set)
805cac: b4000068 cbz x8, 0x805cb8 <WaitEventSetWait+0x118>
805cb0: f9400108 ldr x8, [x8]
805cb4: b5001248 cbnz x8, 0x805efc <WaitEventSetWait+0x35c>
805cb8: f9401280 ldr x0, [x20, #32]
If that CPU can delay the writing to the variable waiting
(str x8, [x11, #3640]) in it's internal form like
"store 1 to [address]" to 805cb0 or a later instruction, then we can get the
behavior discussed. Something like that is shown in the ARM documentation:
https://developer.arm.com/documentation/102336/0100/Memory-ordering?lang=en
I'll try to test this guess on the target machine...
Best regards,
Alexander