Re: [PATCH] target/hppa: Optimize ldcw/ldcd instruction translation

2023-09-14 Thread Helge Deller
Hi Richard, On 9/13/23 22:30, Richard Henderson wrote: On 9/13/23 10:19, Helge Deller wrote: On 9/13/23 18:55, Richard Henderson wrote: On 9/13/23 07:47, Helge Deller wrote: +    haddr = (uint32_t *)((uintptr_t)vaddr); +    old = *haddr; This is horribly incorrect, both for user-onl

Re: [PATCH] target/hppa: Optimize ldcw/ldcd instruction translation

2023-09-13 Thread Richard Henderson
On 9/13/23 10:19, Helge Deller wrote: On 9/13/23 18:55, Richard Henderson wrote: On 9/13/23 07:47, Helge Deller wrote: +    haddr = (uint32_t *)((uintptr_t)vaddr); +    old = *haddr; This is horribly incorrect, both for user-only and system mode. Richard, thank you for the review! B

Re: [PATCH] target/hppa: Optimize ldcw/ldcd instruction translation

2023-09-13 Thread Helge Deller
On 9/13/23 18:55, Richard Henderson wrote: On 9/13/23 07:47, Helge Deller wrote: +    haddr = (uint32_t *)((uintptr_t)vaddr); +    old = *haddr; This is horribly incorrect, both for user-only and system mode. Richard, thank you for the review! But would you mind explaining why this i

Re: [PATCH] target/hppa: Optimize ldcw/ldcd instruction translation

2023-09-13 Thread Richard Henderson
On 9/13/23 07:47, Helge Deller wrote: +haddr = (uint32_t *)((uintptr_t)vaddr); +old = *haddr; This is horribly incorrect, both for user-only and system mode. +/* if already zero, do not write 0 again to reduce memory presssure */ +if (old == 0) { +r

[PATCH] target/hppa: Optimize ldcw/ldcd instruction translation

2023-09-13 Thread Helge Deller
ldcw (load word and clear) is the only atomic memory instruction of the hppa architecture and thus is heavily used by the Linux and HP/UX kernel to implement locks. Since ldcw always writes a zero, optimize it to not write zero again if the memory already contained zero (as the lock was already lo