> -----Original Message----- > From: Richard Henderson <richard.hender...@linaro.org> > Sent: Wednesday, January 24, 2024 6:22 PM > To: Brian Cain <bc...@quicinc.com>; Philippe Mathieu-Daudé > <phi...@linaro.org> > Cc: qemu-devel@nongnu.org; Sid Manning <sidn...@quicinc.com>; Marco > Liebel <mlie...@qti.qualcomm.com>; Matheus Bernardino > <mathb...@qti.qualcomm.com> > Subject: Re: hexagon: modeling a shared lock state > > WARNING: This email originated from outside of Qualcomm. Please be wary of > any links or attachments, and do not enable macros. > > On 1/25/24 01:07, Brian Cain wrote: > > Philippe, > > > > For hexagon sysemu, while internally reviewing patches to be upstreamed we > noticed that > > our design for a lock instruction is not quite suitable. The k0lock > > instruction > will halt > > if some other hexagon hardware CPU has already claimed the kernel lock, > only to continue > > executing once some CPU executes the unlock instruction. We modeled this > using a lock > > state enumeration member { OWNER, WAITING, UNLOCKED } in **each** > vCPU and atomically > > transitioning the lock required us to have vCPU[n] write the updated lock > state to vCPU[m] > > when unlocking. > > > > In context: > > > https://github.com/quic/qemu/blob/hexagon_sysemu_20_dec_2023/target/he > xagon/op_helper.c#L1790-L1821 > <https://github.com/quic/qemu/blob/hexagon_sysemu_20_dec_2023/target/h > exagon/op_helper.c#L1790-L1821> > > > > So instead, maybe it makes more sense to have a system device hold a single > representation > > of the lock’s state and each vCPU can do some kind of access via load/store > and/or > > interrupts to/from the device? I was thinking of raising an interrupt on > > the > lock device > > at the vCPU’s k0lock instruction to indicate demand for the lock, and then > > the > device > > could raise an interrupt to one of the vCPUs when it’s granted the lock. > > One > drawback for > > this is that this might add significant latency to the uncontested lock > > case. So > I also > > considered doing a load of some part of the lock device’s memory that could > indicate > > whether the lock was available, and if available it would claim it with a > > store > (both > > ld/st while holding BQL). Only if unavailable it would halt and raise the > interrupt. Is > > it possible to create an address space that’s independent of the true system > memory map > > for this use case or would we need to carve out some memory from the > existing memory map > > for this synthetic device? Or – do you have a suggestion for a better > approach overall? > > I think you are over-thinking this. A system device would just get in the > way. I
Ok, great - our existing design is ~roughly like this. But we can explore this -- thanks for writing this example! > think > you want something like > > struct CPUHexagonState { > ... > bool hwlock_pending; > } > > hexagon_cpu_has_work() { > if (cpu->hwlock_pending) { > return false; > } > } > > static void do_hwlock(CPUHexagonState *env, bool *lock) > { > bql_lock(); > > if (*lock) { > env->hwlock_pending = true; > cs->halted = true; > cs->exception_index = EXCP_HALTED; > bql_unlock(); > cpu_loop_exit(cs); In place of the above - we have cpu_interrupt(cs, CPU_INTERRUPT_HALT) -- but is that equivalent? Is there one idiom that's preferred over another? Somehow it seems simpler if we don't need to longjmp and IIRC some of these patterns do. > } else { > *lock = true; > bql_unlock(); > } > } > > static void do_hwunlock(CPUHexagonState *env, bool *lock) > { > CPUState *cs; > BQL_LOCK_GUARD(); > > *lock = false; > > CPU_FOREACH(cs) { > if (cs->hwlock_pending) { > cs->hwlock_pending = false; > qemu_cpu_kick(cs); > } > } > } > > static bool k0lock, tlblock; > > void HELPER(k0lock)(CPUHexagonState *env) > void HELPER(tlblock)(CPUHexagonState *env) > void HELPER(k0unlock)(CPUHexagonState *env) > void HELPER(tlbunlock)(CPUHexagonState *env) > > Open questions are: > > (1) Do interrupts cancel lock wait? Either way, placement in > hexagon_cpu_has_work is key. Yeah - they do, we will double check the interaction w has_work. > (2) I assume that the pc will not be advanced, so that after the kick we will > re- > execute > the hwlock instruction. There will be a thundering herd racing to grab the > lock > again, > but it saves queuing logic, which might be complicated if interrupts are > involved. Yes that's right, too. Thanks!