On Tue, Apr 26, 2011 at 8:26 PM, Artyom Tarasenko <atar4q...@gmail.com> wrote: > On Tue, Apr 26, 2011 at 5:34 AM, Igor Kovalenko > <igor.v.kovale...@gmail.com> wrote: >> On Tue, Apr 26, 2011 at 12:29 AM, Aurelien Jarno <aurel...@aurel32.net> >> wrote: >>> On Fri, Apr 22, 2011 at 06:14:06PM +0400, Igor Kovalenko wrote: >>>> On Fri, Apr 22, 2011 at 2:39 AM, Laurent Desnogues >>>> <laurent.desnog...@gmail.com> wrote: >>>> > On Thu, Apr 21, 2011 at 9:45 PM, Igor Kovalenko >>>> > <igor.v.kovale...@gmail.com> wrote: >>>> >> On Thu, Apr 21, 2011 at 7:44 PM, Laurent Desnogues >>>> >> <laurent.desnog...@gmail.com> wrote: >>>> >>> On Thu, Apr 21, 2011 at 4:57 PM, Artyom Tarasenko >>>> >>> <atar4q...@gmail.com> wrote: >>>> >>>> On Tue, Apr 12, 2011 at 4:14 AM, Igor Kovalenko >>>> >>>> <igor.v.kovale...@gmail.com> wrote: >>>> >>>>>>> Do you have public test case? >>>> >>>>>>> It is possible to code this delay slot write test but real issue >>>> >>>>>>> may >>>> >>>>>>> be corruption elsewhere. >>>> >>>> >>>> >>>> The test case is trivial: it's just the two instructions, branch and >>>> >>>> wrpr. >>>> >>>> >>>> >>>>> In theory there could be multiple issues including compiler induced >>>> >>>>> ones. >>>> >>>>> I'd prefer to see some kind of reproducible testcase. >>>> >>>> >>>> >>>> Ok, attached a 40 byte long test (the first 32 bytes are not used and >>>> >>>> needed only because the bios entry point is 0x20). >>>> >>>> >>>> >>>> $ git pull && make && sparc64-softmmu/qemu-system-sparc64 -bios >>>> >>>> test-wrpr.bin -nographic >>>> >>>> Already up-to-date. >>>> >>>> make[1]: Nothing to be done for `all'. >>>> >>>> /mnt/terra/projects/vanilla/qemu/tcg/tcg.c:1892: tcg fatal error >>>> >>>> Aborted >>>> >>> >>>> >>> The problem seems to be that wrpr is using a non-local >>>> >>> TCG tmp (cpu_tmp0). >>>> >> >>>> >> Just tried the test case with write to %pil - seems like write itself >>>> >> is OK. >>>> >> The issue appears to be with save_state() call since adding save_state >>>> >> to %pil case provokes the same tcg abort. >>>> > >>>> > The problem is that cpu_tmp0, not being a local tmp, doesn't >>>> > need to be saved across helper calls. This results in the >>>> > TCG "optimizer" getting rid of it even though it's later used. >>>> > Look at the log and you'll see what I mean :-) >>>> >>>> I'm not very comfortable with tcg yet. Would it be possible to teach >>>> optimizer working with delay slots? Or do I look in the wrong place. >>>> >>> >>> The problem is not on the TCG side, but on the target-sparc/translate.c >>> side: >>> >>> | case 0x32: /* wrwim, V9 wrpr */ >>> | { >>> | if (!supervisor(dc)) >>> | goto priv_insn; >>> | tcg_gen_xor_tl(cpu_tmp0, cpu_src1, cpu_src2); >>> | #ifdef TARGET_SPARC64 >>> >>> Here cpu_tmp0 is loaded. cpu_tmp0 is a TCG temp, which means it is not >>> saved across TCG branches. >>> >>> [...] >>> >>> | case 6: // pstate >>> | save_state(dc, cpu_cond); >>> | gen_helper_wrpstate(cpu_tmp0); >>> | dc->npc = DYNAMIC_PC; >>> | break; >>> >>> save_state() calls save_npc(), which in turns might call >>> gen_generic_branch(): >>> >>> | static inline void gen_generic_branch(target_ulong npc1, target_ulong >>> npc2, >>> | TCGv r_cond) >>> | { >>> | int l1, l2; >>> | >>> | l1 = gen_new_label(); >>> | l2 = gen_new_label(); >>> | >>> | tcg_gen_brcondi_tl(TCG_COND_EQ, r_cond, 0, l1); >>> | >>> | tcg_gen_movi_tl(cpu_npc, npc1); >>> | tcg_gen_br(l2); >>> | >>> | gen_set_label(l1); >>> | tcg_gen_movi_tl(cpu_npc, npc2); >>> | gen_set_label(l2); >>> | } >>> >>> And here is the TCG branch, which drop the TCG temp cpu_temp0. >>> >>> The solution is either to rewrite gen_generic_branch() without TCG >>> branches, or to use a TCG temp local instead of a TCG temp. >> >> Thanks! >> >> I think the issue is more clear now, and loading to local temporary >> works in this case. >> Does not explain why unmodified qemu works with wrpr pstate not in delay >> slot. > > Because the TCG branch is not generated in save_npc()? > >> I looked at my linux kernel builds and do not see any wrpr pstate in delay >> slot. > > Meaning you are not going to fix the bug? ;-)
More like I need to know where the bug is because there is no issue running without wrpr in delay slot. -- Kind regards, Igor V. Kovalenko