On Tue, Apr 26, 2011 at 8:26 PM, Artyom Tarasenko <atar4q...@gmail.com> wrote:
> On Tue, Apr 26, 2011 at 5:34 AM, Igor Kovalenko
> <igor.v.kovale...@gmail.com> wrote:
>> On Tue, Apr 26, 2011 at 12:29 AM, Aurelien Jarno <aurel...@aurel32.net> 
>> wrote:
>>> On Fri, Apr 22, 2011 at 06:14:06PM +0400, Igor Kovalenko wrote:
>>>> On Fri, Apr 22, 2011 at 2:39 AM, Laurent Desnogues
>>>> <laurent.desnog...@gmail.com> wrote:
>>>> > On Thu, Apr 21, 2011 at 9:45 PM, Igor Kovalenko
>>>> > <igor.v.kovale...@gmail.com> wrote:
>>>> >> On Thu, Apr 21, 2011 at 7:44 PM, Laurent Desnogues
>>>> >> <laurent.desnog...@gmail.com> wrote:
>>>> >>> On Thu, Apr 21, 2011 at 4:57 PM, Artyom Tarasenko 
>>>> >>> <atar4q...@gmail.com> wrote:
>>>> >>>> On Tue, Apr 12, 2011 at 4:14 AM, Igor Kovalenko
>>>> >>>> <igor.v.kovale...@gmail.com> wrote:
>>>> >>>>>>> Do you have public test case?
>>>> >>>>>>> It is possible to code this delay slot write test but real issue 
>>>> >>>>>>> may
>>>> >>>>>>> be corruption elsewhere.
>>>> >>>>
>>>> >>>> The test case is trivial: it's just the two instructions, branch and 
>>>> >>>> wrpr.
>>>> >>>>
>>>> >>>>> In theory there could be multiple issues including compiler induced 
>>>> >>>>> ones.
>>>> >>>>> I'd prefer to see some kind of reproducible testcase.
>>>> >>>>
>>>> >>>> Ok, attached a 40 byte long test (the first 32 bytes are not used and
>>>> >>>> needed only because the bios entry point is 0x20).
>>>> >>>>
>>>> >>>> $ git pull && make && sparc64-softmmu/qemu-system-sparc64 -bios
>>>> >>>> test-wrpr.bin -nographic
>>>> >>>> Already up-to-date.
>>>> >>>> make[1]: Nothing to be done for `all'.
>>>> >>>> /mnt/terra/projects/vanilla/qemu/tcg/tcg.c:1892: tcg fatal error
>>>> >>>> Aborted
>>>> >>>
>>>> >>> The problem seems to be that wrpr is using a non-local
>>>> >>> TCG tmp (cpu_tmp0).
>>>> >>
>>>> >> Just tried the test case with write to %pil - seems like write itself 
>>>> >> is OK.
>>>> >> The issue appears to be with save_state() call since adding save_state
>>>> >> to %pil case provokes the same tcg abort.
>>>> >
>>>> > The problem is that cpu_tmp0, not being a local tmp, doesn't
>>>> > need to be saved across helper calls.  This results in the
>>>> > TCG "optimizer" getting rid of it even though it's later used.
>>>> > Look at the log and you'll see what I mean :-)
>>>>
>>>> I'm not very comfortable with tcg yet. Would it be possible to teach
>>>> optimizer working with delay slots? Or do I look in the wrong place.
>>>>
>>>
>>> The problem is not on the TCG side, but on the target-sparc/translate.c
>>> side:
>>>
>>> |                    case 0x32: /* wrwim, V9 wrpr */
>>> |                         {
>>> |                             if (!supervisor(dc))
>>> |                                 goto priv_insn;
>>> |                             tcg_gen_xor_tl(cpu_tmp0, cpu_src1, cpu_src2);
>>> | #ifdef TARGET_SPARC64
>>>
>>> Here cpu_tmp0 is loaded. cpu_tmp0 is a TCG temp, which means it is not
>>> saved across TCG branches.
>>>
>>> [...]
>>>
>>> |                             case 6: // pstate
>>> |                                 save_state(dc, cpu_cond);
>>> |                                 gen_helper_wrpstate(cpu_tmp0);
>>> |                                 dc->npc = DYNAMIC_PC;
>>> |                                 break;
>>>
>>> save_state() calls save_npc(), which in turns might call
>>> gen_generic_branch():
>>>
>>> | static inline void gen_generic_branch(target_ulong npc1, target_ulong 
>>> npc2,
>>> |                                       TCGv r_cond)
>>> | {
>>> |     int l1, l2;
>>> |
>>> |     l1 = gen_new_label();
>>> |     l2 = gen_new_label();
>>> |
>>> |     tcg_gen_brcondi_tl(TCG_COND_EQ, r_cond, 0, l1);
>>> |
>>> |     tcg_gen_movi_tl(cpu_npc, npc1);
>>> |     tcg_gen_br(l2);
>>> |
>>> |     gen_set_label(l1);
>>> |     tcg_gen_movi_tl(cpu_npc, npc2);
>>> |     gen_set_label(l2);
>>> | }
>>>
>>> And here is the TCG branch, which drop the TCG temp cpu_temp0.
>>>
>>> The solution is either to rewrite gen_generic_branch() without TCG
>>> branches, or to use a TCG temp local instead of a TCG temp.
>>
>> Thanks!
>>
>> I think the issue is more clear now, and loading to local temporary
>> works in this case.
>> Does not explain why unmodified qemu works with wrpr pstate not in delay 
>> slot.
>
> Because the TCG branch is not generated in save_npc()?
>
>> I looked at my linux kernel builds and do not see any wrpr pstate in delay 
>> slot.
>
> Meaning you are not going to fix the bug? ;-)

More like I need to know where the bug is
because there is no issue running without wrpr in delay slot.

-- 
Kind regards,
Igor V. Kovalenko

Reply via email to