On Thu, Sep 15, 2011 at 11:07:43AM -0700, Andrew Pinski wrote:
> On Thu, Sep 15, 2011 at 10:52 AM, Michael Meissner
> <meiss...@linux.vnet.ibm.com> wrote:
> > The recent changes for the new code models tickled a latent bug in the GCC 
> > 4.6
> > and 4.7 compilers with the AIX calling sequence (PR 50341).  When you call a
> > function indirectly, the compiler has to save the current function's TOC
> > pointer that is in r2 on the stack frame, and then load the new function's 
> > TOC
> > pointer before doing the call.  If the indirect function that you are 
> > calling
> > has the same TOC value as the current function, the code will run, but if 
> > the
> > indirection function has a different TOC (for example it is in the main
> > program, and the indirect call is in a shared library), the wrong address 
> > will
> > be used.
> >
> > The existing code did this by doing a split after reload has finished.
> > Scheduling after reload then moves the load of the new TOC.  These patches 
> > for
> > GCC 4.6 and GCC 4.7 disable the splits so that the load of the new TOC is
> > always next to the bctrl instruction, preventing the compiler from moving 
> > the
> > load.  Ideally, we can eventually tackle the underlying problem that we did 
> > not
> > have the right dependencies to prevent the scheduler from moving the new TOC
> > load before the use.
> 
> ENOPATCH.

Yes, I realized that after the fact and put it out later.
 
> It was originally splitting before reload and I had changed it to be
> splitting after reload assuming the dependencies would be correct.
> What is the instruction which is being described as not dependent on
> the load of r2?

>From the example in the bug, it is the 3rd cpu_fprintf, where it decides to
move loading up an address into register 27, and move the load of the address
before the call, since register 27 is preserved across calls.

void cpu_dump_state (struct CPUPPCState *env, FILE *f, fprintf_function 
cpu_fprintf,
                     int flags)
{



    int i;

    cpu_fprintf(f, "NIP " "%016" "l" "x" "   LR " "%016" "l" "x" " CTR "
                "%016" "l" "x" " XER " "%016" "l" "x" "\n",
                env->nip, env->lr, env->ctr, env->xer);
    cpu_fprintf(f, "MSR " "%016" "l" "x" " HID0 " "%016" "l" "x" "  HF "
                "%016" "l" "x" " idx %d\n", env->msr, env->spr[(0x3F0)],
                env->hflags, env->mmu_idx);

    cpu_fprintf(f, "TB %08" "u" " %08" "l" "u"

                " DECR %08" "u"

                "\n",
                cpu_ppc_load_tbu(env), cpu_ppc_load_tbl(env)

                , cpu_ppc_load_decr(env)

                );

These are the insns that are moved before the call:

(insn 1105 153 203 2 (set (reg/f:DI 27 27 [603])
        (const:DI (plus:DI (reg:DI 2 2)
                (high:DI (const:DI (unspec:DI [
                                (symbol_ref/f:DI ("*.LC5") [flags 0x82]  
<var_decl 0xfff931e6360 *.LC5>)
                            ] UNSPEC_TOCREL)))))) 513 {largetoc_high}
     (expr_list:REG_EQUIV (const:DI (plus:DI (reg:DI 2 2)
                (high:DI (const:DI (unspec:DI [
                                (symbol_ref/f:DI ("*.LC5") [flags 0x82]  
<var_decl 0xfff931e6360 *.LC5>)
                            ] UNSPEC_TOCREL)))))
        (nil)))

(insn 1107 161 204 2 (set (reg/f:DI 27 27 [601])
        (lo_sum:DI (reg/f:DI 27 27 [603])
            (const:DI (unspec:DI [
                        (symbol_ref/f:DI ("*.LC5") [flags 0x82]  <var_decl 
0xfff931e6360 *.LC5>)
                    ] UNSPEC_TOCREL)))) 514 {largetoc_low}
     (expr_list:REG_EQUIV (symbol_ref/f:DI ("*.LC5") [flags 0x82]  <var_decl 
0xfff931e6360 *.LC5>)
        (nil)))

The RTL logic is not looking into the const.

This is triggered by Alan's patch in June:

2011-06-20  Alan Modra  <amo...@gmail.com>

        * config/rs6000/rs6000.c (create_TOC_reference): Wrap high part
        of toc-relative address in CONST.
        (rs6000_delegitimize_address): Recognize changed address.
        (rs6000_legitimize_reload_address): Likewise.
        (rs6000_emit_move): Don't force these constants to memory.
        * config/rs6000/rs6000.md (tls_gd, tls_gd_high): Wrap high part of
        toc-relative address in CONST.
        (tls_ld, tls_ld_high, tls_got_dtprel, tls_got_dtprel_high): Likewise.
        (tls_got_tprel, tls_got_tprel_high, largetoc_high): Likewise.

Now, I anticipate that the patch in 4.7 will be temporary until we are
confident that we have the right solution, but it is better to put a bandaid on
the patch to limit the damage.

-- 
Michael Meissner, IBM
5 Technology Place Drive, M/S 2757, Westford, MA 01886-3141, USA
meiss...@linux.vnet.ibm.com     fax +1 (978) 399-6899

Reply via email to