On Wed, Oct 07, 2015 at 07:22:59AM +0900, Oleg Endo wrote:
> On Tue, 2015-10-06 at 12:52 -0400, Rich Felker wrote:
> > > > + if (TARGET_FDPIC)
> > > > + {
> > > > + rtx a = force_reg (Pmode, plus_constant (Pmode, XEXP (tramp_mem,
> > > > 0), 8));
> > > > + emit_move_insn (adjust_address (tramp_mem, SImode, 0), a);
> > > > + emit_move_insn (adjust_address (tramp_mem, SImode, 4),
> > > > OUR_FDPIC_REG);
> > > > + emit_move_insn (adjust_address (tramp_mem, SImode, 8),
> > > > + gen_int_mode (TARGET_LITTLE_ENDIAN ? 0xd203d302 :
> > > > 0xd302d203,
> > > > + SImode));
> > > > + emit_move_insn (adjust_address (tramp_mem, SImode, 12),
> > > > + gen_int_mode (TARGET_LITTLE_ENDIAN ? 0x5c216122 :
> > > > 0x61225c21,
> > > > + SImode));
> > > > + emit_move_insn (adjust_address (tramp_mem, SImode, 16),
> > > > + gen_int_mode (TARGET_LITTLE_ENDIAN ? 0x0009412b :
> > > > 0x412b0009,
> > > > + SImode));
> > > > + emit_move_insn (adjust_address (tramp_mem, SImode, 20), cxt);
> > > > + emit_move_insn (adjust_address (tramp_mem, SImode, 24), fnaddr);
> > > > + }
> > > > + else
> > > > + {
> > > > + emit_move_insn (change_address (tramp_mem, SImode, NULL_RTX),
> > > > + gen_int_mode (TARGET_LITTLE_ENDIAN ? 0xd301d202 :
> > > > 0xd202d301,
> > > > + SImode));
> > > > + emit_move_insn (adjust_address (tramp_mem, SImode, 4),
> > > > + gen_int_mode (TARGET_LITTLE_ENDIAN ? 0x0009422b :
> > > > 0x422b0009,
> > > > + SImode));
> > > > + emit_move_insn (adjust_address (tramp_mem, SImode, 8), cxt);
> > > > + emit_move_insn (adjust_address (tramp_mem, SImode, 12), fnaddr);
> > > > + }
> > >
> > > I think this hunk really needs a comment. It copies machine code from
> > > somewhere to somewhere via constant loads... but what exactly are the
> > > instructions ...
> >
> > This is generating trampolines for nested functions. This portion of
> > the patch applied without modification from the old patch, so I didn't
> > read into it in any more detail; it seems to be the following, which
> > makes sense:
> >
> > 0: .long 1f
> > .long gotval
> > 1: mov.l 3f,r3
> > mov.l 2f,r2
> > mov.l @r2,r1
> > mov.l @(4,r2),r12
> > jmp @r1
> > nop
> > 3: .long cxt
> > 2: .long fnaddr
> >
> > The corresponding non-FDPIC version is:
> >
> > mov.l 3f,r3
> > mov.l 2f,r2
> > jmp @r2
> > nop
> > 3: .long cxt
> > 2: .long fnaddr
> >
> > Should these go into the source as comments?
>
> Yes, please. And of course some of the descriptive text as above.
OK.
> > I would think it does, but I've found in the RTL files sometimes extra
> > escaping is silently accepted, and I'm not sure if omitting it would
> > visibly break. Can I rely on it producing a visible error right away
> > if removing it is wrong, or do I need to search the gccint
> > documentation to figure out what the right way is?
>
> Just compile some code and look at the generated asm.
OK, I'll try this.
> > It can't generate the same code either way because, with the patch as
> > submitted, there's an extra load inside the asm. I would prefer
> > switching to an approach that avoids that (mainly to avoid the ugly
> > near-duplication of the asm block, but also to save a couple
> > instructions) but short of feedback on acceptable ways to do the
> > punning in the C++ I'll just leave it in the asm for now.
>
> Do you have some alternatives to what's currently in the patch? It's
> difficult to judge without seeing them...
Perhaps something like the following:
#ifdef __SH_FDPIC__
typedef __attribute__((__may_alias__)) uintptr_t sh_aliased_uintptr_t;
#define SH_CODE_ADDR(x) (*(sh_aliased_uintptr_t *)(x))
#else
#define SH_CODE_ADDR(x) x
#endif
And then just passing SH_CODE_ADDR(__udiv_qrnnd_16) rather than just
__udiv_qrnnd_16 as the input to the asm.
Rich