Re: [PATCH] SH FDPIC backend support

2015-10-05 Thread Oleg Endo
On Sun, 2015-10-04 at 22:16 -0400, Rich Felker wrote:
> This is FDPIC-specific. Because there is fundamentally no way for a
> function to find its own GOT (it has one GOT for each process using
> the code containing the function), its GOT address has to be a
> (hidden) argument to the function which arrives in r12.
> 
> For calls via the PLT, r12 contains the PLT entry's (i.e. the calling
> module's) GOT pointer at the time of the call, and the PLT thunk
> replaces it with the callee's GOT pointer (loaded from the function
> descriptor) before jumping to the callee code. There is fundamentally
> nowhere the PLT thunk could store the old value of r12 and arrange for
> it to be restored at return time, so using a PLT forces r12 to be
> call-clobbered.
> 
> (Note that in the special case where the PLT is bypassed because the
> callee is defined in the same module and bound at link-time, the GOT
> value loaded by the caller is the right GOT value for the callee
> automatically.)
> 
> If we didn't care about being able to do PLT calls, there's no
> fundamental reason r12 has to be call-clobbered, but it still makes a
> lot more sense. Getting back the value of r12 you passed when making a
> function call is rarely useful except in the case where the caller
> knows the function is defined in the same module (so it can keep using
> r12 as its own GOT pointer after the call).
> 
> BTW the reason I'm spending time explaining this now is that it's
> something we should optimize after the FDPIC patch goes in: I think
> the r12-related spills/reload could be made a lot more efficient.

This will be a separate point then, after the initial FDPIC stuff is in.
Maybe also related:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=12306
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54019

Cheers,
Oleg



Re: [PATCH] SH FDPIC backend support

2015-10-05 Thread Kaz Kojima
Oleg Endo  wrote:
>> So apparently the strange behavior I observed is intended. Presumably
>> there is some mechanism to ensure that these functions are always
>> static-linked? But I don't see it. The libgcc spec I see is:
>> 
>> *libgcc:
>> %{static|static-libgcc:-lgcc
>> -lgcc_eh}%{!static:%{!static-libgcc:%{!shared-libgcc:-lgcc --as-needed
>> -lgcc_s --no-as-needed}%{shared-libgcc:-lgcc_s%{!shared: -lgcc
>> 
>> This explicitly omits -lgcc when -shared-libgcc is used with -shared.
>> Thankfully __ashlsi3_r0 is not exported from libgcc.so.1 (as far as I
>> can tell), so this will just be a link error rather than horribly
>> wrong behavior, but it still seems like there's a bug here unless I'm
>> misunderstanding something. I think the final %{!shared: -lgcc} in the
>> spec is an error and should be replaced by simply -lgcc if there are
>> targets where libgcc.a contains necessary symbols that are not/cannot
>> be defined in libgcc_s.so.1.
> 
> Hm, maybe, but I don't know enough about this, sorry.  Kaz, maybe you
> have a comment on that?

Sorry for my late reply.  I was traveling.
I think that almost linux targets uses linker script libgcc_s.so
which includes -lgcc.  See trunk/libgcc/config/t-slibgcc-libgcc.
The target micro functions are statically linked with it.

Regards,
kaz


Re: [PATCH] SH FDPIC backend support

2015-10-04 Thread Rich Felker
On Sun, Oct 04, 2015 at 02:10:42PM +0900, Oleg Endo wrote:
> On Sat, 2015-10-03 at 18:34 -0400, Rich Felker wrote:
> > > 
> > > I found and fixed the problem, but I have a new concern: calls to the
> > > new shift instructions are using the following address forms:
> > > 
> > > -mno-fdpic -fPIC:
> > >   .long   __ashlsi3_r0@GOTOFF
> > > 
> > > -mfdpic:
> > >   .long   __ashlsi3_r0-(.LPCS1+2)
> > > 
> > > Neither of these seems valid. Both assume __ashlsi3_r0 will be defined
> > > in the same DSO, which is not true in general; shared libgcc_s.so
> > > might be in use. In this case the call would need to go through the
> > > PLT, which (for PIC or FDPIC) requires r12 to be loaded with the GOT
> > > address. In the non-FDPIC case, r12 _happens_ to contain the GOT
> > > address just because it was used as an addend to get the function
> > > address from the @GOTOFF address, but this does not seem
> > > safe/reliable. In the FDPIC case there's nothing to cause r12 to
> > > contain the GOT address, and in fact if the function has already made
> > > another function call (which uses and clobbers r12), no code is
> > > generated to save and restore r12 for the libgcc call.
> 
> I might be missing something, but usually R12 is preserved across
> function calls.

This is FDPIC-specific. Because there is fundamentally no way for a
function to find its own GOT (it has one GOT for each process using
the code containing the function), its GOT address has to be a
(hidden) argument to the function which arrives in r12.

For calls via the PLT, r12 contains the PLT entry's (i.e. the calling
module's) GOT pointer at the time of the call, and the PLT thunk
replaces it with the callee's GOT pointer (loaded from the function
descriptor) before jumping to the callee code. There is fundamentally
nowhere the PLT thunk could store the old value of r12 and arrange for
it to be restored at return time, so using a PLT forces r12 to be
call-clobbered.

(Note that in the special case where the PLT is bypassed because the
callee is defined in the same module and bound at link-time, the GOT
value loaded by the caller is the right GOT value for the callee
automatically.)

If we didn't care about being able to do PLT calls, there's no
fundamental reason r12 has to be call-clobbered, but it still makes a
lot more sense. Getting back the value of r12 you passed when making a
function call is rarely useful except in the case where the caller
knows the function is defined in the same module (so it can keep using
r12 as its own GOT pointer after the call).

BTW the reason I'm spending time explaining this now is that it's
something we should optimize after the FDPIC patch goes in: I think
the r12-related spills/reload could be made a lot more efficient.

> The special functions in libgcc tell the compiler
> exactly which things they clobber and which not.  R12 is not clobbered
> by the shift functions.

For FDPIC, that implies an assumption that the definition is local to
the calling module (i.e. static-linked) but I think that assumption
already existed for non-FDPIC since r12 was not explicitly set for the
call.

> > > Calls to other functions lib libgcc (e.g. division) seem to work fine
> > > and either go through the PLT or bypass it and load from the GOT
> > > directly. It's only these new special-calling-convention ones that are
> > > broken, and I can't figure out why...
> 
> Sorry, I wasn't paying attention to dynamic linking or *PIC when
> changing the shift patterns back then, so maybe I've screwed up
> something there.
> To me it looks like they do the same thing as expanders for division or
> the SH1 multiplication ("mulsi3" pattern).  Each of the libgcc support
> functions have a different "ABI", so "__ashlsi3_r0" or "__lshrsi3_r0"
> doesn't introduce a new special ABI, it already is as per definition.
> These function calls are not expanded like regular function calls, via
> e.g. (define_expand "call" ... ).  The function call is hidden from the
> regular function call machinery and everything thinks it's a regular
> instruction that just has some special register constraints and
> clobbers.
> 
> I've just tried compiling the following with -m2 -ml -fPIC
> 
> unsigned int test_2 (unsigned int x, unsigned int y)
> {
>   return x << y;
> }
> 
> unsigned int test_3 (unsigned int x, unsigned int y)
> {
>   return x / y;
> }
> 
> And the compiled code is basically identically for both.  For the labels
> I get:
> 
> ..L4: .long   _GLOBAL_OFFSET_TABLE_
> ..L5: .long   ___ashlsi3_r0@GOTOFF
> 
> and
> 
> ..L10:.long   _GLOBAL_OFFSET_TABLE_
> ..L11:.long   ___udivsi3@GOTOFF
> 
> So the shifts do not work, but the divisions do work that way?

It's not that one works and the other doesn't. I was just concerned
about the behavior and how it seems to be unsafe for shared libgcc;
it's equally unsafe for either. But as I found later:

> > Hmm, according to sh-protos.h:
> > 
> >   /* A special function that should be linked statical

Re: [PATCH] SH FDPIC backend support

2015-10-03 Thread Oleg Endo
On Sat, 2015-10-03 at 18:34 -0400, Rich Felker wrote:
> > 
> > I found and fixed the problem, but I have a new concern: calls to the
> > new shift instructions are using the following address forms:
> > 
> > -mno-fdpic -fPIC:
> > .long   __ashlsi3_r0@GOTOFF
> > 
> > -mfdpic:
> > .long   __ashlsi3_r0-(.LPCS1+2)
> > 
> > Neither of these seems valid. Both assume __ashlsi3_r0 will be defined
> > in the same DSO, which is not true in general; shared libgcc_s.so
> > might be in use. In this case the call would need to go through the
> > PLT, which (for PIC or FDPIC) requires r12 to be loaded with the GOT
> > address. In the non-FDPIC case, r12 _happens_ to contain the GOT
> > address just because it was used as an addend to get the function
> > address from the @GOTOFF address, but this does not seem
> > safe/reliable. In the FDPIC case there's nothing to cause r12 to
> > contain the GOT address, and in fact if the function has already made
> > another function call (which uses and clobbers r12), no code is
> > generated to save and restore r12 for the libgcc call.

I might be missing something, but usually R12 is preserved across
function calls.  The special functions in libgcc tell the compiler
exactly which things they clobber and which not.  R12 is not clobbered
by the shift functions.

> > Calls to other functions lib libgcc (e.g. division) seem to work fine
> > and either go through the PLT or bypass it and load from the GOT
> > directly. It's only these new special-calling-convention ones that are
> > broken, and I can't figure out why...

Sorry, I wasn't paying attention to dynamic linking or *PIC when
changing the shift patterns back then, so maybe I've screwed up
something there.
To me it looks like they do the same thing as expanders for division or
the SH1 multiplication ("mulsi3" pattern).  Each of the libgcc support
functions have a different "ABI", so "__ashlsi3_r0" or "__lshrsi3_r0"
doesn't introduce a new special ABI, it already is as per definition.
These function calls are not expanded like regular function calls, via
e.g. (define_expand "call" ... ).  The function call is hidden from the
regular function call machinery and everything thinks it's a regular
instruction that just has some special register constraints and
clobbers.

I've just tried compiling the following with -m2 -ml -fPIC

unsigned int test_2 (unsigned int x, unsigned int y)
{
  return x << y;
}

unsigned int test_3 (unsigned int x, unsigned int y)
{
  return x / y;
}

And the compiled code is basically identically for both.  For the labels
I get:

.L4:.long   _GLOBAL_OFFSET_TABLE_
.L5:.long   ___ashlsi3_r0@GOTOFF

and

.L10:   .long   _GLOBAL_OFFSET_TABLE_
.L11:   .long   ___udivsi3@GOTOFF

So the shifts do not work, but the divisions do work that way?


> Hmm, according to sh-protos.h:
> 
>   /* A special function that should be linked statically.  These are typically
>  smaller or not much larger than a PLT entry.
>  Some also have a non-standard ABI which precludes dynamic linking.  */
>   SFUNC_STATIC
> 
> So apparently the strange behavior I observed is intended. Presumably
> there is some mechanism to ensure that these functions are always
> static-linked? But I don't see it. The libgcc spec I see is:
> 
> *libgcc:
> %{static|static-libgcc:-lgcc
> -lgcc_eh}%{!static:%{!static-libgcc:%{!shared-libgcc:-lgcc --as-needed
> -lgcc_s --no-as-needed}%{shared-libgcc:-lgcc_s%{!shared: -lgcc
> 
> This explicitly omits -lgcc when -shared-libgcc is used with -shared.
> Thankfully __ashlsi3_r0 is not exported from libgcc.so.1 (as far as I
> can tell), so this will just be a link error rather than horribly
> wrong behavior, but it still seems like there's a bug here unless I'm
> misunderstanding something. I think the final %{!shared: -lgcc} in the
> spec is an error and should be replaced by simply -lgcc if there are
> targets where libgcc.a contains necessary symbols that are not/cannot
> be defined in libgcc_s.so.1.

Hm, maybe, but I don't know enough about this, sorry.  Kaz, maybe you
have a comment on that?

Cheers,
Oleg




Re: [PATCH] SH FDPIC backend support

2015-10-03 Thread Rich Felker
On Fri, Oct 02, 2015 at 11:18:32AM -0400, Rich Felker wrote:
> > > +#ifdef __FDPIC__
> > > +#define udiv_qrnnd(q, r, n1, n0, d) \
> > > +  do {   
> > > \
> > > +extern UWtype __udiv_qrnnd_16 (UWtype, UWtype)   
> > > \
> > 
> > It's really difficult to spot the subtle difference of the FDPIC version
> > and the non-FDPIC version.  At least there should be a comment.
> 
> OK, I can add a comment; this is appropriate anyway since the way it's
> making the FDPIC call is unconventional.

Before I add comments, can we discuss whether the approach I took is
appropriate? The udiv_qrnnd asm block takes as an operand a function
pointer for __udiv_qrnnd_16 which it calls from asm. The
__udiv_qrnnd_16 function is itself written in asm has a special
contract for register clobbers, and it doesn't need a GOT register.
The non-FDPIC asm calls it via jsr @%5 (%5 is the function pointer)
but on FDPIC the function pointer points to a function descriptor, not
code, so an extra level of indirection is needed. This is actually
inefficient to do in asm because we have to repeat it twice. Normally
an FDPIC call would also require loading the GOT pointer from the
function descriptor, but since this call is local, that can be
skipped.

Another option would be to pass (essentially) *(void**)__udiv_qrnnd_16
instead of __udiv_qrnnd_16 to the asm block; then the existing inline
asm can be used as-is. This could be done via passing
SH_CODE_ADDRESS(__udiv_qrnnd_16) instead of __udiv_qrnnd_16, where
SH_CODE_ADDRESS would be a macro defined to pass through its argument
for non-FDPIC and to extract the code address from the function
descriptor for FDPIC. However I'm not convinced it's clean/safe to do
the above punning. At the very least a may_alias attribute probably
belongs in there somewhere. But an approach like this would reduce
code duplication and slightly improve the size/performance of the
resulting code.

Opinions?

Rich


Re: [PATCH] SH FDPIC backend support

2015-10-03 Thread Rich Felker
On Sat, Oct 03, 2015 at 03:12:16PM -0400, Rich Felker wrote:
> On Thu, Oct 01, 2015 at 09:30:17PM -0400, Rich Felker wrote:
> > But trying the patch on vanilla GCC trunk without my usual J2 target
> > setup revealed some additional issues I need to address. I'm getting
> > ICE in the code that generates the libgcc bitshift calls, which
> > weren't used on J2. This is my fault for failing to extend the changes
> > made to other parts of sh.md to the patterns for the new shifts (the
> > same ones that broke the kernel) and perhaps also some other things.
> > I'm going to go back and review that code and get it done right before
> > resubmitting the patch against trunk.
> 
> I found and fixed the problem, but I have a new concern: calls to the
> new shift instructions are using the following address forms:
> 
> -mno-fdpic -fPIC:
>   .long   __ashlsi3_r0@GOTOFF
> 
> -mfdpic:
>   .long   __ashlsi3_r0-(.LPCS1+2)
> 
> Neither of these seems valid. Both assume __ashlsi3_r0 will be defined
> in the same DSO, which is not true in general; shared libgcc_s.so
> might be in use. In this case the call would need to go through the
> PLT, which (for PIC or FDPIC) requires r12 to be loaded with the GOT
> address. In the non-FDPIC case, r12 _happens_ to contain the GOT
> address just because it was used as an addend to get the function
> address from the @GOTOFF address, but this does not seem
> safe/reliable. In the FDPIC case there's nothing to cause r12 to
> contain the GOT address, and in fact if the function has already made
> another function call (which uses and clobbers r12), no code is
> generated to save and restore r12 for the libgcc call.
> 
> Calls to other functions lib libgcc (e.g. division) seem to work fine
> and either go through the PLT or bypass it and load from the GOT
> directly. It's only these new special-calling-convention ones that are
> broken, and I can't figure out why...

Hmm, according to sh-protos.h:

  /* A special function that should be linked statically.  These are typically
 smaller or not much larger than a PLT entry.
 Some also have a non-standard ABI which precludes dynamic linking.  */
  SFUNC_STATIC

So apparently the strange behavior I observed is intended. Presumably
there is some mechanism to ensure that these functions are always
static-linked? But I don't see it. The libgcc spec I see is:

*libgcc:
%{static|static-libgcc:-lgcc
-lgcc_eh}%{!static:%{!static-libgcc:%{!shared-libgcc:-lgcc --as-needed
-lgcc_s --no-as-needed}%{shared-libgcc:-lgcc_s%{!shared: -lgcc

This explicitly omits -lgcc when -shared-libgcc is used with -shared.
Thankfully __ashlsi3_r0 is not exported from libgcc.so.1 (as far as I
can tell), so this will just be a link error rather than horribly
wrong behavior, but it still seems like there's a bug here unless I'm
misunderstanding something. I think the final %{!shared: -lgcc} in the
spec is an error and should be replaced by simply -lgcc if there are
targets where libgcc.a contains necessary symbols that are not/cannot
be defined in libgcc_s.so.1.

Rich


Re: [PATCH] SH FDPIC backend support

2015-10-03 Thread Rich Felker
On Thu, Oct 01, 2015 at 09:30:17PM -0400, Rich Felker wrote:
> But trying the patch on vanilla GCC trunk without my usual J2 target
> setup revealed some additional issues I need to address. I'm getting
> ICE in the code that generates the libgcc bitshift calls, which
> weren't used on J2. This is my fault for failing to extend the changes
> made to other parts of sh.md to the patterns for the new shifts (the
> same ones that broke the kernel) and perhaps also some other things.
> I'm going to go back and review that code and get it done right before
> resubmitting the patch against trunk.

I found and fixed the problem, but I have a new concern: calls to the
new shift instructions are using the following address forms:

-mno-fdpic -fPIC:
.long   __ashlsi3_r0@GOTOFF

-mfdpic:
.long   __ashlsi3_r0-(.LPCS1+2)

Neither of these seems valid. Both assume __ashlsi3_r0 will be defined
in the same DSO, which is not true in general; shared libgcc_s.so
might be in use. In this case the call would need to go through the
PLT, which (for PIC or FDPIC) requires r12 to be loaded with the GOT
address. In the non-FDPIC case, r12 _happens_ to contain the GOT
address just because it was used as an addend to get the function
address from the @GOTOFF address, but this does not seem
safe/reliable. In the FDPIC case there's nothing to cause r12 to
contain the GOT address, and in fact if the function has already made
another function call (which uses and clobbers r12), no code is
generated to save and restore r12 for the libgcc call.

Calls to other functions lib libgcc (e.g. division) seem to work fine
and either go through the PLT or bypass it and load from the GOT
directly. It's only these new special-calling-convention ones that are
broken, and I can't figure out why...

Rich


Re: [PATCH] SH FDPIC backend support

2015-10-03 Thread Rich Felker
On Sat, Oct 03, 2015 at 05:17:53PM +0900, Oleg Endo wrote:
> On Sat, 2015-10-03 at 00:50 -0400, Rich Felker wrote:
> 
> > I have -mfdpic in the self-specs when FDPIC_DEFAULT is defined, so I
> > think only the positive form is needed. 
> 
> Having positive and negative forms for options makes sense.  It usually
> costs nothing because anyway the compiler internally supports both and
> it allows special-casing if one of them is the default, which can be
> useful for testing.

What I'm saying is that the self-specs approach to FDPIC_DEFAULT has
the compiler driver adding -mfdpic to its own command line (via
%{!mno-fdpic:-mfdpic}) when FDPIC_DEFAULT is defined. This allows
other specs simply to test %{mfdpic:...} rather than having complex
separate forms depending on whether FDPIC_DEFAULT is defined. The
negative form is of course supported too (and suppresses the self-spec
addition of -mfdpic).

I'm not sure if this approach is acceptable upstream in gcc. I like it
a lot better because it isolates this kind of logic as described above
rather than having it spread out all over the place in error-prone
ways. I had a several-line patch for default-pie support that worked
via self-specs too; the one that was actually committed to gcc was
hugely invasive across many files including most targets...

Rich


Re: [PATCH] SH FDPIC backend support

2015-10-03 Thread Oleg Endo
On Sat, 2015-10-03 at 00:50 -0400, Rich Felker wrote:

> I have -mfdpic in the self-specs when FDPIC_DEFAULT is defined, so I
> think only the positive form is needed. 

Having positive and negative forms for options makes sense.  It usually
costs nothing because anyway the compiler internally supports both and
it allows special-casing if one of them is the default, which can be
useful for testing.

Cheers,
Oleg



Re: [PATCH] SH FDPIC backend support

2015-10-02 Thread Rich Felker
On Sat, Oct 03, 2015 at 06:57:56AM +0900, Kaz Kojima wrote:
> Rich Felker  wrote:
> > I worked around it and opened an issue for it:
> > 
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67812
> > 
> > But trying the patch on vanilla GCC trunk without my usual J2 target
> > setup revealed some additional issues I need to address. I'm getting
> > ICE in the code that generates the libgcc bitshift calls, which
> > weren't used on J2. This is my fault for failing to extend the changes
> > made to other parts of sh.md to the patterns for the new shifts (the
> > same ones that broke the kernel) and perhaps also some other things.
> > I'm going to go back and review that code and get it done right before
> > resubmitting the patch against trunk.
> > 
> > If you have any other general comments on the patch in the mean time
> > I'd be happy to hear them.
> 
> FYI, the patch can be applied to trunk almost as is.  I've tried
> to build/make -k check for cross sh4-unknown-linux-gnu.

Yes, there's just one trivial conflict.

> >  #ifndef SUBTARGET_ASM_SPEC
> > -#define SUBTARGET_ASM_SPEC ""
> > +#define SUBTARGET_ASM_SPEC "%{!mno-fdpic:--fdpic}"
> >  #endif
> 
> With it, plain sh4-unknown-linux-gnu compiler adds --fdpic
> to the AS command unless -mno-fdpic is specified and the build
> fails during linking the target libgcc.so.  I've changed it into

Oops.

>  #ifndef SUBTARGET_ASM_SPEC
> -#define SUBTARGET_ASM_SPEC ""
> +#ifdef FDPIC_DEFAULT
> +#define SUBTARGET_ASM_SPEC "%{!mno-fdpic:--fdpic}"
> +#else
> +#define SUBTARGET_ASM_SPEC "%{mfdpic:--fdpic}"
> +#endif
>  #endif

I have -mfdpic in the self-specs when FDPIC_DEFAULT is defined, so I
think only the positive form is needed. If having self specs is not
acceptable, several places need changing: at least the linker
emulation and (in the musl support patch; this is not yet upstream)
changing the logic for the dynamic linker name to have separate cases
for FDPIC_DEFAULT defined/undefined.

> There are no new failures with the top level "make -k check"
> on sh4-unknown-linux-gnu.

Thanks for checking this!

Rich


Re: [PATCH] SH FDPIC backend support

2015-10-02 Thread Kaz Kojima
Rich Felker  wrote:
> I worked around it and opened an issue for it:
> 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67812
> 
> But trying the patch on vanilla GCC trunk without my usual J2 target
> setup revealed some additional issues I need to address. I'm getting
> ICE in the code that generates the libgcc bitshift calls, which
> weren't used on J2. This is my fault for failing to extend the changes
> made to other parts of sh.md to the patterns for the new shifts (the
> same ones that broke the kernel) and perhaps also some other things.
> I'm going to go back and review that code and get it done right before
> resubmitting the patch against trunk.
> 
> If you have any other general comments on the patch in the mean time
> I'd be happy to hear them.

FYI, the patch can be applied to trunk almost as is.  I've tried
to build/make -k check for cross sh4-unknown-linux-gnu.

>  #ifndef SUBTARGET_ASM_SPEC
> -#define SUBTARGET_ASM_SPEC ""
> +#define SUBTARGET_ASM_SPEC "%{!mno-fdpic:--fdpic}"
>  #endif

With it, plain sh4-unknown-linux-gnu compiler adds --fdpic
to the AS command unless -mno-fdpic is specified and the build
fails during linking the target libgcc.so.  I've changed it into

 #ifndef SUBTARGET_ASM_SPEC
-#define SUBTARGET_ASM_SPEC ""
+#ifdef FDPIC_DEFAULT
+#define SUBTARGET_ASM_SPEC "%{!mno-fdpic:--fdpic}"
+#else
+#define SUBTARGET_ASM_SPEC "%{mfdpic:--fdpic}"
+#endif
 #endif

There are no new failures with the top level "make -k check"
on sh4-unknown-linux-gnu.

Regards,
kaz


Re: [PATCH] SH FDPIC backend support

2015-10-02 Thread Oleg Endo
On Fri, 2015-10-02 at 11:18 -0400, Rich Felker wrote:

> Thanks! This is very helpful. gcc style has changed a lot since the
> old patch was submitted so I think it makes sense to update it to
> match current practices rather than just making it work. I'll try to
> focus on any functional problems first though so as to keep a working
> patch against 5.2 as well and ease backporting to earlier versions (if
> anyone wants to do that on their own; certainly I don't expect it to
> happen in upstream gcc).

Let's see what the final patch will look like.

> There are a few call sites where the symbol returned is actually used.
> Would you want me to just do something like:
> 
> struct function_symbol_result funcsym = function_symbol(...);
> 
> then use funcsym.symbol and funcsym.label?

If you need both return values, then yes.  But without the "struct".  If
"function_symbol_result" is too long feel free to come up with a shorter
name.

> 
> Would you object to shorter member names .sym and .lab?

No, that's OK, too.

> Uhg, not sure how I missed that. (Well, yes I am -- it's C++'s
> fault;-) I'll try to figure out what's going on.

I think the overloaded function from your patch is simply not invoked by
anything.  You'd probably have to merge it into the already existing
one.

> > > +(define_insn "sibcalli_fdpic"
> > > +  [(call (mem:SI (match_operand:SI 0 "register_operand" "k"))
> > > +  (match_operand 1 "" ""))
> > > +   (use (reg:SI FPSCR_MODES_REG))
> > > +   (use (reg:SI PIC_REG))
> > > +   (return)]
> > > +  "TARGET_SH1 && TARGET_FDPIC"
> > >^^^
> > 
> > This is maybe slightly impossible, because of ..
> 
> Because SH5 is deprecated?

No, because ...

> > > +  if (TARGET_FDPIC
> > > +  && (TARGET_SHMEDIA || TARGET_SHCOMPACT || !TARGET_SH2))
> > > +sorry ("non-SH2 FDPIC");
> > > +

... this refuses operation if FDPIC is used with anything "less than"
SH2, i.e. SH1.  I think the condition above should be "TARGET_SH2 &&
TARGET_FDPIC".


> OK, but I'm not really familiar with this part of the code; I just
> adapted the patch by pattern. There are a lot of places with
> (match_operand N "" ""); should the empty strings be dropped for all
> of them?

Yes, there are several places with empty predicate/constraint strings.
They could be removed with a big patch, but for the moment, just don't
add new ones.

> > Maybe it could be useful to replace all "gen_rtx_REG (Pmode, PIC_REG)"
> > in the patch with something like 'get_t_reg_rtx'.  Depends on how many
> > times this gen_rtx_REG is invoked.
> 
> I'm fairly indifferent to this. Neither is significantly shorter or
> more readable.

It's about the amount of rtx objects generated.  But that can be checked
out later.

> I'm a bit confused by the fact that basically ALL of the traffic on
> the linux-sh list is "shmedia" stuff. Is that unrelated to the actual
> SH5/SHMEDIA and just a brand name that got co-opted for an ARM-based
> SoC?

Yes, looks like.

>  If so, is there anything that can be done to get it off the
> linux-sh list so that it doesn't bury mail about the actual SH
> ISA/platform?

Ask there.  It doesn't show up here :)

Cheers,
Oleg



Re: [PATCH] SH FDPIC backend support

2015-10-02 Thread Rich Felker
On Fri, Oct 02, 2015 at 10:51:03PM +0900, Oleg Endo wrote:
> On Thu, 2015-10-01 at 21:30 -0400, Rich Felker wrote:
> 
> > If you have any other general comments on the patch in the mean time
> > I'd be happy to hear them.
> 
> Below are some comments.  Might be a bit unstructured, I was hopping
> through the patch file.  Sorry about that.

Thanks! This is very helpful. gcc style has changed a lot since the
old patch was submitted so I think it makes sense to update it to
match current practices rather than just making it work. I'll try to
focus on any functional problems first though so as to keep a working
patch against 5.2 as well and ease backporting to earlier versions (if
anyone wants to do that on their own; certainly I don't expect it to
happen in upstream gcc).

> > +function_symbol (rtx target, const char *name, enum sh_function_kind kind, 
> > rtx *lab)
>  ^
> 
> Please do not add unnecessary 'enum', 'struct', 'typedef' etc.  In this
> case it was already here, but since this is is touching the line, please
> remove it.

OK.

> I'd rather make the function 'function_symbol' returning a
> std::pair or something like
> 
> struct function_symbol_result
> {
>   function_symbol_result (void) : symbol (NULL), label (NULL) { }
>   function_symbol_result (rtx s, rtx l) : symbol (s), label (l) { }
> 
>   rtx symbol;
>   rtx label;
> };
> 
> instead of doing return values by pointer-args.  On the caller sites,
> you can then do something like
> 
> rtx lab = function_symbol (func_addr_rtx, "...", SFUNC_STATIC).label;
> 
> This will make the the patch also a few hunks shorter.

There are a few call sites where the symbol returned is actually used.
Would you want me to just do something like:

struct function_symbol_result funcsym = function_symbol(...);

then use funcsym.symbol and funcsym.label?

Would you object to shorter member names .sym and .lab?

> > +extern bool sh_legitimate_constant_p (rtx);
> 
> There is already a target hook/callback function:
> 
> static bool
> sh_legitimate_constant_p (machine_mode mode, rtx x)
> 
> You newly added function is an overload it and I'm not sure who invokes it.

Uhg, not sure how I missed that. (Well, yes I am -- it's C++'s
fault;-) I'll try to figure out what's going on.

> > +extern rtx sh_our_fdpic_reg (void);
> 
> Please rename this to 'sh_get_fdpic_reg_initial_val'.  There's a similar
> function 'sh_get_pr_initial_val' which also uses
> 'get_hard_reg_initial_val'.

OK.

> > +/* An rtx holding the initial value of the FDPIC register (the FDPIC
> > +   pointer passed in from the caller).  */
> > +#define OUR_FDPIC_REG  sh_our_fdpic_reg ()
> > +
> 
> Please remove this macro and add 'sh_get_fdpic_reg_initial_val' to
> sh-protos.h and use that function instead.

OK.

> >  void
> >  prepare_move_operands (rtx operands[], machine_mode mode)
> >  {
> > +  rtx tmp, base, offset;
> > +
> 
> Please declare variables where they are used.

OK.

> > +  if (TARGET_FDPIC)
> > +{
> > +  rtx pic_reg = gen_rtx_REG (Pmode, PIC_REG);
> > +  emit_move_insn (pic_reg, OUR_FDPIC_REG);
> > +}
> > +
> 
> Make this a one-liner
> 
>   emit_move_insn (gen_rtx_REG (Pmode, PIC_REG),
> sh_get_fdpic_reg_initial_val ());

OK.

> > +(define_insn "sibcalli_fdpic"
> > +  [(call (mem:SI (match_operand:SI 0 "register_operand" "k"))
> > +(match_operand 1 "" ""))
> > +   (use (reg:SI FPSCR_MODES_REG))
> > +   (use (reg:SI PIC_REG))
> > +   (return)]
> > +  "TARGET_SH1 && TARGET_FDPIC"
> >^^^
> 
> This is maybe slightly impossible, because of ..

Because SH5 is deprecated?

> > +  if (TARGET_FDPIC
> > +  && (TARGET_SHMEDIA || TARGET_SHCOMPACT || !TARGET_SH2))
> > +sorry ("non-SH2 FDPIC");
> > +
> 
> 
> > +  [(match_operand 0 "" "") (match_operand 1 "" "")]
> > 
> 
> Please don't add empty predicate/constraint strings if not necessary.  In 
> this case
>   [(match_operand 0) (match_operand 1)]
> 
> will suffice.

OK, but I'm not really familiar with this part of the code; I just
adapted the patch by pattern. There are a lot of places with
(match_operand N "" ""); should the empty strings be dropped for all
of them?

> >   if (TARGET_FDPIC)
> > +picreg = OUR_FDPIC_REG;
> > +  else
> > +picreg = gen_rtx_REG (Pmode, PIC_REG);
> > +
> 
> rtx picreg = TARGET_FDPIC ? ...
> : ... ;

OK.

> Maybe it could be useful to replace all "gen_rtx_REG (Pmode, PIC_REG)"
> in the patch with something like 'get_t_reg_rtx'.  Depends on how many
> times this gen_rtx_REG is invoked.

I'm fairly indifferent to this. Neither is significantly shorter or
more readable.

> > +// FIXME: what happens if someone tries fdpic on SH5?
> > 
> 
> Nothing.  See also
> https://gcc.gnu.org/ml/gcc/2015-08/msg00101.html
> 
> Please omit all SH5/SHMEDIA checks and related code.

I'm a bit confused by the fact that basically ALL of the traffic on
the linux-sh list is "shmedia" stuff. Is 

Re: [PATCH] SH FDPIC backend support

2015-10-02 Thread Oleg Endo
On Thu, 2015-10-01 at 21:30 -0400, Rich Felker wrote:

> If you have any other general comments on the patch in the mean time
> I'd be happy to hear them.

Below are some comments.  Might be a bit unstructured, I was hopping
through the patch file.  Sorry about that.

> +function_symbol (rtx target, const char *name, enum sh_function_kind kind, 
> rtx *lab)
 ^

Please do not add unnecessary 'enum', 'struct', 'typedef' etc.  In this
case it was already here, but since this is is touching the line, please
remove it.

I'd rather make the function 'function_symbol' returning a
std::pair or something like

struct function_symbol_result
{
  function_symbol_result (void) : symbol (NULL), label (NULL) { }
  function_symbol_result (rtx s, rtx l) : symbol (s), label (l) { }

  rtx symbol;
  rtx label;
};

instead of doing return values by pointer-args.  On the caller sites,
you can then do something like

rtx lab = function_symbol (func_addr_rtx, "...", SFUNC_STATIC).label;

This will make the the patch also a few hunks shorter.

> +extern bool sh_legitimate_constant_p (rtx);

There is already a target hook/callback function:

static bool
sh_legitimate_constant_p (machine_mode mode, rtx x)

You newly added function is an overload it and I'm not sure who invokes it.


> +extern rtx sh_our_fdpic_reg (void);

Please rename this to 'sh_get_fdpic_reg_initial_val'.  There's a similar
function 'sh_get_pr_initial_val' which also uses
'get_hard_reg_initial_val'.

> +/* An rtx holding the initial value of the FDPIC register (the FDPIC
> +   pointer passed in from the caller).  */
> +#define OUR_FDPIC_REGsh_our_fdpic_reg ()
> +

Please remove this macro and add 'sh_get_fdpic_reg_initial_val' to
sh-protos.h and use that function instead.

>  void
>  prepare_move_operands (rtx operands[], machine_mode mode)
>  {
> +  rtx tmp, base, offset;
> +

Please declare variables where they are used.


> +  if (TARGET_FDPIC)
> +{
> +  rtx pic_reg = gen_rtx_REG (Pmode, PIC_REG);
> +  emit_move_insn (pic_reg, OUR_FDPIC_REG);
> +}
> +

Make this a one-liner

  emit_move_insn (gen_rtx_REG (Pmode, PIC_REG),
  sh_get_fdpic_reg_initial_val ());



> +(define_insn "sibcalli_fdpic"
> +  [(call (mem:SI (match_operand:SI 0 "register_operand" "k"))
> +  (match_operand 1 "" ""))
> +   (use (reg:SI FPSCR_MODES_REG))
> +   (use (reg:SI PIC_REG))
> +   (return)]
> +  "TARGET_SH1 && TARGET_FDPIC"
>^^^

This is maybe slightly impossible, because of ..

> +  if (TARGET_FDPIC
> +  && (TARGET_SHMEDIA || TARGET_SHCOMPACT || !TARGET_SH2))
> +sorry ("non-SH2 FDPIC");
> +


> +  [(match_operand 0 "" "") (match_operand 1 "" "")]
> 

Please don't add empty predicate/constraint strings if not necessary.  In this 
case
[(match_operand 0) (match_operand 1)]

will suffice.


>   if (TARGET_FDPIC)
> +picreg = OUR_FDPIC_REG;
> +  else
> +picreg = gen_rtx_REG (Pmode, PIC_REG);
> +

rtx picreg = TARGET_FDPIC ? ...
  : ... ;

Maybe it could be useful to replace all "gen_rtx_REG (Pmode, PIC_REG)"
in the patch with something like 'get_t_reg_rtx'.  Depends on how many
times this gen_rtx_REG is invoked.


> +// FIXME: what happens if someone tries fdpic on SH5?
> 

Nothing.  See also
https://gcc.gnu.org/ml/gcc/2015-08/msg00101.html

Please omit all SH5/SHMEDIA checks and related code.


> +#ifdef __FDPIC__
> +#define udiv_qrnnd(q, r, n1, n0, d) \
> +  do {   
> \
> +extern UWtype __udiv_qrnnd_16 (UWtype, UWtype)   \

It's really difficult to spot the subtle difference of the FDPIC version
and the non-FDPIC version.  At least there should be a comment.

Cheers,
Oleg




Re: [PATCH] SH FDPIC backend support

2015-10-01 Thread Rich Felker
On Thu, Oct 01, 2015 at 07:39:10PM -0400, Rich Felker wrote:
> On Fri, Oct 02, 2015 at 07:36:27AM +0900, Oleg Endo wrote:
> > On Thu, 2015-10-01 at 17:35 -0400, Rich Felker wrote:
> > > This is a forward-port of the abandoned SH FDPIC patch from 2010:
> > > 
> > > https://gcc.gnu.org/ml/gcc-patches/2010-08/msg01536.html
> > > 
> > > I'm submitting it at this point for initial review, not to be applied
> > > right away; I would not be surprised if some changes are needed. It
> > > applies on top of gcc 5.2.0 with the patch for pr 66609 applied. With
> > > one trivial change it also applies to the current development version
> > > of gcc, but I have not tested that setup.
> > 
> > Thanks for working on this.  Please submit a patch against trunk.
> 
> I'm going to go ahead and submit the patch adjusted for trunk, but I
> have not yet tested it yet -- I can't get trunk to build because of a
> regression. Apparently someone added -fno-PIE to the build process,
> which breaks when the host toolchain you're building with uses -pie by
> default. Is this a known issue?

I worked around it and opened an issue for it:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67812

But trying the patch on vanilla GCC trunk without my usual J2 target
setup revealed some additional issues I need to address. I'm getting
ICE in the code that generates the libgcc bitshift calls, which
weren't used on J2. This is my fault for failing to extend the changes
made to other parts of sh.md to the patterns for the new shifts (the
same ones that broke the kernel) and perhaps also some other things.
I'm going to go back and review that code and get it done right before
resubmitting the patch against trunk.

If you have any other general comments on the patch in the mean time
I'd be happy to hear them.

Rich


Re: [PATCH] SH FDPIC backend support

2015-10-01 Thread Rich Felker
On Fri, Oct 02, 2015 at 07:36:27AM +0900, Oleg Endo wrote:
> On Thu, 2015-10-01 at 17:35 -0400, Rich Felker wrote:
> > This is a forward-port of the abandoned SH FDPIC patch from 2010:
> > 
> > https://gcc.gnu.org/ml/gcc-patches/2010-08/msg01536.html
> > 
> > I'm submitting it at this point for initial review, not to be applied
> > right away; I would not be surprised if some changes are needed. It
> > applies on top of gcc 5.2.0 with the patch for pr 66609 applied. With
> > one trivial change it also applies to the current development version
> > of gcc, but I have not tested that setup.
> 
> Thanks for working on this.  Please submit a patch against trunk.

I'm going to go ahead and submit the patch adjusted for trunk, but I
have not yet tested it yet -- I can't get trunk to build because of a
regression. Apparently someone added -fno-PIE to the build process,
which breaks when the host toolchain you're building with uses -pie by
default. Is this a known issue?

Rich


Re: [PATCH] SH FDPIC backend support

2015-10-01 Thread Oleg Endo
On Thu, 2015-10-01 at 17:35 -0400, Rich Felker wrote:
> This is a forward-port of the abandoned SH FDPIC patch from 2010:
> 
> https://gcc.gnu.org/ml/gcc-patches/2010-08/msg01536.html
> 
> I'm submitting it at this point for initial review, not to be applied
> right away; I would not be surprised if some changes are needed. It
> applies on top of gcc 5.2.0 with the patch for pr 66609 applied. With
> one trivial change it also applies to the current development version
> of gcc, but I have not tested that setup.

Thanks for working on this.  Please submit a patch against trunk.

Cheers,
Oleg