Re: [2/5] C-SKY port: Backend implementation

2018-08-03 Thread Sandra Loosemore

On 08/02/2018 04:27 PM, Jeff Law wrote:

I think the cse_cc pass is really just working around one or more bugs
in CSE and/or a backend bug.  The RTL above clearly shows a common
subexpression that is not eliminated by CSE.

What CSE should be trying to do is changing the second and third
occurrences of (gt:CC (reg 77) (reg 78)) with (reg 33) which would
create nop-sets which would be subsequently deleted.  I suspect you do
not have an insn which matches that nop set of the CC register.  If you
fix that I suspect CSE will work better and eliminate the need for your
cse_cc pass.


Thanks for the tip about that!  I hacked things up to do as you suggest 
and it appears to work.  I'll post a new patch set once I've had time 
for more testing, probably Monday-ish.


-Sandra


Re: [2/5] C-SKY port: Backend implementation

2018-08-03 Thread Yunhai



> 在 2018年8月3日,06:27,Jeff Law  写道:
> 
> On 07/26/2018 12:06 AM, 瞿仙淼 wrote:
>> 
>>> 在 2018年7月25日,上午5:24,Jeff Law  写道:
>>> 
>>> On 07/24/2018 12:18 PM, Sandra Loosemore wrote:
 On 07/24/2018 09:45 AM, Jeff Law wrote:
> On 07/23/2018 10:21 PM, Sandra Loosemore wrote:
> I'm not a big fan of more awk code, but I'm not going to object to it :-)
> 
> Why does the port have its own little pass for condition code
> optimization (cse_cc)?  What is it doing that can't be done with our
> generic optimizers?
 
 This pass was included in the initial patch set we got from C-SKY, and
 as it didn't seem to break anything I left it in.  Perhaps C-SKY can
 provide a testcase that demonstrates why it's still useful in the
 current version of GCC; otherwise we can remove this from the initial
 port submission and restore it later if some performance analysis shows
 it is still worthwhile.
>>> FWIW it looks like we model CC setting on just a few insns, (add,
>>> subtract) so I'd be surprised if this little mini pass found much.  I'd
>>> definitely like to hear from the csky authors here.
>>> 
>>> Alternately, you could do add some instrumentation to flag when it
>>> triggers, take a test or two that does, reduce it and we can then look
>>> at the key RTL sequences and see what the pass is really doing.
>>> 
>> 
>> I wrote a case to reproduce this problem on C-SKY. C code is as follows:
>> ---
>> int e1, e2;
>> 
>> void func (int a, int b, int c, int d, int f, int g)
>> {
>>  e1 = a > b ? f : g;
>>  e2 = a > b ? c : d;
>> 
>>  return;
>> }
>> ---
>> 
>> compile to assembler with option “-O3 -S” :
>> ---
>> func:
>>  cmplt a1, a0
>>  ld.w  t1, (sp, 0)
>>  ld.w  t0, (sp, 4)
>>  movt  t0, t1
>>  cmplt a1, a0
>>  movt  a3, a2
>>  lrw a1, e2
>>  lrw a2, e1
>>  st.w  a3, (a1, 0)
>>  st.w  t0, (a2, 0)
>>  rts
>> ---
>> There is an extra “cmplt a1, a0" in the above code without cse_cc. This 
>> situation mainly occurs when a relatively short branch jump is converted 
>> into a conditional execution instruction. And the CSE pass can not reduce 
>> the same conditional comparison instruction . Here is the rtx sequence after 
>> “cse2” pass.
>> 
>> ---
>> (insn 28 13 29 2 (set (reg:CC 33 c)
>>(gt:CC (reg/v:SI 77 [ a ])
>>(reg/v:SI 78 [ b ]))) func.c:5 1099 {*cmpgtsi}
>> (nil))
>> (insn 29 28 30 2 (set (reg/v:SI 82 [ g ])
>>(if_then_else:SI (eq (reg:CC 33 c)
>>(const_int 0 [0]))
>>(reg/v:SI 82 [ g ])
>>(reg/v:SI 81 [ f ]))) func.c:5 983 {movf}
>> (expr_list:REG_DEAD (reg/v:SI 81 [ f ])
>>(expr_list:REG_DEAD (reg:CC 33 c)
>>(nil
>> (insn 30 29 31 2 (set (reg:CC 33 c)
>>(gt:CC (reg/v:SI 77 [ a ])
>>(reg/v:SI 78 [ b ]))) func.c:5 1099 {*cmpgtsi}
>> (expr_list:REG_DEAD (reg/v:SI 78 [ b ])
>>(expr_list:REG_DEAD (reg/v:SI 77 [ a ])
>>(nil
>> (insn 31 30 18 2 (set (reg/v:SI 80 [ d ])
>>(if_then_else:SI (eq (reg:CC 33 c)
>>(const_int 0 [0]))
>>(reg/v:SI 80 [ d ])
>>(reg/v:SI 79 [ c ]))) func.c:5 983 {movf}
>> (expr_list:REG_DEAD (reg/v:SI 79 [ c ])
>>(expr_list:REG_DEAD (reg:CC 33 c)
>>(nil
>> ---
>> 
>> It doesn't seem to check the same conditional comparison instruction .So I 
>> wrote this to solve this problem, but I am not sure if this is the best way 
>> : )
>> 
>> PS, the same conditional comparison instruction cannot be reduced with the 
>> latest version gcc with C-SKY because I just insert the “cse_cc” after 
>> “cse1”, when I insert after “cse2”, this problem can be solved very well.
> I think the cse_cc pass is really just working around one or more bugs
> in CSE and/or a backend bug.  The RTL above clearly shows a common
> subexpression that is not eliminated by CSE.
> 
> What CSE should be trying to do is changing the second and third
> occurrences of (gt:CC (reg 77) (reg 78)) with (reg 33) which would
> create nop-sets which would be subsequently deleted.  I suspect you do
> not have an insn which matches that nop set of the CC register.  If you
> fix that I suspect CSE will work better and eliminate the need for your
> cse_cc pass.

Thanks you for your suggestions, we will try this method.

> 
> jeff



Re: [2/5] C-SKY port: Backend implementation

2018-08-02 Thread Jeff Law
On 07/27/2018 07:49 PM, Sandra Loosemore wrote:
> On 07/26/2018 12:06 AM, 瞿仙淼 wrote:
>>
>> I wrote a case to reproduce this problem on C-SKY. C code is as follows:
>> ---
>> int e1, e2;
>>
>> void func (int a, int b, int c, int d, int f, int g)
>> {
>>    e1 = a > b ? f : g;
>>    e2 = a > b ? c : d;
>>
>>    return;
>> }
>> ---
>>
>> compile to assembler with option “-O3 -S” :
>> ---
>> func:
>>    cmplt a1, a0
>>    ld.w  t1, (sp, 0)
>>    ld.w  t0, (sp, 4)
>>    movt  t0, t1
>>    cmplt a1, a0
>>    movt  a3, a2
>>    lrw a1, e2
>>    lrw a2, e1
>>    st.w  a3, (a1, 0)
>>    st.w  t0, (a2, 0)
>>    rts
>> ---
>> There is an extra “cmplt a1, a0" in the above code without cse_cc.
>> This situation mainly occurs when a relatively short branch jump is
>> converted into a conditional execution instruction. And the CSE pass
>> can not reduce the same conditional comparison instruction . Here is
>> the rtx sequence after “cse2” pass.
>>
>> ---
>> (insn 28 13 29 2 (set (reg:CC 33 c)
>>  (gt:CC (reg/v:SI 77 [ a ])
>>  (reg/v:SI 78 [ b ]))) func.c:5 1099 {*cmpgtsi}
>>   (nil))
>> (insn 29 28 30 2 (set (reg/v:SI 82 [ g ])
>>  (if_then_else:SI (eq (reg:CC 33 c)
>>  (const_int 0 [0]))
>>  (reg/v:SI 82 [ g ])
>>  (reg/v:SI 81 [ f ]))) func.c:5 983 {movf}
>>   (expr_list:REG_DEAD (reg/v:SI 81 [ f ])
>>  (expr_list:REG_DEAD (reg:CC 33 c)
>>  (nil
>> (insn 30 29 31 2 (set (reg:CC 33 c)
>>  (gt:CC (reg/v:SI 77 [ a ])
>>  (reg/v:SI 78 [ b ]))) func.c:5 1099 {*cmpgtsi}
>>   (expr_list:REG_DEAD (reg/v:SI 78 [ b ])
>>  (expr_list:REG_DEAD (reg/v:SI 77 [ a ])
>>  (nil
>> (insn 31 30 18 2 (set (reg/v:SI 80 [ d ])
>>  (if_then_else:SI (eq (reg:CC 33 c)
>>  (const_int 0 [0]))
>>  (reg/v:SI 80 [ d ])
>>  (reg/v:SI 79 [ c ]))) func.c:5 983 {movf}
>>   (expr_list:REG_DEAD (reg/v:SI 79 [ c ])
>>  (expr_list:REG_DEAD (reg:CC 33 c)
>>  (nil
>> ---
>>
>> It doesn't seem to check the same conditional comparison instruction
>> .So I wrote this to solve this problem, but I am not sure if this is
>> the best way : )
>>
>> PS, the same conditional comparison instruction cannot be reduced with
>> the latest version gcc with C-SKY because I just insert the “cse_cc”
>> after “cse1”, when I insert after “cse2”, this problem can be solved
>> very well.
> 
> Thanks, this is very helpful.  I've verified this and I'm moving the
> pass as you suggest, adding a more detailed comment to the source to
> explain what the pass is for, and adding your test case to the
> testsuite.  This will be included when I resubmit the patches to address
> other review comments too.
> 
> Jeff, does that adequately address your concerns about whether the pass
> is useful?
I think the pass is papering over problems elsewhere (see my most other
reply on this thread).  I do think it would be useful to take that code
and create a test based on it.  I suspect you'll want to verify multiple
GT expressions prior to CSE2 and that after CSE2 you have a single GT
expression.  Presumably it'd be in the csky specific test.

jeff


Re: [2/5] C-SKY port: Backend implementation

2018-08-02 Thread Jeff Law
On 07/26/2018 12:06 AM, 瞿仙淼 wrote:
> 
>> 在 2018年7月25日,上午5:24,Jeff Law  写道:
>>
>> On 07/24/2018 12:18 PM, Sandra Loosemore wrote:
>>> On 07/24/2018 09:45 AM, Jeff Law wrote:
 On 07/23/2018 10:21 PM, Sandra Loosemore wrote:
 I'm not a big fan of more awk code, but I'm not going to object to it :-)

 Why does the port have its own little pass for condition code
 optimization (cse_cc)?  What is it doing that can't be done with our
 generic optimizers?
>>>
>>> This pass was included in the initial patch set we got from C-SKY, and
>>> as it didn't seem to break anything I left it in.  Perhaps C-SKY can
>>> provide a testcase that demonstrates why it's still useful in the
>>> current version of GCC; otherwise we can remove this from the initial
>>> port submission and restore it later if some performance analysis shows
>>> it is still worthwhile.
>> FWIW it looks like we model CC setting on just a few insns, (add,
>> subtract) so I'd be surprised if this little mini pass found much.  I'd
>> definitely like to hear from the csky authors here.
>>
>> Alternately, you could do add some instrumentation to flag when it
>> triggers, take a test or two that does, reduce it and we can then look
>> at the key RTL sequences and see what the pass is really doing.
>>
> 
> I wrote a case to reproduce this problem on C-SKY. C code is as follows:
> ---
> int e1, e2;
> 
> void func (int a, int b, int c, int d, int f, int g)
> {
>   e1 = a > b ? f : g;
>   e2 = a > b ? c : d;
> 
>   return;
> }
> ---
> 
> compile to assembler with option “-O3 -S” :
> ---
> func:
>   cmplt a1, a0
>   ld.w  t1, (sp, 0)
>   ld.w  t0, (sp, 4)
>   movt  t0, t1
>   cmplt a1, a0
>   movt  a3, a2
>   lrw a1, e2
>   lrw a2, e1
>   st.w  a3, (a1, 0)
>   st.w  t0, (a2, 0)
>   rts
> ---
> There is an extra “cmplt a1, a0" in the above code without cse_cc. This 
> situation mainly occurs when a relatively short branch jump is converted into 
> a conditional execution instruction. And the CSE pass can not reduce the same 
> conditional comparison instruction . Here is the rtx sequence after “cse2” 
> pass.
> 
> ---
> (insn 28 13 29 2 (set (reg:CC 33 c)
> (gt:CC (reg/v:SI 77 [ a ])
> (reg/v:SI 78 [ b ]))) func.c:5 1099 {*cmpgtsi}
>  (nil))
> (insn 29 28 30 2 (set (reg/v:SI 82 [ g ])
> (if_then_else:SI (eq (reg:CC 33 c)
> (const_int 0 [0]))
> (reg/v:SI 82 [ g ])
> (reg/v:SI 81 [ f ]))) func.c:5 983 {movf}
>  (expr_list:REG_DEAD (reg/v:SI 81 [ f ])
> (expr_list:REG_DEAD (reg:CC 33 c)
> (nil
> (insn 30 29 31 2 (set (reg:CC 33 c)
> (gt:CC (reg/v:SI 77 [ a ])
> (reg/v:SI 78 [ b ]))) func.c:5 1099 {*cmpgtsi}
>  (expr_list:REG_DEAD (reg/v:SI 78 [ b ])
> (expr_list:REG_DEAD (reg/v:SI 77 [ a ])
> (nil
> (insn 31 30 18 2 (set (reg/v:SI 80 [ d ])
> (if_then_else:SI (eq (reg:CC 33 c)
> (const_int 0 [0]))
> (reg/v:SI 80 [ d ])
> (reg/v:SI 79 [ c ]))) func.c:5 983 {movf}
>  (expr_list:REG_DEAD (reg/v:SI 79 [ c ])
> (expr_list:REG_DEAD (reg:CC 33 c)
> (nil
> ---
> 
> It doesn't seem to check the same conditional comparison instruction .So I 
> wrote this to solve this problem, but I am not sure if this is the best way : 
> )
> 
> PS, the same conditional comparison instruction cannot be reduced with the 
> latest version gcc with C-SKY because I just insert the “cse_cc” after 
> “cse1”, when I insert after “cse2”, this problem can be solved very well.
I think the cse_cc pass is really just working around one or more bugs
in CSE and/or a backend bug.  The RTL above clearly shows a common
subexpression that is not eliminated by CSE.

What CSE should be trying to do is changing the second and third
occurrences of (gt:CC (reg 77) (reg 78)) with (reg 33) which would
create nop-sets which would be subsequently deleted.  I suspect you do
not have an insn which matches that nop set of the CC register.  If you
fix that I suspect CSE will work better and eliminate the need for your
cse_cc pass.

jeff



Re: [2/5] C-SKY port: Backend implementation

2018-07-27 Thread Sandra Loosemore

On 07/26/2018 12:06 AM, 瞿仙淼 wrote:


I wrote a case to reproduce this problem on C-SKY. C code is as follows:
---
int e1, e2;

void func (int a, int b, int c, int d, int f, int g)
{
   e1 = a > b ? f : g;
   e2 = a > b ? c : d;

   return;
}
---

compile to assembler with option “-O3 -S” :
---
func:
   cmplt a1, a0
   ld.w  t1, (sp, 0)
   ld.w  t0, (sp, 4)
   movt  t0, t1
   cmplt a1, a0
   movt  a3, a2
   lrw a1, e2
   lrw a2, e1
   st.w  a3, (a1, 0)
   st.w  t0, (a2, 0)
   rts
---
There is an extra “cmplt a1, a0" in the above code without cse_cc. This 
situation mainly occurs when a relatively short branch jump is converted into a 
conditional execution instruction. And the CSE pass can not reduce the same 
conditional comparison instruction . Here is the rtx sequence after “cse2” pass.

---
(insn 28 13 29 2 (set (reg:CC 33 c)
 (gt:CC (reg/v:SI 77 [ a ])
 (reg/v:SI 78 [ b ]))) func.c:5 1099 {*cmpgtsi}
  (nil))
(insn 29 28 30 2 (set (reg/v:SI 82 [ g ])
 (if_then_else:SI (eq (reg:CC 33 c)
 (const_int 0 [0]))
 (reg/v:SI 82 [ g ])
 (reg/v:SI 81 [ f ]))) func.c:5 983 {movf}
  (expr_list:REG_DEAD (reg/v:SI 81 [ f ])
 (expr_list:REG_DEAD (reg:CC 33 c)
 (nil
(insn 30 29 31 2 (set (reg:CC 33 c)
 (gt:CC (reg/v:SI 77 [ a ])
 (reg/v:SI 78 [ b ]))) func.c:5 1099 {*cmpgtsi}
  (expr_list:REG_DEAD (reg/v:SI 78 [ b ])
 (expr_list:REG_DEAD (reg/v:SI 77 [ a ])
 (nil
(insn 31 30 18 2 (set (reg/v:SI 80 [ d ])
 (if_then_else:SI (eq (reg:CC 33 c)
 (const_int 0 [0]))
 (reg/v:SI 80 [ d ])
 (reg/v:SI 79 [ c ]))) func.c:5 983 {movf}
  (expr_list:REG_DEAD (reg/v:SI 79 [ c ])
 (expr_list:REG_DEAD (reg:CC 33 c)
 (nil
---

It doesn't seem to check the same conditional comparison instruction .So I 
wrote this to solve this problem, but I am not sure if this is the best way : )

PS, the same conditional comparison instruction cannot be reduced with the 
latest version gcc with C-SKY because I just insert the “cse_cc” after “cse1”, 
when I insert after “cse2”, this problem can be solved very well.


Thanks, this is very helpful.  I've verified this and I'm moving the 
pass as you suggest, adding a more detailed comment to the source to 
explain what the pass is for, and adding your test case to the 
testsuite.  This will be included when I resubmit the patches to address 
other review comments too.


Jeff, does that adequately address your concerns about whether the pass 
is useful?


-Sandra


Re: [2/5] C-SKY port: Backend implementation

2018-07-26 Thread 瞿仙淼


> 在 2018年7月25日,上午5:24,Jeff Law  写道:
> 
> On 07/24/2018 12:18 PM, Sandra Loosemore wrote:
>> On 07/24/2018 09:45 AM, Jeff Law wrote:
>>> On 07/23/2018 10:21 PM, Sandra Loosemore wrote:
>>> I'm not a big fan of more awk code, but I'm not going to object to it :-)
>>> 
>>> Why does the port have its own little pass for condition code
>>> optimization (cse_cc)?  What is it doing that can't be done with our
>>> generic optimizers?
>> 
>> This pass was included in the initial patch set we got from C-SKY, and
>> as it didn't seem to break anything I left it in.  Perhaps C-SKY can
>> provide a testcase that demonstrates why it's still useful in the
>> current version of GCC; otherwise we can remove this from the initial
>> port submission and restore it later if some performance analysis shows
>> it is still worthwhile.
> FWIW it looks like we model CC setting on just a few insns, (add,
> subtract) so I'd be surprised if this little mini pass found much.  I'd
> definitely like to hear from the csky authors here.
> 
> Alternately, you could do add some instrumentation to flag when it
> triggers, take a test or two that does, reduce it and we can then look
> at the key RTL sequences and see what the pass is really doing.
> 

I wrote a case to reproduce this problem on C-SKY. C code is as follows:
---
int e1, e2;

void func (int a, int b, int c, int d, int f, int g)
{
  e1 = a > b ? f : g;
  e2 = a > b ? c : d;

  return;
}
---

compile to assembler with option “-O3 -S” :
---
func:
  cmplt a1, a0
  ld.w  t1, (sp, 0)
  ld.w  t0, (sp, 4)
  movt  t0, t1
  cmplt a1, a0
  movt  a3, a2
  lrw a1, e2
  lrw a2, e1
  st.w  a3, (a1, 0)
  st.w  t0, (a2, 0)
  rts
---
There is an extra “cmplt a1, a0" in the above code without cse_cc. This 
situation mainly occurs when a relatively short branch jump is converted into a 
conditional execution instruction. And the CSE pass can not reduce the same 
conditional comparison instruction . Here is the rtx sequence after “cse2” pass.

---
(insn 28 13 29 2 (set (reg:CC 33 c)
(gt:CC (reg/v:SI 77 [ a ])
(reg/v:SI 78 [ b ]))) func.c:5 1099 {*cmpgtsi}
 (nil))
(insn 29 28 30 2 (set (reg/v:SI 82 [ g ])
(if_then_else:SI (eq (reg:CC 33 c)
(const_int 0 [0]))
(reg/v:SI 82 [ g ])
(reg/v:SI 81 [ f ]))) func.c:5 983 {movf}
 (expr_list:REG_DEAD (reg/v:SI 81 [ f ])
(expr_list:REG_DEAD (reg:CC 33 c)
(nil
(insn 30 29 31 2 (set (reg:CC 33 c)
(gt:CC (reg/v:SI 77 [ a ])
(reg/v:SI 78 [ b ]))) func.c:5 1099 {*cmpgtsi}
 (expr_list:REG_DEAD (reg/v:SI 78 [ b ])
(expr_list:REG_DEAD (reg/v:SI 77 [ a ])
(nil
(insn 31 30 18 2 (set (reg/v:SI 80 [ d ])
(if_then_else:SI (eq (reg:CC 33 c)
(const_int 0 [0]))
(reg/v:SI 80 [ d ])
(reg/v:SI 79 [ c ]))) func.c:5 983 {movf}
 (expr_list:REG_DEAD (reg/v:SI 79 [ c ])
(expr_list:REG_DEAD (reg:CC 33 c)
(nil
---

It doesn't seem to check the same conditional comparison instruction .So I 
wrote this to solve this problem, but I am not sure if this is the best way : )

PS, the same conditional comparison instruction cannot be reduced with the 
latest version gcc with C-SKY because I just insert the “cse_cc” after “cse1”, 
when I insert after “cse2”, this problem can be solved very well.


Thanks,
Xianmiao



Re: [2/5] C-SKY port: Backend implementation

2018-07-25 Thread Sandra Loosemore

On 07/25/2018 07:16 AM, Paul Koning wrote:


Non-executable stacks are a very good thing.

That said, I also looked at the target hook documentation and was
left without any clue whatsoever.  It sure isn't clear what powers of
two have to do with descriptors, or what descriptors have to do with
support for nested functions.

Can you suggest places to look to get an understanding of this
feature?  It sounds like the only other option is "Use the source,
Luke".  Any specific targets that make a good reference
implementation for this?


FYI, so far I have found PR ada/67205 and the original patch posting 
here, but it looks like "Use the source" is indeed where we are on this. 
 :-(


https://gcc.gnu.org/ml/gcc-patches/2016-06/msg02016.html

-Sandra


Re: [2/5] C-SKY port: Backend implementation

2018-07-25 Thread Paul Koning



> On Jul 25, 2018, at 12:50 AM, Jeff Law  wrote:
> 
 ...
>>> It did.  See TARGET_CUSTOM_FUNCTION_DESCRIPTORS and the (relatively few)
>>> ports that define it.
>> 
>> Hmmm, I completely failed to make that connection from the docs -- the
>> whole description of that hook is pretty gibberishy and I thought it was
>> something for targets where the ABI already specifies some "standard
>> calling sequence" using descriptors (C-SKY doesn't), rather than a
>> generic alternative to executable trampolines.  Putting on my doc
>> maintainer hat briefly, I can see this needs a lot of work.  :-(
> Most likely :-)  So many things to do, so little time.
> 
> 
>> 
>> Anyway, is this required for new ports nowadays?  If so, I at least know
>> what to search for now.  At this point I couldn't say whether this would
>> do anything to fix the situation on ck801 targets where there simply
>> aren't enough spare registers available to the trampoline to both hold
>> the static link and do an indirect jump.
> It's not required, but preferred, particularly if the part has an MMU
> that can provide no-execute protections on pages in memory.  If the
> target doesn't have an mmu, then it's of marginal value.
> 
> The key advantage it has over the old trampoline implementation is that
> stacks can remain non-executable, even for Ada and nested functions.
> That's a big win from a security standpoint.

Non-executable stacks are a very good thing.

That said, I also looked at the target hook documentation and was left without 
any clue whatsoever.  It sure isn't clear what powers of two have to do with 
descriptors, or what descriptors have to do with support for nested functions.

Can you suggest places to look to get an understanding of this feature?  It 
sounds like the only other option is "Use the source, Luke".  Any specific 
targets that make a good reference implementation for this?

paul




Re: [2/5] C-SKY port: Backend implementation

2018-07-24 Thread Jeff Law
On 07/24/2018 06:17 PM, Sandra Loosemore wrote:
> On 07/24/2018 03:24 PM, Jeff Law wrote:
>>>
 Any thoughts on using the newer function descriptor bits rather than
 old
 style stack trampolines?
>>>
>>> Has that been committed?  I vaguely remembered discussion of a new way
>>> to handle nested functions without using the trampoline interface, but I
>>> couldn't find any documentation in the internals manual.
>> It did.  See TARGET_CUSTOM_FUNCTION_DESCRIPTORS and the (relatively few)
>> ports that define it.
> 
> Hmmm, I completely failed to make that connection from the docs -- the
> whole description of that hook is pretty gibberishy and I thought it was
> something for targets where the ABI already specifies some "standard
> calling sequence" using descriptors (C-SKY doesn't), rather than a
> generic alternative to executable trampolines.  Putting on my doc
> maintainer hat briefly, I can see this needs a lot of work.  :-(
Most likely :-)  So many things to do, so little time.


> 
> Anyway, is this required for new ports nowadays?  If so, I at least know
> what to search for now.  At this point I couldn't say whether this would
> do anything to fix the situation on ck801 targets where there simply
> aren't enough spare registers available to the trampoline to both hold
> the static link and do an indirect jump.
It's not required, but preferred, particularly if the part has an MMU
that can provide no-execute protections on pages in memory.  If the
target doesn't have an mmu, then it's of marginal value.

The key advantage it has over the old trampoline implementation is that
stacks can remain non-executable, even for Ada and nested functions.
That's a big win from a security standpoint.

Jeff


Re: [2/5] C-SKY port: Backend implementation

2018-07-24 Thread Sandra Loosemore

On 07/24/2018 03:24 PM, Jeff Law wrote:



Any thoughts on using the newer function descriptor bits rather than old
style stack trampolines?


Has that been committed?  I vaguely remembered discussion of a new way
to handle nested functions without using the trampoline interface, but I
couldn't find any documentation in the internals manual.

It did.  See TARGET_CUSTOM_FUNCTION_DESCRIPTORS and the (relatively few)
ports that define it.


Hmmm, I completely failed to make that connection from the docs -- the 
whole description of that hook is pretty gibberishy and I thought it was 
something for targets where the ABI already specifies some "standard 
calling sequence" using descriptors (C-SKY doesn't), rather than a 
generic alternative to executable trampolines.  Putting on my doc 
maintainer hat briefly, I can see this needs a lot of work.  :-(


Anyway, is this required for new ports nowadays?  If so, I at least know 
what to search for now.  At this point I couldn't say whether this would 
do anything to fix the situation on ck801 targets where there simply 
aren't enough spare registers available to the trampoline to both hold 
the static link and do an indirect jump.


-Sandra


Re: [2/5] C-SKY port: Backend implementation

2018-07-24 Thread Jeff Law
On 07/24/2018 12:18 PM, Sandra Loosemore wrote:
> On 07/24/2018 09:45 AM, Jeff Law wrote:
>> On 07/23/2018 10:21 PM, Sandra Loosemore wrote:
>>> 2018-07-23  Jojo  
>>>  Huibin Wang  
>>>  Sandra Loosemore  
>>>  Chung-Lin Tang  
>>>
>>>  C-SKY port: Backend implementation
>>>
>>>  gcc/
>>>  * config/csky/*: New.
>>>  * common/config/csky/*: New.
>>
>> Let's avoid gratutious whitespace that attempts to line up conditionals.
>>    As an example, look at the predicate csky_load_multiple_operation.  I
>> think just doing a quick pass over the .c, .h and main .md files should
>> be sufficient here.
> 
> OK, will do.
> 
>> I'm not a big fan of more awk code, but I'm not going to object to it :-)
>>
>> Why does the port have its own little pass for condition code
>> optimization (cse_cc)?  What is it doing that can't be done with our
>> generic optimizers?
> 
> This pass was included in the initial patch set we got from C-SKY, and
> as it didn't seem to break anything I left it in.  Perhaps C-SKY can
> provide a testcase that demonstrates why it's still useful in the
> current version of GCC; otherwise we can remove this from the initial
> port submission and restore it later if some performance analysis shows
> it is still worthwhile.
FWIW it looks like we model CC setting on just a few insns, (add,
subtract) so I'd be surprised if this little mini pass found much.  I'd
definitely like to hear from the csky authors here.

Alternately, you could do add some instrumentation to flag when it
triggers, take a test or two that does, reduce it and we can then look
at the key RTL sequences and see what the pass is really doing.

> 
>> Any thoughts on using the newer function descriptor bits rather than old
>> style stack trampolines?
> 
> Has that been committed?  I vaguely remembered discussion of a new way
> to handle nested functions without using the trampoline interface, but I
> couldn't find any documentation in the internals manual.
It did.  See TARGET_CUSTOM_FUNCTION_DESCRIPTORS and the (relatively few)
ports that define it.



> 
>> I don't see anything terribly concerning in the core of the port.  The
>> amount of support code for minipool is huge and I wonder if some sharing
>> across the various ports would be possible, but I don't think that
>> should be a blocking issue for this port.
> 
> Yes, that code was clearly copied almost verbatim from the ARM backend.
> I left it alone as much as possible to simplify any future attempts at
> genericizing it.
Understood -- I'd assumed it was largely copied from ARM, but hadn't
gone back to the ARM bits to verify.

> 
>> Can you update the backends.html web page here appropriately for the
>> c-sky target?
> 
> Sure, I can take care of updating that when the port is committed.  I
> believe the right entry is
> 
> "csky  b   ia"
Yea, that seems right to me.

Jeff


Re: [2/5] C-SKY port: Backend implementation

2018-07-24 Thread Sandra Loosemore

On 07/24/2018 09:45 AM, Jeff Law wrote:

On 07/23/2018 10:21 PM, Sandra Loosemore wrote:

2018-07-23  Jojo  
     Huibin Wang  
     Sandra Loosemore  
     Chung-Lin Tang  

     C-SKY port: Backend implementation

     gcc/
     * config/csky/*: New.
     * common/config/csky/*: New.


Let's avoid gratutious whitespace that attempts to line up conditionals.
   As an example, look at the predicate csky_load_multiple_operation.  I
think just doing a quick pass over the .c, .h and main .md files should
be sufficient here.


OK, will do.


I'm not a big fan of more awk code, but I'm not going to object to it :-)

Why does the port have its own little pass for condition code
optimization (cse_cc)?  What is it doing that can't be done with our
generic optimizers?


This pass was included in the initial patch set we got from C-SKY, and 
as it didn't seem to break anything I left it in.  Perhaps C-SKY can 
provide a testcase that demonstrates why it's still useful in the 
current version of GCC; otherwise we can remove this from the initial 
port submission and restore it later if some performance analysis shows 
it is still worthwhile.



Any thoughts on using the newer function descriptor bits rather than old
style stack trampolines?


Has that been committed?  I vaguely remembered discussion of a new way 
to handle nested functions without using the trampoline interface, but I 
couldn't find any documentation in the internals manual.



I don't see anything terribly concerning in the core of the port.  The
amount of support code for minipool is huge and I wonder if some sharing
across the various ports would be possible, but I don't think that
should be a blocking issue for this port.


Yes, that code was clearly copied almost verbatim from the ARM backend. 
I left it alone as much as possible to simplify any future attempts at 
genericizing it.



Can you update the backends.html web page here appropriately for the
c-sky target?


Sure, I can take care of updating that when the port is committed.  I 
believe the right entry is


"csky  b   ia"


I'd like to take a closer look, but those are the high level comment's
I've got this morning :-)


Thanks.  I'll wait a bit for more comments to come in before preparing a 
revised patch.


-Sandra


Re: [2/5] C-SKY port: Backend implementation

2018-07-24 Thread Jeff Law
On 07/23/2018 10:21 PM, Sandra Loosemore wrote:
> 2018-07-23  Jojo  
>     Huibin Wang  
>     Sandra Loosemore  
>     Chung-Lin Tang  
> 
>     C-SKY port: Backend implementation
> 
>     gcc/
>     * config/csky/*: New.
>     * common/config/csky/*: New.

Let's avoid gratutious whitespace that attempts to line up conditionals.
  As an example, look at the predicate csky_load_multiple_operation.  I
think just doing a quick pass over the .c, .h and main .md files should
be sufficient here.

I'm not a big fan of more awk code, but I'm not going to object to it :-)

Why does the port have its own little pass for condition code
optimization (cse_cc)?  What is it doing that can't be done with our
generic optimizers?

Any thoughts on using the newer function descriptor bits rather than old
style stack trampolines?

I don't see anything terribly concerning in the core of the port.  The
amount of support code for minipool is huge and I wonder if some sharing
across the various ports would be possible, but I don't think that
should be a blocking issue for this port.


Can you update the backends.html web page here appropriately for the
c-sky target?

I'd like to take a closer look, but those are the high level comment's
I've got this morning :-)


Jeff


[2/5] C-SKY port: Backend implementation

2018-07-23 Thread Sandra Loosemore

2018-07-23  Jojo  
Huibin Wang  
Sandra Loosemore  
Chung-Lin Tang  

C-SKY port: Backend implementation

gcc/
* config/csky/*: New.
* common/config/csky/*: New.


csky-gcc-2.patch.gz
Description: application/gzip