Re: Another issue on RS6000 target. Re: One issue with default implementation of zero_call_used_regs

2020-09-30 Thread Richard Sandiford via Gcc-patches
Qing Zhao  writes:
> Hi, Richard,
>
> At the same time testing aarch64, I also tested the default implementation on 
> rs6000 target. 
>
> The default implementation now is:
>
> +/* The default hook for TARGET_ZERO_CALL_USED_REGS.  */
> +
> +HARD_REG_SET
> +default_zero_call_used_regs (HARD_REG_SET need_zeroed_hardregs)
> +{
> +  gcc_assert (!hard_reg_set_empty_p (need_zeroed_hardregs));
> +
> +  for (unsigned int regno = 0; regno < FIRST_PSEUDO_REGISTER; regno++)
> +if (TEST_HARD_REG_BIT (need_zeroed_hardregs, regno))
> +  {
> +   machine_mode mode = reg_raw_mode[regno];
> +   rtx reg = gen_rtx_REG (mode, regno);
> +   emit_move_insn (reg, const0_rtx);

This should just be:

rtx zero = CONST0_RTX (reg_raw_mode[regno]);
emit_move_insn (regno_reg_rtx[regno], zero);

> +  }
> +  return need_zeroed_hardregs;
> +}
> +
>
> With the small testing case:
> int
> test ()
> {
>   return 1;
> }
>
> If I compiled it with 
>
> /home/qinzhao/Install/latest/bin/gcc -O2 -fzero-call-used-regs=all-arg t.c
>
> It will failed as:
>
> t.c: In function ‘test’:
> t.c:6:1: error: insn does not satisfy its constraints:
> 6 | }
>   | ^
> (insn 28 27 29 (set (reg:DI 33 1)
> (const_int 0 [0])) "t.c":6:1 647 {*movdi_internal64}
>  (nil))
> during RTL pass: shorten
> dump file: t.c.319r.shorten
> t.c:6:1: internal compiler error: in extract_constrain_insn_cached, at 
> recog.c:2207
> 0x1018d693 _fatal_insn(char const*, rtx_def const*, char const*, int, char 
> const*)
>   ../../latest-gcc-x86/gcc/rtl-error.c:108
> 0x1018d6e7 _fatal_insn_not_found(rtx_def const*, char const*, int, char 
> const*)
>   ../../latest-gcc-x86/gcc/rtl-error.c:118
> 0x1099a82b extract_constrain_insn_cached(rtx_insn*)
>   ../../latest-gcc-x86/gcc/recog.c:2207
> 0x11393917 insn_min_length(rtx_insn*)
>   ../../latest-gcc-x86/gcc/config/rs6000/rs6000.md:721
> 0x105bece3 shorten_branches(rtx_insn*)
>   ../../latest-gcc-x86/gcc/final.c:1118
>
>
> As I checked, when the FP registers are zeroed, the above failure happened.
>
> I suspect that the issue still relate to the following statement:
>
> machine_mode mode = reg_raw_mode[regno];
>
> As I checked, the reg_raw_mode always return the integer mode that can be 
> hold by the hard registers, even though it’s FP register.

Well, more precisely: it's the largest mode that the target allows the
registers to hold.  If there are multiple candidate modes of the same
size, the integer one wins, like you say.  But the point is that DI only
wins over DF because the target allows both DI and DF to be stored in
the register, and therefore supports both DI and DF moves for that
register.

So I don't think the mode is the issue.  Integer zero and floating-point
zero have the same bit representation after all.

AIUI, without VSX, Power needs to load the zero from the constant pool.

> So, I still wondering:
>
> 1. Is there another available utility routine that returns the proper MODE 
> for the hard registers that can be readily used to zero the hard register?
> 2. If not, should I add one more target hook for this purpose? i.e 
>
> /* Return the proper machine mode that can be used to zero this hard register 
> specified by REGNO.  */
> machine_mode zero-call-used-regs-mode (unsigned int REGNO)
>
> 3. Or should I just delete the default implemeantion, and let the target to 
> implement it.

IMO no.  This goes back to what we discussed earlier.  It isn't the
case that a default target hook has to be correct for all targets,
with targets only overriding them as an optimisation.  The default
versions of many hooks and macros are not conservatively correct.
They are just reaonable default assumptions.  And IMO that's true
of the hook above too.

The way to flush out whether a target needs to override the hook
is to add tests that run on all targets.

That said, one way of erring on the side of caution from an ICE
perspective would be to do something like:

rtx_insn *last_insn = get_last_insn ();
rtx zero = CONST0_RTX (reg_raw_mode[regno]);
rtx_insn *insn = emit_insn (gen_rtx_SET (regno_reg_rtx[regno], zero));
if (!valid_insn_p (insn))
  {
delete_insns_since (last_insn);
...remove regno from the set of cleared registers...;
  }

where valid_insn_p abstracts out this code from ira.c:

  recog_memoized (move_insn);
  if (INSN_CODE (move_insn) < 0)
continue;
  extract_insn (move_insn);
  /* We don't know whether the move will be in code that is optimized
 for size or speed, so consider all enabled alternatives.  */
  if (! constrain_operands (1, get_enabled_alternatives (move_insn)))
continue;

(but keeping the comment where it is).  The default behaviour would then
be to drop any register that can't be zeroed easily.

Doing this would make the default hook usable for more targets.
The question is whether dropping registers that can't be zeroed

Re: Another issue on RS6000 target. Re: One issue with default implementation of zero_call_used_regs

2020-09-30 Thread Qing Zhao via Gcc-patches



> On Sep 30, 2020, at 4:21 AM, Richard Sandiford  
> wrote:
> 
> Qing Zhao mailto:qing.z...@oracle.com>> writes:
>> Hi, Richard,
>> 
>> At the same time testing aarch64, I also tested the default implementation 
>> on rs6000 target. 
>> 
>> The default implementation now is:
>> 
>> +/* The default hook for TARGET_ZERO_CALL_USED_REGS.  */
>> +
>> +HARD_REG_SET
>> +default_zero_call_used_regs (HARD_REG_SET need_zeroed_hardregs)
>> +{
>> +  gcc_assert (!hard_reg_set_empty_p (need_zeroed_hardregs));
>> +
>> +  for (unsigned int regno = 0; regno < FIRST_PSEUDO_REGISTER; regno++)
>> +if (TEST_HARD_REG_BIT (need_zeroed_hardregs, regno))
>> +  {
>> +   machine_mode mode = reg_raw_mode[regno];
>> +   rtx reg = gen_rtx_REG (mode, regno);
>> +   emit_move_insn (reg, const0_rtx);
> 
> This should just be:
> 
>   rtx zero = CONST0_RTX (reg_raw_mode[regno]);
>   emit_move_insn (regno_reg_rtx[regno], zero);

Okay. Will update the code.

> 
>> +  }
>> +  return need_zeroed_hardregs;
>> +}
>> +
>> 
>> With the small testing case:
>> int
>> test ()
>> {
>>  return 1;
>> }
>> 
>> If I compiled it with 
>> 
>> /home/qinzhao/Install/latest/bin/gcc -O2 -fzero-call-used-regs=all-arg t.c
>> 
>> It will failed as:
>> 
>> t.c: In function ‘test’:
>> t.c:6:1: error: insn does not satisfy its constraints:
>>6 | }
>>  | ^
>> (insn 28 27 29 (set (reg:DI 33 1)
>>(const_int 0 [0])) "t.c":6:1 647 {*movdi_internal64}
>> (nil))
>> during RTL pass: shorten
>> dump file: t.c.319r.shorten
>> t.c:6:1: internal compiler error: in extract_constrain_insn_cached, at 
>> recog.c:2207
>> 0x1018d693 _fatal_insn(char const*, rtx_def const*, char const*, int, char 
>> const*)
>>  ../../latest-gcc-x86/gcc/rtl-error.c:108
>> 0x1018d6e7 _fatal_insn_not_found(rtx_def const*, char const*, int, char 
>> const*)
>>  ../../latest-gcc-x86/gcc/rtl-error.c:118
>> 0x1099a82b extract_constrain_insn_cached(rtx_insn*)
>>  ../../latest-gcc-x86/gcc/recog.c:2207
>> 0x11393917 insn_min_length(rtx_insn*)
>>  ../../latest-gcc-x86/gcc/config/rs6000/rs6000.md:721
>> 0x105bece3 shorten_branches(rtx_insn*)
>>  ../../latest-gcc-x86/gcc/final.c:1118
>> 
>> 
>> As I checked, when the FP registers are zeroed, the above failure happened.
>> 
>> I suspect that the issue still relate to the following statement:
>> 
>> machine_mode mode = reg_raw_mode[regno];
>> 
>> As I checked, the reg_raw_mode always return the integer mode that can be 
>> hold by the hard registers, even though it’s FP register.
> 
> Well, more precisely: it's the largest mode that the target allows the
> registers to hold.  If there are multiple candidate modes of the same
> size, the integer one wins, like you say.  But the point is that DI only
> wins over DF because the target allows both DI and DF to be stored in
> the register, and therefore supports both DI and DF moves for that
> register.
> 
> So I don't think the mode is the issue.  Integer zero and floating-point
> zero have the same bit representation after all.

theoritically  yes. 
However, as we have noticed in Aarch64, the integer TI move has not been 
supported before your fix today. As a result, the Ti move have to be splitted.
With your fix today on aarch64,  Yes, the default implementation works well for 
those vector registers. Thanks a lot.

Potentially there will be other targets that have the same issue. Then those 
targets need to fix those issues too in order to make the default 
implementation work.

> 
> AIUI, without VSX, Power needs to load the zero from the constant pool.
> 
>> So, I still wondering:
>> 
>> 1. Is there another available utility routine that returns the proper MODE 
>> for the hard registers that can be readily used to zero the hard register?
>> 2. If not, should I add one more target hook for this purpose? i.e 
>> 
>> /* Return the proper machine mode that can be used to zero this hard 
>> register specified by REGNO.  */
>> machine_mode zero-call-used-regs-mode (unsigned int REGNO)
>> 
>> 3. Or should I just delete the default implemeantion, and let the target to 
>> implement it.
> 
> IMO no.  This goes back to what we discussed earlier.  It isn't the
> case that a default target hook has to be correct for all targets,
> with targets only overriding them as an optimisation.  The default
> versions of many hooks and macros are not conservatively correct.
> They are just reaonable default assumptions.  And IMO that's true
> of the hook above too.
> 
> The way to flush out whether a target needs to override the hook
> is to add tests that run on all targets.
I planned to add these new test cases, so currently I have been testing the 
simple testing cases on aarch64 and rs6000 to see any issue 
With the default implementation. So far, I have found these issues with the 
very simple testing cases.

For me, at most I can test aarch64 and rs6000 targets for some small testing 
cases for checking correctness.
> 
> That said, one way of erring on the 

Re: Another issue on RS6000 target. Re: One issue with default implementation of zero_call_used_regs

2020-09-30 Thread Richard Sandiford via Gcc-patches
Qing Zhao  writes:
>>> +  }
>>> +  return need_zeroed_hardregs;
>>> +}
>>> +
>>> 
>>> With the small testing case:
>>> int
>>> test ()
>>> {
>>>  return 1;
>>> }
>>> 
>>> If I compiled it with 
>>> 
>>> /home/qinzhao/Install/latest/bin/gcc -O2 -fzero-call-used-regs=all-arg t.c
>>> 
>>> It will failed as:
>>> 
>>> t.c: In function ‘test’:
>>> t.c:6:1: error: insn does not satisfy its constraints:
>>>6 | }
>>>  | ^
>>> (insn 28 27 29 (set (reg:DI 33 1)
>>>(const_int 0 [0])) "t.c":6:1 647 {*movdi_internal64}
>>> (nil))
>>> during RTL pass: shorten
>>> dump file: t.c.319r.shorten
>>> t.c:6:1: internal compiler error: in extract_constrain_insn_cached, at 
>>> recog.c:2207
>>> 0x1018d693 _fatal_insn(char const*, rtx_def const*, char const*, int, char 
>>> const*)
>>> ../../latest-gcc-x86/gcc/rtl-error.c:108
>>> 0x1018d6e7 _fatal_insn_not_found(rtx_def const*, char const*, int, char 
>>> const*)
>>> ../../latest-gcc-x86/gcc/rtl-error.c:118
>>> 0x1099a82b extract_constrain_insn_cached(rtx_insn*)
>>> ../../latest-gcc-x86/gcc/recog.c:2207
>>> 0x11393917 insn_min_length(rtx_insn*)
>>> ../../latest-gcc-x86/gcc/config/rs6000/rs6000.md:721
>>> 0x105bece3 shorten_branches(rtx_insn*)
>>> ../../latest-gcc-x86/gcc/final.c:1118
>>> 
>>> 
>>> As I checked, when the FP registers are zeroed, the above failure happened.
>>> 
>>> I suspect that the issue still relate to the following statement:
>>> 
>>> machine_mode mode = reg_raw_mode[regno];
>>> 
>>> As I checked, the reg_raw_mode always return the integer mode that can be 
>>> hold by the hard registers, even though it’s FP register.
>> 
>> Well, more precisely: it's the largest mode that the target allows the
>> registers to hold.  If there are multiple candidate modes of the same
>> size, the integer one wins, like you say.  But the point is that DI only
>> wins over DF because the target allows both DI and DF to be stored in
>> the register, and therefore supports both DI and DF moves for that
>> register.
>> 
>> So I don't think the mode is the issue.  Integer zero and floating-point
>> zero have the same bit representation after all.
>
> theoritically  yes. 
> However, as we have noticed in Aarch64, the integer TI move has not been 
> supported before your fix today. As a result, the Ti move have to be splitted.
> With your fix today on aarch64,  Yes, the default implementation works well 
> for those vector registers. Thanks a lot.
>
> Potentially there will be other targets that have the same issue. Then those 
> targets need to fix those issues too in order to make the default 
> implementation work.

Right.  But that's not a bad thing.

My point above was that what you describe was not the issue for Power.
AIUI the issue there was…

>> AIUI, without VSX, Power needs to load the zero from the constant pool.

…this instead.

>>> So, I still wondering:
>>> 
>>> 1. Is there another available utility routine that returns the proper MODE 
>>> for the hard registers that can be readily used to zero the hard register?
>>> 2. If not, should I add one more target hook for this purpose? i.e 
>>> 
>>> /* Return the proper machine mode that can be used to zero this hard 
>>> register specified by REGNO.  */
>>> machine_mode zero-call-used-regs-mode (unsigned int REGNO)
>>> 
>>> 3. Or should I just delete the default implemeantion, and let the target to 
>>> implement it.
>> 
>> IMO no.  This goes back to what we discussed earlier.  It isn't the
>> case that a default target hook has to be correct for all targets,
>> with targets only overriding them as an optimisation.  The default
>> versions of many hooks and macros are not conservatively correct.
>> They are just reaonable default assumptions.  And IMO that's true
>> of the hook above too.
>> 
>> The way to flush out whether a target needs to override the hook
>> is to add tests that run on all targets.
> I planned to add these new test cases, so currently I have been testing the 
> simple testing cases on aarch64 and rs6000 to see any issue 
> With the default implementation. So far, I have found these issues with the 
> very simple testing cases.
>
> For me, at most I can test aarch64 and rs6000 targets for some small testing 
> cases for checking correctness.

Thanks for testing other targets.  But I don't think that invalidates
what I said above.  It might be that some of the targets you pick to
test are ones that can't use the default implementation (or at least,
not in all circumstances).  At that point, hopefully target maintainers
will step in to help.

>> That said, one way of erring on the side of caution from an ICE
>> perspective would be to do something like:
>> 
>>rtx_insn *last_insn = get_last_insn ();
>>rtx zero = CONST0_RTX (reg_raw_mode[regno]);
>>rtx_insn *insn = emit_insn (gen_rtx_SET (regno_reg_rtx[regno], zero));
>>if (!valid_insn_p (insn))
>>  {
>>delete_insns_since (last_insn);
>>...remove regno from the set of cleared regist

Re: Another issue on RS6000 target. Re: One issue with default implementation of zero_call_used_regs

2020-09-30 Thread Qing Zhao via Gcc-patches



> On Sep 30, 2020, at 11:25 AM, Richard Sandiford  
> wrote:
> 
> Qing Zhao  writes:
 
 As I checked, when the FP registers are zeroed, the above failure happened.
 
 I suspect that the issue still relate to the following statement:
 
 machine_mode mode = reg_raw_mode[regno];
 
 As I checked, the reg_raw_mode always return the integer mode that can be 
 hold by the hard registers, even though it’s FP register.
>>> 
>>> Well, more precisely: it's the largest mode that the target allows the
>>> registers to hold.  If there are multiple candidate modes of the same
>>> size, the integer one wins, like you say.  But the point is that DI only
>>> wins over DF because the target allows both DI and DF to be stored in
>>> the register, and therefore supports both DI and DF moves for that
>>> register.
>>> 
>>> So I don't think the mode is the issue.  Integer zero and floating-point
>>> zero have the same bit representation after all.
>> 
>> theoritically  yes. 
>> However, as we have noticed in Aarch64, the integer TI move has not been 
>> supported before your fix today. As a result, the Ti move have to be 
>> splitted.
>> With your fix today on aarch64,  Yes, the default implementation works well 
>> for those vector registers. Thanks a lot.
>> 
>> Potentially there will be other targets that have the same issue. Then those 
>> targets need to fix those issues too in order to make the default 
>> implementation work.
> 
> Right.  But that's not a bad thing.
> 
> My point above was that what you describe was not the issue for Power.
> AIUI the issue there was…
> 
>>> AIUI, without VSX, Power needs to load the zero from the constant pool.
> 
> …this instead.
> 
 So, I still wondering:
 
 1. Is there another available utility routine that returns the proper MODE 
 for the hard registers that can be readily used to zero the hard register?
 2. If not, should I add one more target hook for this purpose? i.e 
 
 /* Return the proper machine mode that can be used to zero this hard 
 register specified by REGNO.  */
 machine_mode zero-call-used-regs-mode (unsigned int REGNO)
 
 3. Or should I just delete the default implemeantion, and let the target 
 to implement it.
>>> 
>>> IMO no.  This goes back to what we discussed earlier.  It isn't the
>>> case that a default target hook has to be correct for all targets,
>>> with targets only overriding them as an optimisation.  The default
>>> versions of many hooks and macros are not conservatively correct.
>>> They are just reaonable default assumptions.  And IMO that's true
>>> of the hook above too.
>>> 
>>> The way to flush out whether a target needs to override the hook
>>> is to add tests that run on all targets.
>> I planned to add these new test cases, so currently I have been testing the 
>> simple testing cases on aarch64 and rs6000 to see any issue 
>> With the default implementation. So far, I have found these issues with the 
>> very simple testing cases.
>> 
>> For me, at most I can test aarch64 and rs6000 targets for some small testing 
>> cases for checking correctness.
> 
> Thanks for testing other targets.  But I don't think that invalidates
> what I said above.  It might be that some of the targets you pick to
> test are ones that can't use the default implementation (or at least,
> not in all circumstances).  At that point, hopefully target maintainers
> will step in to help.
> 
>>> That said, one way of erring on the side of caution from an ICE
>>> perspective would be to do something like:
>>> 
>>>   rtx_insn *last_insn = get_last_insn ();
>>>   rtx zero = CONST0_RTX (reg_raw_mode[regno]);
>>>   rtx_insn *insn = emit_insn (gen_rtx_SET (regno_reg_rtx[regno], zero));
>>>   if (!valid_insn_p (insn))
>>> {
>>>   delete_insns_since (last_insn);
>>>   ...remove regno from the set of cleared registers...;
>>> }
>>> 
>>> where valid_insn_p abstracts out this code from ira.c:
>>> 
>>>   recog_memoized (move_insn);
>>>   if (INSN_CODE (move_insn) < 0)
>>> continue;
>>>   extract_insn (move_insn);
>>>   /* We don't know whether the move will be in code that is optimized
>>>  for size or speed, so consider all enabled alternatives.  */
>>>   if (! constrain_operands (1, get_enabled_alternatives (move_insn)))
>>> continue;
>>> 
>>> (but keeping the comment where it is).  The default behaviour would then
>>> be to drop any register that can't be zeroed easily.
>> 
>> I will check whether the above fix the ICE on rs6000.

Yes, it fixes the ICE on rs6000.
However, now with this check, ONLY integer registers on rs6000 can be zeroed.  
All other call_used registers are excluded from ALL. 

This doesn’t look like the correct fix to me. 
>>> 
>>> Doing this would make the default hook usable for more targets.
>>> The question is whether dropping registers that can't be zeroed
>>> easily is acceptable as a default policy for a secur

Re: Another issue on RS6000 target. Re: One issue with default implementation of zero_call_used_regs

2020-10-01 Thread Qing Zhao via Gcc-patches
Hi, Richard,

To answer the question, which registers should be included in “ALL”. 
I studied X86 hard register set in more details. And also consulted with 
H.J.Lu, And found:

In the current x86 implementation, mask registers, MM0-MM7 registers, and 
ST0-ST7 registers are not zeroed.

The reasons actually are:
1. Mask registers are marked as “FIXED_REGS” by middle end,  (in the following 
place, reginfo.c, init_reg_sets_1)

  /* If a register is too limited to be treated as a register operand,
 then it should never be allocated to a pseudo.  */
  if (!TEST_HARD_REG_BIT (operand_reg_set, i))
fixed_regs[i] = 1;

2. MM0-MM7 registers and ST0-ST7 registers are aliased with each other, (i.e, 
the same set of registers have two different
Names), so, zero individual mm or st register will be very impractical. 
However, we can zero them together with emms. 

So, my conclusion is, 

1. For “ALL”, we should include all call_used_regs that are not fixed_regs. 
2. For X86 implementation, I added more comments, and also add clearing all mm 
and st registers with emms.

In general, “ALL” should include all call_used_regs that are not fixed_regs. 

thanks.

Qing

> 
> If we have common, written-down ground rules for deciding which kinds
> of register should be included and which shouldn't, it will be easier
> to apply those rules to a given target.  And it will be easier to decide
> on the right behaviour for the default implementation of the hook.
> 
> It feels like the pushback against defining a default implementation
> of the hook is also a pushback against being specific about how targets
> are supposed to select which kinds of register need to be zeroed.
> 
> Thanks,
> Richard



Re: Another issue on RS6000 target. Re: One issue with default implementation of zero_call_used_regs

2020-10-01 Thread Richard Sandiford via Gcc-patches
Qing Zhao  writes:
> Hi, Richard,
>
> To answer the question, which registers should be included in “ALL”. 
> I studied X86 hard register set in more details. And also consulted with 
> H.J.Lu, And found:
>
> In the current x86 implementation, mask registers, MM0-MM7 registers, and 
> ST0-ST7 registers are not zeroed.
>
> The reasons actually are:
> 1. Mask registers are marked as “FIXED_REGS” by middle end,  (in the 
> following place, reginfo.c, init_reg_sets_1)
>
>   /* If a register is too limited to be treated as a register operand,
>  then it should never be allocated to a pseudo.  */
>   if (!TEST_HARD_REG_BIT (operand_reg_set, i))
> fixed_regs[i] = 1;

But isn't that only true when AVX512F is disabled?

The question is more why the registers shouldn't be zeroed when
they're available.

> 2. MM0-MM7 registers and ST0-ST7 registers are aliased with each other, (i.e, 
> the same set of registers have two different
> Names), so, zero individual mm or st register will be very impractical. 
> However, we can zero them together with emms. 

Ah, OK.

> So, my conclusion is, 
>
> 1. For “ALL”, we should include all call_used_regs that are not fixed_regs. 
> 2. For X86 implementation, I added more comments, and also add clearing all 
> mm and st registers with emms.
>
> In general, “ALL” should include all call_used_regs that are not fixed_regs. 

Right.  I thought the original implementation already excluded fixed
registers, but perhaps I'm misremembering.  I agree that that's the
sensible default behaviour.

Going back to the default hook, I guess one option is:

   rtx zero = CONST0_RTX (reg_raw_mode[regno]);
   rtx_insn *insn = emit_insn (gen_rtx_SET (regno_reg_rtx[regno], zero));
   if (!valid_insn_p (insn))
 sorry (…);

but with some mechanism to avoid spewing the user with messages
for the same problem.

Thanks,
Richard


Re: Another issue on RS6000 target. Re: One issue with default implementation of zero_call_used_regs

2020-10-02 Thread Qing Zhao via Gcc-patches



> On Oct 1, 2020, at 11:20 AM, Richard Sandiford  
> wrote:
> 
> Qing Zhao  writes:
>> Hi, Richard,
>> 
>> To answer the question, which registers should be included in “ALL”. 
>> I studied X86 hard register set in more details. And also consulted with 
>> H.J.Lu, And found:
>> 
>> In the current x86 implementation, mask registers, MM0-MM7 registers, and 
>> ST0-ST7 registers are not zeroed.
>> 
>> The reasons actually are:
>> 1. Mask registers are marked as “FIXED_REGS” by middle end,  (in the 
>> following place, reginfo.c, init_reg_sets_1)
>> 
>>  /* If a register is too limited to be treated as a register operand,
>> then it should never be allocated to a pseudo.  */
>>  if (!TEST_HARD_REG_BIT (operand_reg_set, i))
>>fixed_regs[i] = 1;
> 
> But isn't that only true when AVX512F is disabled?
> 
> The question is more why the registers shouldn't be zeroed when
> they're available.

If they are not treated as fixed, then we should zeroed them.

> 
>> 2. MM0-MM7 registers and ST0-ST7 registers are aliased with each other, 
>> (i.e, the same set of registers have two different
>> Names), so, zero individual mm or st register will be very impractical. 
>> However, we can zero them together with emms. 
> 
> Ah, OK.
> 
>> So, my conclusion is, 
>> 
>> 1. For “ALL”, we should include all call_used_regs that are not fixed_regs. 
>> 2. For X86 implementation, I added more comments, and also add clearing all 
>> mm and st registers with emms.
>> 
>> In general, “ALL” should include all call_used_regs that are not fixed_regs. 
> 
> Right.  I thought the original implementation already excluded fixed
> registers, but perhaps I'm misremembering.

Yes, my current implementation in middle end already excluded fixed registers.

>  I agree that that's the
> sensible default behaviour.
> 
> Going back to the default hook, I guess one option is:
> 
>   rtx zero = CONST0_RTX (reg_raw_mode[regno]);
>   rtx_insn *insn = emit_insn (gen_rtx_SET (regno_reg_rtx[regno], zero));
>   if (!valid_insn_p (insn))
> sorry (…);

The valid_insn_p approach will exclude all FP registers for rs6000 as I 
mentioned in the previous email, I don’t think that this is the right fix to 
the rs6000 ICE. 
I’d rather leave this problem there, and hopefully rs6000 developer can 
override this issue very soon?
> 
> but with some mechanism to avoid spewing the user with messages
> for the same problem.
> 
> Thanks,
> Richard



Re: Another issue on RS6000 target. Re: One issue with default implementation of zero_call_used_regs

2020-10-02 Thread Qing Zhao via Gcc-patches



> 
> Going back to the default hook, I guess one option is:
> 
>   rtx zero = CONST0_RTX (reg_raw_mode[regno]);
>   rtx_insn *insn = emit_insn (gen_rtx_SET (regno_reg_rtx[regno], zero));
>   if (!valid_insn_p (insn))
> sorry (…);

“Sorry” here will tell the user that the implementation on this platform is not 
valid?

Qing
> 
> but with some mechanism to avoid spewing the user with messages
> for the same problem.
> 
> Thanks,
> Richard



Re: Another issue on RS6000 target. Re: One issue with default implementation of zero_call_used_regs

2020-10-02 Thread Richard Sandiford via Gcc-patches
Qing Zhao  writes:
>> 
>> Going back to the default hook, I guess one option is:
>> 
>>   rtx zero = CONST0_RTX (reg_raw_mode[regno]);
>>   rtx_insn *insn = emit_insn (gen_rtx_SET (regno_reg_rtx[regno], zero));
>>   if (!valid_insn_p (insn))
>> sorry (…);
>
> “Sorry” here will tell the user that the implementation on this platform is 
> not valid?

Right.  If we didn't have a default implementation of the target hook,
we would presumably need to issue a sorry () if the user tried to use
the option on a target that didn't define the hook.  The above is a
compromise: we instead make a reasonable attempt to handle the option,
and issue the sorry only if that attempt fails.

Thanks,
Richard

>
> Qing
>> 
>> but with some mechanism to avoid spewing the user with messages
>> for the same problem.
>> 
>> Thanks,
>> Richard


Re: Another issue on RS6000 target. Re: One issue with default implementation of zero_call_used_regs

2020-10-02 Thread Qing Zhao via Gcc-patches



> On Oct 2, 2020, at 10:15 AM, Richard Sandiford  
> wrote:
> 
> Qing Zhao  writes:
>>> 
>>> Going back to the default hook, I guess one option is:
>>> 
>>>  rtx zero = CONST0_RTX (reg_raw_mode[regno]);
>>>  rtx_insn *insn = emit_insn (gen_rtx_SET (regno_reg_rtx[regno], zero));
>>>  if (!valid_insn_p (insn))
>>>sorry (…);
>> 
>> “Sorry” here will tell the user that the implementation on this platform is 
>> not valid?
> 
> Right.  If we didn't have a default implementation of the target hook,
> we would presumably need to issue a sorry () if the user tried to use
> the option on a target that didn't define the hook.  The above is a
> compromise: we instead make a reasonable attempt to handle the option,
> and issue the sorry only if that attempt fails.

Sounds reasonable, I will do this.

thanks.

Qing
> 
> Thanks,
> Richard
> 
>> 
>> Qing
>>> 
>>> but with some mechanism to avoid spewing the user with messages
>>> for the same problem.
>>> 
>>> Thanks,
>>> Richard



Re: Another issue on RS6000 target. Re: One issue with default implementation of zero_call_used_regs

2020-10-02 Thread Qing Zhao via Gcc-patches



> On Oct 1, 2020, at 11:20 AM, Richard Sandiford  
> wrote:
> 
> Qing Zhao  writes:
>> Hi, Richard,
>> 
>> To answer the question, which registers should be included in “ALL”. 
>> I studied X86 hard register set in more details. And also consulted with 
>> H.J.Lu, And found:
>> 
>> In the current x86 implementation, mask registers, MM0-MM7 registers, and 
>> ST0-ST7 registers are not zeroed.
>> 
>> The reasons actually are:
>> 1. Mask registers are marked as “FIXED_REGS” by middle end,  (in the 
>> following place, reginfo.c, init_reg_sets_1)
>> 
>>  /* If a register is too limited to be treated as a register operand,
>> then it should never be allocated to a pseudo.  */
>>  if (!TEST_HARD_REG_BIT (operand_reg_set, i))
>>fixed_regs[i] = 1;
> 
> But isn't that only true when AVX512F is disabled?

You are right.

Yes, when AVX512F is present, mask registers are not fixed register anymore.

I just added the zeroing of mask registers into i386 implementation and also 
the testing case.

Thanks.

Qing
> 
> The question is more why the registers shouldn't be zeroed when
> they're available.
>