subject:"\[PING\] \[PATCH RESEND\] riscv\: improve the cost model for loading a 64bit constant in rv32."

Re: 回复：[PING] [PATCH RESEND] riscv: improve the cost model for loading a 64bit constant in rv32.

2022-11-28 Thread Palmer Dabbelt

On Mon, 28 Nov 2022 11:15:01 PST (-0800), gcc-patches@gcc.gnu.org wrote:
>
>
> On 11/24/22 00:43, Sinan wrote:
>>>Â TheÂ motivationÂ ofÂ thisÂ patchÂ isÂ toÂ correctÂ theÂ wrongÂ estimationÂ 
>>>of
Â theÂ numberÂ ofÂ instructionsÂ neededÂ forÂ loadingÂ aÂ 64bitÂ constantÂ 
in
Â rv32Â inÂ theÂ currentÂ costÂ model(riscv_interger_cost).Â AccordingÂ to
Â theÂ currentÂ implementation,Â ifÂ aÂ constantÂ requiresÂ moreÂ thanÂ 3
Â instructions(riscv_const_insnÂ andÂ riscv_legitimate_constant_p),
Â thenÂ theÂ constantÂ willÂ beÂ putÂ intoÂ constantÂ poolÂ whenÂ expanding
Â gimpleÂ toÂ rtl(legitimate_constant_pÂ hookÂ andÂ emit_move_insn).
Â SoÂ theÂ inaccurateÂ costÂ modelÂ leadsÂ toÂ theÂ suboptimalÂ codegen
Â inÂ rv32Â andÂ theÂ wrongÂ estimationÂ partÂ couldÂ beÂ correctedÂ through
Â thisÂ fix.

Â e.g.Â theÂ currentÂ codegenÂ forÂ loadingÂ 0x839290001Â inÂ rv32

Â Â Â Â luiÂ Â Â Â Â a5,%hi(.LC0)
Â Â Â Â lwÂ Â Â Â Â Â a0,%lo(.LC0)(a5)
Â Â Â Â lwÂ Â Â Â Â Â a1,%lo(.LC0+4)(a5)
Â .LC0:
Â Â Â Â .wordÂ Â Â 958988289
Â Â Â Â .wordÂ Â Â 8

Â outputÂ afterÂ thisÂ patch

Â Â Â Â liÂ a0,958988288
Â Â Â Â addiÂ a0,a0,1
Â Â Â Â liÂ a1,8

Â gcc/ChangeLog:

Â Â Â Â Â Â Â Â Â Â *Â config/riscv/riscv.ccÂ (riscv_build_integer):Â 
HandleÂ theÂ caseÂ ofÂ loadingÂ 64bitÂ constantÂ inÂ rv32.

Â gcc/testsuite/ChangeLog:

Â Â Â Â Â Â Â Â Â Â *Â gcc.target/riscv/rv32-load-64bit-constant.c:Â NewÂ 
test.

Â Signed-off-by:Â LinÂ SinanÂ 
Â ---
Â Â Â gcc/config/riscv/riscv.ccÂ Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â |Â 
23Â +++
Â Â Â .../riscv/rv32-load-64bit-constant.cÂ Â Â Â Â Â Â Â Â Â |Â 38Â 
+++
Â Â Â 2Â filesÂ changed,Â 61Â insertions(+)
Â Â Â createÂ modeÂ 100644Â 
gcc/testsuite/gcc.target/riscv/rv32-load-64bit-constant.c

Â diffÂ --gitÂ a/gcc/config/riscv/riscv.ccÂ b/gcc/config/riscv/riscv.cc
Â indexÂ 32f9ef9ade9..9dffabdc5e3Â 100644
Â ---Â a/gcc/config/riscv/riscv.cc
Â +++Â b/gcc/config/riscv/riscv.cc
Â @@Â -618,6Â +618,29Â @@Â riscv_build_integerÂ (structÂ riscv_integer_opÂ 
*codes,Â HOST_WIDE_INTÂ value,
Â Â Â Â }
Â Â Â Â Â Â Â }

Â +Â Â ifÂ ((valueÂ >Â INT32_MAXÂ ||Â valueÂ <Â INT32_MIN)Â &&Â 
!TARGET_64BIT)
>>>
>>>Â Nit.Â Â Â It'sÂ commonÂ practiceÂ toÂ haveÂ theÂ TARGETÂ testÂ firstÂ inÂ 
>>>aÂ seriesÂ of
>>>Â tests.Â Â ItÂ mayÂ alsoÂ beÂ advisableÂ toÂ breakÂ thisÂ intoÂ twoÂ lines.
>>>Â SomethingÂ likeÂ this:
>>>
>>>
>>>Â Â Â ifÂ ((!TARGET_64BIT)
>>>Â Â Â Â Â Â Â ||Â valueÂ >Â INT32_MAXÂ ||Â valueÂ <Â INT32_MIN)
>>>
>>>
>>>Â That'sÂ theÂ styleÂ mostÂ GCCÂ folksÂ areÂ moreÂ accustomedÂ toÂ reading.
>>
>> ThanksÂ forÂ theÂ tipsÂ andÂ IÂ willÂ changeÂ itÂ then.
>>
Â +Â Â Â Â {
Â +Â Â Â Â Â Â unsignedÂ HOST_WIDE_INTÂ lovalÂ =Â sext_hwiÂ (value,Â 32);
Â +Â Â Â Â Â Â unsignedÂ HOST_WIDE_INTÂ hivalÂ =Â sext_hwiÂ ((valueÂ -Â 
loval)Â >>Â 32,Â 32);
Â +Â Â Â Â Â Â structÂ riscv_integer_opÂ alt_codes[RISCV_MAX_INTEGER_OPS],
Â +Â Â Â Â Â Â Â hicode[RISCV_MAX_INTEGER_OPS];
Â +Â Â Â Â Â Â intÂ hi_cost,Â lo_cost;
Â +
Â +Â Â Â Â Â Â hi_costÂ =Â riscv_build_integer_1Â (hicode,Â hival,Â mode);
Â +Â Â Â Â Â Â ifÂ (hi_costÂ <Â cost)
Â +Â {
Â +Â Â Â lo_costÂ =Â riscv_build_integer_1Â (alt_codes,Â loval,Â mode);
Â +Â Â Â ifÂ (lo_costÂ +Â hi_costÂ <Â cost)
>>>
>>>Â JustÂ soÂ I'mÂ sure.Â Â "cost"Â hereÂ refersÂ strictlyÂ toÂ otherÂ 
>>>synthesized
>>>Â forms?Â IfÂ so,Â thenÂ ISTMÂ thatÂ we'dÂ wantÂ toÂ generateÂ theÂ newÂ 
>>>styleÂ when
>>>Â lo_costÂ +Â hi_costÂ <Â costÂ ORÂ whenÂ lo_costÂ +Â hi_costÂ isÂ lessÂ 
>>>thanÂ loading
>>>Â theÂ constantÂ fromÂ memoryÂ --Â whichÂ isÂ almostÂ certainlyÂ moreÂ thanÂ 
>>>"3"
>>>Â sinceÂ theÂ sequenceÂ fromÂ memoryÂ willÂ beÂ atÂ leastÂ 3Â instructions,Â 
>>>twoÂ of
>>>Â whichÂ willÂ hitÂ memory.
>>>
>>>
>>>Â Jeff
>>>
>>
>> Yes,Â almostÂ right.Â TheÂ basicÂ ideaÂ ofÂ thisÂ patchÂ isÂ toÂ improveÂ 
>> theÂ cost
>> calculationÂ forÂ loadingÂ 64bitÂ constantÂ inÂ rv32,Â insteadÂ ofÂ addingÂ 
>> aÂ new
>> wayÂ toÂ loadÂ constant.
>>
>> gccÂ nowÂ loadsÂ 0x739290001LLÂ inÂ rv32gcÂ withÂ threeÂ instructions,
>>  Â Â Â Â Â Â Â Â liÂ Â Â Â Â Â a0,958988288
>>  Â Â Â Â Â Â Â Â addiÂ Â Â Â a0,a0,1
>>  Â Â Â Â Â Â Â Â liÂ Â Â Â Â Â a1,7
>> However,Â whenÂ itÂ loadsÂ 0x839290001LL,Â theÂ outputÂ assemblyÂ becomes
>>  Â Â Â Â Â Â Â Â luiÂ Â Â Â Â a5,%hi(.LC0)
>>  Â Â Â Â Â Â Â Â lwÂ Â Â Â Â Â a0,%lo(.LC0)(a5)
>>  Â Â Â Â Â Â Â Â lwÂ Â Â Â Â Â a1,%lo(.LC0+4)(a5)
>>  Â Â Â Â .LC0:
>>  Â Â Â Â Â Â Â Â .wordÂ Â Â 958988289
>>  Â Â Â Â Â Â Â Â .wordÂ Â Â 8
>> TheÂ costÂ calculationÂ isÂ inaccurateÂ inÂ suchÂ cases,Â sinceÂ loadingÂ 
>> these
>> twoÂ constantsÂ shouldÂ haveÂ noÂ differenceÂ inÂ rv32Â (justÂ changeÂ `liÂ 
>> a1,7`
>> toÂ `liÂ a1,8`Â toÂ loadÂ theÂ hiÂ part).Â ThisÂ patchÂ willÂ takeÂ theseÂ 
>> cases
>> intoÂ

Re: 回复：[PING] [PATCH RESEND] riscv: improve the cost model for loading a 64bit constant in rv32.

2022-11-28 Thread Jeff Law via Gcc-patches





On 11/24/22 00:43, Sinan wrote:

 The motivation of this patch is to correct the wrong estimation of

 the number of instructions needed for loading a 64bit constant in
 rv32 in the current cost model(riscv_interger_cost). According to
 the current implementation, if a constant requires more than 3
 instructions(riscv_const_insn and riscv_legitimate_constant_p),
 then the constant will be put into constant pool when expanding
 gimple to rtl(legitimate_constant_p hook and emit_move_insn).
 So the inaccurate cost model leads to the suboptimal codegen
 in rv32 and the wrong estimation part could be corrected through
 this fix.

 e.g. the current codegen for loading 0x839290001 in rv32

lui a5,%hi(.LC0)
lw  a0,%lo(.LC0)(a5)
lw  a1,%lo(.LC0+4)(a5)
 .LC0:
.word   958988289
.word   8

 output after this patch

li a0,958988288
addi a0,a0,1
li a1,8

 gcc/ChangeLog:

  * config/riscv/riscv.cc (riscv_build_integer): Handle the case of 
loading 64bit constant in rv32.

 gcc/testsuite/ChangeLog:

  * gcc.target/riscv/rv32-load-64bit-constant.c: New test.

 Signed-off-by: Lin Sinan 
 ---
   gcc/config/riscv/riscv.cc | 23 +++
   .../riscv/rv32-load-64bit-constant.c  | 38 +++
   2 files changed, 61 insertions(+)
   create mode 100644 gcc/testsuite/gcc.target/riscv/rv32-load-64bit-constant.c

 diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
 index 32f9ef9ade9..9dffabdc5e3 100644
 --- a/gcc/config/riscv/riscv.cc
 +++ b/gcc/config/riscv/riscv.cc
 @@ -618,6 +618,29 @@ riscv_build_integer (struct riscv_integer_op *codes, 
HOST_WIDE_INT value,
}
   }
  
 +  if ((value > INT32_MAX || value < INT32_MIN) && !TARGET_64BIT)


 Nit.   It's common practice to have the TARGET test first in a series of 
 tests.  It may also be advisable to break this into two lines.  
 Something like this:



   if ((!TARGET_64BIT)
   || value > INT32_MAX || value < INT32_MIN)


 That's the style most GCC folks are more accustomed to reading.


Thanks for the tips and I will change it then.


 +{
 +  unsigned HOST_WIDE_INT loval = sext_hwi (value, 32);
 +  unsigned HOST_WIDE_INT hival = sext_hwi ((value - loval) >> 32, 32);
 +  struct riscv_integer_op alt_codes[RISCV_MAX_INTEGER_OPS],
 +   hicode[RISCV_MAX_INTEGER_OPS];
 +  int hi_cost, lo_cost;
 +
 +  hi_cost = riscv_build_integer_1 (hicode, hival, mode);
 +  if (hi_cost < cost)
 + {
 +   lo_cost = riscv_build_integer_1 (alt_codes, loval, mode);
 +   if (lo_cost + hi_cost < cost)


 Just so I'm sure.  "cost" here refers strictly to other synthesized 
 forms? If so, then ISTM that we'd want to generate the new style when 
 lo_cost + hi_cost < cost OR when lo_cost + hi_cost is less than loading 
 the constant from memory -- which is almost certainly more than "3" 
 since the sequence from memory will be at least 3 instructions, two of 
 which will hit memory.



 Jeff



Yes, almost right. The basic idea of this patch is to improve the cost
calculation for loading 64bit constant in rv32, instead of adding a new
way to load constant.

gcc now loads 0x739290001LL in rv32gc with three instructions,
 li  a0,958988288
 addia0,a0,1
 li  a1,7
However, when it loads 0x839290001LL, the output assembly becomes
 lui a5,%hi(.LC0)
 lw  a0,%lo(.LC0)(a5)
 lw  a1,%lo(.LC0+4)(a5)
 .LC0:
 .word   958988289
 .word   8
The cost calculation is inaccurate in such cases, since loading these
two constants should have no difference in rv32 (just change `li a1,7`
to `li a1,8` to load the hi part). This patch will take these cases
into consideration.

I think I see better what's going on.  This really isn't about the 
constant pool costing.  It's about another way to break down the 
constant into components.


riscv_build_integer_1, for the cases we're looking at breaks down the 
constant so that high + low will give the final result.  It costs the 
high and low parts separately, then sums their cost + 1 for the addition 
step.


Your patch adds another method that is specific to rv32 and takes 
advantage of register pairs.   You break the constant down into 32bit 
high and low chunks, where each chunk will go into a different 32 bit 
register.  You just then need to sum the cost of loading each chunk.


For the constants in question, your new method will result in a smaller 
cost than the current method.   That's really the point of 
riscv_build_integer -- find the sequence and cost of creation.  We later 
use that information to determine if we should use that sequence or a 
constant pool.


Palmer raised an issue on the tests with a request to not include the 
arch/abi specification.  But I think you addressed that in a later 
comment.  Specifically for rv64 we end up with another instruction, 
which would cause some constants to be considered cheaper as

Re: [PING] [PATCH RESEND] riscv: improve the cost model for loading a 64bit constant in rv32.

2022-11-24 Thread Sinan via Gcc-patches

> The motivation of this patch is to correct the wrong estimation of
>> the number of instructions needed for loading a 64bit constant in
>> rv32 in the current cost model(riscv_interger_cost). According to
>> the current implementation, if a constant requires more than 3
>> instructions(riscv_const_insn and riscv_legitimate_constant_p),
>> then the constant will be put into constant pool when expanding
>> gimple to rtl(legitimate_constant_p hook and emit_move_insn).
>> So the inaccurate cost model leads to the suboptimal codegen
>> in rv32 and the wrong estimation part could be corrected through
>> this fix.
>>
>> e.g. the current codegen for loading 0x839290001 in rv32
>>
>> lui a5,%hi(.LC0)
>> lw a0,%lo(.LC0)(a5)
>> lw a1,%lo(.LC0+4)(a5)
>> .LC0:
>> .word 958988289
>> .word 8
>>
>> output after this patch
>>
>> li a0,958988288
>> addi a0,a0,1
>> li a1,8
>>
>> gcc/ChangeLog:
>>
>> * config/riscv/riscv.cc (riscv_build_integer): Handle the case of loading 
>> 64bit constant in rv32.
>>
>> gcc/testsuite/ChangeLog:
>>
>> * gcc.target/riscv/rv32-load-64bit-constant.c: New test.
>>
>> Signed-off-by: Lin Sinan 
>> ---
>> gcc/config/riscv/riscv.cc | 23 +++
>> .../riscv/rv32-load-64bit-constant.c | 38 +++
>> 2 files changed, 61 insertions(+)
>> create mode 100644 gcc/testsuite/gcc.target/riscv/rv32-load-64bit-constant.c
>>
>> diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
>> index 32f9ef9ade9..9dffabdc5e3 100644
>> --- a/gcc/config/riscv/riscv.cc
>> +++ b/gcc/config/riscv/riscv.cc
>> @@ -618,6 +618,29 @@ riscv_build_integer (struct riscv_integer_op *codes, 
>> HOST_WIDE_INT value,
>> }
>> }
>> 
>> + if ((value > INT32_MAX || value < INT32_MIN) && !TARGET_64BIT)
>
> Nit. It's common practice to have the TARGET test first in a series of 
> tests. It may also be advisable to break this into two lines. 
> Something like this:
>
>
> if ((!TARGET_64BIT)
> || value > INT32_MAX || value < INT32_MIN)
>
>
> That's the style most GCC folks are more accustomed to reading.
Thanks for the tips and I will change it then.
>> + {
>> + unsigned HOST_WIDE_INT loval = sext_hwi (value, 32);
>> + unsigned HOST_WIDE_INT hival = sext_hwi ((value - loval) >> 32, 32);
>> + struct riscv_integer_op alt_codes[RISCV_MAX_INTEGER_OPS],
>> + hicode[RISCV_MAX_INTEGER_OPS];
>> + int hi_cost, lo_cost;
>> +
>> + hi_cost = riscv_build_integer_1 (hicode, hival, mode);
>> + if (hi_cost < cost)
>> + {
>> + lo_cost = riscv_build_integer_1 (alt_codes, loval, mode);
>> + if (lo_cost + hi_cost < cost)
>
> Just so I'm sure. "cost" here refers strictly to other synthesized 
> forms? If so, then ISTM that we'd want to generate the new style when 
> lo_cost + hi_cost < cost OR when lo_cost + hi_cost is less than loading 
> the constant from memory -- which is almost certainly more than "3" 
> since the sequence from memory will be at least 3 instructions, two of 
> which will hit memory.
>
>
> Jeff
>
Yes, almost right. The basic idea of this patch is to improve the cost
calculation for loading 64bit constant in rv32, instead of adding a new
way to load constant.
gcc now loads 0x739290001LL in rv32gc with three instructions,
 li a0,958988288
 addi a0,a0,1
 li a1,7
However, when it loads 0x839290001LL, the output assembly becomes
 lui a5,%hi(.LC0)
 lw a0,%lo(.LC0)(a5)
 lw a1,%lo(.LC0+4)(a5)
 .LC0:
 .word 958988289
 .word 8
The cost calculation is inaccurate in such cases, since loading these
two constant should have no difference in rv32 (just change `li a1,7`
to `li a1,8` to load the hi part). This patch will take these cases
into consideration.
BR,
Sinan

Re: [PING] [PATCH RESEND] riscv: improve the cost model for loading a 64bit constant in rv32.

2022-11-22 Thread Jeff Law via Gcc-patches




On 11/17/22 00:32, Lin Sinan via Gcc-patches wrote:

The motivation of this patch is to correct the wrong estimation of
the number of instructions needed for loading a 64bit constant in
rv32 in the current cost model(riscv_interger_cost). According to
the current implementation, if a constant requires more than 3
instructions(riscv_const_insn and riscv_legitimate_constant_p),
then the constant will be put into constant pool when expanding
gimple to rtl(legitimate_constant_p hook and emit_move_insn).
So the inaccurate cost model leads to the suboptimal codegen
in rv32 and the wrong estimation part could be corrected through
this fix.

e.g. the current codegen for loading 0x839290001 in rv32

   lui a5,%hi(.LC0)
   lw  a0,%lo(.LC0)(a5)
   lw  a1,%lo(.LC0+4)(a5)
.LC0:
   .word   958988289
   .word   8

output after this patch

   li a0,958988288
   addi a0,a0,1
   li a1,8

gcc/ChangeLog:

 * config/riscv/riscv.cc (riscv_build_integer): Handle the case of 
loading 64bit constant in rv32.

gcc/testsuite/ChangeLog:

 * gcc.target/riscv/rv32-load-64bit-constant.c: New test.

Signed-off-by: Lin Sinan 
---
  gcc/config/riscv/riscv.cc | 23 +++
  .../riscv/rv32-load-64bit-constant.c  | 38 +++
  2 files changed, 61 insertions(+)
  create mode 100644 gcc/testsuite/gcc.target/riscv/rv32-load-64bit-constant.c

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 32f9ef9ade9..9dffabdc5e3 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -618,6 +618,29 @@ riscv_build_integer (struct riscv_integer_op *codes, 
HOST_WIDE_INT value,
}
  }
  
+  if ((value > INT32_MAX || value < INT32_MIN) && !TARGET_64BIT)


Nit.   It's common practice to have the TARGET test first in a series of 
tests.  It may also be advisable to break this into two lines.  
Something like this:



  if ((!TARGET_64BIT)
  || value > INT32_MAX || value < INT32_MIN)


That's the style most GCC folks are more accustomed to reading.




+{
+  unsigned HOST_WIDE_INT loval = sext_hwi (value, 32);
+  unsigned HOST_WIDE_INT hival = sext_hwi ((value - loval) >> 32, 32);
+  struct riscv_integer_op alt_codes[RISCV_MAX_INTEGER_OPS],
+   hicode[RISCV_MAX_INTEGER_OPS];
+  int hi_cost, lo_cost;
+
+  hi_cost = riscv_build_integer_1 (hicode, hival, mode);
+  if (hi_cost < cost)
+   {
+ lo_cost = riscv_build_integer_1 (alt_codes, loval, mode);
+ if (lo_cost + hi_cost < cost)


Just so I'm sure.  "cost" here refers strictly to other synthesized 
forms? If so, then ISTM that we'd want to generate the new style when 
lo_cost + hi_cost < cost OR when lo_cost + hi_cost is less than loading 
the constant from memory -- which is almost certainly more than "3" 
since the sequence from memory will be at least 3 instructions, two of 
which will hit memory.



Jeff

Re: 回复：[PING] [PATCH RESEND] riscv: improve the cost model for loading a 64bit constant in rv32.

Re: 回复：[PING] [PATCH RESEND] riscv: improve the cost model for loading a 64bit constant in rv32.

Re: [PING] [PATCH RESEND] riscv: improve the cost model for loading a 64bit constant in rv32.

Re: [PING] [PATCH RESEND] riscv: improve the cost model for loading a 64bit constant in rv32.

4 matches

Site Navigation

Mail list logo

Footer information