RE: [PATCH] RISC-V: cost model for loading 64bit constant in rv32

2022-11-09 Thread Sinan via Gcc-patches
>> comparison with clang:
>> https://godbolt.org/z/v5nxTbKe9 
>
> IIUC the rules are generally no compiler explorer links (so we can
> preserve history) and no attachment patches (ie, inline them like
> git-send-email does). There's some documenation on sending patches at
> .
>
> Given those usually show up for first-time contributors there's also
> some copyright/DCO procedures to bo followed. I see some other
> linux.alibaba.com emails called out explicitly, but then also a generic
> "GCC Alibaba Group Holding Limited". I think that means we're OK for
> copyright assignment here? There's still the "you wrote the code" bits
> that are worth reading, though.
Thanks for your info. I will resend this patch according to your suggestion. As 
for the copyright, people and I with the linux.alibaba.com email have FSF 
copyright assignment and copyright won't be a problem.
> On Tue, 08 Nov 2022 19:26:01 PST (-0800), gcc-patches@gcc.gnu.org wrote:
>> loading constant 0x739290001LL in rv32 can be done with three instructions
>> output:
>> li a1, 7
>> lui a1, 234128
>> addi a1, a1, 1
>
> Probably more useful to load the constant into two different registers? 
> The point is the same, though, so just a commit message issue.
>
>> Similarly, loading 0x839290001LL in rv32 can be done within three 
>> instructions
>> expected output:
>> li a1, 8
>> lui a1, 234128
>> addi a1, a1, 1
> However, riscv_build_integer does not handle this case well and makes a wrong 
> prediction about the number of instructions > needed and then the constant is 
> forced to put in the memory via riscv_const_insns and emit_move_insn.
>> real output:
>> lui a4,%hi(.LC0)
>> lw a2,%lo(.LC0)(a4)
>> lw a3,%lo(.LC0+4)(a4)
>> .LC0:
>> .word958988289
>> .word8
>
> That's still 3 instructions, but having loads and a constant is worse so
> that's just another commit message issue.
>
>
> Looking at the attached patch:
>
>> + if ((value > INT32_MAX || value < INT32_MIN) && !TARGET_64BIT)
>> + {
>> + unsigned HOST_WIDE_INT loval = sext_hwi (value, 32);
>> + unsigned HOST_WIDE_INT hival = sext_hwi ((value - loval) >> 32, 32);
>> + struct riscv_integer_op alt_codes[RISCV_MAX_INTEGER_OPS],
>> + hicode[RISCV_MAX_INTEGER_OPS];
>> + int hi_cost, lo_cost;
>> +
>> + hi_cost = riscv_build_integer_1 (hicode, hival, mode);
>> + if (hi_cost < cost)
>> + {
>> + lo_cost = riscv_build_integer_1 (alt_codes, loval, mode);
>> + if (lo_cost + hi_cost < cost)
>> + {
>> + memcpy (codes, alt_codes,
>> + lo_cost * sizeof (struct riscv_integer_op));
>> + memcpy (codes + lo_cost, hicode,
>> + hi_cost * sizeof (struct riscv_integer_op));
>> + cost = lo_cost + hi_cost;
>> + }
>> + }
>> + }
>
> This kind of stuff always seems like it should be possible to handle
> with generic middle-end optimizations: it's essentially just splitting
> the 64-bit constant into two 32-bit constants, which is free because
> it's going into two registers anyway. That's not a RISC-V specific
> problem, it's just the case any time a constant is going to be split
> between two registers -- it'd even apply for 128-bit constants on rv64,
> though those are probably rare enough they don't matter all that much.
>
> I'm not opposed to doing this in the backend, but maybe it's a sign
> we've just screwed something else up and can avoid adding the code?
Sorry for not making it clear. The motivation of this patch is to correct the 
wrong estimation of the number of instructions needed for loading a 64bit 
constant in rv32 in the current cost model(riscv_interger_cost). According to 
the current implementation, if a constant requires more than 3 
instructions(riscv_const_insn and riscv_legitimate_constant_p), then the 
constant will be put into constant pool when expanding gimple to 
rtl(legitimate_constant_p hook and emit_move_insn). So at least the wrong 
estimation part in rv32 could be improved.
Strategies of loading a big constant vary from backend ot backend, and I think 
it makes sense that let the backend make the decision. As for RV64 and int128, 
I did not investigate it and it seems a bit different from rv32 case. For 
example, loading 64bit constant 0x839290001LL in rv64 actually requires one 
more instruction than rv32.
```
lui a1, 132
addiw a1,a1,-1751
slli a1,a1,16
addi a1,a1,1
```
>> +++ b/gcc/testsuite/gcc.target/riscv/rv32-load-64bit-constant.c
>> @@ -0,0 +1,35 @@
>> +/* { dg-do compile } */
>> +/* { dg-options "-march=rv32gc -mabi=ilp32 -Os" } */
>
> This has the same library/abi problems we've had before, but in this
> case I think it's fine to just leave the -march/-mabi out: the test
> cases won't be that exciting on rv64 (unless we add the 128-bit splits
> too?), but they should still stay away from the constant pool.
>
> IIUC this should work on more than Os, at least O2/above? Not that
> exciting for the test case, though.
>
>> +/* { dg-final { scan-assembler-not "\.LC\[0-9\]" } } */
>
> That's a bit fragile, maybe just ma

Re: [PATCH] RISC-V: cost model for loading 64bit constant in rv32

2022-11-08 Thread Palmer Dabbelt

On Tue, 08 Nov 2022 19:26:01 PST (-0800), gcc-patches@gcc.gnu.org wrote:

loading constant 0x739290001LL in rv32 can be done with three instructions
output:
li a1, 7
lui a1, 234128
addi a1, a1, 1


Probably more useful to load the constant into two different registers?  
The point is the same, though, so just a commit message issue.



Similarly, loading 0x839290001LL in rv32 can be done within three instructions
expected output:
li a1, 8
lui a1, 234128
addi a1, a1, 1
However, riscv_build_integer does not handle this case well and makes a wrong 
prediction about the number of instructions needed and then the constant is 
forced to put in the memory via riscv_const_insns and emit_move_insn.
real output:
lui a4,%hi(.LC0)
lw a2,%lo(.LC0)(a4)
lw a3,%lo(.LC0+4)(a4)
.LC0:
 .word958988289
 .word8


That's still 3 instructions, but having loads and a constant is worse so 
that's just another commit message issue.



comparison with clang:
https://godbolt.org/z/v5nxTbKe9 


IIUC the rules are generally no compiler explorer links (so we can 
preserve history) and no attachment patches (ie, inline them like 
git-send-email does).  There's some documenation on sending patches at 
.


Given those usually show up for first-time contributors there's also 
some copyright/DCO procedures to bo followed.  I see some other 
linux.alibaba.com emails called out explicitly, but then also a generic 
"GCC Alibaba Group Holding Limited".  I think that means we're OK for 
copyright assignment here?  There's still the "you wrote the code" bits 
that are worth reading, though.


Looking at the attached patch:


+  if ((value > INT32_MAX || value < INT32_MIN) && !TARGET_64BIT)
+{
+  unsigned HOST_WIDE_INT loval = sext_hwi (value, 32);
+  unsigned HOST_WIDE_INT hival = sext_hwi ((value - loval) >> 32, 32);
+  struct riscv_integer_op alt_codes[RISCV_MAX_INTEGER_OPS],
+  hicode[RISCV_MAX_INTEGER_OPS];
+  int hi_cost, lo_cost;
+
+  hi_cost = riscv_build_integer_1 (hicode, hival, mode);
+  if (hi_cost < cost)
+   {
+lo_cost = riscv_build_integer_1 (alt_codes, loval, mode);
+if (lo_cost + hi_cost < cost)
+  {
+memcpy (codes, alt_codes,
+lo_cost * sizeof (struct riscv_integer_op));
+memcpy (codes + lo_cost, hicode,
+hi_cost * sizeof (struct riscv_integer_op));
+cost = lo_cost + hi_cost;
+  }
+   }
+}


This kind of stuff always seems like it should be possible to handle 
with generic middle-end optimizations: it's essentially just splitting 
the 64-bit constant into two 32-bit constants, which is free because 
it's going into two registers anyway.  That's not a RISC-V specific 
problem, it's just the case any time a constant is going to be split 
between two registers -- it'd even apply for 128-bit constants on rv64, 
though those are probably rare enough they don't matter all that much.


I'm not opposed to doing this in the backend, but maybe it's a sign 
we've just screwed something else up and can avoid adding the code?



+++ b/gcc/testsuite/gcc.target/riscv/rv32-load-64bit-constant.c
@@ -0,0 +1,35 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv32gc -mabi=ilp32 -Os" } */


This has the same library/abi problems we've had before, but in this 
case I think it's fine to just leave the -march/-mabi out: the test 
cases won't be that exciting on rv64 (unless we add the 128-bit splits 
too?), but they should still stay away from the constant pool.


IIUC this should work on more than Os, at least O2/above?  Not that 
exciting for the test case, though.



+/* { dg-final { scan-assembler-not "\.LC\[0-9\]" } } */


That's a bit fragile, maybe just match on load-word?