On 2023-10-31 00:36  Jeff Law <jeffreya...@gmail.com> wrote:
>
>
>
>On 10/30/23 01:25, Fei Gao wrote:
>> Conditional add, if zero
>> rd = (rc == 0) ? (rs1 + rs2) : rs1
>> -->
>> czero.nez rd, rs2, rc
>> add rd, rs1, rd
>>
>> Conditional add, if non-zero
>> rd = (rc != 0) ? (rs1 + rs2) : rs1
>> -->
>> czero.eqz rd, rs2, rc
>> add rd, rs1, rd
>>
>> Co-authored-by: Xiao Zeng<zengx...@eswincomputing.com>
>>
>> gcc/ChangeLog:
>>
>>          * ifcvt.cc (noce_emit_czero): helper for noce_try_cond_zero_arith
>>          (noce_try_cond_zero_arith): handler for condtional zero op
>>          (noce_process_if_block): add noce_try_cond_zero_arith with hook 
>>control
>>
>> gcc/testsuite/ChangeLog:
>>
>>          * gcc.target/riscv/zicond_ifcvt_opt.c: New test.
>So the idea here is to improve upon the current code we generate for
>conditional arithmetic.  Right now we support conditional arithmetic
>using zicond, but the sequence is poor.
>
>Basically the if-converter knows how to generate a conditional add, but
>it does so in a way that isn't as efficient as it could be.
>
>In effect ifcvt wants to generate
>
>t = a + b
>res = cond ? t : b
>
>
>We want to change it to
>
>t = cond ? b : 0;
>res = a + t;
>
>The latter sequence expands to more efficient code trivially for risc-v. 
Exactly. 2 less insns for add case below:
long test_ADD_ceqz(long x, long y, long z, long c){
  if (c)
    x = y + z;  
  else
    x = y;  
  return x;
  }
  
test_ADD_ceqz(before this patch): 
  add a2,a1,a2
  czero.eqz a0,a2,a3
  czero.nez a3,a1,a3
  or a0,a3,a0
ret

test_ADD_ceqz(after this patch):
  czero.eqz a3,a2,a3
  add a0,a1,a3
  ret
>
>I wandered a bit through the combine dumps to see if it would be easy to
>capture this class of cases.  We never get anything useful, and while I
>can imagine "bridge" patterns that would potentially expose enough RTL
>to allow us to rewrite without changing ifcvt, it'd just be a hack IMHO.
>
>So going back to ifcvt...
>
>In the first sequence the addition must wait for both "a" and "b" to be
>available and the conditional move can fire on the next cycle.
>
>In the second sequence the conditional move can fire when just "b" is
>available.  So that gives "a" another cycle to become ready (say if it's
>coming from memory or a multi-cycle operation like multiply).
>
>On the other hand the second sequence does keep "a" live longer.
>
>In the end I strongly suspect neither sequence is significantly better
>than the other.  Meaning I don't think we need to conditionalize using
>condzero arith at all. 
As shown case above, 2 less insns with using condzero arith.

>
>
>I'll note that subsequent patches add MINUS, IOR, XOR and AND.  It's
>also possible (and important) to handle shifts.  There's a conditional
>shift-by-6 in leela's hot path. 
This series is a initial framework for simple condzero arith. Shift may come 
later
as it involes sugreg stuff.

>
>Overall this looks a lot like the VRULL code, but just less complete.
>My inclination is to do a cleanup pass on the VRULL code verify it
>handles all the cases in your tests and commit the VRULL implementation
>with your tests. 
I searched and didn't find VRULL codes, could you please provide a link at
your convience? My colleague Zeng Xiao posted monthes ago 
https://patchwork.sourceware.org/project/gcc/patch/20230719101156.21771-6-zengx...@eswincomputing.com/
But after fixing several bugs, we realized the previous implementation is quite 
complex and
come up with this patch series.

>
>I'll do some further poking at this today.  Thanks for re-submitting
>these bits.  Getting this target independent work cleaned up has been on
>my TODO for a while now. 
Thanks for your patience.

BR, 
Fei

>
>jeff

Reply via email to