> Here is a patch which fixes redundant zero extensions problem. Issue
> is resolved by expanding implicit_zee pass functionality to cover zero
> and sign extends of different modes. Could please someone review it?

Could you explain the undelying idea?  The current strategy of implicit-zee.c 
is exposed at length at the beginning of the file, but here's a summary:

 1. On some architectures (typically x86-64), implicity zero-extensions are 
applied when instructions operate in selected sub-word modes (SImode here):

  addl edi,eax

has an implicit zero-extension for %rax.

 2. Because of 1, the second instruction in sequences like:

  (set (reg:SI x) (plus:SI (reg:SI z1) (reg:SI z2)))
  (set (reg:DI x) (zero_extend:DI (reg:SI x)))

is redundant.

 3. The pass recognizes this and transforms the above sequence into:

  (set (reg:DI x) (zero_extend:DI (plus:SI (reg:SI z1) (reg:SI z2))))

and the machine description knows how to translate this into an 'addl'.


You're proposing extending this to other modes and other architectures, for 
example QImode on x86.  But does

  addb %dl, %al

modify the entire %eax register on x86?  In other words, are you really after 
implicit (zero-)extensions or after something else, like global elimination of 
redundant extensions?

What's the effect of the patch on the testcase in the PR in terms of insns at 
the RTL level?  Why doesn't the combiner already optimize it?

Enhancing implicit-zee.c to address missed optimizations like the one reported 
in target/50038 might well be the best approach, but the strategy shift must be 
clearly exposed and discussed.  The reported numbers are certainly impressive.

-- 
Eric Botcazou

Reply via email to