[Abe wrote:]
After finishing fixing the known regressions, I intend/plan to reg-test for 
AArch64;
after that, I think I`m going to need some community help to reg-test for other 
ISAs.

[Alan wrote:]
OK, I'm confused. When you write "making the new if-converter not mangle IR"...
> does "the new if-converter" mean your scratchpad fix to PR46029

Basically, yes.  The code I currently have is, IMO, 
sufficiently-{changed/rewritten/replaced}
relative to {the code currently in the file at that path in trunk} that I am 
referring to
the pre-patch code in "tree-if-conv.c" as "the old if converter" and referring 
to the code
in the version of the file at that path in my Git GCC checkout as "the new if 
converter".


[Alan wrote:]
or is there some other new if-conversion phase that you are still working on 
and haven't posted yet?

No; as of now, Sebastian and I are not working on adding any more passes or 
phases: we are just trying to
replace the old, potentially-unsafe, gives-up-more-easily converter with the 
new one that uses the scratchpad
idea both to produce safer conversions and to be able to convert a greater 
percentage of the time.

The only significant difficulty with the preceding at this time, AFAIK, is that 
the work-in-progress of the
new converter produces {too many levels deep} of indirections, so the 
vectorizer gives up trying to vectorize
the result[s] of the conversion, which makes the whole thing a big "fail" [not 
in the DejaGNU sense,
although pretty close].  That`s why I would love to have some help in fixing 
the regressions.


[Alan wrote:]
> I haven't yet understood what you mean about "vectorizer-friendly" IR being 
mangled; is the problem that your
> new […] transforms IR that can currently be if-converted by the existing 
phase, into IR that can't? (Example?)

Not exactly, but close.

"TLDR" for the following [from "vvv" to "^^^"]: "too much unnecessary indirection is 
being added".

--- vvv --- TLDR`d above --- vvv ---

Let`s pretend the programmer of the code being compiled knows all about 
if-conversion and vectorization and does the
conversion _manually_.  In other words, the C/C++/etc. code passed in to the 
compiler _already_ looks something like:

  temp0 = compute_condition();
  temp1 = foo();
  temp2 = bar();
  result = temp0 ? temp1 : temp2;


IOW, essentially IR/assembler written in C.  Since the C code is lowered to
GIMPLE and the '?' operator is lowered [too early IMO *] to blocks and "goto"s,
the if-conversion pass no longer "sees" "result = temp0 ? temp1 : temp2;" --
instead, it sees something like:

  if temp0
    goto <bb 4>
  else
    goto <bb 5>

  /* --- bb 4 --- */
  result = temp1;
  goto <bb 6>

  /* --- bb 4 --- */
  result = temp2;

  /* --- bb 6 --- */


… which the old if converter handles in such a way as to
keep the vectorizer happy, but the new one does not yet.

'*': another issue to discuss and something that IMO should be
     fixed/improved in GCC but outside of "tree-if-conv.c"


The new if converter is doing something at least similar to "ooh!  Ooh!
The programmer wrote a valid-conversion-candidate ''if'' that I know how to 
convert!"
and messes it up; IIRC the way in which this gets messed up is something like:

  _ifcvt_temp_0 = temp0;
  …
  _ifcvt_temp_1 = temp1;
  …
  _ifcvt_temp_2 = temp2;
  …
  result = _ifcvt_temp_0 ? _ifcvt_temp_1 : _ifcvt_temp_2;


… at which the vectorizer turns up its nose, says "too much indirection" and/or "too 
much gather",
and doesn`t vectorize --  for a {source code, target architecture} pair for 
which the code can
and should be vectorized by GCC.  In particular, the old vectorizer handles it 
well IIRC,
probably b/c the old vectorizer has less indirection overhead, which it can
easily "afford" since it never adds an indirection through a scratchpad.

I think what is needed here is basically to reduce the indirection overhead in 
the new converter
by making the new converter realize "oh, <foo> is already pure and 
thread-local, so I don`t need
to copy <foo> into a temporary before using it".  At least, that seems to me 
like _one_ way to
fix this category of regressions in the new converter.  Another way is to make 
GCC overall not
convert e.g. "x ? y : z" into basic blocks etc. so early all of the time; my 
impressions is
that GCC is not doing both of {inspecting the purity, analyzing the cost} of 
the expressions
and only converting into basic blocks etc. when e.g. 'y'/'z'/both is/are impure 
and/or
at least one of {'y', 'z'} is an "expensive" thing to compute.  For low-cost 
pure expressions,
e.g. "x ? y : z" should be retained as-as IMO -- i.e. not lowered into separate 
BBs --
for as long as possible.  Ideally, it is encoded as a "COND_EXPR" and stays 
that way for
as long as possible when there is no purity/high-computational-cost problem 
with either the
second or the third param.  In the worst case, this could involve 
fixing/improving several
front ends, so I want to push this off into a separate subproject from the if 
conversion itself.

--- ^^^ --- TLDR`d above --- ^^^ ---

I hope the above is helpful, but since it`s from memory I don`t
guarantee that it`s both 100% accurate and 100% complete.  ;-)


[Alan wrote:]
Then I might (only "might", sorry!) be able to help...

Great!  Thanks.  :-)

Ideally, I/we fix the above problem -- and the rest of the regressions in the 
new if converter --
without any significant changes to core GCC files outside of "tree-if-conv.c".  
IOW, I`d like
to minimize the invasiveness of this patch and get rid of {as many regressions 
as possible}
the fixes for which lie entirely inside "tree-if-conv.c" before proceeding to 
fix/improve the
"lowered too early" problem that I perceive current GCC trunk has in its lowering of 
"x ? y : z".
I`m almost 100% certain that fixing/improving the lowering will require 
significant alterations
to code in files other than "tree-if-conv.c", so even though that 
fix/improvement would likely
fix at least one regression in the new if converter, I`d rather do it 
separately/later/both.

In particular, I remember that "result = condition ? array1[index] : array2[maybe 
the same index, maybe not]"
is being converted too early IMO.  IOW, somewhere in GCC an array dereference 
is being considered
as either impure, too-expensive, or both.  "array[index]" in C [not in C++!: 
operator overloading]
AFAIK is always pure and is always low-cost whenever the index expression is 
pure and low-cost.

Regards,

Abe

Reply via email to