On 11/29/18 10:28, Michael Eager wrote:
On 11/28/18 14:37, Andrew Pinski wrote:
On Wed, Nov 28, 2018 at 9:47 AM Michael Eager <ea...@eagerm.com> wrote:

On 11/28/18 09:10, Jeff Law wrote:
On 11/28/18 10:00 AM, Michael Eager wrote:
I have a small test case which generates poor quality code on my target.
Here is the original:

    if (cond1 == 2048 || cond2 == 8)
      {
        x = x + y;
      }
    return x;

This ends up generating a series of instructions to compute a flag with
the result of the condition followed by a single compare with zero and
a jump.  Better code would be two compares and two jumps.

The gimple is

    _1 = cond1 == 2048;
    _2 = cond2 == 8;
    _3 = _1 | _2;
    if (_3 != 0) goto <D.1464>; else goto <D.1465>;
...

so this poor code sequence is essentially a correct translation.

On MIPS, for the same test case, the gimple is different:

    if (cond1 == 2048) goto <D.1491>; else goto <D.1493>;
    <D.1493>:
    if (cond2 == 8) goto <D.1491>; else goto <D.1492>;
    <D.1491>:
...

which generates the better code sequence.

Can someone give me a pointer where to find where the lowering
pass decides to break up a condition into multiple tests?  Is
there a target configuration option which I have overlooked or
maybe set incorrectly?
BRANCH_COST, which comes into play during generation of the initial
trees as well in passes which try to optimize branchy code into
straighter code.

Thanks.  I did look at BRANCH_COST, and played with the values, but it
didn't seem to have any affect.  I'll take a closer look.

Look at LOGICAL_OP_NON_SHORT_CIRCUIT .  By defualt, it is defined as:
   (BRANCH_COST (optimize_function_for_speed_p (cfun), \
                 false) >= 2)
But MIPS defines it as 0.

Changing  LOGICAL_OP_NON_SHORT_CIRCUIT to 0 will disable this optimization.

LOGICAL_OP_NON_SHORT_CIRCUIT controls both places where (cond1 == 2048
|| cond2 == 8) would remove the branch.

NOTE I think MIPS defines LOGICAL_OP_NON_SHORT_CIRCUIT incorrectly but
that is a different story.

Thanks,
Andrew Pinski

I set BRANCH_COST to 1 for both predicted and non-predicted branches.
This generates the shorter sequence with a compare, but I'm concerned
that it will adversely affect code generation elsewhere.  Branches are
higher cost than simple instructions.

Looking at the code generated with BRANCH_COST > 1, it doesn't do what
I indicated above.  This did not reduce the number of branches.
In both cases there are two branches in this short test case.  When
BRANCH_COST > 1, there are three instructions to do a compare vs one
when BRANCH_COST = 1.  I'm at a loss to see where there is any benefit.

I'll take a look at the LOGICAL_OP_NON_SHORT_CIRCUIT settings and see
if this makes a difference.


Setting LOGICAL_OP_NON_SHORT_CIRCUIT to zero generates the better code
sequence.  Thanks for the pointer.

I stepped through the code at the two places where
LOGICAL_OP_NON_SHORT_CIRCUIT is tested, with it set both to 0 and 1, and
it is not clear what the code is trying to do.  Or what this parameter
is really supposed to mean.  I'll have to go back and look again.

Setting LOGICAL_OP_NON_SHORT_CIRCUIT to 0 or 1 does not change the
number of branches.  Why this would depend on BRANCH_COST is not clear,
and especially why it would depend on the costs of branches which are
predicted to fail makes no sense to me.

Is there a use case which would help this make some sense?

--
Michael Eager    ea...@eagerm.com
1960 Park Blvd., Palo Alto, CA 94306

Reply via email to