On Tue, Jul 14, 2015 at 9:13 AM, Richard Earnshaw
<richard.earns...@foss.arm.com> wrote:
> We went through this a couple of weeks back.  The backend documentation
> for PROMOTE_MODE says:

I disagree that this is a fully resolved issue.  I see clear problems
with how the ARM port uses PROMOTE_MODE.

> " For most machines, the macro definition does not change @var{unsignedp}.
> However, some machines, have instructions that preferentially handle
> either signed or unsigned quantities of certain modes.  For example, on
> the DEC Alpha, 32-bit loads from memory and 32-bit add instructions
> sign-extend the result to 64 bits.  On such machines, set
> @var{unsignedp} according to which kind of extension is more efficient."

The Alpha case is different than the ARM case.  The Alpha only changes
sign for 32-bit values in 64-bit registers.  The alpha happens to have
a nice set of instructions that operates on 32-bit values, that
accepts these wrong-signed values and handle them correctly.  Thus on
the alpha, it appears that there are no extra zero/sign extends
required, and everything is the same speed or faster with the wrong
sign.  The MIPS port works the same way.  This is not true on the arm
though.  The arm port is changing sign on 8 and 16 bit value, but does
not have instructions that directly operate on 8 and 16 bit values.
This requires the compiler to emit extra zero/sign extend instructions
that affect the performance of the code.  Other than the thumb1 8-bit
load, and the pre-armv6 use of andi for 8-bit zero-extend, I haven't
seen anything that is faster in the wrong sign, and there are a number
of operations that are actually slower because of the extra zero/sign
extends required by the arm PROMOTE_MODE definition.  I've given some
examples.

I have since noticed that the relatively new pass to optimize away
unnecessary zero/sign extensions is actually fixing some of the damage
caused by the ARM PROMOTE_MODE definition.  But there are still some
cases that won't get fixed, as per my earlier examples.  It is better
to emit the fast code at the beginning than to rely on an optimization
pass to convert the slow code into fast code.  So I think the ARM
PROMOTE_MODE definition still needs to be fixed.

> So clearly it anticipates that all permitted extensions should work, and
> in particular it makes no mention of this having to match some
> abi-mandated promotions.  That makes this a MI bug not a target one.

If you look at older gcc releases, like gcc-2.95.3, you will see that
there is only PROMOTE_MODE and a boolean PROMOTE_FUNCTION_ARGS which
says whether PROMOTE_MODE is applied to function arguments or not.
You can't have different zero/sign extensions for parameters and
locals in gcc-2.95.3.  The fact that you can have it today is more an
accident than a deliberate choice.  When PROMOTE_FUNCTION_ARGS was
hookized, it was made into a separate function, and for the ARM port,
the code for PROMOTE_MODE was not correctly copied into the new hook,
resulting in the accidental difference for parameter and local
zero/sign promotion for ARM.  The PROMOTE_MODE docs were written back
when different sign/zero extensions for parms/locals weren't possible,
and hence did not consider the case.

Now that we do have the problem, we can't fix it without an ARM port
ABI change, which is undesirable, so we may have to fix it with a MI
change.

There were two MI changes suggested, one was fixing the out-of-ssa
pass to handle SUBREG_PROMOTED_P promotions.  The other was to
disallow creating PHI nodes between parms and locals.  I haven't had a
chance to try implementing the second one yet; I hope to work on that
today.

Jim

Reply via email to