On 11/09/15 16:31, James Greenhalgh wrote:
On Tue, Sep 01, 2015 at 11:08:10AM +0100, Kyrill Tkachov wrote:
Hi all,
The ARMv8-A reference manual says:
"CNEG <Wd>, <Wn>, <cond>
is equivalent to
CSNEG <Wd>, <Wn>, <Wn>, invert(<cond>)
and is the preferred disassembly when Rn == Rm && cond != '111x'."
That is, when the two input registers are the same we can use the shorter CNEG
mnemonic
with the inverse condition instead of the longer CSNEG instruction. Similarly
for the
CSINV and CSINC instructions, they have shorter CINV and CINC forms.
This patch adjusts the output templates to emit the preferred shorter sequences
when possible.
The new mnemonics are just aliases, they map down to the same instruction in
the end, so there
are no performance or behaviour implications. But it does make the assembly a
bit more readable
IMO, since:
"cneg w27, w9, le"
can be simply read as "if the condition is less or equal negate w9" instead of
the previous:
"csneg w27, w9, w9, gt" where you have to remember which of the input
registers is negated.
Bootstrapped and tested on aarch64-linux-gnu.
Ok for trunk?
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 77bc7cd..2e4b26c 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -3090,7 +3090,12 @@ (define_insn "csinc3<mode>_insn"
(const_int 1))
(match_operand:GPI 3 "aarch64_reg_or_zero" "rZ")))]
""
- "csinc\\t%<w>0, %<w>3, %<w>2, %M1"
+ {
+ if (rtx_equal_p (operands[2], operands[3]))
+ return "cinc\\t%<w>0, %<w>2, %m1";
+ else
+ return "csinc\\t%<w>0, %<w>3, %<w>2, %M1";
+ }
[(set_attr "type" "csel")]
)
I guess you do it this way rather than just adding a new alternative in
the pattern to avoid any chance of constraining the register allocator, but
would this not be more natural to read as an {r, r, r, 2} alternative, or
similar?
I had not considered this approach and I'm a bit sceptical on how feasible it
is.
If we put the {r,r,r,2} as a second alternative then it will be a purely more
restrictive
version of the first alternative and so will never match.
If, however, we put it as the first alternative we'll be expressing some
preference for
allocating the same register for operands 2 and 3, which is not something we
want to do.
If you've given that some thought and decided it doesn't work for you,
then this is OK for trunk.
Given the above
I'll commit this version next week if there are no objections.
Thanks,
Kyrill
Thanks,
James