RE: [PATCH][ARM] Improve code generation for anddi3

2013-04-15 Thread Kyrylo Tkachov
Ping?

Thanks,
Kyrill

 -Original Message-
 From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
 ow...@gcc.gnu.org] On Behalf Of Kyrylo Tkachov
 Sent: 08 April 2013 13:47
 To: gcc-patches@gcc.gnu.org
 Cc: Ramana Radhakrishnan; Richard Earnshaw
 Subject: [PATCH][ARM] Improve code generation for anddi3
 
 Hi all,
 
 When compiling:
 
 unsigned long long
 muld (unsigned long long X, unsigned long long Y)
 {
   unsigned long long mask = 0xull;
   return (X  mask) * (Y  mask);
 }
 
 we get a suboptimal sequence:
 stmfd   sp!, {r4, r5}
 mvn r4, #0
 mov r5, #0
 and r0, r0, r4
 and r3, r3, r5
 and r1, r1, r5
 and r2, r2, r4
 mul r3, r0, r3
 mla r3, r2, r1, r3
 umull   r0, r1, r0, r2
 ldmfd   sp!, {r4, r5}
 add r1, r3, r1
 bx  lr
 
 This patch improves that situation by changing the anddi3 insn into an
 insn_and_split and
 simplifying the SImode ands. Also, the NEON version is merged with the
 non-NEON one.
 This allows us to generate just:
 umull   r0, r1, r2, r0
 bx  lr
 for the above code.
 
 Regtested arm-none-eabi on qemu.
 Ok for trunk?
 
 Thanks,
 Kyrill
 
 
 gcc/ChangeLog
 2013-04-08  Kyrylo Tkachov  kyrylo.tkac...@arm.com
 
 * config/arm/arm.c (const_ok_for_dimode_op): Handle AND case.
 * config/arm/arm.md (*anddi3_insn): Change to insn_and_split.
 * config/arm/constraints.md (De): New constraint.
 * config/arm/neon.md (anddi3_neon): Delete.
 (neon_vandmode): Expand to standard anddi3 pattern.
 * config/arm/predicates.md (imm_for_neon_inv_logic_operand):
 Move earlier in the file.
 (neon_inv_logic_op2): Likewise.
 (arm_anddi_operand_neon): New predicate.
 
 gcc/testsuite/ChangeLog
 2013-04-08  Kyrylo Tkachov  kyrylo.tkac...@arm.com
 
 * gcc.target/arm/anddi3-opt.c: New test.
 * gcc.target/arm/anddi3-opt2.c: Likewise.





Re: [PATCH][ARM] Improve code generation for anddi3

2013-04-15 Thread Richard Earnshaw

On 08/04/13 13:47, Kyrylo Tkachov wrote:

Hi all,

When compiling:

unsigned long long
muld (unsigned long long X, unsigned long long Y)
{
   unsigned long long mask = 0xull;
   return (X  mask) * (Y  mask);
}

we get a suboptimal sequence:
 stmfd   sp!, {r4, r5}
 mvn r4, #0
 mov r5, #0
 and r0, r0, r4
 and r3, r3, r5
 and r1, r1, r5
 and r2, r2, r4
 mul r3, r0, r3
 mla r3, r2, r1, r3
 umull   r0, r1, r0, r2
 ldmfd   sp!, {r4, r5}
 add r1, r3, r1
 bx  lr

This patch improves that situation by changing the anddi3 insn into an
insn_and_split and
simplifying the SImode ands. Also, the NEON version is merged with the
non-NEON one.
This allows us to generate just:
 umull   r0, r1, r2, r0
 bx  lr
for the above code.

Regtested arm-none-eabi on qemu.
Ok for trunk?

Thanks,
Kyrill


gcc/ChangeLog
2013-04-08  Kyrylo Tkachov  kyrylo.tkac...@arm.com

 * config/arm/arm.c (const_ok_for_dimode_op): Handle AND case.
 * config/arm/arm.md (*anddi3_insn): Change to insn_and_split.
 * config/arm/constraints.md (De): New constraint.
 * config/arm/neon.md (anddi3_neon): Delete.
 (neon_vandmode): Expand to standard anddi3 pattern.
 * config/arm/predicates.md (imm_for_neon_inv_logic_operand):
 Move earlier in the file.
 (neon_inv_logic_op2): Likewise.
 (arm_anddi_operand_neon): New predicate.

gcc/testsuite/ChangeLog
2013-04-08  Kyrylo Tkachov  kyrylo.tkac...@arm.com

 * gcc.target/arm/anddi3-opt.c: New test.
 * gcc.target/arm/anddi3-opt2.c: Likewise.



OK.

R.