Re: [PATCH][GCC][AArch64][ARM] Modify idiv costs for Cortex-A53

2017-06-06 Thread Ramana Radhakrishnan
On Tue, Jun 6, 2017 at 1:56 PM, James Greenhalgh
 wrote:
> On Tue, May 02, 2017 at 04:37:21PM +0100, Tamar Christina wrote:
>> Hi All,
>>
>> This patch adjusts the cost model for Cortex-A53 to increase the costs of
>> an integer division. The reason for this is that we want to always expand
>> the division to a multiply when doing a division by constant.
>>
>> On the Cortex-A53 shifts are modeled to cost 1 instruction,
>> when doing the expansion we have to perform two shifts and an addition.
>> However because the cost model can't model things such as fusing of shifts,
>> we have to fully cost both shifts.
>>
>> This leads to the cost model telling us that for the Cortex-A53 we can never
>> do the expansion. By increasing the costs of the division by two instructions
>> we recover the room required in the cost calculation to do the expansions.
>>
>> The reason for all of this is that currently the code does not produce what
>> you'd expect, which is that division by constants are always expanded. Also
>> it's inconsistent because unsigned division does get expanded.
>>
>> This all reduces the ability to do CSE when using signed modulo since that
>> one is also expanded.
>>
>> Given:
>>
>> void f5(void)
>> {
>>   int x = 0;
>>   while (x > -1000)
>>   {
>> g(x % 300);
>> x--;
>>   }
>> }
>>
>>
>> we now generate
>>
>>   smull   x0, w19, w21
>>   asr x0, x0, 37
>>   sub w0, w0, w19, asr 31
>>   msubw0, w0, w20, w19
>>   sub w19, w19, #1
>>   bl  g
>>
>> as opposed to
>>
>>   sdivw0, w19, w20
>>   msubw0, w0, w20, w19
>>   sub w19, w19, #1
>>   bl  g
>>
>>
>> Bootstrapped and reg tested on aarch64-none-linux-gnu with no regressions.
>>
>> OK for trunk?
>
> OK for AArch64, but you'll need an ARM OK too.

OK.

Ramana
>
> Thanks,
> James
>
>> gcc/
>> 2017-05-02  Tamar Christina  
>>
>>   * config/arm/aarch-cost-tables.h (cortexa53_extra_cost): Increase idiv 
>> cost.
>


Re: [PATCH][GCC][AArch64][ARM] Modify idiv costs for Cortex-A53

2017-06-06 Thread James Greenhalgh
On Tue, May 02, 2017 at 04:37:21PM +0100, Tamar Christina wrote:
> Hi All, 
> 
> This patch adjusts the cost model for Cortex-A53 to increase the costs of
> an integer division. The reason for this is that we want to always expand
> the division to a multiply when doing a division by constant.
> 
> On the Cortex-A53 shifts are modeled to cost 1 instruction,
> when doing the expansion we have to perform two shifts and an addition.
> However because the cost model can't model things such as fusing of shifts,
> we have to fully cost both shifts.
> 
> This leads to the cost model telling us that for the Cortex-A53 we can never
> do the expansion. By increasing the costs of the division by two instructions
> we recover the room required in the cost calculation to do the expansions.
> 
> The reason for all of this is that currently the code does not produce what
> you'd expect, which is that division by constants are always expanded. Also
> it's inconsistent because unsigned division does get expanded.
> 
> This all reduces the ability to do CSE when using signed modulo since that
> one is also expanded.
> 
> Given:
> 
> void f5(void)
> {
>   int x = 0;
>   while (x > -1000)
>   {
> g(x % 300);
> x--;
>   }
> }
> 
> 
> we now generate
> 
>   smull   x0, w19, w21
>   asr x0, x0, 37
>   sub w0, w0, w19, asr 31
>   msubw0, w0, w20, w19
>   sub w19, w19, #1
>   bl  g
> 
> as opposed to
> 
>   sdivw0, w19, w20
>   msubw0, w0, w20, w19
>   sub w19, w19, #1
>   bl  g
> 
> 
> Bootstrapped and reg tested on aarch64-none-linux-gnu with no regressions.
> 
> OK for trunk?

OK for AArch64, but you'll need an ARM OK too.

Thanks,
James

> gcc/
> 2017-05-02  Tamar Christina  
> 
>   * config/arm/aarch-cost-tables.h (cortexa53_extra_cost): Increase idiv 
> cost.



Re: [PATCH][GCC][AArch64][ARM] Modify idiv costs for Cortex-A53

2017-05-15 Thread Tamar Christina
Hi,

Reg-tested now on arm-none-linux-gnueabihf as well and no regressions.

Ok for trunk?
Tamar

From: Ramana Radhakrishnan <ramana@googlemail.com>
Sent: Monday, May 15, 2017 9:40:40 AM
To: Tamar Christina
Cc: GCC Patches; nd; Kyrylo Tkachov; Richard Earnshaw; Marcus Shawcroft; James 
Greenhalgh; ni...@redhat.com; Ramana Radhakrishnan
Subject: Re: [PATCH][GCC][AArch64][ARM] Modify idiv costs for Cortex-A53

On Tue, May 2, 2017 at 4:37 PM, Tamar Christina <tamar.christ...@arm.com> wrote:
> Hi All,
>
> This patch adjusts the cost model for Cortex-A53 to increase the costs of
> an integer division. The reason for this is that we want to always expand
> the division to a multiply when doing a division by constant.
>
> On the Cortex-A53 shifts are modeled to cost 1 instruction,
> when doing the expansion we have to perform two shifts and an addition.
> However because the cost model can't model things such as fusing of shifts,
> we have to fully cost both shifts.
>
> This leads to the cost model telling us that for the Cortex-A53 we can never
> do the expansion. By increasing the costs of the division by two instructions
> we recover the room required in the cost calculation to do the expansions.
>
> The reason for all of this is that currently the code does not produce what 
> you'd expect,
> which is that division by constants are always expanded. Also it's 
> inconsistent because
> unsigned division does get expanded.
>
> This all reduces the ability to do CSE when using signed modulo since that 
> one is also expanded.
>
> Given:
>
> void f5(void)
> {
>   int x = 0;
>   while (x > -1000)
>   {
> g(x % 300);
> x--;
>   }
> }
>
>
> we now generate
>
> smull   x0, w19, w21
> asr x0, x0, 37
> sub w0, w0, w19, asr 31
> msubw0, w0, w20, w19
> sub w19, w19, #1
> bl  g
>
> as opposed to
>
> sdivw0, w19, w20
> msubw0, w0, w20, w19
> sub w19, w19, #1
> bl  g
>
>
> Bootstrapped and reg tested on aarch64-none-linux-gnu with no regressions.

Since this affects the arm port as well, it needs to be regression
tested on arm as well.

Thanks,
Ramana

>
> OK for trunk?
>
> Thanks,
> Tamar
>
>
> gcc/
> 2017-05-02  Tamar Christina  <tamar.christ...@arm.com>
>
> * config/arm/aarch-cost-tables.h (cortexa53_extra_cost): Increase 
> idiv cost.


Re: [PATCH][GCC][AArch64][ARM] Modify idiv costs for Cortex-A53

2017-05-15 Thread Ramana Radhakrishnan
On Tue, May 2, 2017 at 4:37 PM, Tamar Christina  wrote:
> Hi All,
>
> This patch adjusts the cost model for Cortex-A53 to increase the costs of
> an integer division. The reason for this is that we want to always expand
> the division to a multiply when doing a division by constant.
>
> On the Cortex-A53 shifts are modeled to cost 1 instruction,
> when doing the expansion we have to perform two shifts and an addition.
> However because the cost model can't model things such as fusing of shifts,
> we have to fully cost both shifts.
>
> This leads to the cost model telling us that for the Cortex-A53 we can never
> do the expansion. By increasing the costs of the division by two instructions
> we recover the room required in the cost calculation to do the expansions.
>
> The reason for all of this is that currently the code does not produce what 
> you'd expect,
> which is that division by constants are always expanded. Also it's 
> inconsistent because
> unsigned division does get expanded.
>
> This all reduces the ability to do CSE when using signed modulo since that 
> one is also expanded.
>
> Given:
>
> void f5(void)
> {
>   int x = 0;
>   while (x > -1000)
>   {
> g(x % 300);
> x--;
>   }
> }
>
>
> we now generate
>
> smull   x0, w19, w21
> asr x0, x0, 37
> sub w0, w0, w19, asr 31
> msubw0, w0, w20, w19
> sub w19, w19, #1
> bl  g
>
> as opposed to
>
> sdivw0, w19, w20
> msubw0, w0, w20, w19
> sub w19, w19, #1
> bl  g
>
>
> Bootstrapped and reg tested on aarch64-none-linux-gnu with no regressions.

Since this affects the arm port as well, it needs to be regression
tested on arm as well.

Thanks,
Ramana

>
> OK for trunk?
>
> Thanks,
> Tamar
>
>
> gcc/
> 2017-05-02  Tamar Christina  
>
> * config/arm/aarch-cost-tables.h (cortexa53_extra_cost): Increase 
> idiv cost.


Re: [PATCH][GCC][AArch64][ARM] Modify idiv costs for Cortex-A53

2017-05-15 Thread Tamar Christina
Ping

From: gcc-patches-ow...@gcc.gnu.org  on behalf 
of Tamar Christina 
Sent: Tuesday, May 2, 2017 4:37:21 PM
To: GCC Patches
Cc: nd; Kyrylo Tkachov; Richard Earnshaw; Marcus Shawcroft; James Greenhalgh; 
ni...@redhat.com; Ramana Radhakrishnan
Subject: [PATCH][GCC][AArch64][ARM] Modify idiv costs for Cortex-A53

Hi All,

This patch adjusts the cost model for Cortex-A53 to increase the costs of
an integer division. The reason for this is that we want to always expand
the division to a multiply when doing a division by constant.

On the Cortex-A53 shifts are modeled to cost 1 instruction,
when doing the expansion we have to perform two shifts and an addition.
However because the cost model can't model things such as fusing of shifts,
we have to fully cost both shifts.

This leads to the cost model telling us that for the Cortex-A53 we can never
do the expansion. By increasing the costs of the division by two instructions
we recover the room required in the cost calculation to do the expansions.

The reason for all of this is that currently the code does not produce what 
you'd expect,
which is that division by constants are always expanded. Also it's inconsistent 
because
unsigned division does get expanded.

This all reduces the ability to do CSE when using signed modulo since that one 
is also expanded.

Given:

void f5(void)
{
  int x = 0;
  while (x > -1000)
  {
g(x % 300);
x--;
  }
}


we now generate

smull   x0, w19, w21
asr x0, x0, 37
sub w0, w0, w19, asr 31
msubw0, w0, w20, w19
sub w19, w19, #1
bl  g

as opposed to

sdivw0, w19, w20
msubw0, w0, w20, w19
sub w19, w19, #1
bl  g


Bootstrapped and reg tested on aarch64-none-linux-gnu with no regressions.

OK for trunk?

Thanks,
Tamar


gcc/
2017-05-02  Tamar Christina  

* config/arm/aarch-cost-tables.h (cortexa53_extra_cost): Increase idiv 
cost.