On Tue, Mar 22, 2016 at 11:22 AM, Bin.Cheng <amker.ch...@gmail.com> wrote:
> On Wed, Mar 16, 2016 at 10:06 AM, Richard Biener
> <richard.guent...@gmail.com> wrote:
>>
>> On Wed, Mar 16, 2016 at 10:48 AM, Bin Cheng <bin.ch...@arm.com> wrote:
>> > Hi,
>> > When I tried to decrease # of IV candidates, I removed code that adds IV 
>> > candidates for use with constant offset stripped in use->base.  This is 
>> > kind of too aggressive and triggers PR69042.  So here is a patch adding 
>> > back the missing candidates.  Honestly, this patch doesn't truly fix the 
>> > issue, it just brings back the original behavior in IVOPT part (Which is 
>> > still a right thing to do I think).  The issue still depends on PIC_OFFSET 
>> > register used on x86 target.  As discussed in 
>> > https://gcc.gnu.org/ml/gcc/2016-02/msg00040.html.  Furthermore, the real 
>> > problem could be in register pressure modeling about PIC_OFFSET symbol in 
>> > IVOPT.
>> >
>> > On AArch64, overall spec2k number isn't changed, though 173.applu is 
>> > regressed by ~4% because couple of loops' # of candidates now hits 
>> > "--param iv-consider-all-candidates-bound=30".  For spec2k6 data on 
>> > AArch64, INT is not affected; FP overall is not changed, as for specific 
>> > case: 459.GemsFDTD is regressed by 2%, 433.milc is improved by 2%.  To 
>> > address the regression, I will send another patch increasing the parameter 
>> > bound.
>> >
>> > Bootstrap&test on x86_64 and AArch64, is it OK?  In the meantime, I will 
>> > collect spec2k6 data on x86_64.
>>
>> Ok.
> Hi Richard,
> Hmm, I got spec2k6 data on my x86_64, it (along with patch increasing
> param iv-consider-all-candidates-bound) causes 1% regression for
> 436.cactusADM in my run.  I looked into the code, for function
> bench_staggeredleapfrog2_ (takes 99% running time after patching),
> IVOPT chooses one fewer candidates for outer loop, but it does result
> in couple of more instructions there.

You mean IVOPTs chooses one fewer IVs for the outer loop?

>  For this case, register
> pressure is a more interesting issue (36 candidates chosen in outer
> loop, many stack accesses), not sure if this 1% regression blocks the
> patch at this stage, or not?

Is this with or without the increase of the param?  What compiler options and
on what sub-architecture was this?

I think if the IVO choice looks optimal before the patch and not optimal after
then it's worth blocking but it sounds like the IVO choice is a mess anyway?
[can you maybe check IV choice by ICC?]

Thanks,
Richard.

> Thanks,
> bin

Reply via email to