Re: [PATCH 7/8] Model cache auto-prefetcher in scheduler

Maxim Kuvyrkov Wed, 19 Nov 2014 01:43:00 -0800

On Nov 19, 2014, at 12:27 PM, Ramana Radhakrishnan 
<ramana.radhakrish...@arm.com> wrote:


> 
> 
> On 14/11/14 15:12, Maxim Kuvyrkov wrote:
>> On Nov 14, 2014, at 8:38 AM, Jeff Law <l...@redhat.com> wrote:
>> 
>>> On 10/20/14 22:06, Maxim Kuvyrkov wrote:
>>>> Hi,
>>>> Ramana, this change requires benchmarking, which I can't easily do
>>>> at
>>> the moment. I would appreciate any benchmarking results that you can
>>> share. In particular, the value of PARAM_SCHED_AUTOPREF_QUEUE_DEPTH
>>> needs to be tuned/confirmed for Cortex-A15.
>>> What were the results of that benchmarking?  IIRC I tabled reviewing this 
>>> work waiting for those results (and I probably should have let you know 
>>> that.  Sorry,  my bad there).
>> 
>> I don't have the benchmarking results yet, and I was hoping for ARM to help 
>> with getting the numbers.  The arm maintainers still need to OK the 
>> arm-specific portion of the patch, which, I imagine, will happen only of 
>> benchmark scores improve.
> 
> 
> I tried benchmarking 78f367cfcfdc9f0a422a362cd85ecc122834d96f from the trees 
> you gave me links to against the equivalent version on trunk.
> 
> The results with SPEC2k on A15 were in the noise with the default value for 
> PARAM_SCHED_AUTOPREF_QUEUE_DEPTH which is 2 in the backend. I'm still waiting 
> on results for values 0, 1 and 3 and hopefully something will come back soon 
> for SPEC2k.

I think there is a good chance that this optimization will not improve SPEC 
benchmarks, my main concern is to make sure SPECs don't regress.  If SPECs are 
neutral and the patch gives good improvement (2.5 times faster) on another 
benchmark (STREAM), is this a good enough argument for commit?

> 
> 
>> 
>> @@ -29903,6 +29915,20 @@ arm_first_cycle_multipass_dfa_lookahead (void)
>>   return issue_rate > 1 ? issue_rate : 0;
>> }
>> 
>> +/* Enable modeling of Cortex-A15 L2 auto-prefetcher.  */
>> +static int
>> +arm_first_cycle_multipass_dfa_lookahead_guard (rtx insn, int ready_index)
>> +{
>> +  switch (arm_tune)
>> +    {
>> +    case cortexa15:
>> +      return autopref_multipass_dfa_lookahead_guard (insn, ready_index);
>> +
>> +    default:
>> +      return 0;
>> +    }
>> +}
>> +
> 
> 
> It would be better to have this as a flag in the tuning tables rather than 
> hardcoding for a core here. The backend has been moving in that direction for 
> all core centric information and it is preferable that be continued.
> 
> So this logic here should just be
> 
> if (current_tune->multipass_lookahead)
>  return autopref_multipass_lookahead_guard (insn, ready_index);
> else
>  return 0;
> 

OK, I'll convert this to use tuning tables.

Thank you,

--
Maxim Kuvyrkov
www.linaro.org

Re: [PATCH 7/8] Model cache auto-prefetcher in scheduler

Reply via email to