On Mon, May 7, 2018 at 7:09 AM, Luis Machado <luis.mach...@linaro.org> wrote:
>
>
> On 05/01/2018 03:30 PM, Jeff Law wrote:
>>
>> On 01/22/2018 06:46 AM, Luis Machado wrote:
>>>
>>> This patch adds a new option to control the minimum stride, for a memory
>>> reference, after which the loop prefetch pass may issue software prefetch
>>> hints for. There are two motivations:
>>>
>>> * Make the pass less aggressive, only issuing prefetch hints for bigger
>>> strides
>>> that are more likely to benefit from prefetching. I've noticed a case in
>>> cpu2017
>>> where we were issuing thousands of hints, for example.
>>>
>>> * For processors that have a hardware prefetcher, like Falkor, it allows
>>> the
>>> loop prefetch pass to defer prefetching of smaller (less than the
>>> threshold)
>>> strides to the hardware prefetcher instead. This prevents conflicts
>>> between
>>> the software prefetcher and the hardware prefetcher.
>>>
>>> I've noticed considerable reduction in the number of prefetch hints and
>>> slightly positive performance numbers. This aligns GCC and LLVM in terms
>>> of
>>> prefetch behavior for Falkor.
>>>
>>> The default settings should guarantee no changes for existing targets.
>>> Those
>>> are free to tweak the settings as necessary.
>>>
>>> No regressions in the testsuite and bootstrapped ok on aarch64-linux.
>>>
>>> Ok?
>>>
>>> 2018-01-22  Luis Machado  <luis.mach...@linaro.org>
>>>
>>>         Introduce option to limit software prefetching to known constant
>>>         strides above a specific threshold with the goal of preventing
>>>         conflicts with a hardware prefetcher.
>>>
>>>         gcc/
>>>         * config/aarch64/aarch64-protos.h (cpu_prefetch_tune)
>>>         <minimum_stride>: New const int field.
>>>         * config/aarch64/aarch64.c (generic_prefetch_tune): Update to
>>> include
>>>         minimum_stride field.
>>>         (exynosm1_prefetch_tune): Likewise.
>>>         (thunderxt88_prefetch_tune): Likewise.
>>>         (thunderx_prefetch_tune): Likewise.
>>>         (thunderx2t99_prefetch_tune): Likewise.
>>>         (qdf24xx_prefetch_tune): Likewise. Set minimum_stride to 2048.
>>>         (aarch64_override_options_internal): Update to set
>>>         PARAM_PREFETCH_MINIMUM_STRIDE.
>>>         * doc/invoke.texi (prefetch-minimum-stride): Document new option.
>>>         * params.def (PARAM_PREFETCH_MINIMUM_STRIDE): New.
>>>         * params.h (PARAM_PREFETCH_MINIMUM_STRIDE): Define.
>>>         * tree-ssa-loop-prefetch.c (should_issue_prefetch_p): Return
>>> false if
>>>         stride is constant and is below the minimum stride threshold.
>>
>> OK for the trunk.
>> jeff
>>
>
> Thanks. Committed as revision 259995 now.

This breaks bootstrap on x86:

../../src-trunk/gcc/tree-ssa-loop-prefetch.c: In function ‘bool
should_issue_prefetch_p(mem_ref*)’:
../../src-trunk/gcc/tree-ssa-loop-prefetch.c:1010:54: error:
comparison of integer expressions of different signedness: ‘long long
unsigned int’ and ‘int’ [-Werror=sign-compare]
       && absu_hwi (int_cst_value (ref->group->step)) < PREFETCH_MINIMUM_STRIDE)
../../src-trunk/gcc/tree-ssa-loop-prefetch.c:1014:4: error: format
‘%d’ expects argument of type ‘int’, but argument 5 has type ‘long
long int’ [-Werror=format=]
    "Step for reference %u:%u (%d) is less than the mininum "
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    "required stride of %d\n",
    ~~~~~~~~~~~~~~~~~~~~~~~~~
    ref->group->uid, ref->uid, int_cst_value (ref->group->step),
                               ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

-- 
H.J.

Reply via email to