On 05/16/2018 08:18 AM, Luis Machado wrote:


On 05/16/2018 06:08 AM, Kyrill Tkachov wrote:

On 15/05/18 12:12, Luis Machado wrote:
Hi,

On 05/15/2018 06:37 AM, Kyrill Tkachov wrote:
Hi Luis,

On 14/05/18 22:18, Luis Machado wrote:
Hi,

Here's an updated version of the patch (now reverted) that addresses the previous bootstrap problem (signedness and long long/int conversion).

I've checked that it bootstraps properly on both aarch64-linux and x86_64-linux and that tests look sane.

James, would you please give this one a try to see if you can still reproduce PR85682? I couldn't reproduce it in multiple attempts.


The patch doesn't hit the regressions in PR85682 from what I can see.
I have a comment on the patch below.


Great. Thanks for checking Kyrill.

--- a/gcc/tree-ssa-loop-prefetch.c
+++ b/gcc/tree-ssa-loop-prefetch.c
@@ -992,6 +992,23 @@ prune_by_reuse (struct mem_ref_group *groups)
  static bool
  should_issue_prefetch_p (struct mem_ref *ref)
  {
+  /* Some processors may have a hardware prefetcher that may conflict with
+     prefetch hints for a range of strides.  Make sure we don't issue
+     prefetches for such cases if the stride is within this particular
+     range.  */
+  if (cst_and_fits_in_hwi (ref->group->step)
+      && abs_hwi (int_cst_value (ref->group->step)) <
+      (HOST_WIDE_INT) PREFETCH_MINIMUM_STRIDE)
+    {

The '<' should go on the line below together with PREFETCH_MINIMUM_STRIDE.

I've fixed this locally now.

Thanks. I haven't followed the patch in detail, are you looking for midend changes approval since the last version?
Or do you need aarch64 approval?

The changes are not substantial, but midend approval i what i was aiming at.

Also the confirmation that PR85682 is no longer happening.

James confirmed PR85682 is no longer reproducible with the updated patch and the bootstrap issue is fixed now. So i take it this should be OK to push to mainline?

Also, i'd like to discuss the possibility of having these couple options backported to GCC 8. As is, the changes don't alter code generation by default, but they allow better tuning of the software prefetcher for targets that benefit from it.

Maybe after letting the changes bake on mainline enough to be confirmed stable?

Thanks,
Luis

Reply via email to