pr58955-2.c is miscompiled by RTL scheduling after reload

crazylht at gmail dot com via Gcc-bugs Mon, 26 Jun 2023 01:44:06 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110237


--- Comment #15 from Hongtao.liu <crazylht at gmail dot com> ---
(In reply to rguent...@suse.de from comment #10)
> On Sun, 25 Jun 2023, crazylht at gmail dot com wrote:
> 
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110237
> > 
> > --- Comment #9 from Hongtao.liu <crazylht at gmail dot com> ---
> > 
> > > So we can simply clear only MEM_EXPR (and MEM_OFFSET), that cuts off the
> > > problematic part of alias analysis.  Together with UNSPEC this might be
> > > enough to fix things.
> > > 
> > Note maskstore won't optimized to vpblendd since it doesn't support memory
> > dest, so I guess no need to use UNSPEC for maskstore?
> 
> A maskstore now looks like
> 
> (insn 31 30 32 5 (set (mem:V8DF (plus:DI (reg/v/f:DI 108 [ a ])
>                 (reg:DI 90 [ ivtmp.28 ])) [1  S64 A64])
>         (vec_merge:V8DF (reg:V8DF 115 [ vect__9.14 ])
>             (mem:V8DF (plus:DI (reg/v/f:DI 108 [ a ])
>                     (reg:DI 90 [ ivtmp.28 ])) [1  S64 A64])
>             (reg:QI 100 [ loop_mask_57 ]))) "t.c":6:14 1957 
> {avx512f_storev8df_mask}
> 
> that appears as a full read and a full store which means earlier
> stores to the masked part could be elided rightfully by RTL DSE
> at least.  If there's any RTL analysis about whether a conditional
> load could trap then for example a full V8DF load from
> the same address that's currently conditional but after the above
> could be analyzed as safe to be scheduled speculatively and
> unconditional while it would not be safe as it could trap.
> 

In my understanding, use whole size for trap analysis is suboptimal but safe,
if whole size access is safe, mask_load/store must be safe. But it could be
suboptimal when whole size access can trap but mask_load/store won't, but we
should accept that suboptimal since mask is not always known.

I didn't find such rule in rtx_addr_can_trap_p, but only known_subrange_p.

  /* If the pointer based access is bigger than the variable they cannot
     alias.  This is similar to the check below where we use TBAA to
     increase the size of the pointer based access based on the dynamic
     type of a containing object we can infer from it.  */
  poly_int64 dsize2;
  if (known_size_p (size1)
      && poly_int_tree_p (DECL_SIZE (base2), &dsize2)
      && known_lt (dsize2, size1))
    return false;

[Bug rtl-optimization/110237] gcc.dg/torture/pr58955-2.c is miscompiled by RTL scheduling after reload

Reply via email to