[Bug tree-optimization/100253] [10/11/12 Regression] wrong code with -O2 -fno-tree-bit-ccp -ftree-slp-vectorize (unaligned movdqa)

2021-04-29 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100253

--- Comment #7 from CVS Commits  ---
The master branch has been updated by Richard Biener :

https://gcc.gnu.org/g:af4ccaa7515b8e72449448c509916575831e6292

commit r12-284-gaf4ccaa7515b8e72449448c509916575831e6292
Author: Richard Biener 
Date:   Thu Apr 29 11:52:08 2021 +0200

tree-optimization/100253 - fix bogus aligned vectorized loads/stores

At some point DR_MISALIGNMENT was supposed to be -1 when the
access was not element aligned.  That's obviously not true at this
point so this adjusts both store and load vectorizing to no longer
assume this which in turn allows simplifying the code.

2021-04-29  Richard Biener  

PR tree-optimization/100253
* tree-vect-stmts.c (vectorizable_load): Do not assume
element alignment when DR_MISALIGNMENT is -1.
(vectorizable_store): Likewise.

* g++.dg/pr100253.C: New testcase.

[Bug tree-optimization/100253] [10/11/12 Regression] wrong code with -O2 -fno-tree-bit-ccp -ftree-slp-vectorize (unaligned movdqa)

2021-04-29 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100253

--- Comment #6 from Richard Biener  ---
So the issue is we're getting a dataref pointer like

 <__int128 unsigned> [(char * {ref-all}) + 25B]

and the first access has DR_MISALIGNMENT of 9 and the target alignment is 16.
So we have

align == 16
misalign == 9

then we do

  data_ref = fold_build2 (MEM_REF, vectype,
  dataref_ptr,
  dataref_offset
  ? dataref_offset
  : build_int_cst (ref_type, 0));
  if (aligned_access_p (first_dr_info))
;
  else if (DR_MISALIGNMENT (first_dr_info) == -1)
TREE_TYPE (data_ref)
  = build_aligned_type (TREE_TYPE (data_ref),
align * BITS_PER_UNIT);
  else
TREE_TYPE (data_ref)
  = build_aligned_type (TREE_TYPE (data_ref),
TYPE_ALIGN (elem_type));

but since DR_MISALIGNMENT is not -1 we assume element alignment
(since DR_MISALIGNMENT is the misalign in elements and at least at
some point wasn't arbitrary ... unless I misremember).  Since
the vector type is vector(1) __int128 unsigned we get an aligned
access.  Note how we're using 'align' in the == -1 case but that's
the target alignment ...

The load code has the same issue.  I'm testing a simplification.

[Bug tree-optimization/100253] [10/11/12 Regression] wrong code with -O2 -fno-tree-bit-ccp -ftree-slp-vectorize (unaligned movdqa)

2021-04-26 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100253

Richard Biener  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |rguenth at gcc dot 
gnu.org
   Priority|P3  |P2
   Target Milestone|--- |10.4

--- Comment #5 from Richard Biener  ---
I will have a look.

[Bug tree-optimization/100253] [10/11/12 Regression] wrong code with -O2 -fno-tree-bit-ccp -ftree-slp-vectorize (unaligned movdqa)

2021-04-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100253

--- Comment #4 from Andrew Pinski  ---
(In reply to Hongtao.liu from comment #3)
> > I think SLP did not mark the load as unaligned even though it knows it is
> > one:
> But gimple tree is marked as aligned.

Right and we are saying the same thing just differently. SLP is what needs to
mark the load as unaligned as it creates the (gimple) load in the first place.

[Bug tree-optimization/100253] [10/11/12 Regression] wrong code with -O2 -fno-tree-bit-ccp -ftree-slp-vectorize (unaligned movdqa)

2021-04-25 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100253

--- Comment #3 from Hongtao.liu  ---
(In reply to Andrew Pinski from comment #2)
> The problem is right away in expand:
> ;; vect__36.383_12 = MEM  [(char *
> {ref-all})_10 + 16B];
> 
> (insn 23 22 0 (set (reg:V1TI 88 [ vect__36.383 ])
> (mem:V1TI (plus:DI (reg/f:DI 86 [ _10 ])
> (const_int 16 [0x10])) [0 MEM 
> [(char * {ref-all})_10 + 16B]+0 S16 A128])) -1
>  (nil))
> 
> 
> I think SLP did not mark the load as unaligned even though it knows it is
> one:
But gimple tree is marked as aligned.

 
unit-size 
align:128 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type
0x7fffea300a80 precision:128 min  max

pointer_to_this >
unsigned V1TI size  unit-size

align:128 warn_if_not_align:0 symtab:0 alias-set 31 canonical-type
0x7fffe9a59150 nunits:1
pointer_to_this >

arg:0 
sizes-gimplified public unsigned type_6 DI
size 
unit-size 
align:64 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type
0x7fffea30c498>
visited
def_stmt _11 =  + _214;
version:11
ptr-info 0x7fffe9487330>
arg:1 
constant 16>>

[Bug tree-optimization/100253] [10/11/12 Regression] wrong code with -O2 -fno-tree-bit-ccp -ftree-slp-vectorize (unaligned movdqa)

2021-04-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100253

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
  Component|rtl-optimization|tree-optimization
   Last reconfirmed||2021-04-25
 Ever confirmed|0   |1

--- Comment #2 from Andrew Pinski  ---
The problem is right away in expand:
;; vect__36.383_12 = MEM  [(char * {ref-all})_10 +
16B];

(insn 23 22 0 (set (reg:V1TI 88 [ vect__36.383 ])
(mem:V1TI (plus:DI (reg/f:DI 86 [ _10 ])
(const_int 16 [0x10])) [0 MEM 
[(char * {ref-all})_10 + 16B]+0 S16 A128])) -1
 (nil))


I think SLP did not mark the load as unaligned even though it knows it is one:
t.cc:7:8: note:   Vectorizing an unaligned access.
t.cc:7:8: note:   vect_model_load_cost: unaligned supported by hardware.
t.cc:7:8: note:   vect_model_load_cost: inside_cost = 24, prologue_cost = 0 .
t.cc:7:8: note:   ==> examining statement: MEM <__int128 unsigned> [(char *
{ref-all}) + 25B] = _36;
t.cc:7:8: note:   vect_is_simple_use: operand # VUSE <.MEM_30>
MEM <__int128 unsignedD.19> [(charD.10 * {ref-all})_10], type of def: internal
t.cc:7:8: note:   vect_is_simple_use: operand # VUSE <.MEM_35>
MEM <__int128 unsignedD.19> [(charD.10 * {ref-all})_19], type of def: internal
t.cc:7:8: note:   Vectorizing an unaligned access.
t.cc:7:8: note:   vect_model_store_cost: unaligned supported by hardware.

Confirmed.

When -fno-tree-bit-ccp is turned off, the prop of the unalignedness does not
happen.