[Bug rtl-optimization/53652] *andn* isn't used for vectorization

2021-12-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53652

Andrew Pinski  changed:

   What|Removed |Added

   Last reconfirmed||2021-12-27
   Severity|normal  |enhancement
 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1

--- Comment #7 from Andrew Pinski  ---
Simplified testcase from PR 56876:

typedef unsigned long long vec __attribute__((vector_size(16)));
vec g;
vec f1(vec a, vec b){
  return ~a
}
vec f2(vec a, vec b){
  return ~g
}

f2 is similar to the testcase referenced in comment #0.

[Bug rtl-optimization/53652] *andn* isn't used for vectorization

2021-12-26 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53652

Andrew Pinski  changed:

   What|Removed |Added

 CC||glisse at gcc dot gnu.org

--- Comment #6 from Andrew Pinski  ---
*** Bug 56876 has been marked as a duplicate of this bug. ***

[Bug rtl-optimization/53652] *andn* isn't used for vectorization

2019-07-19 Thread segher at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53652

--- Comment #5 from Segher Boessenkool  ---
It might work a lot better if it didn't have to load that all-ones vector
in a separate insn.  Because it does, you need to do a 3->3 combination
(which we do not currently support) if you need to do the memory load in
a separate insn, as well the the insn needed to keep the constant load
(it isn't dead yet, later insns use that same value again)).

So that would mean having insns (that split) for doing a NOT.

[Bug rtl-optimization/53652] *andn* isn't used for vectorization

2019-07-19 Thread ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53652

--- Comment #4 from Uroš Bizjak  ---
(In reply to Jakub Jelinek from comment #3)
> Ran into this again in context of PR91204, there is another case that isn't
> matched for a different reason:
> int a, b, c[64];
> 
> void
> foo (void)
> {
>   int i;
>   for (i = 0; i < 64; i++)
> c[i] = ~c[i] & b;
> }
> In this case the loop has been unrolled and combiner even tries to match
> (set (reg:V4SI 137 [ vect__4.8 ])
> (and:V4SI (not:V4SI (mem/c:V4SI (symbol_ref:DI ("c") [flags 0x2] 
> ) [1 MEM  [(int *)]+0 S16 A128]))
> (reg:V4SI 132)))
> but doesn't match that as memory operand is not allowed in the andnot
> patterns (perhaps it should and we should just wait for reload to cure it
> up).

It should also accept memory operand, this is the way we trick combiner in
several other places.

[Bug rtl-optimization/53652] *andn* isn't used for vectorization

2019-07-19 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53652

Jakub Jelinek  changed:

   What|Removed |Added

 CC||segher at gcc dot gnu.org

--- Comment #3 from Jakub Jelinek  ---
Ran into this again in context of PR91204, there is another case that isn't
matched for a different reason:
int a, b, c[64];

void
foo (void)
{
  int i;
  for (i = 0; i < 64; i++)
c[i] = ~c[i] & b;
}
In this case the loop has been unrolled and combiner even tries to match
(set (reg:V4SI 137 [ vect__4.8 ])
(and:V4SI (not:V4SI (mem/c:V4SI (symbol_ref:DI ("c") [flags 0x2]  ) [1 MEM  [(int *)]+0 S16 A128]))
(reg:V4SI 132)))
but doesn't match that as memory operand is not allowed in the andnot patterns
(perhaps it should and we should just wait for reload to cure it up).

[Bug rtl-optimization/53652] *andn* isn't used for vectorization

2012-06-14 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53652

--- Comment #2 from Jakub Jelinek jakub at gcc dot gnu.org 2012-06-14 
10:23:31 UTC ---
Such a def_insn_and_split isn't going to work well, because the hw supported
alternative (xor with all ones vector) needs the vector constant loaded into
memory, which is much preferrable to be done before loop, and nothing post
combine is going to move it before the loop again.
The combiner can already look at the REG_EQUAL note:
(insn 25 21 27 3 (set (reg:V4DI 90 [ vect_var_.18 ])
(xor:V4DI (mem:V4DI (plus:DI (reg:DI 78 [ ivtmp.28 ])
(symbol_ref:DI (c)  var_decl 0x7f09fb364280 c)) [2
MEM[symbol: c, index: ivtmp.28_16, offset: 0B]+0 S32 A256])
(reg:V4DI 94))) v2.c:10 1587 {*xorv4di3}
 (expr_list:REG_EQUAL (not:V4DI (mem:V4DI (plus:DI (symbol_ref:DI (c) 
var_decl 0x7f09fb364280 c)
(reg:DI 78 [ ivtmp.28 ])) [2 MEM[symbol: c, index:
ivtmp.28_16, offset: 0B]+0 S32 A256]))
(nil)))

(insn 27 25 28 3 (set (reg:V4DI 93 [ vect_var_.19 ])
(and:V4DI (reg:V4DI 90 [ vect_var_.18 ])
(mem:V4DI (plus:DI (reg:DI 78 [ ivtmp.28 ])
(symbol_ref:DI (b)  var_decl 0x7f09fb3641e0 b)) [2
MEM[symbol: b, index: ivtmp.28_16, offset: 0B]+0 S32 A256]))) v2.c:10 1585
{*andv4di3}
 (expr_list:REG_DEAD (reg:V4DI 90 [ vect_var_.18 ])
(nil)))
but doesn't use that.  The additional complication here is that both the XOR
(and REG_EQUAL not note) and the other AND operand are both MEMs, while andn on
x86_64/i?86 only supports one of the operands as MEM.  The combiner would then
need to split that into a load followed by andn (in place of the 3 insns (one
load before the loop, xor and and).


[Bug rtl-optimization/53652] *andn* isn't used for vectorization

2012-06-13 Thread hp at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53652

Hans-Peter Nilsson hp at gcc dot gnu.org changed:

   What|Removed |Added

 CC||hp at gcc dot gnu.org

--- Comment #1 from Hans-Peter Nilsson hp at gcc dot gnu.org 2012-06-14 
03:45:59 UTC ---
I'd humbly suggest adding a not-recognizer anon insn-and-split pattern with a
clear comment combine needs this as a stepping stone to combine into the
andnot.
BTW, this PR seems target-specific; I don't see it for my vector back-end
(using 4.7.1:ish, has and and not and andnot V4SI).