Re: [PATCH] i386, expand: Optimize also 256-bit and 512-bit permutatations as vpmovzx if possible [PR95905]

2021-01-13 Thread Richard Biener
On Wed, 13 Jan 2021, Jakub Jelinek wrote: > On Wed, Jan 13, 2021 at 08:26:49AM +0100, Richard Biener wrote: > > + if (op1 && op0 != op1) > > +op1 = force_reg (vmode, op1); > > > > code (presumably to handle RTX sharing here)? > > That could be actually simplified, incrementally e.g. to: >

Re: [PATCH] i386, expand: Optimize also 256-bit and 512-bit permutatations as vpmovzx if possible [PR95905]

2021-01-13 Thread Uros Bizjak via Gcc-patches
On Wed, Jan 13, 2021 at 8:13 AM Jakub Jelinek wrote: > > Hi! > > The following patch implements what I've talked about, i.e. to no longer > force operands of vec_perm_const into registers in the generic code, but let > each of the (currently 8) targets force it into registers individually, > givin

Re: [PATCH] i386, expand: Optimize also 256-bit and 512-bit permutatations as vpmovzx if possible [PR95905]

2021-01-13 Thread Jakub Jelinek via Gcc-patches
On Wed, Jan 13, 2021 at 08:26:49AM +0100, Richard Biener wrote: > + if (op1 && op0 != op1) > +op1 = force_reg (vmode, op1); > > code (presumably to handle RTX sharing here)? That could be actually simplified, incrementally e.g. to: if (op0) { rtx nop0 = force_reg (vmode, op0);

Re: [PATCH] i386, expand: Optimize also 256-bit and 512-bit permutatations as vpmovzx if possible [PR95905]

2021-01-12 Thread Jakub Jelinek via Gcc-patches
On Wed, Jan 13, 2021 at 08:26:49AM +0100, Richard Biener wrote: > On Wed, 13 Jan 2021, Jakub Jelinek wrote: > > > Hi! > > > > The following patch implements what I've talked about, i.e. to no longer > > force operands of vec_perm_const into registers in the generic code, but let > > each of the (

Re: [PATCH] i386, expand: Optimize also 256-bit and 512-bit permutatations as vpmovzx if possible [PR95905]

2021-01-12 Thread Richard Biener
On Wed, 13 Jan 2021, Jakub Jelinek wrote: > Hi! > > The following patch implements what I've talked about, i.e. to no longer > force operands of vec_perm_const into registers in the generic code, but let > each of the (currently 8) targets force it into registers individually, > giving the target

[PATCH] i386, expand: Optimize also 256-bit and 512-bit permutatations as vpmovzx if possible [PR95905]

2021-01-12 Thread Jakub Jelinek via Gcc-patches
Hi! The following patch implements what I've talked about, i.e. to no longer force operands of vec_perm_const into registers in the generic code, but let each of the (currently 8) targets force it into registers individually, giving the targets better control on if it does that and when and allowi