On Thu, Aug 14, 2014 at 1:08 AM, Evgeny Stupachenko <evstu...@gmail.com> wrote:
> Ping.
>
> On Thu, Jul 10, 2014 at 7:29 PM, Evgeny Stupachenko <evstu...@gmail.com> 
> wrote:
>> On Mon, Jul 7, 2014 at 6:40 PM, Richard Henderson <r...@redhat.com> wrote:
>>> On 07/03/2014 02:53 AM, Evgeny Stupachenko wrote:
>>>> -expand_vec_perm_palignr (struct expand_vec_perm_d *d)
>>>> +expand_vec_perm_palignr (struct expand_vec_perm_d *d, int insn_num)
>>>
>>> insn_num might as well be "bool avx2", since it's only ever set to two 
>>> values.
>>
>> Agree. However:
>>  after the alignment, one operand permutation could be just move and
>> take 2 instructions for AVX2 as well
>>  for AVX2 there could be other cases when the scheme takes 4 or 5 
>> instructions
>>  we can leave it for potential avx512 extension
>>
>>>
>>>> -  /* Even with AVX, palignr only operates on 128-bit vectors.  */
>>>> -  if (!TARGET_SSSE3 || GET_MODE_SIZE (d->vmode) != 16)
>>>> +  /* SSSE3 is required to apply PALIGNR on 16 bytes operands.  */
>>>> +  if (GET_MODE_SIZE (d->vmode) == 16)
>>>> +    {
>>>> +      if (!TARGET_SSSE3)
>>>> +       return false;
>>>> +    }
>>>> +  /* AVX2 is required to apply PALIGNR on 32 bytes operands.  */
>>>> +  else if (GET_MODE_SIZE (d->vmode) == 32)
>>>> +    {
>>>> +      if (!TARGET_AVX2)
>>>> +       return false;
>>>> +    }
>>>> +  /* Other sizes are not supported.  */
>>>> +  else
>>>>      return false;
>>>
>>> And you'd better check it up here because...
>>>
>>
>> Correct. The following should resolve the issue:
>>   /* For AVX2 we need more than 2 instructions when the alignment
>>      by itself does not produce the desired permutation.  */
>>   if (TARGET_AVX2 && insn_num <= 2)
>>     return false;
>>
>>>> +  /* For SSSE3 we need 1 instruction for palignr plus 1 for one
>>>> +     operand permutaoin.  */
>>>> +  if (insn_num == 2)
>>>> +    {
>>>> +      ok = expand_vec_perm_1 (&dcopy);
>>>> +      gcc_assert (ok);
>>>> +    }
>>>> +  /* For AVX2 we need 2 instructions for the shift: vpalignr and
>>>> +     vperm plus 4 instructions for one operand permutation.  */
>>>> +  else if (insn_num == 6)
>>>> +    {
>>>> +      ok = expand_vec_perm_vpshufb2_vpermq (&dcopy);
>>>> +      gcc_assert (ok);
>>>> +    }
>>>> +  else
>>>> +    ok = false;
>>>>    return ok;
>>>
>>> ... down here you'll simply ICE from the gcc_assert.
>>

Can you modify your patch to fix

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62128

with a testcase?


-- 
H.J.

Reply via email to