https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111754

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |rsandifo at gcc dot gnu.org

--- Comment #5 from Richard Biener <rguenth at gcc dot gnu.org> ---
It seems we have VECTOR_CST_NELTS_PER_PATTERN ({ 9.0e+0, 0.0, 0.0, 0.0 })
2 and VECTOR_CST_NPATTERNS == 1.  And the selector { 1, 0, 1, 2 } has
npatterns == 1 and nelts-per-pattern == 3.

  /* (1) If SEL is a suitable mask as determined by
     valid_mask_for_fold_vec_perm_cst_p, then:
     res_npatterns = max of npatterns between ARG0, ARG1, and SEL
     res_nelts_per_pattern = max of nelts_per_pattern between
                             ARG0, ARG1 and SEL.
     (2) If SEL is not a suitable mask, and TYPE is VLS then:
     res_npatterns = nelts in result vector.
     res_nelts_per_pattern = 1.
     This exception is made so that VLS ARG0, ARG1 and SEL work as before.  */
  if (valid_mask_for_fold_vec_perm_cst_p (arg0, arg1, sel, reason))
    {
      res_npatterns
        = std::max (VECTOR_CST_NPATTERNS (arg0),
                    std::max (VECTOR_CST_NPATTERNS (arg1),
                              sel.encoding ().npatterns ()));

      res_nelts_per_pattern
        = std::max (VECTOR_CST_NELTS_PER_PATTERN (arg0),
                    std::max (VECTOR_CST_NELTS_PER_PATTERN (arg1),
                              sel.encoding ().nelts_per_pattern ()));

      res_nelts = res_npatterns * res_nelts_per_pattern;

this seems to be a case that doesn't fit, so the fix needs to be to
valid_mask_for_fold_vec_perm_cst_p which really looks a bit
unwieldly.

An assert that res_nelts is power-of-two would be nice to add.

Reply via email to