https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101267
Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |rsandifo at gcc dot gnu.org --- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> --- So we're having (gdb) p debug (slp_node) t.f90:1:21: note: node 0x39fbbc0 (max_nunits=2, refcnt=1) t.f90:1:21: note: op template: _144 = .MASK_LOAD (_143, 32B, _142); t.f90:1:21: note: stmt 0 _144 = .MASK_LOAD (_143, 32B, _142); t.f90:1:21: note: stmt 1 _146 = .MASK_LOAD (_145, 32B, _142); t.f90:1:21: note: children 0x39fbc48 where vectorizable_load invokes 8500 if (!vect_check_scalar_mask (vinfo, stmt_info, mask, &mask_dt, 8501 &mask_vectype)) 8502 return false; but the SLP child is (gdb) p debug ((slp_tree)0x39fbc48) t.f90:1:21: note: node (external) 0x39fbc48 (max_nunits=1, refcnt=1) t.f90:1:21: note: { _142, _142 } so it won't have a vector type set. In fact vect_check_scalar_mask doesn't seem to be prepared for SLP at all - we're lucky it "works" but then most definitely it won't for externals. You'll note that the SLP variant for vect_is_simple_use won't be applicable here since we only have SLP representations for the mask operand which isn't even the first one. The SLP "support" for masked loads was added by Alejandro Martinez it seems, CCing other ARM folks. A possible fix is to simply give up for external SLP defs above, the internal def case was probably working by chance. I'm testing such a fix.