Richard Biener <rguent...@suse.de> writes: > On Fri, 11 Aug 2023, juzhe.zh...@rivai.ai wrote: > >> Hi, Richi. >> >> > 1. Target is using loop MASK as the partial vector loop control. >> >> I don't think it checks for this? >> >> I am not sure whether I understand EXTRACT_LAST correctly. >> But if target doesn't use loop MASK for partial vector loop control, how >> does target use EXTRACT_LAST? >> Since EXTRACT_LAST is always extracting the last element of the vector >> according to MASK operand. >> >> > But we don't really know this at this point? The only thing we know >> > is that nothing set LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P to false. >> >> Yes. So I am try to use 'get_len_load_store' to check whether target support >> LEN loop control. >> >> Well, I admit it's not a good idea. >> >> >> > I think it should work to change the direct_internal_fn_supported_p >> > check for IFN_EXTRACT_LAST to a "poitive" one guarding >> >> > gcc_assert (ncopies == 1 && !slp_node); >> > vect_record_loop_mask (loop_vinfo, >> > &LOOP_VINFO_MASKS (loop_vinfo), >> > 1, vectype, NULL); >> >> > and in the else branch check for VEC_EXTRACT support and if present >> > record a loop len. Just in this case this particular order would >> > be important. >> >> Do you mean change the codes as follows :? >> >> - if (!direct_internal_fn_supported_p (IFN_EXTRACT_LAST, vectype, >> - OPTIMIZE_FOR_SPEED)) >> - { >> - if (dump_enabled_p ()) >> - dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, >> - "can't operate on partial vectors " >> - "because the target doesn't support extract >> " >> - "last reduction.\n"); >> - LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo) = false; >> - } >> - else if (slp_node) >> if (slp_node) >> { >> if (dump_enabled_p ()) >> dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, >> "can't operate on partial vectors " >> "because an SLP statement is live after " >> "the loop.\n"); >> LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo) = false; >> } >> else if (ncopies > 1) >> { >> if (dump_enabled_p ()) >> dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, >> "can't operate on partial vectors " >> "because ncopies is greater than 1.\n"); >> LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo) = false; >> } >> else >> { >> gcc_assert (ncopies == 1 && !slp_node); >> if (direct_internal_fn_supported_p (IFN_EXTRACT_LAST, vectype, >> OPTIMIZE_FOR_SPEED)) >> vect_record_loop_mask (loop_vinfo, >> &LOOP_VINFO_MASKS (loop_vinfo), >> 1, vectype, NULL); >> else > > check here the target supports VEC_EXTRACT > >> vect_record_loop_len (loop_vinfo, >> &LOOP_VINFO_LENS (loop_vinfo), >> 1, vectype, 1); > > else set LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P to false with a > diagnostic.
I agree with all this FWIW. That is, the check should be based on .VEC_EXTRACT alone, but .EXTRACT_LAST should take priority (not least because SVE provides both .VEC_EXTRACT and .EXTRACT_LAST). Thanks, Richard