https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83660
acsawdey at gcc dot gnu.org changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |acsawdey at gcc dot gnu.org --- Comment #11 from acsawdey at gcc dot gnu.org --- Looking at the dump of an analogous test case for vec_insert: #include <altivec.h> typedef __vector unsigned int uvec32_t __attribute__((__aligned__(16))); uvec32_t get_word(uvec32_t v) { return({const unsigned _B1 = 32; vec_insert(10, (uvec32_t)v, 2);}); } It seems that we do get an additional cleanup_point like you are proposing to add for vec_extract, which is maybe why that does not get into trouble: ;; Function __vector(4) unsigned int get_word(__vector(4) unsigned int) (null) ;; enabled by -tree-original { <<cleanup_point return <retval> = { const unsigned int _B1 = 32; <<cleanup_point const unsigned int _B1 = 32;>>; <<cleanup_point *((unsigned int *) &TARGET_EXPR <D.3231, NON_LVALUE_EXPR <v>> + 8) = 10;, D.3231>>; }>>; } I've gotten as far as seeing that something is calling fold_build_cleanup_point_expr an additional time compared to the vec_extract example.