https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100866
--- Comment #10 from Bill Schmidt <wschmidt at gcc dot gnu.org> --- Right, it would be a good optimization. We've stopped focusing much on P8 optimization work at this point simply because of lack of resources. The needed transform is to recognize load-xxlnor-vperm as a group and combine into invload-vperm. But this requires the loaded constant not be used elsewhere (unlikely, but possible), or if it is, that all such uses are also xxlnor-vperm, so dataflow analysis for reached uses is required. Not completely trivial. Because it's a P8-only optimization, it's bit lower on the priority list.