https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109811
--- Comment #18 from Jan Hubicka <hubicka at gcc dot gnu.org> --- I made a typo: Mainline with -O2 -flto -march=native run manually since build machinery patch is needed 23.03 22.85 23.04 Should be Mainline with -O3 -flto -march=native run manually since build machinery patch is needed 23.03 22.85 23.04 So with -O2 we still get slightly lower score than clang with -O3 we are slightly better. push_back inlining does not seem to be a problem (as tested by increasing limits) so perhaps more agressive unrolling/vectorization settings clang has at -O2. I think upstream jpegxl should use -O3 or -Ofast instead of -O2. It is quite typical kind of task that benefits from large optimization levels. I filled in https://github.com/libjxl/libjxl/issues/2970