https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109811
Jan Hubicka <hubicka at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |hubicka at gcc dot gnu.org Ever confirmed|0 |1 Status|UNCONFIRMED |NEW Last reconfirmed| |2023-05-17 --- Comment #4 from Jan Hubicka <hubicka at gcc dot gnu.org> --- Confirmed. LTO is not necessary to reproduce the differnce. I got libjxl and the test jpeg file from Phoronix testuiste and configure clang build with: CC=clang CXX=clang++ CFLAGS="-O3 -g -march=native -fno-exceptions" CXXFLAGS="-O3 -g -march=native -fno-exceptions" cmake -DCMAKE_C_FLAGS_RELEASE="$CFLAGS -DNDEBUG" -DCMAKE_CXX_FLAGS_RELEASE="$CXXFLAGS -DNDEBUG" -DBUILD_TESTING=OFF .. and CFLAGS="-O3 -g -march=native -fno-exceptions" CXXFLAGS="-O3 -g -march=native -fno-exceptions" cmake -DCMAKE_C_FLAGS_RELEASE="$CFLAGS -DNDEBUG" -DCMAKE_CXX_FLAGS_RELEASE="$CXXFLAGS -DNDEBUG" -DBUILD_TESTING=OFF .. jh@ryzen3:~/.phoronix-test-suite/installed-tests/pts/jpegxl-1.5.0> ./libjxl-0.7.0/build-gcc/tools/cjxl sample-photo-6000x4000.JPG --quality=90 --lossless_jpeg=0 JPEG XL encoder v0.7.0 [AVX2] No output file specified. Encoding will be performed, but the result will be discarded. Read 6000x4000 image, 7837694 bytes, 926.0 MP/s Encoding [Container | VarDCT, d1.000, effort: 7 | 29424-byte Exif], Compressed to 2288431 bytes including container (0.763 bpp). 6000 x 4000, 11.12 MP/s [11.12, 11.12], 1 reps, 16 threads. jh@ryzen3:~/.phoronix-test-suite/installed-tests/pts/jpegxl-1.5.0> ./libjxl-0.7.0/build-gcc/tools/cjxl sample-photo-6000x4000.JPG --quality=90 --lossless_jpeg=0 test JPEG XL encoder v0.7.0 [AVX2] Read 6000x4000 image, 7837694 bytes, 926.5 MP/s Encoding [Container | VarDCT, d1.000, effort: 7 | 29424-byte Exif], Compressed to 2288431 bytes including container (0.763 bpp). 6000 x 4000, 11.09 MP/s [11.09, 11.09], 1 reps, 16 threads. jh@ryzen3:~/.phoronix-test-suite/installed-tests/pts/jpegxl-1.5.0> ./libjxl-0.7.0/build-gcc/tools/cjxl sample-photo-6000x4000.JPG --quality=90 --lossless_jpeg=0 test JPEG XL encoder v0.7.0 [AVX2] Read 6000x4000 image, 7837694 bytes, 925.6 MP/s Encoding [Container | VarDCT, d1.000, effort: 7 | 29424-byte Exif], Compressed to 2288431 bytes including container (0.763 bpp). 6000 x 4000, 11.12 MP/s [11.12, 11.12], 1 reps, 16 threads. jh@ryzen3:~/.phoronix-test-suite/installed-tests/pts/jpegxl-1.5.0> ./libjxl-0.7.0/build-clang/tools/cjxl sample-photo-6000x4000.JPG --quality=90 --lossless_jpeg=0 test JPEG XL encoder v0.7.0 [AVX2] Read 6000x4000 image, 7837694 bytes, 924.6 MP/s Encoding [Container | VarDCT, d1.000, effort: 7 | 29424-byte Exif], Compressed to 2288430 bytes including container (0.763 bpp). 6000 x 4000, 15.17 MP/s [15.17, 15.17], 1 reps, 16 threads. jh@ryzen3:~/.phoronix-test-suite/installed-tests/pts/jpegxl-1.5.0> ./libjxl-0.7.0/build-clang/tools/cjxl sample-photo-6000x4000.JPG --quality=90 --lossless_jpeg=0 test JPEG XL encoder v0.7.0 [AVX2] Read 6000x4000 image, 7837694 bytes, 922.4 MP/s Encoding [Container | VarDCT, d1.000, effort: 7 | 29424-byte Exif], Compressed to 2288430 bytes including container (0.763 bpp). 6000 x 4000, 15.18 MP/s [15.18, 15.18], 1 reps, 16 threads. So GCC does 11MB/s while clang 15MB/s