https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113005
--- Comment #5 from Jakub Jelinek <jakub at gcc dot gnu.org> --- I can confirm that on x86_64-linux with 16 cores/32 threads even the -O0 rwlock_1 and rwlock_3 tests aren't that slow, byt with OMP_NUM_THREADS=1024 and higher rwlock_1 STOPs: $ OMP_NUM_THREADS=256 LD_LIBRARY_PATH=../.libs/ time ./rwlock_1.exe 2.21user 8.88system 0:00.88elapsed 1260%CPU (0avgtext+0avgdata 47324maxresident)k 0inputs+206848outputs (0major+29321minor)pagefaults 0swaps $ OMP_NUM_THREADS=512 LD_LIBRARY_PATH=../.libs/ time ./rwlock_1.exe 4.59user 14.41system 0:02.29elapsed 829%CPU (0avgtext+0avgdata 89464maxresident)k 0inputs+413696outputs (0major+55232minor)pagefaults 0swaps $ OMP_NUM_THREADS=1024 LD_LIBRARY_PATH=../.libs/ time ./rwlock_1.exe STOP 2 STOP 2 Command exited with non-zero status 2 0.04user 0.78system 0:00.14elapsed 586%CPU (0avgtext+0avgdata 18976maxresident)k 0inputs+1672outputs (2major+4138minor)pagefaults 0swaps $ OMP_NUM_THREADS=1024 LD_LIBRARY_PATH=../.libs/ time ./rwlock_3.exe 13.57user 49.00system 0:17.66elapsed 354%CPU (0avgtext+0avgdata 26588maxresident)k 0inputs+0outputs (1major+5987minor)pagefaults 0swaps $ OMP_NUM_THREADS=2048 LD_LIBRARY_PATH=../.libs/ time ./rwlock_3.exe 38.15user 134.83system 0:51.26elapsed 337%CPU (0avgtext+0avgdata 45860maxresident)k 0inputs+0outputs (0major+10844minor)pagefaults 0swaps