Hi Thomas,

On 2023/12/21 19:42, Thomas Schwinge wrote:
Hi!

On 2023-12-13T21:52:29+0100, I wrote:
On 2023-12-12T02:05:26+0000, "Zhu, Lipeng" <lipeng....@intel.com> wrote:
On 2023/12/12 1:45, H.J. Lu wrote:
On Sat, Dec 9, 2023 at 7:25 PM Zhu, Lipeng <lipeng....@intel.com> wrote:
On 2023/12/9 23:23, Jakub Jelinek wrote:
On Sat, Dec 09, 2023 at 10:39:45AM -0500, Lipeng Zhu wrote:
This patch try to introduce the rwlock and split the read/write to
unit_root tree and unit_cache with rwlock instead of the mutex to
increase CPU efficiency. In the get_gfc_unit function, the
percentage to step into the insert_unit function is around 30%, in
most instances, we can get the unit in the phase of reading the
unit_cache or unit_root tree. So split the read/write phase by
rwlock would be an approach to make it more parallel.

BTW, the IPC metrics can gain around 9x in our test server with
220 cores. The benchmark we used is
https://github.com/rwesson/NEAT

Ok for trunk, thanks.

Thanks! Looking forward to landing to trunk.

Pushed for you.

I've just filed <https://gcc.gnu.org/PR113005>
"'libgomp.fortran/rwlock_1.f90', 'libgomp.fortran/rwlock_3.f90' execution test 
timeouts".
Would you be able to look into that?

See my update in there.


Grüße
  Thomas
-------------- > Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße
201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955


Since I don't have gcc bugzilla account. Reply in this thread:
Limit themselves to some lower 'OMP_NUM_THREADS' should be an option or increase the execution timeout?

But I can't reproduce the execution timeout failure on both powerpc9 and powerpc10 arch machine. And I also tried to decrease the CPU frequency from 2.6G to 800M, these test cases still can run successfully.

> so only a little bit of an improvement of the new "rwlock" libgfortran vs. old "mutex" GCC 10 one, curiously. (But supposedly that depends on the hardware or other factors?)

The rwlock can increase the IPC a lot, maybe the wall time you listed is not obvious.

$ grep ^cpu < /proc/cpuinfo | uniq -c

192 cpu             : POWER10 (architected), altivec supported

Native configuration is powerpc64le-unknown-linux-gnu

Schedule of variations:
    unix

PASS: libgomp.fortran/rwlock_1.f90   -O0  (test for excess errors)
PASS: libgomp.fortran/rwlock_1.f90   -O0  execution test
PASS: libgomp.fortran/rwlock_1.f90   -O1  (test for excess errors)
PASS: libgomp.fortran/rwlock_1.f90   -O1  execution test
PASS: libgomp.fortran/rwlock_1.f90   -O2  (test for excess errors)
PASS: libgomp.fortran/rwlock_1.f90   -O2  execution test
PASS: libgomp.fortran/rwlock_1.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions (test for excess errors) PASS: libgomp.fortran/rwlock_1.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions execution test
PASS: libgomp.fortran/rwlock_1.f90   -O3 -g  (test for excess errors)
PASS: libgomp.fortran/rwlock_1.f90   -O3 -g  execution test
PASS: libgomp.fortran/rwlock_1.f90   -Os  (test for excess errors)
PASS: libgomp.fortran/rwlock_1.f90   -Os  execution test
PASS: libgomp.fortran/rwlock_2.f90   -O0  (test for excess errors)
PASS: libgomp.fortran/rwlock_2.f90   -O0  execution test
PASS: libgomp.fortran/rwlock_2.f90   -O1  (test for excess errors)
PASS: libgomp.fortran/rwlock_2.f90   -O1  execution test
PASS: libgomp.fortran/rwlock_2.f90   -O2  (test for excess errors)
PASS: libgomp.fortran/rwlock_2.f90   -O2  execution test
PASS: libgomp.fortran/rwlock_2.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions (test for excess errors) PASS: libgomp.fortran/rwlock_2.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions execution test
PASS: libgomp.fortran/rwlock_2.f90   -O3 -g  (test for excess errors)
PASS: libgomp.fortran/rwlock_2.f90   -O3 -g  execution test
PASS: libgomp.fortran/rwlock_2.f90   -Os  (test for excess errors)
PASS: libgomp.fortran/rwlock_2.f90   -Os  execution test
PASS: libgomp.fortran/rwlock_3.f90   -O0  (test for excess errors)
PASS: libgomp.fortran/rwlock_3.f90   -O0  execution test
PASS: libgomp.fortran/rwlock_3.f90   -O1  (test for excess errors)
PASS: libgomp.fortran/rwlock_3.f90   -O1  execution test
PASS: libgomp.fortran/rwlock_3.f90   -O2  (test for excess errors)
PASS: libgomp.fortran/rwlock_3.f90   -O2  execution test
PASS: libgomp.fortran/rwlock_3.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions (test for excess errors) PASS: libgomp.fortran/rwlock_3.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions execution test

Lipeng Zhu

Reply via email to