Re: Libitm issues porting to POWER8 HTM
On Wed, Jun 19, 2013 at 11:04:25AM -0500, Peter Bergner wrote: > On Tue, 2013-06-18 at 21:48 +0200, Andi Kleen wrote: > > > Given Torvald's comment, can you verify whether your hw txn succeeds > > > (all the way to commit) or whether it is failing and somehow skips > > > the fall through code that is hanging for us (Power and S390)? > > > > All the 3 transactions in reentrant.c abort. > > Can you please explain the above? When you say abort, do you mean > that libitm is calling htm_abort() or that your xbegin hardware > instruction isn't succeeding? XBEGIN aborts, according to the hardware counters. > > > That's not surprising, because there are usually lots of aborts in > > the startup phase of programs, and the test doesn't use a loop. > > Is this a libitm statement or an Intel RTM statement, that the > startup phase usually has lots of aborts? This is a Intel RTM statement. -Andi
Re: Libitm issues porting to POWER8 HTM
On Tue, 2013-06-18 at 21:48 +0200, Andi Kleen wrote: > > Given Torvald's comment, can you verify whether your hw txn succeeds > > (all the way to commit) or whether it is failing and somehow skips > > the fall through code that is hanging for us (Power and S390)? > > All the 3 transactions in reentrant.c abort. Can you please explain the above? When you say abort, do you mean that libitm is calling htm_abort() or that your xbegin hardware instruction isn't succeeding? > That's not surprising, because there are usually lots of aborts in > the startup phase of programs, and the test doesn't use a loop. Is this a libitm statement or an Intel RTM statement, that the startup phase usually has lots of aborts? Peter
Re: Libitm issues porting to POWER8 HTM
> Given Torvald's comment, can you verify whether your hw txn succeeds > (all the way to commit) or whether it is failing and somehow skips > the fall through code that is hanging for us (Power and S390)? All the 3 transactions in reentrant.c abort. That's not surprising, because there are usually lots of aborts in the startup phase of programs, and the test doesn't use a loop. -Andi -- a...@linux.intel.com -- Speaking for myself only.
Re: Libitm issues porting to POWER8 HTM
On Tue, 2013-06-18 at 18:41 +0200, Torvald Riegel wrote: > On Fri, 2013-06-14 at 19:44 -0500, Peter Bergner wrote: > > I'll note that if I hack the call to > > htm_abort_should_retry(ret) so that we break of of the loop and fallback > > to SW TM, then the test case executes correctly. > > That matches what I suppose the bug is. > > Please feel free to create a bug report. I will work on a patch. Done. http://gcc.gnu.org/PR57643 Since this seems to pass on x86, let me know if you want me to test a patch on our power8 system. Peter
Re: Libitm issues porting to POWER8 HTM
On Tue, 2013-06-18 at 11:22 -0700, Andi Kleen wrote: > Peter Bergner writes: > > > > I have yet to track down who has the write lock and why, but I am working > > towards that. Talking with Andreas, he said he is seeing the same failure > > on S390, so I'm wondering whether this might be a generic libitm issue > > and it might hit Intel too. Does anyone know whether this executes > > correctly > > on Intel hardware with RTM? I'll note that if I hack the call to > > FWIW on a TSX system I get the following for libitm with current > trunk. So no hangs on reentrant at least. Given Torvald's comment, can you verify whether your hw txn succeeds (all the way to commit) or whether it is failing and somehow skips the fall through code that is hanging for us (Power and S390)? Thanks! Peter
Re: Libitm issues porting to POWER8 HTM
Peter Bergner writes: > > I have yet to track down who has the write lock and why, but I am working > towards that. Talking with Andreas, he said he is seeing the same failure > on S390, so I'm wondering whether this might be a generic libitm issue > and it might hit Intel too. Does anyone know whether this executes correctly > on Intel hardware with RTM? I'll note that if I hack the call to FWIW on a TSX system I get the following for libitm with current trunk. So no hangs on reentrant at least. Native configuration is x86_64-unknown-linux-gnu === libitm tests === Schedule of variations: unix Running target unix Running /home/ak/gcc/gcc/libitm/testsuite/libitm.c/c.exp ... PASS: libitm.c/cancel.c (test for excess errors) PASS: libitm.c/cancel.c execution test PASS: libitm.c/clone-1.c (test for excess errors) PASS: libitm.c/clone-1.c execution test PASS: libitm.c/dropref-2.c (test for excess errors) XFAIL: libitm.c/dropref-2.c execution test PASS: libitm.c/dropref.c (test for excess errors) XFAIL: libitm.c/dropref.c execution test PASS: libitm.c/memcpy-1.c (test for excess errors) PASS: libitm.c/memcpy-1.c execution test PASS: libitm.c/memset-1.c (test for excess errors) PASS: libitm.c/memset-1.c execution test PASS: libitm.c/notx.c (test for excess errors) PASS: libitm.c/notx.c execution test PASS: libitm.c/reentrant.c (test for excess errors) PASS: libitm.c/reentrant.c execution test PASS: libitm.c/simple-1.c (test for excess errors) PASS: libitm.c/simple-1.c execution test PASS: libitm.c/simple-2.c (test for excess errors) PASS: libitm.c/simple-2.c execution test PASS: libitm.c/stackundo.c (test for excess errors) PASS: libitm.c/stackundo.c execution test PASS: libitm.c/txrelease.c (test for excess errors) PASS: libitm.c/txrelease.c execution test Running /home/ak/gcc/gcc/libitm/testsuite/libitm.c++/c++.exp ... PASS: libitm.c++/dropref.C (test for excess errors) XFAIL: libitm.c++/dropref.C execution test PASS: libitm.c++/eh-1.C (test for excess errors) PASS: libitm.c++/eh-1.C execution test UNSUPPORTED: libitm.c++/static_ctor.C PASS: libitm.c++/throwdown.C (test for excess errors) === libitm Summary === # of expected passes26 # of expected failures 3 # of unsupported tests 1 -Andi -- a...@linux.intel.com -- Speaking for myself only
Re: Libitm issues porting to POWER8 HTM
On Fri, 2013-06-14 at 19:44 -0500, Peter Bergner wrote: > I'm currently implementing support for hardware transactional memory in > the rs6000 backend for POWER8. Things seem to be mostly working, but I > have run into a few issues I'm wondering whether other people are seeing. > > For me, all of the libitm execution test cases in libitm/testsuite/libitm.c/ > compile and execute without error, except for reentrant.c, which hangs for me. > My gdb hasn't been ported to support HTM on Power yet, so debugging has been > slow, but what I've learned is, that my tbegin. instruction succeeds, but I > fail the test (meaning someone has the write lock) at beginend.cc:200: > > if (unlikely(serial_lock.is_write_locked())) > htm_abort(); > > ...so we abort the transaction. The failure is not persistent, so we do > not break out of the loop due to: > > if (!htm_abort_should_retry(ret)) > break; > > We then fall into the following code, where we hang trying to get the > read lock: > > serial_lock.read_lock(tx); > > I have yet to track down who has the write lock and why, but I am working > towards that. Talking with Andreas, he said he is seeing the same failure > on S390, so I'm wondering whether this might be a generic libitm issue > and it might hit Intel too. I think that this is a bug in libitm's HTM fastpath. What I suppose happens is that we have a relaxed outermost transaction that executes unsafe code (see reentrant.c), thus switches to serial-irrevocable mode, and then tries to start a nested transaction. The nested txn then observes in the HTM fastpath that there is a serial-mode txn already, but it never checks whether it is enclosed in an already serial outermost transaction. > Does anyone know whether this executes correctly > on Intel hardware with RTM? I don't know currently, but I suppose the bug should trigger there too (unless, for some reason, the nested txn always aborts immediately with RTM). > I'll note that if I hack the call to > htm_abort_should_retry(ret) so that we break of of the loop and fallback > to SW TM, then the test case executes correctly. That matches what I suppose the bug is. Please feel free to create a bug report. I will work on a patch. Torvald
Re: Libitm issues porting to POWER8 HTM
Hi Peter, On Sat, Jun 15, 2013 at 2:44 AM, Peter Bergner wrote: > I'm currently implementing support for hardware transactional memory in > the rs6000 backend for POWER8. Things seem to be mostly working, but I > have run into a few issues I'm wondering whether other people are seeing. It sounds great! Is it already publicly available? > Finially, when compiling (static or non-static) static-ctor.C, I'm seeing: > > /home/bergner/gcc/gcc-fsf-mainline-htm/libitm/testsuite/libitm.c++/static_ctor.C:12:18: > error: unsafe function call 'void __cxa_guard_release(long long int*)' > within 'transaction_safe' function >static int y = x; > ^ > /home/bergner/gcc/gcc-fsf-mainline-htm/libitm/testsuite/libitm.c++/static_ctor.C:12:18: > error: unsafe function call 'int __cxa_guard_acquire(long long int*)' within > 'transaction_safe' function > > Does x86 not get calls to __cxa_guard_acquire and __cxa_guard_release for > this access, so it doesn't see this error? To be honest, I'm not sure > what we're supposed to do with this error. Sorry I don't have answers to your previous questions (I may have in the future when I will get a CPU with HTM). About the last one, this fails for a long long time now (even on x86): http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51173 Indeed, static constructors are not transaction safe yet and we should have a workaround for this... -- Patrick
Libitm issues porting to POWER8 HTM
I'm currently implementing support for hardware transactional memory in the rs6000 backend for POWER8. Things seem to be mostly working, but I have run into a few issues I'm wondering whether other people are seeing. For me, all of the libitm execution test cases in libitm/testsuite/libitm.c/ compile and execute without error, except for reentrant.c, which hangs for me. My gdb hasn't been ported to support HTM on Power yet, so debugging has been slow, but what I've learned is, that my tbegin. instruction succeeds, but I fail the test (meaning someone has the write lock) at beginend.cc:200: if (unlikely(serial_lock.is_write_locked())) htm_abort(); ...so we abort the transaction. The failure is not persistent, so we do not break out of the loop due to: if (!htm_abort_should_retry(ret)) break; We then fall into the following code, where we hang trying to get the read lock: serial_lock.read_lock(tx); I have yet to track down who has the write lock and why, but I am working towards that. Talking with Andreas, he said he is seeing the same failure on S390, so I'm wondering whether this might be a generic libitm issue and it might hit Intel too. Does anyone know whether this executes correctly on Intel hardware with RTM? I'll note that if I hack the call to htm_abort_should_retry(ret) so that we break of of the loop and fallback to SW TM, then the test case executes correctly. Secondly, many of the test cases in libitm/testsuite/libitm.c++/ fail to build for me when I use -static with the following error: /home/bergner/gcc/install/gcc-fsf-mainline-htm/lib64/libitm.a(method-serial.o):(.opd+0x1098): multiple definition of `__cxa_pure_virtual' /home/bergner/gcc/install/gcc-fsf-mainline-htm/lib64/libstdc++.a(pure.o):(.opd+0x0): first defined here collect2: error: ld returned 1 exit status The comment in method-serial.cc says it's trying to avoid a dependency on libstdc++. Is the __cxa_pure_virtual workaround in method-serial.cc supposed to work with -static? Finially, when compiling (static or non-static) static-ctor.C, I'm seeing: /home/bergner/gcc/gcc-fsf-mainline-htm/libitm/testsuite/libitm.c++/static_ctor.C:12:18: error: unsafe function call 'void __cxa_guard_release(long long int*)' within 'transaction_safe' function static int y = x; ^ /home/bergner/gcc/gcc-fsf-mainline-htm/libitm/testsuite/libitm.c++/static_ctor.C:12:18: error: unsafe function call 'int __cxa_guard_acquire(long long int*)' within 'transaction_safe' function Does x86 not get calls to __cxa_guard_acquire and __cxa_guard_release for this access, so it doesn't see this error? To be honest, I'm not sure what we're supposed to do with this error. Peter