[Bug go/60406] recover.go: test13reflect2 test failure
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60406 --- Comment #29 from Dominik Vogt --- Thanks! > It's necessary to handle all cases correctly. But there is nothing wrong > with using an efficient mechanism when it works, as long as we can correctly > fall back to a more expensive mechanism when it fails. I believe that my > patch does that. I'm fine with that. But couldn't this also happen when copying multiple arguments to the stack? E.g. if you have an array A and call the function with something like defer(foo(A[0], A[1], ... A[N]) gcc could turn that into a loop for a sufficiently large N? This may be also an unlikely case, but a test for this wouldn't hurt either. > If you like, we can incorporate your patch too, as long as it is only an > addition to the existing code. Before calling runtime_callers, we can call > _Unwind_FindEnclosingFunction. Then we need only go on to the > runtime_callers code if _Unwind_FindEnclosingFunction returns NULL, or in the > cases where d->__makefunc_can_recover is true. Given that my patch fails to work with indirect function pointers (Power) that does not seem to be very helpful.
[Bug go/60406] recover.go: test13reflect2 test failure
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60406 Ian Lance Taylor changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #28 from Ian Lance Taylor --- Fixed on trunk.
[Bug go/60406] recover.go: test13reflect2 test failure
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60406 --- Comment #27 from ian at gcc dot gnu.org --- Author: ian Date: Wed Oct 8 14:03:13 2014 New Revision: 216003 URL: https://gcc.gnu.org/viewcvs?rev=216003&root=gcc&view=rev Log: PR go/60406 runtime: Check callers in can_recover if return addressdoesn't match. Also use __builtin_extract_return_address and tighten up the checks in FFI code. Fixes PR 60406. Modified: trunk/libgo/go/reflect/makefunc_ffi_c.c trunk/libgo/runtime/go-defer.c trunk/libgo/runtime/go-panic.h trunk/libgo/runtime/go-recover.c trunk/libgo/runtime/panic.c
[Bug go/60406] recover.go: test13reflect2 test failure
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60406 --- Comment #26 from Ian Lance Taylor --- (In reply to Dominik Vogt from comment #23) > > 1) We need to call __builtin_extract_return_address(retaddr) in > __go_set_defer_retaddr() too. (On s390 (31 bit) this is necessary to remove > the most significant bit of the address which indicates the addressing mode.) > > I.e. > > --snip-- > -g->defer->__retaddr = retaddr; > +g->defer->__retaddr = __builtin_extract_return_addr (retaddr); > --snip-- Will do. > 2) The new patch does not compile for me: > > -- > In file included from ../../../libgo/runtime/go-check-interface.c:8:0: > ../../../libgo/runtime/go-panic.h:43:13: error: conflicting types for > ‘__go_makefunc_can_recover’ > extern void __go_makefunc_can_recover (void *retaddr); > ^ > In file included from ../../../libgo/runtime/go-check-interface.c:7:0: > ../../../libgo/runtime/runtime.h:845:9: note: previous declaration of > ‘__go_makefunc_can_recover’ was here > void__go_makefunc_can_recover (const void *); > ^ > -- > > In runtime.h, __go_makefunc_can_recover still has a "const" argument: > > --snip-- > -void__go_makefunc_can_recover (const void *); > +void__go_makefunc_can_recover (void *); > --snip-- I think that must be a local change in your tree. In my tree runtime.h does not declare __go_makefunc_can_recover. > 3) Couldn't this be written more efficiently by passing a flag? > > + /* If we are called from __go_makefunc_can_recover, then we need to > + look one level higher. */ > + if (locs[0].function.len > 0 > + && __builtin_strcmp ((const char *) locs[0].function.str, > +"__go_makefunc_can_recover") == 0) > > E.g. > > _Bool __go_can_recover (void *retaddr, _Bool > called_by_makefunc_can_recover) > { > ... > if (called_by_makefunc_can_recover) > { do something } > else > { do something else } > ... > } Yes, but I would rather do that later if it seems useful. It seems silly to change the compiler to always pass an extra argument that will always be false. Introducing a new function called by both __go_can_recover and __go_makefunc_can_recover changes the stack layout depending on the compiler's inlining choices, so must be treated with care. And it's worth remembering that this case never ever happens outside of tests. It only happens when somebody constructs a function using reflect.MakeFunc and then defers a call to that function. That is a bizarre way to code and I am confident that absolutely no real Go code does it. Making that case slightly less efficient is not important. > 4) With the suggested changes from 1 and 2 above: > > s390x (64 bit): > > * PASS: test/recover.go > * the test from #4 in my earlier posting works as expected > * my private defer/recover/panic testsuite works as expected > > s390 (31 bit): > > * PASS: test/recover.go > * the test from #4 in my earlier posting works as expected > * my private defer/recover/panic testsuite works as expected Thanks for testing. > 5) I find the assumption in the loop at the end of __go_can_recover() that > if a caller's name begins with "__go_" that means the panic can be > recovered, a bit hairy. Even if it is with the current libgo, an unrelated > change elsewhere could break this logic. I agree that it's imperfect. However, it's fairly difficult to construct a case that causes the wrong thing to happen. No Go function can ever start with __go. Very little code in libgo calls a function written in Go; it's easy to find such code because it must call __go_set_closure (the defer thunks are a special case that are known to have no closure). So it can only happen if somebody writes a Go function that calls recover, and then passes it to C code, and that C code names a function starting with "__go_" and then calls the Go function. And this has to happen from a deferred function called while there is a panic in progress. The result will be that the function called by the C code will incorrectly recover the panic.
[Bug go/60406] recover.go: test13reflect2 test failure
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60406 --- Comment #25 from Ian Lance Taylor --- (In reply to Dominik Vogt from comment #22) > >> Hm, so the patch penalises platforms that cannot deal with the > >> 16 byte window? > > > Yes, but, recall that on your system almost all tests pass using the > > code that is in the tree today, before your patch. > > But they really only succeed "by accident". The call for any platform might > introduce control flow into the thunk and trigger the problem in any of the > test cases. Yes, I understand that. You suggested that my patch penalized all cases where we have a control flow problem, because it will do a more costly unwind step. That is true. However, in practice, the case arises very rarely. We know that the code in the thunk is always very simple: if (__go_set_defer_retaddr (&&L)) goto L; real_function(args); L: The compiler is not going to start randomly throwing additional control flow statements into that function. The cases where it does do that are the cases where args is very large, so that it uses a block copy to copy them onto the stack. But that essentially never happens. The args here are the args in "defer real_function(args)". 90% of the time there are no arguments at all. 9.9% of the time it's just a couple of pointer/scalar arguments. In real code nobody ever calls defer with large arguments, because it's just a strange way to write; people write a function closure instead. The only code in the standard library that calls defer with a large argument is the recover.go test, to make sure that it works. It's necessary to handle all cases correctly. But there is nothing wrong with using an efficient mechanism when it works, as long as we can correctly fall back to a more expensive mechanism when it fails. I believe that my patch does that. If you like, we can incorporate your patch too, as long as it is only an addition to the existing code. Before calling runtime_callers, we can call _Unwind_FindEnclosingFunction. Then we need only go on to the runtime_callers code if _Unwind_FindEnclosingFunction returns NULL, or in the cases where d->__makefunc_can_recover is true.
[Bug go/60406] recover.go: test13reflect2 test failure
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60406 --- Comment #24 from Dominik Vogt --- > --snip-- > -g->defer->__retaddr = retaddr; > +g->defer->__retaddr = __builtin_extract_return_addr (retaddr); > --snip-- We need to double check whether this would break Sparc.
[Bug go/60406] recover.go: test13reflect2 test failure
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60406 --- Comment #23 from Dominik Vogt --- Regarding the new patch: 1) We need to call __builtin_extract_return_address(retaddr) in __go_set_defer_retaddr() too. (On s390 (31 bit) this is necessary to remove the most significant bit of the address which indicates the addressing mode.) I.e. --snip-- -g->defer->__retaddr = retaddr; +g->defer->__retaddr = __builtin_extract_return_addr (retaddr); --snip-- 2) The new patch does not compile for me: -- In file included from ../../../libgo/runtime/go-check-interface.c:8:0: ../../../libgo/runtime/go-panic.h:43:13: error: conflicting types for ‘__go_makefunc_can_recover’ extern void __go_makefunc_can_recover (void *retaddr); ^ In file included from ../../../libgo/runtime/go-check-interface.c:7:0: ../../../libgo/runtime/runtime.h:845:9: note: previous declaration of ‘__go_makefunc_can_recover’ was here void__go_makefunc_can_recover (const void *); ^ -- In runtime.h, __go_makefunc_can_recover still has a "const" argument: --snip-- -void__go_makefunc_can_recover (const void *); +void__go_makefunc_can_recover (void *); --snip-- 3) Couldn't this be written more efficiently by passing a flag? + /* If we are called from __go_makefunc_can_recover, then we need to + look one level higher. */ + if (locs[0].function.len > 0 + && __builtin_strcmp ((const char *) locs[0].function.str, + "__go_makefunc_can_recover") == 0) E.g. _Bool __go_can_recover (void *retaddr, _Bool called_by_makefunc_can_recover) { ... if (called_by_makefunc_can_recover) { do something } else { do something else } ... } 4) With the suggested changes from 1 and 2 above: s390x (64 bit): * PASS: test/recover.go * the test from #4 in my earlier posting works as expected * my private defer/recover/panic testsuite works as expected s390 (31 bit): * PASS: test/recover.go * the test from #4 in my earlier posting works as expected * my private defer/recover/panic testsuite works as expected 5) I find the assumption in the loop at the end of __go_can_recover() that if a caller's name begins with "__go_" that means the panic can be recovered, a bit hairy. Even if it is with the current libgo, an unrelated change elsewhere could break this logic.
[Bug go/60406] recover.go: test13reflect2 test failure
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60406 --- Comment #22 from Dominik Vogt --- >> Hm, so the patch penalises platforms that cannot deal with the >> 16 byte window? > Yes, but, recall that on your system almost all tests pass using the > code that is in the tree today, before your patch. But they really only succeed "by accident". The call for any platform might introduce control flow into the thunk and trigger the problem in any of the test cases. > The spec says "Suppose a function G defers a function D that calls > recover and a panic occurs in a function on the same goroutine in > which G is executing." The order is 1) G defers D; 2) a panic occurs. > ... I'd still not understand it that way, but from other documentation the intention is "clear". Maybe another bullet could be added to the list below: "The return value of recover is nil if ... ... * defer was called when the panic already existed"
[Bug go/60406] recover.go: test13reflect2 test failure
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60406 --- Comment #21 from Uroš Bizjak --- (In reply to boger from comment #20) > The latest patch worked on ppc64 for LE & BE. That is, the testcase > recover.go now works and no new regressions were introduced. Also works on alpha [1] without new regressions. [1] https://gcc.gnu.org/ml/gcc-testresults/2014-10/msg00840.html
[Bug go/60406] recover.go: test13reflect2 test failure
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60406 --- Comment #20 from boger at us dot ibm.com --- The latest patch worked on ppc64 for LE & BE. That is, the testcase recover.go now works and no new regressions were introduced.
[Bug go/60406] recover.go: test13reflect2 test failure
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60406 Ian Lance Taylor changed: What|Removed |Added Attachment #33644|0 |1 is obsolete|| --- Comment #19 from Ian Lance Taylor --- Created attachment 33661 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=33661&action=edit possible patch 2 Here is a new possible patch. This time I tested it on a PPC GNU/Linux machine. The problem was a mishandling of the stack walk when using MakeFunc with FFI closures. This passes on both x86_64 and PPC. It also address Dominik's case in which there is a spurious recover when one MakeFunc function calls another. Please review this patch and let me know if this fails on your system. Thanks.
[Bug go/60406] recover.go: test13reflect2 test failure
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60406 --- Comment #18 from Ian Lance Taylor --- > Well, maybe it was only a problem with tail recursion, Note that because Go programs expect predictable results from runtime.Callers and other stack backtracing functions, the Go frontend disables tail recursion (go_langhook_post_options in gcc/go/go-lang.c). > Hm, so the patch penalises platforms that cannot deal with the 16 byte window? Yes, but, recall that on your system almost all tests pass using the code that is in the tree today, before your patch. The only tests that fail are the very challenging ones in recover.go, that stress test the panic/recover mechanism but are in no way representative of real code. The normal tests all works fine. So while there is a penalty, it is one that only occurs in rare cases. >>> func main() { defer foo(); panic("..."); } >>> func foo() { defer bar(); } >>> func bar() { recover(); } >> >> In this case, the call to recover in bar is supposed to return nil; >> it should not recover the panic. If you read the paragraph before >> the one you quote, you will see that recover only returns non-nil >> if it was called by a function that was deferred before the call to >> panic. > >I've read it but cannot see anything that would disallow recovery in this > situation. What exactly do you mean? The spec says "Suppose a function G defers a function D that calls recover and a panic occurs in a function on the same goroutine in which G is executing." The order is 1) G defers D; 2) a panic occurs. In your example above, this applies to main defers foo and then a panic occurs. It does not apply to foo defers bar, because there is no panic after foo defers bar (the panic has already occurred--that is why we are executing foo). Since there is no panic, the recover in bar returns nil. The recover.go file tests this pattern in, e.g., test1WithClosures.
[Bug go/60406] recover.go: test13reflect2 test failure
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60406 --- Comment #17 from Dominik Vogt --- >> * Wouldn't the new patch re-introduce the bug that >> >> func foo(n int) { >> if (n == 0) { recover(); } else { foo(0); } >> } >> func main() { >> defer foo(1) >> panic("...") >> } >> >> would recover although it should not? > > Hmmm, I hadn't fully internalized that issue. Your new > withoutRecoverRecursive test doesn't fail for me on x86_64. I think you have implicitly fixed this issue by splitting functions that call recover() into two parts (i.e. main.foo and main.foo$recover). So recursive calls originate from the ...$recover function and never match the return address check. Well, maybe it was only a problem with tail recursion, i.e. func foo(n int) int { if (n == 0) { recover(); return 0; } return foo(0) } Would be optimized to a loop, removing the function call, and then the return address check would trigger. That's not possible with two function approach. But if there's another criterion to allow recover that simply depends on the caller's name the problem might reappear. >> * The code is even more expensive than the approach I had chosen because >> now it needs to fetch a two level backtrace instead of just one level >> (and probably each level is more expensive than the one >> _Unwind_FindEnclosingFunc()). > > Yes, but the expensive case only happens in the rare cases where > either recover should not work or when the existing code has a > false negative. Hm, so the patch penalises platforms that cannot deal with the 16 byte window? >> func main() { defer foo(); panic("..."); } >> func foo() { defer bar(); } >> func bar() { recover(); } > > In this case, the call to recover in bar is supposed to return nil; > it should not recover the panic. If you read the paragraph before > the one you quote, you will see that recover only returns non-nil > if it was called by a function that was deferred before the call to > panic. I've read it but cannot see anything that would disallow recovery in this situation. What exactly do you mean? >> 4) __go_can_recover assumes that any call through libffi is allowed >> to recover. > > Thanks for the example. Does your patch fix this problem? No, I just became aware of the problem when reviewing the code last week.
[Bug go/60406] recover.go: test13reflect2 test failure
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60406 --- Comment #16 from Ian Lance Taylor --- >> I'm not really happy with Dominik's patch because 1) it doesn't work when >> configuring with --enable-sjlj-exceptions; > > Why is that important? It's not very important but it's still a point to consider. Some targets default to SJLJ exceptions, albeit not very important ones. >> 2) the current code almost always works, even on S/390, but the patch >> forces us to do a lookup in the FDE table every time we call recover. > > The current code works unreliably as s390 uses memcopy to copy call > arguments to the stack. The control flow introduced by the function > call triggers basic block reordering that may result in anything. Sure, I understand, and it can obviously cause a false negative: some cases that should recover will fail to do so. However, I don't see any way that it can ever cause a false positive: I don't see any way that it can cause recover to succeed when it should not. > * On systems that "use a leading underscore on symbol names", the test > for functions beginning with "__go_" or "_go_" would yield "true" from user > functions named "_go_..." (because the system adds one '_' and the patch > strips it). Yes. We are already going to have trouble on such systems. Really the library needs to learn which systems use a leading underscore and which do not. This is actually available as __USER_LABEL_PREFIX__, and we should use that. > * Wouldn't the new patch re-introduce the bug that > > func foo(n int) { > if (n == 0) { recover(); } else { foo(0); } > } > func main() { > defer foo(1) > panic("...") > } > > would recover although it should not? Hmmm, I hadn't fully internalized that issue. Your new withoutRecoverRecursive test doesn't fail for me on x86_64. I'll have to figure out why. > * The code is even more expensive than the approach I had chosen because > now it needs to fetch a two level backtrace instead of just one level > (and probably each level is more expensive than the one > _Unwind_FindEnclosingFunc()). Yes, but the expensive case only happens in the rare cases where either recover should not work or when the existing code has a false negative. In the normal case, where recover is permitted and the existing code works, we save the FDE lookup. > 2) The current checks for "return address + 16" may point into a > different function, allowing recover() in weird situations. It's a potential problem but I'm not too worried about it. > "The return value of recover is nil if any of the following conditions holds: > ... > *recover was not called directly by a deferred function." > > According to the spec, the following code should recover the panic but > does not: > > func main() { defer foo(); panic("..."); } > func foo() { defer bar(); } > func bar() { recover(); } > > Note that this is also also "broken" in Golang (well, at least in the old > version that comes with Ubuntu). This may be an effect of imprecise > wording of the spec. In this case, the call to recover in bar is supposed to return nil; it should not recover the panic. If you read the paragraph before the one you quote, you will see that recover only returns non-nil if it was called by a function that was deferred before the call to panic. In your example, the defer of bar happens after the call to panic. The reason Go works this way is to that the deferred function foo can itself call a function that panics and recovers without that function being confused by the earlier panic, one that it may not know anything about. > 4) __go_can_recover assumes that any call through libffi is allowed > to recover. Thanks for the example. Does your patch fix this problem?
[Bug go/60406] recover.go: test13reflect2 test failure
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60406 --- Comment #15 from boger at us dot ibm.com --- The testcase recover.go continues to fail on both ppc64 LE & BE with the new patch https://codereview.appspot.com/153950043.
[Bug go/60406] recover.go: test13reflect2 test failure
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60406 --- Comment #14 from Dominik Vogt --- > I'm not really happy with Dominik's patch because 1) it doesn't work when > configuring with --enable-sjlj-exceptions; Why is that important? > 2) the current code almost always works, even on S/390, but the patch > forces us to do a lookup in the FDE table every time we call recover. The current code works unreliably as s390 uses memcopy to copy call arguments to the stack. The control flow introduced by the function call triggers basic block reordering that may result in anything. > I'd like to propose a different patch, that keeps the current code > that almost always works, does a quick exit when there is no defer in > the call stack, and does an unwind to check for a specific function > in other cases. I've not tested the patch yet, but see some potential problems: * On systems that "use a leading underscore on symbol names", the test for functions beginning with "__go_" or "_go_" would yield "true" from user functions named "_go_..." (because the system adds one '_' and the patch strips it). * Wouldn't the new patch re-introduce the bug that func foo(n int) { if (n == 0) { recover(); } else { foo(0); } } func main() { defer foo(1) panic("...") } would recover although it should not? * The code is even more expensive than the approach I had chosen because now it needs to fetch a two level backtrace instead of just one level (and probably each level is more expensive than the one _Unwind_FindEnclosingFunc()). Digging deeper at the issue that causes Lynn's problems on Power surfaces more problems with the current implementation of __go_can_recover() and the thunk, also with the patch I've posted. Here's a summary of all the problems I am aware of (some new, some known, skipping the bugs introduced by the patch): Problems with the current implementation only: -- 1) Calculation of the label address in the thunk does not work if the basic block reordering is done (that's the issue why this bug report was created) 2) The current checks for "return address + 16" may point into a different function, allowing recover() in weird situations. Problems with the current implementation and the proposed patch: 3) Quote from http://golang.org/ref/spec: "The return value of recover is nil if any of the following conditions holds: ... *recover was not called directly by a deferred function." According to the spec, the following code should recover the panic but does not: func main() { defer foo(); panic("..."); } func foo() { defer bar(); } func bar() { recover(); } Note that this is also also "broken" in Golang (well, at least in the old version that comes with Ubuntu). This may be an effect of imprecise wording of the spec. 4) __go_can_recover assumes that any call through libffi is allowed to recover. Quote from the sources: "/* If the function calling recover was created by reflect.MakeFunc, then RETADDR will be somewhere in libffi. Our caller is permitted to recover if it was called from libffi. */" This violates the specification. An example that recovers the panic in a nested function but should not: --snip-- package main import "reflect" func main() { // generate foo() and bar() fn := reflect.ValueOf(interface{}(&foo)).Elem() val := reflect.MakeFunc(fn.Type(), recover_reflect1) fn.Set(val) fn = reflect.ValueOf(interface{}(&bar)).Elem() val = reflect.MakeFunc(fn.Type(), recover_reflect2) fn.Set(val) defer foo() panic("...") } var foo func() func recover_reflect1(args []reflect.Value) (results []reflect.Value) { bar() return results } var bar func() func recover_reflect2(args []reflect.Value) (results []reflect.Value) { recover() return results } --snip-- Actually, I think this may also happen if bar is not a function created through reflection but any foreign language function * from a stripped object file (no name in the backtrace is guessed as "called by libffi"), * if the name begins with "[_]ffi_" (But perhaps it's impossible to construct such an example.)
[Bug go/60406] recover.go: test13reflect2 test failure
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60406 --- Comment #13 from Uroš Bizjak --- (In reply to Ian Lance Taylor from comment #12) > Can people on systems for which the recover.go test currently fails please > try this patch and let me know whether it fixes the problem? Thanks. Unfortunately, the patched libgo still fails on alpha.
[Bug go/60406] recover.go: test13reflect2 test failure
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60406 --- Comment #12 from Ian Lance Taylor --- Created attachment 33644 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=33644&action=edit possible patch I'm not really happy with Dominik's patch because 1) it doesn't work when configuring with --enable-sjlj-exceptions; 2) the current code almost always works, even on S/390, but the patch forces us to do a lookup in the FDE table every time we call recover. I'd like to propose a different patch, that keeps the current code that almost always works, does a quick exit when there is no defer in the call stack, and does an unwind to check for a specific function in other cases. Can people on systems for which the recover.go test currently fails please try this patch and let me know whether it fixes the problem? Thanks. This is https://codereview.appspot.com/153950043 .
[Bug go/60406] recover.go: test13reflect2 test failure
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60406 --- Comment #11 from boger at us dot ibm.com --- On ppc64 BE, the call to make_code_func_reference returns the function pointer in the .opd and not the actual address of the function. That is what causes the failures with this patch https://gcc.gnu.org/ml/gcc-patches/2014-09/msg00710.html The information in this reply fixes the regressions from this patch on ppc64 BE: https://gcc.gnu.org/ml/gcc-patches/2014-09/msg02282.html Essentially the change is to dereference the function pointer in __go_set_defering_fn when building for ppc64 BE as follows: +#if defined(__powerpc64__) && _CALL_ELF != 2 +g->defer->__defering_fn = *(void **)defering_fn; +#else +g->defer->__defering_fn = defering_fn; +#endif
[Bug go/60406] recover.go: test13reflect2 test failure
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60406 --- Comment #10 from Peter Bergner --- (In reply to boger from comment #9) > On ppc64le, this works as expected but on ppc64(be) the code that is > generated from this is not the address of the function but the address of > the .opd entry that is used to call the function. As a result the setting > of the deferring function is incorrect and it does not appear as if it can > recover because of the way __go_can_recover uses the deferring function's > address to see if it is in the correct range. On BE, a "function pointer" (unlike on LE or x64* or ...) always points to a function's function descriptor (ie, the .opd entry) and not the code address. The function descriptor contains 3 64-bit doublewords. The first doubleword is the address of the function's code. The second doubleword is the TOC value that needs to be in r2 while we execute the function and the third doubleword contains an environment pointer for languages such as Pascal and PL/1.
[Bug go/60406] recover.go: test13reflect2 test failure
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60406 --- Comment #9 from boger at us dot ibm.com --- The patch to fix the recover.go problem causes significant regression in gccgo when built for ppc64 (BE). There are 32 unexpected failures in the gcc go testsuite with the patch 32 unexpected failures in the gcc libgo testsuite. Without this patch on trunk there are 2 failures in the go testsuites and 1 failure in the libgo testsuite. I did some debugging on one of the regression failures: bug113.go. I found that the problem occurs in the new code in the patch in go/gofrontend/statements.cc when creating the expression tree to call __go_set_defering_fn. The argument that is being passed is generated through a call to make_func_code_reference: Expression* arg = Expression::make_func_code_reference(function, location); Expression* call = Runtime::make_call(Runtime::SET_DEFERING_FN, location, 1, arg); Statement* s = Statement::make_statement(call, true); s->determine_types(); gogo->add_statement(s); On ppc64le, this works as expected but on ppc64(be) the code that is generated from this is not the address of the function but the address of the .opd entry that is used to call the function. As a result the setting of the deferring function is incorrect and it does not appear as if it can recover because of the way __go_can_recover uses the deferring function's address to see if it is in the correct range. When I debug this using gdb: Breakpoint 1, __go_set_defering_fn (defering_fn=0x10020150 ) at /home/boger/gccgo.work/140922/src/libgo/runtime/go-defer.c:78 78 { Missing separate debuginfos, use: debuginfo-install glibc-2.17-55.el7.ppc64 (gdb) info reg $r3 r3 0x10020150 268566864 >From the objdump, I see this address is in the .opd: 10020150 : 10020150: 00 00 00 00 .long 0x0 10020154: 10 00 1c 4c vsrov0,v0,v3 10020158: 00 00 00 00 .long 0x0 1002015c: 10 02 81 c0 .long 0x100281c0 but the code is actually here, which is the address that should be passed to __go_set_defering_fn: 10001c4c <.main.$thunk0>: main.$thunk0(): 10001c4c: 7c 08 02 a6 mflrr0 10001c50: f8 01 00 10 std r0,16(r1) 10001c54: fb e1 ff f8 std r31,-8(r1) 10001c58: f8 21 ff 71 stdur1,-144(r1) Note that the actual function code address is in the first 8 bytes of the .opd entry for the function.
[Bug go/60406] recover.go: test13reflect2 test failure
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60406 --- Comment #8 from boger at us dot ibm.com --- Update on my previous work: 1) My previous update referred to a build that was done with the patches that were submitted to gcc and patches that Dominik provided me for libffi. I found that if I only build with the gcc patches and don't use the libffi patches then the recover.go testcase passes on ppc64le.
[Bug go/60406] recover.go: test13reflect2 test failure
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60406 --- Comment #7 from boger at us dot ibm.com --- I've built with Dominik's patches against trunk on ppc64le and have been trying to run the gcc testsuite for go and libgo. The recover.go testcase continues to fail in my build. I did some debugging on this and found that the problem occurs in ffi_callback when it calls __go_makefunc_can_recover. Based on the comments I think it should be called with the program address for the most recent caller on the stack which is not ffi, but instead it is called with the address for ffi_closer_helper_LINUX64. I suspect this is happening because runtime_callers is not returning the complete set of callers. This error leads to an incorrect setting of __makefunc_can_recover in the __go_defer_stack structure so the test prints the "missing recover" error.
[Bug go/60406] recover.go: test13reflect2 test failure
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60406 --- Comment #6 from Dominik Vogt --- I'll submit a fix for this problem as soon as I figure out what to do with the patch.
[Bug go/60406] recover.go: test13reflect2 test failure
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60406 Ian Lance Taylor changed: What|Removed |Added CC||boger at us dot ibm.com --- Comment #5 from Ian Lance Taylor --- *** Bug 63170 has been marked as a duplicate of this bug. ***
[Bug go/60406] recover.go: test13reflect2 test failure
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60406 --- Comment #4 from Uroš Bizjak --- Alternatively, since __go_set_defer_retaddr always returns 0, we can prevent split of the label by enclosing it in asm volatiles: extern int foo (void *); extern void bar (void); int test_1 (void) { __label__ bla1; if (foo (&&bla1)) goto bla1; asm volatile (""); bar (); bla1: asm volatile (""); return 0; } test_1: subl$28, %esp movl$.L2, (%esp) callfoo testl %eax, %eax jne .L2 callbar .L2: xorl%eax, %eax addl$28, %esp ret 'goto label' can also be substituted with 'asm goto ("" label)', but I don't think this is necessary in the above case.
[Bug go/60406] recover.go: test13reflect2 test failure
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60406 --- Comment #3 from Uroš Bizjak --- It looks that a hack, mentioned in Comment #0 should use asm goto instead of goto. The following test: --cut here-- extern void foo (void *); int test_bad (int p1, int p2) { __label__ bla1; foo (&&bla1); goto bla1; bla1: return 0; } int test_good (int p1, int p2) { __label__ bla2; foo (&&bla2); asm goto ("" bla2); bla2: return 0; } --cut here-- results in (-O2): test_good: subl$28, %esp movl$.L5, (%esp) callfoo .L6: .L5: xorl%eax, %eax addl$28, %esp ret which is much better than: test_bad: .L2: subl$28, %esp movl$.L2, (%esp) callfoo xorl%eax, %eax addl$28, %esp ret Also checked on alpha: test_good: .frame $30,16,$26,0 .mask 0x400,-16 $LFB1: .cfi_startproc ldah $29,0($27) !gpdisp!4 lda $29,0($29) !gpdisp!4 $test_good..ng: lda $30,-16($30) .cfi_def_cfa_offset 16 cpys $f31,$f31,$f31 ldah $16,$L4($29) !gprelhigh ldq $27,foo($29)!literal!5 stq $26,0($30) .cfi_offset 26, -16 .prologue 1 lda $16,$L4($16)!gprellow jsr $26,($27),foo !lituse_jsr!5 ldah $29,0($26) !gpdisp!6 lda $29,0($29) !gpdisp!6 .align 3 #realign $L5: $L4: mov $31,$0 ldq $26,0($30) lda $30,16($30) .cfi_restore 26 .cfi_def_cfa_offset 0 ret $31,($26),1 Please note, that the check will still need some tolerance due to various label alignment requirements, additional instructions, etc: 5c: 00 00 10 22 lda a0,0(a0) 5c: GPRELLOW.text+0x70 <- label loc 60: 00 40 5b 6b jsr ra,(t12),64 60: LITUSE .text+0x3 60: HINTfoo 64: 00 00 ba 27 ldahgp,0(ra)<- retaddr 64: GPDISP .text+0x4 68: 00 00 bd 23 lda gp,0(gp) 6c: 00 00 fe 2f unop 70: 00 04 ff 47 clr v0
[Bug go/60406] recover.go: test13reflect2 test failure
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60406 Uroš Bizjak changed: What|Removed |Added Summary|reflect.go:test13reflect2 |recover.go: test13reflect2 |test failure|test failure --- Comment #2 from Uroš Bizjak --- Corrected Summary.