[Bug go/64999] s390x libgo test failure in TestMemoryProfiler

2015-04-21 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64999

--- Comment #60 from Dominik Vogt vogt at linux dot vnet.ibm.com ---
Works on s390x.


[Bug go/64999] s390x libgo test failure in TestMemoryProfiler

2015-04-17 Thread boger at us dot ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64999

--- Comment #56 from boger at us dot ibm.com ---
Here is a bit more detail.  Now that I think I understand all the
considerations, I'm proposing this simple fix for gcc 5.  Maybe longer term a
more thorough solution could be done but not sure it is worth changing multiple
functions to add an additional argument and changing external header files in
gcc 5.

The problems to be fixed are:
- the pc in the call stack returned by runtime_callers is wrong for some
platforms, such as Power and Z and possibly others
- the pc adjustment and line number that is looked up in go/runtime/pprof.go
sometimes provides the wrong line number if based on the bad pc

The proposed fix is to increment the pc in the callback routine that is passed
to backtrace_full when trying to find the runtime_callers all the time.  During
the unwind the pc was decremented for the normal call case.  If it was a
signal, the pc would not have been decremented so it is being incremented
unnecessarily for that situation but I think that is OK since the line number
will be right and the pc will still refer to the signalling instruction.

As far as the change to the pprof code, at least in gccgo for Power and Z the
pc and the line number should be correct so no adjustment is even needed, but
my understanding from previous posts is that in the gc compiled case an
adjustment is needed and this code is common for gc and gccgo?  With the pprof
code that is there now, the pprof code is OK for Power, i.e., the line number
in the pprof testcase matches the expected line number.  For Z, I think that
the amount to decrement should be the minimum size for a call instruction (not
sure what that value is) and then the tracepc value would refer to the previous
instruction.

Note that the pprof testcase does not yet work even with this fix because there
is a new problem that fails on all platforms with gccgo described here: 
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64683.


[Bug go/64999] s390x libgo test failure in TestMemoryProfiler

2015-04-17 Thread boger at us dot ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64999

--- Comment #55 from boger at us dot ibm.com ---
Created attachment 35344
  -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=35344action=edit
Increment the pc in the callback routine for backtrace_full

Always increment the pc in the callback, knowing that it was decremented during
the unwind function to find the line number number of the call instruction. 
This leaves the pc with the correct value for the call stack.  If a signal had
occurred, and it wasn't a call instruction, the pc will get incremented when it
hadn't been decremented during unwind but that is OK for this fix.


[Bug go/64999] s390x libgo test failure in TestMemoryProfiler

2015-04-17 Thread ian at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64999

--- Comment #57 from ian at gcc dot gnu.org ian at gcc dot gnu.org ---
Author: ian
Date: Fri Apr 17 19:29:28 2015
New Revision: 222196

URL: https://gcc.gnu.org/viewcvs?rev=222196root=gccview=rev
Log:
PR go/64999
PR go/65180
runtime: Adjust libbacktrace PC value to what runtime.Callers expects.

From Lynn Boger.

Modified:
trunk/libgo/runtime/go-callers.c


[Bug go/64999] s390x libgo test failure in TestMemoryProfiler

2015-04-17 Thread ian at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64999

--- Comment #58 from ian at gcc dot gnu.org ian at gcc dot gnu.org ---
Author: ian
Date: Fri Apr 17 19:29:43 2015
New Revision: 222197

URL: https://gcc.gnu.org/viewcvs?rev=222197root=gccview=rev
Log:
PR go/64999
PR go/65180
runtime: Adjust libbacktrace PC value to what runtime.Callers expects.

From Lynn Boger.

Modified:
branches/gcc-5-branch/libgo/runtime/go-callers.c


[Bug go/64999] s390x libgo test failure in TestMemoryProfiler

2015-04-17 Thread ian at airs dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64999

Ian Lance Taylor ian at airs dot com changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #59 from Ian Lance Taylor ian at airs dot com ---
As far as I know all issues here are now fixed.  Please reopen if not.  Thanks.


[Bug go/64999] s390x libgo test failure in TestMemoryProfiler

2015-03-12 Thread boger at us dot ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64999

--- Comment #46 from boger at us dot ibm.com ---
(In reply to Ian Lance Taylor from comment #45)
 If we change the PC returned by backtrace_full, and then use that changed PC
 to look up file/line information, we might get different results.  That
 seems clear.  My next question is: when does this matter?
 
 There are two ways that we use the result of runtime_callers: either we use
 the file/line information, or we use the PC.  If we use the file/line
 information, all is well, and there is nothing to discuss.  If we use the
 PC, as in runtime.Callers, then it's true that if we pass that PC back to
 libbacktrace we may get different file/line information.  But, so what? 
 We've already lost the original file/line information anyhow.
 
 The idea is that the changed PC will be the same as the PC returned by the
 gc library.  Therefore, we should get approximately the same file/line
 information as the gc library gets.  That is why runtime/pprof/pprof.go in
 the gc library backs up the PC to the call instruction: because it knows
 that it has the PC after the call instruction.

Let me just respond to this last paragraph for now.  If the pc values in gc are
always the instruction following the call (or the excepting instruction), and
runtime_callers expects it to be that way, and we want gc and gccgo to return
the same results, then I don't understand why backtrace_full and
backtrace_simple should ever decrement the pc for any platform.  There is a
brief mention of this in comment 9 -- that we can make use the fact that the pc
had been decremented by inspecting it and making use of the fact that is an
abnormal pc value; however I don't see anywhere in the libgo code where it does
this, and besides, for some platforms after the pc is decremented it is still a
normal looking pc so there is no way to know if it had been decremented. 
Getting the right pc and the right instruction seems more important than
preserving some information that is currently not being used and can't be used
on all platforms.

That means, if the decrement was removed completely from backtrace_full and
backtrace_simple then the pc values would be correct for each platform and the
line numbers would be determined based on the correct pc.  And then we would
still need the BackupPC function for those cases where the call instruction was
needed as is done in pprof.go and some extra stuff for s390 and s390x for their
various call instructions to get the right number of bytes to back up.


[Bug go/64999] s390x libgo test failure in TestMemoryProfiler

2015-03-12 Thread boger at us dot ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64999

--- Comment #51 from boger at us dot ibm.com ---
Here is the change I made to go-callers.c.  This patch along with my previous
changes to extern.go and pprof.go allows the test identified in this issue to
report the correct line number on ppc64le.  I wanted to post it here for
comments before I do the bootstrap build and test on ppc64 and ppc64le in case
there is a concern with using an #if defined.  This is in libgo/runtime and I
didn't see any other examples where it used the GOARCH value in C code so that
is why I did it this way.

===
--- libgo/runtime/go-callers.c  (revision 221039)
+++ libgo/runtime/go-callers.c  (working copy)
@@ -83,6 +83,10 @@ callback (void *data, uintptr_t pc, const char *fi
 }

   loc = arg-locbuf[arg-index];
+#if defined(__powerpc64__) || defined(__s390__) || defined(__s390x__)
+  if ((pc  1) == 1)
+pc++;
+#endif
   loc-pc = pc;

   /* The libbacktrace library says that these strings might disappear,


[Bug go/64999] s390x libgo test failure in TestMemoryProfiler

2015-03-12 Thread ian at airs dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64999

--- Comment #52 from Ian Lance Taylor ian at airs dot com ---
Why not just
pc++;
on all targets?  Why the #ifdef?  It seems to me that we should aim for
consistent results on all platforms.


[Bug go/64999] s390x libgo test failure in TestMemoryProfiler

2015-03-12 Thread ian at airs dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64999

--- Comment #54 from Ian Lance Taylor ian at airs dot com ---
I assume that it works on x86 because subtracting 1 from PC in libbacktrace,
and then subtracting 1 again in runtime/pprof/pprof.go, still gives you a PC
within the call instruction.  On PPC, subtracting 1 in libbacktrace and then
subtracting 4 in pprof.go gives you a PC in the instruction just before the
call instruction.

I appreciate that you only want to fix what is broken, but in this case I think
we should turn that around and keep the code simple unless we find that it
breaks something.

Yes, you're right that if we were to change libbacktrace it would mean passing
the flag through a chain of functions.  Basically the flag would have to move
along with the PC value.  I'm not sure this is worth doing, ip_before_insn is
always false except when tracing back through a signal trampoline, where the
file/line information is generally not helpful anyhow.


[Bug go/64999] s390x libgo test failure in TestMemoryProfiler

2015-03-12 Thread boger at us dot ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64999

--- Comment #53 from boger at us dot ibm.com ---
I was taking the approach of only fixing what was known to be broken, and I was
not aware that this was broken on other platforms.  Minimizing risk.  I can
change it for all platforms but these tests are not broken on x86 today with
the pc decremented in both libbacktrace and in pprof.go and I don't understand
why not so I didn't want to mess with it.

I'm not sure if I understand how it would work to add an argument to callback. 
The call to callback happens much later after the ip_before_insn flag has been
set:  from inside unwind - backtrace_pcinfo - dwarf_fileline -
dwarf_lookup_pc - callback.  Do you mean pass this flag as an argument to each
of these functions so it can be passed to callback?


[Bug go/64999] s390x libgo test failure in TestMemoryProfiler

2015-03-10 Thread boger at us dot ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64999

--- Comment #42 from boger at us dot ibm.com ---



(In reply to Ian Lance Taylor from comment #41)
 I really don't want libbacktrace to be processor-dependent.  That makes all
 uses of it more complex for no significant gain.  Maybe we should change
 libbacktrace, but definitely not by making it processor-dependent.

I'm sorry but I have to disagree.  In my opinion what libbacktrace does now for
Power and s390/s390x is wrong.  It does not return valid PC values for those
platforms.  Everyone who calls it has to know that they must look through the
list of PC values and change them back.  In addition, libbacktrace has already
looked up the line number information with those invalid PCs.  So essentially
libbacktrace is returning invalid PCs and usually incorrect line numbers which
the caller has to know about and fix up and that seems significant to me.

Maybe the PCs should never be decremented for any platform?

 
 I would prefer that changes to files like runtime/pprof/pprof.go and
 log/log.go go through the master repo first.  If gccgo's runtime.Callers
 acts the same as gc's runtime.Callers, then there shouldn't be any
 gccgo-specific need to change those files.

Yes I agree that is the way it should be done.  I mainly wanted to show all the
changes in one place for review and didn't think through the process for
getting that code in place.

 
 Can somebody please try just editing the function callback in
 runtime/go-callers.c to add one to pc when storing it in loc-pc?  I'd like
 to understand why that simple change doesn't work before trying something
 more complex.

Yes I've already tried that.  But I ran into the issues I described in my first
comment above, especially seeing that all the stack information from
runtime_callers had incorrect line information and it is not obvious to the
caller that they should fix that information before using it.  In addition,
there is evidence that looking up the line number information is not cheap, so
doing it in libbacktrace and then having to look it up again if it is needed
seems like a waste especially in those cases when the line number information
is not even used.


[Bug go/64999] s390x libgo test failure in TestMemoryProfiler

2015-03-10 Thread ian at airs dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64999

--- Comment #43 from Ian Lance Taylor ian at airs dot com ---
I'm getting confused.  I think I need to talk about one thing at a time.

You say that libbacktrace is returning incorrect line numbers.  That obviously
needs to be fixed.  When does that happen?


[Bug go/64999] s390x libgo test failure in TestMemoryProfiler

2015-03-10 Thread ian at airs dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64999

--- Comment #45 from Ian Lance Taylor ian at airs dot com ---
If we change the PC returned by backtrace_full, and then use that changed PC to
look up file/line information, we might get different results.  That seems
clear.  My next question is: when does this matter?

There are two ways that we use the result of runtime_callers: either we use the
file/line information, or we use the PC.  If we use the file/line information,
all is well, and there is nothing to discuss.  If we use the PC, as in
runtime.Callers, then it's true that if we pass that PC back to libbacktrace we
may get different file/line information.  But, so what?  We've already lost the
original file/line information anyhow.

The idea is that the changed PC will be the same as the PC returned by the gc
library.  Therefore, we should get approximately the same file/line information
as the gc library gets.  That is why runtime/pprof/pprof.go in the gc library
backs up the PC to the call instruction: because it knows that it has the PC
after the call instruction.


[Bug go/64999] s390x libgo test failure in TestMemoryProfiler

2015-03-10 Thread boger at us dot ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64999

--- Comment #44 from boger at us dot ibm.com ---
If we do the increment of the pc to fix it in the callback, here is how that
happens:
- backtrace_full gets the pc and decrements by 1
- backtrace_full calls backtrace_pcinfo to look up the file, function, line
number information based on the decremented pc
- backtrace_pcinfo calls the callback function after the line number has been
determined.  If the pc is incremented here then it might not correspond to the
line number anymore.  A common example is if you have a statement that does a
function call with no return value, the line number will be wrong because the
call instruction and the instruction following the call correspond to different
line numbers.


[Bug go/64999] s390x libgo test failure in TestMemoryProfiler

2015-03-10 Thread boger at us dot ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64999

--- Comment #48 from boger at us dot ibm.com ---
(In reply to Ian Lance Taylor from comment #47)
 We have to separate backtrace_full and backtrace_simple, which are part of
 the libbacktrace library, from functions like runtime_callers which are part
 of libgo.  The libbacktrace library is used by several different projects. 
 It is not part of libgo.
 
 It may be correct that libbacktrace should never decrement pc.  But the
 argument for that can't be the behaviour of the gc library.
 

The backtraces I'm familiar with are those where the pc values represent the
instruction after the call.  Regardless of what the gc library does I don't see
why you'd want to decrement the pcs in backtrace_full.  Actually to me it seems
like decrementing it would be wrong for all platforms (based on what is done in
pprof.go -- doesn't that mean in some cases it would get decremented twice?)
but I just haven't had time to investigate that further.

 The bit in comment #9 about not decrementing the pc for a signal frame is
 specific to libbacktrace.  You can see that in unwind in
 libbacktrace/backtrace.c and in simple_unwind in libbacktrace/simple.c.
 
 Perhaps we need to change libbacktrace so that backtrace_full_callback and
 backtrace_simple_callback take an additional parameter, which is whether the
 PC is in a signal frame.  From libgo's perspective, though, that would be
 exactly the same as simply always adding to pc in callback in go-callers.c. 
 You said that did not work, and I'm still trying to understand why.

I'm not exactly sure what you are describing -- which functions would have an
additional argument and what would you do with that argument.  

Here is the problem with incrementing the pc in the callback.  In unwind the pc
is decremented, and the line number is looked up for that pc and saved away. 
Then callback is called, and the pc is incremented, but now the line number
that was saved away is stale.  That means when on return to runtime_callers,
the pcs are all correct but some line numbers might not be right for those pcs.
 I tried adding code to look up the line number again in the callback after the
pc was modified but that caused the program to hang.


[Bug go/64999] s390x libgo test failure in TestMemoryProfiler

2015-03-10 Thread ian at airs dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64999

--- Comment #49 from Ian Lance Taylor ian at airs dot com ---
libbacktrace returns the line number that you actually care about: the line
number of the call instruction.  There is no question that that is correct.

You say that it is a problem if the PC does not match the line number, but to
me that sounds like a conceptual problem.  What is the actual problem?

The pprof.go code expects to get the PC after the call instruction, and tries
to turn it into the PC of the call instruction.  I think we can all agree that
pprof.go should in fact see the PC after the call instruction.  The simple way
to do that is for callback in libgo/runtime/go-callers.c to increment pc before
storing it in loc-pc.  Apparently there is some problem with that but I do not
know what that problem is.

The additional argument I mentioned would be for the function types
backtrace_fulL_callback and backtrace_simple_callback defined in
libbacktrace/backtrace.h.  The additional argument would be, essentially, the
value of ip_before_insn in libbacktrace/backtrace.c and libbacktrace/simple.c. 
The argument would tell you whether the PC follows a call instruction, or
whether it is the correct PC for a signal frame instruction.


[Bug go/64999] s390x libgo test failure in TestMemoryProfiler

2015-03-10 Thread ian at airs dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64999

--- Comment #47 from Ian Lance Taylor ian at airs dot com ---
We have to separate backtrace_full and backtrace_simple, which are part of the
libbacktrace library, from functions like runtime_callers which are part of
libgo.  The libbacktrace library is used by several different projects.  It is
not part of libgo.

It may be correct that libbacktrace should never decrement pc.  But the
argument for that can't be the behaviour of the gc library.

The bit in comment #9 about not decrementing the pc for a signal frame is
specific to libbacktrace.  You can see that in unwind in
libbacktrace/backtrace.c and in simple_unwind in libbacktrace/simple.c.

Perhaps we need to change libbacktrace so that backtrace_full_callback and
backtrace_simple_callback take an additional parameter, which is whether the PC
is in a signal frame.  From libgo's perspective, though, that would be exactly
the same as simply always adding to pc in callback in go-callers.c.  You said
that did not work, and I'm still trying to understand why.


[Bug go/64999] s390x libgo test failure in TestMemoryProfiler

2015-03-10 Thread boger at us dot ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64999

--- Comment #50 from boger at us dot ibm.com ---
(In reply to Ian Lance Taylor from comment #49)
 libbacktrace returns the line number that you actually care about: the line
 number of the call instruction.  There is no question that that is correct.
 
 You say that it is a problem if the PC does not match the line number, but
 to me that sounds like a conceptual problem.  What is the actual problem?
 
 The pprof.go code expects to get the PC after the call instruction, and
 tries to turn it into the PC of the call instruction.  I think we can all
 agree that pprof.go should in fact see the PC after the call instruction. 
 The simple way to do that is for callback in libgo/runtime/go-callers.c to
 increment pc before storing it in loc-pc.  Apparently there is some problem
 with that but I do not know what that problem is.
 
 The additional argument I mentioned would be for the function types
 backtrace_fulL_callback and backtrace_simple_callback defined in
 libbacktrace/backtrace.h.  The additional argument would be, essentially,
 the value of ip_before_insn in libbacktrace/backtrace.c and
 libbacktrace/simple.c.  The argument would tell you whether the PC follows a
 call instruction, or whether it is the correct PC for a signal frame
 instruction

OK.

I didn't realize that backtrace_full is expected to return an array of Location
structures which contain the PC for the instruction following the call but the
line number for the call.  When I saw there was a single structure containing a
PC and lineno, I assumed that it was intended to hold information for a single
location, i.e., a PC and the lineno that corresponded to it.

I will go back and try this again as you suggested.  I had tried several
variations and hit different errors and when I saw how the line number and PC
were out of sync that steered me down the wrong path.  Now that I understand
what it really should be I think it should probably work.


[Bug go/64999] s390x libgo test failure in TestMemoryProfiler

2015-03-09 Thread boger at us dot ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64999

--- Comment #40 from boger at us dot ibm.com ---
Created attachment 34995
  -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=34995action=edit
Proposed fix

Here is my proposed fix to correct the problem on Power, and mostly fix it for
s390/s390x.  Since I don't know the details on the s390/s390x call instructions
that have different sizes, I didn't add the code to fix the problem where the
call instruction might not be 6 bytes.

Description:

- I made a change to avoid doing the decrement in libbacktrace for Power and
s390/s390x.  I think it is better to do it this way than having libbacktrace
decrement it and then on return or through the callback increment it.  If it is
fixed up in the callback or after the return then the line number information
has already been determined in libbacktrace and it could be incorrect.  My
assumption is that no other platform has the issue with the decrement of the PC
since there are no bugzillas for them so the change in libbacktrace is only
needed for Power and s390/s390x.
- I added a new function BackupPC to back up the PC by one instruction by
subtracting the correct instruction size for the platform.  As mentioned above,
some extra work is needed here to handle different call instructions with
different sizes.
- All callers of runtime_callers, runtime.Caller, and runtime.Callers have to
recognize that the PC values being returned represent the instruction following
the call instruction, so if the call instruction is what is needed, the code
will do the back up of the PC and determine the line number based on the new
PC.  I had to make a change to log.go to do this otherwise log_test.go failed. 
I changed the code in pprof.go to call the new function.

With my patch, the line number information is now correct but the mprof
testcase fails due to other differences in the mprof output.  My patch applied
to commit 220481 works and it fails on commit 221230, but have not yet narrowed
it down to the exact commit where it starts to fail.  I am investigating
further, but here is where the output differs:

testing.go:370: The entry did not match:
(0|1): (0|2097152) \[1: 2097152\] @ 0x[0-9,a-f x]+
#   0x[0-9,a-f]+   
pprof_test\.allocateTransient2M\+0x[0-9,a-f]+   .*/mprof_test.go:30
#   0x[0-9,a-f]+   
runtime_pprof_test\.TestMemoryProfiler\+0x[0-9,a-f]+.*/mprof_test.go:65


0: 0 [1: 32] @ 0x1002815c 0x1002815c 0x10011a48 0x1000b510 0x1000b7b0
0x100c5a00 0x1001fed4
#   0x1000b510  pprof_test.allocateTransient2M+0x50
   
/home/boger/gccgo.work/go1.4/150121/bld/powerpc64le-linux/libgo/gotest8260/test/mprof_test.go:30
#   0x1000b7b0 
runtime_pprof_test.TestMemoryProfiler+0x170
/home/boger/gccgo.work/go1.4/150121/bld/powerpc64le-linux/libgo/gotest8260/test/mprof_test.go:65

On the first line, the actual output contains [1: 32] but the expected output
is [1: 2097152].

I built the latest gccgo on an x86 with no extra patches and this same failure
occurs there.


[Bug go/64999] s390x libgo test failure in TestMemoryProfiler

2015-03-09 Thread ian at airs dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64999

--- Comment #41 from Ian Lance Taylor ian at airs dot com ---
I really don't want libbacktrace to be processor-dependent.  That makes all
uses of it more complex for no significant gain.  Maybe we should change
libbacktrace, but definitely not by making it processor-dependent.

I would prefer that changes to files like runtime/pprof/pprof.go and log/log.go
go through the master repo first.  If gccgo's runtime.Callers acts the same as
gc's runtime.Callers, then there shouldn't be any gccgo-specific need to change
those files.

Can somebody please try just editing the function callback in
runtime/go-callers.c to add one to pc when storing it in loc-pc?  I'd like to
understand why that simple change doesn't work before trying something more
complex.


[Bug go/64999] s390x libgo test failure in TestMemoryProfiler

2015-03-06 Thread boger at us dot ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64999

--- Comment #36 from boger at us dot ibm.com ---
I'd like to take a step back and clarify what the functions in question,
runtime_callers, runtime.Caller, and runtime.Callers should be returning:  the
pc values for the call instruction, or the pc values from the instruction
following the call?

The golang documentation for runtime.Callers says it is the pc for the
instruction after the call so I will assume that is what we want for that one.

For runtime.Caller I believe it is supposed to return the call instruction
itself but I'm not 100% sure on that.

Since runtime.Callers calls runtime_callers, I assume runtime_callers should be
providing the instruction after the call.  That means all callers of
runtime_callers should be decrementing pc values to the previous instruction if
the call instruction is what they want.

Does this sound right?


[Bug go/64999] s390x libgo test failure in TestMemoryProfiler

2015-03-06 Thread ian at airs dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64999

--- Comment #37 from Ian Lance Taylor ian at airs dot com ---
The PC returned by runtime.Caller is the same as that returned by
runtime.Callers: the instruction following the call.

Otherwise, yes, I agree.


[Bug go/64999] s390x libgo test failure in TestMemoryProfiler

2015-03-06 Thread boger at us dot ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64999

--- Comment #38 from boger at us dot ibm.com ---
I've spent some time thinking about a fix for this and made a few other
observations...

I've noticed that when runtime_callers calls backtrace_full, it locks the
thread using runtime_xadd and then looks up the line number and function name
information and returns it to runtime_callers.  However, in some cases, the
caller of runtime callers doesn't use the line and file information, as happens
with runtime.Callers and others.   And actually if the caller of
runtime_callers (or runtime.Caller) wants to back up the pc, then line number
information that was determined by backtrace_full might not be correct so has
to be looked up again.

That has me wondering if runtime_callers should be changed so that it calls
backtrace_simple instead of backtrace_full and returns just the pc values.  I
don't think backtrace_simple would require the thread to be locked since it
doesn't care about the backtrace_state; then the code in libbacktrace wouldn't
have to read in and save all the debug information as it does now.  And then it
would be up to the callers of runtime_callers to determine if the pc should be
backed up and determine the line information and file information after it has
the correct pc using FuncForPC.

And then to complete the fix for this, we could add a simple function BackupPC
which would be similar to what is in pprof.go to move the pc back to the
previous instruction based on the GOARCH value.  Unfortunately I am not
familiar with all that needs to be done for the s390 and s390x case but
whatever has to be done for that would be done in this function.  And then any
time the pc needs to be backed up it would be done by calling this function.

Summarizing the changes:
- runtime_callers would call backtrace_simple instead of backtrace_full,
returning an array of just the pc values for the stack
- backtrace_simple would not decrement the pc for Power and s390.  I'm assuming
it is OK to do the decrement for other platforms because they don't have an
issue with the way it is now?  This could be controlled by an #if defined
- a new function BackupPC would be added to take a pc value as input and return
the pc for the previous instruction, as described above
- callers of runtime_callers, runtime.Caller or runtime.Callers would call
BackupPC if that was needed, and then use FuncForPC to find the file and line
number information only if needed.

I have started to work on a patch for this but wanted to put this idea out
there.


[Bug go/64999] s390x libgo test failure in TestMemoryProfiler

2015-03-06 Thread ian at airs dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64999

--- Comment #39 from Ian Lance Taylor ian at airs dot com ---
Unfortunately backtrace_simple does not handle inlined calls correctly, so it
won't work for a couple of very important cases: doing a stack backtrace on
panic, and skipping the right number of frames when called from places like the
log package.  I don't think that is a workable approach.


[Bug go/64999] s390x libgo test failure in TestMemoryProfiler

2015-02-27 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64999

--- Comment #35 from Dominik Vogt vogt at linux dot vnet.ibm.com ---
I'd like to bring back to attention the fact that the code that deducts six
from the pc (s390x) in pprof.go is broken regardless of what patches are made
to the runtime code.  Determining the size of the call instruction is by no
means trivial.  Other platforms certainly have similar problems with the pc -=
4.  It would be good to solve this at the same time as the pc-- problem.


[Bug go/64999] s390x libgo test failure in TestMemoryProfiler

2015-02-26 Thread ian at airs dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64999

--- Comment #32 from Ian Lance Taylor ian at airs dot com ---
 I don't think it is a good idea to add code in multiple places to fix up the 
 pc values after calling runtime.Callers.  That seems prone to error and 
 confusing for future updates to the code.

Right, which is why I never suggested that.  I suggested changing
runtime.Callers itself to adjust the values that it gets from libbacktrace.


 - Add a wrapper function to the libgo code to call backtrace_full and then 
 adjust the pc value based on the GOARCH value, understanding what 
 backtrace_full might have done and what the GO code expects.  Then there 
 should be no direct calls to backtrace_full in libgo, but only calls to this 
 wrapper function.

Yes, that is exactly what I have been trying to say, but we don't need to
introduce a new function.  We already only call backtrace_full from a single
place in libgo: runtime_callers (which, to be clear, is not the same as
runtime.Callers).


 I think the pc mapping for inlined functions is a separate issue.  Inlining 
 happens in gccgo and not gc and happens on all gcc compilers so the mapping 
 of pc values to line numbers for inlined code is not an issue specific to 
 gccgo and that is not the issue in this testcase.  Maybe it just needs to be 
 documented so users understand that can happen or maybe inlining should be 
 disabled by default for gccgo and then if users enable it they understand 
 what could happen.

To be clear, libbacktrace can handle inlined functions just fine, and libgo
does the right thing for things like the stack traces dumped when a program
crashes.  It also does the right thing when handling the skip parameter to
runtime.Caller and runtime.Callers.  The problem arises when converting the
libbacktrace values into the single PC value expected by Go functions like
runtime.Callers.

I am not going to disable inlining by default for gccgo.  That would be a
performance killer.


[Bug go/64999] s390x libgo test failure in TestMemoryProfiler

2015-02-26 Thread boger at us dot ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64999

--- Comment #33 from boger at us dot ibm.com ---
(In reply to Ian Lance Taylor from comment #32)
  I don't think it is a good idea to add code in multiple places to fix up 
  the pc values after calling runtime.Callers.  That seems prone to error and 
  confusing for future updates to the code.
 
 Right, which is why I never suggested that.  I suggested changing
 runtime.Callers itself to adjust the values that it gets from libbacktrace.
 
 
  - Add a wrapper function to the libgo code to call backtrace_full and then 
  adjust the pc value based on the GOARCH value, understanding what 
  backtrace_full might have done and what the GO code expects.  Then there 
  should be no direct calls to backtrace_full in libgo, but only calls to 
  this wrapper function.
 
 Yes, that is exactly what I have been trying to say, but we don't need to
 introduce a new function.  We already only call backtrace_full from a single
 place in libgo: runtime_callers (which, to be clear, is not the same as
 runtime.Callers).
 

In comments 21  23 it sounded there were plans to make changes in other
locations too.  I agree with what you just said here, that you could just make
a change in runtime_callers after it calls backtrace_full to adjust the pc and
then no other changes should be needed anywhere else.

Yes I realize runtime_callers and runtime.Callers are different and I was being
sloppy when I mentioned them.

 
  I think the pc mapping for inlined functions is a separate issue.  Inlining 
  happens in gccgo and not gc and happens on all gcc compilers so the mapping 
  of pc values to line numbers for inlined code is not an issue specific to 
  gccgo and that is not the issue in this testcase.  Maybe it just needs to 
  be documented so users understand that can happen or maybe inlining should 
  be disabled by default for gccgo and then if users enable it they 
  understand what could happen.
 
 To be clear, libbacktrace can handle inlined functions just fine, and libgo
 does the right thing for things like the stack traces dumped when a program
 crashes.  It also does the right thing when handling the skip parameter to
 runtime.Caller and runtime.Callers.  The problem arises when converting the
 libbacktrace values into the single PC value expected by Go functions like
 runtime.Callers.
 
 I am not going to disable inlining by default for gccgo.  That would be a
 performance killer.


[Bug go/64999] s390x libgo test failure in TestMemoryProfiler

2015-02-26 Thread ian at airs dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64999

--- Comment #31 from Ian Lance Taylor ian at airs dot com ---
 Why is it important to be able to map a file:line to a single PC?

The problem is that the Go code in the runtime/pprof package assumes that it
can take a single PC value and map that to a meaningful file/line.  Which it
can, but with gccgo it loses inlining information.


 But there is no useful file/line information in a signal handler anyhow.

 I would imagine file:line information *is* useful for user written signal 
 handlers and in case of a panic or crash.

Sorry, I misspoke.  I meant signal trampoline--the code in the C library that
is invoked by the kernel--not signal handler.


[Bug go/64999] s390x libgo test failure in TestMemoryProfiler

2015-02-26 Thread boger at us dot ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64999

--- Comment #30 from boger at us dot ibm.com ---
I don't think it is a good idea to add code in multiple places to fix up the pc
values after calling runtime.Callers.  That seems prone to error and confusing
for future updates to the code.  

From what I can tell, the problem originates with the use of backtrace_full
from the libgo code to find the pc stack values for some platforms.  I think if
the pc was never decremented in the backtrace code then I think the values are
right.  However I understand you can't just change that behavior due to other
existing users of this code that might expect it to continue working as before.
 Here are some options:
- Add a new function to libbacktrace for this situation where the pc is never
decremented.
- Add a flag to one of the backtrace structures to indicate you don't want the
pc decremented in libbacktrace and then use that flag in libbacktrace to
control whether the pc is decremented.
- Add a wrapper function to the libgo code to call backtrace_full and then
adjust the pc value based on the GOARCH value, understanding what
backtrace_full might have done and what the GO code expects.  Then there should
be no direct calls to backtrace_full in libgo, but only calls to this wrapper
function.

Once the stack pc values are generated as expected then runtime.Callers has the
correct values according to what its callers would expect, how it is
documented, and matches gc behavior.

I think the pc mapping for inlined functions is a separate issue.  Inlining
happens in gccgo and not gc and happens on all gcc compilers so the mapping of
pc values to line numbers for inlined code is not an issue specific to gccgo
and that is not the issue in this testcase.  Maybe it just needs to be
documented so users understand that can happen or maybe inlining should be
disabled by default for gccgo and then if users enable it they understand what
could happen.


[Bug go/64999] s390x libgo test failure in TestMemoryProfiler

2015-02-26 Thread ian at airs dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64999

--- Comment #34 from Ian Lance Taylor ian at airs dot com ---
You're right, we could just change runtime_callers.  Or, simpler still, just
change callback in libgo/runtime/go-callers.c.  Good point.


[Bug go/64999] s390x libgo test failure in TestMemoryProfiler

2015-02-26 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64999

--- Comment #29 from Dominik Vogt vogt at linux dot vnet.ibm.com ---
(In reply to Ian Lance Taylor from comment #27)
 The problem is that we can't represent full file/line
 information using a single PC value, because a single PC value can't
 represent inlined functions.

Why is it important to be able to map a file:line to a single PC?

 But there is no useful file/line information in a signal handler anyhow.

I would imagine file:line information *is* useful for user written signal
handlers and in case of a panic or crash.


[Bug go/64999] s390x libgo test failure in TestMemoryProfiler

2015-02-24 Thread boger at us dot ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64999

--- Comment #26 from boger at us dot ibm.com ---
(In reply to Ian Lance Taylor from comment #25)
 To which code in libgcc are you referring?  I don't see it.
 

Here is the code I was referring to, but I was wrong when I thought it fixed
the problem.  Even though the testcase passed on further inspection I see that
the pc values are still wrong.

ip a.k.a. pc

unwind-c.c:

#ifdef HAVE_GETIPINFO
  ip = _Unwind_GetIPInfo (context, ip_before_insn);
#else
  ip = _Unwind_GetIP (context);
#endif
  if (! ip_before_insn)
--ip;

 Our goal has to be for runtime.Callers to return the same sort of values in
 gccgo as it does in gc.  That means that your change to
 go/runtime/pprof/pprof.go looks wrong to me.  It suggests that the ppc64
 values are different from those of other architectures.  What's the
 justification for that?

Sorry I was only experimenting with a solution for Power, just to understand
what could be changed to make it work so it was not meant to be a solution for
other platforms.  I could see that the pc was being decremented wrong for Power
so once I fixed that it didn't make sense to decrement it again in pprof.go. 
That was my thinking.

After rereading all you've wrote I think we are coming to same result
(somewhat).  If I read the documentation for runtime.Callers, it says that the
pc values are the return program counters, but in gccgo it is program counter
for the call instruction, so runtime.Callers does need to be changed.  But in
one place you are just incrementing the pc, so won't that be wrong for the
cases where libbacktrace didn't decrement it?


[Bug go/64999] s390x libgo test failure in TestMemoryProfiler

2015-02-24 Thread boger at us dot ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64999

boger at us dot ibm.com changed:

   What|Removed |Added

 CC||boger at us dot ibm.com

--- Comment #24 from boger at us dot ibm.com ---
In my opinion, the way the decrement of pc is done in libbacktrace is
incorrect.  The code in libbacktrace looks similar to code in libgcc, where the
pc type is _Unwind_Ptr.  Since the libgcc has been around for quite a while I
assume it is correct, and I see no reason why the code to do the pc decrement
in libbacktrace should be different from what's in libgcc.  

I made this change to libbacktrace/backtrace.c and simple.c:

 if (!ip_before_insn)
pc = (uintptr_t)((_Unwind_Ptr)pc-1);

And then in pprof.go, added this so that the pc value did not get decremented
again for ppc64  ppc64le:

  tracepc := pc
// Back up to call instruction.
if i  0  pc  f.Entry()  !wasPanic {
if runtime.GOARCH == 386 || runtime.GOARCH ==
amd64 {
tracepc--
} else if runtime.GOARCH == s390 ||
runtime.GOARCH == s390x {
// only works if function was called
// with the brasl instruction (or a
// different 6-byte instruction).
tracepc -= 6
} else if runtime.GOARCH != ppc64 
runtime.GOARCH != ppc64le {
tracepc -= 4 // arm, etc
}
}

After making this change the regression went away and no further failures
occurred on ppc64 and ppc64le.  I am not familiar with other platforms (i.e.,
other supported GOARCHes) to understand how they might be affected by this. 
However, since the code that appears in libgcc has existed and been used for
quite some time, that code should be right and decrementing the pc in
libbacktrace in the same way as is done in libgcc should be correct for all
platforms that use it.  Then that leads to the question of what is done in
pprof.go, which looks like a hack to undo what was done incorrectly in
libbacktrace at least for some platforms.

I will continue to do further testing, but would be curious to know if this
change to libbacktrace would fix s390, assuming the special check in pprof.go
would be removed?  And then I don't think you should have to make changes in
other locations, once libbacktrace has been corrected.


[Bug go/64999] s390x libgo test failure in TestMemoryProfiler

2015-02-24 Thread ian at airs dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64999

--- Comment #25 from Ian Lance Taylor ian at airs dot com ---
To which code in libgcc are you referring?  I don't see it.

Our goal has to be for runtime.Callers to return the same sort of values in
gccgo as it does in gc.  That means that your change to
go/runtime/pprof/pprof.go looks wrong to me.  It suggests that the ppc64 values
are different from those of other architectures.  What's the justification for
that?


[Bug go/64999] s390x libgo test failure in TestMemoryProfiler

2015-02-24 Thread boger at us dot ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64999

--- Comment #28 from boger at us dot ibm.com ---
I'm not concerned about the inline case.  The user could build without inlining
if it was important. 

To me it seems like you don't want libbacktrace to decrement the pc when it is
being called by runtime.Callers.  Then runtime.Callers should be correct and no
changes needed for pprof.go.  But I suppose that messes up other users of
libbacktrace unless you somehow had a way to indicate not to decrement in this
case or to create a different unwind callback function.


[Bug go/64999] s390x libgo test failure in TestMemoryProfiler

2015-02-24 Thread ian at airs dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64999

--- Comment #27 from Ian Lance Taylor ian at airs dot com ---
The runtime.Callers function (and friends) are somewhat broken for gccgo no
matter what we do.  The problem is that we can't represent full file/line
information using a single PC value, because a single PC value can't represent
inlined functions.

Fortunately, the case where libbacktrace does not decrement the PC is not very
interesting for Go, because we aren't going to get any useful file/line
information for that case anyhow.  It's the case of unwinding through a signal
handler: the PC value in the signal handler does not follow a call instruction,
and therefore should not be decremented.  But there is no useful file/line
information in a signal handler anyhow.

I'm still not fully certain as to what the best fix is.


[Bug go/64999] s390x libgo test failure in TestMemoryProfiler

2015-02-18 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64999

--- Comment #20 from Dominik Vogt vogt at linux dot vnet.ibm.com ---
runtime_MProf_Malloc() calls runtime_callers() without going through
runtime.Callers():

#0  runtime_callers (skip=skip@entry=1, locbuf=locbuf@entry=0xc2094734b8, 
m=m@entry=32, keep_thunks=keep_thunks@entry=0 '\000')
at ../../../libgo/runtime/go-callers.c:186
#1  0x80023150 in runtime_MProf_Malloc (p=p@entry=0xc20957ac00, 
size=size@entry=1024) at ../../../libgo/runtime/mprof.goc:189
#2  0x80021722 in profilealloc (size=1024, v=0xc20957ac00)
at ../../../libgo/runtime/malloc.goc:315
#3  runtime_mallocgc (size=size@entry=1024, typ=typ@entry=2148440120, 
flag=flag@entry=0) at ../../../libgo/runtime/malloc.goc:253
#4  0x8000e30e in __go_new (
td=td@entry=0x800e9838 __go_td_S1_xAN5_uint81024ee, size=size@entry=1024)
at ../../../libgo/runtime/go-new.c:15
#5  0x8000c174 in pprof_test.allocateTransient1M () at mprof_test.go:21
#6  0x8000c3d2 in runtime_pprof_test.TestMemoryProfiler (
t=0xc2088502d0) at mprof_test.go:64


[Bug go/64999] s390x libgo test failure in TestMemoryProfiler

2015-02-18 Thread ian at airs dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64999

--- Comment #23 from Ian Lance Taylor ian at airs dot com ---
Yes, I do mean to change saveg in mprof.goc.

runtime_callers in general returns full file/line information, which is
required for full correctness when using gccgo.  When it devolves back to a
plain PC, you've correctly observed that it gets the wrong PC.  I'm suggesting
correcting the PC value wherever we devolve, which happens in runtime.Callers
and, as you've pointed out, in saveg in mprof.goc.

I looked for other calls to runtime_callers, and found one other case: in
runtime_sigprof in proc.c.  I think it's worth trying fixing those cases to see
if it fixes this bug.


[Bug go/64999] s390x libgo test failure in TestMemoryProfiler

2015-02-18 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64999

--- Comment #22 from Dominik Vogt vogt at linux dot vnet.ibm.com ---
You mean the function saveg() code in mprof.goc?

I'm actually not sure how the above patch to runtime_callers() interacts with
all the other functions that call runtime_callers().  :-/


[Bug go/64999] s390x libgo test failure in TestMemoryProfiler

2015-02-18 Thread ian at airs dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64999

--- Comment #21 from Ian Lance Taylor ian at airs dot com ---
Ah, thanks.  So we should change that to
r-stk[i] = locstk[i].pc + 1;
too.


[Bug go/64999] s390x libgo test failure in TestMemoryProfiler

2015-02-17 Thread ian at airs dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64999

--- Comment #19 from Ian Lance Taylor ian at airs dot com ---
I see a call to runtime.Callers in Profile.Add in
libgo/go/runtime/pprof/pprof.go.  If the PC values in question do not come from
there, where do they come from?


[Bug go/64999] s390x libgo test failure in TestMemoryProfiler

2015-02-17 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64999

--- Comment #18 from Dominik Vogt vogt at linux dot vnet.ibm.com ---
Created attachment 34788
  -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=34788action=edit
Different fix

How about this patch to the callback() function in go-callers.c to modify the
incoming pc.  The algorithm to do this is moved from ppfor.go to go-callers.c.


[Bug go/64999] s390x libgo test failure in TestMemoryProfiler

2015-02-17 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64999

--- Comment #17 from Dominik Vogt vogt at linux dot vnet.ibm.com ---
runtime.Callers is not called in this test case, so changing it won't help.


[Bug go/64999] s390x libgo test failure in TestMemoryProfiler

2015-02-16 Thread ian at airs dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64999

--- Comment #16 from Ian Lance Taylor ian at airs dot com ---
Back in comment #9 I was trying to suggest that runtime.Callers should adjust
the PC that it returns to Go code.  That seems simpler and should have the same
effect as your patch.


[Bug go/64999] s390x libgo test failure in TestMemoryProfiler

2015-02-16 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64999

--- Comment #13 from Dominik Vogt vogt at linux dot vnet.ibm.com ---
I see.

But what bug or bugs do we have here then?  Current symptoms are:

(1) A bogus call addres may be stored in the backtrace structure (i.e. next
instruction's address minus 1).

(2) The address from (1) in printed in the textual backtrace.  As I understand
it, the code should instead print the address calculated in printStackRecord().

(3) At least for s390[x], the caller's line number in the printed backtrace may
be wrong, but I guess that affects other targets as well.

--

So, what happens is this:

* The code in libbacktrace knows whether the pc points to the call instruction
or the instruction after that and handles the latter by decrementing the pc by
one before it's stored in some structure describing the stack trace.

* When the user prints a memory heap trace, writeHeap() is called, which prints
the address from the stack trace structure and then calls printStackRecord().

* printStackRecord() then deducts the guessed size of the call instruction from
the pc and uses this modified pc to determine the file and line number to print
in the stack trace.

--

I see at least two things that need to be fixed:

- printStackRecord() needs to know whether the pc has already been decremented
in libbacktrace and adapt the calculation accordingly.  On s390[x] this could
simply be done by looking at the pc, because odd pcs are invalid.  This is not
the case on other architectures though (e.g. ARM7).

- writeHeap should print the address calculated in printStackRecord() instead
of the address from the stack trace structure.

As an alternative, if the code in libbacktrace could properly determine the
call address, the address calculation code in printStackTrace() could be
removed and everything would be fine.


[Bug go/64999] s390x libgo test failure in TestMemoryProfiler

2015-02-16 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64999

--- Comment #14 from Dominik Vogt vogt at linux dot vnet.ibm.com ---
Created attachment 34773
  -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=34773action=edit
Experimental fix

The attached patch is a sketch of a possible fix (at the expense of duplicating
some code from printStackRecord() in a separate function).

This fixes the problems running the test runtime/pprof on s390 and should also
fix the inconsistencies in the printed stack trace.


[Bug go/64999] s390x libgo test failure in TestMemoryProfiler

2015-02-16 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64999

--- Comment #15 from Dominik Vogt vogt at linux dot vnet.ibm.com ---
P.S.: The patch only addresses s390 and s390x.


[Bug go/64999] s390x libgo test failure in TestMemoryProfiler

2015-02-12 Thread ian at airs dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64999

--- Comment #12 from Ian Lance Taylor ian at airs dot com ---
I should add that for purposes of Go, it's not all that important that
libbacktrace does not yet handle sibling calls, because the Go compiler turns
on -fno-optimize-sibling-calls by default
(https://gcc.gnu.org/ml/gcc-patches/2014-10/msg01643.html in gcc/go/go-lang.c).


[Bug go/64999] s390x libgo test failure in TestMemoryProfiler

2015-02-12 Thread ian at airs dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64999

--- Comment #11 from Ian Lance Taylor ian at airs dot com ---
libbacktrace is all about stack backtraces.  It is not about handling
exceptions.

libbacktrace handles inlined calls and hand written trampolines, assuming the
DWARF information is correct.  libbacktrace does not currently handle sibling
calls; it could, based on http://www.dwarfstd.org/ShowIssue.php?issue=100909.2,
but nobody has implemented it.

The glibc backtrace_symbols call is a simpler function that is not nearly as
powerful as libbacktrace.


[Bug go/64999] s390x libgo test failure in TestMemoryProfiler

2015-02-11 Thread ian at airs dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64999

--- Comment #9 from Ian Lance Taylor ian at airs dot com ---
I think that libbacktrace is doing more or less the right thing.  If we don't
subtract one from pc there, we have no way to convey signal handler frames
correctly.  Also, the resulting PC value is in the instruction we want, not the
next instruction.

So probably runtime.Callers (in libgo/runtime/go-callers.c) ought to adjust the
value, since that is what the Go code expects.

There is a deeper problem with this code, which is that although the gc
compiler can reliably map a PC value to a location, the gccgo compiler can not,
because of inlining.  To get inlining right, you need to collect the file/line
information when unwinding; you can't recreate it afterward.  runtime_callers
get the right information, but runtime.Callers throws it away.


[Bug go/64999] s390x libgo test failure in TestMemoryProfiler

2015-02-11 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64999

--- Comment #10 from Dominik Vogt vogt at linux dot vnet.ibm.com ---
As far as I understand, the code in libbacktrace was originally only intended
for handling exceptions, not for generating stack traces?  For the former, the
code is fine.  But given a function's return address, identifying the call
instruction is non-trivial.  Gdb has complex heuristics to get that right in
case of inlined functions, tail calls or the hand written trampolines in
libffi.  Glibc probably has the same complexity for backtrace_symbols() call. 
With libgo's backtrace functionality we need to duplicate the same logic in
libbacktrace, no?


[Bug go/64999] s390x libgo test failure in TestMemoryProfiler

2015-02-10 Thread ian at airs dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64999

--- Comment #2 from Ian Lance Taylor ian at airs dot com ---
From a cursory look the problem is that the regexp expects line 66 but the
actual output is line 65.  Looking at the code line 66 seems correct.  However
if the code is built with optimization it's easy to imagine this being not
quite correct for one reason or another.  Probably the test should be loosened.


[Bug go/64999] s390x libgo test failure in TestMemoryProfiler

2015-02-10 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64999

--- Comment #5 from Dominik Vogt vogt at linux dot vnet.ibm.com ---
Just noticed that all function offsets in the stack traces are broken.  On
s390x, only even offsets are valid, but all the numbers are odd, e.g.

  pprof_test.allocatePersistent1K+0x23
 ^


[Bug go/64999] s390x libgo test failure in TestMemoryProfiler

2015-02-10 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64999

--- Comment #1 from Dominik Vogt vogt at linux dot vnet.ibm.com ---
These entries looks screwed to me:

-- snip --
32: 1024 [32: 1024] @ ...
#0x8000c44b ...allocatePersistent1K ... mprof_test.go:43
#0x8000c595 ...TestMemoryProfiler ... mprof_test.go:65
...

0: 0 [1: 32] @ ...
#0x8000c3b7 ...allocateTransient2M ... mprof_test.go:30
#0x8000c58f ...TestMemoryProfiler ... mprof_test.go:64
...
-- snip --

In line 64 there's a call to allocateTransient1M(), not to
allocateTransient2M() as the stack trace claims, and likewise line 65 calls
allocateTransient2M() and not allocatePersistent1K().  On top of that, the
output for line 66 is missing completely.


[Bug go/64999] s390x libgo test failure in TestMemoryProfiler

2015-02-10 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64999

--- Comment #4 from Dominik Vogt vogt at linux dot vnet.ibm.com ---
The stack traces in gdb look good:

line 64:

#0  __go_new (td=td@entry=0x8010c8e8 __go_td_S1_xAN5_uint81024ee, 
size=size@entry=1024) at ../../../libgo/runtime/go-new.c:14
#1  0x8000c32c in pprof_test.allocateTransient1M () at mprof_test.go:21
#2  0x8000c58a in runtime_pprof_test.TestMemoryProfiler (
t=0xc2088502d0) at mprof_test.go:64

line 65:

#0  __go_make_slice1 (td=td@entry=0x80109848 __go_td_AN5_uint8e, 
len=len@entry=2097152) at ../../../libgo/runtime/go-make-slice.c:74
#1  0x8000c3a8 in pprof_test.allocateTransient2M () at mprof_test.go:30
#2  0x8000c590 in runtime_pprof_test.TestMemoryProfiler (
t=0xc2088502d0) at mprof_test.go:65

line 66:

#0  __go_new (td=td@entry=0x8010cc18 __go_tdn_runtime_pprof_test.Obj32, 
size=size@entry=32) at ../../../libgo/runtime/go-new.c:14
#1  0x8000c44c in pprof_test.allocatePersistent1K ()
at mprof_test.go:43
#2  0x8000c596 in runtime_pprof_test.TestMemoryProfiler (
t=0xc2088502d0) at mprof_test.go:66


[Bug go/64999] s390x libgo test failure in TestMemoryProfiler

2015-02-10 Thread ian at airs dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64999

--- Comment #6 from Ian Lance Taylor ian at airs dot com ---
The odd addresses are most likely a symptom of the libbacktrace library.  It
should probably be considered a bug.  I'm guessing that it's because of the
--pc in the static function unwind in libbacktrace/backtrace.c.


[Bug go/64999] s390x libgo test failure in TestMemoryProfiler

2015-02-10 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64999

--- Comment #7 from Dominik Vogt vogt at linux dot vnet.ibm.com ---
The --pc is definitely the cause of the bogus addresses.  To me it seems that
these addresses are intended to identify the function and source line from the
dwarf info, not to be printed out as a real address (which they are not).

Anyway, the printed address and source line information do not match.

  # 0x8000c595 runtime_pprof_test.TestMemoryProfiler+0x11d mprof_test.go:65

From the disassembly (function names shortened):

   0x8000c584 +268:brasl%r14,0x8000c308 allocateTransient1M
   0x8000c58a +274:brasl%r14,0x8000c370 allocateTransient2M
   0x8000c590 +280:brasl%r14,0x8000c428
allocatePersistent1K
   0x8000c596 +286:lghi%r1,0

The address points to line 66 in the file, but the textual trace associates it
with line 65.  On the other hand gdb properly identifies the source line, so
the dwarf information must be correct:

  (gdb) $pc = 0x8000c595
  (gdb) p $pc
  $1 = (void (*)()) 0x8000c595 runtime_pprof_test.TestMemoryProfiler+285
  (gdb) bt
  #0  0x8000c595 in runtime_pprof_test.TestMemoryProfiler (
  t=0xf318aa8) at mprof_test.go:66
  


[Bug go/64999] s390x libgo test failure in TestMemoryProfiler

2015-02-10 Thread vogt at linux dot vnet.ibm.com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64999

--- Comment #8 from Dominik Vogt vogt at linux dot vnet.ibm.com ---
The cause for this bug is the interaction between the --pc and this code
snippet from printStackRecord() in pprof.go:

  ... if runtime.GOARCH == s390 || runtime.GOARCH == s390x {
// only works if function was called
// with the brasl instruction (or a
// different 6-byte instruction).
tracepc -= 6
  } ...

First, the pc is decremented by one, pointing to the last byte of the brasl
instruction.  Then we subtract another six bytes and end up pointing to the
last byte of the previous instruction, and that messes up the stack trace. 
printStackRecord() does not know that unwind() has already decremented the pc
and may mis-calculate the instruction address, possibly not just for s390[x]
but for other architectures as well.


[Bug go/64999] s390x libgo test failure in TestMemoryProfiler

2015-02-10 Thread ian at airs dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64999

--- Comment #3 from Ian Lance Taylor ian at airs dot com ---
Of course it's also possible that there is some bug in libbacktrace.  It's
worth checking what gdb reports.