Re: RFR 8066708: JMXStartStopTest fails to connect to port 38112

2014-12-08 Thread Stuart Marks

On 12/8/14 12:35 PM, Jaroslav Bachorik wrote:

Please, review the following test change

Issue : https://bugs.openjdk.java.net/browse/JDK-8066708
Webrev: http://cr.openjdk.java.net/~jbachorik/8066708/webrev.00

The test fails very intermittently when RMI registry is trying to bind to a port
previously used in the test (via ServerSocket).

This seems to be caused by the sockets created via `new ServerSocket(0)` and
being in reusable mode. The fix attempts to prevent this by explicitly
forbidding the reusable mode.


Hi Jaroslav,

I happened to see this fly by, and there are (I think) some similar issues going 
on in the RMI tests.


But first I'll note that I don't think setReuseAddress() will have the effect 
that you want. Typically it's set to true before binding a socket, so that a 
subsequent bind operation will succeed even if the address/port is already in 
use. ServerSockets created with new ServerSocket(0) are already bound, and I'm 
not sure what calling setReuseAddress(false) will do on such sockets. The spec 
says behavior is undefined, but my bet is that it does nothing.


I guess it doesn't hurt to try this out to see if it makes a difference, but I 
don't have much confidence it will help.


The potential similarity to the RMI tests is exemplified by JDK-8049202 (sorry, 
this bug report isn't open) but briefly this tests the RMI registry as follows:


1. Opens port 1099 using new ServerSocket(1099) [1099 is the default
   RMI registry port] in order to ensure that 1099 isn't in use by
   something else already;

2. If this succeeds, it immediately closes the ServerSocket.

3. Then it creates a new RMI registry on port 1099.

In principle, this should succeed, yet it fails around 10% of the time on some 
systems. The error is "port already in use". My best theory is that even though 
the socket has just been closed by a user program, the kernel has to run the 
socket through some of the socket states such as FIN_WAIT_1, FIN_WAIT_2, or 
CLOSING before the socket is actually closed and is available for reuse. If a 
program -- even the same one -- attempts to open a socket on the same port 
before the socket has reached its final state, it will get an "already in use 
error".


If this is true I don't believe that setting SO_REUSEADDR will work if the 
socket is in one of these final states. (I remember reading this somewhere but 
I'm not sure where at the moment. I can try to dig it up if there is interest.)


I admit this is just a theory and I'm open to alternatives, and I'm also open to 
hearing about ways to deal with this problem.


Could something similar be going on with this JMX test?

s'marks


Re: A hotspot patch for stack profiling (frame pointer)

2014-12-08 Thread Brendan Gregg
G'Day Mikael,

On Mon, Dec 8, 2014 at 9:15 AM, Mikael Gerdin  wrote:
> Maynard,
>
>
> On 2014-12-08 16:05, Maynard Johnson wrote:
>>
[...]
>>
>> And if the VM creates a /tmp/perf-.map file to save information about
>> JITed methods, the perf's post-profiling tool will find it and use it to
>> correlate sampled addresses it collected from the VM's executable
>> anonymous
>> memory mappings to the method names.
>
>
> I seem to recall reading about perf having support for DWARF debug info.
>
> If the VM (or a JVM/TI agent) could create DWARF debug symbols, could that
> be used to convey information about inlined functions and stack unwinding
> without frame pointers?
> I realize that emitting DWARF debug symbols for generated code is not a
> trivial undertaking but since perf is running sampling in the kernel and we
> can't disable inlining that seems to be one of the few ways we can get
> complete stack traces.

It's a good idea, but I'm not sure the DWARF unwind approach is
suitable for dynamic JIT. I'm usually sampling at 99 Hertz. With
inlined symbols, just the perf.map file can become 10s of Mbytes, and
I assume the DWARF info would be similar. So the file would need to be
in a consistent state so that perf can begin reading it anytime, and
do stack walking based on what it reads, while at the same time
symbols may be compiled anytime and the map file would need to change.

With the frame pointer approach, perf always knows how to walk the
stack, at any time. If symbols move during the profile, it breaks
translation but not walking. And there's different ways to deal with
the translation issue (collect before and after maps and note
differences, or do timestamped maps).

I assume the reliable option is having kernel support for Java
unwinding (like the Solaris approach mentioned previously). Frame
pointer support can be an option for situations when the kernel
support isn't available, while noting its caveats.

>
> There would be several other advantages to having DWARF symbols for
> generated code, GDB can use them when debugging the JVM for example.
>
> An alternate approach could be to extend the information in perf-.map
> to have more detailed PC ranges with information about which functions are
> inlined. A lot of that information is available in the VM but not
> necessarily exposed via the tool APIs

Johannes has done some of this with the perf-map-agent "unfold" option
(https://github.com/jrudolph/perf-map-agent), which includes inlined
information. I've tried adding an extra filter step to resuscitate
frames that were inlined, which sort-of worked (needs more work).

However, having inlined stacks hasn't been that much of a problem.
I've shown my flame graphs to developers, noting that inlined frames
can't be seen, and so far they can still follow what's going on (the
use case here is performance profiling, to figure out where the bulk
of CPU time is spent). jstack(1) output can be used for clues, to see
how the inlined code maps to the full stacks. And, there's JVM
tunables that can be used to reduce inlining.

Brendan


RFR 8066708: JMXStartStopTest fails to connect to port 38112

2014-12-08 Thread Jaroslav Bachorik

Please, review the following test change

Issue : https://bugs.openjdk.java.net/browse/JDK-8066708
Webrev: http://cr.openjdk.java.net/~jbachorik/8066708/webrev.00

The test fails very intermittently when RMI registry is trying to bind 
to a port previously used in the test (via ServerSocket).


This seems to be caused by the sockets created via `new ServerSocket(0)` 
and being in reusable mode. The fix attempts to prevent this by 
explicitly forbidding the reusable mode.


Thanks,

-JB-


Re: A hotspot patch for stack profiling (frame pointer)

2014-12-08 Thread Brendan Gregg
G'Day Staffan,

On Mon, Dec 8, 2014 at 11:17 AM, Staffan Larsen
 wrote:
>
>> On 8 dec 2014, at 16:05, Maynard Johnson  wrote:
>>
[...]
>>
>> And if the VM creates a /tmp/perf-.map file to save information about
>> JITed methods, the perf's post-profiling tool will find it and use it to
>> correlate sampled addresses it collected from the VM's executable anonymous
>> memory mappings to the method names.
>
> Is there a way in this .map file to express that different JITed methods are 
> located at the same address at different times? This typically happens a lot 
> when classes and their JITed methods are being unloaded from the VM. That 
> space will be reused by a different method. I’m guessing this would confuse 
> perf.

In the .map file, no, at least not currently.

However, consider the following perf sampled stack trace (this is from
my patched OpenJDK 8, with frame pointers):

# perf record -F 99 -a -g -- sleep 10
# perf script
[...]
java 10532 [008] 3444046.716431: cpu-clock:
7fe919301c30  (/tmp/perf-10490.map)
7fe91934d50c  (/tmp/perf-10490.map)
7fe9193a43d0  (/tmp/perf-10490.map)
7fe9194ffcf0  (/tmp/perf-10490.map)
7fe9195026d0  (/tmp/perf-10490.map)
7fe9194ffc4c  (/tmp/perf-10490.map)
7fe91b6c440c  (/tmp/perf-10490.map)
7fe91afa9c00  (/tmp/perf-10490.map)
7fe91ab739f4  (/tmp/perf-10490.map)
7fe91df23630  (/tmp/perf-10490.map)
7fe91ab739f4  (/tmp/perf-10490.map)
7fe91acc7ea8  (/tmp/perf-10490.map)
7fe91c4fa014  (/tmp/perf-10490.map)
7fe9190072e0  (/tmp/perf-10490.map)
7fe9190072e0  (/tmp/perf-10490.map)
7fe919007325  (/tmp/perf-10490.map)
7fe9190004e7  (/tmp/perf-10490.map)
7fe92f70670e JavaCalls::call_helper(JavaValue*,
methodHandle*, JavaCallArguments*, Thread*)
(/mnt/openjdk8/build/linux-x86_64-normal-server-release/jdk/lib/am
7fe92f707a3f JavaCalls::call_virtual(JavaValue*,
KlassHandle, Symbol*, Symbol*, JavaCallArguments*, Thread*)
(/mnt/openjdk8/build/linux-x86_64-normal-server-r
7fe92f707edf JavaCalls::call_virtual(JavaValue*, Handle,
KlassHandle, Symbol*, Symbol*, Thread*)
(/mnt/openjdk8/build/linux-x86_64-normal-server-release/jdk/l
7fe92f741668 thread_entry(JavaThread*, Thread*)
(/mnt/openjdk8/build/linux-x86_64-normal-server-release/jdk/lib/amd64/server/libjvm.so)
7fe92fa555c8 JavaThread::thread_main_inner()
(/mnt/openjdk8/build/linux-x86_64-normal-server-release/jdk/lib/amd64/server/libjvm.so)
7fe92fa5581c JavaThread::run()
(/mnt/openjdk8/build/linux-x86_64-normal-server-release/jdk/lib/amd64/server/libjvm.so)
7fe92f916ba2 java_start(Thread*)
(/mnt/openjdk8/build/linux-x86_64-normal-server-release/jdk/lib/amd64/server/libjvm.so)
7fe92ff95e9a start_thread (/lib/x86_64-linux-gnu/libpthread-2.15.so)

perf script emits every sampled stack trace, which I normally do for
flame graph generation. If I had Johannes's perf-map-agent loaded,
then those hex addresses become Java method symbols.

Note the timestamp on the cpu-clock line.

So... It should be easy to change perf-map-agent to write a
timestamped map file, to a different location (so perf doesn't find
it). Use the same timestamp type as perf. Then, a little Perl wrapper
can take "perf script" output and translate the addresses based on the
timestamped map file. (perf itself can be enhanced to do this,
although the time frame for a perf change propagating to Linux users
may be years, so the Perl wrapper could be used in the meantime.)

I haven't needed to do this yet. map file churn in production has been
small (depends on workload). The real pain for us is switching apps
from Oracle JDK to my patched OpenJDK, so we get the frame pointer.

Brendan


Re: RFR 8059949: com/sun/tools/attach/StartManagementAgent.java interrupted! (timed out?)

2014-12-08 Thread Staffan Larsen
Looks good!

Thanks,
/Staffan

> On 8 dec 2014, at 19:09, Jaroslav Bachorik  
> wrote:
> 
> Please, review this small test adjustment
> 
> Issue : https://bugs.openjdk.java.net/browse/JDK-8059949
> Webrev: http://cr.openjdk.java.net/~jbachorik/8059949/webrev.01
> 
> The test is failing very intermittently when run on fastdebug builds. In the 
> diagnostic output there is no indication of any unexpected errors - the test 
> just times out while trying to start the local JMX agent. This change is an 
> attempt to give the test more time to finish by increasing the timeout.
> 
> Thanks,
> 
> -JB-



Re: A hotspot patch for stack profiling (frame pointer)

2014-12-08 Thread Staffan Larsen

> On 8 dec 2014, at 16:05, Maynard Johnson  wrote:
> 
> On 12/05/2014 05:09 PM, Brendan Gregg wrote:
>> G'Day Volker,
>> 
>> On Fri, Dec 5, 2014 at 11:22 AM, Volker Simonis
>>  wrote:
>>> Hi Brendan,
>>> 
>>> I'm still not understanding who is taking the actual stack traces (let
>>> alone the symbols) in your examples. Is this done by 'perf' itself
>>> based only on the frame pointer?
>> 
>> perf is walking the frame pointers.
> Volker, to be specific, the perf profiling tool has a user space part and a
> kernel space part. The collection of stack traces is done by the kernel.
> When a user-specified event (or series of events) occur, the process
> being profiled is interrupted and the sampled information (which can
> optionally include a full stack trace) is made available to the user space
> perf tool to be saved to a file for future post-profiling processing.
> 
> During the profiling phase, the perf tool collects information about the
> profiled process's memory mappings, which allows for this address-to-symbol.
> resolution, It's in the post-profiling phase where the sampled instruction,
> along with its associated stack trace, are resolved to the appropriate symbol
> (i.e., function/method) in a specific binary file (e.g., library, exectuable).
> 
> And if the VM creates a /tmp/perf-.map file to save information about
> JITed methods, the perf's post-profiling tool will find it and use it to
> correlate sampled addresses it collected from the VM's executable anonymous
> memory mappings to the method names.

Is there a way in this .map file to express that different JITed methods are 
located at the same address at different times? This typically happens a lot 
when classes and their JITed methods are being unloaded from the VM. That space 
will be reused by a different method. I’m guessing this would confuse perf.

/Staffan

> 
> -Maynard
>> 
>> A JVMTI agent, perf-map-agent, is providing a map file for symbol
>> translation under /tmp/perf-PID.map. Linux perf already hunts for such
>> a file when doing symbol translation.
>> 
>>> 
>>> As I wrote before, this is pretty hard to get right for a JVM, but
>>> there are good approximations. Have you looked at the 'jstack' tool
>>> which is part of the JDK? If you run it on a Java process, it will
>>> give you exact stack traces with full inlining information. However
>>> this only works at safepoints so it is probably not suitable for
>>> profiling with performance counters.
>> 
>> Right, jstack works, and I get full correct stacks. I do really want
>> to take stacks at any moment: not just CPU samples, but when tracing
>> kernel TCP events, or PMC cache miss profiling, etc. perf can already
>> do many advanced tracing and profiling activities. I just needed the
>> Java stacks for context.
>> 
>>> But you can also use 'jstack -F
>>> -m' which gives you a 'best effort' mixed Java/C++ stacaktrace (most
>>> of the time even with inlined Java frames. This is probably the best
>>> you can get when interrupting a running JVM at an arbitrary point in
>>> time. As you mentioned in one of your blogs, the VM can be in the
>>> C-Library or even in the kernel at that time which don't preserve the
>>> frame pointer either. So it will be already hard to even walk up to
>>> the first Java frame.
>> 
>> Well, the JVMs I'm looking at are already built with
>> -fno-omit-frame-pointer (which is good). I edited hotspot to preserve
>> it as well.
>> 
>> Here's before I changed hotspot:
>> 
>> http://www.brendangregg.com/FlameGraphs/cpu-mixedmode-nofp.svg
>> 
>> Yes, most stacks are clearly broken.
>> 
>> After changing hotspot:
>> 
>> http://www.brendangregg.com/FlameGraphs/cpu-mixedmode-vertx.svg
>> 
>> It's looking pretty good. If you look carefully on the far left and
>> right, there are 0.8% stacks in read() and write() directly from java,
>> which may well be broken (unless a java thread is calling these
>> directly; there could also be some gcc inlining going on). Even if
>> they are broken, I can see 98% of my profile. Plus, I'd be interested
>> to know what exactly is reusing the frame pointer, so we could fix
>> that too.
>> 
>> The Java stacks themselves are also about a third as deep as they
>> should be, due to inlining.
>> 
>>> 
>>> But nevertheless, if the output of 'jstack -F -m' is "good enough" for
>>> your purpose, you can implement something similar in 'perf' or a
>>> helper library of 'perf' and be happy (I don't actually know how perf
>>> takes stack traces but I suppose there may some kind of callback
>>> mechanism for walking unknown frames). This is actually not so hard.
>>> I've recently implemented a "print_native_stack()" function within
>>> hotspot itself (you can call it for example from gdb during debugging
>>> - see http://hg.openjdk.java.net/jdk9/dev/hotspot/rev/86183a940db4).
>>> Maye you could call this functions directly from 'perf' if perf
>>> attaches with ptrace to the process (I assume it does or how else
>>> could it walk the stack)?
>

RFR 8059949: com/sun/tools/attach/StartManagementAgent.java interrupted! (timed out?)

2014-12-08 Thread Jaroslav Bachorik

Please, review this small test adjustment

Issue : https://bugs.openjdk.java.net/browse/JDK-8059949
Webrev: http://cr.openjdk.java.net/~jbachorik/8059949/webrev.01

The test is failing very intermittently when run on fastdebug builds. In 
the diagnostic output there is no indication of any unexpected errors - 
the test just times out while trying to start the local JMX agent. This 
change is an attempt to give the test more time to finish by increasing 
the timeout.


Thanks,

-JB-


Re: A hotspot patch for stack profiling (frame pointer)

2014-12-08 Thread Mikael Gerdin

Maynard,

On 2014-12-08 16:05, Maynard Johnson wrote:

On 12/05/2014 05:09 PM, Brendan Gregg wrote:

G'Day Volker,

On Fri, Dec 5, 2014 at 11:22 AM, Volker Simonis
 wrote:

Hi Brendan,

I'm still not understanding who is taking the actual stack traces (let
alone the symbols) in your examples. Is this done by 'perf' itself
based only on the frame pointer?


perf is walking the frame pointers.

Volker, to be specific, the perf profiling tool has a user space part and a
kernel space part. The collection of stack traces is done by the kernel.
When a user-specified event (or series of events) occur, the process
being profiled is interrupted and the sampled information (which can
optionally include a full stack trace) is made available to the user space
perf tool to be saved to a file for future post-profiling processing.

During the profiling phase, the perf tool collects information about the
profiled process's memory mappings, which allows for this address-to-symbol.
resolution, It's in the post-profiling phase where the sampled instruction,
along with its associated stack trace, are resolved to the appropriate symbol
(i.e., function/method) in a specific binary file (e.g., library, exectuable).

And if the VM creates a /tmp/perf-.map file to save information about
JITed methods, the perf's post-profiling tool will find it and use it to
correlate sampled addresses it collected from the VM's executable anonymous
memory mappings to the method names.


I seem to recall reading about perf having support for DWARF debug info.

If the VM (or a JVM/TI agent) could create DWARF debug symbols, could 
that be used to convey information about inlined functions and stack 
unwinding without frame pointers?
I realize that emitting DWARF debug symbols for generated code is not a 
trivial undertaking but since perf is running sampling in the kernel and 
we can't disable inlining that seems to be one of the few ways we can 
get complete stack traces.


There would be several other advantages to having DWARF symbols for 
generated code, GDB can use them when debugging the JVM for example.


An alternate approach could be to extend the information in 
perf-.map to have more detailed PC ranges with information about 
which functions are inlined. A lot of that information is available in 
the VM but not necessarily exposed via the tool APIs


/Mikael



-Maynard


A JVMTI agent, perf-map-agent, is providing a map file for symbol
translation under /tmp/perf-PID.map. Linux perf already hunts for such
a file when doing symbol translation.



As I wrote before, this is pretty hard to get right for a JVM, but
there are good approximations. Have you looked at the 'jstack' tool
which is part of the JDK? If you run it on a Java process, it will
give you exact stack traces with full inlining information. However
this only works at safepoints so it is probably not suitable for
profiling with performance counters.


Right, jstack works, and I get full correct stacks. I do really want
to take stacks at any moment: not just CPU samples, but when tracing
kernel TCP events, or PMC cache miss profiling, etc. perf can already
do many advanced tracing and profiling activities. I just needed the
Java stacks for context.


But you can also use 'jstack -F
-m' which gives you a 'best effort' mixed Java/C++ stacaktrace (most
of the time even with inlined Java frames. This is probably the best
you can get when interrupting a running JVM at an arbitrary point in
time. As you mentioned in one of your blogs, the VM can be in the
C-Library or even in the kernel at that time which don't preserve the
frame pointer either. So it will be already hard to even walk up to
the first Java frame.


Well, the JVMs I'm looking at are already built with
-fno-omit-frame-pointer (which is good). I edited hotspot to preserve
it as well.

Here's before I changed hotspot:

http://www.brendangregg.com/FlameGraphs/cpu-mixedmode-nofp.svg

Yes, most stacks are clearly broken.

After changing hotspot:

http://www.brendangregg.com/FlameGraphs/cpu-mixedmode-vertx.svg

It's looking pretty good. If you look carefully on the far left and
right, there are 0.8% stacks in read() and write() directly from java,
which may well be broken (unless a java thread is calling these
directly; there could also be some gcc inlining going on). Even if
they are broken, I can see 98% of my profile. Plus, I'd be interested
to know what exactly is reusing the frame pointer, so we could fix
that too.

The Java stacks themselves are also about a third as deep as they
should be, due to inlining.



But nevertheless, if the output of 'jstack -F -m' is "good enough" for
your purpose, you can implement something similar in 'perf' or a
helper library of 'perf' and be happy (I don't actually know how perf
takes stack traces but I suppose there may some kind of callback
mechanism for walking unknown frames). This is actually not so hard.
I've recently implemented a "print_native_stack()" function within
hotspot 

Re: A hotspot patch for stack profiling (frame pointer)

2014-12-08 Thread Volker Simonis
On Mon, Dec 8, 2014 at 4:05 PM, Maynard Johnson  wrote:
> On 12/05/2014 05:09 PM, Brendan Gregg wrote:
>> G'Day Volker,
>>
>> On Fri, Dec 5, 2014 at 11:22 AM, Volker Simonis
>>  wrote:
>>> Hi Brendan,
>>>
>>> I'm still not understanding who is taking the actual stack traces (let
>>> alone the symbols) in your examples. Is this done by 'perf' itself
>>> based only on the frame pointer?
>>
>> perf is walking the frame pointers.
> Volker, to be specific, the perf profiling tool has a user space part and a
> kernel space part. The collection of stack traces is done by the kernel.

Hi Maynard,

thanks for the explanation. So how does the kernel kernel collects
stack traces - simply by looking at the frame pointers?

Is there any way of extension mechanism to support the kernel with
this task (i.e. a callback mechanism where the kernel provides a
pc/sp/p touple and gets back the previous frame or a kind of
plugin/module which will be called for the stack trace generation). I
suppose such extensibility will be hard to achieve because it would
have to be executed in the kernel context but I just wanted to ask.

Regards,
Volker

> When a user-specified event (or series of events) occur, the process
> being profiled is interrupted and the sampled information (which can
> optionally include a full stack trace) is made available to the user space
> perf tool to be saved to a file for future post-profiling processing.
>
> During the profiling phase, the perf tool collects information about the
> profiled process's memory mappings, which allows for this address-to-symbol.
> resolution, It's in the post-profiling phase where the sampled instruction,
> along with its associated stack trace, are resolved to the appropriate symbol
> (i.e., function/method) in a specific binary file (e.g., library, exectuable).
>
> And if the VM creates a /tmp/perf-.map file to save information about
> JITed methods, the perf's post-profiling tool will find it and use it to
> correlate sampled addresses it collected from the VM's executable anonymous
> memory mappings to the method names.
>
> -Maynard
>>
>> A JVMTI agent, perf-map-agent, is providing a map file for symbol
>> translation under /tmp/perf-PID.map. Linux perf already hunts for such
>> a file when doing symbol translation.
>>
>>>
>>> As I wrote before, this is pretty hard to get right for a JVM, but
>>> there are good approximations. Have you looked at the 'jstack' tool
>>> which is part of the JDK? If you run it on a Java process, it will
>>> give you exact stack traces with full inlining information. However
>>> this only works at safepoints so it is probably not suitable for
>>> profiling with performance counters.
>>
>> Right, jstack works, and I get full correct stacks. I do really want
>> to take stacks at any moment: not just CPU samples, but when tracing
>> kernel TCP events, or PMC cache miss profiling, etc. perf can already
>> do many advanced tracing and profiling activities. I just needed the
>> Java stacks for context.
>>
>>> But you can also use 'jstack -F
>>> -m' which gives you a 'best effort' mixed Java/C++ stacaktrace (most
>>> of the time even with inlined Java frames. This is probably the best
>>> you can get when interrupting a running JVM at an arbitrary point in
>>> time. As you mentioned in one of your blogs, the VM can be in the
>>> C-Library or even in the kernel at that time which don't preserve the
>>> frame pointer either. So it will be already hard to even walk up to
>>> the first Java frame.
>>
>> Well, the JVMs I'm looking at are already built with
>> -fno-omit-frame-pointer (which is good). I edited hotspot to preserve
>> it as well.
>>
>> Here's before I changed hotspot:
>>
>> http://www.brendangregg.com/FlameGraphs/cpu-mixedmode-nofp.svg
>>
>> Yes, most stacks are clearly broken.
>>
>> After changing hotspot:
>>
>> http://www.brendangregg.com/FlameGraphs/cpu-mixedmode-vertx.svg
>>
>> It's looking pretty good. If you look carefully on the far left and
>> right, there are 0.8% stacks in read() and write() directly from java,
>> which may well be broken (unless a java thread is calling these
>> directly; there could also be some gcc inlining going on). Even if
>> they are broken, I can see 98% of my profile. Plus, I'd be interested
>> to know what exactly is reusing the frame pointer, so we could fix
>> that too.
>>
>> The Java stacks themselves are also about a third as deep as they
>> should be, due to inlining.
>>
>>>
>>> But nevertheless, if the output of 'jstack -F -m' is "good enough" for
>>> your purpose, you can implement something similar in 'perf' or a
>>> helper library of 'perf' and be happy (I don't actually know how perf
>>> takes stack traces but I suppose there may some kind of callback
>>> mechanism for walking unknown frames). This is actually not so hard.
>>> I've recently implemented a "print_native_stack()" function within
>>> hotspot itself (you can call it for example from gdb during debugging
>>> - see http://hg.openjdk.java

Re: A hotspot patch for stack profiling (frame pointer)

2014-12-08 Thread Maynard Johnson
On 12/05/2014 05:09 PM, Brendan Gregg wrote:
> G'Day Volker,
> 
> On Fri, Dec 5, 2014 at 11:22 AM, Volker Simonis
>  wrote:
>> Hi Brendan,
>>
>> I'm still not understanding who is taking the actual stack traces (let
>> alone the symbols) in your examples. Is this done by 'perf' itself
>> based only on the frame pointer?
> 
> perf is walking the frame pointers.
Volker, to be specific, the perf profiling tool has a user space part and a
kernel space part. The collection of stack traces is done by the kernel.
When a user-specified event (or series of events) occur, the process
being profiled is interrupted and the sampled information (which can
optionally include a full stack trace) is made available to the user space
perf tool to be saved to a file for future post-profiling processing.

During the profiling phase, the perf tool collects information about the
profiled process's memory mappings, which allows for this address-to-symbol.
resolution, It's in the post-profiling phase where the sampled instruction,
along with its associated stack trace, are resolved to the appropriate symbol
(i.e., function/method) in a specific binary file (e.g., library, exectuable).

And if the VM creates a /tmp/perf-.map file to save information about
JITed methods, the perf's post-profiling tool will find it and use it to
correlate sampled addresses it collected from the VM's executable anonymous
memory mappings to the method names.

-Maynard
> 
> A JVMTI agent, perf-map-agent, is providing a map file for symbol
> translation under /tmp/perf-PID.map. Linux perf already hunts for such
> a file when doing symbol translation.
> 
>>
>> As I wrote before, this is pretty hard to get right for a JVM, but
>> there are good approximations. Have you looked at the 'jstack' tool
>> which is part of the JDK? If you run it on a Java process, it will
>> give you exact stack traces with full inlining information. However
>> this only works at safepoints so it is probably not suitable for
>> profiling with performance counters.
> 
> Right, jstack works, and I get full correct stacks. I do really want
> to take stacks at any moment: not just CPU samples, but when tracing
> kernel TCP events, or PMC cache miss profiling, etc. perf can already
> do many advanced tracing and profiling activities. I just needed the
> Java stacks for context.
> 
>> But you can also use 'jstack -F
>> -m' which gives you a 'best effort' mixed Java/C++ stacaktrace (most
>> of the time even with inlined Java frames. This is probably the best
>> you can get when interrupting a running JVM at an arbitrary point in
>> time. As you mentioned in one of your blogs, the VM can be in the
>> C-Library or even in the kernel at that time which don't preserve the
>> frame pointer either. So it will be already hard to even walk up to
>> the first Java frame.
> 
> Well, the JVMs I'm looking at are already built with
> -fno-omit-frame-pointer (which is good). I edited hotspot to preserve
> it as well.
> 
> Here's before I changed hotspot:
> 
> http://www.brendangregg.com/FlameGraphs/cpu-mixedmode-nofp.svg
> 
> Yes, most stacks are clearly broken.
> 
> After changing hotspot:
> 
> http://www.brendangregg.com/FlameGraphs/cpu-mixedmode-vertx.svg
> 
> It's looking pretty good. If you look carefully on the far left and
> right, there are 0.8% stacks in read() and write() directly from java,
> which may well be broken (unless a java thread is calling these
> directly; there could also be some gcc inlining going on). Even if
> they are broken, I can see 98% of my profile. Plus, I'd be interested
> to know what exactly is reusing the frame pointer, so we could fix
> that too.
> 
> The Java stacks themselves are also about a third as deep as they
> should be, due to inlining.
> 
>>
>> But nevertheless, if the output of 'jstack -F -m' is "good enough" for
>> your purpose, you can implement something similar in 'perf' or a
>> helper library of 'perf' and be happy (I don't actually know how perf
>> takes stack traces but I suppose there may some kind of callback
>> mechanism for walking unknown frames). This is actually not so hard.
>> I've recently implemented a "print_native_stack()" function within
>> hotspot itself (you can call it for example from gdb during debugging
>> - see http://hg.openjdk.java.net/jdk9/dev/hotspot/rev/86183a940db4).
>> Maye you could call this functions directly from 'perf' if perf
>> attaches with ptrace to the process (I assume it does or how else
>> could it walk the stack)?
> 
> An OS-cooperative stack walker would be great, and I think the hotspot
> team is already doing this for Oracle Solaris. Thanks for the code
> too, this is pretty interesting.
> 
> jstack -F -m eats 0.5s of CPU for me, so it would need work to make
> this into a 99 Hertz-capable profiler. Plus I'd like to pick arbitrary
> kernel functions or tracepoints and get Java context from them, too.
> Eg, TCP functions, memory allocation, disk I/O, etc.
> 
>>
>> These were just some random thoughts wi

Re: RFR(S) 8028773: SA, warnings from b116 for hotspot.agent.src.share.native: JNI exception pending

2014-12-08 Thread serguei.spit...@oracle.com

Looks good.

Some minor comments.

Better to reformat with one variable at a line:

 113   const char *error_message = NULL, *jrepath = NULL, *libname = NULL;

 283   jbyte *start, *end;


Uninitialized locals:

 115 #ifdef _WINDOWS
 116   HINSTANCE hsdis_handle;
 117 #else
 118   void* hsdis_handle;
 119 #endif
 ...
 201   jlong result;
 ...

 282   jboolean isCopy;
 283   jbyte *start, *end;
 284   jclass disclass;
 285   const char *options;


Unaligned parameters:

 209   result = (*env)->CallLongMethod(env, denv->dis, denv->handle_event, 
denv->visitor,
 210 event_string, (jlong) 
(uintptr_t)arg);


Thanks,
Serguei


On 12/7/14 6:17 AM, Dmitry Samersoff wrote:

Please, review a modified fix:

http://cr.openjdk.java.net/~dsamersoff/JDK-8028773/webrev.03/

Windows compiler doesn't allow declaration in the middle of the function
for c code.

Also I put my few cents to reduce build noise and add a pragma that
disable windows compiler warnings that have no value in this context.

-Dmitry


On 2014-12-03 16:01, Staffan Larsen wrote:

Changes look good. What testing have you done?

/Staffan


On 3 dec 2014, at 13:06, Dmitry Samersoff  wrote:

Serguei,

Updated webrev

http://cr.openjdk.java.net/~dsamersoff/JDK-8028773/webrev.02/

-Dmitry

On 2014-12-03 01:24, serguei.spit...@oracle.com wrote:

Dmitry,

It is good in general modulo Staffan's comments.

There are some inconsistencies:
-  the ExceptionOccurred(env) is compared to != NULL (which make the
logic more complex)
 in some places and no such comparison (implicit comparison instead)
in others
- two different forms of check/action are used:
  -  (*env)->ExceptionOccurred(env) and
  -  !(*env)->ExceptionOccurred(env)

I'd suggest to do it the same way and always return the "default" value
if an exception occurred.
It will make the logic more flat and clear.
Otherwise, we have to think what exact value is returned in exception
occurred cases like at lines 241, 255.

The comment // OOM is used inconsistently too.


Thanks,
Serguei


On 12/2/14 11:14 AM, Dmitry Samersoff wrote:

Please review the small fix.

http://cr.openjdk.java.net/~dsamersoff/JDK-8028773/webrev.01/

Added more missed exception checks to sadis.c



--
Dmitry Samersoff
Oracle Java development team, Saint Petersburg, Russia
* I would love to change the world, but they won't give me the sources.






Re: JDK 9 RFR of JDK-8066634: Suppress deprecation warnings in java.management module

2014-12-08 Thread Daniel Fuchs

Hi Joe,

Looks good!

best regards,

-- daniel

On 12/8/14 8:04 AM, joe darcy wrote:

Hello,

Please review the patch below which addresses

JDK-8066634: Suppress deprecation warnings in java.management module

Thanks,

-Joe

diff -r 913808eaf19a 
src/java.management/share/classes/com/sun/jmx/interceptor/DefaultMBeanServerInterceptor.java
--- 
a/src/java.management/share/classes/com/sun/jmx/interceptor/DefaultMBeanServerInterceptor.java 
Mon Nov 10 08:43:27 2014 -0800
+++ 
b/src/java.management/share/classes/com/sun/jmx/interceptor/DefaultMBeanServerInterceptor.java 
Sun Dec 07 23:02:51 2014 -0800

@@ -1,5 +1,5 @@
 /*
- * Copyright (c) 2000, 2013, Oracle and/or its affiliates. All rights 
reserved.
+ * Copyright (c) 2000, 2014, Oracle and/or its affiliates. All rights 
reserved.

  * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
  *
  * This code is free software; you can redistribute it and/or modify it
@@ -1707,16 +1707,19 @@
 throw new UnsupportedOperationException("Not supported yet.");
 }

+@SuppressWarnings("deprecation")
 public ObjectInputStream deserialize(ObjectName name, byte[] 
data) throws InstanceNotFoundException,

OperationsException {
 throw new UnsupportedOperationException("Not supported yet.");
 }

+@SuppressWarnings("deprecation")
 public ObjectInputStream deserialize(String className, byte[] 
data) throws OperationsException,

ReflectionException {
 throw new UnsupportedOperationException("Not supported yet.");
 }

+@SuppressWarnings("deprecation")
 public ObjectInputStream deserialize(String className, ObjectName 
loaderName,
 byte[] data) throws InstanceNotFoundException, 
OperationsException,

 ReflectionException {





Re: JDK 9 RFR of JDK-8066634: Suppress deprecation warnings in java.management module

2014-12-08 Thread Alan Bateman


I assume you meant to cc jmx-dev for this one. In any case it looks okay 
to me.


-Alan

On 08/12/2014 07:04, joe darcy wrote:

Hello,

Please review the patch below which addresses

JDK-8066634: Suppress deprecation warnings in java.management module

Thanks,

-Joe

diff -r 913808eaf19a 
src/java.management/share/classes/com/sun/jmx/interceptor/DefaultMBeanServerInterceptor.java
--- 
a/src/java.management/share/classes/com/sun/jmx/interceptor/DefaultMBeanServerInterceptor.java 
Mon Nov 10 08:43:27 2014 -0800
+++ 
b/src/java.management/share/classes/com/sun/jmx/interceptor/DefaultMBeanServerInterceptor.java 
Sun Dec 07 23:02:51 2014 -0800

@@ -1,5 +1,5 @@
 /*
- * Copyright (c) 2000, 2013, Oracle and/or its affiliates. All rights 
reserved.
+ * Copyright (c) 2000, 2014, Oracle and/or its affiliates. All rights 
reserved.

  * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
  *
  * This code is free software; you can redistribute it and/or modify it
@@ -1707,16 +1707,19 @@
 throw new UnsupportedOperationException("Not supported yet.");
 }

+@SuppressWarnings("deprecation")
 public ObjectInputStream deserialize(ObjectName name, byte[] 
data) throws InstanceNotFoundException,

OperationsException {
 throw new UnsupportedOperationException("Not supported yet.");
 }

+@SuppressWarnings("deprecation")
 public ObjectInputStream deserialize(String className, byte[] 
data) throws OperationsException,

ReflectionException {
 throw new UnsupportedOperationException("Not supported yet.");
 }

+@SuppressWarnings("deprecation")
 public ObjectInputStream deserialize(String className, ObjectName 
loaderName,
 byte[] data) throws InstanceNotFoundException, 
OperationsException,

 ReflectionException {