Hi Jeremy,
Thanks for the feedback and the CallerFinder API you have.
On 7/7/2014 9:55 AM, Jeremy Manson wrote:
Hey folks,
I don't know if Mandy's draft JEP has gotten any love,
The JEP process is in transition to 2.0 version. Hope this JEP will
come out soon.
but this is something that has (in the past) been a major CPU cycle
consumer for us, and we've had to invent / reinvent many wheels to fix
it internally, so we'd love to see a principled solution.
A couple of notes:
- A large percentage of the time, you just want to find one of:
1) The direct caller of the method,
2) The first caller outside a given package.
The current thinking is to allow you to find the direct caller as well
as express the predicate for filtering that will cover these cases.
We added a CallerFinder API that basically looks like this:
// Finds the caller of the invoking method in the current stack that
isn't in one of the excluded classes
public static StackTraceElement findCaller(Class<?>... excludedClasses);
// Finds the first caller of a given class
public static StackTraceElement findCallerOf(Class<?>... classesToFind);
This isn't the ideal API (it is more the one that happened to be
convenient when we threw together the class), but it gets the vast
majority of use cases.
Does it use Thread.getStackTrace() to implement this CallerFinder API?
Thread.getStackTrace or Throwable.getStackTrace both eagerly capture the
entire stack trace that is expensive. We want to have the VM to be able
to only capture the stack frames that the client needs and the
implementation as efficient as possible.
2) Even with a super-efficient stack walker, anyone who uses the
java.util.logging framework pervasively is going to see a lot of CPU
cycles consumed by determining the caller.
The current LogRecord implementation calls new Throwable that has to pay
the cost of capturing the entire stack.
We've had a lot of luck minimizing this by using a bytecode rewriter
to change callers of log(msg) to log(sourceClass, sourceMethod, msg).
This is almost certainly something that could be done (even in a
principled way!) by the VM; improvements to CPU usage in such apps
have been dramatic.
Thanks. I'll make sure to measure and compare the performance with
java.util.logging using the new stack walk API and also may ask your
help to determine if you observe the performance difference comparing
the rewritten bytecode vs the java.util.logging using the new API.
Mandy
Jeremy
On Sun, Mar 30, 2014 at 4:02 PM, Mandy Chung <mandy.ch...@oracle.com
<mailto:mandy.ch...@oracle.com>> wrote:
Below is a draft JEP we are considering submitting for JDK 9.
Mandy
----------------------------
Title: Efficient API for Stack Walking
Goal
----
Define a standard API for stack walking that will be efficient and
performant.
Non-goal
--------
It is not a goal for this API be easy to use via Reflection for
example
use in code that is compiled for an older JDK.
Motivation
----------
There is no standard API to obtain information about the caller's
class
and traverse the execution stack in a performant way. Existing
libraries
and frameworks such as Log4j and Groovy have to resort to using the
JDK internal API `sun.reflect.Reflection.getCallerClass(int depth)`.
This JEP proposes to define a standard API for stack walking that will
be efficient and performant and also enable the implementation up
level the stack walk machinery from the VM to Java and replaces
the current mechanism of `Throwable.fillInStackTrace.
Description
-----------
There is no standard API to traverse certain frames on the execution
stack efficiently and access the Class instance of each frame.
There are APIs that allow to access the stack trace information:
- `Throwable.getStackTrace()` and `Thread.getStackTrace()` that
returns
an array of `StackTraceElement` which contains the classname
and method name of a stack trace.
- `SecurityManager.getClassContext()` which is a protected method
such that only `SecurityManager` subclass can access the class
context.
These APIs require the VM to eagerly capture a snapshot of the entire
stack trace and returns the information representing the entire stack.
There is no other way to avoid the cost to examine all frames if
the caller is only interested in the top few frames on the stack.
Both `Throwable.getStackTrace()` and `Thread.getStackTrace()` methods
return an array of `StackTraceElement` that contains the classname and
method name of a stack frame but the `Class` instance.
In fact, for applications interested in the entire stack, the
specification
allows VM implementation to omit some frames in the stack for
performance.
In other words, `Thread.getStackTrace()` may return a partial
stack trace.
These APIs do not satisfy the use cases that currently depend on
the `getCallerClass(int depth)` method or its performance overhead
is intolerable. The use cases include:
- JDK caller-sensitive APIs look up its immediate caller's class
which will be used to determine the behavior of the API. For
example
`Class.forName(String classname)` and
`ResourceBundle.getBundle(String rbname)` methods use the
immediate
caller's class loader to load a class and a resource bundle
respectively.
`Class.getMethod` etc will use the immediate caller's class loader
to determine the security checks to be performed.
- `java.util.logging`, Log4j and Groovy runtime filter the
intermediary
stack frames (typically implementation-specific and reflection
frames)
and find the caller's class to be used by the runtime of such
library
or framework.
- Traverse the entire stack trace or the stack trace of a
`Throwbale`
and obtain additional information about classes for enhanced
diagnosibility in addition to the class and method name.
This JEP will define a stack walk API that allows laziness, frame
filtering,
supports short reaches to stop at a frame matching some criteria
as well as long reaches to traverse the entire stack trace. This
would
need the JVM to provide a flexible mechanism to traverse and
materialize
the specific stack frame information to be used and allow efficient
lazy access to additional stack frames when required.
Native JVM transitions should be minimzed.
The API will define how it works when running with a security manager
that allows access to a `Class` instance
of any frame ensuring that the security is not compromised.
An example API to walk the stack can be like:
Thread.walkStack(Consumer<StackFrameInfo> action, int depthLimit)
that takes a callback to be invoked for each frame traversed. A
variant
of the walkStack method will take a predicate for stack frame
filtering.
Thread.getCaller(Function<StackFrameInfo, R> function)
Thread.findCaller(Predicate<StackFrameInfo> predicate,
Function<StackFrameInfo, R> function)
finds the caller frame with or without filtering.
Testing
-------
Unit tests and JCK tests for the new SE API will need to be developed.
In addition, the performance of the new API for different use cases
will be assessed.
Impact
------
- Performance/scalability: performance measurement shall be
performed
using micro-benchmarks as well as real world usage of
`getCallerClass`
replaced with the new API.
- TCK: New JCK test cases shall be developed.