Re: Slow performance of StackWalker.getCallerClass() vs Reflection.getCallerClass()
On 7/2/19 9:54 PM, Kasper Nielsen wrote: On Tue, 2 Jul 2019 at 18:50, Mandy Chung wrote: I'm not getting how getCallerClass is used and related to access check. Can you elaborate? Caller sensitive methods are viral, in the sense that if you invoke a caller sensitive method in the JDK, as a library on behalf of a client (without having a lookup object). You need to perform the same access checks as the JDK does. That is, checking that the client that calls you - the library - has the right access rights. The caller sensitive methods in the JDK cannot do this, because there is no way for them to see that the library is merely a proxy acting on behalf of a client. Consider a very simple serialization library with one method String serializeToString(Object o) // prints out the value of every field with an implementation in a module called SER. And two modules M1, M2 that uses SER (For example, via a ServiceLoader). Both M1 and M2 are open to SER in order for the serializer to serialize the objects in each of the two modules. However, M1 and M2 are not open towards each other. So it is not the intention that, for example, some code from M1 can call it with objects from M2 and have them serialized or vice versa. However, this is entire possible unless serializeToString() performs access checks on the caller. All M1 has to do is get a hold of an object from M2 and then call serializeToString() with it. There is no way the jdk can check this, it just sees an object from M2 which is open to SER. It has no idea it is actually M1 trying to serialize it. So the only way for this to work as intended is for serializeToString to check that the caller matches the object. And unless you pass around Lookup objects the only way you can do it, is similar to how the jdk does it; by looking at the calling class. Reflection::getCallerClass is not available outside of the JDK, so StackWalker is the only way to do this. I've put up an example at https://github.com/kaspernielsen/modulestest. Calling M2Use.main will serialize an object from M1 even though it M1 is not open to M2. As noted in another thread this gets further complicated because all the access control code is buried in internal jdk classes. In this example you more or less have to reimplement AccessibleObject.checkCanSetAccessible yourself. In the end I don't think it is realistic to expect library developers to get this right. M1 and M2 are open to SER. SER can call `privateLookupIn` to get a lookup on M1's class and then it can use M1's lookup to call accessClass. An alternative idea is to have the caller to pass its lookup to serializeToString method to access if the caller lookup object has access to the given object. Would that be a plausible solution? Mandy
Re: Slow performance of StackWalker.getCallerClass() vs Reflection.getCallerClass()
On 7/2/19 6:57 PM, Ralph Goers wrote: Thanks Mandy, It seems I commented on the thread mentioned in the issue you linked to. Unfortunately, it doesn’t look like any work has been done on the issue.in the last 18 months. I did start to explore some options and never have time to spend on this as I have been working on other higher priority projects. Yes, LogRecord doesn’t get the StackTraceElement. We only get it for the one stack entry we are interested in and we only do that because Log4j is still compatible with Java 7 and creating our own class would be too disruptive. Still, the cost seems to be in locating the correct frame, not creating the StackTraceElement. Do you have data/information where you find the overhead comes from? if you avoid using StackTraceElement, that'd help understanding it. Mandy
Re: Slow performance of StackWalker.getCallerClass() vs Reflection.getCallerClass()
On Tue, 2 Jul 2019 at 22:49, Ralph Goers wrote: > > The timing of this question is perfect. I have been doing some testing over > the last week to address https://issues.apache.org/jira/browse/LOG4J2-2644 > and found some interesting things - although they are related to the walk() > method, not getCallerClass(). Getting a single stack frame with the calling class + line number for logging is probably the most common use case for using a StackWalker. I think optimized versions of these two methods would be really useful and cover most of these usecases: Optional findFirstWithClassName(Predicate p) {...} Optional findFirstWithDeclaringClass(Predicate> p) {...} There is really no reason to create a StackFrame object for every frame. Only the final frame. Perhaps even skip returning a StackFrame but just returning a string with ClassName:Linenumber if there is significant overhead in creating the StackFrame. I use StackWalker as well for logging, and find that the 2-3 microseconds it typically takes to get the calling class + line number a bit steep in my performance budget. /Kasper
Re: Slow performance of StackWalker.getCallerClass() vs Reflection.getCallerClass()
On Tue, 2 Jul 2019 at 18:50, Mandy Chung wrote: > > I'm not getting how getCallerClass is used and related to access check. > Can you elaborate? Caller sensitive methods are viral, in the sense that if you invoke a caller sensitive method in the JDK, as a library on behalf of a client (without having a lookup object). You need to perform the same access checks as the JDK does. That is, checking that the client that calls you - the library - has the right access rights. The caller sensitive methods in the JDK cannot do this, because there is no way for them to see that the library is merely a proxy acting on behalf of a client. Consider a very simple serialization library with one method String serializeToString(Object o) // prints out the value of every field with an implementation in a module called SER. And two modules M1, M2 that uses SER (For example, via a ServiceLoader). Both M1 and M2 are open to SER in order for the serializer to serialize the objects in each of the two modules. However, M1 and M2 are not open towards each other. So it is not the intention that, for example, some code from M1 can call it with objects from M2 and have them serialized or vice versa. However, this is entire possible unless serializeToString() performs access checks on the caller. All M1 has to do is get a hold of an object from M2 and then call serializeToString() with it. There is no way the jdk can check this, it just sees an object from M2 which is open to SER. It has no idea it is actually M1 trying to serialize it. So the only way for this to work as intended is for serializeToString to check that the caller matches the object. And unless you pass around Lookup objects the only way you can do it, is similar to how the jdk does it; by looking at the calling class. Reflection::getCallerClass is not available outside of the JDK, so StackWalker is the only way to do this. I've put up an example at https://github.com/kaspernielsen/modulestest. Calling M2Use.main will serialize an object from M1 even though it M1 is not open to M2. As noted in another thread this gets further complicated because all the access control code is buried in internal jdk classes. In this example you more or less have to reimplement AccessibleObject.checkCanSetAccessible yourself. In the end I don't think it is realistic to expect library developers to get this right. /Kasper
Re: Slow performance of StackWalker.getCallerClass() vs Reflection.getCallerClass()
ce(Option.RETAIN_CLASS_REFERENCE); >>>> } >>>> >>>> @Benchmark >>>> public Class stackWalkerCallerClass() { >>>> return sw.getCallerClass(); >>>> } >>>> >>>> @Benchmark >>>> public Lookup reflectionCallerClass() { >>>> return MethodHandles.lookup(); >>>> } >>>> } >>>> >>>> Benchmark Mode Cnt ScoreError Units >>>> StackWalkerPerf.stackWalkerSetupavgt 1011.958 ± 0.353 ns/op >>>> StackWalkerPerf.reflectionCallerClass avgt 10 8.511 ± 0.415 ns/op >>>> StackWalkerPerf.stackWalkerCallerClass avgt 10 1269.825 ± 66.471 ns/op >>>> >>>> I'm using MethodHandles.lookup() in this test because it is cheapest >>>> way to invoke Reflection.getCallerClass() without any tricks. >>>> So real performance is likely better. >>>> >>>> /Kasper >>>> >>>> On Tue, 2 Jul 2019 at 13:53, Remi Forax >>> <mailto:fo...@univ-mlv.fr>> wrote: >>>>> Hi Kasper, >>>>> did you store the StackWalker instance in a static final field ? >>>>> >>>>> Rémi >>>>> >>>>> - Mail original - >>>>>> De: "Kasper Nielsen" mailto:kaspe...@gmail.com>> >>>>>> À: "core-libs-dev" >>>>> <mailto:core-libs-dev@openjdk.java.net>> >>>>>> Envoyé: Mardi 2 Juillet 2019 11:09:11 >>>>>> Objet: Slow performance of StackWalker.getCallerClass() vs >>>>>> Reflection.getCallerClass() >>>>>> Hi, >>>>>> >>>>>> Are there any security reasons for why StackWalker.getCallerClass() >>>>>> cannot be made as performant as Reflection.getCallerClass()? >>>>>> StackWalker.getCallerClass() is at least 100 times slower then >>>>>> Reflection.getCallerClass() (~1000 ns/op vs ~10 ns/op). >>>>>> >>>>>> I'm trying to retrofit some existing APIs where I cannot take a Lookup >>>>>> object to do some access control checks. >>>>>> But the performance of StackWalker.getCallerClass() is making it >>>>>> impossible. >>>>>> >>>>>> Best >>>>>> Kasper >>> >>> >> >
Re: Slow performance of StackWalker.getCallerClass() vs Reflection.getCallerClass()
Hi Ralph, Thanks for this info. Quick comments: LogRecord does not get the line number nor StackTraceElement. There is cost to construct the string-based StackTraceElement or get the line number mapped from BCI. And it is orthogonal to StackWalker::getCallerClass that is only interested in the Class object. You may also be interesting in https://bugs.openjdk.java.net/browse/JDK-8189752 to take a snapshot of a stack trace (possibly the top N frames). Mandy On 7/2/19 2:49 PM, Ralph Goers wrote: The timing of this question is perfect. I have been doing some testing over the last week to address https://issues.apache.org/jira/browse/LOG4J2-2644 and found some interesting things - although they are related to the walk() method, not getCallerClass(). 1. Using skip(n) to find a frame a couple frames up is pretty fast but isn’t too much faster than finding the same element using new Throwable().getStackTrace() was in Java 8. 2. The cost of walking the stack becomes much more costly as the number of elements needing to be walked increases. 3. The most shocking to me was that the fastest way to traverse a stack trace is to use a Function that immediately converts the Stream to an array and then use an old style for loop to traverse it. However, doing this is incredibly awkward because StackWalker only supports streams so there is no good way to pass the value being searched for into the Function. I had to resort to storing it in a ThreadLocal. Having a toArray() method on StackWalker would be a lot nicer, especially if I could limit the number of frames retrieved. I should note that java.util.logging.LogRecord uses a Filter to walk the stack which is faster than the stream methods I was originally using, but is much slower than what I ended up with. As for the issue mentioned here, I believe I reported that getCallerClass was much slower than the Reflection class in Java 9 and opened a bug here. As I recall that was addressed and I believe I verified that fix but it probably wouldn’t hurt for me to do it again. Ralph On Jul 2, 2019, at 10:48 AM, Mandy Chung <mailto:mandy.ch...@oracle.com>> wrote: MethodHandles::lookup is optimized (@ForceInline) and so it may not represent apple-to-apple comparison.StackWalker::getCallerClass does have overhead compared to Reflection::getCallerClass and need to get the microbenchmark in the jdk repo and rerun the numbers [1]. I'm not getting how getCallerClass is used and related to access check. Can you elaborate? Mandy [1] https://bugs.openjdk.java.net/browse/JDK-8221623 On 7/2/19 6:07 AM, Kasper Nielsen wrote: Hi Remi, Yes, setting up a StackWalker is more or less free. It is just wrapping a set of options. public class StackWalkerPerf { static final StackWalker sw = StackWalker.getInstance(Option.RETAIN_CLASS_REFERENCE); @Benchmark public StackWalker stackWalkerSetup() { return StackWalker.getInstance(Option.RETAIN_CLASS_REFERENCE); } @Benchmark public Class stackWalkerCallerClass() { return sw.getCallerClass(); } @Benchmark public Lookup reflectionCallerClass() { return MethodHandles.lookup(); } } Benchmark Mode Cnt Score Error Units StackWalkerPerf.stackWalkerSetup avgt 10 11.958 ± 0.353 ns/op StackWalkerPerf.reflectionCallerClass avgt 10 8.511 ± 0.415 ns/op StackWalkerPerf.stackWalkerCallerClass avgt 10 1269.825 ± 66.471 ns/op I'm using MethodHandles.lookup() in this test because it is cheapest way to invoke Reflection.getCallerClass() without any tricks. So real performance is likely better. /Kasper On Tue, 2 Jul 2019 at 13:53, Remi Forax <mailto:fo...@univ-mlv.fr>> wrote: Hi Kasper, did you store the StackWalker instance in a static final field ? Rémi - Mail original - De: "Kasper Nielsen" mailto:kaspe...@gmail.com>> À: "core-libs-dev" <mailto:core-libs-dev@openjdk.java.net>> Envoyé: Mardi 2 Juillet 2019 11:09:11 Objet: Slow performance of StackWalker.getCallerClass() vs Reflection.getCallerClass() Hi, Are there any security reasons for why StackWalker.getCallerClass() cannot be made as performant as Reflection.getCallerClass()? StackWalker.getCallerClass() is at least 100 times slower then Reflection.getCallerClass() (~1000 ns/op vs ~10 ns/op). I'm trying to retrofit some existing APIs where I cannot take a Lookup object to do some access control checks. But the performance of StackWalker.getCallerClass() is making it impossible. Best Kasper
Re: Slow performance of StackWalker.getCallerClass() vs Reflection.getCallerClass()
The timing of this question is perfect. I have been doing some testing over the last week to address https://issues.apache.org/jira/browse/LOG4J2-2644 <https://issues.apache.org/jira/browse/LOG4J2-2644> and found some interesting things - although they are related to the walk() method, not getCallerClass(). 1. Using skip(n) to find a frame a couple frames up is pretty fast but isn’t too much faster than finding the same element using new Throwable().getStackTrace() was in Java 8. 2. The cost of walking the stack becomes much more costly as the number of elements needing to be walked increases. 3. The most shocking to me was that the fastest way to traverse a stack trace is to use a Function that immediately converts the Stream to an array and then use an old style for loop to traverse it. However, doing this is incredibly awkward because StackWalker only supports streams so there is no good way to pass the value being searched for into the Function. I had to resort to storing it in a ThreadLocal. Having a toArray() method on StackWalker would be a lot nicer, especially if I could limit the number of frames retrieved. I should note that java.util.logging.LogRecord uses a Filter to walk the stack which is faster than the stream methods I was originally using, but is much slower than what I ended up with. As for the issue mentioned here, I believe I reported that getCallerClass was much slower than the Reflection class in Java 9 and opened a bug here. As I recall that was addressed and I believe I verified that fix but it probably wouldn’t hurt for me to do it again. Ralph > On Jul 2, 2019, at 10:48 AM, Mandy Chung wrote: > > MethodHandles::lookup is optimized (@ForceInline) and so it may not > represent apple-to-apple comparison.StackWalker::getCallerClass > does have overhead compared to Reflection::getCallerClass and > need to get the microbenchmark in the jdk repo and rerun the numbers [1]. > > I'm not getting how getCallerClass is used and related to access check. > Can you elaborate? > > Mandy > [1] https://bugs.openjdk.java.net/browse/JDK-8221623 > > > On 7/2/19 6:07 AM, Kasper Nielsen wrote: >> Hi Remi, >> >> Yes, setting up a StackWalker is more or less free. It is just >> wrapping a set of options. >> >> public class StackWalkerPerf { >> >> static final StackWalker sw = >> StackWalker.getInstance(Option.RETAIN_CLASS_REFERENCE); >> >> @Benchmark >> public StackWalker stackWalkerSetup() { >> return StackWalker.getInstance(Option.RETAIN_CLASS_REFERENCE); >> } >> >> @Benchmark >> public Class stackWalkerCallerClass() { >> return sw.getCallerClass(); >> } >> >> @Benchmark >> public Lookup reflectionCallerClass() { >> return MethodHandles.lookup(); >> } >> } >> >> Benchmark Mode Cnt ScoreError Units >> StackWalkerPerf.stackWalkerSetupavgt 1011.958 ± 0.353 ns/op >> StackWalkerPerf.reflectionCallerClass avgt 10 8.511 ± 0.415 ns/op >> StackWalkerPerf.stackWalkerCallerClass avgt 10 1269.825 ± 66.471 ns/op >> >> I'm using MethodHandles.lookup() in this test because it is cheapest >> way to invoke Reflection.getCallerClass() without any tricks. >> So real performance is likely better. >> >> /Kasper >> >> On Tue, 2 Jul 2019 at 13:53, Remi Forax wrote: >>> Hi Kasper, >>> did you store the StackWalker instance in a static final field ? >>> >>> Rémi >>> >>> - Mail original - >>>> De: "Kasper Nielsen" >>>> À: "core-libs-dev" >>>> Envoyé: Mardi 2 Juillet 2019 11:09:11 >>>> Objet: Slow performance of StackWalker.getCallerClass() vs >>>> Reflection.getCallerClass() >>>> Hi, >>>> >>>> Are there any security reasons for why StackWalker.getCallerClass() >>>> cannot be made as performant as Reflection.getCallerClass()? >>>> StackWalker.getCallerClass() is at least 100 times slower then >>>> Reflection.getCallerClass() (~1000 ns/op vs ~10 ns/op). >>>> >>>> I'm trying to retrofit some existing APIs where I cannot take a Lookup >>>> object to do some access control checks. >>>> But the performance of StackWalker.getCallerClass() is making it >>>> impossible. >>>> >>>> Best >>>> Kasper > >
Re: Slow performance of StackWalker.getCallerClass() vs Reflection.getCallerClass()
MethodHandles::lookup is optimized (@ForceInline) and so it may not represent apple-to-apple comparison.StackWalker::getCallerClass does have overhead compared to Reflection::getCallerClass and need to get the microbenchmark in the jdk repo and rerun the numbers [1]. I'm not getting how getCallerClass is used and related to access check. Can you elaborate? Mandy [1] https://bugs.openjdk.java.net/browse/JDK-8221623 On 7/2/19 6:07 AM, Kasper Nielsen wrote: Hi Remi, Yes, setting up a StackWalker is more or less free. It is just wrapping a set of options. public class StackWalkerPerf { static final StackWalker sw = StackWalker.getInstance(Option.RETAIN_CLASS_REFERENCE); @Benchmark public StackWalker stackWalkerSetup() { return StackWalker.getInstance(Option.RETAIN_CLASS_REFERENCE); } @Benchmark public Class stackWalkerCallerClass() { return sw.getCallerClass(); } @Benchmark public Lookup reflectionCallerClass() { return MethodHandles.lookup(); } } Benchmark Mode Cnt ScoreError Units StackWalkerPerf.stackWalkerSetupavgt 1011.958 ± 0.353 ns/op StackWalkerPerf.reflectionCallerClass avgt 10 8.511 ± 0.415 ns/op StackWalkerPerf.stackWalkerCallerClass avgt 10 1269.825 ± 66.471 ns/op I'm using MethodHandles.lookup() in this test because it is cheapest way to invoke Reflection.getCallerClass() without any tricks. So real performance is likely better. /Kasper On Tue, 2 Jul 2019 at 13:53, Remi Forax wrote: Hi Kasper, did you store the StackWalker instance in a static final field ? Rémi - Mail original - De: "Kasper Nielsen" À: "core-libs-dev" Envoyé: Mardi 2 Juillet 2019 11:09:11 Objet: Slow performance of StackWalker.getCallerClass() vs Reflection.getCallerClass() Hi, Are there any security reasons for why StackWalker.getCallerClass() cannot be made as performant as Reflection.getCallerClass()? StackWalker.getCallerClass() is at least 100 times slower then Reflection.getCallerClass() (~1000 ns/op vs ~10 ns/op). I'm trying to retrofit some existing APIs where I cannot take a Lookup object to do some access control checks. But the performance of StackWalker.getCallerClass() is making it impossible. Best Kasper
Re: Slow performance of StackWalker.getCallerClass() vs Reflection.getCallerClass()
Hi Remi, Yes, setting up a StackWalker is more or less free. It is just wrapping a set of options. public class StackWalkerPerf { static final StackWalker sw = StackWalker.getInstance(Option.RETAIN_CLASS_REFERENCE); @Benchmark public StackWalker stackWalkerSetup() { return StackWalker.getInstance(Option.RETAIN_CLASS_REFERENCE); } @Benchmark public Class stackWalkerCallerClass() { return sw.getCallerClass(); } @Benchmark public Lookup reflectionCallerClass() { return MethodHandles.lookup(); } } Benchmark Mode Cnt ScoreError Units StackWalkerPerf.stackWalkerSetupavgt 1011.958 ± 0.353 ns/op StackWalkerPerf.reflectionCallerClass avgt 10 8.511 ± 0.415 ns/op StackWalkerPerf.stackWalkerCallerClass avgt 10 1269.825 ± 66.471 ns/op I'm using MethodHandles.lookup() in this test because it is cheapest way to invoke Reflection.getCallerClass() without any tricks. So real performance is likely better. /Kasper On Tue, 2 Jul 2019 at 13:53, Remi Forax wrote: > > Hi Kasper, > did you store the StackWalker instance in a static final field ? > > Rémi > > - Mail original - > > De: "Kasper Nielsen" > > À: "core-libs-dev" > > Envoyé: Mardi 2 Juillet 2019 11:09:11 > > Objet: Slow performance of StackWalker.getCallerClass() vs > > Reflection.getCallerClass() > > > Hi, > > > > Are there any security reasons for why StackWalker.getCallerClass() > > cannot be made as performant as Reflection.getCallerClass()? > > StackWalker.getCallerClass() is at least 100 times slower then > > Reflection.getCallerClass() (~1000 ns/op vs ~10 ns/op). > > > > I'm trying to retrofit some existing APIs where I cannot take a Lookup > > object to do some access control checks. > > But the performance of StackWalker.getCallerClass() is making it impossible. > > > > Best > > Kasper
Re: Slow performance of StackWalker.getCallerClass() vs Reflection.getCallerClass()
Hi Kasper, did you store the StackWalker instance in a static final field ? Rémi - Mail original - > De: "Kasper Nielsen" > À: "core-libs-dev" > Envoyé: Mardi 2 Juillet 2019 11:09:11 > Objet: Slow performance of StackWalker.getCallerClass() vs > Reflection.getCallerClass() > Hi, > > Are there any security reasons for why StackWalker.getCallerClass() > cannot be made as performant as Reflection.getCallerClass()? > StackWalker.getCallerClass() is at least 100 times slower then > Reflection.getCallerClass() (~1000 ns/op vs ~10 ns/op). > > I'm trying to retrofit some existing APIs where I cannot take a Lookup > object to do some access control checks. > But the performance of StackWalker.getCallerClass() is making it impossible. > > Best > Kasper
Slow performance of StackWalker.getCallerClass() vs Reflection.getCallerClass()
Hi, Are there any security reasons for why StackWalker.getCallerClass() cannot be made as performant as Reflection.getCallerClass()? StackWalker.getCallerClass() is at least 100 times slower then Reflection.getCallerClass() (~1000 ns/op vs ~10 ns/op). I'm trying to retrofit some existing APIs where I cannot take a Lookup object to do some access control checks. But the performance of StackWalker.getCallerClass() is making it impossible. Best Kasper