Re: fast way to infer caller
On Fri, 8 Apr 2022 at 06:57, Michael Kuhlmann wrote: > Good morning! > > On 4/7/22 23:59, Kasper Nielsen wrote: > > Hi Michael, > > > > I don't really have an opinion on how you obtain a logger. But one > > particular > > use-case I've had was that I would like to perform some access checks > based > > on the module of the caller. Something similar to how Panama checks for > > native access: > > > > Reflection.ensureNativeAccess(Reflection.getCallerClass()); > > Hmh, but wouldn't it be sufficient to perform these checks at > initialization time? > There might not be an initialization time. Take, for example, this piece of code from jdk.incubator.foreign.CLinker: static void freeMemory(MemoryAddress addr) { Reflection.ensureNativeAccess(Reflection.getCallerClass()); SharedUtils.checkAddress(addr); SharedUtils.freeMemoryInternal(addr); } > That I totally agree! Having the exact line number in the log statement > would be a huge win. > > But to be honest, Reflection::getCallerClass wouldn't help either. > Getting the caller class is the fastest operation of StackWalker. Getting something like line numbers from the top stack frame is somewhat slower. Again for many use cases, the performance of StackWalker is fine. But if you like Ceki is looking for information about stack frames and only have 100 ns to spend then there aren't really any APIs you can use. /Kasper
Re: fast way to infer caller
Good morning! On 4/7/22 23:59, Kasper Nielsen wrote: Hi Michael, I don't really have an opinion on how you obtain a logger. But one particular use-case I've had was that I would like to perform some access checks based on the module of the caller. Something similar to how Panama checks for native access: Reflection.ensureNativeAccess(Reflection.getCallerClass()); Hmh, but wouldn't it be sufficient to perform these checks at initialization time? On 4/7/22 21:01, Ralph Goers wrote: > In Log4j’s case there are two places that accessing the stack comes into play: > 1. Obtaining a Logger . > 2. Logging an event and including the caller’s location information. > > To be honest, what sucks for Logging frameworks is that we really want > access to the location information at compile time such that the overhead > to include it would be zero. I’ve never understood why Java doesn’t include it. That I totally agree! Having the exact line number in the log statement would be a huge win. But to be honest, Reflection::getCallerClass wouldn't help either. I agree that such an inlined statement would be a huge win. But to also put this into relation: From my opinion, logging frameworks should be most performant in the cases where they are *not* logging. Or in those cases where you would log a lot even in production environment - like for financial calculations - the exact line number isn't that relevant any more. But that's just my opinion. -Michael
Re: fast way to infer caller
> > Hi Kasper, > > sorry to jump in here as an uninvolved onlooker, but I can't resist. > I really don't see why this should matter. Getting the caller class is a > rare edge case that you just do in exceptional situations; most often > it's more for debugging or something. > > What users really are interested in is high performance for standard > cases. Implementing a specific optimization into Hotspot to gain few > milliseconds is the least thing I expect from the JVM developers. > > I also don't understand why someone should instantiate a logger during > performance critical procedures. In more than 20 years of Java > development, I've never seen the need to create a logger on the fly. > They are *always* assigned to final static variables, or at least to > predefined pools. Everything else would be just wrong: To instantiate a > logger, you have to fetch at least the log level definition from some > configuration source, and this can never be fast. At least not that > we're talking about nanoseconds here. > > All logging implementations I know of (and all that make sense) are > highly optimized on log throughput; this can only be achieved by > preprocessing during initialization, why this is slow. But that doesn't > matter, because, as said, you should anyway create logger instances > beforehand. > > Sorry for the rant, but I really don't see the use case here. > Hi Michael, I don't really have an opinion on how you obtain a logger. But one particular use-case I've had was that I would like to perform some access checks based on the module of the caller. Something similar to how Panama checks for native access: Reflection.ensureNativeAccess(Reflection.getCallerClass()); You can implement something similar with StackWalker but it just adds a bit of non-negligible overhead to every invocation of the method. I'm pretty sure there must be other valid use cases out there where performance is important. After all, the StackWalker API was created because existing solutions such as Thread::getStackTrace weren't performant enough. /Kasper
Re: fast way to infer caller
C and C++ are harder to read as a side effect of pre-processor macros. IMHO, Java's lack of pre-processor macros simplifies the language and improves readability. On 4/7/2022 9:01 PM, Ralph Goers wrote: > I mentioned this same concern back in 2017 when the Sun Reflection > class was removed. > > In Log4j’s case there are two places that accessing the stack comes into play: > 1. Obtaining a Logger . > 2. Logging an event and including the caller’s location information. > > To be honest, what sucks for Logging frameworks is that we really want > access to the location information at compile time such that the overhead > to include it would be zero. I’ve never understood why Java doesn’t include > it. > I wrote a Logging Framework for C back in the 1980’s that could include the > Information so this isn’t a new concept. > > Even if I could do > > logger.debug(System.location(), ….) > > Where System.location is replaced by the compiler with a constant object > would be a huge win. > > Ralph > > > >> On Apr 7, 2022, at 11:02 AM, Bernd Eckenfels wrote: >> >> Some loggers do need to find the location of the log statement (class and >> line where the logger is used not where it is instantiated). >> >> for those (it makes loggers more useful) getting the call site is time >> critical even if they are not in tight performance critical loops. >> >> But it actually does matter if/how the JVM optimizes such introspection.. if >> the JVM can inline (and maybe even constant intrinsic) the stalkwalker it >> would benefit such use cases just as well. >> >> -- >> https://bernd.eckenfels.net >> ________ >> From: core-libs-dev on behalf of >> Michael Kuhlmann >> Sent: Thursday, April 7, 2022 7:55:16 PM >> To: core-libs-dev@openjdk.java.net >> Subject: Re: fast way to infer caller >> >> >> >> On 4/7/22 19:27, Kasper Nielsen wrote: >>>> >>>> nope, see my previous mail to Ceki, the VM is cheating here if it can >>>> inline the call to MethodHandles.lookup() >>>> >>> >>> Does how the VM cheats really matter? The fact is that the code in the JDK >>> can >>> get the calling class and implement something like MethodHandles.lookup() so >>> it takes ~3 ns. If you implemented something like a lookup class as a normal >>> user your best bet would be StackWalker.GetCallingClass() and you would end >>> up with something that is at least 2 magnitudes slower. That is probably not >>> an issue for most use cases. But for some, it might be a bit of a steep >>> cost. >>> >>> /Kasper >> >> Hi Kasper, >> >> sorry to jump in here as an uninvolved onlooker, but I can't resist. >> I really don't see why this should matter. Getting the caller class is a >> rare edge case that you just do in exceptional situations; most often >> it's more for debugging or something. >> >> What users really are interested in is high performance for standard >> cases. Implementing a specific optimization into Hotspot to gain few >> milliseconds is the least thing I expect from the JVM developers. >> >> I also don't understand why someone should instantiate a logger during >> performance critical procedures. In more than 20 years of Java >> development, I've never seen the need to create a logger on the fly. >> They are *always* assigned to final static variables, or at least to >> predefined pools. Everything else would be just wrong: To instantiate a >> logger, you have to fetch at least the log level definition from some >> configuration source, and this can never be fast. At least not that >> we're talking about nanoseconds here. >> >> All logging implementations I know of (and all that make sense) are >> highly optimized on log throughput; this can only be achieved by >> preprocessing during initialization, why this is slow. But that doesn't >> matter, because, as said, you should anyway create logger instances >> beforehand. >> >> Sorry for the rant, but I really don't see the use case here. > -- Ceki Gülcü Sponsoring SLF4J/logback/reload4j at https://github.com/sponsors/qos-ch
Re: fast way to infer caller
Perhaps https://bugs.openjdk.java.net/browse/JDK-4515935 for the MemoryHandler could be used to determine if StackWalker is fast enough for the lowest rung on the StackWalker performance ladder. Currently the MemoryHandler doesn't not infer the caller and the target handler sees the callsite of the thread that triggers the push. Most use cases of MemoryHandler are records that are loggable but, get discarded and never published to the target handler (never formatted nor sent to some data sink). So this is a real world use case of only openjdk classes. The Peabody fix I proposed in 2007 was to unconditionally force the caller to be computed prior to adding the LogRecord to the internal data structure. Therefore all loggable records would pay the cost of inferring the caller. Current code is fast and broken (assuming target formatter is showing callsite) and the correct code will be slower. If I were to redo that patch from Peabody I would think that PR review would bring to light a consistent view that StackWalker is fast enough at least the openjdk logging. Effectively defining the minimum performance standard. However, if it raises performance regression concerns perhaps there is some more work to be done improving StackWalker? :) Jason From: core-libs-dev on behalf of Bernd Eckenfels Sent: Thursday, April 7, 2022 1:02 PM To: core-libs-dev@openjdk.java.net Subject: Re: fast way to infer caller Some loggers do need to find the location of the log statement (class and line where the logger is used not where it is instantiated). for those (it makes loggers more useful) getting the call site is time critical even if they are not in tight performance critical loops. But it actually does matter if/how the JVM optimizes such introspection.. if the JVM can inline (and maybe even constant intrinsic) the stalkwalker it would benefit such use cases just as well. -- https://bernd.eckenfels.net From: core-libs-dev on behalf of Michael Kuhlmann Sent: Thursday, April 7, 2022 7:55:16 PM To: core-libs-dev@openjdk.java.net Subject: Re: fast way to infer caller On 4/7/22 19:27, Kasper Nielsen wrote: >> >> nope, see my previous mail to Ceki, the VM is cheating here if it can >> inline the call to MethodHandles.lookup() >> > > Does how the VM cheats really matter? The fact is that the code in the JDK > can > get the calling class and implement something like MethodHandles.lookup() so > it takes ~3 ns. If you implemented something like a lookup class as a normal > user your best bet would be StackWalker.GetCallingClass() and you would end > up with something that is at least 2 magnitudes slower. That is probably not > an issue for most use cases. But for some, it might be a bit of a steep > cost. > > /Kasper Hi Kasper, sorry to jump in here as an uninvolved onlooker, but I can't resist. I really don't see why this should matter. Getting the caller class is a rare edge case that you just do in exceptional situations; most often it's more for debugging or something. What users really are interested in is high performance for standard cases. Implementing a specific optimization into Hotspot to gain few milliseconds is the least thing I expect from the JVM developers. I also don't understand why someone should instantiate a logger during performance critical procedures. In more than 20 years of Java development, I've never seen the need to create a logger on the fly. They are *always* assigned to final static variables, or at least to predefined pools. Everything else would be just wrong: To instantiate a logger, you have to fetch at least the log level definition from some configuration source, and this can never be fast. At least not that we're talking about nanoseconds here. All logging implementations I know of (and all that make sense) are highly optimized on log throughput; this can only be achieved by preprocessing during initialization, why this is slow. But that doesn't matter, because, as said, you should anyway create logger instances beforehand. Sorry for the rant, but I really don't see the use case here.
Re: fast way to infer caller
I mentioned this same concern back in 2017 when the Sun Reflection class was removed. In Log4j’s case there are two places that accessing the stack comes into play: 1. Obtaining a Logger . 2. Logging an event and including the caller’s location information. To be honest, what sucks for Logging frameworks is that we really want access to the location information at compile time such that the overhead to include it would be zero. I’ve never understood why Java doesn’t include it. I wrote a Logging Framework for C back in the 1980’s that could include the Information so this isn’t a new concept. Even if I could do logger.debug(System.location(), ….) Where System.location is replaced by the compiler with a constant object would be a huge win. Ralph > On Apr 7, 2022, at 11:02 AM, Bernd Eckenfels wrote: > > Some loggers do need to find the location of the log statement (class and > line where the logger is used not where it is instantiated). > > for those (it makes loggers more useful) getting the call site is time > critical even if they are not in tight performance critical loops. > > But it actually does matter if/how the JVM optimizes such introspection.. if > the JVM can inline (and maybe even constant intrinsic) the stalkwalker it > would benefit such use cases just as well. > > -- > https://bernd.eckenfels.net > > From: core-libs-dev on behalf of > Michael Kuhlmann > Sent: Thursday, April 7, 2022 7:55:16 PM > To: core-libs-dev@openjdk.java.net > Subject: Re: fast way to infer caller > > > > On 4/7/22 19:27, Kasper Nielsen wrote: >>> >>> nope, see my previous mail to Ceki, the VM is cheating here if it can >>> inline the call to MethodHandles.lookup() >>> >> >> Does how the VM cheats really matter? The fact is that the code in the JDK >> can >> get the calling class and implement something like MethodHandles.lookup() so >> it takes ~3 ns. If you implemented something like a lookup class as a normal >> user your best bet would be StackWalker.GetCallingClass() and you would end >> up with something that is at least 2 magnitudes slower. That is probably not >> an issue for most use cases. But for some, it might be a bit of a steep >> cost. >> >> /Kasper > > Hi Kasper, > > sorry to jump in here as an uninvolved onlooker, but I can't resist. > I really don't see why this should matter. Getting the caller class is a > rare edge case that you just do in exceptional situations; most often > it's more for debugging or something. > > What users really are interested in is high performance for standard > cases. Implementing a specific optimization into Hotspot to gain few > milliseconds is the least thing I expect from the JVM developers. > > I also don't understand why someone should instantiate a logger during > performance critical procedures. In more than 20 years of Java > development, I've never seen the need to create a logger on the fly. > They are *always* assigned to final static variables, or at least to > predefined pools. Everything else would be just wrong: To instantiate a > logger, you have to fetch at least the log level definition from some > configuration source, and this can never be fast. At least not that > we're talking about nanoseconds here. > > All logging implementations I know of (and all that make sense) are > highly optimized on log throughput; this can only be achieved by > preprocessing during initialization, why this is slow. But that doesn't > matter, because, as said, you should anyway create logger instances > beforehand. > > Sorry for the rant, but I really don't see the use case here.
Re: fast way to infer caller
Some loggers do need to find the location of the log statement (class and line where the logger is used not where it is instantiated). for those (it makes loggers more useful) getting the call site is time critical even if they are not in tight performance critical loops. But it actually does matter if/how the JVM optimizes such introspection.. if the JVM can inline (and maybe even constant intrinsic) the stalkwalker it would benefit such use cases just as well. -- https://bernd.eckenfels.net From: core-libs-dev on behalf of Michael Kuhlmann Sent: Thursday, April 7, 2022 7:55:16 PM To: core-libs-dev@openjdk.java.net Subject: Re: fast way to infer caller On 4/7/22 19:27, Kasper Nielsen wrote: >> >> nope, see my previous mail to Ceki, the VM is cheating here if it can >> inline the call to MethodHandles.lookup() >> > > Does how the VM cheats really matter? The fact is that the code in the JDK > can > get the calling class and implement something like MethodHandles.lookup() so > it takes ~3 ns. If you implemented something like a lookup class as a normal > user your best bet would be StackWalker.GetCallingClass() and you would end > up with something that is at least 2 magnitudes slower. That is probably not > an issue for most use cases. But for some, it might be a bit of a steep > cost. > > /Kasper Hi Kasper, sorry to jump in here as an uninvolved onlooker, but I can't resist. I really don't see why this should matter. Getting the caller class is a rare edge case that you just do in exceptional situations; most often it's more for debugging or something. What users really are interested in is high performance for standard cases. Implementing a specific optimization into Hotspot to gain few milliseconds is the least thing I expect from the JVM developers. I also don't understand why someone should instantiate a logger during performance critical procedures. In more than 20 years of Java development, I've never seen the need to create a logger on the fly. They are *always* assigned to final static variables, or at least to predefined pools. Everything else would be just wrong: To instantiate a logger, you have to fetch at least the log level definition from some configuration source, and this can never be fast. At least not that we're talking about nanoseconds here. All logging implementations I know of (and all that make sense) are highly optimized on log throughput; this can only be achieved by preprocessing during initialization, why this is slow. But that doesn't matter, because, as said, you should anyway create logger instances beforehand. Sorry for the rant, but I really don't see the use case here.
Re: fast way to infer caller
On 4/7/22 19:27, Kasper Nielsen wrote: nope, see my previous mail to Ceki, the VM is cheating here if it can inline the call to MethodHandles.lookup() Does how the VM cheats really matter? The fact is that the code in the JDK can get the calling class and implement something like MethodHandles.lookup() so it takes ~3 ns. If you implemented something like a lookup class as a normal user your best bet would be StackWalker.GetCallingClass() and you would end up with something that is at least 2 magnitudes slower. That is probably not an issue for most use cases. But for some, it might be a bit of a steep cost. /Kasper Hi Kasper, sorry to jump in here as an uninvolved onlooker, but I can't resist. I really don't see why this should matter. Getting the caller class is a rare edge case that you just do in exceptional situations; most often it's more for debugging or something. What users really are interested in is high performance for standard cases. Implementing a specific optimization into Hotspot to gain few milliseconds is the least thing I expect from the JVM developers. I also don't understand why someone should instantiate a logger during performance critical procedures. In more than 20 years of Java development, I've never seen the need to create a logger on the fly. They are *always* assigned to final static variables, or at least to predefined pools. Everything else would be just wrong: To instantiate a logger, you have to fetch at least the log level definition from some configuration source, and this can never be fast. At least not that we're talking about nanoseconds here. All logging implementations I know of (and all that make sense) are highly optimized on log throughput; this can only be achieved by preprocessing during initialization, why this is slow. But that doesn't matter, because, as said, you should anyway create logger instances beforehand. Sorry for the rant, but I really don't see the use case here.
Re: fast way to infer caller
> > nope, see my previous mail to Ceki, the VM is cheating here if it can > inline the call to MethodHandles.lookup() > Does how the VM cheats really matter? The fact is that the code in the JDK can get the calling class and implement something like MethodHandles.lookup() so it takes ~3 ns. If you implemented something like a lookup class as a normal user your best bet would be StackWalker.GetCallingClass() and you would end up with something that is at least 2 magnitudes slower. That is probably not an issue for most use cases. But for some, it might be a bit of a steep cost. /Kasper
Re: fast way to infer caller
Rémi, Jason, Steven, Kasper, Thank you all for your thoughtful feedback. I forgot to mention that in addition to obtaining the appropriate logger by inferring caller class, the caller class may also be a data point written with each logging event, aka LogRecord. If the cost of writing a log event to file is 500 nanoseconds (without caller data), spending additional 1800 nanoseconds to obtain caller data via StackWalker reduces throughput by a factor of 4.8. Mind you, this is still much better than obtaining caller data via "new Throwable().getStackTrace();" which costs about 11'000 nanoseconds, reducing throughput by a factor of 24. While logging 2'000'000 messages per second is not a goal, reducing the cost of logging as much as possible, is a worthwhile goal. At present time, retrieving caller data still has a significant effect on logging performance. -- Ceki Gülcü Sponsoring SLF4J/logback/reload4j at https://github.com/sponsors/qos-ch
Re: fast way to infer caller
> From: "Kasper Nielsen" > To: "Remi Forax" > Cc: "Ceki Gülcü" , "core-libs-dev" > > Sent: Thursday, April 7, 2022 2:42:46 PM > Subject: Re: fast way to infer caller > On Thu, 7 Apr 2022 at 13:33, Remi Forax < [ mailto:fo...@univ-mlv.fr | > fo...@univ-mlv.fr ] > wrote: >> - Original Message - >> > From: "Kasper Nielsen" < [ mailto:kaspe...@gmail.com | kaspe...@gmail.com >> > ] > >> > To: "Ceki Gülcü" < [ mailto:c...@qos.ch | c...@qos.ch ] > >>> Cc: "core-libs-dev" < [ mailto:core-libs-dev@openjdk.java.net | >> > core-libs-dev@openjdk.java.net ] > >> > Sent: Thursday, April 7, 2022 1:53:33 PM >> > Subject: Re: fast way to infer caller >> >> MethodHandles.lookup().lookupClass() looks very promising except that >> >> there is no way to specify the depth. >> >> I am looking for a method to obtain the Nth caller at a cost of around >> >> 100 to 200 nanoseconds of CPU time. Do you think the JDK could cater >> >> for this use case? >> > Hi Ceki, >> > I don't think you will find the numbers you are looking for with >> > StackWalker. >> > Compared to something like Reflection.getCallerClass() which >> > MethodHandles.lookup() uses. >> > There is still a very large (>100x) performance gap between the two. >> > public class StackWalkerPerf { >> > static final StackWalker sw = >> > StackWalker.getInstance(Option.RETAIN_CLASS_REFERENCE); >> > @Benchmark >> > public Class stackWalkerCallerClass() { >> > return sw.getCallerClass(); >> > } >> > @Benchmark >> > public Class reflectionCallerClass() { >> > return MethodHandles.lookup().lookupClass(); >> > } >> > } >> > Benchmark Mode Cnt Score Error Units >> > StackWalkerPerf.reflectionCallerClass avgt 10 2,927 ± 0,012 ns/op >> > StackWalkerPerf.stackWalkerCallerClass avgt 10 915,287 ± 9,565 ns/op >>> The resulting class is not the same, so comparing the performance of two >>> codes >> > that does not do the same thing is dubious, no ? > I think it is a fair test. MethodHandles.lookup() is basically just wrapping > Reflection.getCallerClass(). So if anything > calling Reflection.getCallerClass() directly would be even faster than calling > MethodHandles.lookup().lookupClass(); nope, see my previous mail to Ceki, the VM is cheating here if it can inline the call to MethodHandles.lookup(). > /Kasper Rémi
Re: fast way to infer caller
On Thu, 7 Apr 2022 at 13:33, Remi Forax wrote: > - Original Message - > > From: "Kasper Nielsen" > > To: "Ceki Gülcü" > > Cc: "core-libs-dev" > > Sent: Thursday, April 7, 2022 1:53:33 PM > > Subject: Re: fast way to infer caller > > >> > >> MethodHandles.lookup().lookupClass() looks very promising except that > >> there is no way to specify the depth. > >> > >> I am looking for a method to obtain the Nth caller at a cost of around > >> 100 to 200 nanoseconds of CPU time. Do you think the JDK could cater > >> for this use case? > >> > > > > Hi Ceki, > > > > I don't think you will find the numbers you are looking for with > > StackWalker. > > Compared to something like Reflection.getCallerClass() which > > MethodHandles.lookup() uses. > > There is still a very large (>100x) performance gap between the two. > > > > public class StackWalkerPerf { > > > >static final StackWalker sw = > > StackWalker.getInstance(Option.RETAIN_CLASS_REFERENCE); > > > >@Benchmark > >public Class stackWalkerCallerClass() { > >return sw.getCallerClass(); > >} > > > >@Benchmark > >public Class reflectionCallerClass() { > >return MethodHandles.lookup().lookupClass(); > >} > > } > > > > Benchmark Mode Cnt ScoreError > Units > > StackWalkerPerf.reflectionCallerClass avgt 102,927 ± 0,012 ns/op > > StackWalkerPerf.stackWalkerCallerClass avgt 10 915,287 ± 9,565 ns/op > > > The resulting class is not the same, so comparing the performance of two > codes that does not do the same thing is dubious, no ? I think it is a fair test. MethodHandles.lookup() is basically just wrapping Reflection.getCallerClass(). So if anything calling Reflection.getCallerClass() directly would be even faster than calling MethodHandles.lookup().lookupClass(); /Kasper
Re: fast way to infer caller
- Original Message - > From: "Kasper Nielsen" > To: "Ceki Gülcü" > Cc: "core-libs-dev" > Sent: Thursday, April 7, 2022 1:53:33 PM > Subject: Re: fast way to infer caller >> >> MethodHandles.lookup().lookupClass() looks very promising except that >> there is no way to specify the depth. >> >> I am looking for a method to obtain the Nth caller at a cost of around >> 100 to 200 nanoseconds of CPU time. Do you think the JDK could cater >> for this use case? >> > > Hi Ceki, > > I don't think you will find the numbers you are looking for with > StackWalker. > Compared to something like Reflection.getCallerClass() which > MethodHandles.lookup() uses. > There is still a very large (>100x) performance gap between the two. > > public class StackWalkerPerf { > >static final StackWalker sw = > StackWalker.getInstance(Option.RETAIN_CLASS_REFERENCE); > >@Benchmark >public Class stackWalkerCallerClass() { >return sw.getCallerClass(); >} > >@Benchmark >public Class reflectionCallerClass() { >return MethodHandles.lookup().lookupClass(); >} > } > > Benchmark Mode Cnt ScoreError Units > StackWalkerPerf.reflectionCallerClass avgt 102,927 ± 0,012 ns/op > StackWalkerPerf.stackWalkerCallerClass avgt 10 915,287 ± 9,565 ns/op The resulting class is not the same, so comparing the performance of two codes that does not do the same thing is dubious, no ? > > /Kasper Rémi
Re: fast way to infer caller
> > MethodHandles.lookup().lookupClass() looks very promising except that > there is no way to specify the depth. > > I am looking for a method to obtain the Nth caller at a cost of around > 100 to 200 nanoseconds of CPU time. Do you think the JDK could cater > for this use case? > Hi Ceki, I don't think you will find the numbers you are looking for with StackWalker. Compared to something like Reflection.getCallerClass() which MethodHandles.lookup() uses. There is still a very large (>100x) performance gap between the two. public class StackWalkerPerf { static final StackWalker sw = StackWalker.getInstance(Option.RETAIN_CLASS_REFERENCE); @Benchmark public Class stackWalkerCallerClass() { return sw.getCallerClass(); } @Benchmark public Class reflectionCallerClass() { return MethodHandles.lookup().lookupClass(); } } Benchmark Mode Cnt ScoreError Units StackWalkerPerf.reflectionCallerClass avgt 102,927 ± 0,012 ns/op StackWalkerPerf.stackWalkerCallerClass avgt 10 915,287 ± 9,565 ns/op /Kasper
Re: fast way to infer caller
- Original Message - > From: "Ceki Gülcü" > To: "core-libs-dev" > Sent: Thursday, April 7, 2022 1:04:11 AM > Subject: Re: fast way to infer caller > Hi Jason, > > Yes, the code was mentioned in SLF4J PR 271. > > The benchmark can be found at > > https://github.com/ceki/logback-perf/tree/master/src/main/java/ch/qos/logback/perf/caller > > Here is the gist of the StackWalker variant: > > public class CallerCompute { > >static private StackWalker WALKER = StackWalker.getInstance(); > >static public String getCallerClass(int depth) { > >List frames = WALKER.walk(s -> s.limit(4).toList()); > process the returned frames >} > } > > The above executes at around 1.8 microseconds per call whereas the code > in LogRecord.getSourceClassName() executes at 5 microseconds per call. > The cost of LogRecord constructor is negligible at around 25 nanoseconds. > > Anyway, thank you for the suggestion. You can declare WALKER "final", it should help. This is also better to call getCallerClass ASAP and then pass the class to the other stack frames instead of having to skip those stack frames when calling getCallerClass. You may also try able to improve the speed by doing a == before calling equals(String) if you call String.intern() on THIS_CLASS_NAME. > > -- > Ceki Rémi > > > On 4/7/2022 12:27 AM, Jason Mehrens wrote: >> Ceki, >> >> Looks like the benchmark code is from >> https://github.com/qos-ch/slf4j/pull/271 >> >> Looks like the benchmark code is doing a collection so perhaps that is some >> of >> the performance hit? >> Have you benchmarked java.util.LogRecord.getSourceClassName() to compare with >> your StackWalker benchmark? >> >> https://github.com/openjdk/jdk/blob/master/src/java.logging/share/classes/java/util/logging/LogRecord.java#L754 >> >> Jason >> >> >> From: core-libs-dev on behalf of Ceki >> Gülcü >> Sent: Wednesday, April 6, 2022 4:26 PM >> To: core-libs-dev >> Subject: Re: fast way to infer caller >> >> >> Hi Rémi, >> >> Thank you for your answer. >> >> According to some benchmarks on a i7-8565U Intel CPU (quite average >> CPU), here are the costs of computing the caller class via different >> methods: >> >> using new Throwable().getStackTrace: 11 microseconds per call >> using StackWalker API: 1.8 microseconds per call >> using SecurityManager: 0.9 microseconds per call >> >> While a six fold improvement (StackWalker compared to new Thorowable) is >> nothing to sneeze at, the performance of StackWalker is not as good as >> with a custom SecurityManager. >> >> I have not said so explicitly but the aim here is to allow the user to >> obtain a new logger by calling LoggerFactory.getLogger() with no >> arguments with the returned logger named after the caller class. >> >> Spending 1 or 2 microseconds for this call is OK if the logger is a >> static field. However, if the logger is an instance field, then spending >> 2 microseconds per host object instance just to obtain a logger might >> have a noticeable impact on performance. It follows that the performance >> of LoggerFactory.getLogger() must be exceptionally good, assuming we >> wish to avoid having the user accidentally shooting herself on the foot, >> ergo the 100 nanosecond performance per call requirement. >> >> Noting that invoking MethodHandles.lookup().lookupClass() seems very >> fast (about 2 nanoseconds), I would be very interested if new >> lookup(int index) method were added to java.lang.invoke.MethodHandles >> class, with index designating the desired index on the call stack. >> >> Does the above make sense? How difficult would it be to add such a method? >> >> -- >> Ceki Gülcü >> >> >> On 4/6/2022 5:52 PM, Remi Forax wrote: >>> - Original Message - >>>> From: "Ceki Gülcü" >>>> To: "core-libs-dev" >>>> Sent: Wednesday, April 6, 2022 5:30:51 PM >>>> Subject: fast way to infer caller >>> >>>> Hello, >>> >>> Hello, >>> >>>> >>>> As you are probably aware, one of the important primitives used in >>>> logging libraries is inferring the caller of a given logging statement. >>>> The current common practice is to create a throwable and process its >>>> stack trace. This is rather wasteful and rather slow. As an alterna
Re: fast way to infer caller
> On Apr 6, 2022, at 3:54 PM, Remi Forax wrote: > > - Original Message - >> From: "Ceki Gülcü" >> To: "core-libs-dev" >> Sent: Wednesday, April 6, 2022 11:26:39 PM >> Subject: Re: fast way to infer caller > >> Hi Rémi, >> > > Now, i don't think there is a real solution to you issue, worst i will try to > convince you that this is not a real problem :) > You are offering convenience using magic to your user, this has a cost. > > For me this is very similar to the trade off you have by offering to change > the logger configuration at runtime. > This is convenient but it requires a volatile read (even when the logger is > de-activated) which destroy performance in tight loop (you loose hoisting). > I believe that if your users are fine with that, they are also fine with a > call to LoggerFactory.getLogger() being a little slow. > We're only one user, but we use Logback and always assumed getLogger was reasonably expensive, and that the most effort went into optimizing the actual logging itself. So we always store it in a static (preferred) or at least instance field, and if we ever dynamically look it up we assume that's more expensive. To us, the convenience of calling getLogger() over getLogger(MyClass.class) is a nice win, and even 10x the execution speed of calling getLogger doesn't matter at all. If the performance cost was documented we would still use it without hesitation. Just my 2¢, Steven > > >> >> >> On 4/6/2022 5:52 PM, Remi Forax wrote: >>> ----- Original Message - >>>> From: "Ceki Gülcü" >>>> To: "core-libs-dev" >>>> Sent: Wednesday, April 6, 2022 5:30:51 PM >>>> Subject: fast way to infer caller >>> >>>> Hello, >>> >>> Hello, >>> >>>> >>>> As you are probably aware, one of the important primitives used in >>>> logging libraries is inferring the caller of a given logging statement. >>>> The current common practice is to create a throwable and process its >>>> stack trace. This is rather wasteful and rather slow. As an alternative, >>>> I have tried using the StackWalker API to infer the caller but was >>>> unsatisfied with the performance. >>>> >>>> MethodHandles.lookup().lookupClass() looks very promising except that >>>> there is no way to specify the depth. >>>> >>>> I am looking for a method to obtain the Nth caller at a cost of around >>>> 100 to 200 nanoseconds of CPU time. Do you think the JDK could cater >>>> for this use case? >>> >>> We have designed the StackWalker with that in mind >>> https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/StackWalker.html >>> >>> see the discussion on StackWalker.getCallerClass() >>> https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/StackWalker.html#getCallerClass()
Re: fast way to infer caller
Hi Jason, Yes, the code was mentioned in SLF4J PR 271. The benchmark can be found at https://github.com/ceki/logback-perf/tree/master/src/main/java/ch/qos/logback/perf/caller Here is the gist of the StackWalker variant: public class CallerCompute { static private StackWalker WALKER = StackWalker.getInstance(); static public String getCallerClass(int depth) { List frames = WALKER.walk(s -> s.limit(4).toList()); process the returned frames } } The above executes at around 1.8 microseconds per call whereas the code in LogRecord.getSourceClassName() executes at 5 microseconds per call. The cost of LogRecord constructor is negligible at around 25 nanoseconds. Anyway, thank you for the suggestion. -- Ceki On 4/7/2022 12:27 AM, Jason Mehrens wrote: > Ceki, > > Looks like the benchmark code is from https://github.com/qos-ch/slf4j/pull/271 > > Looks like the benchmark code is doing a collection so perhaps that is some > of the performance hit? > Have you benchmarked java.util.LogRecord.getSourceClassName() to compare with > your StackWalker benchmark? > > https://github.com/openjdk/jdk/blob/master/src/java.logging/share/classes/java/util/logging/LogRecord.java#L754 > > Jason > > > From: core-libs-dev on behalf of Ceki > Gülcü > Sent: Wednesday, April 6, 2022 4:26 PM > To: core-libs-dev > Subject: Re: fast way to infer caller > > > Hi Rémi, > > Thank you for your answer. > > According to some benchmarks on a i7-8565U Intel CPU (quite average > CPU), here are the costs of computing the caller class via different > methods: > > using new Throwable().getStackTrace: 11 microseconds per call > using StackWalker API: 1.8 microseconds per call > using SecurityManager: 0.9 microseconds per call > > While a six fold improvement (StackWalker compared to new Thorowable) is > nothing to sneeze at, the performance of StackWalker is not as good as > with a custom SecurityManager. > > I have not said so explicitly but the aim here is to allow the user to > obtain a new logger by calling LoggerFactory.getLogger() with no > arguments with the returned logger named after the caller class. > > Spending 1 or 2 microseconds for this call is OK if the logger is a > static field. However, if the logger is an instance field, then spending > 2 microseconds per host object instance just to obtain a logger might > have a noticeable impact on performance. It follows that the performance > of LoggerFactory.getLogger() must be exceptionally good, assuming we > wish to avoid having the user accidentally shooting herself on the foot, > ergo the 100 nanosecond performance per call requirement. > > Noting that invoking MethodHandles.lookup().lookupClass() seems very > fast (about 2 nanoseconds), I would be very interested if new > lookup(int index) method were added to java.lang.invoke.MethodHandles > class, with index designating the desired index on the call stack. > > Does the above make sense? How difficult would it be to add such a method? > > -- > Ceki Gülcü > > > On 4/6/2022 5:52 PM, Remi Forax wrote: >> - Original Message - >>> From: "Ceki Gülcü" >>> To: "core-libs-dev" >>> Sent: Wednesday, April 6, 2022 5:30:51 PM >>> Subject: fast way to infer caller >> >>> Hello, >> >> Hello, >> >>> >>> As you are probably aware, one of the important primitives used in >>> logging libraries is inferring the caller of a given logging statement. >>> The current common practice is to create a throwable and process its >>> stack trace. This is rather wasteful and rather slow. As an alternative, >>> I have tried using the StackWalker API to infer the caller but was >>> unsatisfied with the performance. >>> >>> MethodHandles.lookup().lookupClass() looks very promising except that >>> there is no way to specify the depth. >>> >>> I am looking for a method to obtain the Nth caller at a cost of around >>> 100 to 200 nanoseconds of CPU time. Do you think the JDK could cater >>> for this use case? >> >> We have designed the StackWalker with that in mind >> https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/StackWalker.html >> >> see the discussion on StackWalker.getCallerClass() >> https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/StackWalker.html#getCallerClass() >> >
Re: fast way to infer caller
- Original Message - > From: "Ceki Gülcü" > To: "core-libs-dev" > Sent: Wednesday, April 6, 2022 11:26:39 PM > Subject: Re: fast way to infer caller > Hi Rémi, > > Thank you for your answer. > > According to some benchmarks on a i7-8565U Intel CPU (quite average > CPU), here are the costs of computing the caller class via different > methods: > > using new Throwable().getStackTrace: 11 microseconds per call > using StackWalker API: 1.8 microseconds per call > using SecurityManager: 0.9 microseconds per call > > While a six fold improvement (StackWalker compared to new Throwable) is > nothing to sneeze at, the performance of StackWalker is not as good as > with a custom SecurityManager. > > I have not said so explicitly but the aim here is to allow the user to > obtain a new logger by calling LoggerFactory.getLogger() with no > arguments with the returned logger named after the caller class. > > Spending 1 or 2 microseconds for this call is OK if the logger is a > static field. However, if the logger is an instance field, then spending > 2 microseconds per host object instance just to obtain a logger might > have a noticeable impact on performance. It follows that the performance > of LoggerFactory.getLogger() must be exceptionally good, assuming we > wish to avoid having the user accidentally shooting herself on the foot, > ergo the 100 nanosecond performance per call requirement. > > Noting that invoking MethodHandles.lookup().lookupClass() seems very > fast (about 2 nanoseconds), I would be very interested if new > lookup(int index) method were added to java.lang.invoke.MethodHandles > class, with index designating the desired index on the call stack. > > Does the above make sense? How difficult would it be to add such a method ? I think that for MethodHandles.lookup() and all other caller sensitive methods, the VM uses a trick, it inlines the call into the caller method, once this is done, the caller class is know statically and can be replaced by a constant. In order to use the same trick, you need a way to force the inlining through getLogger(), the is something the JDK can do but that is not available to user code. And you do not want lookup(-1) because getLogger() can be called by reflection, java.lang.invoke or a lambda, in all these cases you have "hidden" stack frames in between the user code code and LoggerFactory.getLogger(), you need to skip those stack frames, this is what the StackWalker does. Now, i don't think there is a real solution to you issue, worst i will try to convince you that this is not a real problem :) You are offering convenience using magic to your user, this has a cost. For me this is very similar to the trade off you have by offering to change the logger configuration at runtime. This is convenient but it requires a volatile read (even when the logger is de-activated) which destroy performance in tight loop (you loose hoisting). I believe that if your users are fine with that, they are also fine with a call to LoggerFactory.getLogger() being a little slow. > > -- > Ceki Gülcü Rémi > > > On 4/6/2022 5:52 PM, Remi Forax wrote: >> - Original Message - >>> From: "Ceki Gülcü" >>> To: "core-libs-dev" >>> Sent: Wednesday, April 6, 2022 5:30:51 PM >>> Subject: fast way to infer caller >> >>> Hello, >> >> Hello, >> >>> >>> As you are probably aware, one of the important primitives used in >>> logging libraries is inferring the caller of a given logging statement. >>> The current common practice is to create a throwable and process its >>> stack trace. This is rather wasteful and rather slow. As an alternative, >>> I have tried using the StackWalker API to infer the caller but was >>> unsatisfied with the performance. >>> >>> MethodHandles.lookup().lookupClass() looks very promising except that >>> there is no way to specify the depth. >>> >>> I am looking for a method to obtain the Nth caller at a cost of around >>> 100 to 200 nanoseconds of CPU time. Do you think the JDK could cater >>> for this use case? >> >> We have designed the StackWalker with that in mind >> https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/StackWalker.html >> >> see the discussion on StackWalker.getCallerClass() >> https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/StackWalker.html#getCallerClass()
Re: fast way to infer caller
Ceki, Looks like the benchmark code is from https://github.com/qos-ch/slf4j/pull/271 Looks like the benchmark code is doing a collection so perhaps that is some of the performance hit? Have you benchmarked java.util.LogRecord.getSourceClassName() to compare with your StackWalker benchmark? https://github.com/openjdk/jdk/blob/master/src/java.logging/share/classes/java/util/logging/LogRecord.java#L754 Jason From: core-libs-dev on behalf of Ceki Gülcü Sent: Wednesday, April 6, 2022 4:26 PM To: core-libs-dev Subject: Re: fast way to infer caller Hi Rémi, Thank you for your answer. According to some benchmarks on a i7-8565U Intel CPU (quite average CPU), here are the costs of computing the caller class via different methods: using new Throwable().getStackTrace: 11 microseconds per call using StackWalker API: 1.8 microseconds per call using SecurityManager: 0.9 microseconds per call While a six fold improvement (StackWalker compared to new Thorowable) is nothing to sneeze at, the performance of StackWalker is not as good as with a custom SecurityManager. I have not said so explicitly but the aim here is to allow the user to obtain a new logger by calling LoggerFactory.getLogger() with no arguments with the returned logger named after the caller class. Spending 1 or 2 microseconds for this call is OK if the logger is a static field. However, if the logger is an instance field, then spending 2 microseconds per host object instance just to obtain a logger might have a noticeable impact on performance. It follows that the performance of LoggerFactory.getLogger() must be exceptionally good, assuming we wish to avoid having the user accidentally shooting herself on the foot, ergo the 100 nanosecond performance per call requirement. Noting that invoking MethodHandles.lookup().lookupClass() seems very fast (about 2 nanoseconds), I would be very interested if new lookup(int index) method were added to java.lang.invoke.MethodHandles class, with index designating the desired index on the call stack. Does the above make sense? How difficult would it be to add such a method? -- Ceki Gülcü On 4/6/2022 5:52 PM, Remi Forax wrote: > - Original Message - >> From: "Ceki Gülcü" >> To: "core-libs-dev" >> Sent: Wednesday, April 6, 2022 5:30:51 PM >> Subject: fast way to infer caller > >> Hello, > > Hello, > >> >> As you are probably aware, one of the important primitives used in >> logging libraries is inferring the caller of a given logging statement. >> The current common practice is to create a throwable and process its >> stack trace. This is rather wasteful and rather slow. As an alternative, >> I have tried using the StackWalker API to infer the caller but was >> unsatisfied with the performance. >> >> MethodHandles.lookup().lookupClass() looks very promising except that >> there is no way to specify the depth. >> >> I am looking for a method to obtain the Nth caller at a cost of around >> 100 to 200 nanoseconds of CPU time. Do you think the JDK could cater >> for this use case? > > We have designed the StackWalker with that in mind > https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/StackWalker.html > > see the discussion on StackWalker.getCallerClass() > https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/StackWalker.html#getCallerClass() >
Re: fast way to infer caller
Hi Rémi, Thank you for your answer. According to some benchmarks on a i7-8565U Intel CPU (quite average CPU), here are the costs of computing the caller class via different methods: using new Throwable().getStackTrace: 11 microseconds per call using StackWalker API: 1.8 microseconds per call using SecurityManager: 0.9 microseconds per call While a six fold improvement (StackWalker compared to new Thorowable) is nothing to sneeze at, the performance of StackWalker is not as good as with a custom SecurityManager. I have not said so explicitly but the aim here is to allow the user to obtain a new logger by calling LoggerFactory.getLogger() with no arguments with the returned logger named after the caller class. Spending 1 or 2 microseconds for this call is OK if the logger is a static field. However, if the logger is an instance field, then spending 2 microseconds per host object instance just to obtain a logger might have a noticeable impact on performance. It follows that the performance of LoggerFactory.getLogger() must be exceptionally good, assuming we wish to avoid having the user accidentally shooting herself on the foot, ergo the 100 nanosecond performance per call requirement. Noting that invoking MethodHandles.lookup().lookupClass() seems very fast (about 2 nanoseconds), I would be very interested if new lookup(int index) method were added to java.lang.invoke.MethodHandles class, with index designating the desired index on the call stack. Does the above make sense? How difficult would it be to add such a method? -- Ceki Gülcü On 4/6/2022 5:52 PM, Remi Forax wrote: > - Original Message - >> From: "Ceki Gülcü" >> To: "core-libs-dev" >> Sent: Wednesday, April 6, 2022 5:30:51 PM >> Subject: fast way to infer caller > >> Hello, > > Hello, > >> >> As you are probably aware, one of the important primitives used in >> logging libraries is inferring the caller of a given logging statement. >> The current common practice is to create a throwable and process its >> stack trace. This is rather wasteful and rather slow. As an alternative, >> I have tried using the StackWalker API to infer the caller but was >> unsatisfied with the performance. >> >> MethodHandles.lookup().lookupClass() looks very promising except that >> there is no way to specify the depth. >> >> I am looking for a method to obtain the Nth caller at a cost of around >> 100 to 200 nanoseconds of CPU time. Do you think the JDK could cater >> for this use case? > > We have designed the StackWalker with that in mind > https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/StackWalker.html > > see the discussion on StackWalker.getCallerClass() > https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/StackWalker.html#getCallerClass() >
Re: fast way to infer caller
- Original Message - > From: "Ceki Gülcü" > To: "core-libs-dev" > Sent: Wednesday, April 6, 2022 5:30:51 PM > Subject: fast way to infer caller > Hello, Hello, > > As you are probably aware, one of the important primitives used in > logging libraries is inferring the caller of a given logging statement. > The current common practice is to create a throwable and process its > stack trace. This is rather wasteful and rather slow. As an alternative, > I have tried using the StackWalker API to infer the caller but was > unsatisfied with the performance. > > MethodHandles.lookup().lookupClass() looks very promising except that > there is no way to specify the depth. > > I am looking for a method to obtain the Nth caller at a cost of around > 100 to 200 nanoseconds of CPU time. Do you think the JDK could cater > for this use case? We have designed the StackWalker with that in mind https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/StackWalker.html see the discussion on StackWalker.getCallerClass() https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/StackWalker.html#getCallerClass() > > -- > Ceki Gülcü > > Sponsoring SLF4J/logback/reload4j at https://github.com/sponsors/qos-ch regards, Rémi
fast way to infer caller
Hello, As you are probably aware, one of the important primitives used in logging libraries is inferring the caller of a given logging statement. The current common practice is to create a throwable and process its stack trace. This is rather wasteful and rather slow. As an alternative, I have tried using the StackWalker API to infer the caller but was unsatisfied with the performance. MethodHandles.lookup().lookupClass() looks very promising except that there is no way to specify the depth. I am looking for a method to obtain the Nth caller at a cost of around 100 to 200 nanoseconds of CPU time. Do you think the JDK could cater for this use case? -- Ceki Gülcü Sponsoring SLF4J/logback/reload4j at https://github.com/sponsors/qos-ch