Re: Cek's new log4j vs logback benchmark

Carter Kozak Wed, 01 Sep 2021 05:57:09 -0700

Thanks, Volkan!

I had not seen InstantFormatter, it does look helpful, however I think it may
have a bug. It appears to only compare milliseconds of Instant values
while FixedDateFormat has some patterns which support microsecond
and nanosecond precision. Currently I think this will batch together all
events within a millisecond and assign them the microsecond value
of the first cached event.
I think this is what we want: https://github.com/apache/logging-log4j2/pull/576


Good idea reaching out to Claes Redestad, it would be helpful to have
someone more familiar with modern jvm string internals than us take
a look if he's interested!

-ck

On Wed, Sep 1, 2021, at 03:44, Volkan Yazıcı wrote:
> Great work Carter!
> 
> Have you seen `o.a.l.l.layout.template.json.util.InstantFormatter`,
> particularly its `Formatter#isInstantMatching` methods used for
> invalidating the cache? I was thinking of making it even smarter, e.g., if
> the pattern only has seconds, compare `Instant`s by their seconds. I aspire
> to pull it to the core, replace access to all individual formatters with
> this one, and mark the rest deprecated. Another idea I was thinking about
> is enhancing these individual formatters to contain the precision they
> require and use that in `isInstantMatching` methods.
> 
> Regarding your unicode character problems, shall we try pinging Claes
> Redestad (@cl4es), who has recently enhanced String.format() in JDK 17
> <https://twitter.com/cl4es/status/1432361530268528642>, via Twitter?
> 
> On Mon, Aug 30, 2021 at 11:32 PM Carter Kozak <[email protected]> wrote:
> 
> > I've merged the fix for our FixedDateFormat caching bug which caused us to
> > recompute the same millisecond-precision formatted timestamp unnecessarily
> > each time our microsecond-precision clock incremented.
> > https://issues.apache.org/jira/browse/LOG4J2-3153
> >
> > I've also been unwrapping a few layers of complexity, wrapping several
> > layers of components with conditional logic makes it harder for the jit to
> > optimize code, so we can improve things by using separate types based on
> > the requested features:
> > https://github.com/apache/logging-log4j2/pull/573
> > TODO: I'm not happy with the way I unwrapped PatternFormatter objects in
> > this PR, I think it could work better as an optional wrapper around the
> > delegate LogEventPatternConverter (where the default FormattingInfo returns
> > the delegate instance directly)
> > TODO: simplify MessagePatternConverter  a bit, the body is giant for
> > something that's generally relatively simple. The method is too large for
> > me to read in a glance, so I imagine the jit will have a hard time making
> > it fast as well. I don't really like the message-format feature which
> > allows lookups in the formatted message text because it leaks details of
> > the framework implementation/configuration into consumers of logging APIs
> > (which may not even be log4j-core), however I'm not sure how reasonable it
> > would be to change the default to disallow lookups given I'm sure folks
> > depend on that behavior.
> >
> > I'm not sure what to do about the CharsetEncoder vs
> > string.getBytes(charset) issue. The CharsetEncoder does not require
> > allocation and outperforms getBytes when I add a unicode character to the
> > message. When the message contains only ascii characters, getBytes performs
> > better. Using CharBuffer.wrap(StringBuilder) produces substantially worse
> > performance -- it shouldn't be copying the buffer in memory, but I suppose
> > the heap buffer is more efficient to deal with. I need to do more research
> > in this area.
> >
> > Thoughts/ideas/concerns?
> > -ck
> >
>

Re: Cek's new log4j vs logback benchmark

Reply via email to