Re: Cek's new log4j vs logback benchmark

2021-08-29 Thread Carter Kozak
I've pushed some changes I've been using to validate performance here: 
https://github.com/carterkozak/logback-perf/pull/new/ckozak/sandbox

Using the linux perf-norm profiler, the direct (garbage free) encoders on my 
machine use about 5000 instructions per operation while the byte-array method 
uses about 3500 instructions per operation.

The allocations appear to be small and short-lived enough to fit nicely into 
TLAB, if I disable TLAB with '-XX:-UseTLAB', the gc-free implementation isn't 
quite as heavily impacted as the byte-array encoder.

Interestingly, performance suffers dramatically when I introduce unicode 
characters to the log line. It looks like the garbage-free implementation fares 
better than the byte-array implementation by about 5%. I need to dig in a bit 
more here and re-validate the benchmarking results.

On Sat, Aug 28, 2021, at 17:58, Remko Popma wrote:
> On Sat, Aug 28, 2021 at 3:16 PM Ron Grabowski
>  wrote:
> 
> >  Follow-up to "Formatting the date is expensive. The process of actually
> > formatting a value is reasonable". Is this still an issue from LOG4J2-930:
> > %m %ex%n: 1,755,573 msg/sec%d %m %ex%n: 1,194,184 msg/sec
> >
> 
> No, I believe that formatting the date is no longer the bottleneck.
> The analysis done in LOG4J2-930 led to
> https://issues.apache.org/jira/browse/LOG4J2-1097 which resulted in the
> FixedDateFormat.
> This format gives a good trade-off between speed and flexibility.
> The drawback is that it only works for some fixed formats, but those are
> the most widely used formats.
> I don't think that focusing on the date formatting and pregenerate
> formatted timestamps will be fruitful (but I could be wrong).
> 
> Avoiding the PreciseTime Instant and using System.currentTimeMillis when it
> is known that none of the downstream formatters require sub-millisecond
> precision may be a good optimization.
> 
> The message formatting (PatternLayout) as a whole is expensive somehow,
> there may be more to optimize there.
> 
> 
> > If so, isn't date rendering essentially a sequence we can generate ahead
> > of time similar to how a local ID generator asks for an allocation of keys
> > then uses that to quickly assign IDs to new objects? When its time to
> > render %d we can just grab it via an index:
> >
> > 1)At startup calculate the next 32k formatted dates. If
> > Clock.currentTimeMillis() were configured down to the second, 32000 seconds
> > would pre-allocate %d for the next 8 hours.
> > 2)Apply math to Clock.currentTimeMillis() to get an index into the buffer.
> > Seconds precision:
> > [10] = "2021-08-28 09:44:31,000"
> > [11] = "2021-08-28 09:44:32,000"[12] = "2021-08-28 09:44:33,000"[13] =
> > "2021-08-28 09:44:34,000"[14] = "2021-08-28 09:44:35,000"[15] = "2021-08-28
> > 09:44:36,000"...[31999] = "..."
> > 50ms precision:
> > [10] = "2021-08-28 09:44:31,050"[11] = "2021-08-28 09:44:31,075"[12] =
> > "2021-08-28 09:44:31,100"[13] = "2021-08-28 09:44:31,150"[14] = "2021-08-28
> > 09:44:31,175"[15] = "2021-08-28 09:44:31,200"...[31999] = "..."
> >
> > 3)Rendering %d{SEQ(DEFAULT,32000)} is just a index lookup into the
> > sequence of 32000 pre-calculated %d{DEFAULT} values without the cost of
> > formatting. I made up the "SEQ" notation, there's likely a better way to
> > express the feature. Everything can read from the buffer without locking.
> > 4)Have a background thread casually keep the sequence filled in a ring so
> > dates in the past are replaced with future dates so the structure consumes
> > a consistent amount of memory.
> > On Friday, August 27, 2021, 10:07:59 PM EDT, Carter Kozak <
> > cko...@ckozak.net> wrote:
> >
> >  Thanks, Remko. The default '%d' uses FixedDateFormat with
> > FixedFormat.DEFAULT. The FastDateFormat alternative does not support
> > microseconds, so it doesn't suffer from the same problem. I think I can
> > substantially reduce the frequency we re-format dates by checking
> > FixedFormat.secondFractionDigits to determine if we meed to compare
> > microseconds.
> >
> > On Fri, Aug 27, 2021, at 16:10, Remko Popma wrote:
> > > I remember looking at PatternLayout performance, I reported my findings
> > here, hopefully they’re still useful:
> > https://issues.apache.org/jira/browse/LOG4J2-930
> > >
> > > If %d is used in the pattern, does the FixedDateFormat get used?
> > >
> > >
> > >
> > >
> > > > On Aug 28, 2021, at 4:33, Ralph Goers 
> > wrote:
> > > >
> > > > All of that agrees with my observations as well.
> > > >
> > > > Ralph
> > > >
> > > >> On Aug 27, 2021, at 12:23 PM, Carter Kozak  wrote:
> > > >>
> > > >> I've identified a few things that seem impactful, but bear in mind
> > that I haven't begun to drill down into them yet. I plan to file individual
> > tickets and investigate in greater depth later on:
> > > >>
> > > >> 1. Formatting the date is expensive. The process of actually
> > formatting a value is reasonable, however using a precise clock appears to
> > cause cache misses even when the pattern results

Re: The diff between XSDs in master and release-2.x

2021-08-29 Thread Gary Gregory
I don't think there is a real reason...

Gary

On Wed, Aug 25, 2021, 11:03 Volkan Yazıcı  wrote:

> Log4j-config.xsd in master is substantially more complete compared to the
> one in release-2.x. Is there a reason why we did not backport these changes
> from master to release-2.x?
>