Ralph, could you point me toward the test that was failing for you? I'd fixed 
up a few and thought everything was passing. The branch should be ready to 
merge unless there are test flakes, then I'll fix them up. I'm also planning to 
turn off direct encoders by default on Java 9 and newer in a separate change.
I've been busy with work and life things, and haven't been as proactive with 
this as I'd like.

-ck

On Mon, Oct 4, 2021, at 03:35, Ralph Goers wrote:
> I don’t believe Carter’s latest changes have been merged. But when I tried to 
> build 
> Carter’s branch one unit test kept failing. I believe the problem is in the 
> test - some 
> sort of timing issue.
> 
> I have some documentation updates I need to make as well as my usual pass 
> through 
> open issues and PRs. I just finished a project at work that has been 
> consuming a 
> lot of my time so I should be able to work on that this week.
> 
> It would be really nice to update our performance page as it is seriously out 
> of date as 
> the tests are all based on Log4j 2 2.6. I am reluctant to do that though 
> since I cannot 
> replicate the results Remko published there. Although Remko documented info 
> on the 
> CPU that was used the page doesn’t mention what type of disks were used and 
> how 
> fast they are. I’ve run the tests on my Mac with the SSD that comes with it 
> and have 
> never been able to verify the results.
> 
> Ralph
> 
> > On Oct 4, 2021, at 12:09 AM, Volkan Yazıcı <vol...@yazi.ci> wrote:
> > 
> > Gentlemen, what is the status of the progress on this front? Is it in a
> > "good enough" state for the 2.15.0 release?
> > 
> > On Thu, Sep 23, 2021 at 6:29 PM Carter Kozak <cko...@ckozak.net> wrote:
> > 
> >> Thanks, Ralph!
> >> 
> >> You're correct that the direct encoderes are required for GC-free logging,
> >> some of our consumers rely on that
> >> for niche (low latency) workloads where it's important to understand
> >> precisely where a pause may occur.
> >> GC performance has become much better over the last several years, and the
> >> throughput of the allocation-
> >> heavy workflow has increased significantly, however there are a few things
> >> we should keep in mind:
> >> 
> >> 1. Java 8 performs better with direct encoders (although afaict the
> >> performance isn't any better than java 9+,
> >>    string.getBytes just has several new optimizations
> >> 2. If this[1] change is merged into openjdk, the direct encoders may once
> >> again be the best-performing option,
> >>    I'm planning to put together a local build for comparison later.
> >> 3. My testing has primarily targetted UTF-8 as it's the only charset we
> >> target at work, and the default charset
> >>    on most modern servers. The jdk has a great deal of optimization which
> >> special cases UTF-8 for this reason,
> >>    especially from string.getBytes. Other charsets are likely to perform
> >> differently.
> >> 
> >> I'm in favor of changing the default value of "direct.encoders" based on
> >> the java version, for example
> >> Java 8 should continue to use them, while versions after compact-strings
> >> was introduced are better
> >> off without direct encoders until the jdk can be improved using something
> >> like the linked PR.
> >> 
> >> 1. https://github.com/openjdk/jdk/pull/5621#issuecomment-925413767
> >> 
> >> -ck
> >> 
> >> On Thu, Sep 23, 2021, at 11:51, Ralph Goers wrote:
> >>> I ran Ceki’s benchmarks against 2.14.1, 2.15.0-SNAPSHOT and Carter’s
> >> branch. I ran them with 16 threads on my
> >>> MacBook Pro and ran each test twice since the tests show some
> >> variability from time to time.
> >>> 
> >>> I can draw 2 conclusions from this.
> >>> 1. The improvements Carter has made to 2.15.0 have already made a
> >> noticeable improvement.
> >>> 2. His new changes show a definite improvement on top of the 2.15.0
> >> improvements when enable direct encoders is false.
> >>> 
> >>> These results definitely make me wonder if we should just remove the
> >> direct encoders logic at least in master. From what
> >>> I can tell it makes things complicated and makes no noticeable impact on
> >> performance in 2.15.0 even before Carter’s
> >>> changes. Are they required for gc-free? Can we make direct.encoders =
> >> false the default if it isn’t now?
> >>> 
> >>> I apologize if the performance results don’t look good in your email.
> >> They are formatted with a fixed-width font but it is
> >>> likely they will be messed up once they reach your mail client.
> >>> 
> >>> Ralph
> >>> 
> >>> 2.14.1 - enable direct encoders = false
> >>> 
> >>> Benchmark                                        Mode  Cnt    Score
> >> Error  Units
> >>> AsyncWithFileAppenderBenchmark.log4j1File       thrpt  10  1221.995 ±
> >> 150.876  ops/ms
> >>> AsyncWithFileAppenderBenchmark.log4j2AsyncFile  thrpt  10  1495.072 ±
> >> 41.715  ops/ms
> >>> AsyncWithFileAppenderBenchmark.logbackFile      thrpt  10  1529.807 ±
> >> 55.876  ops/ms
> >>> FileAppenderBenchmark.log4j1File                thrpt  10   681.690 ±
> >> 17.631  ops/ms
> >>> FileAppenderBenchmark.log4j2File                thrpt  10  1134.543 ±
> >> 35.065  ops/ms
> >>> FileAppenderBenchmark.logbackFile               thrpt  10  1546.811 ±
> >> 95.721  ops/ms
> >>> 
> >>> Benchmark                                        Mode  Cnt    Score
> >> Error  Units
> >>> AsyncWithFileAppenderBenchmark.log4j1File       thrpt  10  1103.526 ±
> >> 78.201  ops/ms
> >>> AsyncWithFileAppenderBenchmark.log4j2AsyncFile  thrpt  10  1402.669 ±
> >> 144.374  ops/ms
> >>> AsyncWithFileAppenderBenchmark.logbackFile      thrpt  10  1475.666 ±
> >> 87.885  ops/ms
> >>> FileAppenderBenchmark.log4j1File                thrpt  10   582.841 ±
> >> 57.156  ops/ms
> >>> FileAppenderBenchmark.log4j2File                thrpt  10   860.587 ±
> >> 185.170  ops/ms
> >>> FileAppenderBenchmark.logbackFile               thrpt  10  1560.206 ±
> >> 323.229  ops/ms
> >>> 
> >>> 2.14.1 - enable direct encoders = true
> >>> 
> >>> Benchmark                                        Mode  Cnt    Score
> >> Error  Units
> >>> AsyncWithFileAppenderBenchmark.log4j1File       thrpt  10  1250.808 ±
> >> 116.848  ops/ms
> >>> AsyncWithFileAppenderBenchmark.log4j2AsyncFile  thrpt  10  1564.926 ±
> >> 146.195  ops/ms
> >>> AsyncWithFileAppenderBenchmark.logbackFile      thrpt  10  1703.731 ±
> >> 183.304  ops/ms
> >>> FileAppenderBenchmark.log4j1File                thrpt  10  736.189 ±
> >> 40.798  ops/ms
> >>> FileAppenderBenchmark.log4j2File                thrpt  10  1070.958 ±
> >> 97.173  ops/ms
> >>> FileAppenderBenchmark.logbackFile               thrpt  10  1350.507 ±
> >> 225.104  ops/ms
> >>> 
> >>> Benchmark                                        Mode  Cnt    Score
> >> Error  Units
> >>> AsyncWithFileAppenderBenchmark.log4j1File       thrpt  10  1218.639 ±
> >> 141.343  ops/ms
> >>> AsyncWithFileAppenderBenchmark.log4j2AsyncFile  thrpt  10  1525.081 ±
> >> 112.058  ops/ms
> >>> AsyncWithFileAppenderBenchmark.logbackFile      thrpt  10  1724.925 ±
> >> 169.680  ops/ms
> >>> FileAppenderBenchmark.log4j1File                thrpt  10   680.453 ±
> >> 32.330  ops/ms
> >>> FileAppenderBenchmark.log4j2File                thrpt  10  1054.517 ±
> >> 52.147  ops/ms
> >>> FileAppenderBenchmark.logbackFile               thrpt  10  1353.180 ±
> >> 224.415  ops/ms
> >>> 
> >>> 2.15.0-SNAPSHOT - enable direct encoders = false
> >>> 
> >>> Benchmark                                        Mode  Cnt    Score
> >> Error  Units
> >>> AsyncWithFileAppenderBenchmark.log4j1File       thrpt  10  1121.838 ±
> >> 112.026  ops/ms
> >>> AsyncWithFileAppenderBenchmark.log4j2AsyncFile  thrpt  10  1606.602 ±
> >> 217.776  ops/ms
> >>> AsyncWithFileAppenderBenchmark.logbackFile      thrpt  10  1607.875 ±
> >> 226.656  ops/ms
> >>> FileAppenderBenchmark.log4j1File                thrpt  10  694.809 ±
> >> 33.380   ops/ms
> >>> FileAppenderBenchmark.log4j2File                thrpt  10  1500.557 ±
> >> 402.779  ops/ms
> >>> FileAppenderBenchmark.logbackFile               thrpt  10  1387.306 ±
> >> 213.081  ops/ms
> >>> 
> >>> Benchmark                                        Mode  Cnt    Score
> >> Error  Units
> >>> AsyncWithFileAppenderBenchmark.log4j1File       thrpt  10  1199.591 ±
> >> 149.157 ops/ms
> >>> AsyncWithFileAppenderBenchmark.log4j2AsyncFile  thrpt  10  1744.558 ±
> >> 190.732 ops/ms
> >>> AsyncWithFileAppenderBenchmark.logbackFile      thrpt  10  1718.862 ±
> >> 142.328 ops/ms
> >>> FileAppenderBenchmark.log4j1File                thrpt  10   604.785 ±
> >> 78.884  ops/ms
> >>> FileAppenderBenchmark.log4j2File                thrpt  10  1604.844 ±
> >> 712.807 ops/ms
> >>> FileAppenderBenchmark.logbackFile               thrpt  10  1578.756 ±
> >> 109.343 ops/ms
> >>> 
> >>> 2.15.0-SNAPSHOT - enable direct encoders = true
> >>> 
> >>> Benchmark                                        Mode  Cnt    Score
> >> Error  Units
> >>> AsyncWithFileAppenderBenchmark.log4j1File       thrpt  10  1244.029 ±
> >> 61.842  ops/ms
> >>> AsyncWithFileAppenderBenchmark.log4j2AsyncFile  thrpt  10  1564.244 ±
> >> 263.322  ops/ms
> >>> AsyncWithFileAppenderBenchmark.logbackFile      thrpt  10  1648.783 ±
> >> 217.936  ops/ms
> >>> FileAppenderBenchmark.log4j1File                thrpt  10  717.024 ±
> >> 43.631   ops/ms
> >>> FileAppenderBenchmark.log4j2File                thrpt  10  1513.402 ±
> >> 438.311  ops/ms
> >>> FileAppenderBenchmark.logbackFile               thrpt  10  1425.745 ±
> >> 231.201  ops/ms
> >>> 
> >>> Benchmark                                        Mode  Cnt    Score
> >> Error  Units
> >>> AsyncWithFileAppenderBenchmark.log4j1File       thrpt  10  1207.924 ±
> >> 136.763  ops/ms
> >>> AsyncWithFileAppenderBenchmark.log4j2AsyncFile  thrpt  10  1514.119 ±
> >> 98.725  ops/ms
> >>> AsyncWithFileAppenderBenchmark.logbackFile      thrpt  10  1620.834 ±
> >> 238.498  ops/ms
> >>> FileAppenderBenchmark.log4j1File                thrpt  10   716.294 ±
> >> 65.527  ops/ms
> >>> FileAppenderBenchmark.log4j2File                thrpt  10  1523.430 ±
> >> 387.178  ops/ms
> >>> FileAppenderBenchmark.logbackFile               thrpt  10  1402.748 ±
> >> 312.713  ops/ms
> >>> 
> >>> Carter 2.15.0-SNAPSHOT - enable direct encoders = false
> >>> 
> >>> Benchmark                                        Mode  Cnt    Score
> >> Error  Units
> >>> AsyncWithFileAppenderBenchmark.log4j1File       thrpt  10  1351.481 ±
> >> 68.356  ops/ms
> >>> AsyncWithFileAppenderBenchmark.log4j2AsyncFile  thrpt  10  2048.458 ±
> >> 118.329  ops/ms
> >>> AsyncWithFileAppenderBenchmark.logbackFile      thrpt  10  1762.185 ±
> >> 155.733  ops/ms
> >>> FileAppenderBenchmark.log4j1File                thrpt  10   693.222 ±
> >> 94.051  ops/ms
> >>> FileAppenderBenchmark.log4j2File                thrpt  10  1707.576 ±
> >> 299.891  ops/ms
> >>> FileAppenderBenchmark.logbackFile               thrpt  10  1439.624 ±
> >> 60.957  ops/ms
> >>> 
> >>> Benchmark                                        Mode  Cnt    Score
> >> Error  Units
> >>> AsyncWithFileAppenderBenchmark.log4j1File       thrpt  10  1230.511 ±
> >> 111.351  ops/ms
> >>> AsyncWithFileAppenderBenchmark.log4j2AsyncFile  thrpt  10  1950.005 ±
> >> 166.843  ops/ms
> >>> AsyncWithFileAppenderBenchmark.logbackFile      thrpt  10  1783.227 ±
> >> 53.359  ops/ms
> >>> FileAppenderBenchmark.log4j1File                thrpt  10   702.814 ±
> >> 38.263  ops/ms
> >>> FileAppenderBenchmark.log4j2File                thrpt  10  1738.084 ±
> >> 344.135  ops/ms
> >>> FileAppenderBenchmark.logbackFile               thrpt  10  1463.187 ±
> >> 158.787  ops/ms
> >>> 
> >>> Carter 2.15.0-SNAPSHOT - enable direct encoders = true
> >>> 
> >>> Benchmark                                        Mode  Cnt    Score
> >> Error  Units
> >>> AsyncWithFileAppenderBenchmark.log4j1File       thrpt  10  1212.045 ±
> >> 165.745  ops/ms
> >>> AsyncWithFileAppenderBenchmark.log4j2AsyncFile  thrpt  10  1671.374 ±
> >> 190.449  ops/ms
> >>> AsyncWithFileAppenderBenchmark.logbackFile      thrpt  10  1637.298 ±
> >> 169.134  ops/ms
> >>> FileAppenderBenchmark.log4j1File                thrpt  10   708.106 ±
> >> 30.035  ops/ms
> >>> FileAppenderBenchmark.log4j2File                thrpt  10  1523.321 ±
> >> 429.586  ops/ms
> >>> FileAppenderBenchmark.logbackFile               thrpt  10  1418.236 ±
> >> 233.527  ops/ms
> >>> 
> >>> Benchmark                                        Mode  Cnt    Score
> >> Error  Units
> >>> AsyncWithFileAppenderBenchmark.log4j1File       thrpt  10  1189.572 ±
> >> 117.298  ops/ms
> >>> AsyncWithFileAppenderBenchmark.log4j2AsyncFile  thrpt  10  1499.633 ±
> >> 186.132  ops/ms
> >>> AsyncWithFileAppenderBenchmark.logbackFile      thrpt  10  1631.038 ±
> >> 214.951  ops/ms
> >>> FileAppenderBenchmark.log4j1File                thrpt  10   697.329 ±
> >> 48.757  ops/ms
> >>> FileAppenderBenchmark.log4j2File                thrpt  10  1542.492 ±
> >> 395.759  ops/ms
> >>> FileAppenderBenchmark.logbackFile               thrpt  10  1321.256 ±
> >> 148.364  ops/ms
> >>> 
> >>> 
> >>>> On Sep 22, 2021, at 12:30 AM, Volkan Yazıcı <vol...@yazi.ci> wrote:
> >>>> 
> >>>> That's great news! Then I will be looking forward to your signal for
> >>>> putting a ribbon onto 2.15.0.
> >>>> 
> >>>> I have followed the conversation with Claes Redestad from Oracle on
> >> Twitter
> >>>> <https://twitter.com/carter_kozak/status/1433798391604162561>. My
> >>>> conclusion was also that there apparently is no way to make
> >> CharsetEncoder
> >>>> beat .toString().getBytes() in Java 9+, until a CE specialization gets
> >>>> introduced similar to what has been done in .toString().getBytes().
> >>>> Nevertheless, I was also (naively?) thinking about a similar branching
> >>>> strategy like the one you proposed as a temporary solution: preJava9 ?
> >>>> useCE() : useStringGetBytes(). Would you be able to get this done?
> >>>> 
> >>>> Another idea I thought of yesterday evening was to introduce our own
> >>>> hand-written StringBuilder- or char[]-to-byte[] encoders for common
> >> cases,
> >>>> i.e., ASCII and UTF-8. What do you think?
> >>>> 
> >>>> 
> >>>> On Wed, Sep 22, 2021 at 3:48 AM Carter Kozak <cko...@ckozak.net>
> >> wrote:
> >>>> 
> >>>>> Thanks, Volkan!
> >>>>> 
> >>>>> Rerunning the benchmarks on my branch (specifically the PatternLayout
> >>>>> benchmarks in log4j-perf) produced much better improvements than I had
> >>>>> anticipated. Beyond that, the date format cache invalidation problem
> >>>>> resulted in a substantial speedup. I agree that it would be helpful
> >> to get
> >>>>> a release out the door once this is merged.
> >>>>> 
> >>>>> Re getBytes vs CharsetEncoder, I don't want to use the unsafe hack I
> >> put
> >>>>> together in my benchmark project, that was just for experimentation
> >> :-)
> >>>>> Future java releases (or changes in minor patch releases) could cause
> >> it to
> >>>>> fail in frightening ways. We may be better off recommending the
> >> getBytes
> >>>>> approach for now on some java versions (possibly by changing our
> >> default on
> >>>>> java 9+).
> >>>>> Claes has a potential change[1] which appears to buy us a great deal
> >> of
> >>>>> performance in future Javas (assuming it is merged) and we may be
> >> able to
> >>>>> engage for additional encoding APIs, for example something like this
> >> could
> >>>>> avoid allocations and additional buffers:
> >>>>> 
> >>>>> /** Returns the number of characters encoded to destination, or -1 if
> >> more
> >>>>> space is needed (equivalent to CoderResult.OVERFLOW) */
> >>>>> int CharSetEncoder.encode(charsequence, inputStart, inputLimit, byte[]
> >>>>> destination, int destOffset, int destLimit)
> >>>>> (I haven't put a great deal of thought into this API and it's getting
> >>>>> late, so pardon any terrible ideas!)
> >>>>> 
> >>>>> 1. https://github.com/openjdk/jdk/pull/5621
> >>>>> 
> >>>>> -ck
> >>>>> 
> >>>>> On Tue, Sep 21, 2021, at 15:31, Volkan Yazıcı wrote:
> >>>>>> First and foremost, fantastic job Carter!
> >>>>>> 
> >>>>>> For #573, I see that Gary and Ralph have already shared some
> >> remarks. I
> >>>>>> would appreciate it if we can get this merged and cut the ribbon for
> >>>>> 2.15.0
> >>>>>> release.
> >>>>>> 
> >>>>>> Regarding
> >> `StringBuilder#toString().getBytes()`-vs-`CharsetEncoder`...
> >>>>> That
> >>>>>> is a tough one. In your hack branch there, I am not sure if using
> >>>>> `Unsafe`
> >>>>>> is a future-proof path forward. I was trying to wrap my mind around
> >>>>> Daniel
> >>>>>> Sun's fast-reflection <
> >> https://github.com/danielsun1106/fast-reflection>
> >>>>>> (for the records, I couldn't) and was triggered by his ASM usage
> >> there. I
> >>>>>> was curious if we could do something similar via ASM to hack
> >>>>>> `CharsetEncoder`? (I am probably talking nonsense, I hope it anyway
> >>>>>> triggers a practical idea for you.)
> >>>>> 
> >>> 
> >>> 
> >> 
> 
> 
> 

Reply via email to