On Thu, 2 Apr 2026 09:47:57 GMT, Roman Kennke <[email protected]> wrote:
>> I analyzed the performance of Thread.setName() in response to a customer >> workload running Cassandra, where Thread.setName() showed up (mostly because >> of rather pathetic use of setName() from Cassanda *sigh*). >> >> Profiling showed that most time (around 75%) is spent in the actual syscall, >> so there are limits on what we can do. >> >> I implemented the following improvements: >> - Almost all thread names are Latin1/ASCII, and there is no need to convert >> to UTF8 in that case. Also, the various OS APIs to set the thread name don't >> even seem to specify the character encoding. Avoiding the UTF8 conversion >> (if possible) brings down the length-dependent costs. In many cases we can >> also pass down the backing array of the string and avoid copying altogether. >> - For truncating the name on Linux to 16 chars, instead of using snprintf >> with a pattern, we can simply stitch together the name directly (first 7 >> chars, last 6 chars, 2 dots in between), this saves ~100ns. >> >> In the end, we bring down performance for the small cases by ~7%, longer >> names by ~20% and completely removed the conversion overhead that primarily >> affected longer names. >> >> | Benchmark | (length) | Baseline (ns/op) | Optimized (ns/op) | Change >> | >> >> |---------------|----------|------------------:|-------------------:|--------:| >> | setName | 1 | 602.3 ± 2.0 | 561.9 ± 1.5 | >> -6.7% | >> | setName | 4 | 605.9 ± 2.1 | 570.2 ± 1.2 | >> -5.9% | >> | setName | 15 | 617.1 ± 2.7 | 570.4 ± 2.8 | >> -7.6% | >> | setName | 16 | 712.1 ± 6.0 | 569.4 ± 2.7 | >> -20.0% | >> | setName | 50 | 757.9 ± 5.2 | 566.3 ± 4.6 | >> -25.3% | >> | setName | 200 | 986.2 ± 2.7 | 569.9 ± 4.9 | >> -42.2% | >> | setNameSame | 1 | — | 7.4 ± 0.0 | — >> | >> | setNameSame | 4 | — | 7.4 ± 0.0 | — >> | >> | setNameSame | 15 | — | 7.4 ± 0.0 | — >> | >> | setNameSame | 16 | — | 7.4 ± 0.0 | — >> | >> | setNameSame | 50 | — | 7.4 ± 0.0 | — >> | >> | setNameSame | 200 | — | 7.4 ± 0.0 | — >> | >> >> >> Testing: >> - [x] tier1 >> - [ ] tier2 >> >> The failing GHA test in langtools seems unrelated. > > Roman Kennke has updated the pull request incrementally with one additional > commit since the last revision: > > Update src/hotspot/os/bsd/os_bsd.cpp > > Co-authored-by: David Holmes <[email protected]> Hi, I backport the proposed patch to jdk17u-dev, and run dacapo/cassandra benchmark, the performance result shows no difference. I am no object of this PR, I am just curious how much performance improve after this PR on real Cassandra. ------------- PR Comment: https://git.openjdk.org/jdk/pull/30374#issuecomment-4177587887
