Hi all,
We recently noticed that in several of our production services there are very
frequent allocations / promotions of coder objects from the StringCoding class
(e.g. java.lang.StringCoding$String{Encoder,Decoder} instances). We are pretty
sure that these are allocated to replace the coders cached in the StringCoding
ThreadLocals (we also see SoftReferences being allocated / promoted along with
the coder objects):
/** The cached coders for each thread */
private final static ThreadLocal<SoftReference<StringDecoder>> decoder =
new ThreadLocal<>();
private final static ThreadLocal<SoftReference<StringEncoder>> encoder =
new ThreadLocal<>();
We seem to be doing encoding / decoding using at least two different charsets
(we also see Encoder / Decoders for those two charsets being allocated /
promoted) which is causing this churn (i.e., the coder allocations are not
caused by the SoftReferences being cleared; they are frequently replaced when a
coder for a different charset is required). Even though we observed this
behavior with JDK 8 it should also exist in JDK 9 (StringCoding still has the
same two ThreadLocals; we can confirm this very easily).
Has anyone identified this issue before? We believe that caching a small number
of coders per thread (instead of just one) could avoid this unnecessary churn.
We’ll be happy to contribute such a fix if there’s interest in getting it
accepted.
Also, why are SoftReferences used here? In our case, if a thread does some
encoding / decoding once it’s likely it will likely keep doing it for ever. So
the use of SoftReferences here is not very helpful (it adds unnecessary
overhead to the GC, not only due to the extra copying but also due to the extra
work required during reference processing). Was there a specific reason for the
use of SoftReferences here?
Finally, the StringCoding coder c'tor allocates a new Charset coder
(Charset{Encoder,Decoder}) for the specific charset. But such Charset coders
already seem to be cached in ThreadLocals in the sun.nio.cs.ThreadLocalCoders
class. Any reason why we cannot re-use those? (Oh, and also note that this
cache does not use SoftReferences, which makes their use by the StringCoding
class even more perplexing.)
(a tip of the hat to my colleague Peter Beaman for discovering this issue)
Tony
-----
Tony Printezis | JVM/GC Engineer / VM Team | Twitter
@TonyPrintezis
[email protected]