Hi Jiangli,

On 24/05/2019 9:21 am, Jiangli Zhou wrote:
Hi David (and others),

There was a discussion [1] (between you, Jeremy, Martin and others)
back in 2015 regarding a stack size issue caused by a glibc bug
related to TLS (Thread local storage) [2]. The issue was manifested as
a StackOverflowError with the reported test in JDK-8130425 [0] when
large TLS size is used. A workaround was introduced with
-Djdk.lang.processReaperUseDefaultStackSize. Based on the glibc
discussion thread [2], Rust implemented a fix by taking into account
of the TLS size. From one of the comments in the OpenJDK discussion
archive [3], looks like you considered similar fix could be applied
for JVM. I talked to Jeremy about sharing his fix for this particular
issue today. The fix appears to be a more general solution than the
processReaperUseDefaultStackSize workaround. It has been tested/used
for server years and seems to be stable. The link to the changeset is
listed below. Please let me know your thoughts on taking the change in
OpenJDK.

My thoughts haven't really changed since 2015 - and sadly neither has there been any change in glibc in that time. Nor, to my recollection, have there been any other reported issues with this.

If this were to be taken into hotspot then I think it has to be opt-in via a flag so that it doesn't make sudden and unexpected differences in the number of threads an application can create. It may also be worth considering, from the bugzilla discussion, only adding in the TLS size if it is greater than a certain percentage of the stack size being requested. That would limit the impact to threads with small stacks without forcing every thread to have to grow by the TLS size.

But I'd want to know how often this is actually needed. As Andrew Haley said in the original discussion thread "I think we're rather looking at abuse of TLS here.".

And I'd need to understand better what versions of glibc this would work for (and how they relate to current distros).

Cheers,
David

[0] JDK bug: https://bugs.openjdk.java.net/browse/JDK-8130425
[1] OpenJDK discussion archive:
http://mail.openjdk.java.net/pipermail/core-libs-dev/2015-December/037558.html
[2] glibc discussion archive:
http://sourceware.org/bugzilla/show_bug.cgi?id=11787
[3] change: http://cr.openjdk.java.net/~jiangli/tls_size/webrev/
(contributed by Jeremy Manson)

The #ifdef __GLIBC__ in the change could be removed as os_linux.cpp
already makes assumption about the use of glibc.

Best regards,
Jiangli

Reply via email to