The process grim reaper ends up being the first point of failure since it tries not to waste the user's memory and it's in a core library, but in principle it's not special. I think a more general workaround would be to add a hotspot flag that would add a memory safety zone to all threads. If it's known that TLS is stealing 32k from every thread's stack, then the flag should ensure that every thread stack is 32k larger.
More generally, I was hoping that hotspot would ensure that the -Xss size was available for actual java stack frames and OS overhead was added on automatically, but that is hard to implement, so the best alternative workaround is for users to be able to specify that additional stack size padding. Maybe -XX:StackSizeOverhead=32768 ?