On Wed, 17 Mar 2021 17:30:25 GMT, Thomas Stuefe <stu...@openjdk.org> wrote:

>> Arbitrary time out has been a reliable source of intermittent failures.
>> 
>> Since we have spent a lot of time analyzing this failure, I think it's 
>> worthwhile to fix it properly, which doesn't seem that complicated. That's 
>> better than the same bug happening again a year later and a different set of 
>> people would spend hours to analyze it again.
>
> I don't think this is CPU starvation but memory exhaustion. _beginthreadex 
> fails with EACCES if it has no resources to start the thread, which in this 
> case probably means memory (the other possibility would be 
> out-of-HANDLE-space but seeing that the child just started I don't see how 
> this could be).
> 
> Should we harden tests against resource starvation like this, or rather 
> require the test machine to be beefy enough for tests? Also, I don't 
> understand, if the child has not enough resources to bring the VM fully up 
> how waiting on either stream would help.

I'm not sure what part of the system or tests should be more robust.
If the VM can't tolerate the failure of a thread to start, then it should be a 
fatal VM error, not just an unexpected line on stdout (or it should be able to 
proceed without the extra thread).
It is a reasonable position to take that when a process is destroyed, there can 
be unpredictable effects.
Though in various iterations, attempts have been made to shutdown in an orderly 
fashion, coupled to calls to System.exit() or SIGTERM.
(It would be nice if Windows had a way to do a cleanly request process 
termination - a SIGTERM equivalent does not exist.)
Avoidance may be easiest but may just hide another problem.

-------------

PR: https://git.openjdk.java.net/jdk/pull/3049

Reply via email to