I have some code where I start an external process (ProcessBuilder.start() ,etc.) and then I spawn two worker threads to read the stdout and stderr of the external process. I directly read the streams provided by process.getInputStream() and process.getErrorStream() , I'm not wrapping them with my own streams or anything. Rather, the worker threads are calling java.io.InputStream.read(byte[]) in a loop.
I've encountered a situation, where the worker threads hang despite the process having been terminated already! ( Java HotSpot(TM) 64-Bit Server VM (build 24.51-b03, mixed mode) , Windows 7) I'm able to caught this whilst running the Java program under the debugger. I invoked process.exitValue() under the debugger to see if the JVM has indeed realized the process has terminated. It returned 0, so it seems it knows the process has terminated. Yet the streams are still blocked, in a native method: The stdout worker thread is stuck here: Daemon Thread [ExternalProcessEclipseHelper.MainWorker] (Suspended) owns: BufferedInputStream (id=145) FileInputStream.readBytes(byte[], int, int) line: not available [native method] FileInputStream.read(byte[], int, int) line: 272 BufferedInputStream.fill() line: 235 BufferedInputStream.read1(byte[], int, int) line: 275 BufferedInputStream.read(byte[], int, int) line: 334 BufferedInputStream(FilterInputStream).read(byte[]) line: 107 ExternalProcessNotifyingHelper$1(ExternalProcessHelper$ReadAllBytesTask).doRun() line: 73 The stderr worker thread is similarly stuck : Daemon Thread [ExternalProcessEclipseHelper.StdErrWorker] (Suspended) FileInputStream.readBytes(byte[], int, int) line: not available [native method] FileInputStream.read(byte[]) line: 243 ExternalProcessNotifyingHelper$2(ExternalProcessHelper$ReadAllBytesTask).doRun() line: 73 Could this be a JVM bug? I don't see that this scenario should ever be happening, unless some other part of my code somehow did some violation and messed up the JVM state. I've added a sample of the relevant code I'm using here: https://github.com/bruno-medeiros/Scratchpad/tree/jvm-processio-issue However, I haven't yet been able to replicate this bug using the isolated code from there. At the moment, I can only replicate it when I run my full application. The sample code could be simplified further, but I haven't done it yet since I couldn't replicate the bug using that. One interesting bit, is that I can only replicate it when I run the application for the first time, per computer session. That is, apparently I need to reboot my computer for the bug to manifest again! I'd like to narrow this down, but I would appreciate some help or suggestions for that. What could affect the JVM, such that subsequent invocations apparently don't cause the bug? Some code cache issue? I also wonder if the OSGi runtime could be a factor here. -- Bruno Medeiros https://twitter.com/brunodomedeiros