Hello!
It is reproducible by letting the test create .java_pid* files for all possible process id's on the system, setting correct access flags, launching the target VM and attempting to connect. There are some caveats though but it should be doable. I'll convert the repro script to JTREG and add it to the webrev. Thanks for the reviews! /peter From: serguei.spit...@oracle.com [mailto:serguei.spit...@oracle.com] Sent: Tuesday, July 9, 2013 1:26 AM To: daniel.daughe...@oracle.com Cc: Peter Allwin; serviceability-dev@openjdk.java.net; hotspot-runtime-...@openjdk.java.net Subject: Re: RFR 7162400: Intermittent java.io.IOException: Bad file number during HotSpotVirtualMachine.executeCommand Ok, thanks! Peter, did you manage to reproduce this issue with your script? If so, then, please, include it into the bug report and remove the "noreg-sqe" label. It is Ok if you did not reproduce it, though. Thanks, Serguei On 7/8/13 4:20 PM, Daniel D. Daugherty wrote: I definitely don't insist... :-) BTW, I noticed this in Peter's e-mail: > Testing: > JPRT, reproducing script on Solaris, Linux. so maybe Peter already has this covered with "reproducing script"... Dan On 7/8/13 5:07 PM, serguei.spit...@oracle.com <mailto:serguei.spit...@oracle.com> wrote: Dan, Dan, thank you for the recommendation. But I'm still not sure it is a right thing to do. Even though, there are multiple test cases associated with this bug they can not be used to verify that fix because an additional condition must be present as well. This condition is a presence of stale door file which is not that easy to reproduce. However, if you insist then I can change the lable to the "noreg-sqe" with the corresponding comment. Thanks, Serguei On 7/8/13 3:46 PM, Daniel D. Daugherty wrote: Serguei, There are a number of existing tests associated with this bug. I don't think that 'noreg-hard' is the right label. I think 'noreg-sqe' is the right one: noreg-sqe Change can be verified by running an existing SQE test suite; the bug should identify the suite and the specific test case(s). Dan On 7/8/13 12:59 PM, serguei.spit...@oracle.com <mailto:serguei.spit...@oracle.com> wrote: Peter, I've added the label "noreg-hard" with the comment to the report. It is not easy to reproduce the issue and demonstrate the fix in a regression test. Thanks, Serguei On 7/8/13 11:36 AM, serguei.spit...@oracle.com <mailto:serguei.spit...@oracle.com> wrote: Hi Peter, The fix looks good. Thanks, Serguei On 7/8/13 6:54 AM, Peter Allwin wrote: Hello! Looking for reviews of this change: http://cr.openjdk.java.net/~allwin/7162400/webrev.01/ <http://cr.openjdk.java.net/%7Eallwin/7162400/webrev.01/> For CR: http://bugs.sun.com/view_bug.do?bug_id=7162400 https://jbs.oracle.com/bugs/browse/JDK-7162400 Summary: This change addresses an issue in the Attach API on Solaris, Linux and BSD where an attaching application can receive IOExceptions such as "Bad file number" (Solaris), "Connection refused" (Linux/BSD), or "well-known file is not secure". The attach process uses a file in the temporary directory as a door (Solaris) or domain socket (Linux,BSD) to communicate with the VM. In certain circumstances stale files can be left in the file system which can cause the attaching application to believe that the VM is ready to receive a connection when it's not. With this change the stale file will be removed during VM startup. Note that there is still an issue if we don't have permission to remove the stale file, the attaching process will fail to connect. Testing: JPRT, reproducing script on Solaris, Linux. Credits: Thanks to Staffan Larsen who worked on this issue with me. Regards, Peter