RE: RFR 7162400: Intermittent java.io.IOException: Bad file number during HotSpotVirtualMachine.executeCommand

Peter Allwin Tue, 09 Jul 2013 05:27:17 -0700

Hello!

It is reproducible by letting the test create .java_pid* files for all
possible process id's on the system, setting correct access flags, launching
the target VM and attempting to connect. There are some caveats though but
it should be doable.

I'll convert the repro script to JTREG and add it to the webrev.

Thanks for the reviews!

/peter 

From: [email protected] [mailto:[email protected]] 
Sent: Tuesday, July 9, 2013 1:26 AM
To: [email protected]
Cc: Peter Allwin; [email protected];
[email protected]
Subject: Re: RFR 7162400: Intermittent java.io.IOException: Bad file number
during HotSpotVirtualMachine.executeCommand

Ok, thanks!

Peter, did you manage to reproduce this issue with your script?
If so, then, please, include it into the bug report and remove the
"noreg-sqe" label.

It is Ok if you did not reproduce it, though.

Thanks,
Serguei

On 7/8/13 4:20 PM, Daniel D. Daugherty wrote:

I definitely don't insist... :-)

BTW, I noticed this in Peter's e-mail:

> Testing:
> JPRT, reproducing script on Solaris, Linux.

so maybe Peter already has this covered with "reproducing script"...

Dan

On 7/8/13 5:07 PM, [email protected]
<mailto:[email protected]>  wrote:

Dan,

Dan, thank you for the recommendation.
But I'm still not sure it is a right thing to do.
Even though, there are multiple test cases associated with this bug they
can not be used to verify that fix because an additional condition
must be present as well. 
This condition is a presence of stale door file which is not that easy to
reproduce.

However, if you insist then I can change the lable to the "noreg-sqe"
with the corresponding comment.

Thanks,
Serguei

On 7/8/13 3:46 PM, Daniel D. Daugherty wrote:

Serguei,

There are a number of existing tests associated with this bug. I don't
think that 'noreg-hard' is the right label. I think 'noreg-sqe' is
the right one:

noreg-sqe
    Change can be verified by running an existing SQE test suite; the bug
    should identify the suite and the specific test case(s).

Dan

On 7/8/13 12:59 PM, [email protected]
<mailto:[email protected]>  wrote:

Peter,

I've added the label "noreg-hard" with the comment to the report.
It is not easy to reproduce the issue and demonstrate the fix in a
regression test.

Thanks,
Serguei

On 7/8/13 11:36 AM, [email protected]
<mailto:[email protected]>  wrote:

Hi Peter,

The fix looks good.

Thanks,
Serguei

On 7/8/13 6:54 AM, Peter Allwin wrote:

Hello!

Looking for reviews of this change:

http://cr.openjdk.java.net/~allwin/7162400/webrev.01/
<http://cr.openjdk.java.net/%7Eallwin/7162400/webrev.01/> 

For CR:

http://bugs.sun.com/view_bug.do?bug_id=7162400

https://jbs.oracle.com/bugs/browse/JDK-7162400

Summary:

This change addresses an issue in the Attach API on Solaris, Linux and BSD
where an attaching application can receive IOExceptions such as "Bad file
number" (Solaris), "Connection refused" (Linux/BSD), or "well-known file is
not secure". 

The attach process uses a file in the temporary directory as a door
(Solaris) or domain socket (Linux,BSD) to communicate with the VM. In
certain circumstances stale files can be left in the file system which can
cause the attaching application to believe that the VM is ready to receive a
connection when it's not. With this change the stale file will be removed
during VM startup.

Note that there is still an issue if we don't have permission to remove the
stale file, the attaching process will fail to connect.

Testing:

JPRT, reproducing script on Solaris, Linux.

Credits:

Thanks to Staffan Larsen who worked on this issue with me.

Regards,

Peter

RE: RFR 7162400: Intermittent java.io.IOException: Bad file number during HotSpotVirtualMachine.executeCommand

Reply via email to