On 07/09/2013 05:25 PM, Peter Allwin wrote:
Mikael,
That's a good point, unfortunately attach uses os::get_temp_directory which
is hardcoded to use /tmp. We could add a whitebox API to allow us to
override this but now we're on the border to noreg-hard land again IMO.
Any other opinions on this?
Can you use the "-XX:+PauseAtStartup" vm flag it will create a
vm.paused.<pid> file in the current work directory. You could extract
the pid, touch the correct attach file in /tmp and then remove the
vm.paused to let the VM resume.
I didn't check if PauseAtStartup stops the VM early enough though.
An alternate, even more hacky approach is to do something like (in bash):
(bash -c 'echo $$; touch /tmp/.java_pid$$; exec java -version')
Where you can extract the pid of the subshell process with $$ and then
exec into the java launcher and keep the same pid (at least on Linux,
unsure about the Solaris launcher).
/Mikael
Thanks!
/peter
-----Original Message-----
From: Mikael Gerdin [mailto:mikael.ger...@oracle.com]
Sent: Tuesday, July 9, 2013 2:49 PM
To: Peter Allwin
Cc: serguei.spit...@oracle.com; daniel.daughe...@oracle.com;
serviceability-dev@openjdk.java.net; hotspot-runtime-
d...@openjdk.java.net
Subject: Re: RFR 7162400: Intermittent java.io.IOException: Bad file
number
during HotSpotVirtualMachine.executeCommand
Peter,
On 2013-07-09 14:25, Peter Allwin wrote:
Hello!
It is reproducible by letting the test create .java_pid* files for all
possible process id's on the system, setting correct access flags,
launching the target VM and attempting to connect. There are some
caveats though but it should be doable.
I'll convert the repro script to JTREG and add it to the webrev.
It's probably not a good idea to have a test which taints the system with
stale
.java_pid* files.
If the test execution times out and the script isn't allowed to clean up I
imagine that other subsequent executions could fail.
Is there a way to tell the attach api to use a specific directory so you
won't
need to taint /tmp?
/Mikael
Thanks for the reviews!
/peter
*From:*serguei.spit...@oracle.com [mailto:serguei.spit...@oracle.com]
*Sent:* Tuesday, July 9, 2013 1:26 AM
*To:* daniel.daughe...@oracle.com
*Cc:* Peter Allwin; serviceability-dev@openjdk.java.net;
hotspot-runtime-...@openjdk.java.net
*Subject:* Re: RFR 7162400: Intermittent java.io.IOException: Bad file
number during HotSpotVirtualMachine.executeCommand
Ok, thanks!
Peter, did you manage to reproduce this issue with your script?
If so, then, please, include it into the bug report and remove the
"noreg-sqe" label.
It is Ok if you did not reproduce it, though.
Thanks,
Serguei
On 7/8/13 4:20 PM, Daniel D. Daugherty wrote:
I definitely don't insist... :-)
BTW, I noticed this in Peter's e-mail:
> Testing:
> JPRT, reproducing script on Solaris, Linux.
so maybe Peter already has this covered with "reproducing script"...
Dan
On 7/8/13 5:07 PM, serguei.spit...@oracle.com
<mailto:serguei.spit...@oracle.com> wrote:
Dan,
Dan, thank you for the recommendation.
But I'm still not sure it is a right thing to do.
Even though, there are multiple test cases associated with this
bug they
can not be used to verify that fix because an additional
condition
must be present as well.
This condition is a presence of stale door file which is not
that easy to reproduce.
However, if you insist then I can change the lable to the
"noreg-sqe"
with the corresponding comment.
Thanks,
Serguei
On 7/8/13 3:46 PM, Daniel D. Daugherty wrote:
Serguei,
There are a number of existing tests associated with this
bug. I don't
think that 'noreg-hard' is the right label. I think
'noreg-sqe' is
the right one:
noreg-sqe
Change can be verified by running an existing SQE test
suite; the bug
should identify the suite and the specific test
case(s).
Dan
On 7/8/13 12:59 PM, serguei.spit...@oracle.com
<mailto:serguei.spit...@oracle.com> wrote:
Peter,
I've added the label "noreg-hard" with the comment to
the report.
It is not easy to reproduce the issue and demonstrate
the fix in a regression test.
Thanks,
Serguei
On 7/8/13 11:36 AM, serguei.spit...@oracle.com
<mailto:serguei.spit...@oracle.com> wrote:
Hi Peter,
The fix looks good.
Thanks,
Serguei
On 7/8/13 6:54 AM, Peter Allwin wrote:
Hello!
Looking for reviews of this change:
http://cr.openjdk.java.net/~allwin/7162400/webrev.01/
<http://cr.openjdk.java.net/%7Eallwin/7162400/webrev.01/>
For CR:
http://bugs.sun.com/view_bug.do?bug_id=7162400
https://jbs.oracle.com/bugs/browse/JDK-7162400
Summary:
This change addresses an issue in the Attach API
on Solaris, Linux and BSD where an attaching
application can receive IOExceptions such as
"Bad file number" (Solaris), "Connection
refused" (Linux/BSD), or "well-known file is not
secure".
The attach process uses a file in the temporary
directory as a door (Solaris) or domain socket
(Linux,BSD) to communicate with the VM. In
certain circumstances stale files can be left in
the file system which can cause the attaching
application to believe that the VM is ready to
receive a connection when it's not. With this
change the stale file will be removed during VM
startup.
Note that there is still an issue if we don't
have permission to remove the stale file, the
attaching process will fail to connect.
Testing:
JPRT, reproducing script on Solaris, Linux.
Credits:
Thanks to Staffan Larsen who worked on this
issue with me.
Regards,
Peter