Re: RFR(S): 8067030 JDWP crash in transport_startTransport on OOM

2014-12-11 Thread Dmitry Samersoff
Serguei,

Fixed two more missed error checks (in-place, press shift-reload)

http://cr.openjdk.java.net/~dsamersoff/JDK-8067030/webrev.01/

-Dmitry

On 2014-12-10 01:30, serguei.spit...@oracle.com wrote:
 Hi Dmitry,
 
 The fix looks good.
 
 However, there are a couple of more places in that file
 where the result of the jvmtiAllocate() is not checked:
 
   68 utf8msg = (jbyte*)jvmtiAllocate(maxlen+1);
 
  393 buf = jvmtiAllocate(len*3+3);
 
 
 Could you fix this as well?
 
 
 Thanks,
 Serguei
 
 
 On 12/9/14 11:50 AM, Dmitry Samersoff wrote:
 Hi Everybody,

 Please review small fix.

 http://cr.openjdk.java.net/~dsamersoff/JDK-8067030/webrev.01/

 JDWP crash if allocation fails because it calls strcpy before
 check of allocation results.

 -Dmitry

 


-- 
Dmitry Samersoff
Oracle Java development team, Saint Petersburg, Russia
* I would love to change the world, but they won't give me the sources.


Re: RFR(S): 8067030 JDWP crash in transport_startTransport on OOM

2014-12-11 Thread serguei.spit...@oracle.com

Dmitry,

It looks good.
Reviewed.
Thank you for fixing two more cases!

Thanks,
Serguei

On 12/11/14 12:48 AM, Dmitry Samersoff wrote:

Serguei,

Fixed two more missed error checks (in-place, press shift-reload)

http://cr.openjdk.java.net/~dsamersoff/JDK-8067030/webrev.01/

-Dmitry

On 2014-12-10 01:30, serguei.spit...@oracle.com wrote:

Hi Dmitry,

The fix looks good.

However, there are a couple of more places in that file
where the result of the jvmtiAllocate() is not checked:

   68 utf8msg = (jbyte*)jvmtiAllocate(maxlen+1);

  393 buf = jvmtiAllocate(len*3+3);


Could you fix this as well?


Thanks,
Serguei


On 12/9/14 11:50 AM, Dmitry Samersoff wrote:

Hi Everybody,

Please review small fix.

http://cr.openjdk.java.net/~dsamersoff/JDK-8067030/webrev.01/

JDWP crash if allocation fails because it calls strcpy before
check of allocation results.

-Dmitry







Re: RFR(S): 8067030 JDWP crash in transport_startTransport on OOM

2014-12-11 Thread Jaroslav Bachorik

Hi,

Looks good!

-JB-

On 12/11/2014 09:48 AM, Dmitry Samersoff wrote:

Serguei,

Fixed two more missed error checks (in-place, press shift-reload)

http://cr.openjdk.java.net/~dsamersoff/JDK-8067030/webrev.01/

-Dmitry

On 2014-12-10 01:30, serguei.spit...@oracle.com wrote:

Hi Dmitry,

The fix looks good.

However, there are a couple of more places in that file
where the result of the jvmtiAllocate() is not checked:

   68 utf8msg = (jbyte*)jvmtiAllocate(maxlen+1);

  393 buf = jvmtiAllocate(len*3+3);


Could you fix this as well?


Thanks,
Serguei


On 12/9/14 11:50 AM, Dmitry Samersoff wrote:

Hi Everybody,

Please review small fix.

http://cr.openjdk.java.net/~dsamersoff/JDK-8067030/webrev.01/

JDWP crash if allocation fails because it calls strcpy before
check of allocation results.

-Dmitry










Re: RFR 8066863: bigapps/runThese/nowarnings fails: Java HotSpot(TM) 64-Bit Server VM warning: WaitForMultipleObjects

2014-12-11 Thread David Holmes

Hi Ivan,

On 11/12/2014 4:52 PM, Ivan Gerasimov wrote:

Hello!

After the fix for JDK-8064694 some more failures of
WaitForMultipleObjects() were observed under heavy load.
The reason was that the limitation on wait object number was overlooked.
The total number of the objects should not be greater than
MAXIMUM_WAIT_OBJECTS (= 64).

The proposed fix is to get rid of constant MAX_EXIT_HANDLES and use
MAXIMUM_WAIT_OBJECTS instead for all kinds of builds.
I also added the last error code to the failure reports, so it would be
easier to identify the cause of a failure.

Would you please help review the fix?

BUGURL: https://bugs.openjdk.java.net/browse/JDK-8066863
WEBREV: http://cr.openjdk.java.net/~igerasim/8064694/0/webrev/


The webrev changes do not correspond to the description you gave above.

David



Sincerely yours,
Ivan


Re: RFR 8066863: bigapps/runThese/nowarnings fails: Java HotSpot(TM) 64-Bit Server VM warning: WaitForMultipleObjects

2014-12-11 Thread David Holmes

On 11/12/2014 7:48 PM, David Holmes wrote:

Hi Ivan,

On 11/12/2014 4:52 PM, Ivan Gerasimov wrote:

Hello!

After the fix for JDK-8064694 some more failures of
WaitForMultipleObjects() were observed under heavy load.
The reason was that the limitation on wait object number was overlooked.
The total number of the objects should not be greater than
MAXIMUM_WAIT_OBJECTS (= 64).

The proposed fix is to get rid of constant MAX_EXIT_HANDLES and use
MAXIMUM_WAIT_OBJECTS instead for all kinds of builds.
I also added the last error code to the failure reports, so it would be
easier to identify the cause of a failure.

Would you please help review the fix?

BUGURL: https://bugs.openjdk.java.net/browse/JDK-8066863
WEBREV: http://cr.openjdk.java.net/~igerasim/8064694/0/webrev/


The webrev changes do not correspond to the description you gave above.


Correct webrev URL:

http://cr.openjdk.java.net/~igerasim/8066863/0/webrev/

Seems this saga will never end. :( Changes seem okay.

Thanks,
David


David



Sincerely yours,
Ivan


Re: RFR 8066708: JMXStartStopTest fails to connect to port 38112

2014-12-11 Thread Jaroslav Bachorik

On 12/09/2014 01:25 PM, Jaroslav Bachorik wrote:

On 12/09/2014 01:39 AM, Stuart Marks wrote:

On 12/8/14 12:35 PM, Jaroslav Bachorik wrote:

Please, review the following test change

Issue : https://bugs.openjdk.java.net/browse/JDK-8066708
Webrev: http://cr.openjdk.java.net/~jbachorik/8066708/webrev.00

The test fails very intermittently when RMI registry is trying to bind
to a port
previously used in the test (via ServerSocket).

This seems to be caused by the sockets created via `new
ServerSocket(0)` and
being in reusable mode. The fix attempts to prevent this by explicitly
forbidding the reusable mode.


Hi Jaroslav,

I happened to see this fly by, and there are (I think) some similar
issues going on in the RMI tests.

But first I'll note that I don't think setReuseAddress() will have the
effect that you want. Typically it's set to true before binding a
socket, so that a subsequent bind operation will succeed even if the
address/port is already in use. ServerSockets created with new
ServerSocket(0) are already bound, and I'm not sure what calling
setReuseAddress(false) will do on such sockets. The spec says behavior
is undefined, but my bet is that it does nothing.

I guess it doesn't hurt to try this out to see if it makes a difference,
but I don't have much confidence it will help.

The potential similarity to the RMI tests is exemplified by JDK-8049202
(sorry, this bug report isn't open) but briefly this tests the RMI
registry as follows:

1. Opens port 1099 using new ServerSocket(1099) [1099 is the default
RMI registry port] in order to ensure that 1099 isn't in use by
something else already;

2. If this succeeds, it immediately closes the ServerSocket.

3. Then it creates a new RMI registry on port 1099.

In principle, this should succeed, yet it fails around 10% of the time
on some systems. The error is port already in use. My best theory is
that even though the socket has just been closed by a user program, the
kernel has to run the socket through some of the socket states such as
FIN_WAIT_1, FIN_WAIT_2, or CLOSING before the socket is actually closed
and is available for reuse. If a program -- even the same one --
attempts to open a socket on the same port before the socket has reached
its final state, it will get an already in use error.

If this is true I don't believe that setting SO_REUSEADDR will work if
the socket is in one of these final states. (I remember reading this
somewhere but I'm not sure where at the moment. I can try to dig it up
if there is interest.)

I admit this is just a theory and I'm open to alternatives, and I'm also
open to hearing about ways to deal with this problem.

Could something similar be going on with this JMX test?


Hm, this is exactly what happened with this test :(

The problem is that the port is reported as available while it is still
occupied and RMI registry attempts to start using that port.

If setting SO_REUSEADDR does not work then the only solution would be to
retry the test case when this exception occurs.


Further investigation shows that the problem was rather the client 
connecting to a socket being shut down.


It sounds like setting SO_REUSEADDR to false should prevent this failure.

From the ServerSocket javadoc:
When a TCP connection is closed the connection may remain in a timeout 
state for a period of time after the connection is closed (typically 
known as the TIME_WAIT state or 2MSL wait state). For applications using 
a well known socket address or port it may not be possible to bind a 
socket to the required SocketAddress if there is a connection in the 
timeout state involving the socket address or port.


It also turns out that the test does not close the server sockets 
properly so there might be several sockets being opened or timed out 
dangling around.


I've updated the test so it is setting SO_REUSEADDR for all the new 
ServerSockets instances + introduced the mechanism to run the test code 
while properly cleaning up any allocated ports.


http://cr.openjdk.java.net/~jbachorik/8066708/webrev.01/

-JB-



-JB-



s'marks






Re: RFR 8066708: JMXStartStopTest fails to connect to port 38112

2014-12-11 Thread Dmitry Samersoff
Jaroslav,

You can set SO_LINGER to zero, in this case socket will be closed
immediately without waiting in TIME_WAIT

But there are no reliable way to predict whether you can take this port
or not after you close it.

So the only valid solution is to try to connect to a random port and if
this attempt fails try another random port. Everything else will cause
more or less frequent intermittent failures.

-Dmitry


On 2014-12-11 17:06, Jaroslav Bachorik wrote:
 On 12/09/2014 01:25 PM, Jaroslav Bachorik wrote:
 On 12/09/2014 01:39 AM, Stuart Marks wrote:
 On 12/8/14 12:35 PM, Jaroslav Bachorik wrote:
 Please, review the following test change

 Issue : https://bugs.openjdk.java.net/browse/JDK-8066708
 Webrev: http://cr.openjdk.java.net/~jbachorik/8066708/webrev.00

 The test fails very intermittently when RMI registry is trying to bind
 to a port
 previously used in the test (via ServerSocket).

 This seems to be caused by the sockets created via `new
 ServerSocket(0)` and
 being in reusable mode. The fix attempts to prevent this by explicitly
 forbidding the reusable mode.

 Hi Jaroslav,

 I happened to see this fly by, and there are (I think) some similar
 issues going on in the RMI tests.

 But first I'll note that I don't think setReuseAddress() will have the
 effect that you want. Typically it's set to true before binding a
 socket, so that a subsequent bind operation will succeed even if the
 address/port is already in use. ServerSockets created with new
 ServerSocket(0) are already bound, and I'm not sure what calling
 setReuseAddress(false) will do on such sockets. The spec says behavior
 is undefined, but my bet is that it does nothing.

 I guess it doesn't hurt to try this out to see if it makes a difference,
 but I don't have much confidence it will help.

 The potential similarity to the RMI tests is exemplified by JDK-8049202
 (sorry, this bug report isn't open) but briefly this tests the RMI
 registry as follows:

 1. Opens port 1099 using new ServerSocket(1099) [1099 is the default
 RMI registry port] in order to ensure that 1099 isn't in use by
 something else already;

 2. If this succeeds, it immediately closes the ServerSocket.

 3. Then it creates a new RMI registry on port 1099.

 In principle, this should succeed, yet it fails around 10% of the time
 on some systems. The error is port already in use. My best theory is
 that even though the socket has just been closed by a user program, the
 kernel has to run the socket through some of the socket states such as
 FIN_WAIT_1, FIN_WAIT_2, or CLOSING before the socket is actually closed
 and is available for reuse. If a program -- even the same one --
 attempts to open a socket on the same port before the socket has reached
 its final state, it will get an already in use error.

 If this is true I don't believe that setting SO_REUSEADDR will work if
 the socket is in one of these final states. (I remember reading this
 somewhere but I'm not sure where at the moment. I can try to dig it up
 if there is interest.)

 I admit this is just a theory and I'm open to alternatives, and I'm also
 open to hearing about ways to deal with this problem.

 Could something similar be going on with this JMX test?

 Hm, this is exactly what happened with this test :(

 The problem is that the port is reported as available while it is still
 occupied and RMI registry attempts to start using that port.

 If setting SO_REUSEADDR does not work then the only solution would be to
 retry the test case when this exception occurs.
 
 Further investigation shows that the problem was rather the client
 connecting to a socket being shut down.
 
 It sounds like setting SO_REUSEADDR to false should prevent this failure.
 
 From the ServerSocket javadoc:
 When a TCP connection is closed the connection may remain in a timeout
 state for a period of time after the connection is closed (typically
 known as the TIME_WAIT state or 2MSL wait state). For applications using
 a well known socket address or port it may not be possible to bind a
 socket to the required SocketAddress if there is a connection in the
 timeout state involving the socket address or port.
 
 It also turns out that the test does not close the server sockets
 properly so there might be several sockets being opened or timed out
 dangling around.
 
 I've updated the test so it is setting SO_REUSEADDR for all the new
 ServerSockets instances + introduced the mechanism to run the test code
 while properly cleaning up any allocated ports.
 
 http://cr.openjdk.java.net/~jbachorik/8066708/webrev.01/
 
 -JB-
 

 -JB-


 s'marks

 


-- 
Dmitry Samersoff
Oracle Java development team, Saint Petersburg, Russia
* I would love to change the world, but they won't give me the sources.


Re: RFR 8066708: JMXStartStopTest fails to connect to port 38112

2014-12-11 Thread olivier.lagn...@oracle.com

Hi Jaroslav,

On 11/12/2014 15:06, Jaroslav Bachorik wrote:
Further investigation shows that the problem was rather the client 
connecting to a socket being shut down.
I remember I met this situation for an RMI fix a while ago and IIRC no 
flag setting could help (SO_REUSEADDR as well),

the port kept being unavailable.


It sounds like setting SO_REUSEADDR to false should prevent this failure.

From the ServerSocket javadoc:
When a TCP connection is closed the connection may remain in a 
timeout state for a period of time after the connection is closed 
(typically known as the TIME_WAIT state or 2MSL wait state). For 
applications using a well known socket address or port it may not be 
possible to bind a socket to the required SocketAddress if there is a 
connection in the timeout state involving the socket address or port.


It also turns out that the test does not close the server sockets 
properly so there might be several sockets being opened or timed out 
dangling around.

I think this is the main reason why we see these intermittent failures.


I've updated the test so it is setting SO_REUSEADDR for all the new 
ServerSockets instances + introduced the mechanism to run the test 
code while properly cleaning up any allocated ports. 

Olivier.

On 11/12/2014 15:06, Jaroslav Bachorik wrote:

On 12/09/2014 01:25 PM, Jaroslav Bachorik wrote:

On 12/09/2014 01:39 AM, Stuart Marks wrote:

On 12/8/14 12:35 PM, Jaroslav Bachorik wrote:

Please, review the following test change

Issue : https://bugs.openjdk.java.net/browse/JDK-8066708
Webrev: http://cr.openjdk.java.net/~jbachorik/8066708/webrev.00

The test fails very intermittently when RMI registry is trying to bind
to a port
previously used in the test (via ServerSocket).

This seems to be caused by the sockets created via `new
ServerSocket(0)` and
being in reusable mode. The fix attempts to prevent this by explicitly
forbidding the reusable mode.


Hi Jaroslav,

I happened to see this fly by, and there are (I think) some similar
issues going on in the RMI tests.

But first I'll note that I don't think setReuseAddress() will have the
effect that you want. Typically it's set to true before binding a
socket, so that a subsequent bind operation will succeed even if the
address/port is already in use. ServerSockets created with new
ServerSocket(0) are already bound, and I'm not sure what calling
setReuseAddress(false) will do on such sockets. The spec says behavior
is undefined, but my bet is that it does nothing.

I guess it doesn't hurt to try this out to see if it makes a 
difference,

but I don't have much confidence it will help.

The potential similarity to the RMI tests is exemplified by JDK-8049202
(sorry, this bug report isn't open) but briefly this tests the RMI
registry as follows:

1. Opens port 1099 using new ServerSocket(1099) [1099 is the default
RMI registry port] in order to ensure that 1099 isn't in use by
something else already;

2. If this succeeds, it immediately closes the ServerSocket.

3. Then it creates a new RMI registry on port 1099.

In principle, this should succeed, yet it fails around 10% of the time
on some systems. The error is port already in use. My best theory is
that even though the socket has just been closed by a user program, the
kernel has to run the socket through some of the socket states such as
FIN_WAIT_1, FIN_WAIT_2, or CLOSING before the socket is actually closed
and is available for reuse. If a program -- even the same one --
attempts to open a socket on the same port before the socket has 
reached

its final state, it will get an already in use error.

If this is true I don't believe that setting SO_REUSEADDR will work if
the socket is in one of these final states. (I remember reading this
somewhere but I'm not sure where at the moment. I can try to dig it up
if there is interest.)

I admit this is just a theory and I'm open to alternatives, and I'm 
also

open to hearing about ways to deal with this problem.

Could something similar be going on with this JMX test?


Hm, this is exactly what happened with this test :(

The problem is that the port is reported as available while it is still
occupied and RMI registry attempts to start using that port.

If setting SO_REUSEADDR does not work then the only solution would be to
retry the test case when this exception occurs.


Further investigation shows that the problem was rather the client 
connecting to a socket being shut down.


It sounds like setting SO_REUSEADDR to false should prevent this failure.

From the ServerSocket javadoc:
When a TCP connection is closed the connection may remain in a 
timeout state for a period of time after the connection is closed 
(typically known as the TIME_WAIT state or 2MSL wait state). For 
applications using a well known socket address or port it may not be 
possible to bind a socket to the required SocketAddress if there is a 
connection in the timeout state involving the socket address or port.


It also turns 

Re: RFR 8066863: bigapps/runThese/nowarnings fails: Java HotSpot(TM) 64-Bit Server VM warning: WaitForMultipleObjects

2014-12-11 Thread Daniel D. Daugherty

On 12/11/14 3:01 AM, David Holmes wrote:

On 11/12/2014 7:48 PM, David Holmes wrote:

Hi Ivan,

On 11/12/2014 4:52 PM, Ivan Gerasimov wrote:

Hello!

After the fix for JDK-8064694 some more failures of
WaitForMultipleObjects() were observed under heavy load.
The reason was that the limitation on wait object number was 
overlooked.

The total number of the objects should not be greater than
MAXIMUM_WAIT_OBJECTS (= 64).

The proposed fix is to get rid of constant MAX_EXIT_HANDLES and use
MAXIMUM_WAIT_OBJECTS instead for all kinds of builds.
I also added the last error code to the failure reports, so it would be
easier to identify the cause of a failure.

Would you please help review the fix?

BUGURL: https://bugs.openjdk.java.net/browse/JDK-8066863
WEBREV: http://cr.openjdk.java.net/~igerasim/8064694/0/webrev/


The webrev changes do not correspond to the description you gave above.


Correct webrev URL:

http://cr.openjdk.java.net/~igerasim/8066863/0/webrev/


src/os/windows/vm/os_windows.cpp
Thanks for adding the GetLastError() info to the messages.

Thumbs up.

RT_Baseline has already pushed to Main_Baseline for this week
so you should be good to go if you're happy with your pre-push
testing.



Seems this saga will never end. :( Changes seem okay.


On the plus side, we're seeing fewer and fewer exit_code stomping
failures in nightly so things are getting better...

Dan




Thanks,
David


David



Sincerely yours,
Ivan




Re: RFR(L): 8049716: PPC64: Implement SA on Linux/PPC64

2014-12-11 Thread Dmitry Samersoff
Volker,

Below is fist part of review - shared files.


* agent/make/Makefile - no comments

* agent/src/os/linux/LinuxDebuggerLocal.c - no comments

* agent/src/os/linux/symtab.c:

438:
  - What is the magic of symbols that starts with '.' ?
  - As far as I understand you are getting indirect value from opd section.

 Could you reformat it a bit to have it better readable?

 Something like:

uintptr_t sym_value;
...
symvalue = syms-st_value

#ifdef ppc64
  // Some comments here
  // ppc specific code here
  sym_value =
#endif

symtab-symbols[j].offset = sym_value - baseaddr;


454:

I appreciate detailed comments here.

if (false) can cause unreachable code warning, and unused variable
warning so it might be better to add #ifdef ppc64 on caller
site  at ll. 516 and leave here only a comment.

But if you decide to guard against try_debuginfo please replace

if (false)

to

goto quit


*
agent/src/share/classes/sun/jvm/hotspot/debugger/MachineDescriptionPPC64.java

38:  return (endian.equals(big));

is enough


*
agent/src/share/classes/sun/jvm/hotspot/debugger/linux/LinuxCDebugger.java
- no comments

*
agent/src/share/classes/sun/jvm/hotspot/debugger/linux/LinuxThreadContextFactory.java
- no comments

*
agent/src/share/classes/sun/jvm/hotspot/debugger/proc/ProcDebuggerLocal.java
- no comments

*
agent/src/share/classes/sun/jvm/hotspot/debugger/remote/RemoteDebuggerClient.java
- no comments

* agent/src/share/classes/sun/jvm/hotspot/runtime/Threads.java - no comments

* agent/src/share/classes/sun/jvm/hotspot/runtime/VFrame.java - no comments

* make/linux/makefiles/sa.make - no comments

* make/sa.files - no comments

* src/share/vm/runtime/vmStructs.cpp
- no comments

-Dmitry

On 2014-12-09 21:10, Volker Simonis wrote:
 Hi,
 
 can somebody from the serviceability team please review this webrev?
 
 http://cr.openjdk.java.net/~simonis/webrevs/8049716
 https://bugs.openjdk.java.net/browse/JDK-8049716
 
 The shared changes are really all trivial.
 
 Thanks,
 Volker
 
 
 On Fri, Dec 5, 2014 at 4:01 PM, Volker Simonis volker.simo...@gmail.com 
 wrote:
 Hi Sasha,

 thanks for looking at this change.
 I'll incorporate your suggestions into the final version.
 I'm just waiting for one more review before preparing a new webrev.

 Regards,
 Volker


 On Fri, Dec 5, 2014 at 3:10 PM, Maynard Johnson mayna...@us.ibm.com wrote:
 On 12/04/2014 07:50 PM, Alexander Smundak wrote:
 You are correct, but there no need to have this code for LE at all.
 Agreed. I'm fine with adding the  !defined(ABI_ELFv2) throughout that 
 file
 along with the #if defined(ppc64).

 BTW, a bit on nitpicking in the same file:
 +String endian = System.getProperty(sun.cpu.endian);
 +if (endian.equals(big))
 +  return true;
 +else
 +  return false;
 can be rewirtten as
   return big.equals(System.getProperty(sun.cpu.endian));
 Right. A silly piece of coding there.  :-/

 -Maynard


 On Thu, Dec 4, 2014 at 3:43 PM, Maynard Johnson mayna...@us.ibm.com 
 wrote:
 On 12/04/2014 01:15 PM, Alexander Smundak wrote:
 The changes for agent/src/os/linux/symtab.c
 b/agent/src/os/linux/symtab.c in
 http://cr.openjdk.java.net/~simonis/webrevs/8049716 will break
 Linux/PPC64 little-endian:
 it uses ABIv2, which dropped function descriptors. So the preprocessor
 brackets in it should
 read
 #if defined(ppc64)  !defined(ABI_ELFv2)
 instead of just
 #if defined(ppc64)
 Hi, Alexander,
 I think this is actually fine everywhere except one place. The 'opd' 
 variable will be
 set to something other than NULL at line 379 only if running on ppc64 BE. 
 So in
 the rest of that function, opd is checked for non-null before using it.  
 The only
 place where I think there may be a problem is at line 455:

 --
 #if defined(ppc64)
   // On Linux/PPC64 the debuginfo files contain an empty file descriptor
   // section (i.e. '.opd' section) which makes the resolution of symbols
   // with the above algorithm impossible (we would need the have both, the
   // .opd section from the library and the symbol table from the debuginfo
   // file which doesn't match with the current workflow.)
   if (false) {
 #else
   // Look for a separate debuginfo file.
   if (try_debuginfo) {
 #endif
 --

 Here I think we should do as you suggest:
#if defined(ppc64)  !defined(ABI_ELFv2)

 -Maynard

 Sorry for the late notice.
 Sasha

 On Thu, Dec 4, 2014 at 9:50 AM, Volker Simonis 
 volker.simo...@gmail.com wrote:
 Hi,

 I'd like to submit this webrev which adds support for the SA agent on
 Linux/PPC64 on behalf of Maynard Johnson who is the main author of the
 change:

 http://cr.openjdk.java.net/~simonis/webrevs/8049716
 https://bugs.openjdk.java.net/browse/JDK-8049716

 I have already reviewed and tested the change and from my side
 everything looks fine.

 The change touches quite some shared code but all of these changes are
 trivial and straight-forward (i.e. they just add Linux/PPC64 support
 with the help of '#ifdef's in 

Re: RFR 8066863: bigapps/runThese/nowarnings fails: Java HotSpot(TM) 64-Bit Server VM warning: WaitForMultipleObjects

2014-12-11 Thread Ivan Gerasimov


On 11.12.2014 13:01, David Holmes wrote:

On 11/12/2014 7:48 PM, David Holmes wrote:

Hi Ivan,

On 11/12/2014 4:52 PM, Ivan Gerasimov wrote:

Hello!

After the fix for JDK-8064694 some more failures of
WaitForMultipleObjects() were observed under heavy load.
The reason was that the limitation on wait object number was 
overlooked.

The total number of the objects should not be greater than
MAXIMUM_WAIT_OBJECTS (= 64).

The proposed fix is to get rid of constant MAX_EXIT_HANDLES and use
MAXIMUM_WAIT_OBJECTS instead for all kinds of builds.
I also added the last error code to the failure reports, so it would be
easier to identify the cause of a failure.

Would you please help review the fix?

BUGURL: https://bugs.openjdk.java.net/browse/JDK-8066863
WEBREV: http://cr.openjdk.java.net/~igerasim/8064694/0/webrev/


The webrev changes do not correspond to the description you gave above.


Correct webrev URL:

http://cr.openjdk.java.net/~igerasim/8066863/0/webrev/


Thank you David for correcting the link and the review!


Seems this saga will never end. :( Changes seem okay.


I still have a hope to have it finished one day.

Sincerely yours,
Ivan


Thanks,
David


David



Sincerely yours,
Ivan







Re: RFR 8066863: bigapps/runThese/nowarnings fails: Java HotSpot(TM) 64-Bit Server VM warning: WaitForMultipleObjects

2014-12-11 Thread Ivan Gerasimov


On 11.12.2014 19:05, Daniel D. Daugherty wrote:

On 12/11/14 3:01 AM, David Holmes wrote:

On 11/12/2014 7:48 PM, David Holmes wrote:

Hi Ivan,

On 11/12/2014 4:52 PM, Ivan Gerasimov wrote:

Hello!

After the fix for JDK-8064694 some more failures of
WaitForMultipleObjects() were observed under heavy load.
The reason was that the limitation on wait object number was 
overlooked.

The total number of the objects should not be greater than
MAXIMUM_WAIT_OBJECTS (= 64).

The proposed fix is to get rid of constant MAX_EXIT_HANDLES and use
MAXIMUM_WAIT_OBJECTS instead for all kinds of builds.
I also added the last error code to the failure reports, so it 
would be

easier to identify the cause of a failure.

Would you please help review the fix?

BUGURL: https://bugs.openjdk.java.net/browse/JDK-8066863
WEBREV: http://cr.openjdk.java.net/~igerasim/8064694/0/webrev/


The webrev changes do not correspond to the description you gave above.


Correct webrev URL:

http://cr.openjdk.java.net/~igerasim/8066863/0/webrev/


src/os/windows/vm/os_windows.cpp
Thanks for adding the GetLastError() info to the messages.

Thumbs up.

RT_Baseline has already pushed to Main_Baseline for this week
so you should be good to go if you're happy with your pre-push
testing.



Thank you Daniel for the review!

I've run a JPRT job + hotspot test set, with so single failure.

Sincerely yours,
Ivan


Seems this saga will never end. :( Changes seem okay.


On the plus side, we're seeing fewer and fewer exit_code stomping
failures in nightly so things are getting better...

Dan




Thanks,
David


David



Sincerely yours,
Ivan








Re: RFR 8066708: JMXStartStopTest fails to connect to port 38112

2014-12-11 Thread Stuart Marks



On 12/11/14 7:09 AM, olivier.lagn...@oracle.com wrote:

On 11/12/2014 15:43, Dmitry Samersoff wrote:

You can set SO_LINGER to zero, in this case socket will be closed
immediately without waiting in TIME_WAIT

SO-LINGER did not help either in my case (see my previous mail to Jaroslav).
That ended-up in using another hard-coded (supposedly free) port.
Note that was before RMI tests used randomly allocated ports.


But there are no reliable way to predict whether you can take this port
or not after you close it.

This is what I observed in my case.


So the only valid solution is to try to connect to a random port and if
this attempt fails try another random port. Everything else will cause
more or less frequent intermittent failures.

IIRC think this is what is currently done in RMI tests.


The RMI tests are still suffering from this problem, unfortunately.

The RMI test library gets a random port with new ServerSocket(0), gets the 
port number, closes the socket, then returns the port to the caller. The caller 
then assumes that it can use that port as it wishes. That's when the 
BindException can occur. There are about 10 RMI test bugs in the database that 
all seem to have this as their root cause.


There is some retry logic in RMI's test library, but that's to avoid the 
so-called reserved ports that specific RMI tests use, or if new 
ServerSocket(0) fails. It doesn't have anything to do with the BindException 
that occurs when the caller attempts to reuse the port with another socket.


My observation is also that setting SO_REUSEADDR has no effect. I haven't tried 
SO_LINGER. My hunch is that it won't have any effect, since the sockets in 
question aren't actually going into TIME_WAIT state. But I suppose it's worth a try.


I don't have any solution for this; we're still discussing the issue. I think 
the best approach would be to refactor the code so that the eventual user of the 
socket opens it up on an ephemeral port in the first place. That avoids the 
open/close/reopen business. Unfortunately that doesn't help the case where you 
want to tell another JVM to run a service on a specific port. We don't have a 
solution for that case yet.


The second-best approach (not really a solution) is to open/close a serversocket 
to get the port, sleep for a little bit, then return the port number to the 
caller. This might give the kernel a chance to clean up the socket after the 
close. Of course, this still has a race condition, but it might reduce the 
incidence of problems to an acceptable level.


I'll let you know if we come up with anything better.

s'marks


Re: RFR 8066708: JMXStartStopTest fails to connect to port 38112

2014-12-11 Thread Dmitry Samersoff
Stuart,

As soon as you close socket, you open a door for the race.

So you need another communication channel to pass a port number (or bind
result) between a client and a server without closing a socket on the
server side.

Typical scenario used by network related code is:

1. Server opens the socket
2. Server binds to port(0)
3. Server gets port number assigned by OS
4. Server informs client (e.g. write the port down to known file,
broadcast it etc)
5. Client establishes connection.

If the server is a blackbox and have to get a port number from outside,
scenario looks like:

WHILE(!success and !timeout)
1. Driver chooses random port number
2. Driver runs a server with this number
3. Driver checks that server is actually listening on this port
   (e.g. try to connect by it self)
WEND

4. Driver runs a client with this port number or bails out with
   descriptive error message.

-Dmitry

On 2014-12-11 20:53, Stuart Marks wrote:
 
 
 On 12/11/14 7:09 AM, olivier.lagn...@oracle.com wrote:
 On 11/12/2014 15:43, Dmitry Samersoff wrote:
 You can set SO_LINGER to zero, in this case socket will be closed
 immediately without waiting in TIME_WAIT
 SO-LINGER did not help either in my case (see my previous mail to
 Jaroslav).
 That ended-up in using another hard-coded (supposedly free) port.
 Note that was before RMI tests used randomly allocated ports.

 But there are no reliable way to predict whether you can take this port
 or not after you close it.
 This is what I observed in my case.

 So the only valid solution is to try to connect to a random port and if
 this attempt fails try another random port. Everything else will cause
 more or less frequent intermittent failures.
 IIRC think this is what is currently done in RMI tests.
 
 The RMI tests are still suffering from this problem, unfortunately.
 
 The RMI test library gets a random port with new ServerSocket(0),
 gets the port number, closes the socket, then returns the port to the
 caller. The caller then assumes that it can use that port as it wishes.
 That's when the BindException can occur. There are about 10 RMI test
 bugs in the database that all seem to have this as their root cause.
 
 There is some retry logic in RMI's test library, but that's to avoid the
 so-called reserved ports that specific RMI tests use, or if new
 ServerSocket(0) fails. It doesn't have anything to do with the
 BindException that occurs when the caller attempts to reuse the port
 with another socket.
 
 My observation is also that setting SO_REUSEADDR has no effect. I
 haven't tried SO_LINGER. My hunch is that it won't have any effect,
 since the sockets in question aren't actually going into TIME_WAIT
 state. But I suppose it's worth a try.
 
 I don't have any solution for this; we're still discussing the issue. I
 think the best approach would be to refactor the code so that the
 eventual user of the socket opens it up on an ephemeral port in the
 first place. That avoids the open/close/reopen business. Unfortunately
 that doesn't help the case where you want to tell another JVM to run a
 service on a specific port. We don't have a solution for that case yet.
 
 The second-best approach (not really a solution) is to open/close a
 serversocket to get the port, sleep for a little bit, then return the
 port number to the caller. This might give the kernel a chance to clean
 up the socket after the close. Of course, this still has a race
 condition, but it might reduce the incidence of problems to an
 acceptable level.
 
 I'll let you know if we come up with anything better.
 
 s'marks


-- 
Dmitry Samersoff
Oracle Java development team, Saint Petersburg, Russia
* I would love to change the world, but they won't give me the sources.


RFR 8067241 DeadlockTest.java failed with negative timeout value

2014-12-11 Thread shanliang

Hi,

It is a test bug, it is not correct:
  while(!wb.done || timeToWait  0) {

it should be:
  while(!wb.done  timeToWait  0) {

|| should be changed to 

Another issue is that the waiting time could be not enough (final long 
timeout = 2000).


The fix is to remove the waiting time specified in the test, the timeout 
of test harness will be used as the max waiting time.


bug: https://bugs.openjdk.java.net/browse/JDK-8067241
webrev: http://cr.openjdk.java.net/~sjiang/JDK-8067241/00/

thanks,
Shanliang