Re: RFR: 8032901: WaitForMultipleObjects() return value not handled appropriately

2014-05-14 Thread David Holmes

On 14/05/2014 11:06 AM, Vitaly Davidovich wrote:

In windows, you acquire a mutex by waiting on it using one of the wait
functions, one of them employed in the code in question.  If
WaitForMultipleObjects succeeds and returns the index of the mutex,
current thread has ownership now.


Yes I understand the basic mechanics :)


It's also common to use multi wait functions where the event is a
cancelation token, e.g. manual reset event; this allows someone to
cancel waiting on mutex acquisition and return from the wait function.
Presumably that's the case here, but I'll let Aleksej confirm; just
wanted to throw this out there in the meantime :).


Ah I see - yes cancellable lock acquisition would make sense.

Thanks,
David


Sent from my phone

On May 13, 2014 6:46 PM, David Holmes david.hol...@oracle.com
mailto:david.hol...@oracle.com wrote:

Hi Aleksej,

Thanks for the doc references regarding abandonment.

Let me rephrase my question. What is this logic trying to achieve by
waiting on both a mutex and an event? Do we already own the mutex
when this function is called?

David

On 13/05/2014 11:19 PM, Aleksej Efimov wrote:

David,

The Windows has a different terminology for mutex objects (much
differs
from the POSIX one). This one link gave me some understanding of
it [1].

Here is the MSDN [1] description of what abandoned mutex is:
 If a thread terminates without releasing its ownership of a mutex
object, the mutex object is considered to be abandoned. A
waiting thread
can acquire ownership of an abandoned mutex object, but the wait
function will return*WAIT_ABANDONED*to indicate that the mutex
object is
abandoned. An abandoned mutex object indicates that an error has
occurred and that any shared resource being protected by the mutex
object is in an undefined state. If the thread proceeds as
though the
mutex object had not been abandoned, it is no longer considered
abandoned after the thread releases its ownership. This restores
normal
behavior if a handle to the mutex object is subsequently
specified in a
wait function.


What does it mean to wait on mutex and ownership of the mutex
object:
Any thread with a handle to a mutex object can use one of thewait
functions

http://msdn.microsoft.com/en-__gb/library/windows/desktop/__ms687069%28v=vs.85%29.aspx

http://msdn.microsoft.com/en-gb/library/windows/desktop/ms687069%28v=vs.85%29.aspxto
request ownership of the mutex object. If the mutex object is
owned by
another thread, the wait function blocks the requesting thread
until the
owning thread releases the mutex object using the*ReleaseMutex*

http://msdn.microsoft.com/en-__gb/library/windows/desktop/__ms685066%28v=vs.85%29.aspx

http://msdn.microsoft.com/en-gb/library/windows/desktop/ms685066%28v=vs.85%29.aspx__function.

How we can release mutex and wait on already owned mutex:
 After a thread obtains ownership of a mutex, it can specify
the same
mutex in repeated calls to the wait-functions

http://msdn.microsoft.com/en-__gb/library/windows/desktop/__ms687069%28v=vs.85%29.aspx

http://msdn.microsoft.com/en-gb/library/windows/desktop/ms687069%28v=vs.85%29.aspx__without
blocking its execution. This prevents a thread from deadlocking
itself
while waiting for a mutex that it already owns. To release its
ownership
under such circumstances, the thread must call*ReleaseMutex*

http://msdn.microsoft.com/en-__gb/library/windows/desktop/__ms685066%28v=vs.85%29.aspx

http://msdn.microsoft.com/en-gb/library/windows/desktop/ms685066%28v=vs.85%29.aspx__once
for each time that the mutex satisfied the conditions of a wait
function.

[1]

http://msdn.microsoft.com/en-__gb/library/windows/desktop/__ms684266(v=vs.85).aspx

http://msdn.microsoft.com/en-gb/library/windows/desktop/ms684266(v=vs.85).aspx

-Aleksej

On 05/13/2014 04:00 PM, David Holmes wrote:

I don't understand this one at all. What is an abandoned
mutex? For
that matter why does the code wait on a mutex and an event?
Do we
already own the mutex? If so what does it mean to wait on
it? If not
then how can we release it?

???

Thanks,
David


On 13/05/2014 8:57 PM, Alan Bateman wrote:


This is debugger's shared memory transport so cc'ing
serviceability-dev
as that is there this code is maintained.

Is there a test case or any outline of the conditions
that cause this? I
think 

Re: RFR: 8032901: WaitForMultipleObjects() return value not handled appropriately

2014-05-14 Thread Aleksej Efimov

David, Vitaly,

I totally agree with Vitaly's explanation (Vitaly - thank you for that) 
and code in shmemBase.c (the usage of enterMutex() function for 
sending/receiving bytes through shared memory connection) illustrates on 
how the connection shutdown event is used as a cancellation token.


Thank you,
-Aleksej


On 05/14/2014 01:05 PM, David Holmes wrote:

On 14/05/2014 11:06 AM, Vitaly Davidovich wrote:

In windows, you acquire a mutex by waiting on it using one of the wait
functions, one of them employed in the code in question.  If
WaitForMultipleObjects succeeds and returns the index of the mutex,
current thread has ownership now.


Yes I understand the basic mechanics :)


It's also common to use multi wait functions where the event is a
cancelation token, e.g. manual reset event; this allows someone to
cancel waiting on mutex acquisition and return from the wait function.
Presumably that's the case here, but I'll let Aleksej confirm; just
wanted to throw this out there in the meantime :).


Ah I see - yes cancellable lock acquisition would make sense.

Thanks,
David


Sent from my phone

On May 13, 2014 6:46 PM, David Holmes david.hol...@oracle.com
mailto:david.hol...@oracle.com wrote:

Hi Aleksej,

Thanks for the doc references regarding abandonment.

Let me rephrase my question. What is this logic trying to achieve by
waiting on both a mutex and an event? Do we already own the mutex
when this function is called?

David

On 13/05/2014 11:19 PM, Aleksej Efimov wrote:

David,

The Windows has a different terminology for mutex objects (much
differs
from the POSIX one). This one link gave me some understanding of
it [1].

Here is the MSDN [1] description of what abandoned mutex is:
 If a thread terminates without releasing its ownership of a 
mutex

object, the mutex object is considered to be abandoned. A
waiting thread
can acquire ownership of an abandoned mutex object, but the wait
function will return*WAIT_ABANDONED*to indicate that the mutex
object is
abandoned. An abandoned mutex object indicates that an error has
occurred and that any shared resource being protected by the 
mutex

object is in an undefined state. If the thread proceeds as
though the
mutex object had not been abandoned, it is no longer considered
abandoned after the thread releases its ownership. This restores
normal
behavior if a handle to the mutex object is subsequently
specified in a
wait function.


What does it mean to wait on mutex and ownership of the mutex
object:
Any thread with a handle to a mutex object can use one of 
thewait

functions
http://msdn.microsoft.com/en-__gb/library/windows/desktop/__ms687069%28v=vs.85%29.aspx
http://msdn.microsoft.com/en-gb/library/windows/desktop/ms687069%28v=vs.85%29.aspxto
request ownership of the mutex object. If the mutex object is
owned by
another thread, the wait function blocks the requesting thread
until the
owning thread releases the mutex object using the*ReleaseMutex*
http://msdn.microsoft.com/en-__gb/library/windows/desktop/__ms685066%28v=vs.85%29.aspx
http://msdn.microsoft.com/en-gb/library/windows/desktop/ms685066%28v=vs.85%29.aspx__function.

How we can release mutex and wait on already owned mutex:
 After a thread obtains ownership of a mutex, it can specify
the same
mutex in repeated calls to the wait-functions
http://msdn.microsoft.com/en-__gb/library/windows/desktop/__ms687069%28v=vs.85%29.aspx
http://msdn.microsoft.com/en-gb/library/windows/desktop/ms687069%28v=vs.85%29.aspx__without
blocking its execution. This prevents a thread from deadlocking
itself
while waiting for a mutex that it already owns. To release its
ownership
under such circumstances, the thread must call*ReleaseMutex*
http://msdn.microsoft.com/en-__gb/library/windows/desktop/__ms685066%28v=vs.85%29.aspx
http://msdn.microsoft.com/en-gb/library/windows/desktop/ms685066%28v=vs.85%29.aspx__once
for each time that the mutex satisfied the conditions of a wait
function.

[1]
http://msdn.microsoft.com/en-__gb/library/windows/desktop/__ms684266(v=vs.85).aspx
http://msdn.microsoft.com/en-gb/library/windows/desktop/ms684266(v=vs.85).aspx

-Aleksej

On 05/13/2014 04:00 PM, David Holmes wrote:

I don't understand this one at all. What is an abandoned
mutex? For
that matter why does the code wait on a mutex and an event?
Do we
already own the mutex? If so what does it mean to wait on
it? If not
then how can we release it?

???

Thanks,
David


On 13/05/2014 8:57 PM, Alan Bateman wrote:


   

Re: RFR: 8041934 com/sun/jdi/RepStep.java fails on all platforms with assert(_cur_stack_depth == count_frames()) failed: cur_stack_depth out of sync

2014-05-14 Thread Christian Thalinger
Looks good.

On May 13, 2014, at 11:58 PM, Staffan Larsen staffan.lar...@oracle.com wrote:

 Thanks Christian,
 
 I will make the change below before I push.
 
 /Staffan
 
 
 diff --git a/src/cpu/x86/vm/sharedRuntime_x86_32.cpp 
 b/src/cpu/x86/vm/sharedRuntime_x86_32.cpp
 --- a/src/cpu/x86/vm/sharedRuntime_x86_32.cpp
 +++ b/src/cpu/x86/vm/sharedRuntime_x86_32.cpp
 @@ -2248,7 +2248,7 @@
  // if we are now in interp_only_mode and in that case we do the jvmti
  // callback.
  Label skip_jvmti_method_exit;
 -__ cmpb(Address(thread, JavaThread::interp_only_mode_offset()), 0);
 +__ cmpl(Address(thread, JavaThread::interp_only_mode_offset()), 0);
  __ jcc(Assembler::zero, skip_jvmti_method_exit, true);
 
  save_native_result(masm, ret_type, stack_slots);
 diff --git a/src/cpu/x86/vm/sharedRuntime_x86_64.cpp 
 b/src/cpu/x86/vm/sharedRuntime_x86_64.cpp
 --- a/src/cpu/x86/vm/sharedRuntime_x86_64.cpp
 +++ b/src/cpu/x86/vm/sharedRuntime_x86_64.cpp
 @@ -2495,7 +2495,7 @@
  // if we are now in interp_only_mode and in that case we do the jvmti
  // callback.
  Label skip_jvmti_method_exit;
 -__ cmpb(Address(r15_thread, JavaThread::interp_only_mode_offset()), 0);
 +__ cmpl(Address(r15_thread, JavaThread::interp_only_mode_offset()), 0);
  __ jcc(Assembler::zero, skip_jvmti_method_exit, true);
 
  save_native_result(masm, ret_type, stack_slots);
 
 
 On 14 maj 2014, at 00:21, Christian Thalinger 
 christian.thalin...@oracle.com wrote:
 
 Since:
 
   int   _interp_only_mode;
 
 is an int field I would prefer to actually read the value as an int instead 
 of just a byte on x86:
 +__ cmpb(Address(r15_thread, JavaThread::interp_only_mode_offset()), 0);
 Otherwise this looks good.
 
 On May 13, 2014, at 11:30 AM, Staffan Larsen staffan.lar...@oracle.com 
 wrote:
 
 
 On 13 maj 2014, at 18:53, Daniel D. Daugherty daniel.daughe...@oracle.com 
 wrote:
 
  new webrev is here: http://cr.openjdk.java.net/~sla/8041934/webrev.02/
 
 src/share/vm/runtime/sharedRuntime.hpp
 No comments.
 
 src/share/vm/runtime/sharedRuntime.cpp
 No comments.
 
 src/cpu/sparc/vm/sharedRuntime_sparc.cpp
 No comments.
 
 src/cpu/x86/vm/sharedRuntime_x86_32.cpp
 No comments.
 
 src/cpu/x86/vm/sharedRuntime_x86_64.cpp
 No comments.
 
 On the switch from call_VM_leaf() - call_VM(), I looked through all
 the mentions of the string call_VM in the three src/cpu files. Your
 change adds the first call_VM() use in all three files and the other
 places that mention the string call_VM seem to have gone through
 a great deal of trouble not to use call_VM() directly. I have no
 specific thing I think is wrong, but I find this data worrisome…
 
 Yes, it would be great if someone from the compiler team could review this, 
 too.
 
 Thanks,
 /Staffan
 
 
 Thumbs up!
 
 Dan
 
 
 On 5/13/14 3:20 AM, Staffan Larsen wrote:
 
 On 9 maj 2014, at 20:18, serguei.spit...@oracle.com wrote:
 
 Staffan,
 
 This is important discovery, thanks!
 The fix looks good to me.
 One question below.
 
 Thanks,
 Serguei
 
 
 On 5/9/14 3:47 AM, Staffan Larsen wrote:
 On 8 maj 2014, at 19:05, Daniel D. Daugherty 
 daniel.daughe...@oracle.com wrote:
 
 webrev: http://cr.openjdk.java.net/~sla/8041934/webrev.00/
 src/share/vm/runtime/sharedRuntime.hpp
No comments.
 
 src/share/vm/runtime/sharedRuntime.cpp
line 994: JRT_LEAF(int, SharedRuntime::jvmti_method_exit(
I'm not sure that JRT_LEAF is right. I would think that
JvmtiExport::post_method_exit() would have to grab one or
more locks with all the state checks it has to make...
 
For reference, InterpreterRuntime::post_method_exit()
is a wrapper around JvmtiExport::post_method_exit()
and it is IRT_ENTRY instead of IRT_LEAF.
 Good catch!
 
 Q: Is correct to use call_VM_leaf (vs call_VM ) in the  
 sharedRuntime_arch.cpp ?
 
 +__ call_VM_leaf(
 + CAST_FROM_FN_PTR(address, SharedRuntime::jvmti_method_exit),
 + thread, rax);
 
 That is another good catch! It should probably be call_VM as you note. 
 The reason for all these leaf references is because we used the dtrace 
 probe as a template - obviously without fully understanding what we did 
 :-/
 
 I have changed to code to use call_VM instead. This also required a 
 change from jccb to jcc for the jump (which is now longer than an 8-bit 
 offset). 
 
 new webrev is here: http://cr.openjdk.java.net/~sla/8041934/webrev.02/
 
 Thanks,
 /Staffan
 
 
 
 src/cpu/sparc/vm/sharedRuntime_sparc.cpp
No comments.
 
 src/cpu/x86/vm/sharedRuntime_x86_32.cpp
No comments.
 
 src/cpu/x86/vm/sharedRuntime_x86_64.cpp
No comments.
 
 I'm guessing that PPC has the same issue, but I'm presuming
 that someone else (Vladimir?) will handle that…
 Yes, I was hoping that I could file a follow-up bug for the platforms I 
 didn’t know how to fix.
 
 Updated review: http://cr.openjdk.java.net/~sla/8041934/webrev.01/
 
 Thanks,
 /Staffan
 
 Dan
 
 
 On 

Re: RFR: 8032901: WaitForMultipleObjects() return value not handled appropriately

2014-05-14 Thread David Holmes

On 14/05/2014 11:18 PM, Aleksej Efimov wrote:

David, Vitaly,

I totally agree with Vitaly's explanation (Vitaly - thank you for that)
and code in shmemBase.c (the usage of enterMutex() function for
sending/receiving bytes through shared memory connection) illustrates on
how the connection shutdown event is used as a cancellation token.


Thanks for clarifying that.

So if we were to encounter an abandoned mutex the code would presently 
have acquired the mutex but return an error, thus preventing a 
subsequent release, and preventing other threads from acquiring (but 
allowing current thread to recurisevely acquire. So this could both hang 
and cause data corruption.


The new code will still return an error but release the mutex. So no 
more hangs (other than by conditions caused by data corruption) but more 
opportunity for data corruption.


Obviously it depends on exactly what happens in the critical sections 
guarded by this mutex, but in general I don't agree with this rationale 
for making the change:


 204 /* If the mutex is abandoned we want to return an error
 205  * and also release the mutex so that we don't cause
 206  * other threads to be blocked. If a mutex was abandoned
 207  * we are in scary state. Data may be corrupted or inconsistent
 208  * but it is still better to let other threads run (and possibly
 209  * crash) than having them blocked (on the premise that a crash
 210  * is always easier to debug than a hang).

Considering something has to have gone drastically wrong for the mutex 
to become abandoned, I'm more inclined to consider this a fatal error 
and abort.


But I'll let the serviceability folk chime in here.

Thanks,
David



Thank you,
-Aleksej


On 05/14/2014 01:05 PM, David Holmes wrote:

On 14/05/2014 11:06 AM, Vitaly Davidovich wrote:

In windows, you acquire a mutex by waiting on it using one of the wait
functions, one of them employed in the code in question.  If
WaitForMultipleObjects succeeds and returns the index of the mutex,
current thread has ownership now.


Yes I understand the basic mechanics :)


It's also common to use multi wait functions where the event is a
cancelation token, e.g. manual reset event; this allows someone to
cancel waiting on mutex acquisition and return from the wait function.
Presumably that's the case here, but I'll let Aleksej confirm; just
wanted to throw this out there in the meantime :).


Ah I see - yes cancellable lock acquisition would make sense.

Thanks,
David


Sent from my phone

On May 13, 2014 6:46 PM, David Holmes david.hol...@oracle.com
mailto:david.hol...@oracle.com wrote:

Hi Aleksej,

Thanks for the doc references regarding abandonment.

Let me rephrase my question. What is this logic trying to achieve by
waiting on both a mutex and an event? Do we already own the mutex
when this function is called?

David

On 13/05/2014 11:19 PM, Aleksej Efimov wrote:

David,

The Windows has a different terminology for mutex objects (much
differs
from the POSIX one). This one link gave me some understanding of
it [1].

Here is the MSDN [1] description of what abandoned mutex is:
 If a thread terminates without releasing its ownership of a
mutex
object, the mutex object is considered to be abandoned. A
waiting thread
can acquire ownership of an abandoned mutex object, but the wait
function will return*WAIT_ABANDONED*to indicate that the mutex
object is
abandoned. An abandoned mutex object indicates that an error has
occurred and that any shared resource being protected by the
mutex
object is in an undefined state. If the thread proceeds as
though the
mutex object had not been abandoned, it is no longer considered
abandoned after the thread releases its ownership. This restores
normal
behavior if a handle to the mutex object is subsequently
specified in a
wait function.


What does it mean to wait on mutex and ownership of the mutex
object:
Any thread with a handle to a mutex object can use one of
thewait
functions
http://msdn.microsoft.com/en-__gb/library/windows/desktop/__ms687069%28v=vs.85%29.aspx
http://msdn.microsoft.com/en-gb/library/windows/desktop/ms687069%28v=vs.85%29.aspxto
request ownership of the mutex object. If the mutex object is
owned by
another thread, the wait function blocks the requesting thread
until the
owning thread releases the mutex object using the*ReleaseMutex*
http://msdn.microsoft.com/en-__gb/library/windows/desktop/__ms685066%28v=vs.85%29.aspx
http://msdn.microsoft.com/en-gb/library/windows/desktop/ms685066%28v=vs.85%29.aspx__function.

How we can release mutex and wait on already owned mutex:
 After a thread obtains ownership of a mutex, it can specify
the same
mutex 

Re: RFR: 8032901: WaitForMultipleObjects() return value not handled appropriately

2014-05-14 Thread Vitaly Davidovich
In windows, you acquire a mutex by waiting on it using one of the wait
functions, one of them employed in the code in question.  If
WaitForMultipleObjects succeeds and returns the index of the mutex, current
thread has ownership now.

It's also common to use multi wait functions where the event is a
cancelation token, e.g. manual reset event; this allows someone to cancel
waiting on mutex acquisition and return from the wait function.  Presumably
that's the case here, but I'll let Aleksej confirm; just wanted to throw
this out there in the meantime :).

Sent from my phone
On May 13, 2014 6:46 PM, David Holmes david.hol...@oracle.com wrote:

 Hi Aleksej,

 Thanks for the doc references regarding abandonment.

 Let me rephrase my question. What is this logic trying to achieve by
 waiting on both a mutex and an event? Do we already own the mutex when this
 function is called?

 David

 On 13/05/2014 11:19 PM, Aleksej Efimov wrote:

 David,

 The Windows has a different terminology for mutex objects (much differs
 from the POSIX one). This one link gave me some understanding of it [1].

 Here is the MSDN [1] description of what abandoned mutex is:
  If a thread terminates without releasing its ownership of a mutex
 object, the mutex object is considered to be abandoned. A waiting thread
 can acquire ownership of an abandoned mutex object, but the wait
 function will return*WAIT_ABANDONED*to indicate that the mutex object is
 abandoned. An abandoned mutex object indicates that an error has
 occurred and that any shared resource being protected by the mutex
 object is in an undefined state. If the thread proceeds as though the
 mutex object had not been abandoned, it is no longer considered
 abandoned after the thread releases its ownership. This restores normal
 behavior if a handle to the mutex object is subsequently specified in a
 wait function.


 What does it mean to wait on mutex and ownership of the mutex object:
 Any thread with a handle to a mutex object can use one of thewait
 functions
 http://msdn.microsoft.com/en-gb/library/windows/desktop/
 ms687069%28v=vs.85%29.aspxto
 request ownership of the mutex object. If the mutex object is owned by
 another thread, the wait function blocks the requesting thread until the
 owning thread releases the mutex object using the*ReleaseMutex*
 http://msdn.microsoft.com/en-gb/library/windows/desktop/
 ms685066%28v=vs.85%29.aspxfunction.

 How we can release mutex and wait on already owned mutex:
  After a thread obtains ownership of a mutex, it can specify the same
 mutex in repeated calls to the wait-functions
 http://msdn.microsoft.com/en-gb/library/windows/desktop/
 ms687069%28v=vs.85%29.aspxwithout
 blocking its execution. This prevents a thread from deadlocking itself
 while waiting for a mutex that it already owns. To release its ownership
 under such circumstances, the thread must call*ReleaseMutex*
 http://msdn.microsoft.com/en-gb/library/windows/desktop/
 ms685066%28v=vs.85%29.aspxonce
 for each time that the mutex satisfied the conditions of a wait function.

 [1]
 http://msdn.microsoft.com/en-gb/library/windows/desktop/
 ms684266(v=vs.85).aspx

 -Aleksej

 On 05/13/2014 04:00 PM, David Holmes wrote:

 I don't understand this one at all. What is an abandoned mutex? For
 that matter why does the code wait on a mutex and an event? Do we
 already own the mutex? If so what does it mean to wait on it? If not
 then how can we release it?

 ???

 Thanks,
 David


 On 13/05/2014 8:57 PM, Alan Bateman wrote:


 This is debugger's shared memory transport so cc'ing serviceability-dev
 as that is there this code is maintained.

 Is there a test case or any outline of the conditions that cause this? I
 think that would be useful to understand the issue further.

 -Alan

 On 13/05/2014 11:46, Aleksej Efimov wrote:

 Hi,

 Can I have a review for 8032901 bug [1] fix [2]. There is a possible
 case when 'WaitForMultipleObjects' function can return the
 WAIT_ABANDONED_0 [3] error value.
 In such case it's better to release the mutex and return error value.
 This will prevent other threads to be blocked on abandoned mutex.

 Thank you,
 Aleksej

 [1] https://bugs.openjdk.java.net/browse/JDK-8032901
 [2] http://cr.openjdk.java.net/~aefimov/8032901/9/webrev.00/
 [3]
 http://msdn.microsoft.com/en-gb/library/windows/desktop/
 ms687025(v=vs.85).aspx