Re: [drlvm][threading] Is it safe to use hythread_suspend_all and hythread_resume_all?

2006-10-19 Thread Artem Aliev

Weldon,


If the current scheme is the same that we had 1 or 2 years ago, the answer
is no


This is just the same scheme!


I am really hoping that all of this is simply an implementation
bug.


There are no open issues  on this scheme, there is no examples that
fail right now because of the suspend_all.


 The bottom line is that to make the system easy to reason about, a
thread should always be in suspend_enable mode before it does anything that
might block.


We already talk about that in the top of the thread. I agree with
that, and will add some debug capacity to the TM.


Eugeny,

Actually, the code is not ideal, and there is a lot of things to do on it.
You could contribute you ideas into the code, and test them.

The suspend_all code contains a number of compromises that was
produced by different workloads and stress tests failures.

I attach patch that implement one of you ideas: to hold global thread
lock between
suspend_all / resume_all.
As I remember it could cause deadlock in JVMTI. Probably, something
change in mean time.


Thanks
Artem
Index: vm/thread/src/thread_native_suspend.c
===
--- vm/thread/src/thread_native_suspend.c	(revision 464417)
+++ vm/thread/src/thread_native_suspend.c	(working copy)
@@ -420,7 +420,7 @@
 }
 
 hythread_iterator_reset(iter);
-hythread_iterator_release(iter);
+//hythread_iterator_release(iter);
 if(t) 
 {
 *t=iter;
@@ -450,6 +450,8 @@
 }
 
 hythread_iterator_release(iter); 
+hythread_iterator_release(iter);
+
 return TM_ERROR_NONE;
 }
 
-
Terms of use : http://incubator.apache.org/harmony/mailing.html
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: [drlvm][threading] Is it safe to use hythread_suspend_all and hythread_resume_all?

2006-10-19 Thread Weldon Washburn

On 10/18/06, Nikolay Kuznetsov [EMAIL PROTECTED] wrote:


 I agree it is required to have a solid model in mind. I also believe
 it is required to have such design/implementation which doesn't allow
 to break that model.

No, that's the point if functionality is so safe that it's impossible
to break the model,
it's not so highly important(as in our case) to understand how it
works, this model will restore itself.



hmmm this is probably a case where we are all saying the same thing only
using different words.  Nikolay, would it be possible for you to try a
different way of explaining the above?  It would really be appreciated.


It's ok to have safe points inside disabled regions only if it is
 really safe to enable GC at that point. All such cases should be taken
 with extreme caution. In our particular case we can't guarantee that
 it is safe to suspend the thread. That's why I think having something
 like assert(hythread_is_suspend_enabled) in the beginning of
 hythread_suspend_all is really required? Of cause it will require some
 changes in VM and TM...

Again, I agree that sometimes safepoints enable suspension in wrong
places an assert must be placed inside conditions, for instance, but
suspend_all is the rare place where safe_point placed in
suspend_disable region intentionally, by design(please refer to the
lock semantics of safe regions in my answer to Weldon).

 Could you give the most impossible thing to do?
Peace All Over the World? :)

 I was thinking of TM
 as of quite independent component. But now it seems like some parts of
 it are really depend on DRLVM implementation. :-(

First of all, TM _is_ independent component, but some of its functions
designed for special usage(it's potentially unsafe to nail up smth
with the gun, for instance).
Also, I believe that TM suspension safe enough an should not be
rewritten w/o special need, and even if it should the performance of
this functionality should be always in mind.
Current suspension scheme was tested on multiple workloads and tuned
to achieve better performance, and note that not even using additional
locks but having CAS to perform suspend_disable/enable leads to
noticeable performance loss.

Actually my idea from the beginning is that while we don't have a
scenario we should not change suspension algorithm. It's good enough,
robust tuned for better performance of GC.

Nik.

-
Terms of use : http://incubator.apache.org/harmony/mailing.html
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]





--
Weldon Washburn
Intel Middleware Products Division


Re: [drlvm][threading] Is it safe to use hythread_suspend_all and hythread_resume_all?

2006-10-18 Thread Evgueni Brevnov

On 10/17/06, Nikolay Kuznetsov [EMAIL PROTECTED] wrote:

Hello All,

first of all I'd like to emphasize that suspend/resume_all functions
are potentially unsafe
and should be used with care.

secondly, those methods were designed mainly to support
stop_the_world_enumeration
and thus usually being used under certain conditions.


hmmm... it is strange to hear words like that. TM provides
specification for this particular method and it should perform
according to that specification. I as a developer don't want to care
about particular use case scenario.



 1)  I found that hythread_suspend_all calls thread_safe_point_impl
 inside. There is no assertion regarding thread's state upon entering
 hythread_suspend_all. So it can be called in suspend disabled state
 and nobody (at least me) expects to have a safe point during
 hythread_suspend_all. The problem seems to be very similar with the
 one discussed in [drlvm][threading] Possible race condition in
 implementation of conditional variables? Your thoughts?

The code of suspend_all method is dedicated to the cyclic suspension problem.
The fact that this method is being called from suspend_disable region and
have safe_point in within is all about cyclic suspension. A lot of
time was spent to resolve deadlocks cause by two threads trying to
suspend each other.

I agree that problem is similar to one with conditions, but I believe
that this one should be discussed as a part of particular scenario.






 2) Assume I need to suspend all threads in particular group. Ok I pass
 that group to hythread_suspend_all. Later when all suspended threads
 should be resumed I pass the same group to hythread_resume_all. But
 threads were suspended group has changed. For example one new thread
 was added to it. So the questions are. Is it acceptable to have such
 unsafe functionality? Would it better to lock the group in
 hythread_suspend_all and unlock it in hythread_resume_all.

First of all I would differentiate j.l.ThreadGroup and thread groups
defined by thread manager(saying that I mean that this method was not
designed for ordinary use, like ThreadGroup.suspend()), and after that
return to the question why we would need it (I mean, it would be
better to have particular scenario) and then we can discuss how to
implement this. Till now suspend_all method was designed to work
within one group(in particular default group, containing java
threads), and called be GC.


Nikolay, I understand there is only one use case for now. Again I
expect the method works according to the spec but not how it is used
in some particular case. Could you clarify what you mean by saying
Till now suspend_all method was designed to work within one group(in
particular default group, containing java
threads), and called be GC. Why do you have such restriction? Where
it is specified?

Thanks
Evgueni



Thank you.
  Nik.

 Thanks
 Evgueni

 -
 Terms of use : http://incubator.apache.org/harmony/mailing.html
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]



-
Terms of use : http://incubator.apache.org/harmony/mailing.html
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




-
Terms of use : http://incubator.apache.org/harmony/mailing.html
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [drlvm][threading] Is it safe to use hythread_suspend_all and hythread_resume_all?

2006-10-18 Thread Evgueni Brevnov

Weldon,

I agree with what you are saying above. Do you think it makes
sense to call hythread_suspend_all in enabled state only?

Evgueni

On 10/17/06, Weldon Washburn [EMAIL PROTECTED] wrote:

On 10/17/06, Nikolay Kuznetsov [EMAIL PROTECTED] wrote:

 Hello All,

 first of all I'd like to emphasize that suspend/resume_all functions
 are potentially unsafe
 and should be used with care.


In specific, with a solid model of system behavior in mind.

secondly, those methods were designed mainly to support
 stop_the_world_enumeration
 and thus usually being used under certain conditions.

  1)  I found that hythread_suspend_all calls thread_safe_point_impl
  inside. There is no assertion regarding thread's state upon entering
  hythread_suspend_all. So it can be called in suspend disabled state
  and nobody (at least me) expects to have a safe point during
  hythread_suspend_all.


The simplest model is to grab the thread lock whenever thread A wants to
suspend thread B at a safepoint.  While this serializes thread suspension
and can potentially be a bottleneck, let's wait until its a proven
performance problem to change this model.  For thread A to be ready to grab
the thread lock, thread A must have all its java live references put in a
place where the GC will see them.  Why?  Because thread A may block.  Once
thread A obtains the lock, it can disable suspension if it likes, reload the
java live refs and do whatever it needs but make certain it is quick and non
blocking.  If thread A needs to block on some OS call, etc, it will need to
re-enable suspension and abandon the thread lock.  Why? Because if thread A
blocks while holding the global thread lock, there may be deadlock or
latency problems.


Did you try the above approach?  ARe there deadlocks?


The problem seems to be very similar with the
  one discussed in [drlvm][threading] Possible race condition in
  implementation of conditional variables? Your thoughts?

 The code of suspend_all method is dedicated to the cyclic suspension
 problem.
 The fact that this method is being called from suspend_disable region and
 have safe_point in within is all about cyclic suspension. A lot of
 time was spent to resolve deadlocks cause by two threads trying to
 suspend each other.

 I agree that problem is similar to one with conditions, but I believe
 that this one should be discussed as a part of particular scenario.

  2) Assume I need to suspend all threads in particular group. Ok I pass
  that group to hythread_suspend_all. Later when all suspended threads
  should be resumed I pass the same group to hythread_resume_all. But
  threads were suspended group has changed. For example one new thread
  was added to it. So the questions are. Is it acceptable to have such
  unsafe functionality? Would it better to lock the group in
  hythread_suspend_all and unlock it in hythread_resume_all.

 First of all I would differentiate j.l.ThreadGroup and thread groups
 defined by thread manager(saying that I mean that this method was not
 designed for ordinary use, like ThreadGroup.suspend()), and after that
 return to the question why we would need it (I mean, it would be
 better to have particular scenario) and then we can discuss how to
 implement this. Till now suspend_all method was designed to work
 within one group(in particular default group, containing java
 threads), and called be GC.

 Thank you.
   Nik.

  Thanks
  Evgueni
 
  -
  Terms of use : http://incubator.apache.org/harmony/mailing.html
  To unsubscribe, e-mail: [EMAIL PROTECTED]
  For additional commands, e-mail: [EMAIL PROTECTED]
 
 

 -
 Terms of use : http://incubator.apache.org/harmony/mailing.html
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]




--
Weldon Washburn
Intel Middleware Products Division




-
Terms of use : http://incubator.apache.org/harmony/mailing.html
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [drlvm][threading] Is it safe to use hythread_suspend_all and hythread_resume_all?

2006-10-18 Thread Evgueni Brevnov

On 10/17/06, Salikh Zakirov [EMAIL PROTECTED] wrote:

Evgueni Brevnov wrote:
 Hi All,

 I'd like to here you opinion regarding hythread_suspend_all and
 hythread_resume_all functionality provided by TM. Actually I have to
 separate questions:

 1)  I found that hythread_suspend_all calls thread_safe_point_impl
 inside. There is no assertion regarding thread's state upon entering
 hythread_suspend_all. So it can be called in suspend disabled state
 and nobody (at least me) expects to have a safe point during
 hythread_suspend_all. The problem seems to be very similar with the
 one discussed in [drlvm][threading] Possible race condition in
 implementation of conditional variables? Your thoughts?

The code you see is there to prevent following deadlock scenario:

  Thread A   Thread B
 |  |
 |  suspend(A);
 |   A-suspend_request = 1;
 |wait for A to reach a safepoint...
 |  |
  suspend_all() |
 B-suspend_request = 1
wait for B to reach a safepoint ...

and then two threads are infinitely waiting one another.


Salikh, I see your scenario...I don't suggest to remove safe points
from hythread_suspend_all. Contrary I believe it makes sense to
suspend other threads only if it suspender thread is in a safe region.
Agree?



 2) Assume I need to suspend all threads in particular group. Ok I pass
 that group to hythread_suspend_all. Later when all suspended threads
 should be resumed I pass the same group to hythread_resume_all. But
 threads were suspended group has changed. For example one new thread
 was added to it. So the questions are. Is it acceptable to have such
 unsafe functionality? Would it better to lock the group in
 hythread_suspend_all and unlock it in hythread_resume_all.

We may as well leave it as the responsibility of application / TI agent
writer not to modify a suspended thread group.
Why do you think this should be enforced?


In general, any good design should strive to eliminate/minimize
cases of illegal use of the interface. In other words it should be
hard to use it in a buggy way. In our case that means it is better to
ensure integrity inside TM instead of making application responsible
for that  if we can do it inside what is the reason not to do it?
Moreover if you look to the spec of hythread_suspend_all it states
...This method sets a suspend request for the every thread in the
group and then returns the iterator that can be used to traverse
through the suspended threads... But implementation contradicts with
that. If group is changed while you are traversing through the group
you can get wrong thread. What is even worse you can get a crash when
one thread is iterating through the group while another thread inserts
new elements (or removes) to it.

Evgueni



-
Terms of use : http://incubator.apache.org/harmony/mailing.html
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




-
Terms of use : http://incubator.apache.org/harmony/mailing.html
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [drlvm][threading] Is it safe to use hythread_suspend_all and hythread_resume_all?

2006-10-18 Thread Nikolay Kuznetsov

Evgueni,


 first of all I'd like to emphasize that suspend/resume_all functions
 are potentially unsafe
 and should be used with care.

 secondly, those methods were designed mainly to support
 stop_the_world_enumeration
 and thus usually being used under certain conditions.

hmmm... it is strange to hear words like that. TM provides
specification for this particular method and it should perform
according to that specification. I as a developer don't want to care
about particular use case scenario.


I'd say that w/o stop_the_world enumeration we would not include this function
to the interface, and w/o Thread,stop()(which is deprecated and
jvmtiThreadStop which is
used in debugger) we would not include even suspend function to the
interface, like pthread or original version of hythread.

Most of the others implementations have some special notes about suspend:
MSDN:
This function is primarily designed for use by debuggers. It is not
intended to be used for thread synchronization.

HP pthread_suspend_np(not portable):
This functionality enables a process that is multithreaded to
temporarily suspend all activity to a single thread of control. When
the process is single threaded, the address space is not changing, and
a consistent view of the process can be gathered.

So, this function is unsafe, and should be used with care and In
specific, with a solid model of system behavior in mind.(c) Weldon;

And returning to the question of safepoints inside suspend_disable
regions, we have a lot
of such places inside a VM and suspend_all is probably the only place
where this safepoint was left intentionally(I mean cyclic suspends).


Nikolay, I understand there is only one use case for now. Again I
expect the method works according to the spec but not how it is used
in some particular case. Could you clarify what you mean by saying
Till now suspend_all method was designed to work within one group(in
particular default group, containing java
threads), and called be GC. Why do you have such restriction? Where
it is specified?


We have such restrictions, because it's extremly hard to implement
common(not only for GC + limited use of TI). The spec on
suspend/suspend_all function usage scenarious can be found at:
drlvm/trunk/vm/thread/doc/ThreadManager.htm paragraphs 6.2 and 6.3

Thank you.
  Nik.

-
Terms of use : http://incubator.apache.org/harmony/mailing.html
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [drlvm][threading] Is it safe to use hythread_suspend_all and hythread_resume_all?

2006-10-18 Thread Nikolay Kuznetsov

hmm I never thought of it that way.  My initial reaction is no.  Suspend
enable/disable and global thread lock are seperate, distinct concepts.  The
thread lock should protect the VM internal thread structs when they
are being modified.   For example, the thread lock should allow only
one thread create/die at any given instant.  The enable/disable state is
incidental to this event. This is independent of the concept of a thread
running native code being in a state where the GC can find all its live
references.  If a thread needs to grab the thread lock, of course, it needs
to put itself in a suspend enable mode because it might have to wait for the
lock.


Yes I agree that global lock allows only one thread to create/die (and
so on) at any given moment, while suspend_disable/enable affect only
suspension functionality. But in fact
suspend_disable is per_thread lock for suspension, and if it's
taken(suspend_disable called) other thread can't suspend particular
thread while this lock is not released(suspend_enable called). And I
believe that additional synchronization is excessive and very
expensive.

Also my opinion is that suspension scheme is the last place in DRLVM
that should be changed w/o any open issue or problem which is depends
on it (or we do have a problems in GC in regard to suspension). Do you
really think that current scheme is unsafe and should be redesigned?

Nik.

-
Terms of use : http://incubator.apache.org/harmony/mailing.html
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [drlvm][threading] Is it safe to use hythread_suspend_all and hythread_resume_all?

2006-10-18 Thread Evgueni Brevnov

Hi Nikolay!

On 10/18/06, Nikolay Kuznetsov [EMAIL PROTECTED] wrote:

Evgueni,

  first of all I'd like to emphasize that suspend/resume_all functions
  are potentially unsafe
  and should be used with care.
 
  secondly, those methods were designed mainly to support
  stop_the_world_enumeration
  and thus usually being used under certain conditions.

 hmmm... it is strange to hear words like that. TM provides
 specification for this particular method and it should perform
 according to that specification. I as a developer don't want to care
 about particular use case scenario.

I'd say that w/o stop_the_world enumeration we would not include this function
to the interface, and w/o Thread,stop()(which is deprecated and
jvmtiThreadStop which is
used in debugger) we would not include even suspend function to the
interface, like pthread or original version of hythread.

Most of the others implementations have some special notes about suspend:
MSDN:
This function is primarily designed for use by debuggers. It is not
intended to be used for thread synchronization.

HP pthread_suspend_np(not portable):
This functionality enables a process that is multithreaded to
temporarily suspend all activity to a single thread of control. When
the process is single threaded, the address space is not changing, and
a consistent view of the process can be gathered.

So, this function is unsafe, and should be used with care and In
specific, with a solid model of system behavior in mind.(c) Weldon;


I agree it is required to have a solid model in mind. I also believe
it is required to have such design/implementation which doesn't allow
to break that model.



And returning to the question of safepoints inside suspend_disable
regions, we have a lot
of such places inside a VM and suspend_all is probably the only place
where this safepoint was left intentionally(I mean cyclic suspends).


It's ok to have safe points inside disabled regions only if it is
really safe to enable GC at that point. All such cases should be taken
with extreme caution. In our particular case we can't guarantee that
it is safe to suspend the thread. That's why I think having something
like assert(hythread_is_suspend_enabled) in the beginning of
hythread_suspend_all is really required? Of cause it will require some
changes in VM and TM...



 Nikolay, I understand there is only one use case for now. Again I
 expect the method works according to the spec but not how it is used
 in some particular case. Could you clarify what you mean by saying
 Till now suspend_all method was designed to work within one group(in
 particular default group, containing java
 threads), and called be GC. Why do you have such restriction? Where
 it is specified?

We have such restrictions, because it's extremly hard to implement
common(not only for GC + limited use of TI). The spec on
suspend/suspend_all function usage scenarious can be found at:
drlvm/trunk/vm/thread/doc/ThreadManager.htm paragraphs 6.2 and 6.3


Could you give the most impossible thing to do? I was thinking of TM
as of quite independent component. But now it seems like some parts of
it are really depend on DRLVM implementation. :-(

Thanks
Evgueni



Thank you.
  Nik.

-
Terms of use : http://incubator.apache.org/harmony/mailing.html
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




-
Terms of use : http://incubator.apache.org/harmony/mailing.html
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [drlvm][threading] Is it safe to use hythread_suspend_all and hythread_resume_all?

2006-10-18 Thread Nikolay Kuznetsov

I agree it is required to have a solid model in mind. I also believe
it is required to have such design/implementation which doesn't allow
to break that model.


No, that's the point if functionality is so safe that it's impossible
to break the model,
it's not so highly important(as in our case) to understand how it
works, this model will restore itself.


It's ok to have safe points inside disabled regions only if it is
really safe to enable GC at that point. All such cases should be taken
with extreme caution. In our particular case we can't guarantee that
it is safe to suspend the thread. That's why I think having something
like assert(hythread_is_suspend_enabled) in the beginning of
hythread_suspend_all is really required? Of cause it will require some
changes in VM and TM...


Again, I agree that sometimes safepoints enable suspension in wrong
places an assert must be placed inside conditions, for instance, but
suspend_all is the rare place where safe_point placed in
suspend_disable region intentionally, by design(please refer to the
lock semantics of safe regions in my answer to Weldon).


Could you give the most impossible thing to do?

Peace All Over the World? :)


I was thinking of TM
as of quite independent component. But now it seems like some parts of
it are really depend on DRLVM implementation. :-(


First of all, TM _is_ independent component, but some of its functions
designed for special usage(it's potentially unsafe to nail up smth
with the gun, for instance).
Also, I believe that TM suspension safe enough an should not be
rewritten w/o special need, and even if it should the performance of
this functionality should be always in mind.
Current suspension scheme was tested on multiple workloads and tuned
to achieve better performance, and note that not even using additional
locks but having CAS to perform suspend_disable/enable leads to
noticeable performance loss.

Actually my idea from the beginning is that while we don't have a
scenario we should not change suspension algorithm. It's good enough,
robust tuned for better performance of GC.

Nik.

-
Terms of use : http://incubator.apache.org/harmony/mailing.html
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [drlvm][threading] Is it safe to use hythread_suspend_all and hythread_resume_all?

2006-10-18 Thread Evgueni Brevnov

It seems we are not in sync I don't suggest changing current
suspention scheme...I like it I'm talking about one particular
case.and still can't see any disadvantages in what I propose...

On 10/18/06, Nikolay Kuznetsov [EMAIL PROTECTED] wrote:

 I agree it is required to have a solid model in mind. I also believe
 it is required to have such design/implementation which doesn't allow
 to break that model.

No, that's the point if functionality is so safe that it's impossible
to break the model,
it's not so highly important(as in our case) to understand how it
works, this model will restore itself.

 It's ok to have safe points inside disabled regions only if it is
 really safe to enable GC at that point. All such cases should be taken
 with extreme caution. In our particular case we can't guarantee that
 it is safe to suspend the thread. That's why I think having something
 like assert(hythread_is_suspend_enabled) in the beginning of
 hythread_suspend_all is really required? Of cause it will require some
 changes in VM and TM...

Again, I agree that sometimes safepoints enable suspension in wrong
places an assert must be placed inside conditions, for instance, but
suspend_all is the rare place where safe_point placed in
suspend_disable region intentionally, by design(please refer to the
lock semantics of safe regions in my answer to Weldon).

 Could you give the most impossible thing to do?
Peace All Over the World? :)

 I was thinking of TM
 as of quite independent component. But now it seems like some parts of
 it are really depend on DRLVM implementation. :-(

First of all, TM _is_ independent component, but some of its functions
designed for special usage(it's potentially unsafe to nail up smth
with the gun, for instance).
Also, I believe that TM suspension safe enough an should not be
rewritten w/o special need, and even if it should the performance of
this functionality should be always in mind.
Current suspension scheme was tested on multiple workloads and tuned
to achieve better performance, and note that not even using additional
locks but having CAS to perform suspend_disable/enable leads to
noticeable performance loss.

Actually my idea from the beginning is that while we don't have a
scenario we should not change suspension algorithm. It's good enough,
robust tuned for better performance of GC.

Nik.

-
Terms of use : http://incubator.apache.org/harmony/mailing.html
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




-
Terms of use : http://incubator.apache.org/harmony/mailing.html
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [drlvm][threading] Is it safe to use hythread_suspend_all and hythread_resume_all?

2006-10-18 Thread Weldon Washburn

On 10/18/06, Evgueni Brevnov [EMAIL PROTECTED] wrote:


On 10/17/06, Salikh Zakirov [EMAIL PROTECTED] wrote:
 Evgueni Brevnov wrote:
  Hi All,
 
  I'd like to here you opinion regarding hythread_suspend_all and
  hythread_resume_all functionality provided by TM. Actually I have to
  separate questions:
 
  1)  I found that hythread_suspend_all calls thread_safe_point_impl
  inside. There is no assertion regarding thread's state upon entering
  hythread_suspend_all. So it can be called in suspend disabled state
  and nobody (at least me) expects to have a safe point during
  hythread_suspend_all. The problem seems to be very similar with the
  one discussed in [drlvm][threading] Possible race condition in
  implementation of conditional variables? Your thoughts?

 The code you see is there to prevent following deadlock scenario:

   Thread A   Thread B
  |  |
  |  suspend(A);
  |   A-suspend_request = 1;
  |wait for A to reach a safepoint...
  |  |
   suspend_all() |
  B-suspend_request = 1
 wait for B to reach a safepoint ...

 and then two threads are infinitely waiting one another.

Salikh, I see your scenario...I don't suggest to remove safe points
from hythread_suspend_all. Contrary I believe it makes sense to
suspend other threads only if it suspender thread is in a safe region.
Agree?



This seems to make the most sense.  I think there might be an argument that
Thread A trying to suspend all other threads really does not have to be in
suspend_enable mode.  But this corner case probably adds nothing but
clutter/confusion to the design.  As a design rule, I believe any time a
thread calls a function that might block, the thread should be in suspend
enable mode.  This simple rule makes it much easier for the whole team
working on the code base to know what to do.

Suppose Thread A intends to suspend a subset of all java threads.  Thread A
will need to block somehow and wait for the complete subset to get to
suspended state with their stacks enumerable (suspend_enabled).  Meanwhile
over on another CPU in the SMP box, a bunch of non-suspended threads chew up
gobs of memory and one of them calls for a stop-the-world GC.  Ultimately
Thread A as well as the subset that was suspended needs to be in a
suspend_enabled state.

The easiest sync model to reason about is one where Thread A suspends its
target subset of java threads *before* allowing the stop-the-world gc to
proceed.  A global thread/gc lock provides this guarantee.



  2) Assume I need to suspend all threads in particular group. Ok I pass
  that group to hythread_suspend_all. Later when all suspended threads
  should be resumed I pass the same group to hythread_resume_all. But
  threads were suspended group has changed. For example one new thread
  was added to it. So the questions are. Is it acceptable to have such
  unsafe functionality? Would it better to lock the group in
  hythread_suspend_all and unlock it in hythread_resume_all.

 We may as well leave it as the responsibility of application / TI agent
 writer not to modify a suspended thread group.
 Why do you think this should be enforced?

In general, any good design should strive to eliminate/minimize
cases of illegal use of the interface. In other words it should be
hard to use it in a buggy way. In our case that means it is better to
ensure integrity inside TM instead of making application responsible
for that  if we can do it inside what is the reason not to do it?



Yes.  Good idea provided it really can be done.


Moreover if you look to the spec of hythread_suspend_all it states

...This method sets a suspend request for the every thread in the
group and then returns the iterator that can be used to traverse
through the suspended threads... But implementation contradicts with
that. If group is changed while you are traversing through the group
you can get wrong thread.



Agreed.  What you describe can be a problem.  I would like to see a
conservative, simple design.  Once we get the sync funtional part robust, we
can look at the performance problems.  While it is possible to
simultaneously walk a link-list while the list is changing, this adds way
too much confusion at this stage of  VM development.  And the VM does not
yet have enough performance where it make sense to look at such fine detail.


What is even worse you can get a crash when

one thread is iterating through the group while another thread inserts
new elements (or removes) to it.



Actually a crash would be the best failure I can think of.  At least the
crash occurs somewhere close to the bad code.  A worse scenario is that you
can fool yourself into thinking you have processed all the threads on the
list when,  in fact, you may have missed a couple.  And the impact may not
surface for 10 seconds...

Evgueni



 -
 Terms of use : 

Re: [drlvm][threading] Is it safe to use hythread_suspend_all and hythread_resume_all?

2006-10-18 Thread Weldon Washburn

On 10/18/06, Nikolay Kuznetsov [EMAIL PROTECTED] wrote:


 hmm I never thought of it that way.  My initial reaction is
no.  Suspend
 enable/disable and global thread lock are seperate, distinct
concepts.  The
 thread lock should protect the VM internal thread structs when they
 are being modified.   For example, the thread lock should allow only
 one thread create/die at any given instant.  The enable/disable state is
 incidental to this event. This is independent of the concept of a thread
 running native code being in a state where the GC can find all its live
 references.  If a thread needs to grab the thread lock, of course, it
needs
 to put itself in a suspend enable mode because it might have to wait for
the
 lock.

Yes I agree that global lock allows only one thread to create/die (and
so on) at any given moment, while suspend_disable/enable affect only
suspension functionality. But in fact
suspend_disable is per_thread lock for suspension, and if it's
taken(suspend_disable called) other thread can't suspend particular
thread while this lock is not released(suspend_enable called). And I
believe that additional synchronization is excessive and very
expensive.



This is interesting.  A thread's suspend enable/disable state is basically
one bit of thread-local storage info that is only written by the owning
thread.  And is only read by other threads in the system.  There is no lock
protocol on this bit.  It should be very cheap operation.  Is there evidence
that this operation is expensive?

Also, note we have to take into account the hardware memory model.  And, as
fate would have it, different HW has different memory models.  For example,
Intel 32-bit has what is known as write ordering.  Basically this means
that writes inside of a CPU will hit the SMP coherency domain in the order
of the program.  There is no guarantee precisely when the writes hit the
bus.  Bottom line: Thread A can toggle its enable/disable bit and eventually
other CPUs will _eventually_ see the writes in the order they happened.  PPC
is different, IPF is different.

Grabbing the thread system lock will get expensive if it is done at a high
rate.  My initial hunch is that grabbing the thread system lock happens at
low frequency.  Why?  Because operations such as thread create/kill, thread
suspend/resume, get thread group, thread interrrupt,etc happen at rather low
frequency.   Is there evidence that workloads we care about will cause high
frequency thread system lock?

Also my opinion is that suspension scheme is the last place in DRLVM

that should be changed w/o any open issue or problem which is depends
on it (or we do have a problems in GC in regard to suspension). Do you
really think that current scheme is unsafe and should be redesigned?



If the current scheme is the same that we had 1 or 2 years ago, the answer
is no.  I am really hoping that all of this is simply an implementation
bug.  The bottom line is that to make the system easy to reason about, a
thread should always be in suspend_enable mode before it does anything that
might block.

Nik.


-
Terms of use : http://incubator.apache.org/harmony/mailing.html
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]





--
Weldon Washburn
Intel Middleware Products Division


Re: [drlvm][threading] Is it safe to use hythread_suspend_all and hythread_resume_all?

2006-10-18 Thread Evgueni Brevnov

On 10/19/06, Weldon Washburn [EMAIL PROTECTED] wrote:

On 10/18/06, Evgueni Brevnov [EMAIL PROTECTED] wrote:

 On 10/17/06, Salikh Zakirov [EMAIL PROTECTED] wrote:
  Evgueni Brevnov wrote:
   Hi All,
  
   I'd like to here you opinion regarding hythread_suspend_all and
   hythread_resume_all functionality provided by TM. Actually I have to
   separate questions:
  
   1)  I found that hythread_suspend_all calls thread_safe_point_impl
   inside. There is no assertion regarding thread's state upon entering
   hythread_suspend_all. So it can be called in suspend disabled state
   and nobody (at least me) expects to have a safe point during
   hythread_suspend_all. The problem seems to be very similar with the
   one discussed in [drlvm][threading] Possible race condition in
   implementation of conditional variables? Your thoughts?
 
  The code you see is there to prevent following deadlock scenario:
 
Thread A   Thread B
   |  |
   |  suspend(A);
   |   A-suspend_request = 1;
   |wait for A to reach a safepoint...
   |  |
suspend_all() |
   B-suspend_request = 1
  wait for B to reach a safepoint ...
 
  and then two threads are infinitely waiting one another.

 Salikh, I see your scenario...I don't suggest to remove safe points
 from hythread_suspend_all. Contrary I believe it makes sense to
 suspend other threads only if it suspender thread is in a safe region.
 Agree?


This seems to make the most sense.  I think there might be an argument that
Thread A trying to suspend all other threads really does not have to be in
suspend_enable mode.  But this corner case probably adds nothing but
clutter/confusion to the design.  As a design rule, I believe any time a
thread calls a function that might block, the thread should be in suspend
enable mode.  This simple rule makes it much easier for the whole team
working on the code base to know what to do.


Strongly agree!



Suppose Thread A intends to suspend a subset of all java threads.  Thread A
will need to block somehow and wait for the complete subset to get to
suspended state with their stacks enumerable (suspend_enabled).  Meanwhile
over on another CPU in the SMP box, a bunch of non-suspended threads chew up
gobs of memory and one of them calls for a stop-the-world GC.  Ultimately
Thread A as well as the subset that was suspended needs to be in a
suspend_enabled state.

The easiest sync model to reason about is one where Thread A suspends its
target subset of java threads *before* allowing the stop-the-world gc to
proceed.  A global thread/gc lock provides this guarantee.


   2) Assume I need to suspend all threads in particular group. Ok I pass
   that group to hythread_suspend_all. Later when all suspended threads
   should be resumed I pass the same group to hythread_resume_all. But
   threads were suspended group has changed. For example one new thread
   was added to it. So the questions are. Is it acceptable to have such
   unsafe functionality? Would it better to lock the group in
   hythread_suspend_all and unlock it in hythread_resume_all.
 
  We may as well leave it as the responsibility of application / TI agent
  writer not to modify a suspended thread group.
  Why do you think this should be enforced?

 In general, any good design should strive to eliminate/minimize
 cases of illegal use of the interface. In other words it should be
 hard to use it in a buggy way. In our case that means it is better to
 ensure integrity inside TM instead of making application responsible
 for that  if we can do it inside what is the reason not to do it?


Yes.  Good idea provided it really can be done.


Moreover if you look to the spec of hythread_suspend_all it states
 ...This method sets a suspend request for the every thread in the
 group and then returns the iterator that can be used to traverse
 through the suspended threads... But implementation contradicts with
 that. If group is changed while you are traversing through the group
 you can get wrong thread.


Agreed.  What you describe can be a problem.  I would like to see a
conservative, simple design.  Once we get the sync funtional part robust, we
can look at the performance problems.  While it is possible to
simultaneously walk a link-list while the list is changing, this adds way
too much confusion at this stage of  VM development.  And the VM does not
yet have enough performance where it make sense to look at such fine detail.


 What is even worse you can get a crash when
 one thread is iterating through the group while another thread inserts
 new elements (or removes) to it.


Actually a crash would be the best failure I can think of.  At least the
crash occurs somewhere close to the bad code.  A worse scenario is that you
can fool yourself into thinking you have processed all the threads on the
list when,  in fact, you may have missed a couple.  And the impact may not
surface for 10 

[drlvm][threading] Is it safe to use hythread_suspend_all and hythread_resume_all?

2006-10-17 Thread Evgueni Brevnov

Hi All,

I'd like to here you opinion regarding hythread_suspend_all and
hythread_resume_all functionality provided by TM. Actually I have to
separate questions:

1)  I found that hythread_suspend_all calls thread_safe_point_impl
inside. There is no assertion regarding thread's state upon entering
hythread_suspend_all. So it can be called in suspend disabled state
and nobody (at least me) expects to have a safe point during
hythread_suspend_all. The problem seems to be very similar with the
one discussed in [drlvm][threading] Possible race condition in
implementation of conditional variables? Your thoughts?

2) Assume I need to suspend all threads in particular group. Ok I pass
that group to hythread_suspend_all. Later when all suspended threads
should be resumed I pass the same group to hythread_resume_all. But
threads were suspended group has changed. For example one new thread
was added to it. So the questions are. Is it acceptable to have such
unsafe functionality? Would it better to lock the group in
hythread_suspend_all and unlock it in hythread_resume_all.

Thanks
Evgueni

-
Terms of use : http://incubator.apache.org/harmony/mailing.html
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [drlvm][threading] Is it safe to use hythread_suspend_all and hythread_resume_all?

2006-10-17 Thread Nikolay Kuznetsov

Hello All,

first of all I'd like to emphasize that suspend/resume_all functions
are potentially unsafe
and should be used with care.

secondly, those methods were designed mainly to support
stop_the_world_enumeration
and thus usually being used under certain conditions.


1)  I found that hythread_suspend_all calls thread_safe_point_impl
inside. There is no assertion regarding thread's state upon entering
hythread_suspend_all. So it can be called in suspend disabled state
and nobody (at least me) expects to have a safe point during
hythread_suspend_all. The problem seems to be very similar with the
one discussed in [drlvm][threading] Possible race condition in
implementation of conditional variables? Your thoughts?


The code of suspend_all method is dedicated to the cyclic suspension problem.
The fact that this method is being called from suspend_disable region and
have safe_point in within is all about cyclic suspension. A lot of
time was spent to resolve deadlocks cause by two threads trying to
suspend each other.

I agree that problem is similar to one with conditions, but I believe
that this one should be discussed as a part of particular scenario.


2) Assume I need to suspend all threads in particular group. Ok I pass
that group to hythread_suspend_all. Later when all suspended threads
should be resumed I pass the same group to hythread_resume_all. But
threads were suspended group has changed. For example one new thread
was added to it. So the questions are. Is it acceptable to have such
unsafe functionality? Would it better to lock the group in
hythread_suspend_all and unlock it in hythread_resume_all.


First of all I would differentiate j.l.ThreadGroup and thread groups
defined by thread manager(saying that I mean that this method was not
designed for ordinary use, like ThreadGroup.suspend()), and after that
return to the question why we would need it (I mean, it would be
better to have particular scenario) and then we can discuss how to
implement this. Till now suspend_all method was designed to work
within one group(in particular default group, containing java
threads), and called be GC.

Thank you.
  Nik.


Thanks
Evgueni

-
Terms of use : http://incubator.apache.org/harmony/mailing.html
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




-
Terms of use : http://incubator.apache.org/harmony/mailing.html
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [drlvm][threading] Is it safe to use hythread_suspend_all and hythread_resume_all?

2006-10-17 Thread Weldon Washburn

On 10/17/06, Nikolay Kuznetsov [EMAIL PROTECTED] wrote:


Hello All,

first of all I'd like to emphasize that suspend/resume_all functions
are potentially unsafe
and should be used with care.



In specific, with a solid model of system behavior in mind.

secondly, those methods were designed mainly to support

stop_the_world_enumeration
and thus usually being used under certain conditions.

 1)  I found that hythread_suspend_all calls thread_safe_point_impl
 inside. There is no assertion regarding thread's state upon entering
 hythread_suspend_all. So it can be called in suspend disabled state
 and nobody (at least me) expects to have a safe point during
 hythread_suspend_all.



The simplest model is to grab the thread lock whenever thread A wants to
suspend thread B at a safepoint.  While this serializes thread suspension
and can potentially be a bottleneck, let's wait until its a proven
performance problem to change this model.  For thread A to be ready to grab
the thread lock, thread A must have all its java live references put in a
place where the GC will see them.  Why?  Because thread A may block.  Once
thread A obtains the lock, it can disable suspension if it likes, reload the
java live refs and do whatever it needs but make certain it is quick and non
blocking.  If thread A needs to block on some OS call, etc, it will need to
re-enable suspension and abandon the thread lock.  Why? Because if thread A
blocks while holding the global thread lock, there may be deadlock or
latency problems.


Did you try the above approach?  ARe there deadlocks?


The problem seems to be very similar with the

 one discussed in [drlvm][threading] Possible race condition in
 implementation of conditional variables? Your thoughts?

The code of suspend_all method is dedicated to the cyclic suspension
problem.
The fact that this method is being called from suspend_disable region and
have safe_point in within is all about cyclic suspension. A lot of
time was spent to resolve deadlocks cause by two threads trying to
suspend each other.

I agree that problem is similar to one with conditions, but I believe
that this one should be discussed as a part of particular scenario.

 2) Assume I need to suspend all threads in particular group. Ok I pass
 that group to hythread_suspend_all. Later when all suspended threads
 should be resumed I pass the same group to hythread_resume_all. But
 threads were suspended group has changed. For example one new thread
 was added to it. So the questions are. Is it acceptable to have such
 unsafe functionality? Would it better to lock the group in
 hythread_suspend_all and unlock it in hythread_resume_all.

First of all I would differentiate j.l.ThreadGroup and thread groups
defined by thread manager(saying that I mean that this method was not
designed for ordinary use, like ThreadGroup.suspend()), and after that
return to the question why we would need it (I mean, it would be
better to have particular scenario) and then we can discuss how to
implement this. Till now suspend_all method was designed to work
within one group(in particular default group, containing java
threads), and called be GC.

Thank you.
  Nik.

 Thanks
 Evgueni

 -
 Terms of use : http://incubator.apache.org/harmony/mailing.html
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]



-
Terms of use : http://incubator.apache.org/harmony/mailing.html
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]





--
Weldon Washburn
Intel Middleware Products Division


Re: [drlvm][threading] Is it safe to use hythread_suspend_all and hythread_resume_all?

2006-10-17 Thread Salikh Zakirov
Evgueni Brevnov wrote:
 Hi All,
 
 I'd like to here you opinion regarding hythread_suspend_all and
 hythread_resume_all functionality provided by TM. Actually I have to
 separate questions:
 
 1)  I found that hythread_suspend_all calls thread_safe_point_impl
 inside. There is no assertion regarding thread's state upon entering
 hythread_suspend_all. So it can be called in suspend disabled state
 and nobody (at least me) expects to have a safe point during
 hythread_suspend_all. The problem seems to be very similar with the
 one discussed in [drlvm][threading] Possible race condition in
 implementation of conditional variables? Your thoughts?

The code you see is there to prevent following deadlock scenario:

   Thread A   Thread B
  |  |
  |  suspend(A);
  |   A-suspend_request = 1;
  |wait for A to reach a safepoint...
  |  |
   suspend_all() |
 B-suspend_request = 1  
wait for B to reach a safepoint ... 

and then two threads are infinitely waiting one another.

 2) Assume I need to suspend all threads in particular group. Ok I pass
 that group to hythread_suspend_all. Later when all suspended threads
 should be resumed I pass the same group to hythread_resume_all. But
 threads were suspended group has changed. For example one new thread
 was added to it. So the questions are. Is it acceptable to have such
 unsafe functionality? Would it better to lock the group in
 hythread_suspend_all and unlock it in hythread_resume_all.

We may as well leave it as the responsibility of application / TI agent
writer not to modify a suspended thread group.
Why do you think this should be enforced?


-
Terms of use : http://incubator.apache.org/harmony/mailing.html
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [drlvm][threading] Is it safe to use hythread_suspend_all and hythread_resume_all?

2006-10-17 Thread Nikolay Kuznetsov

The simplest model is to grab the thread lock whenever thread A wants to
suspend thread B at a safepoint.  While this serializes thread suspension
and can potentially be a bottleneck, let's wait until its a proven
performance problem to change this model.  For thread A to be ready to grab
the thread lock, thread A must have all its java live references put in a
place where the GC will see them.  Why?  Because thread A may block.  Once
thread A obtains the lock, it can disable suspension if it likes, reload the
java live refs and do whatever it needs but make certain it is quick and non
blocking.  If thread A needs to block on some OS call, etc, it will need to
re-enable suspension and abandon the thread lock.  Why? Because if thread A
blocks while holding the global thread lock, there may be deadlock or
latency problems.


Did you try the above approach?  ARe there deadlocks?


I wonder if suspend_disable call can be treated as a thread lock and
if yes we do have nearly the same scheme related to stop_the_world
suspension.

Nik.

-
Terms of use : http://incubator.apache.org/harmony/mailing.html
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [drlvm][threading] Is it safe to use hythread_suspend_all and hythread_resume_all?

2006-10-17 Thread Weldon Washburn

On 10/17/06, Nikolay Kuznetsov [EMAIL PROTECTED] wrote:


 The simplest model is to grab the thread lock whenever thread A wants to
 suspend thread B at a safepoint.  While this serializes thread
suspension
 and can potentially be a bottleneck, let's wait until its a proven
 performance problem to change this model.  For thread A to be ready to
grab
 the thread lock, thread A must have all its java live references put in
a
 place where the GC will see them.  Why?  Because thread A may
block.  Once
 thread A obtains the lock, it can disable suspension if it likes, reload
the
 java live refs and do whatever it needs but make certain it is quick and
non
 blocking.  If thread A needs to block on some OS call, etc, it will need
to
 re-enable suspension and abandon the thread lock.  Why? Because if
thread A
 blocks while holding the global thread lock, there may be deadlock or
 latency problems.


 Did you try the above approach?  ARe there deadlocks?

I wonder if suspend_disable call can be treated as a thread lock and
if yes we do have nearly the same scheme related to stop_the_world
suspension.



hmm I never thought of it that way.  My initial reaction is no.  Suspend
enable/disable and global thread lock are seperate, distinct concepts.  The
thread lock should protect the VM internal thread structs when they
are being modified.   For example, the thread lock should allow only
one thread create/die at any given instant.  The enable/disable state is
incidental to this event. This is independent of the concept of a thread
running native code being in a state where the GC can find all its live
references.  If a thread needs to grab the thread lock, of course, it needs
to put itself in a suspend enable mode because it might have to wait for the
lock.

Nik.


-
Terms of use : http://incubator.apache.org/harmony/mailing.html
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]





--
Weldon Washburn
Intel Middleware Products Division