Re: Review request for 6977034 Thread.getState() very slow

2010-12-06 Thread Robert Lougher
Hi Mandy,

On 6 December 2010 19:26, Mandy Chung  wrote:
> Remi, Eamonn, Brian, David, Doug,
>
> Thanks for the feedback.
>

I don't know if you welcome external feedback, but I'd like to point
out (if you're not already aware) that this change modifies the VM
interface.

While I'm cognisant of the reason for the change, my understanding of
the existing map mechanism is that it makes the JDK class library
independent of the underlying VM thread status values.  The value of
Thread.threadStatus is opaque, with the mapping from VM thread status
being determined by the following VM interface functions (see
hotspot/src/share/vm/prims/jvm.h):

--
/*
 * Returns an array of the threadStatus values representing the
 * given Java thread state.  Returns NULL if the VM version is
 * incompatible with the JDK or doesn't support the given
 * Java thread state.
 */
JNIEXPORT jintArray JNICALL
JVM_GetThreadStateValues(JNIEnv* env, jint javaThreadState);

/*
 * Returns an array of the substate names representing the
 * given Java thread state.  Returns NULL if the VM version is
 * incompatible with the JDK or the VM doesn't support
 * the given Java thread state.
 * values must be the jintArray returned from JVM_GetThreadStateValues
 * and javaThreadState.
 */
JNIEXPORT jobjectArray JNICALL
JVM_GetThreadStateNames(JNIEnv* env, jint javaThreadState, jintArray values);

--

These two functions are used by the native method
sun.misc.VM.getThreadStateValues to setup the arrays which are then
used to initialise the map.

This change breaks this abstraction, and requires Thread.threadStatus
to be a JVM TI thread state (which happens to match Hotspot's internal
thread state).  This change will therefore break any VM which does not
have an internal thread state based on JVM TI.

As far as I'm aware, IKVM and CACAO are currently the only other users
of OpenJDK (I'm also nearing completion of a port to JamVM).
Unfortunately, from looking at CACAO I can see that this change will
break it.  It may also break IKVM, but I haven't checked.  I, of
course, can modify the internal thread states of JamVM to be
compatible.

I'm CC'ing CACAO's mailing list and GNU Classpath so that affected
parties can be made aware of this change.  As an aside, will there be
any later clean-up of the native method implementation and VM
interface?

Thanks,
Rob.

> On 12/04/10 04:22, Eamonn McManus wrote:
>
> Hi Mandy,
>
> This test:
>
> if ((threadStatus & JVMTI_THREAD_STATE_RUNNABLE) == 1) {
>
> is always false, since JVMTI_THREAD_STATE_RUNNABLE is 4. (NetBeans 7.0
> helpfully flags this; I'm not sure if earlier versions do.)
>
> Good catch.   This explains why the speed up for RUNNABLE was not as high in
> the microbenchmark measurement.  Correcting it shows that Thread.getState()
> gets 3.5X speed up on a thread in RUNNABLE state.
>
> But, once corrected, I think you could use this idea further to write a much
> simpler and faster method, on these lines:
>
> public static Thread.State toThreadState(int threadStatus) {
> if ((threadStatus & JVMTI_THREAD_STATE_RUNNABLE) != 0) {
> return RUNNABLE;
> } else if ((threadStatus &
> JVMTI_THREAD_STATE_BLOCKED_ON_MONITOR_ENTER) != 0) {
> return BLOCKED;
> } else if ((threadStatus & JVMTI_THREAD_STATE_WAITING_WITH_TIMEOUT)
> != 0) {
> return TIMED_WAITING;
> } else if ((threadStatus & JVMTI_THREAD_STATE_WAITING_INDEFINITELY)
> != 0) {
> return WAITING;
> } else if ((threadStatus & JVMTI_THREAD_STATE_TERMINATED) != 0) {
> return TERMINATED;
> } else {
> return NEW;
> }
> }
>
> I forgot to mention in the email that I implemented this simpler approach to
> compare with the table lookup approach.   There were no significant
> difference.  I now rerun with the corrected fix (checking != 0 rather than
> == 1) and the table lookup approach is about 2-6% faster than the sequence
> of tests approach.
>
> I am also for the simpler approach but I post the table lookup approach as a
> proposed fix to get any opinion on the performance aspect with that
> approach.
>
> Given that the Fork-Join framework doesn't depend on it, I will go for a
> simpler approach (sequence of tests) and further tune its performance when
> there is a use case requiring a perf improvement.
>
> New webrev:
>    http://cr.openjdk.java.net/~mchung/6977034/webrev.01/
>
> Can you review this version?
>
> Thanks
> Mandy
>
> You could tweak the order of the tests based on what might be the relative
> frequency of the different states but it probably isn't worth it.
>
> Regards,
>
> Éamonn
>
> On 3/12/10 11:52 PM, Mandy Chung wrote:
>
> Fix for 6977034: Thread.getState() very slow
>
> Webrev at:
>    http://cr.openjdk.java.net/~mchung/6977034/webrev.00/
>
> This is an improvement to map a Thread's threadStatus field to
> Thread.State.  The VM updates the Thread.threadStatus field directly

Re: Review request for 6977034 Thread.getState() very slow

2010-12-07 Thread Mandy Chung

 On 12/6/10 5:27 PM, Robert Lougher wrote:

Hi Mandy,

On 6 December 2010 19:26, Mandy Chung  wrote:

Remi, Eamonn, Brian, David, Doug,

Thanks for the feedback.


I don't know if you welcome external feedback, but I'd like to point
out (if you're not already aware) that this change modifies the VM
interface.


Any feedback is welcome.   That's the intent to post this for the 
openjdk discussion.  Thanks for the information that this change will 
break CACAO of OpenJDK.

While I'm cognisant of the reason for the change, my understanding of
the existing map mechanism is that it makes the JDK class library
independent of the underlying VM thread status values.  The value of
Thread.threadStatus is opaque, with the mapping from VM thread status
being determined by the following VM interface functions (see
hotspot/src/share/vm/prims/jvm.h):

--
/*
  * Returns an array of the threadStatus values representing the
  * given Java thread state.  Returns NULL if the VM version is
  * incompatible with the JDK or doesn't support the given
  * Java thread state.
  */
JNIEXPORT jintArray JNICALL
JVM_GetThreadStateValues(JNIEnv* env, jint javaThreadState);

/*
  * Returns an array of the substate names representing the
  * given Java thread state.  Returns NULL if the VM version is
  * incompatible with the JDK or the VM doesn't support
  * the given Java thread state.
  * values must be the jintArray returned from JVM_GetThreadStateValues
  * and javaThreadState.
  */
JNIEXPORT jobjectArray JNICALL
JVM_GetThreadStateNames(JNIEnv* env, jint javaThreadState, jintArray values);

--

These two functions are used by the native method
sun.misc.VM.getThreadStateValues to setup the arrays which are then
used to initialise the map.

This change breaks this abstraction, and requires Thread.threadStatus
to be a JVM TI thread state (which happens to match Hotspot's internal
thread state).  This change will therefore break any VM which does not
have an internal thread state based on JVM TI.



Indeed this fix will depend on this private interface between VM and JDK 
to set the threadStatus as JVM TI thread state.  The VM implementation 
of setting Thread.threadStatus hasn't been changed since 1.5 and 
performance required by the profiling tools was one of the reasons done 
in that way.



As far as I'm aware, IKVM and CACAO are currently the only other users
of OpenJDK (I'm also nearing completion of a port to JamVM).
Unfortunately, from looking at CACAO I can see that this change will
break it.  It may also break IKVM, but I haven't checked.  I, of
course, can modify the internal thread states of JamVM to be
compatible.

I'm CC'ing CACAO's mailing list and GNU Classpath so that affected
parties can be made aware of this change.


Thanks for doing that.


As an aside, will there be
any later clean-up of the native method implementation and VM
interface?


I don't know of the project status.  I suggest you to post this question 
to the common VM interface openjdk project:

   http://openjdk.java.net/projects/cvmi/
   cvmi-...@openjdk.java.net

Thanks
Mandy

Thanks,
Rob.


On 12/04/10 04:22, Eamonn McManus wrote:

Hi Mandy,

This test:

 if ((threadStatus&  JVMTI_THREAD_STATE_RUNNABLE) == 1) {

is always false, since JVMTI_THREAD_STATE_RUNNABLE is 4. (NetBeans 7.0
helpfully flags this; I'm not sure if earlier versions do.)

Good catch.   This explains why the speed up for RUNNABLE was not as high in
the microbenchmark measurement.  Correcting it shows that Thread.getState()
gets 3.5X speed up on a thread in RUNNABLE state.

But, once corrected, I think you could use this idea further to write a much
simpler and faster method, on these lines:

 public static Thread.State toThreadState(int threadStatus) {
 if ((threadStatus&  JVMTI_THREAD_STATE_RUNNABLE) != 0) {
 return RUNNABLE;
 } else if ((threadStatus&
JVMTI_THREAD_STATE_BLOCKED_ON_MONITOR_ENTER) != 0) {
 return BLOCKED;
 } else if ((threadStatus&  JVMTI_THREAD_STATE_WAITING_WITH_TIMEOUT)
!= 0) {
 return TIMED_WAITING;
 } else if ((threadStatus&  JVMTI_THREAD_STATE_WAITING_INDEFINITELY)
!= 0) {
 return WAITING;
 } else if ((threadStatus&  JVMTI_THREAD_STATE_TERMINATED) != 0) {
 return TERMINATED;
 } else {
 return NEW;
 }
 }

I forgot to mention in the email that I implemented this simpler approach to
compare with the table lookup approach.   There were no significant
difference.  I now rerun with the corrected fix (checking != 0 rather than
== 1) and the table lookup approach is about 2-6% faster than the sequence
of tests approach.

I am also for the simpler approach but I post the table lookup approach as a
proposed fix to get any opinion on the performance aspect with that
approach.

Given that the Fork-Join framework doesn't depend on it, I will go for a
simpler approach (sequence of tests) and further t