Hi,

I am currently continuing to work on improving JVMTI Heap Iteration 
(HARMONY-1635),
particularly, tagging objects.

The use case that I've heard of is tagging *all* objects for the purpose of 
memory
profiling. According to what I've heard it causes 60x slowdown on Sun's VM.
However, the initial tags implementation that I've uploaded to HARMONY-1635
is far worse, as it uses linear search for get/set tag operations.

(* for those who didn't read JVMTI spec, tags are jlong (8 byte integer) values,
which can be attached to arbitrary objects in get/set manner *)

The alternative approach I came up with is to use (mostly) constant time 
algorithms
for get/set operations, is to store a tag pointer in each object.
Storing tag itself in an object is not an option, as JVMTI requires to send
OBJECT_FREE events with tags for each reclaimed objects, and this information 
would not be
available if the tag would be reclaimed together with the object.

However, since the general consensus was that increasing object header is 
highly undesired,
I've tried to implement the _conditional_ increase in object header. 
Additional object header field is allocated in case JVMTI Agent has requested
can_tag_objects capability.

The modified object layout I used is as follows:

  +-------------------+
  |   VTable pointer  |
  +-------------------+
  |      lockword     |
  +-------------------+
  |   [array length]  |
  +-------------------+
  |   [tag pointer]   |
  +-------------------+
  |    [padding]      |
  +-------------------+
  | fields or elements|
  |       ...         |
  +-------------------+

Where [array length] is only present in array objects,
[tag pointer] is only present when can_tag_capability has been enabled at 
startup
[padding] is only present in arrays of longs and doubles for natural 8-byte 
alignment.

VTable pointer is really uint32 offset on em64t/x86_64 and ipf/ia64.

The only difference with current object layout is introduction of tag pointer 
field.

I've modified gc_cc to take the changed dynamic object layout into account,
and surprisingly it took only one modification:

* use VM function vector_first_element_offset_unboxed() instead of hardcoding 
first array element offset. This is done once for each class done at loading 
stage,
and gc_cc caches this offset for later uses.

I've experimented with putting tag pointer at fixed location before array 
length,
but it looks expensive, as it will add one more read to GC array scanning, and
we obviously do not want optimize at the expense of common case.

The latest version of the patch is attached to HARMONY-1635 
(heap-iteration-optimized.patch),
I would appreciate any comments and concerns.


Reply via email to