Re: "Hot to Write GC" requires improvement

Salikh Zakirov Tue, 17 Oct 2006 02:30:44 -0700

Svetlana,

I've looked through your changes.
Mostly they look okay, and they greatly improve the visual presentation.


Originally, GC-howto was created using AsciiDoc[1] toolchain from the source
text file and source .cpp file. Modifying .html file directly means that we 
cannot
update the document to keep it in sync with the source code.

I guess this is acceptable, since nobody is changing source code inlets in 
GC-howto now,
but be warned: if anyone is to introduce source changes, it would tedious
task to synchronize visual and content changes.

Have you tried to configure asciidoc to produce the content you want?

I will send you the version of gc-howto.txt and gc.cpp that I have,
but Nadya may have a later version, so please check with her.
Since I am not sure attachment will make it to the list, I'll send it to you
directly. (* or to anyone else who might be interested, just ask *)

[1] http://www.methods.co.nz/asciidoc/

Konovalova, Svetlana wrote:
> Sorry about that! 
> I've created a new patch, hope it's the right one you need. Please let
> me know if you still have any problems. 
> 
> [JIRA 1881] http://issues.apache.org/jira/browse/HARMONY-1881
> 
> Cheers,
> Sveta Konovalova
> 
> -----Original Message-----
> From: Geir Magnusson Jr. [mailto:[EMAIL PROTECTED] 
> Sent: Tuesday, October 17, 2006 8:50 AM
> To: harmony-dev@incubator.apache.org
> Subject: Re: "Hot to Write GC" requires improvement
> 
> The problem with the patch is that it's to the rendered output
> 
>     site/xdoc/subcomponent/drlvm/gc-howto.html
> 
> when what we need is the patch to the source document
> 
>     site/xdoc/subcomponent/drlvm/gc-howto-content.html
> 
> Can you add a new patch with that please?
> 
> geir
> 
> Rana Dasgupta wrote:
>> This is a good document, thanks Svetlana. Even if a lot of custom gc's
> 
>> don't
>> get written, it helps in understanding the current collecor farmework
> and
>> how it plugs into DRLVM.
>>
>> Rana
>>
>>
>>
>>> On 10/16/06, Konovalova, Svetlana <[EMAIL PROTECTED]>
> wrote:
>>>>
>>>> Folks,
>>>>
>>>> I took a close look at "Hot to Write GC" [1] and created a patch
> for
>>>> this doc [JIRA 1881]. I fixed formatting, brushed up the code,
> removed
>>>> out-dated tags etc.
>>>> It would be great if someone can find a chance to look at the
> patch.
>>>> Thanks in advance!
>>>>
>>>> [1]
>>>>
> http://incubator.apache.org/harmony/subcomponents/drlvm/gc-howto.html
>>>> [JIRA 1881] http://issues.apache.org/jira/browse/HARMONY-1881
>>>>
>>>>
>>>> Cheers,
>>>> Sveta Konovalova
>>>>
>>>>
> 
> ---------------------------------------------------------------------
> Terms of use : http://incubator.apache.org/harmony/mailing.html
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 
> ---------------------------------------------------------------------
> Terms of use : http://incubator.apache.org/harmony/mailing.html
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 
>

/**
 *  Copyright 2005-2006 The Apache Software Foundation or its licensors, as 
applicable.
 *
 *  Licensed under the Apache License, Version 2.0 (the "License");
 *  you may not use this file except in compliance with the License.
 *  You may obtain a copy of the License at
 *
 *     http://www.apache.org/licenses/LICENSE-2.0
 *
 *  Unless required by applicable law or agreed to in writing, software
 *  distributed under the License is distributed on an "AS IS" BASIS,
 *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 *  See the License for the specific language governing permissions and
 *  limitations under the License.
 */

#include <assert.h>
#include <string.h>
#include <list>
#include <algorithm>

#include "open/gc.h"
#include "open/vm_gc.h"
#include "open/vm.h"

#define LOG_DOMAIN "gc"
#include "cxxlog.h"

typedef unsigned char byte;

//////////////////////////////////////////////////
// non-GC utilities

// Included to get critical sections
#include <windows.h>

struct Lock {
    CRITICAL_SECTION cs;
    Lock() { InitializeCriticalSection(&cs); }
    ~Lock() { DeleteCriticalSection(&cs); }
    void lock() { EnterCriticalSection(&cs); }
    void unlock() { LeaveCriticalSection(&cs); }
};

// convenient converter from sizes in bytes to Megabytes
inline int mb (int i) {
    return (i + 1048576/2) / 1048576;
}

//////////////////////////////////////////////////
// GC types


// This structure is allocated for each user thread.
// It contains the thread-local allocation area.

struct TLS {
    byte* current;  // the allocation pointer
    byte* limit;    // the end of the allocation area
};

struct InteriorPointer {
    struct Object*  base;
    int             offset;
    struct Object** interior_ref;
};

// Encapsulates all GC data.
struct GC {

    unsigned int semisize;   // the size of the semispace
    unsigned int chunk_size; // the chunk size for thread-local chunks

    byte* space;    // the pointer to the heap

    Lock lock;      // the lock to protect global heap access

    byte* fromspace;// the allocation space
    byte* current;  // the allocation marker
    byte* limit;    // the allocation limit

    byte* tospace;  // the evacuation space
    byte* scan;     // the scan marker
    byte* copy;     // the copy marker
    byte* toend;    // the evacuation space limit

    // The list of thread-local storage (TLS) 
    // structures allocated for user threads.
    std::list<TLS*> threads; 

    void init();      // configures and initalizes GC
    void wrapup();    // destroys the heap

    // Allocates an object from a thread chunk
    // reserving space for TLS as needed.
    byte* alloc(unsigned size, TLS* tls);

    // Allocates space on the global heap.
    byte* galloc(unsigned size);

    byte* move(void*);      // moves an object
    byte* forwarded(void*); // reads the forwarding pointer
    void root(void**);      // handles a root reference
    void trace(byte*);      // traces one object

    // Collects garbage and allocates the object.
    byte* collect_alloc(unsigned size, TLS* tls);

    std::list<InteriorPointer> interior_pointers;
    void repoint_all_roots_with_interior_points();
};

// Structure OI (from "object information") 
// is used to cache GC information for each Java class
// loaded by the virtual machine.
// Each VTable stores the pointer to an OI (Object information) structure.
struct OI {
    char magic[8];      // used for debugging
    const char *name;   // the class name (handy for debugging)
    bool is_array;      // true if this structure describes array
    bool has_slots;     // true if the described object has reference fields
                        // (aka slots) and thus needs to be traced
                        // during collection
    int size;           // the object size or the array element size
    int* offsets;       // zero-terminated list of slot offsets in an object
                        // undefined for array
};

//////////////////////////////////////////////////
// Static assumptions about object layout

// The VTable structure has 4 bytes reserved
// for GC use at the beginning.
// The pointer to the OI structure is stored there.
struct VTable {
    OI* oi;
    // Other VTable fields are not used in GC.
};

// Describes the object header format assumed by GC.
struct Object {
    VTable *vt;
    uint32 lockword;
};

// Describes the array header format assumed by GC.
struct Array {
    VTable *vt;
    uint32 lockword;
    uint32 length;
};


//////////////////////////////////////////////////
// GC utility functions
//

void init_vt(Managed_Object_Handle p, Allocation_Handle ah) {
    Object* obj = (Object*)p;
    obj->vt = (VTable*)ah;
}

OI* obj_oi(Managed_Object_Handle p) {
    Object* obj = (Object*)p;
    return obj->vt->oi;
}

int array_length(Managed_Object_Handle p) {
    Array* array = (Array*)p;
    return array->length;
}

OI* vt_oi(VTable_Handle p) {
    VTable* vt = (VTable*)p;
    return vt->oi;
}

OI* ah_oi(Allocation_Handle ah) {
    // Allocation_Handle is a VTable pointer on 32-bit platforms.
    return vt_oi((VTable_Handle)ah);
}

int object_size (Managed_Object_Handle obj) {
    OI* oi = obj_oi(obj);
    if (oi->is_array) {
        // 4-byte alignment
        return ((oi->size * array_length(obj) + 12) + 3) & (~3);
    } else {
        return oi->size;
    }
}

/////////////////////////////////////////////////////
// Global GC instance
GC gc;

/////////////////////////////////////////////
// GC functions
//

void GC::init() {
    semisize = 500*1024*1024;
    chunk_size = 64*1024;

    space = (byte*) malloc(semisize*2); 
    assert(space); assert(((int)space & 3) == 0);
    fromspace = space;
    tospace = fromspace + semisize; assert(((int)tospace & 3) == 0);
    toend = tospace + semisize;

    INFO("heap size " << mb(2*semisize) << " Mb "
            << (void*)space << "-" << (void*)(space + 2*semisize));

    current = fromspace;
    limit = fromspace + semisize;

    LOG("allocation from " << (void*)current << "-" << (void*)limit);

    memset(current, 0, limit - current);

    interior_pointers.clear();
}

void GC::wrapup() {
    ::free(space);
}

byte* GC::galloc(unsigned size) {
    byte* r = NULL;
    lock.lock();
    if (current + size <= limit) {
        r = current;
        current += size;
    }
    lock.unlock();
    return r;
}

byte* GC::alloc(unsigned size, TLS* tls) {

    byte* obj = NULL;

    assert(NULL == tls->current || fromspace <= tls->current);
    assert(NULL == tls->limit || tls->limit <= limit);

    if (tls->current + size <= tls->limit) {
        // Allocate from the thread-local chunk if possible.
        obj = tls->current;
        tls->current += size;
        return obj;
    }

    // Allocate "large" objects directly from the heap
    // bypassing the thread-local allocation buffer
    // to prevent inefficient handling of half-filled
    // thread-local allocation buffers.
    if (size >= chunk_size/4) {
        return gc.galloc(size);
    }

    // Allocate a new thread-local chunk.
    obj = gc.galloc(chunk_size);
    if (obj) {
        tls->current = obj + size;
        tls->limit = obj + chunk_size;
    }

    return obj;
}

byte* GC::forwarded (void* obj) {
    int* p = (int*)obj + 1;
    int lockword = *p;
    if (lockword & 1) 
        return (byte*)(lockword & (~1));
    else 
        return NULL;
}

byte* GC::move (void* obj) {
    int size = object_size(obj);
    assert(tospace <= copy); assert(copy + size <= toend);

    byte* nobj = copy;
    TRACE2("gc.move", "move " << (void*)obj << " -> " << (void*)nobj);
    assert(((int)nobj & 3) == 0);
    memcpy(nobj, obj, size);
    copy += size; assert(((int)copy & 3) == 0);

    int* plockword = (int*)obj + 1;
    *plockword = ((int)nobj) | 1;

    return nobj;
}

void GC::root(void** root) {
    byte* obj = (byte*)(*root);
    byte* nobj = forwarded(obj);
    if (NULL == nobj) {
        nobj = move(obj);
    }
    TRACE2("gc.root", "root " << root << " repointed from " 
            << (void*)obj << " to " << (void*)nobj);
    *root = nobj;
}

void GC::trace (byte* obj) {
    OI* oi = obj_oi(obj);
    TRACE2("gc.trace", "trace " << (void*)obj 
        << " (" << (void*)object_size(obj) << ", " << oi->name << ")");
    if (!oi->has_slots) return;
    if (oi->is_array) {
        int len = array_length(obj);
        int i;
        // Trace (len) elements starting from the offset 12.
        // NB: long[] and double[] start at offset 16
        // but never need tracing.
        byte** elem = (byte**)(obj + 12);
        for (i = 0; i < len; i++, elem += 1) {
            if (NULL == *elem) continue;
            byte* nobj = forwarded(*elem);
            if (!nobj) nobj = move(*elem);
            TRACE2("gc.update", "elem " << i << " in array " 
                << (void*)obj << " repointed from " << (void*)(*elem)
                << " to " << (void*)nobj);
            *elem = nobj;
        }
    } else {
        int* poff;
        // Use (offsets) array to get the list of reference
        // field offsets in an object.
        for (poff = oi->offsets; *poff; poff++) {
            byte** field = (byte**) (obj + *poff);
            if (NULL == *field) continue;
            byte* nobj = forwarded(*field);
            if (!nobj) nobj = move(*field);
            TRACE2("gc.update", "field " << *poff << " in object " 
                << (void*)obj << " repointed from " << (void*)(*field)
                << " to " << (void*)nobj);
            *field = nobj;
        }
    }
}

void GC::repoint_all_roots_with_interior_points()
{
    for( std::list<InteriorPointer>::iterator it = interior_pointers.begin();
         it != interior_pointers.end();
         it++ ){
        Object** root_ref = (*it).interior_ref;
        Object* root_base = (*it).base;
        const int root_offset = (*it).offset;
        Object* new_slot_contents = root_base + root_offset;

        if( new_slot_contents != *root_ref ){
            *root_ref = new_slot_contents;
        }
    }

    interior_pointers.clear();

    return;
}

byte* GC::collect_alloc(unsigned size, TLS* tls) {

    scan = tospace;
    copy = tospace;
    toend = tospace + semisize;

    LOG("allocated " << (current - fromspace) << " bytes");
    LOG("collection to " << (void*)tospace << "-" << (void*)toend);

    vm_enumerate_root_set_all_threads();

    LOG("scan");
    while (scan < copy) {
        trace(scan);
        scan += object_size(scan);
    }

    LOG("live " << (copy-tospace) << " bytes");

    if( !interior_pointers.empty() ){
        repoint_all_roots_with_interior_points();
    }

    byte* swap = tospace;
    tospace = fromspace;
    fromspace = swap;

    current = copy;
    limit = fromspace + semisize;
    memset(current, 0, limit - current);

    LOG("allocation from " << (void*)current << "-" << (void*)limit);

    std::list<TLS*>::iterator i;
    int j;
    for (i = gc.threads.begin(), j = 0; i != gc.threads.end(); i++, j++) {
        (*i) -> current = NULL;
        (*i) -> limit = NULL;
    }
    LOG2("gc.threads", "reset thread allocation areas in " << j
            << " user threads");

    byte* obj = NULL;
    if (size > 0) {
        // Allocate an object before resuming threads to maintain 
        // "fairness" and prevent spurious out-of-memory errors.
        obj = alloc(size, tls);
    }

    vm_resume_threads_after();
    return obj;
}

//////////////////////////////////////////////////
// Interface functions
//
void gc_init() {
    gc.init();
}

void gc_vm_initialized() {
}

void gc_wrapup() {
    gc.wrapup();
}

void gc_thread_init(void* tp) {
    TRACE2("gc.thread", "gc_thread_init " << tp);
    TLS* tls = (TLS*) tp;
    std::list<TLS*>::iterator i = 
        std::find(gc.threads.begin(), gc.threads.end(), tls);
    assert(i == gc.threads.end());
    gc.threads.push_back(tls);

    tls->current = NULL;
    tls->limit = NULL;
}

void gc_thread_kill(void* tp) {
    TRACE2("gc.thread", "gc_thread_kill " << tp);
    TLS* tls = (TLS*) tp;
    std::list<TLS*>::iterator i = 
        std::find(gc.threads.begin(), gc.threads.end(), tls);
    assert(i != gc.threads.end());
    gc.threads.erase(i);

    tls->current = NULL;
    tls->limit = NULL;
}

Managed_Object_Handle gc_alloc (unsigned size, Allocation_Handle ah, void *tp) {
    Managed_Object_Handle obj;
    TLS* tls = (TLS*) tp;

    // The next-to-highest bit of the size may be set
    // when allocating objects requiring finalization.
    // Ignore hint for now.
    size = size & 0x3fffffff;

    assert((size & 3) == 0);
    assert(ah_oi(ah)->is_array || size == ah_oi(ah)->size);

    // First, try the allocation
    obj = gc.alloc(size, tls);
    if (!obj) {
        // If the allocation failed, 
        // grab the global GC lock.
        vm_gc_lock_enum();

        // Multiple threads may try to get the GC lock.
        // Only one gets it and does a collection, and others get
        // blocked on vm_gc_lock_enum() during collection.

        // Retry the allocation while holding the GC lock.
        obj = gc.alloc(size, tls);

        // The allocation will succeed if another thread
        // has done collection while this thread was waiting for the GC lock.

        if (!obj) {
            // If the allocation failed, start a garbage collection.
            obj = gc.collect_alloc(size, tls);

            // NULL return value from collect_alloc() indicates out-of-memory.
        }
        vm_gc_unlock_enum();
    }
        
    if (obj) init_vt(obj, ah);
    TRACE2("gc.alloc.slow", "gc_alloc(" << (void*)size << ", " 
            << (*(OI**)ah)->name << ") = " << obj);
    assert(NULL == obj || 
        (gc.fromspace <= obj && obj < gc.limit && ((int)obj & 3) == 0));
    return obj;
}

Managed_Object_Handle gc_alloc_fast (unsigned size, Allocation_Handle ah, void 
*tp) {
    Managed_Object_Handle obj;
    TLS* tls = (TLS*) tp;
    size = size & 0x3fffffff;
    assert((size & 3) == 0);
    assert(ah_oi(ah)->is_array || size == ah_oi(ah)->size);

    obj = gc.alloc(size, tls);

    if (obj) init_vt(obj, ah);
    TRACE2("gc.alloc.fast", "gc_alloc_fast(" 
            << (void*)size <<  ", " << (*(OI**)ah)->name << ") = " << obj);
    assert(NULL == obj || 
        (gc.fromspace <= obj && obj < gc.limit && ((int)obj & 3) == 0));
    return obj;
}

Boolean gc_requires_barriers() {   
    return FALSE;
}

Boolean gc_supports_compressed_references() {   
    return FALSE;
}

void gc_write_barrier(Managed_Object_Handle p_base_of_object_holding_ref) {   
    assert(!"gc does not support write barriers");
}

void *gc_heap_base_address() {   
    return (void *)gc.space;
}

void *gc_heap_ceiling_address() {   
    return (void *)(gc.space + 2*gc.semisize);
}

void gc_add_root_set_entry(Managed_Object_Handle *ref, Boolean is_pinned) {
    assert(!is_pinned);
    TRACE2("gc.root", "gc_add_root_set_entry " << ref << " -> " << *ref);
    if (NULL == *ref) return;
    gc.root(ref);
}

Boolean gc_is_object_pinned(Managed_Object_Handle obj) {
    return FALSE;
} 

static int *build_slot_offset_array(Class_Handle ch)
{
    unsigned num_ref_fields = 0;
    // Get the total number of fields including primitive fields.
    unsigned num_fields = class_num_instance_fields_recursive(ch);

    // Compute the number of reference fields.
    unsigned int i;
    for (i = 0; i < num_fields; i++) {
        // For each field, get its handle and check
        // whether it's a reference type.
        Field_Handle fh = class_get_instance_field_recursive(ch, i);
        if (field_is_reference(fh)) {
            num_ref_fields++;
        }
    }

    if  (0 == num_ref_fields) return NULL;

    // Allocate the offsets array.
    int* ref_array = (int*) malloc((num_ref_fields+1) * sizeof(int));

    // For each reference field, store its offset
    // into the offsets array.
    int* p = ref_array;
    for (i = 0; i < num_fields; i++) {
        Field_Handle fh = class_get_instance_field_recursive(ch, i);
        if (field_is_reference(fh)) {
            *p = field_get_offset(fh);
            p++;
        }
    }

    // It is 0 delimited.
    *p = 0;

    return ref_array;
}


void gc_class_prepared (Class_Handle ch, VTable_Handle vth) {
    TRACE2("gc.prepared", "gc_class_prepared(" 
            << class_get_name(ch) << ")");
    OI** vt = (OI**) vth;
    OI* oi = new OI;
    *vt = oi;

    memset(oi, 0, sizeof(OI));
    strcpy(oi->magic, "   OI  ");
    oi->name = class_get_name(ch);

    if (class_is_array(ch)) {
        oi->is_array = true;
        // Store the array element size in the OI structure > size.
        oi->size = class_element_size(ch);
        // Reference arrays have slots, non-reference arrays don't.
        oi->has_slots = !class_is_non_ref_array(ch);
        assert(NULL == oi->offsets);
    } else {
        oi->is_array = false;
        // Store the object size in the OI structure > size.
        oi->size = class_get_boxed_data_size(ch);
        assert((oi->size & 3) == 0);
        oi->offsets = build_slot_offset_array(ch);
        oi->has_slots = (oi->offsets != NULL);
    }
}

void gc_add_root_set_entry_interior_pointer (void **slot, int offset, Boolean 
is_pinned)
{
    assert(offset > 0);
    assert( !is_pinned );

    Object* p_obj = (Object *)((Byte*)*slot - offset);
    assert( p_obj->vt != NULL );
    InteriorPointer ip;
    ip.offset = offset;
    ip.interior_ref = (Object**)slot;
    ip.base = (Object*)((*(unsigned char**)slot) - offset);

    OI* oi = obj_oi(ip.base);
    TRACE2("gc.ip", "interior_pointer " << (void*)ip.base
        << " (" << (void*)object_size(ip.base) << ", " << oi->name << ")");

    gc.interior_pointers.push_back( ip );
    gc_add_root_set_entry( (Managed_Object_Handle*)&(ip.base), is_pinned );

    return;
}

void gc_force_gc () {
    vm_gc_lock_enum();
    gc.collect_alloc(0, NULL);
    vm_gc_unlock_enum();
}

int64 gc_free_memory() {
    return 0;
}

void gc_pin_object(Managed_Object_Handle*) {
}

void gc_unpin_object(Managed_Object_Handle*) {
}

Managed_Object_Handle gc_get_next_live_object(void*) {
    assert(!"gc does not support heap iteration");
    return NULL;
}

void gc_finalize_on_exit() {
}

unsigned gc_time_since_last_gc() {
    return 0;
}

int64 gc_total_memory() {
    return 2*gc.semisize;
}

int64 gc_max_memory() {
    return 2*gc.semisize;
}

#    Copyright 2005 The Apache Software Foundation or its licensors, as 
applicable.
#  
#    Licensed under the Apache License, Version 2.0 (the "License");
#    you may not use this file except in compliance with the License.
#    You may obtain a copy of the License at
#  
#       http://www.apache.org/licenses/LICENSE-2.0
#  
#    Unless required by applicable law or agreed to in writing, software
#    distributed under the License is distributed on an "AS IS" BASIS,
#    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
#    See the License for the specific language governing permissions and
#    limitations under the License.
#

[miscellaneous]
numbered

[specialwords]
monospacedwords = \w+\(\) \w+(?:\/[\w.]*)+ \w+\.dll \w+\.exe \w+\.h(?!\w) 
\w+\.cpp \w+\.xml NULL

[header]
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
    "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd";>
<html xmlns="http://www.w3.org/1999/xhtml"; xml:lang="en">
<head>
<meta http-equiv="Content-Type" content="text/html; charset={encoding}" />
<meta name="generator" content="AsciiDoc {asciidoc-version}" />
<style type="text/css">
include::{stylesdir={asciidoc-dir}/stylesheets}/{theme={backend}}.css[]
ifdef::quirks[]
include::{stylesdir={asciidoc-dir}/stylesheets}/{theme={backend}}-quirks.css[]
endif::quirks[]

div.sidebar-content {
  font-size: 70%;
  background: #ffcccc;
  border: 1px solid silver;
  padding: 3pt;
  width: 50%;
  display: float;

  position: relative;
  left: 900px;
  bottom: 2em;
  -moz-border-radius: 12px;
}

body {
  width: 80%;
  text-align: justify;
}

tt {
font-size: 10pt;
}

strong {
  font-weight: bold;
}
</style>
<title>{doctitle}</title>
</head>
<body>
# Article, book header.
<div id="header">
<h1>{doctitle}</h1>
<span id="author">{author}</span><br />
<span id="email"><tt>&lt;<a 
href="mailto:{email}";>{email}</a>&gt;</tt></span><br />
<span id="revision">version {revision}{date?,}</span>
{date}
</div>

How to write DRL GC
===================
[EMAIL PROTECTED], [EMAIL PROTECTED]
revision 1.0, 2006-04-19

//
//  Copyright 2005 The Apache Software Foundation or its licensors, as 
applicable.
//
//  Licensed under the Apache License, Version 2.0 (the "License");
//  you may not use this file except in compliance with the License.
//  You may obtain a copy of the License at
//
//     http://www.apache.org/licenses/LICENSE-2.0
//
//  Unless required by applicable law or agreed to in writing, software
//  distributed under the License is distributed on an "AS IS" BASIS,
//  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
//  See the License for the specific language governing permissions and
//  limitations under the License.
//
//
//  Generate HTML version of this document from text source
//  and configuration file GC-howto.conf using the command
//
//      asciidoc -f GC-howto.conf --unsafe GC-howto.txt
//
//  Download Asciidoc generic distribution archive from
//
//      http://www.methods.co.nz/asciidoc/downloads.html
//
//  unpack it somewhere (e.g. /usr/local/opt/asciidoc), and
//  symlink /usr/local/opt/asciidoc/asciidoc.py to /usr/local/bin/asciidoc
//

This document provides instructions on creating a custom garbage collector 
implementation
in C++ and configuring the DRL virtual machine to use it. The section describes
the major steps of this procedure, namely: 

- Establishing the build infrastructure
- Implementing the GC interface
- Implementing the garbage collector algorithm
- Running the VM with the custom GC

.Note
Plugging-in a user-designed garbage collector presupposes an operating DRL
virtual machine built according to the instructions of the README.txt file
supplied with the VM source package. 

Establishing the build infrastructure
-------------------------------------

At this stage, you create the directory and set up the build infrastructure to
build the dynamic library. At the end of this stage, you will be fully set for
adding the garbage collector code and building it.

DRLVM can load a custom garbage collector from a dynamic library. It is
recommended that you build your dynamic library using a DRLVM build
infrastructure. Below is an example of creating of a
build descriptor on the Windows* / IA-32 architecture.

Create a directory for a new GC module, for example:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

-----
vm$ mkdir gc_copying
vm$ mkdir gc_copying/src
vm$ cd gc_copying/src
-----
That is where you will put the source code, see Section 3, 
Implementing the garbage collector algorithm. 

Create a build descriptor file
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Create the build descriptor file build/make/components/vm/gc_copying.xml
with the following content:
-----
sys::[sed -n '/<project/,$p' gc_copying.xml]
-----

You can add other macro definitions, include directories or compiler-specific 
command-line options to match your needs.

Create a C++ file with essential includes, namely:  
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

-----
sys::[sed -n '/^#include "open/,/^#include "cxxlog/p' gc.cpp]
-----

These include files are located in directories vm/include/open and
vm/port/include. Consult their content for documentation and details of the
interface.

Test the configuration
~~~~~~~~~~~~~~~~~~~~~~

Run the build system to test whether the infrastructure is set up correctly:
-----
build$ build.bat -DCOMPONENTS=vm.gc_copying
-----

On a successful build, the .dll file is placed to the VM build directory
build/win_ia32_icl_debug/deploy/jre/bin/. The name of the directory may differ
depending on your system and the compiler used.  This empty library will not
work, you have to write your GC first!


Implementing the GC interface
-----------------------------

This section lists the functions that a garbage collector interface must
implement. Declarations of these functions are in gc.h. For details, consult
the Developer's Guide and documentation in gc.h and vm_gc.h. 

GC lifecycle
~~~~~~~~~~~~

* gc_init() initializes the garbage collector
* gc_wrapup() shuts down the GC 
* gc_vm_initialized() notifies the GC about the VM transition from the
  initialization stage to running user applications
* gc_thread_init() and gc_thread_kill() notify the GC about creation and
  termination of user threads that may request memory allocation or other GC
  services
* gc_class_prepared() notifies the GC about loaded and prepared classes

Object allocation
~~~~~~~~~~~~~~~~~

* gc_alloc() performs slow allocation, can trigger garbage collection
* gc_alloc_fast() performs faster allocation, should not trigger garbage
  collection
* gc_add_root_set_entry() is responsible for enumeration
* gc_add_root_set_entry() enumerates one root pointer
 
See the Root set enumeration section in the Developer's Guide for details.

Miscellaneous
~~~~~~~~~~~~~

* gc_supports_compressed_references() indicates whether GC supports compressed 
references 
* gc_is_object_pinned() indicates whether the GC will move an object or not
* gc_force_gc() forces a garbage collection

Optional
~~~~~~~~

The virtual machine can operate without the functions listed below, but certain 
features
will be unavailable.

* gc_free_memory() returns the estimated amount of memory available for 
allocation
* gc_pin_object()  requests that the GC does not move an object
* gc_unpin_object()  removes the restriction on not moving an object
* gc_get_next_live_object()  iterates over live objects during the 
stop-the-world
  phase of garbage collection
* gc_finalize_on_exit()  transfers finalizable queue contents to the VM core on
  shutdown
* gc_time_since_last_gc() returns the amount of time that elapsed since the
  previous collection, in milliseconds
* gc_total_memory() returns the overall amount of memory used for the Java heap
* gc_max_memory() returns the maximum amount of memory that can be used for the
  Java heap

The `VM_GC` interface
~~~~~~~~~~~~~~~~~~~~~

The garbage collector requires VM support in its operation. The virtual machine
exports the VM_GC  interface to meet the needs of the garbage collector.
Besides, the GC uses the VM_common interface.

The VM_GC interface describes the services that the VM provides specifically
for the garbage collector. Please refer to the header file vm_gc.h to see the
complete list and documentation. 

The VM exports two functions to provide the global locking service for the
garbage collector: vm_gc_lock_enum() and vm_gc_unlock_enum(). These two
functions differ from plain system locks in their ability to gracefully
interoperate with VM threading services. In case of contention on the GC lock,
that is, when multiple threads call vm_gc_lock_enum() simultaneously, one
thread gets the lock, and others remain blocked. If the thread that grabbed the 
GC
lock does a garbage collection, the blocked threads are considered safely
suspended.  Other ways to lock user threads for a long time can lead to a
deadlock because the VM will have no way to find out whether the thread is 
blocked
or running.

A detailed description of GC procedure is given in the Developers' Guide.

DRLVM provides two functions to support thread suspension and root set
enumeration simultaneously:

* vm_enumerate_root_set_all_threads() suspends all user threads and initiates
  root set enumeration
* vm_resume_threads_after() resumes execution of user threads 

These functions effectively restrict the garbage collector to stop-the-world 
algorithms only.

Implementing the garbage collector algorithm
--------------------------------------------

This section gives step-by-step instructions on how to implement the garbage
collection algorithm. The example shows a semispace copying collector.

.Note
This example does not implement object finalization and weak references.

[[gc_algorithm]]
Algorithm Overview
~~~~~~~~~~~~~~~~~~

The heap is divided into two equally sized contiguous semispaces. 
During normal operation, only one semispace is used ('current semispace'),
and the other one is 'reserved' for garbage collection.
Allocation requests are satisfied by contiguous allocation
from the current semispace. Each application thread reserves a thread-local
allocation buffer ('TLAB') under a global lock, and serves most of the 
allocation
requests without locking, by incrementing the allocation pointer local
to the buffer. 

When the application requests an allocation that does not fit into the remaining
free space of the current semispace, a garbage collection is initiated. The 
current
semispace becomes the 'evacuation space' ('fromspace'), and the reserved 
semispace
becomes the 'destination space' ('tospace'). The VM suspends all application 
threads and
enumerates root references. 

The GC copies the objects reachable from root references to the destination 
space. 
When an object is copied from evacuation space to destination space, the GC 
installs 
the forwarding pointer in the old copy. Root references are updated to point 
to new object locations.

After the root set enumeration is complete, the GC scans objects in the
destination space. Each reached object is copied to the destination space,
the forwarding pointer is installed in the old copy, and the scanned object 
fields are
updated. For objects with forwarding pointers installed, the GC updates object 
fields. 
In this way, the GC ensures that all live objects are copied to the destination 
space exactly once.

The destination space serves as a queue of objects to be scanned when
more and more objects are copied to the destination space during heap
traversal. Once all live objects are reached and copied, the scan
queue stops growing, and the GC updates object fields only during
the last part of the scanning process.

The GC completes the scanning process when the scan pointer reaches the 
allocation
pointer in the destination space. At this stage, all live objects have been
evacuated to the destination space, and the evacuation space can be safely 
reclaimed.
The GC then changes the semispace roles: it uses the destination space for 
further allocation
and reserves the evacuation space for the next garbage collection. The change of
the semispace roles is commonly referred to as 'flip'.

After the semispace flip, the GC resumes user threads.

Please refer to the excellent survey for detailed description
of this algorithm and other basic garbage collection techniques,
"Uniprocessor Garbage Collection Techniques", Paul R. Wilson.

Source code explained
~~~~~~~~~~~~~~~~~~~~~

The full source code of the collector is available in gc.cpp.


The structure `TLS` (thread-local storage)
is used for the optimizing fast path allocation. The GC allocates
a buffer of free space from the heap with appropriate locking and further uses
this buffer for thread-local allocation.
-----
sys::[sed -n '/^\/\/ This structure is allocated for each user thread/,/^}/p' 
gc.cpp]
-----

Define the main GC structure to contain the Java heap and the data necessary
for GC operation, as shown below. 

-----
sys::[sed -n '/^\/\/ Encapsulates all GC data/,/}/p' gc.cpp]
-----

The following structure stores object information: the object field layout and
the object size.
-----
sys::[sed -n '/^\/\/ Structure OI/,/^}/p' gc.cpp]
-----
The data stored in the `OI` structure is initialized and accessed by the GC 
only.

The following structures convey the static assumptions that GC makes about
object layout. The VM must use the same object layout assumptions for the
correct GC operation.

The `VTable` structure contains the virtual table of the object methods,
and is linked from the object header. The VM reserves some space (at least 4 
bytes)
for exclusive use by GC. The GC uses 4 bytes of GC-private space to put the 
pointer
to the object information structure `struct OI`.
------
sys::[sed -n '/^\/\/ The VTable structure has 4 bytes reserved/,/^}/p' gc.cpp]
------

The GC assumes that each Java* object has a fixed header: (1) a pointer
to the `VTable` structure, and then a (2) 32 bit word with flags.
The 25 highest bits are used by the VM Thread Manager component to
implement Java* monitors and 7 lowest bits are used by GC and for
storing the object hash code.
------
sys::[sed -n '/^\/\/ Describes the object header format assumed by GC./,/^}/p' 
gc.cpp]
------

The array objects have the same header, and a 4 byte length field
at the offset 8.
------
sys::[sed -n '/^\/\/ Describes the array header format assumed by GC./,/^}/p' 
gc.cpp]
-----

.Note
The layout described is valid for the IA-32 platform only. 

A number of convenience functions use object layout knowledge to perform
various data manipulations. The function init_vt() writes the VTable pointer
to an object.

-----
sys::[sed -n '/^void init_vt/,/^}/p' gc.cpp]
-----
The function obj_oi() retrieves object information structure
pointer from an object.
-----
sys::[sed -n '/^OI *\* *obj_oi/,/^}/p' gc.cpp]
-----
The function array_length() retrieves the length of an array
object.
-----
sys::[sed -n '/^int array_length/,/^}/p' gc.cpp]
-----
The function vt_oi() retrieves  the `OI` structure pointer
from the VTable pointer.
-----
sys::[sed -n '/^OI *\* *vt_oi/,/^}/p' gc.cpp]
-----
The function ah_oi() retrieves the `OI` structure pointer
using `Allocation_Handle`. On 32-bit architectures, the
VTable pointer is a 32-bit pointer, and Allocation_Handle is a 32-bit
integer.
-----
sys::[sed -n '/^OI *\* *ah_oi/,/^}/p' gc.cpp]
-----

The object_size() function computes the size of an object. Array size is
calculated by summing the header size and the element size multiplied by array
length. Afterwards the size is aligned to be multiple of 4. The non-array
object size is cached in the OI structure.
-----
sys::[sed -n '/^int object_size/,/^}/p' gc.cpp]
-----
In this example, the garbage collector is created statically as a global
instance of structure GC:
-----
GC gc;
-----

The function init() statically configures size parameters. Normally, this
function uses the function vm_get_property() to read configuration options
specified as property values on the command line. In this example, we use
constant values for simplicity.
-----
sys::[sed -n '/^void GC::init/,/chunk_size/p' gc.cpp]
-----

As the next step, the init() function allocates space for the heap, divides it
into two semispaces, and initializes the allocation semispace.
------
sys::[sed -n '/^ *space = /,/^}/p' gc.cpp]
-----

The global allocation function uses a lock to protect the heap from
simultaneous access from multiple threads. The locking mechanism
is trivially implemented in a platform-dependent way. See the full source code 
in gc.cpp.
-----
sys::[sed -n '/^byte *\* *GC::galloc *(/,/^}/p' gc.cpp]
-----

The local allocation function uses the thread-local allocation area for object
allocation, and uses galloc() to allocate a new chunk for a thread-local
allocation area as needed.

-----
sys::[sed -n '/^byte *\* *GC::alloc *(/,/^}/p' gc.cpp]
-----

The forwarding pointers are installed in the lockword structure, the second word
of an object.
-----
sys::[sed -n '/^byte *\* *GC::forwarded *(/,/^}/p' gc.cpp]
-----

The function move() copies the object data to the evacuation semispace and
installs the forwarding pointer in the old object copy.
-----
sys::[sed -n '/^byte *\* *GC::move *(/,/^}/p' gc.cpp]
-----

The function root() handles one root during root set enumeration. If the root
points to an object already reached, the root is updated with the forwarded
pointer value. Otherwise, the GC moves the object to the destination space and
installs the forwarding pointer in the old object copy.
-----
sys::[sed -n '/^void GC::root *(/,/^}/p' gc.cpp]
-----

The function trace() scans one object.
-----
sys::[sed -n '/^void GC::trace *(/,/^}/p' gc.cpp]
-----

The function collect_alloc() is the main function controlling garbage
collection. This function reclaims unused memory and the retries the 
allocation. The GC
attempts to allocate the memory before resuming other threads. This prevents the
thread that triggered the garbage collection from starving. 

.Note
The thread is 'starving' when it gets no resources for a long time
because other threads grab the resource before it can even try.
If the garbage collector resumes user threads before retrying the allocation,
these threads may use all available space quickly before the allocation 
succeeds. 
In this case, the allocation will fail for an indefinite number of times. 

-----
sys::[sed -n '/^byte\* GC::collect_alloc/,/^}/p' gc.cpp]
-----

The exported GC interface is mostly implemented by delegating the task to the
method of the structure `GC`. The GC initialization function init() is called
from gc_init().
-----
sys::[sed -n '/^void gc_init/,/^}/p' gc.cpp]
-----

Thread local allocation areas are reset on thread creation and thread
termination events.
-----
sys::[sed -n '/^void gc_thread_init/,/^}/p' gc.cpp]

sys::[sed -n '/^void gc_thread_kill/,/^}/p' gc.cpp]
-----

The slow path allocation function gc_alloc() checks whether the allocation 
space is
exhausted and starts garbage collection when necessary.
-----
sys::[sed -n '/^Managed_Object_Handle gc_alloc *(/,/^}/p' gc.cpp]
-----

If the memory is exhausted, the no-collection allocation function
gc_alloc_fast() returns NULL, and does not start garbage collection.

-----
sys::[sed -n '/^Managed_Object_Handle gc_alloc_fast/,/^}/p' gc.cpp]
-----

The root set enumeration function passes the root reference to the root()
function.
-----
sys::[sed -n '/^void gc_add_root_set_entry *(/,/^}/p' gc.cpp]
-----

The function build_slot_offset_array() is used to construct a NULL-terminated
list of offsets of reference fields.
-----
sys::[sed -n '/^static int *\* *build_slot_offset_array/,/^}/p' gc.cpp]
-----

The GC caches object layout information when the function gc_class_prepared()
is called.
------
sys::[sed -n '/^void gc_class_prepared/,/^}/p' gc.cpp]
-----
The function gc_force_gc() starts a forced garbage collection using the global
GC lock to ensure that only one thread is doing a collection at any time. It
passes null arguments to collect_alloc(), because it requires no
allocation.
-----
sys::[sed -n '/^void gc_force_gc/,/^}/p' gc.cpp]
-----

Other functions of the GC interface are empty or trivial, and not described in
this document. You can see the full listing in the gc.cpp file.

After you completed coding the garbage collector, you can build a GC dynamic
library, as described above, by typing 
-----
build$ build.bat -DCOMPONENTS=vm.gc_copying
-----

Running VM with the custom GC
-----------------------------

This section describes how to run the DRL virtual machine with the custom
garbage collector library.

You can specify the name of the dynamic library on the command line. For
example, to load a GC gc_copying.dll, execute the following:

-----
ij -Dvm.dlls=gc_copying Hello
-----

The virtual machine searches for a dynamic library gc_copying.dll in the
default locations, that is, the value for the PATH variable and the location of
executable ij.exe. 
The default garbage collector is gc.dll located in the same bin/ directory as 
ij.exe.

GC-howto.html: GC-howto.txt GC-howto.conf
        asciidoc -o $@ -f GC-howto.conf --unsafe $<

---------------------------------------------------------------------
Terms of use : http://incubator.apache.org/harmony/mailing.html
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: "Hot to Write GC" requires improvement

Reply via email to