Svetlana, I've looked through your changes. Mostly they look okay, and they greatly improve the visual presentation.
Originally, GC-howto was created using AsciiDoc[1] toolchain from the source text file and source .cpp file. Modifying .html file directly means that we cannot update the document to keep it in sync with the source code. I guess this is acceptable, since nobody is changing source code inlets in GC-howto now, but be warned: if anyone is to introduce source changes, it would tedious task to synchronize visual and content changes. Have you tried to configure asciidoc to produce the content you want? I will send you the version of gc-howto.txt and gc.cpp that I have, but Nadya may have a later version, so please check with her. Since I am not sure attachment will make it to the list, I'll send it to you directly. (* or to anyone else who might be interested, just ask *) [1] http://www.methods.co.nz/asciidoc/ Konovalova, Svetlana wrote: > Sorry about that! > I've created a new patch, hope it's the right one you need. Please let > me know if you still have any problems. > > [JIRA 1881] http://issues.apache.org/jira/browse/HARMONY-1881 > > Cheers, > Sveta Konovalova > > -----Original Message----- > From: Geir Magnusson Jr. [mailto:[EMAIL PROTECTED] > Sent: Tuesday, October 17, 2006 8:50 AM > To: harmony-dev@incubator.apache.org > Subject: Re: "Hot to Write GC" requires improvement > > The problem with the patch is that it's to the rendered output > > site/xdoc/subcomponent/drlvm/gc-howto.html > > when what we need is the patch to the source document > > site/xdoc/subcomponent/drlvm/gc-howto-content.html > > Can you add a new patch with that please? > > geir > > Rana Dasgupta wrote: >> This is a good document, thanks Svetlana. Even if a lot of custom gc's > >> don't >> get written, it helps in understanding the current collecor farmework > and >> how it plugs into DRLVM. >> >> Rana >> >> >> >>> On 10/16/06, Konovalova, Svetlana <[EMAIL PROTECTED]> > wrote: >>>> >>>> Folks, >>>> >>>> I took a close look at "Hot to Write GC" [1] and created a patch > for >>>> this doc [JIRA 1881]. I fixed formatting, brushed up the code, > removed >>>> out-dated tags etc. >>>> It would be great if someone can find a chance to look at the > patch. >>>> Thanks in advance! >>>> >>>> [1] >>>> > http://incubator.apache.org/harmony/subcomponents/drlvm/gc-howto.html >>>> [JIRA 1881] http://issues.apache.org/jira/browse/HARMONY-1881 >>>> >>>> >>>> Cheers, >>>> Sveta Konovalova >>>> >>>> > > --------------------------------------------------------------------- > Terms of use : http://incubator.apache.org/harmony/mailing.html > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > --------------------------------------------------------------------- > Terms of use : http://incubator.apache.org/harmony/mailing.html > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > >
/** * Copyright 2005-2006 The Apache Software Foundation or its licensors, as applicable. * * Licensed under the Apache License, Version 2.0 (the "License"); * you may not use this file except in compliance with the License. * You may obtain a copy of the License at * * http://www.apache.org/licenses/LICENSE-2.0 * * Unless required by applicable law or agreed to in writing, software * distributed under the License is distributed on an "AS IS" BASIS, * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. * See the License for the specific language governing permissions and * limitations under the License. */ #include <assert.h> #include <string.h> #include <list> #include <algorithm> #include "open/gc.h" #include "open/vm_gc.h" #include "open/vm.h" #define LOG_DOMAIN "gc" #include "cxxlog.h" typedef unsigned char byte; ////////////////////////////////////////////////// // non-GC utilities // Included to get critical sections #include <windows.h> struct Lock { CRITICAL_SECTION cs; Lock() { InitializeCriticalSection(&cs); } ~Lock() { DeleteCriticalSection(&cs); } void lock() { EnterCriticalSection(&cs); } void unlock() { LeaveCriticalSection(&cs); } }; // convenient converter from sizes in bytes to Megabytes inline int mb (int i) { return (i + 1048576/2) / 1048576; } ////////////////////////////////////////////////// // GC types // This structure is allocated for each user thread. // It contains the thread-local allocation area. struct TLS { byte* current; // the allocation pointer byte* limit; // the end of the allocation area }; struct InteriorPointer { struct Object* base; int offset; struct Object** interior_ref; }; // Encapsulates all GC data. struct GC { unsigned int semisize; // the size of the semispace unsigned int chunk_size; // the chunk size for thread-local chunks byte* space; // the pointer to the heap Lock lock; // the lock to protect global heap access byte* fromspace;// the allocation space byte* current; // the allocation marker byte* limit; // the allocation limit byte* tospace; // the evacuation space byte* scan; // the scan marker byte* copy; // the copy marker byte* toend; // the evacuation space limit // The list of thread-local storage (TLS) // structures allocated for user threads. std::list<TLS*> threads; void init(); // configures and initalizes GC void wrapup(); // destroys the heap // Allocates an object from a thread chunk // reserving space for TLS as needed. byte* alloc(unsigned size, TLS* tls); // Allocates space on the global heap. byte* galloc(unsigned size); byte* move(void*); // moves an object byte* forwarded(void*); // reads the forwarding pointer void root(void**); // handles a root reference void trace(byte*); // traces one object // Collects garbage and allocates the object. byte* collect_alloc(unsigned size, TLS* tls); std::list<InteriorPointer> interior_pointers; void repoint_all_roots_with_interior_points(); }; // Structure OI (from "object information") // is used to cache GC information for each Java class // loaded by the virtual machine. // Each VTable stores the pointer to an OI (Object information) structure. struct OI { char magic[8]; // used for debugging const char *name; // the class name (handy for debugging) bool is_array; // true if this structure describes array bool has_slots; // true if the described object has reference fields // (aka slots) and thus needs to be traced // during collection int size; // the object size or the array element size int* offsets; // zero-terminated list of slot offsets in an object // undefined for array }; ////////////////////////////////////////////////// // Static assumptions about object layout // The VTable structure has 4 bytes reserved // for GC use at the beginning. // The pointer to the OI structure is stored there. struct VTable { OI* oi; // Other VTable fields are not used in GC. }; // Describes the object header format assumed by GC. struct Object { VTable *vt; uint32 lockword; }; // Describes the array header format assumed by GC. struct Array { VTable *vt; uint32 lockword; uint32 length; }; ////////////////////////////////////////////////// // GC utility functions // void init_vt(Managed_Object_Handle p, Allocation_Handle ah) { Object* obj = (Object*)p; obj->vt = (VTable*)ah; } OI* obj_oi(Managed_Object_Handle p) { Object* obj = (Object*)p; return obj->vt->oi; } int array_length(Managed_Object_Handle p) { Array* array = (Array*)p; return array->length; } OI* vt_oi(VTable_Handle p) { VTable* vt = (VTable*)p; return vt->oi; } OI* ah_oi(Allocation_Handle ah) { // Allocation_Handle is a VTable pointer on 32-bit platforms. return vt_oi((VTable_Handle)ah); } int object_size (Managed_Object_Handle obj) { OI* oi = obj_oi(obj); if (oi->is_array) { // 4-byte alignment return ((oi->size * array_length(obj) + 12) + 3) & (~3); } else { return oi->size; } } ///////////////////////////////////////////////////// // Global GC instance GC gc; ///////////////////////////////////////////// // GC functions // void GC::init() { semisize = 500*1024*1024; chunk_size = 64*1024; space = (byte*) malloc(semisize*2); assert(space); assert(((int)space & 3) == 0); fromspace = space; tospace = fromspace + semisize; assert(((int)tospace & 3) == 0); toend = tospace + semisize; INFO("heap size " << mb(2*semisize) << " Mb " << (void*)space << "-" << (void*)(space + 2*semisize)); current = fromspace; limit = fromspace + semisize; LOG("allocation from " << (void*)current << "-" << (void*)limit); memset(current, 0, limit - current); interior_pointers.clear(); } void GC::wrapup() { ::free(space); } byte* GC::galloc(unsigned size) { byte* r = NULL; lock.lock(); if (current + size <= limit) { r = current; current += size; } lock.unlock(); return r; } byte* GC::alloc(unsigned size, TLS* tls) { byte* obj = NULL; assert(NULL == tls->current || fromspace <= tls->current); assert(NULL == tls->limit || tls->limit <= limit); if (tls->current + size <= tls->limit) { // Allocate from the thread-local chunk if possible. obj = tls->current; tls->current += size; return obj; } // Allocate "large" objects directly from the heap // bypassing the thread-local allocation buffer // to prevent inefficient handling of half-filled // thread-local allocation buffers. if (size >= chunk_size/4) { return gc.galloc(size); } // Allocate a new thread-local chunk. obj = gc.galloc(chunk_size); if (obj) { tls->current = obj + size; tls->limit = obj + chunk_size; } return obj; } byte* GC::forwarded (void* obj) { int* p = (int*)obj + 1; int lockword = *p; if (lockword & 1) return (byte*)(lockword & (~1)); else return NULL; } byte* GC::move (void* obj) { int size = object_size(obj); assert(tospace <= copy); assert(copy + size <= toend); byte* nobj = copy; TRACE2("gc.move", "move " << (void*)obj << " -> " << (void*)nobj); assert(((int)nobj & 3) == 0); memcpy(nobj, obj, size); copy += size; assert(((int)copy & 3) == 0); int* plockword = (int*)obj + 1; *plockword = ((int)nobj) | 1; return nobj; } void GC::root(void** root) { byte* obj = (byte*)(*root); byte* nobj = forwarded(obj); if (NULL == nobj) { nobj = move(obj); } TRACE2("gc.root", "root " << root << " repointed from " << (void*)obj << " to " << (void*)nobj); *root = nobj; } void GC::trace (byte* obj) { OI* oi = obj_oi(obj); TRACE2("gc.trace", "trace " << (void*)obj << " (" << (void*)object_size(obj) << ", " << oi->name << ")"); if (!oi->has_slots) return; if (oi->is_array) { int len = array_length(obj); int i; // Trace (len) elements starting from the offset 12. // NB: long[] and double[] start at offset 16 // but never need tracing. byte** elem = (byte**)(obj + 12); for (i = 0; i < len; i++, elem += 1) { if (NULL == *elem) continue; byte* nobj = forwarded(*elem); if (!nobj) nobj = move(*elem); TRACE2("gc.update", "elem " << i << " in array " << (void*)obj << " repointed from " << (void*)(*elem) << " to " << (void*)nobj); *elem = nobj; } } else { int* poff; // Use (offsets) array to get the list of reference // field offsets in an object. for (poff = oi->offsets; *poff; poff++) { byte** field = (byte**) (obj + *poff); if (NULL == *field) continue; byte* nobj = forwarded(*field); if (!nobj) nobj = move(*field); TRACE2("gc.update", "field " << *poff << " in object " << (void*)obj << " repointed from " << (void*)(*field) << " to " << (void*)nobj); *field = nobj; } } } void GC::repoint_all_roots_with_interior_points() { for( std::list<InteriorPointer>::iterator it = interior_pointers.begin(); it != interior_pointers.end(); it++ ){ Object** root_ref = (*it).interior_ref; Object* root_base = (*it).base; const int root_offset = (*it).offset; Object* new_slot_contents = root_base + root_offset; if( new_slot_contents != *root_ref ){ *root_ref = new_slot_contents; } } interior_pointers.clear(); return; } byte* GC::collect_alloc(unsigned size, TLS* tls) { scan = tospace; copy = tospace; toend = tospace + semisize; LOG("allocated " << (current - fromspace) << " bytes"); LOG("collection to " << (void*)tospace << "-" << (void*)toend); vm_enumerate_root_set_all_threads(); LOG("scan"); while (scan < copy) { trace(scan); scan += object_size(scan); } LOG("live " << (copy-tospace) << " bytes"); if( !interior_pointers.empty() ){ repoint_all_roots_with_interior_points(); } byte* swap = tospace; tospace = fromspace; fromspace = swap; current = copy; limit = fromspace + semisize; memset(current, 0, limit - current); LOG("allocation from " << (void*)current << "-" << (void*)limit); std::list<TLS*>::iterator i; int j; for (i = gc.threads.begin(), j = 0; i != gc.threads.end(); i++, j++) { (*i) -> current = NULL; (*i) -> limit = NULL; } LOG2("gc.threads", "reset thread allocation areas in " << j << " user threads"); byte* obj = NULL; if (size > 0) { // Allocate an object before resuming threads to maintain // "fairness" and prevent spurious out-of-memory errors. obj = alloc(size, tls); } vm_resume_threads_after(); return obj; } ////////////////////////////////////////////////// // Interface functions // void gc_init() { gc.init(); } void gc_vm_initialized() { } void gc_wrapup() { gc.wrapup(); } void gc_thread_init(void* tp) { TRACE2("gc.thread", "gc_thread_init " << tp); TLS* tls = (TLS*) tp; std::list<TLS*>::iterator i = std::find(gc.threads.begin(), gc.threads.end(), tls); assert(i == gc.threads.end()); gc.threads.push_back(tls); tls->current = NULL; tls->limit = NULL; } void gc_thread_kill(void* tp) { TRACE2("gc.thread", "gc_thread_kill " << tp); TLS* tls = (TLS*) tp; std::list<TLS*>::iterator i = std::find(gc.threads.begin(), gc.threads.end(), tls); assert(i != gc.threads.end()); gc.threads.erase(i); tls->current = NULL; tls->limit = NULL; } Managed_Object_Handle gc_alloc (unsigned size, Allocation_Handle ah, void *tp) { Managed_Object_Handle obj; TLS* tls = (TLS*) tp; // The next-to-highest bit of the size may be set // when allocating objects requiring finalization. // Ignore hint for now. size = size & 0x3fffffff; assert((size & 3) == 0); assert(ah_oi(ah)->is_array || size == ah_oi(ah)->size); // First, try the allocation obj = gc.alloc(size, tls); if (!obj) { // If the allocation failed, // grab the global GC lock. vm_gc_lock_enum(); // Multiple threads may try to get the GC lock. // Only one gets it and does a collection, and others get // blocked on vm_gc_lock_enum() during collection. // Retry the allocation while holding the GC lock. obj = gc.alloc(size, tls); // The allocation will succeed if another thread // has done collection while this thread was waiting for the GC lock. if (!obj) { // If the allocation failed, start a garbage collection. obj = gc.collect_alloc(size, tls); // NULL return value from collect_alloc() indicates out-of-memory. } vm_gc_unlock_enum(); } if (obj) init_vt(obj, ah); TRACE2("gc.alloc.slow", "gc_alloc(" << (void*)size << ", " << (*(OI**)ah)->name << ") = " << obj); assert(NULL == obj || (gc.fromspace <= obj && obj < gc.limit && ((int)obj & 3) == 0)); return obj; } Managed_Object_Handle gc_alloc_fast (unsigned size, Allocation_Handle ah, void *tp) { Managed_Object_Handle obj; TLS* tls = (TLS*) tp; size = size & 0x3fffffff; assert((size & 3) == 0); assert(ah_oi(ah)->is_array || size == ah_oi(ah)->size); obj = gc.alloc(size, tls); if (obj) init_vt(obj, ah); TRACE2("gc.alloc.fast", "gc_alloc_fast(" << (void*)size << ", " << (*(OI**)ah)->name << ") = " << obj); assert(NULL == obj || (gc.fromspace <= obj && obj < gc.limit && ((int)obj & 3) == 0)); return obj; } Boolean gc_requires_barriers() { return FALSE; } Boolean gc_supports_compressed_references() { return FALSE; } void gc_write_barrier(Managed_Object_Handle p_base_of_object_holding_ref) { assert(!"gc does not support write barriers"); } void *gc_heap_base_address() { return (void *)gc.space; } void *gc_heap_ceiling_address() { return (void *)(gc.space + 2*gc.semisize); } void gc_add_root_set_entry(Managed_Object_Handle *ref, Boolean is_pinned) { assert(!is_pinned); TRACE2("gc.root", "gc_add_root_set_entry " << ref << " -> " << *ref); if (NULL == *ref) return; gc.root(ref); } Boolean gc_is_object_pinned(Managed_Object_Handle obj) { return FALSE; } static int *build_slot_offset_array(Class_Handle ch) { unsigned num_ref_fields = 0; // Get the total number of fields including primitive fields. unsigned num_fields = class_num_instance_fields_recursive(ch); // Compute the number of reference fields. unsigned int i; for (i = 0; i < num_fields; i++) { // For each field, get its handle and check // whether it's a reference type. Field_Handle fh = class_get_instance_field_recursive(ch, i); if (field_is_reference(fh)) { num_ref_fields++; } } if (0 == num_ref_fields) return NULL; // Allocate the offsets array. int* ref_array = (int*) malloc((num_ref_fields+1) * sizeof(int)); // For each reference field, store its offset // into the offsets array. int* p = ref_array; for (i = 0; i < num_fields; i++) { Field_Handle fh = class_get_instance_field_recursive(ch, i); if (field_is_reference(fh)) { *p = field_get_offset(fh); p++; } } // It is 0 delimited. *p = 0; return ref_array; } void gc_class_prepared (Class_Handle ch, VTable_Handle vth) { TRACE2("gc.prepared", "gc_class_prepared(" << class_get_name(ch) << ")"); OI** vt = (OI**) vth; OI* oi = new OI; *vt = oi; memset(oi, 0, sizeof(OI)); strcpy(oi->magic, " OI "); oi->name = class_get_name(ch); if (class_is_array(ch)) { oi->is_array = true; // Store the array element size in the OI structure > size. oi->size = class_element_size(ch); // Reference arrays have slots, non-reference arrays don't. oi->has_slots = !class_is_non_ref_array(ch); assert(NULL == oi->offsets); } else { oi->is_array = false; // Store the object size in the OI structure > size. oi->size = class_get_boxed_data_size(ch); assert((oi->size & 3) == 0); oi->offsets = build_slot_offset_array(ch); oi->has_slots = (oi->offsets != NULL); } } void gc_add_root_set_entry_interior_pointer (void **slot, int offset, Boolean is_pinned) { assert(offset > 0); assert( !is_pinned ); Object* p_obj = (Object *)((Byte*)*slot - offset); assert( p_obj->vt != NULL ); InteriorPointer ip; ip.offset = offset; ip.interior_ref = (Object**)slot; ip.base = (Object*)((*(unsigned char**)slot) - offset); OI* oi = obj_oi(ip.base); TRACE2("gc.ip", "interior_pointer " << (void*)ip.base << " (" << (void*)object_size(ip.base) << ", " << oi->name << ")"); gc.interior_pointers.push_back( ip ); gc_add_root_set_entry( (Managed_Object_Handle*)&(ip.base), is_pinned ); return; } void gc_force_gc () { vm_gc_lock_enum(); gc.collect_alloc(0, NULL); vm_gc_unlock_enum(); } int64 gc_free_memory() { return 0; } void gc_pin_object(Managed_Object_Handle*) { } void gc_unpin_object(Managed_Object_Handle*) { } Managed_Object_Handle gc_get_next_live_object(void*) { assert(!"gc does not support heap iteration"); return NULL; } void gc_finalize_on_exit() { } unsigned gc_time_since_last_gc() { return 0; } int64 gc_total_memory() { return 2*gc.semisize; } int64 gc_max_memory() { return 2*gc.semisize; }
# Copyright 2005 The Apache Software Foundation or its licensors, as applicable. # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. # You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. # [miscellaneous] numbered [specialwords] monospacedwords = \w+\(\) \w+(?:\/[\w.]*)+ \w+\.dll \w+\.exe \w+\.h(?!\w) \w+\.cpp \w+\.xml NULL [header] <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en"> <head> <meta http-equiv="Content-Type" content="text/html; charset={encoding}" /> <meta name="generator" content="AsciiDoc {asciidoc-version}" /> <style type="text/css"> include::{stylesdir={asciidoc-dir}/stylesheets}/{theme={backend}}.css[] ifdef::quirks[] include::{stylesdir={asciidoc-dir}/stylesheets}/{theme={backend}}-quirks.css[] endif::quirks[] div.sidebar-content { font-size: 70%; background: #ffcccc; border: 1px solid silver; padding: 3pt; width: 50%; display: float; position: relative; left: 900px; bottom: 2em; -moz-border-radius: 12px; } body { width: 80%; text-align: justify; } tt { font-size: 10pt; } strong { font-weight: bold; } </style> <title>{doctitle}</title> </head> <body> # Article, book header. <div id="header"> <h1>{doctitle}</h1> <span id="author">{author}</span><br /> <span id="email"><tt><<a href="mailto:{email}">{email}</a>></tt></span><br /> <span id="revision">version {revision}{date?,}</span> {date} </div>
How to write DRL GC =================== [EMAIL PROTECTED], [EMAIL PROTECTED] revision 1.0, 2006-04-19 // // Copyright 2005 The Apache Software Foundation or its licensors, as applicable. // // Licensed under the Apache License, Version 2.0 (the "License"); // you may not use this file except in compliance with the License. // You may obtain a copy of the License at // // http://www.apache.org/licenses/LICENSE-2.0 // // Unless required by applicable law or agreed to in writing, software // distributed under the License is distributed on an "AS IS" BASIS, // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // See the License for the specific language governing permissions and // limitations under the License. // // // Generate HTML version of this document from text source // and configuration file GC-howto.conf using the command // // asciidoc -f GC-howto.conf --unsafe GC-howto.txt // // Download Asciidoc generic distribution archive from // // http://www.methods.co.nz/asciidoc/downloads.html // // unpack it somewhere (e.g. /usr/local/opt/asciidoc), and // symlink /usr/local/opt/asciidoc/asciidoc.py to /usr/local/bin/asciidoc // This document provides instructions on creating a custom garbage collector implementation in C++ and configuring the DRL virtual machine to use it. The section describes the major steps of this procedure, namely: - Establishing the build infrastructure - Implementing the GC interface - Implementing the garbage collector algorithm - Running the VM with the custom GC .Note Plugging-in a user-designed garbage collector presupposes an operating DRL virtual machine built according to the instructions of the README.txt file supplied with the VM source package. Establishing the build infrastructure ------------------------------------- At this stage, you create the directory and set up the build infrastructure to build the dynamic library. At the end of this stage, you will be fully set for adding the garbage collector code and building it. DRLVM can load a custom garbage collector from a dynamic library. It is recommended that you build your dynamic library using a DRLVM build infrastructure. Below is an example of creating of a build descriptor on the Windows* / IA-32 architecture. Create a directory for a new GC module, for example: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ----- vm$ mkdir gc_copying vm$ mkdir gc_copying/src vm$ cd gc_copying/src ----- That is where you will put the source code, see Section 3, Implementing the garbage collector algorithm. Create a build descriptor file ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Create the build descriptor file build/make/components/vm/gc_copying.xml with the following content: ----- sys::[sed -n '/<project/,$p' gc_copying.xml] ----- You can add other macro definitions, include directories or compiler-specific command-line options to match your needs. Create a C++ file with essential includes, namely: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ----- sys::[sed -n '/^#include "open/,/^#include "cxxlog/p' gc.cpp] ----- These include files are located in directories vm/include/open and vm/port/include. Consult their content for documentation and details of the interface. Test the configuration ~~~~~~~~~~~~~~~~~~~~~~ Run the build system to test whether the infrastructure is set up correctly: ----- build$ build.bat -DCOMPONENTS=vm.gc_copying ----- On a successful build, the .dll file is placed to the VM build directory build/win_ia32_icl_debug/deploy/jre/bin/. The name of the directory may differ depending on your system and the compiler used. This empty library will not work, you have to write your GC first! Implementing the GC interface ----------------------------- This section lists the functions that a garbage collector interface must implement. Declarations of these functions are in gc.h. For details, consult the Developer's Guide and documentation in gc.h and vm_gc.h. GC lifecycle ~~~~~~~~~~~~ * gc_init() initializes the garbage collector * gc_wrapup() shuts down the GC * gc_vm_initialized() notifies the GC about the VM transition from the initialization stage to running user applications * gc_thread_init() and gc_thread_kill() notify the GC about creation and termination of user threads that may request memory allocation or other GC services * gc_class_prepared() notifies the GC about loaded and prepared classes Object allocation ~~~~~~~~~~~~~~~~~ * gc_alloc() performs slow allocation, can trigger garbage collection * gc_alloc_fast() performs faster allocation, should not trigger garbage collection * gc_add_root_set_entry() is responsible for enumeration * gc_add_root_set_entry() enumerates one root pointer See the Root set enumeration section in the Developer's Guide for details. Miscellaneous ~~~~~~~~~~~~~ * gc_supports_compressed_references() indicates whether GC supports compressed references * gc_is_object_pinned() indicates whether the GC will move an object or not * gc_force_gc() forces a garbage collection Optional ~~~~~~~~ The virtual machine can operate without the functions listed below, but certain features will be unavailable. * gc_free_memory() returns the estimated amount of memory available for allocation * gc_pin_object() requests that the GC does not move an object * gc_unpin_object() removes the restriction on not moving an object * gc_get_next_live_object() iterates over live objects during the stop-the-world phase of garbage collection * gc_finalize_on_exit() transfers finalizable queue contents to the VM core on shutdown * gc_time_since_last_gc() returns the amount of time that elapsed since the previous collection, in milliseconds * gc_total_memory() returns the overall amount of memory used for the Java heap * gc_max_memory() returns the maximum amount of memory that can be used for the Java heap The `VM_GC` interface ~~~~~~~~~~~~~~~~~~~~~ The garbage collector requires VM support in its operation. The virtual machine exports the VM_GC interface to meet the needs of the garbage collector. Besides, the GC uses the VM_common interface. The VM_GC interface describes the services that the VM provides specifically for the garbage collector. Please refer to the header file vm_gc.h to see the complete list and documentation. The VM exports two functions to provide the global locking service for the garbage collector: vm_gc_lock_enum() and vm_gc_unlock_enum(). These two functions differ from plain system locks in their ability to gracefully interoperate with VM threading services. In case of contention on the GC lock, that is, when multiple threads call vm_gc_lock_enum() simultaneously, one thread gets the lock, and others remain blocked. If the thread that grabbed the GC lock does a garbage collection, the blocked threads are considered safely suspended. Other ways to lock user threads for a long time can lead to a deadlock because the VM will have no way to find out whether the thread is blocked or running. A detailed description of GC procedure is given in the Developers' Guide. DRLVM provides two functions to support thread suspension and root set enumeration simultaneously: * vm_enumerate_root_set_all_threads() suspends all user threads and initiates root set enumeration * vm_resume_threads_after() resumes execution of user threads These functions effectively restrict the garbage collector to stop-the-world algorithms only. Implementing the garbage collector algorithm -------------------------------------------- This section gives step-by-step instructions on how to implement the garbage collection algorithm. The example shows a semispace copying collector. .Note This example does not implement object finalization and weak references. [[gc_algorithm]] Algorithm Overview ~~~~~~~~~~~~~~~~~~ The heap is divided into two equally sized contiguous semispaces. During normal operation, only one semispace is used ('current semispace'), and the other one is 'reserved' for garbage collection. Allocation requests are satisfied by contiguous allocation from the current semispace. Each application thread reserves a thread-local allocation buffer ('TLAB') under a global lock, and serves most of the allocation requests without locking, by incrementing the allocation pointer local to the buffer. When the application requests an allocation that does not fit into the remaining free space of the current semispace, a garbage collection is initiated. The current semispace becomes the 'evacuation space' ('fromspace'), and the reserved semispace becomes the 'destination space' ('tospace'). The VM suspends all application threads and enumerates root references. The GC copies the objects reachable from root references to the destination space. When an object is copied from evacuation space to destination space, the GC installs the forwarding pointer in the old copy. Root references are updated to point to new object locations. After the root set enumeration is complete, the GC scans objects in the destination space. Each reached object is copied to the destination space, the forwarding pointer is installed in the old copy, and the scanned object fields are updated. For objects with forwarding pointers installed, the GC updates object fields. In this way, the GC ensures that all live objects are copied to the destination space exactly once. The destination space serves as a queue of objects to be scanned when more and more objects are copied to the destination space during heap traversal. Once all live objects are reached and copied, the scan queue stops growing, and the GC updates object fields only during the last part of the scanning process. The GC completes the scanning process when the scan pointer reaches the allocation pointer in the destination space. At this stage, all live objects have been evacuated to the destination space, and the evacuation space can be safely reclaimed. The GC then changes the semispace roles: it uses the destination space for further allocation and reserves the evacuation space for the next garbage collection. The change of the semispace roles is commonly referred to as 'flip'. After the semispace flip, the GC resumes user threads. Please refer to the excellent survey for detailed description of this algorithm and other basic garbage collection techniques, "Uniprocessor Garbage Collection Techniques", Paul R. Wilson. Source code explained ~~~~~~~~~~~~~~~~~~~~~ The full source code of the collector is available in gc.cpp. The structure `TLS` (thread-local storage) is used for the optimizing fast path allocation. The GC allocates a buffer of free space from the heap with appropriate locking and further uses this buffer for thread-local allocation. ----- sys::[sed -n '/^\/\/ This structure is allocated for each user thread/,/^}/p' gc.cpp] ----- Define the main GC structure to contain the Java heap and the data necessary for GC operation, as shown below. ----- sys::[sed -n '/^\/\/ Encapsulates all GC data/,/}/p' gc.cpp] ----- The following structure stores object information: the object field layout and the object size. ----- sys::[sed -n '/^\/\/ Structure OI/,/^}/p' gc.cpp] ----- The data stored in the `OI` structure is initialized and accessed by the GC only. The following structures convey the static assumptions that GC makes about object layout. The VM must use the same object layout assumptions for the correct GC operation. The `VTable` structure contains the virtual table of the object methods, and is linked from the object header. The VM reserves some space (at least 4 bytes) for exclusive use by GC. The GC uses 4 bytes of GC-private space to put the pointer to the object information structure `struct OI`. ------ sys::[sed -n '/^\/\/ The VTable structure has 4 bytes reserved/,/^}/p' gc.cpp] ------ The GC assumes that each Java* object has a fixed header: (1) a pointer to the `VTable` structure, and then a (2) 32 bit word with flags. The 25 highest bits are used by the VM Thread Manager component to implement Java* monitors and 7 lowest bits are used by GC and for storing the object hash code. ------ sys::[sed -n '/^\/\/ Describes the object header format assumed by GC./,/^}/p' gc.cpp] ------ The array objects have the same header, and a 4 byte length field at the offset 8. ------ sys::[sed -n '/^\/\/ Describes the array header format assumed by GC./,/^}/p' gc.cpp] ----- .Note The layout described is valid for the IA-32 platform only. A number of convenience functions use object layout knowledge to perform various data manipulations. The function init_vt() writes the VTable pointer to an object. ----- sys::[sed -n '/^void init_vt/,/^}/p' gc.cpp] ----- The function obj_oi() retrieves object information structure pointer from an object. ----- sys::[sed -n '/^OI *\* *obj_oi/,/^}/p' gc.cpp] ----- The function array_length() retrieves the length of an array object. ----- sys::[sed -n '/^int array_length/,/^}/p' gc.cpp] ----- The function vt_oi() retrieves the `OI` structure pointer from the VTable pointer. ----- sys::[sed -n '/^OI *\* *vt_oi/,/^}/p' gc.cpp] ----- The function ah_oi() retrieves the `OI` structure pointer using `Allocation_Handle`. On 32-bit architectures, the VTable pointer is a 32-bit pointer, and Allocation_Handle is a 32-bit integer. ----- sys::[sed -n '/^OI *\* *ah_oi/,/^}/p' gc.cpp] ----- The object_size() function computes the size of an object. Array size is calculated by summing the header size and the element size multiplied by array length. Afterwards the size is aligned to be multiple of 4. The non-array object size is cached in the OI structure. ----- sys::[sed -n '/^int object_size/,/^}/p' gc.cpp] ----- In this example, the garbage collector is created statically as a global instance of structure GC: ----- GC gc; ----- The function init() statically configures size parameters. Normally, this function uses the function vm_get_property() to read configuration options specified as property values on the command line. In this example, we use constant values for simplicity. ----- sys::[sed -n '/^void GC::init/,/chunk_size/p' gc.cpp] ----- As the next step, the init() function allocates space for the heap, divides it into two semispaces, and initializes the allocation semispace. ------ sys::[sed -n '/^ *space = /,/^}/p' gc.cpp] ----- The global allocation function uses a lock to protect the heap from simultaneous access from multiple threads. The locking mechanism is trivially implemented in a platform-dependent way. See the full source code in gc.cpp. ----- sys::[sed -n '/^byte *\* *GC::galloc *(/,/^}/p' gc.cpp] ----- The local allocation function uses the thread-local allocation area for object allocation, and uses galloc() to allocate a new chunk for a thread-local allocation area as needed. ----- sys::[sed -n '/^byte *\* *GC::alloc *(/,/^}/p' gc.cpp] ----- The forwarding pointers are installed in the lockword structure, the second word of an object. ----- sys::[sed -n '/^byte *\* *GC::forwarded *(/,/^}/p' gc.cpp] ----- The function move() copies the object data to the evacuation semispace and installs the forwarding pointer in the old object copy. ----- sys::[sed -n '/^byte *\* *GC::move *(/,/^}/p' gc.cpp] ----- The function root() handles one root during root set enumeration. If the root points to an object already reached, the root is updated with the forwarded pointer value. Otherwise, the GC moves the object to the destination space and installs the forwarding pointer in the old object copy. ----- sys::[sed -n '/^void GC::root *(/,/^}/p' gc.cpp] ----- The function trace() scans one object. ----- sys::[sed -n '/^void GC::trace *(/,/^}/p' gc.cpp] ----- The function collect_alloc() is the main function controlling garbage collection. This function reclaims unused memory and the retries the allocation. The GC attempts to allocate the memory before resuming other threads. This prevents the thread that triggered the garbage collection from starving. .Note The thread is 'starving' when it gets no resources for a long time because other threads grab the resource before it can even try. If the garbage collector resumes user threads before retrying the allocation, these threads may use all available space quickly before the allocation succeeds. In this case, the allocation will fail for an indefinite number of times. ----- sys::[sed -n '/^byte\* GC::collect_alloc/,/^}/p' gc.cpp] ----- The exported GC interface is mostly implemented by delegating the task to the method of the structure `GC`. The GC initialization function init() is called from gc_init(). ----- sys::[sed -n '/^void gc_init/,/^}/p' gc.cpp] ----- Thread local allocation areas are reset on thread creation and thread termination events. ----- sys::[sed -n '/^void gc_thread_init/,/^}/p' gc.cpp] sys::[sed -n '/^void gc_thread_kill/,/^}/p' gc.cpp] ----- The slow path allocation function gc_alloc() checks whether the allocation space is exhausted and starts garbage collection when necessary. ----- sys::[sed -n '/^Managed_Object_Handle gc_alloc *(/,/^}/p' gc.cpp] ----- If the memory is exhausted, the no-collection allocation function gc_alloc_fast() returns NULL, and does not start garbage collection. ----- sys::[sed -n '/^Managed_Object_Handle gc_alloc_fast/,/^}/p' gc.cpp] ----- The root set enumeration function passes the root reference to the root() function. ----- sys::[sed -n '/^void gc_add_root_set_entry *(/,/^}/p' gc.cpp] ----- The function build_slot_offset_array() is used to construct a NULL-terminated list of offsets of reference fields. ----- sys::[sed -n '/^static int *\* *build_slot_offset_array/,/^}/p' gc.cpp] ----- The GC caches object layout information when the function gc_class_prepared() is called. ------ sys::[sed -n '/^void gc_class_prepared/,/^}/p' gc.cpp] ----- The function gc_force_gc() starts a forced garbage collection using the global GC lock to ensure that only one thread is doing a collection at any time. It passes null arguments to collect_alloc(), because it requires no allocation. ----- sys::[sed -n '/^void gc_force_gc/,/^}/p' gc.cpp] ----- Other functions of the GC interface are empty or trivial, and not described in this document. You can see the full listing in the gc.cpp file. After you completed coding the garbage collector, you can build a GC dynamic library, as described above, by typing ----- build$ build.bat -DCOMPONENTS=vm.gc_copying ----- Running VM with the custom GC ----------------------------- This section describes how to run the DRL virtual machine with the custom garbage collector library. You can specify the name of the dynamic library on the command line. For example, to load a GC gc_copying.dll, execute the following: ----- ij -Dvm.dlls=gc_copying Hello ----- The virtual machine searches for a dynamic library gc_copying.dll in the default locations, that is, the value for the PATH variable and the location of executable ij.exe. The default garbage collector is gc.dll located in the same bin/ directory as ij.exe.
GC-howto.html: GC-howto.txt GC-howto.conf asciidoc -o $@ -f GC-howto.conf --unsafe $<
--------------------------------------------------------------------- Terms of use : http://incubator.apache.org/harmony/mailing.html To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]