On 10/20/05, Apache Harmony Bootstrap JVM <[EMAIL PROTECTED]> wrote: > Robin, Rodrigo, > > Perhaps the two of you could get your heads together > on GC issues? I think both of you have been thinking > along related lines on the structure of GC for this JVM. > What do you think? > > > Further comments follow... > > -----Original Message----- > From: Rodrigo Kumpera <[EMAIL PROTECTED]> > Sent: Oct 19, 2005 4:49 PM > To: harmony-dev@incubator.apache.org > Subject: Re: Some questions about the architecture > > On 10/19/05, Apache Harmony Bootstrap JVM <[EMAIL PROTECTED]> wrote: > > > > > > -----Original Message----- > > From: Rodrigo Kumpera <[EMAIL PROTECTED]> > > Sent: Oct 19, 2005 1:49 PM > > To: harmony-dev@incubator.apache.org, Apache Harmony Bootstrap JVM <[EMAIL > > PROTECTED]> > > Subject: Re: Some questions about the architecture > > > > On 10/19/05, Apache Harmony Bootstrap JVM <[EMAIL PROTECTED]> wrote: > > > > ...snip... > > > > Notice that in 'jvm/src/jvmcfg.h' there is a JVMCFG_GC_THREAD > > that is used in jvm_run() as a regular thread like any other. > > It calls gc_run() on a scheduled basis. Also, any time an object > > finalize() is done, gc_run() is possible. Yes, I treat GC as a > > stop-the-world process, but here is the key: Due to the lack > > of asynchronous native POSIX threads, there are no safe points > > required. The only thread is the SIGALRM target that sets the > > volatile boolean in timeslice_tick() for use by opcode_run() to > > test. <b>This is the _only_ formally asynchrous data structure in > > the whole machine.</b> (Bold if you use an HTML browser, otherwise > > clutter meant for emphasis.) Objects that contain no references can > > be GC'd since they are merely table entries. Depending on how the > > GC algorithm is done, gc_run() may or may not even need to look > > at a particular object. > > > > Notice also that classes are treated in the same way by the GC API. > > If a class is no longer referenced by any objects, it may be GC'd also. > > First, its intrinsic class object must be GC'd, then the class itself. This > > may take more than one pass of gc_run() to make it happen. > > > > --- > > How exactly the java thread stack is scanned for references on this > scheme? Safepoints are required for 2 reasons, first to allow native > threads proper pausing and second to make easier for the garbage > collector identify what on the stack is a reference and what is not. > > The first one is a non-issue in this case, but the second one is, as > precise "java stack" scanning is required for any moving collector > (f.e. semi-space, train or mark-sweep-compact). The solution for the > second problem is either have a tagged stack (we tag each slot in the > stack is it's a reference or not), generate gc_maps for all bytecodes > of a method (memory-wise, this is not pratical, with JIT'ed code even > worse). > > --- > > That depends on the GC implementation. Look at 'jvm/src/gc_stub.c' > for the stub reference implementation. To see the mechanics of > how to fit it into the compile environment, look at the GC and heap > setup in 'config.sh' and at 'jvm/src/heap.h' for how multiple heap > implementations get configured in. > > The GC interface API that I defined may or may not be adequate > for everything. I basically set it up so that any time an object reference > was added or deleted, I called a GC function. The same goes for > class loading and unloading. For local variables on the JVM stack for each > thread, the GC functions are slightly different than for fields in an object, > but the principle is the same. > > --- > > > > > > For exemple, as I understand, JikesRVM implements gc safepoints (the > > points in the bytecode where gc maps are generated) at loop backedges > > and method calls. > > > > > The priorities that I set were (1) get the logic working > > > without resorting to design changes such as multi-threading, > > > then (2) optimize the implementation and evaluate > > > improvements and architectural changes, then (3) implement > > > improvements and architectural changes. The same goes > > > for the object model using the OBJECT() macro and the > > > 'robject' structure in 'jvm/src/object.h'. And the CLASS() > > > macro, and the STACK() macro, and other components > > > that I have tried to implement in a modular fashion (see 'README' > > > for a discussion of this issue). Let's get it working, then look into > > > design changes, even having more than one option available at > > > configuration time, compile time, or even run time, such as is > > > now the case with the HEAP_xxx() macros and the GC_xxx() > > > macros that Robin Garner has been asking about. > > > > > > As to the 'jvm/src/timeslice.c' code, notice that each > > > time that SIGALRM is received, the handler sets a > > > volatile boolean that is read by the JVM inner loop > > > in 'while ( ... || (rfalse == pjvm->timeslice_expired))' > > > in 'jvm/src/opcode.c' to check if it is time to give the > > > next thread some time. I don't expect this to be the > > > most efficient check, but it _should_ work properly > > > since I have unit tested the time slicing code, both > > > the while() test and the setting of the boolean in > > > timeslice_tick(). One thing I have heard on this > > > list is that one of the implementations, I think it was > > > IBM's Jikes (?), was that they chose an interpreter > > > over a JIT. Now that is not directly related to time > > > slicing, but it does mean that a mechanism like what I > > > implemented does not have to have compile-time > > > support. > > > > > > *** How about you JVM experts out there? Do you have > > > any wisdom for me on the subject of time slicing > > > on an outer/inner interpreter loop interpreter > > > implementation? And compared to JIT? Archie Cobb, > > > what do you think? How about you lurkers out there? *** > > > > All open source JVMs I checked use native threads, you can take a look > > at how IBM did with Native POSIX Threading Library (NPTL), as it > > implement userland threads on linux. > > > > --- > > > > I would be interested in your evaluation of the existing implementation > > against what could be done to implement such an approach. > > First, it's was not NPTL, but NGPT the project IBM created, my fault. > The IBM site seens to be offline.From what I remember, it implemented > userland threads with coordination of the kernel to do context switch > and scheduling, basically using signals to perform the context switch. > > Anyway, I think it seens to be a good decision to switch soon to a > native thread implementation, as it requires less code to have proper > schedulling and good I/O primitives. > > --- > > There is a case to be made for this approach since it may perform > better under load than the one that I have implemented. > > --- > > > --- > > > > > As to your question about setjmp/longjmp, I agree that > > > there are other ways to do it. In fact, I originally used > > > stack walking in one sense to return from fatal errors > > > instead for my original implementation of the heap > > > allocator, which used malloc/free. If I got an error > > > from malloc(), I simply returned a NULL pointer, which > > > I tested from the calling function. If I got this error, > > > I returned to its caller with an error, and so on, all the > > > way up. However, what happens when you have a > > > normally (void) return? Use TRUE/FALSE instead? > > > Could be. But the more I developed the code, the > > > harder this became to support. Therefore, since fatal > > > errors kill the application anyway, I decided to _VASTLY_ > > > simplify the code by using what is effectively the OO concept > > > of an exception as available in the 'C' runtime library > > > with setjmp/longjmp. Notice that many complicated models > > > can end up with irresolvable terminal conditions and that > > > the simplest way to escape is back to a known good state. > > > This is the purpose of setjmp/longjmp. Try this on for size > > > with any communication protocol implementation, such as > > > TCP/IP some time. When you get to a snarled condition where > > > there just is not any graceful way out, the non-local character > > > of setjmp/longjmp cuts that knot instead of untying it with > > > horrible error code checking back up the stack. This is why > > > I finally decided to go this way. (Does this answer your main > > > question here?) > > > > It does, but by stack walking I meant not returning null, but having > > the code analise the call stack for a proper IP address to use. > > > > --- > > What do you mean by 'IP address' in this context? I think I am > > missing something. > > --- > > > By IP I mean Intruction Pointer, the EIP register in x86 f.e. What I > mean was something like this: > > void throw_exception(jobject_t *ex) { > long * ip = (*(&ex - 1)); //the return address is after the arguments > long * sp = (*(&ex - 2)); //the old frame pointer is after the return > address > jclass_t * cl = ex->vtable->class_obj; > > printf("obj 0x%x ip 0x%x sp 0x%x\n", obj, ip, sp); > > printf("------\n"); > //this code performs stack unwinding, it misses synchronized methods . > while(isNotThreadBaseFunction(ip)) { > printf("trace element ip 0x%x sp 0x%x\n", ip, sp); > catch_info_t * info = find_catch_info(ip, cl); > if(info) restore_to(ip, sp, ex, info); > ip = (long *)*(sp+ 1); > sp = (long *)*sp; > } > printf("-----\n"); > fflush(stdout); > //uncaught exception, must never happen, this is a JVM bug. > //in my vm, at least, uncaught exceptions where handled by the > implementation of Thread. > } > > find_catch_info was implemented in java, but looks something like this > (don't bother with the linear search for now): > > catch_info * find_catch_info(long *ip, jclass_t *ex) { > if(ip < vm ->compiledMethodsStart || ip > vm->compiledMethodsEnd) > return 0; > foreach(compiled_method_t * m, vm->compiledMethods) > if(m->owns(ip)) //this instruction pointer belongs to this method > return m->findCatch(ip, ex); //find a catch block for the exception > return 0; > } > > restore_to is implemented this way: > > state void restore_to(long *ip, long *frame, jobject_t *ex, catch_info *info) > { > asm("movl %0, %%eax;" > "movl %1, %%ebx;" > "movl %2, %%ecx;" > "movl %3, %%edx;" > "movl %%ebx, %%ebp;" > "movl %%ebp, %%esp;" > "subl %%edx, %%esp;" > "pushl %%ecx;" > "pushl %%eax;" > "ret;" > : > :"m"(ip), "m"(frame), "m"(ex), "m"(info->stackDelta) > //stackDelta is local storage + temp storage > :"%eax", "%ebx", "%ecx", "%edx"); > } > > This stuff works only in a JIT only enviroment, but only some minor > tweaks would be required to work in a hybrid enviroment > > --- > > Thanks for your clarification on the term 'IP address'. Back to your > question: > > > It does, but by stack walking I meant not returning null, but having > > the code analise the call stack for a proper IP address to use. > > In this implementation, unprotected exceptions are handled in > 'jvm/src/opcode.c' by references to thread_throw_exception() > in 'jvm/src/thread.c'. Stack printing is available through the > various utilities (esp. jvmutil_print_stack_common()) > in 'jvm/src/jvmutil.c'. Protected exceptions are handled by the > exception list found in the 'jvm_pc' field 'excpatridx'. When an > exception is found, this list is queried (by the ATHROW opcode, > which will be available with 0.0.2) and, if found, JVM thread control > is transferred to that handler. If it is _not_ found, > thread_throw_exception() > is called and the thread dies at the end of opcode_run(). This functionality > looks very similar to your code shown above. > > --- > > > ...snip... > > > > Dan Lydick > > > > > > > Dan Lydick >
Dan, I'm confused by all this classification on types of exceptions (from the code you make a distiction from caugth/uncaugth and Exception/Error/Throwable). My view is that these are not an issue for the runtime besides the verifier. We could have the following code on java.lang.Thread: private void doRun() { try { if(runnable != null) runnable.run(); else this.run(); } catch(Throwable t) { this.threadGroup.uncaughtException(t); } terminate(); } The runtime would only assert that the exception have not fallen out and not care about how it would be handled.