Re: Some questions about the architecture

Apache Harmony Bootstrap JVM Thu, 20 Oct 2005 08:09:10 -0700

Robin, Rodrigo,

Perhaps the two of you could get your heads together
on GC issues?  I think both of you have been thinking
along related lines on the structure of GC for this JVM.
What do you think?



Further comments follow...

-----Original Message-----
From: Rodrigo Kumpera <[EMAIL PROTECTED]>
Sent: Oct 19, 2005 4:49 PM
To: harmony-dev@incubator.apache.org
Subject: Re: Some questions about the architecture

On 10/19/05, Apache Harmony Bootstrap JVM <[EMAIL PROTECTED]> wrote:
>
>
> -----Original Message-----
> From: Rodrigo Kumpera <[EMAIL PROTECTED]>
> Sent: Oct 19, 2005 1:49 PM
> To: harmony-dev@incubator.apache.org, Apache Harmony Bootstrap JVM <[EMAIL 
> PROTECTED]>
> Subject: Re: Some questions about the architecture
>
> On 10/19/05, Apache Harmony Bootstrap JVM <[EMAIL PROTECTED]> wrote:
> >
...snip...
>
> Notice that in 'jvm/src/jvmcfg.h' there is a JVMCFG_GC_THREAD
> that is used in jvm_run() as a regular thread like any other.
> It calls gc_run() on a scheduled basis.  Also, any time an object
> finalize() is done, gc_run() is possible.  Yes, I treat GC as a
> stop-the-world process, but here is the key:  Due to the lack
> of asynchronous native POSIX threads, there are no safe points
> required.  The only thread is the SIGALRM target that sets the
> volatile boolean in timeslice_tick() for use by opcode_run() to
> test.  <b>This is the _only_ formally asynchrous data structure in
> the whole machine.</b>  (Bold if you use an HTML browser, otherwise
> clutter meant for emphasis.)  Objects that contain no references can
> be GC'd since they are merely table entries.  Depending on how the
> GC algorithm is done, gc_run() may or may not even need to look
> at a particular object.
>
> Notice also that classes are treated in the same way by the GC API.
> If a class is no longer referenced by any objects, it may be GC'd also.
> First, its intrinsic class object must be GC'd, then the class itself.  This
> may take more than one pass of gc_run() to make it happen.
>
> ---

How exactly the java thread stack is scanned for references on this
scheme? Safepoints are required for 2 reasons, first to allow native
threads proper pausing and second to make easier for the garbage
collector identify what on the stack is a reference and what is not.

The first one is a non-issue in this case, but the second one is, as
precise "java stack" scanning is required for any moving collector
(f.e. semi-space, train or mark-sweep-compact). The solution for the
second problem is either have a tagged stack (we tag each slot in the
stack is it's a reference or not), generate gc_maps for all bytecodes
of a method (memory-wise, this is not pratical, with JIT'ed code even
worse).

---

That depends on the GC implementation.  Look at 'jvm/src/gc_stub.c'
for the stub reference implementation.  To see the mechanics of
how to fit it into the compile environment, look at the GC and heap
setup in 'config.sh' and at 'jvm/src/heap.h' for how multiple heap
implementations get configured in.

The GC interface API that I defined may or may not be adequate
for everything.  I basically set it up so that any time an object reference
was added or deleted, I called a GC function.  The same goes for
class loading and unloading.  For local variables on the JVM stack for each
thread, the GC functions are slightly different than for fields in an object,
but the principle is the same.

---


>
> For exemple, as I understand, JikesRVM implements gc safepoints (the
> points in the bytecode where gc maps are generated) at loop backedges
> and method calls.
>
> > The priorities that I set were (1) get the logic working
> > without resorting to design changes such as multi-threading,
> > then (2) optimize the implementation and evaluate
> > improvements and architectural changes, then (3) implement
> > improvements and architectural changes.  The same goes
> > for the object model using the OBJECT() macro and the
> > 'robject' structure in 'jvm/src/object.h'.  And the CLASS()
> > macro, and the STACK() macro, and other components
> > that I have tried to implement in a modular fashion (see 'README'
> > for a discussion of this issue).  Let's get it working, then look into
> > design changes, even having more than one option available at
> > configuration time, compile time, or even run time, such as is
> > now the case with the HEAP_xxx() macros and the GC_xxx()
> > macros that Robin Garner has been asking about.
> >
> > As to the 'jvm/src/timeslice.c' code, notice that each
> > time that SIGALRM is received, the handler sets a
> > volatile boolean that is read by the JVM inner loop
> > in 'while ( ... || (rfalse == pjvm->timeslice_expired))'
> > in 'jvm/src/opcode.c' to check if it is time to give the
> > next thread some time.  I don't expect this to be the
> > most efficient check, but it _should_ work properly
> > since I have unit tested the time slicing code, both
> > the while() test and the setting of the boolean in
> > timeslice_tick().  One thing I have heard on this
> > list is that one of the implementations, I think it was
> > IBM's Jikes (?), was that they chose an interpreter
> > over a JIT.  Now that is not directly related to time
> > slicing, but it does mean that a mechanism like what I
> > implemented does not have to have compile-time
> > support.
> >
> > *** How about you JVM experts out there?  Do you have
> >       any wisdom for me on the subject of time slicing
> >       on an outer/inner interpreter loop interpreter
> >       implementation?  And compared to JIT?  Archie Cobb,
> >       what do you think?  How about you lurkers out there? ***
>
> All open source JVMs I checked use native threads, you can take a look
> at how IBM did with Native POSIX Threading Library (NPTL), as it
> implement userland threads on linux.
>
> ---
>
> I would be interested in your evaluation of the existing implementation
> against what could be done to implement such an approach.

First, it's was not NPTL, but NGPT the project IBM created, my fault.
The IBM site seens to be offline.From what I remember, it implemented
userland threads with coordination of the kernel to do context switch
and scheduling, basically using signals to perform the context switch.

Anyway, I think it seens to be a good decision to switch soon to a
native thread implementation, as it requires less code to have proper
schedulling and good I/O primitives.

---

There is a case to be made for this approach since it may perform
better under load than the one that I have implemented.

---

> ---
>
> > As to your question about setjmp/longjmp, I agree that
> > there are other ways to do it.  In fact, I originally used
> > stack walking in one sense to return from fatal errors
> > instead for my original implementation of the heap
> > allocator, which used malloc/free.  If I got an error
> > from malloc(), I simply returned a NULL pointer, which
> > I tested from the calling function.  If I got this error,
> > I returned to its caller with an error, and so on, all the
> > way up.  However, what happens when you have a
> > normally (void) return?  Use TRUE/FALSE instead?
> > Could be.  But the more I developed the code, the
> > harder this became to support.  Therefore, since fatal
> > errors kill the application anyway, I decided to _VASTLY_
> > simplify the code by using what is effectively the OO concept
> > of an exception as available in the 'C' runtime library
> > with setjmp/longjmp.  Notice that many complicated models
> > can end up with irresolvable terminal conditions and that
> > the simplest way to escape is back to a known good state.
> > This is the purpose of setjmp/longjmp.  Try this on for size
> > with any communication protocol implementation, such as
> > TCP/IP some time.  When you get to a snarled condition where
> > there just is not any graceful way out, the non-local character
> > of setjmp/longjmp cuts that knot instead of untying it with
> > horrible error code checking back up the stack.  This is why
> > I finally decided to go this way.  (Does this answer your main
> > question here?)
>
> It does, but by stack walking I meant not returning null, but having
> the code analise the call stack for a proper IP address to use.
>
> ---
> What do you mean by 'IP address' in this context?  I think I am
> missing something.
> ---


By IP I mean Intruction Pointer, the  EIP register in x86 f.e. What I
mean was something like this:

void throw_exception(jobject_t *ex) {
        long * ip = (*(&ex - 1)); //the return address is after the arguments
        long * sp = (*(&ex - 2)); //the old frame pointer is after the return 
address
        jclass_t * cl = ex->vtable->class_obj;

        printf("obj 0x%x ip 0x%x sp 0x%x\n", obj, ip, sp);
        
        printf("------\n");
        //this code performs stack unwinding, it misses synchronized methods .
        while(isNotThreadBaseFunction(ip)) {
                printf("trace element ip 0x%x sp 0x%x\n", ip, sp);
                catch_info_t * info = find_catch_info(ip, cl);
                if(info) restore_to(ip, sp, ex, info);
                ip = (long *)*(sp+ 1);
                sp = (long *)*sp;
        }
        printf("-----\n");
        fflush(stdout);
        //uncaught exception, must never happen, this is a JVM bug.
        //in my vm, at least, uncaught exceptions where handled by the
implementation of Thread.
}

find_catch_info was implemented in java, but looks something like this
(don't bother with the linear search for now):

catch_info * find_catch_info(long *ip, jclass_t *ex) {
  if(ip < vm ->compiledMethodsStart || ip > vm->compiledMethodsEnd)
      return 0;
  foreach(compiled_method_t * m, vm->compiledMethods)
      if(m->owns(ip)) //this instruction pointer belongs to this method
         return m->findCatch(ip, ex); //find a catch block for the exception
  return 0;
}

restore_to is implemented this way:

state void restore_to(long *ip, long *frame, jobject_t *ex, catch_info *info)  {
   asm("movl %0, %%eax;"
                "movl %1, %%ebx;"
                "movl %2, %%ecx;"
                "movl %3, %%edx;"
                "movl %%ebx, %%ebp;"
                "movl %%ebp, %%esp;"
                "subl %%edx, %%esp;"
                "pushl %%ecx;"
                "pushl %%eax;"
                "ret;"
                        :
                        :"m"(ip), "m"(frame), "m"(ex), "m"(info->stackDelta)
//stackDelta is local storage + temp storage
                        :"%eax", "%ebx", "%ecx", "%edx");
}

This stuff works only in a JIT only enviroment, but only some minor
tweaks would be required to work in a hybrid enviroment

---

Thanks for your clarification on the term 'IP address'.  Back to your
question:

    > It does, but by stack walking I meant not returning null, but having
    > the code analise the call stack for a proper IP address to use.

In this implementation, unprotected exceptions are handled in
'jvm/src/opcode.c' by references to thread_throw_exception()
in 'jvm/src/thread.c'.  Stack printing is available through the
various utilities (esp. jvmutil_print_stack_common())
in 'jvm/src/jvmutil.c'.  Protected exceptions are handled by the
exception list found in the 'jvm_pc' field 'excpatridx'.  When an
exception is found, this list is queried (by the ATHROW opcode,
which will be available with 0.0.2) and, if found, JVM thread control
is transferred to that handler.  If it is _not_ found, thread_throw_exception()
is called and the thread dies at the end of opcode_run().  This functionality
looks very similar to your code shown above.

---
>
...snip...
>
> Dan Lydick
>




Dan Lydick

Re: Some questions about the architecture

Reply via email to