Some more thoughts for people who are into dynamic translation
and performance...

If you picture that we translate as we see new code, following
static conditional/unconditional branches where possible, each
code sequence would be translated into a code cache buffer.
For an application like plex86, a lot of translations would
actually just be passing the instruction through 1:1.

One of the side effects of this, is that for code sequences like:

    i0
  label0:
    i1
    i2
    i3
    i4
  label1:
    i5
    i6
    i7
    i8

you might see the sequence at label1 first.  Later when
translating the sequence at label0, it would be necessary
to generate a jump from sequence0 to sequence1 in the
translation buffer.  This is merely because of order in which
various sequences were encountered.  If the code were encountered
in the order label0, label1, then no jump would be necessary.

As an optimization, what might be cool is to have a separate
thread running in host user space, examining regions of translated
code.  This thread could optimize regions of translated code
based on various meta usage data maintained by the monitor,
reorder/optimize the region of code by flattening out such
code, and post the results to the monitor.  The monitor would
be free to accept the changes, or ignore them.

This would allow an idle processor (in an SMP machine) to dynamically
optimize translated guest code.

Later when we support allowing some guest pages to be host swappable,
we might offload such duties to a separate host user thread in a similar
fashion.

-Kevin

-- 
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Kevin Lawton                        [EMAIL PROTECTED]
MandrakeSoft, Inc.                  Plex86 developer
http://www.linux-mandrake.com/      http://www.plex86.org/

Reply via email to