Ulrich Weigand wrote:

>...
> - Add a instruction analyzer/decoder to be used e.g. inside the GPF
>   handler to find out which instruction caused the fault.  (I have an
>   decoder already written for another project, this might be useful ...)
> 
> - Start virtualizing certain system instructions.  We'll probably start
>   with those that are used by the nullkernel; most important seem to be
>   LDT/GDT/IDT accesses.  This will require to set up a framework to
>   move the real monitor tables to the linear address expected by the
>   guest, protected by page access privileges, and maintain the 'shadow'
>   tables at some other location ...

I'll work on the pre-scanning technique for virtualizing
arbitrary instructions.  Will need to implement parts of
the other items here, but probably will do so minimally
at first to get started.

Here's the strategy I came up with for a first go at
instruction pre-scanning.  It requires an additional page
of memory for each code page which has some amount of
pre-scanning activity.  Will refine memory usage later.

Essentially, what we are trying to accomplish here, is
to never let the execution of code pass through unscanned
code.


- For each new code page we encounter, allocate a page
  which represents attributes of the instructions within
  the page.  Zero out page to begin with.

- Each byte in this corresponding attribute page, donotes
  attributes of the instruction which starts at that offset
  in the code page.  Here's a possible layout of the bitfields
  in each byte:

  7 6 5 4 3 2 1 0
  | | | | | | | |
  | | | | +-+-+-+---- instruction length 1..15
  | | | |
  | | +-+------------ available for future use
  | |
  | +---------------- 0=execute native, 1=virtualize
  |
  +------------------ 0=not yet scanned, 1=scanned

  When bit7 is 0, all the other bits are meaningless, since we
  have not yet scanned the instruction.

- At first, when we encounter new local-page branch instructions
  (static offsets) the target address in the page may very well
  not have been scanned yet.  We could mark this instruction
  as one to virtualize for now.  The virtualization logic
  could simulate the branch until the target address has been
  scanned, in which case we could mark the instruction to execute
  natively from thereon.  Lazy processing at it's best.

  The next step beyond this strategy would be to use a recursive
  descent technique and branch out pre-scanning the one (unconditional
  branch) or two (conditional branch) possible target addresses.
  Upon returning from the recursion, granted both are in the local
  code page, we would likely be able to let the branch instruction
  execute natively.  Code downstream could generate breakpoints
  where necessary.  Terminals in the recursion could be:

  - instructions which are already scanned
  - out of page branches
  - instructions which require virtualization, though we
    could scan right through these.
  - instructions which cross page boundaries.

  We may also want to establish a maximum recursion depth.
  The win we get is if the code we prescan is well used before
  we have to dump the attribute page.  We lose when this is not
  the case, for instance when the code page is modified frequently
  via self modifying code or common code/data pages.  We may
  find a comfortable max depth of N, which does a much better
  job winning than losing on the average.

- When we detect a write to a code page, using methods we talked
  about previously, we could examine the addresses affected, and look
  at the corresponding attributes.  If they pertain to an instruction(s)
  which we have pre-scanned, then we need to dump all the mappings for the
  entire attribute page.  This is because the technique I have here
  doesn't record which instructions branch into these addresses, so
  there is no way to know which other instructions to invalidate.
  This is the tradeoff for a simple algorithm.

  If such writes go to areas in the page which are not yet marked
  as pre-scanned, then we can step through the instructions.  We
  can consider the writes as ones to data in a shared code/data page.
  And as such, we don't need to dump the page attributes we have
  accumulated thus far.

- For now, out of page branches and computed branches will be
  marked as needing to be virtualized, and emulated by the monitor.
  I have some further thoughts on this, but it's worth keeping
  things simple for now, and working on code.

- This technique handles overlapping instructions well.  Each
  byte in the attribute page holds info about only the instruction
  which *starts* there.  So there can be attributes for instructions
  which start in the middle of a previous instruction with no conflict.
  This technique also makes things simple for the code which handles
  writes to a code page.  Given the affected addresses, you can
  easily scan forward and backward to see if a pre-scanned instruction
  was hit, and then invalidate the attribute page.  Or in other words,
  it can handle self-modifying code well.

Let me know what you think of this strategy.  It would be good
to get the kinks out before getting too far into coding this.

As far as I can think of, this is the last major piece we need for now.

-Kevin

Reply via email to