I spent some time thinking of how various modes
of guest code should be run in the VM, transitions
between them etc, and jotted down some notes while
this stuff was fresh in my mind. My notes are attached.
Looking into a few changes first, then back to
testing and getting plex86 to run Linux without
cosimulation.
-Kevin
Guest Mode/Ring Level Run at Host Mode/Ring Level
=====================================================================
Protected Mode (PM)
-------------------
Ring 0 code .......................... PM, Ring 3
Ring 1 code .......................... PM, Ring 3
Ring 2 code .......................... PM, Ring 3
Ring 3 code .......................... PM, Ring 3
Notes:
- All guest privilege levels are run at ring3. Since guest
code is really always running at the same level, we must build
virtualized page tables either geared for ring3 guest code, or
ring0/1/2 guest code. This means that a transition to/from
ring3 and any other level in the guest necessitates a rebuild
of virtualized page tables.
- The RPL of guest selectors needs to be modified to
accomplish running all guest code at ring3. Thus we must
modifiy access to the selectors, in ring0/1/2 guest code.
Thus, SBE control must be on for these rings. Behaviour of
ring3 guest code running at ring3 in the VM, of course, more
closely models that which is expected. There are a few
instructions/circumstances where behavior may differ. It
is conceivable to allow SBE to be turned off for running
guest ring3 code, controlled by a user configuration file option.
This would result in a signficant performance win for user-mode
code running in the VM.
Real Mode (RM)
--------------
After PM->RM transition, before ...... PM, Ring 3, 16-bit segment
all segments have been reloaded.
Normal, all segments have been ....... Same as above, or V86M
reloaded (or initialized at
power-up).
Notes:
- With SBE on, RM guest code could conceivably be run in
the VM in either V86M, or as a Ring3 PM 16-bit segment.
V86M would be the natural choice as it offers the native
segmentation loading scheme of RM, so less virtualization
needs to be done on the guest code. However, look at
the possible mode transitions in the guest, involving RM:
(1) RM -> PM After the transition, descriptor cache values
retain their values from RM, until the segments are reloaded
in the new PM environment. This is no problem, as PM code
is actually run as PM code, and virtualized segment selectors
can point to descriptors with RM compatible values.
(2) PM -> RM After the transition, descriptor cache values
retain their values from PM, until the segments are reloaded
in the new RM environment. If we chose to run RM guest code
in V86M, this can be a problem. The transition from our
monitor handler code to the guest code running in V86M will
necessarily reload the segment registers, making it impossible
to keep the segment descriptor caches loaded with legacy values
compatible with the previous PM.
- Thus, for the period of time until all descriptors have been
reloaded, we have to either use Ring3 PM 16-bit segments and
virtualize selector accesses in the guest, or do some extra
emulation using V86M. For example, we could start by emulating
all data access instructions. Then adaptively, command SBE to
virtualize less instructions as the data segment registers are
reloaded. A more brute-force method would be to keep plex86
in a pure emulation loop until conditions are met (all segment
registers are reloaded), then use V86M. It is also possible
to start out using 16-bit PM segments, then switch to using
V86M after such registers are reloaded.
- Depending on the accuracy of virtualization required, there
may be situations where SBE is not needed, once the segment
registers are sane. One known problematic instruction is
SMSW, which can be executed in any mode. It unfortunately
lets guest code look at the lower 16-bits of CR0, one of those
bits being the PE bit. Code that tests this value can determine
if it is being run in RM or V86M, with SBE off. Guest code
that tests virtualized CR0 values in this way (or perhaps
certain EFLAGS values) would be problematic. At any rate,
if such accurace is not needed, it may be possible to allow
SBE to be turned off by a configuration option.
V86 Mode (V86M)
---------------
...................................... Same as for RM "normal" case
Notes:
- The transition from PM to V86M in a guest is more "clean",
in that it always reloads all the segment registers. Thus we
don't have to support "legacy values" in the descriptor caches.
So we can just run guest V86M code in V86M. Using 16-bit PM
segments would also be possible, but that mode doesn't have
the benefit of the natural RM-like segment loading, thus more
virtualization overhead is needed.
- There may be similar issues as per RM to turning off SBE,
for example if any of the lower 16 bits of CR0 have to be
virtualized (modified).