Re: [Qemu-devel] nested page table translation for non-x86 operating system

2012-06-22 Thread Wei-Ren Chen
Hi Xin Tong,

  O.K., after studying KVM a little bit, I just give you my 2 cents. :)

On Fri, Jan 20, 2012 at 12:12:00AM -0500, Xin Tong wrote:
 I am wondering the possibilities of using the nested page table
 mechanism available on the x86 processors to do page translation for
 non-x86 operating system emulation.

  Orit mentioned there was a project called QuickTransit which runs
Solaris/SPARC on Linux/x86, it seems to use shadow page table rather
than NPT/EPT to help the guest memory translation [1]. Maybe you have
targets in mind which might have similar page table structures to x86
, so that NPT/EPT can be used?

 So, when nested page is enabled, you can control the gCR3 and hCR3.
 The gCR3 can be used to point to the page table of the running process
 in the guest operating system and the hCR3 can be used to point to the
 page table of the QEMU process. Assuming the page table layouts of
 both operating systems are exactly the same.  I think this can be
 done. However there are a few problems I see here. I would like to
 hear some suggestions or corrections.
 
 1.  The control of gCR3 and hCR3 needs kernel access. While they can
 be set with a device module as what is done in kvm. Trapping into the
 kernel every time gCR3 is reseted might be too expensive.

  How can you set gCR3/hCR3 through the device module (sorry, I am not
a KVM expert)? Do you think making QEMU into kernel mode can mitigate
the overhead?
 
 2. After setting the gCR3 and hCR3. whatever memory references fall
 within the guest memory will be done correctly. However, memory
 references done by the host will be broken. Therefore, when we load
 the from the CPUstates, call to helpers for exits from the code cache,
 we need to change the paging mechanism back to non-nested. can this be
 done ? how expensive will this be ?

  What in my mind is, maybe we can lauch a KVM_RUN in the TCG prologue,
that way we can run the translated host binary with two level memory
translation. If we run into helper function, we do VMExit, fall back to
the original (one level) memory translation. Is that what you mean?

  How about make the address mapping of helper functions be identical,
i.e., make the spte actually do nothing?
 
 3. Lastly and most importantly,  the code cache is based on a host
 address, what about fetching instructions from the code cache, this
 has to happen in non-nested mode ?

  When guest VM (with KVM enabled) fetch instructions from its memory,
does it also have the same issue?

Regards,
chenwj

[1] http://www.mail-archive.com/qemu-devel@nongnu.org/msg117254.html

-- 
Wei-Ren Chen (陳韋任)
Computer Systems Lab, Institute of Information Science,
Academia Sinica, Taiwan (R.O.C.)
Tel:886-2-2788-3799 #1667
Homepage: http://people.cs.nctu.edu.tw/~chenwj



Re: [Qemu-devel] nested page table translation for non-x86 operating system

2012-06-22 Thread Xin Tong
On Fri, Jun 22, 2012 at 3:28 AM, 陳韋任 (Wei-Ren Chen)
che...@iis.sinica.edu.tw wrote:
 Hi Xin Tong,

  O.K., after studying KVM a little bit, I just give you my 2 cents. :)

 On Fri, Jan 20, 2012 at 12:12:00AM -0500, Xin Tong wrote:
 I am wondering the possibilities of using the nested page table
 mechanism available on the x86 processors to do page translation for
 non-x86 operating system emulation.

  Orit mentioned there was a project called QuickTransit which runs
 Solaris/SPARC on Linux/x86, it seems to use shadow page table rather
 than NPT/EPT to help the guest memory translation [1]. Maybe you have
 targets in mind which might have similar page table structures to x86
 , so that NPT/EPT can be used?

I did have a target in mind when i sent out the question. it was s390
from IBM (a.k.a. system z or zarchecture).  In one of the emulators i
had the chance to work on, we spend reasonable amount of time in the
memory emulation code that makes use of hardware to accelerate this
translation could improve the performance of the emulator.

 So, when nested page is enabled, you can control the gCR3 and hCR3.
 The gCR3 can be used to point to the page table of the running process
 in the guest operating system and the hCR3 can be used to point to the
 page table of the QEMU process. Assuming the page table layouts of
 both operating systems are exactly the same.  I think this can be
 done. However there are a few problems I see here. I would like to
 hear some suggestions or corrections.

 1.  The control of gCR3 and hCR3 needs kernel access. While they can
 be set with a device module as what is done in kvm. Trapping into the
 kernel every time gCR3 is reseted might be too expensive.

  How can you set gCR3/hCR3 through the device module (sorry, I am not
 a KVM expert)? Do you think making QEMU into kernel mode can mitigate
 the overhead?

I do not remember the details, but once you have a driver running in
ring 0, there are many things you can do which you can not in ring 3
i.e. changing the state of CRx registers . if you really want the
details. i think you should look at the transition of virtual machines
to host machine in kvm code.


 2. After setting the gCR3 and hCR3. whatever memory references fall
 within the guest memory will be done correctly. However, memory
 references done by the host will be broken. Therefore, when we load
 the from the CPUstates, call to helpers for exits from the code cache,
 we need to change the paging mechanism back to non-nested. can this be
 done ? how expensive will this be ?

  What in my mind is, maybe we can lauch a KVM_RUN in the TCG prologue,
 that way we can run the translated host binary with two level memory
 translation. If we run into helper function, we do VMExit, fall back to
 the original (one level) memory translation. Is that what you mean?

  How about make the address mapping of helper functions be identical,
 i.e., make the spte actually do nothing?

 3. Lastly and most importantly,  the code cache is based on a host
 address, what about fetching instructions from the code cache, this
 has to happen in non-nested mode ?

  When guest VM (with KVM enabled) fetch instructions from its memory,
 does it also have the same issue?

No i do not think so. in the case of x86 guest running with KVM, the
guest code is in the guest memory and addressable using the guest page
table. but in my case, the emulation code is in the host  memory and
may not be addressable by the guest. remember, the emulation code is
generated by the host here :).

 Regards,
 chenwj

 [1] http://www.mail-archive.com/qemu-devel@nongnu.org/msg117254.html

 --
 Wei-Ren Chen (陳韋任)
 Computer Systems Lab, Institute of Information Science,
 Academia Sinica, Taiwan (R.O.C.)
 Tel:886-2-2788-3799 #1667
 Homepage: http://people.cs.nctu.edu.tw/~chenwj



Re: [Qemu-devel] nested page table translation for non-x86 operating system

2012-05-29 Thread Wei-Ren Chen
Hi Xin Tong,

On Fri, Jan 20, 2012 at 08:54:12AM -0500, Xin Tong wrote:
 On Fri, Jan 20, 2012 at 3:23 AM, 陳韋任 che...@iis.sinica.edu.tw wrote:
  1.  The control of gCR3 and hCR3 needs kernel access. While they can
  be set with a device module as what is done in kvm. Trapping into the
  kernel every time gCR3 is reseted might be too expensive.
 
   Why the control of gCR3 needs kernel access? Isn't gCR3 just a field of the
  CPUX86State? QEMU should have the control of it. Or you mean the trapping 
  thing?
 
 I do not think gCR3 is a field in the CPUx86State. I think inorder to
 change the guest CR3, we need to trap into the kernel as kvm does.

  I read stuff about Intel NPT again. Is gCR3 a field of VMCS, then loaded into
CR3 at runtime? Thanks!

Regards,
chenwj
 
-- 
Wei-Ren Chen (陳韋任)
Computer Systems Lab, Institute of Information Science,
Academia Sinica, Taiwan (R.O.C.)
Tel:886-2-2788-3799 #1667
Homepage: http://people.cs.nctu.edu.tw/~chenwj



Re: [Qemu-devel] nested page table translation for non-x86 operating system

2012-01-31 Thread 陳韋任
On Fri, Jan 20, 2012 at 08:54:12AM -0500, Xin Tong wrote:
 On Fri, Jan 20, 2012 at 3:23 AM, 陳韋任 che...@iis.sinica.edu.tw wrote:
  1.  The control of gCR3 and hCR3 needs kernel access. While they can
  be set with a device module as what is done in kvm. Trapping into the
  kernel every time gCR3 is reseted might be too expensive.
 
   Why the control of gCR3 needs kernel access? Isn't gCR3 just a field of the
  CPUX86State? QEMU should have the control of it. Or you mean the trapping 
  thing?
 
 I do not think gCR3 is a field in the CPUx86State. I think inorder to
 change the guest CR3, we need to trap into the kernel as kvm does.

  If your scenario is pure QEMU (without kvm), I think gCR3 is a field in the
CPUx86State. See below,

typedef struct CPUX86State {

...

target_ulong cr[5]; /* NOTE: cr1 is unused */

...
};

Or I misunderstand what you're trying to do?

Regards,
chenwj

-- 
Wei-Ren Chen (陳韋任)
Computer Systems Lab, Institute of Information Science,
Academia Sinica, Taiwan (R.O.C.)
Tel:886-2-2788-3799 #1667
Homepage: http://people.cs.nctu.edu.tw/~chenwj



Re: [Qemu-devel] nested page table translation for non-x86 operating system

2012-01-20 Thread Xin Tong
On Fri, Jan 20, 2012 at 3:23 AM, 陳韋任 che...@iis.sinica.edu.tw wrote:
 1.  The control of gCR3 and hCR3 needs kernel access. While they can
 be set with a device module as what is done in kvm. Trapping into the
 kernel every time gCR3 is reseted might be too expensive.

  Why the control of gCR3 needs kernel access? Isn't gCR3 just a field of the
 CPUX86State? QEMU should have the control of it. Or you mean the trapping 
 thing?

I do not think gCR3 is a field in the CPUx86State. I think inorder to
change the guest CR3, we need to trap into the kernel as kvm does.

 2. After setting the gCR3 and hCR3. whatever memory references fall
 within the guest memory will be done correctly. However, memory
 references done by the host will be broken. Therefore, when we load
 the from the CPUstates, call to helpers for exits from the code cache,
 we need to change the paging mechanism back to non-nested. can this be
 done ? how expensive will this be ?

  Why the memeory references done by the host will be broken?

the CPUstate is a host memory, if nested paging is enabled, the guest
page table is walked  and then the host. however, for memory accesses
to CPUstate, we do not want to guest page table to be walked.


 Regards,
 chenwj

 --
 Wei-Ren Chen (陳韋任)
 Computer Systems Lab, Institute of Information Science,
 Academia Sinica, Taiwan (R.O.C.)
 Tel:886-2-2788-3799 #1667
 Homepage: http://people.cs.nctu.edu.tw/~chenwj



Re: [Qemu-devel] nested page table translation for non-x86 operating system

2012-01-20 Thread 陳韋任
 1.  The control of gCR3 and hCR3 needs kernel access. While they can
 be set with a device module as what is done in kvm. Trapping into the
 kernel every time gCR3 is reseted might be too expensive.

  Why the control of gCR3 needs kernel access? Isn't gCR3 just a field of the
CPUX86State? QEMU should have the control of it. Or you mean the trapping thing?
 
 2. After setting the gCR3 and hCR3. whatever memory references fall
 within the guest memory will be done correctly. However, memory
 references done by the host will be broken. Therefore, when we load
 the from the CPUstates, call to helpers for exits from the code cache,
 we need to change the paging mechanism back to non-nested. can this be
 done ? how expensive will this be ?

  Why the memeory references done by the host will be broken?

Regards,
chenwj
 
-- 
Wei-Ren Chen (陳韋任)
Computer Systems Lab, Institute of Information Science,
Academia Sinica, Taiwan (R.O.C.)
Tel:886-2-2788-3799 #1667
Homepage: http://people.cs.nctu.edu.tw/~chenwj



[Qemu-devel] nested page table translation for non-x86 operating system

2012-01-19 Thread Xin Tong
I am wondering the possibilities of using the nested page table
mechanism available on the x86 processors to do page translation for
non-x86 operating system emulation.

So, when nested page is enabled, you can control the gCR3 and hCR3.
The gCR3 can be used to point to the page table of the running process
in the guest operating system and the hCR3 can be used to point to the
page table of the QEMU process. Assuming the page table layouts of
both operating systems are exactly the same.  I think this can be
done. However there are a few problems I see here. I would like to
hear some suggestions or corrections.

1.  The control of gCR3 and hCR3 needs kernel access. While they can
be set with a device module as what is done in kvm. Trapping into the
kernel every time gCR3 is reseted might be too expensive.

2. After setting the gCR3 and hCR3. whatever memory references fall
within the guest memory will be done correctly. However, memory
references done by the host will be broken. Therefore, when we load
the from the CPUstates, call to helpers for exits from the code cache,
we need to change the paging mechanism back to non-nested. can this be
done ? how expensive will this be ?

3. Lastly and most importantly,  the code cache is based on a host
address, what about fetching instructions from the code cache, this
has to happen in non-nested mode ?


Thanks


Xin