Re: [Qemu-devel] [PATCH v3 0/4] target-ppc: Add FWNMI support in qemu for powerKVM guests

2014-11-19 Thread Alexander Graf



 Am 19.11.2014 um 06:48 schrieb Aravinda Prasad aravi...@linux.vnet.ibm.com:
 
 
 
 On Tuesday 11 November 2014 08:54 AM, David Gibson wrote:
 
 [..]
 
 
 So, this may not still be possible depending on whether the KVM side
 of this is already merged, but it occurs to me that there's a simpler
 way.
 
 Rather than mucking about with having to update the hypervisor on the
 RTAS location, they have qemu copy the code out of RTAS, patch it and
 copy it back into the vector, you could instead do this:
 
  1. Make KVM instead of immediately delivering a 0x200 for a guest
 machine check, cause a special exit to qemu.
 
  2. Have the register-nmi RTAS call store the guest side MC handler
 address in the spapr structure, but perform no actual guest code
 patching.
 
  3. Allocate the error log buffer independently from the RTAS blob,
 so qemu always knows where it is.
 
  4. When qemu gets the MC exit condition, instead of going via a
 patched 0x200 vector, just directly set the guest register state and
 jump straight into the guest side MC handler.
 
 Before I proceed further I would like to know what others think about
 the approach proposed above (except for step 3 - as per PAPR the error
 log buffer should be part of RTAS blob and hence we cannot have error
 log buffer independent of RTAS blob).
 
 Alex, Alexey, Ben: Any thoughts?

If in doubt, stick to PAPR please.

Alex




Re: [Qemu-devel] [PATCH v3 0/4] target-ppc: Add FWNMI support in qemu for powerKVM guests

2014-11-19 Thread David Gibson
On Wed, Nov 19, 2014 at 11:32:56AM +0100, Alexander Graf wrote:
 
 
 
  Am 19.11.2014 um 06:48 schrieb Aravinda Prasad 
  aravi...@linux.vnet.ibm.com:
  
  
  
  On Tuesday 11 November 2014 08:54 AM, David Gibson wrote:
  
  [..]
  
  
  So, this may not still be possible depending on whether the KVM side
  of this is already merged, but it occurs to me that there's a simpler
  way.
  
  Rather than mucking about with having to update the hypervisor on the
  RTAS location, they have qemu copy the code out of RTAS, patch it and
  copy it back into the vector, you could instead do this:
  
   1. Make KVM instead of immediately delivering a 0x200 for a guest
  machine check, cause a special exit to qemu.
  
   2. Have the register-nmi RTAS call store the guest side MC handler
  address in the spapr structure, but perform no actual guest code
  patching.
  
   3. Allocate the error log buffer independently from the RTAS blob,
  so qemu always knows where it is.
  
   4. When qemu gets the MC exit condition, instead of going via a
  patched 0x200 vector, just directly set the guest register state and
  jump straight into the guest side MC handler.
  
  Before I proceed further I would like to know what others think about
  the approach proposed above (except for step 3 - as per PAPR the error
  log buffer should be part of RTAS blob and hence we cannot have error
  log buffer independent of RTAS blob).
  
  Alex, Alexey, Ben: Any thoughts?
 
 If in doubt, stick to PAPR please.

Apart from (3), which was a misunderstanding on my part, this doesn't
diverge from PAPR - it's just a question of how we're implementing the
PAPR behaviour.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


pgpo3TEoR23sA.pgp
Description: PGP signature


Re: [Qemu-devel] [PATCH v3 0/4] target-ppc: Add FWNMI support in qemu for powerKVM guests

2014-11-19 Thread Alexander Graf



 Am 19.11.2014 um 12:44 schrieb David Gibson da...@gibson.dropbear.id.au:
 
 On Wed, Nov 19, 2014 at 11:32:56AM +0100, Alexander Graf wrote:
 
 
 
 Am 19.11.2014 um 06:48 schrieb Aravinda Prasad 
 aravi...@linux.vnet.ibm.com:
 
 
 
 On Tuesday 11 November 2014 08:54 AM, David Gibson wrote:
 
 [..]
 
 
 So, this may not still be possible depending on whether the KVM side
 of this is already merged, but it occurs to me that there's a simpler
 way.
 
 Rather than mucking about with having to update the hypervisor on the
 RTAS location, they have qemu copy the code out of RTAS, patch it and
 copy it back into the vector, you could instead do this:
 
 1. Make KVM instead of immediately delivering a 0x200 for a guest
 machine check, cause a special exit to qemu.
 
 2. Have the register-nmi RTAS call store the guest side MC handler
 address in the spapr structure, but perform no actual guest code
 patching.
 
 3. Allocate the error log buffer independently from the RTAS blob,
 so qemu always knows where it is.
 
 4. When qemu gets the MC exit condition, instead of going via a
 patched 0x200 vector, just directly set the guest register state and
 jump straight into the guest side MC handler.
 
 Before I proceed further I would like to know what others think about
 the approach proposed above (except for step 3 - as per PAPR the error
 log buffer should be part of RTAS blob and hence we cannot have error
 log buffer independent of RTAS blob).
 
 Alex, Alexey, Ben: Any thoughts?
 
 If in doubt, stick to PAPR please.
 
 Apart from (3), which was a misunderstanding on my part, this doesn't
 diverge from PAPR - it's just a question of how we're implementing the
 PAPR behaviour.

Do we need a guest handler at all? Couldn't we make MCs a new exit type and 
handle it all straight from QEMU?


Alex

 
 -- 
 David Gibson| I'll have my music baroque, and my code
 david AT gibson.dropbear.id.au| minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
 http://www.ozlabs.org/~dgibson



Re: [Qemu-devel] [PATCH v3 0/4] target-ppc: Add FWNMI support in qemu for powerKVM guests

2014-11-19 Thread David Gibson
On Wed, Nov 19, 2014 at 01:22:01PM +0100, Alexander Graf wrote:
 
 
 
  Am 19.11.2014 um 12:44 schrieb David Gibson da...@gibson.dropbear.id.au:
  
  On Wed, Nov 19, 2014 at 11:32:56AM +0100, Alexander Graf wrote:
  
  
  
  Am 19.11.2014 um 06:48 schrieb Aravinda Prasad 
  aravi...@linux.vnet.ibm.com:
  
  
  
  On Tuesday 11 November 2014 08:54 AM, David Gibson wrote:
  
  [..]
  
  
  So, this may not still be possible depending on whether the KVM side
  of this is already merged, but it occurs to me that there's a simpler
  way.
  
  Rather than mucking about with having to update the hypervisor on the
  RTAS location, they have qemu copy the code out of RTAS, patch it and
  copy it back into the vector, you could instead do this:
  
  1. Make KVM instead of immediately delivering a 0x200 for a guest
  machine check, cause a special exit to qemu.
  
  2. Have the register-nmi RTAS call store the guest side MC handler
  address in the spapr structure, but perform no actual guest code
  patching.
  
  3. Allocate the error log buffer independently from the RTAS blob,
  so qemu always knows where it is.
  
  4. When qemu gets the MC exit condition, instead of going via a
  patched 0x200 vector, just directly set the guest register state and
  jump straight into the guest side MC handler.
  
  Before I proceed further I would like to know what others think about
  the approach proposed above (except for step 3 - as per PAPR the error
  log buffer should be part of RTAS blob and hence we cannot have error
  log buffer independent of RTAS blob).
  
  Alex, Alexey, Ben: Any thoughts?
  
  If in doubt, stick to PAPR please.
  
  Apart from (3), which was a misunderstanding on my part, this doesn't
  diverge from PAPR - it's just a question of how we're implementing the
  PAPR behaviour.
 
 Do we need a guest handler at all? Couldn't we make MCs a new exit
 type and handle it all straight from QEMU?

Well, PAPR allows the OS to register a handler, which existing guests
will expect to be able to do.  The registered handler expects various
information collated for it though, so it isn't a raw 0x200 vector.

IIUC, traditionally pHyp implemented this by patching the guests 0x200
vector to collate the necessary information then jump to the supplied
handler.

I'm suggesting that instead we indeed make a new exit type, have qemu
collate the information internally then jump directly back into the
guest registered handler.

I'm not sure if that's quite what you were suggesting, but I think we
have pretty close to the same idea here.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


pgpSQ4cAO4vNS.pgp
Description: PGP signature


Re: [Qemu-devel] [PATCH v3 0/4] target-ppc: Add FWNMI support in qemu for powerKVM guests

2014-11-18 Thread Aravinda Prasad


On Tuesday 11 November 2014 08:54 AM, David Gibson wrote:

[..]

 
 So, this may not still be possible depending on whether the KVM side
 of this is already merged, but it occurs to me that there's a simpler
 way.
 
 Rather than mucking about with having to update the hypervisor on the
 RTAS location, they have qemu copy the code out of RTAS, patch it and
 copy it back into the vector, you could instead do this:
 
   1. Make KVM instead of immediately delivering a 0x200 for a guest
 machine check, cause a special exit to qemu.
 
   2. Have the register-nmi RTAS call store the guest side MC handler
 address in the spapr structure, but perform no actual guest code
 patching.
 
   3. Allocate the error log buffer independently from the RTAS blob,
 so qemu always knows where it is.
 
   4. When qemu gets the MC exit condition, instead of going via a
 patched 0x200 vector, just directly set the guest register state and
 jump straight into the guest side MC handler.


Before I proceed further I would like to know what others think about
the approach proposed above (except for step 3 - as per PAPR the error
log buffer should be part of RTAS blob and hence we cannot have error
log buffer independent of RTAS blob).

Alex, Alexey, Ben: Any thoughts?

-- 
Regards,
Aravinda




Re: [Qemu-devel] [PATCH v3 0/4] target-ppc: Add FWNMI support in qemu for powerKVM guests

2014-11-12 Thread David Gibson
On Tue, Nov 11, 2014 at 12:45:05PM +0530, Aravinda Prasad wrote:
 
 
 On Tuesday 11 November 2014 08:54 AM, David Gibson wrote:
  On Wed, Nov 05, 2014 at 12:42:03PM +0530, Aravinda Prasad wrote:
  This series of patches add support for fwnmi in powerKVM guests.
 
  Currently upon machine check exception, if the address in
  error belongs to guest then KVM invokes guest's NMI interrupt
  vector 0x200.
 
  This patch series adds functionality where the guest's 0x200
  interrupt vector is patched such that QEMU gets control. QEMU
  then builds error log and reports the error to OS registered
  machine check handlers through RTAS space.
 
  Apart from this, the patch series also takes care of synchronization
  when multiple processors encounter machine check at or about the
  same time.
 
  The patch set was tested by simulating a machine check error in
  the guest.
 
  Changes in v3:
  - Incorporated review comments
  - Byte codes in patch 4/4 are now moved to
pc-bios/spapr-rtas/spapr-rtas.S as instructions.
  - Defined the RTAS blob in-memory layout.
  - FIX: save and restore cr register in the trampoline
 
  Changes in v2:
  - Re-based to github.com/agraf/qemu.git  branch: ppc-next
  - Merged patches 4 and 5.
  - Incorporated other review comments
  
  So, this may not still be possible depending on whether the KVM side
  of this is already merged, but it occurs to me that there's a simpler
  way.
 
 The KVM part is already merged. Commit ID: 74845bc

Ok, that makes life harder, though I guess without the qemu code
merged, no-one would be using yet, so it's not impossible to change still.

  Rather than mucking about with having to update the hypervisor on the
  RTAS location, they have qemu copy the code out of RTAS, patch it and
  copy it back into the vector, you could instead do this:
 
 Though this is possible, I have coupe of comments below
 
  
1. Make KVM instead of immediately delivering a 0x200 for a guest
  machine check, cause a special exit to qemu.
  
2. Have the register-nmi RTAS call store the guest side MC handler
  address in the spapr structure, but perform no actual guest code
  patching.
  
3. Allocate the error log buffer independently from the RTAS blob,
  so qemu always knows where it is.
 
 As per PAPR, the error log buffer should be part of RTAS blob and the
 guest kernel explicitly checks if error log is inside RTAS blob.
 This requires qemu to know the updated RTAS location by the OS which is
 handled in patch 2/4.

Ugh, ok.  That's a pretty stupid interface requirement, even by PAPR
standards, but I guess we're stuck with it.

4. When qemu gets the MC exit condition, instead of going via a
  patched 0x200 vector, just directly set the guest register state and
  jump straight into the guest side MC handler.
 
 PAPR mentions:
 
 R1–7.3.14–8: Once the OS has registered for NMI notification, the
 platform firmware must intercept all System Reset Interrupts on all of
 the OS’s processors.
 
 So do we need to go via 0x200?

I don't see why.  The hypervisor is already intercepting system resets
and machine checks because it's a hypervisor, and from the PAPR
guest's point of view, all it cares about is that you enter its
registered handler with the expected information available.

I don't see that the guest cares whether you bounce via a vector in
guest space or directly enter the guest supplied handler using
hypervisor magic.  Patching the guest's vector actually seems a pretty
awful hack that would only be necessary to work around limitations in
the virtualization capabilities which I don't think we have as of POWER8.

Btw, isn't a System Reset Interrupt vector 0x100, not vector 0x200?

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


pgpmYcUEXP0bA.pgp
Description: PGP signature


Re: [Qemu-devel] [PATCH v3 0/4] target-ppc: Add FWNMI support in qemu for powerKVM guests

2014-11-12 Thread Aravinda Prasad


On Thursday 13 November 2014 09:27 AM, David Gibson wrote:
 On Tue, Nov 11, 2014 at 12:45:05PM +0530, Aravinda Prasad wrote:


 On Tuesday 11 November 2014 08:54 AM, David Gibson wrote:
 On Wed, Nov 05, 2014 at 12:42:03PM +0530, Aravinda Prasad wrote:
 This series of patches add support for fwnmi in powerKVM guests.

 Currently upon machine check exception, if the address in
 error belongs to guest then KVM invokes guest's NMI interrupt
 vector 0x200.

 This patch series adds functionality where the guest's 0x200
 interrupt vector is patched such that QEMU gets control. QEMU
 then builds error log and reports the error to OS registered
 machine check handlers through RTAS space.

 Apart from this, the patch series also takes care of synchronization
 when multiple processors encounter machine check at or about the
 same time.

 The patch set was tested by simulating a machine check error in
 the guest.

 Changes in v3:
 - Incorporated review comments
 - Byte codes in patch 4/4 are now moved to
   pc-bios/spapr-rtas/spapr-rtas.S as instructions.
 - Defined the RTAS blob in-memory layout.
 - FIX: save and restore cr register in the trampoline

 Changes in v2:
 - Re-based to github.com/agraf/qemu.git  branch: ppc-next
 - Merged patches 4 and 5.
 - Incorporated other review comments

 So, this may not still be possible depending on whether the KVM side
 of this is already merged, but it occurs to me that there's a simpler
 way.

 The KVM part is already merged. Commit ID: 74845bc
 
 Ok, that makes life harder, though I guess without the qemu code
 merged, no-one would be using yet, so it's not impossible to change still.
 
 Rather than mucking about with having to update the hypervisor on the
 RTAS location, they have qemu copy the code out of RTAS, patch it and
 copy it back into the vector, you could instead do this:

 Though this is possible, I have coupe of comments below


   1. Make KVM instead of immediately delivering a 0x200 for a guest
 machine check, cause a special exit to qemu.

   2. Have the register-nmi RTAS call store the guest side MC handler
 address in the spapr structure, but perform no actual guest code
 patching.

   3. Allocate the error log buffer independently from the RTAS blob,
 so qemu always knows where it is.

 As per PAPR, the error log buffer should be part of RTAS blob and the
 guest kernel explicitly checks if error log is inside RTAS blob.
 This requires qemu to know the updated RTAS location by the OS which is
 handled in patch 2/4.
 
 Ugh, ok.  That's a pretty stupid interface requirement, even by PAPR
 standards, but I guess we're stuck with it.
 
   4. When qemu gets the MC exit condition, instead of going via a
 patched 0x200 vector, just directly set the guest register state and
 jump straight into the guest side MC handler.

 PAPR mentions:

 R1–7.3.14–8: Once the OS has registered for NMI notification, the
 platform firmware must intercept all System Reset Interrupts on all of
 the OS’s processors.

 So do we need to go via 0x200?
 
 I don't see why.  The hypervisor is already intercepting system resets
 and machine checks because it's a hypervisor, and from the PAPR
 guest's point of view, all it cares about is that you enter its
 registered handler with the expected information available.
 
 I don't see that the guest cares whether you bounce via a vector in
 guest space or directly enter the guest supplied handler using
 hypervisor magic.  Patching the guest's vector actually seems a pretty
 awful hack that would only be necessary to work around limitations in
 the virtualization capabilities which I don't think we have as of POWER8.
 

Agree.

 Btw, isn't a System Reset Interrupt vector 0x100, not vector 0x200?

System Reset Interrupt vector is 0x100. Machine Check Interrupt
is 0x200. The above R1–7.3.14–8 extract was for System Reset. We have
one for Machine Check in R1–7.3.14–10.

 

-- 
Regards,
Aravinda




Re: [Qemu-devel] [PATCH v3 0/4] target-ppc: Add FWNMI support in qemu for powerKVM guests

2014-11-10 Thread David Gibson
On Wed, Nov 05, 2014 at 12:42:03PM +0530, Aravinda Prasad wrote:
 This series of patches add support for fwnmi in powerKVM guests.
 
 Currently upon machine check exception, if the address in
 error belongs to guest then KVM invokes guest's NMI interrupt
 vector 0x200.
 
 This patch series adds functionality where the guest's 0x200
 interrupt vector is patched such that QEMU gets control. QEMU
 then builds error log and reports the error to OS registered
 machine check handlers through RTAS space.
 
 Apart from this, the patch series also takes care of synchronization
 when multiple processors encounter machine check at or about the
 same time.
 
 The patch set was tested by simulating a machine check error in
 the guest.
 
 Changes in v3:
 - Incorporated review comments
 - Byte codes in patch 4/4 are now moved to
   pc-bios/spapr-rtas/spapr-rtas.S as instructions.
 - Defined the RTAS blob in-memory layout.
 - FIX: save and restore cr register in the trampoline
 
 Changes in v2:
 - Re-based to github.com/agraf/qemu.git  branch: ppc-next
 - Merged patches 4 and 5.
 - Incorporated other review comments

So, this may not still be possible depending on whether the KVM side
of this is already merged, but it occurs to me that there's a simpler
way.

Rather than mucking about with having to update the hypervisor on the
RTAS location, they have qemu copy the code out of RTAS, patch it and
copy it back into the vector, you could instead do this:

  1. Make KVM instead of immediately delivering a 0x200 for a guest
machine check, cause a special exit to qemu.

  2. Have the register-nmi RTAS call store the guest side MC handler
address in the spapr structure, but perform no actual guest code
patching.

  3. Allocate the error log buffer independently from the RTAS blob,
so qemu always knows where it is.

  4. When qemu gets the MC exit condition, instead of going via a
patched 0x200 vector, just directly set the guest register state and
jump straight into the guest side MC handler.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


pgpHq63ol_PgN.pgp
Description: PGP signature


Re: [Qemu-devel] [PATCH v3 0/4] target-ppc: Add FWNMI support in qemu for powerKVM guests

2014-11-10 Thread Aravinda Prasad


On Tuesday 11 November 2014 08:54 AM, David Gibson wrote:
 On Wed, Nov 05, 2014 at 12:42:03PM +0530, Aravinda Prasad wrote:
 This series of patches add support for fwnmi in powerKVM guests.

 Currently upon machine check exception, if the address in
 error belongs to guest then KVM invokes guest's NMI interrupt
 vector 0x200.

 This patch series adds functionality where the guest's 0x200
 interrupt vector is patched such that QEMU gets control. QEMU
 then builds error log and reports the error to OS registered
 machine check handlers through RTAS space.

 Apart from this, the patch series also takes care of synchronization
 when multiple processors encounter machine check at or about the
 same time.

 The patch set was tested by simulating a machine check error in
 the guest.

 Changes in v3:
 - Incorporated review comments
 - Byte codes in patch 4/4 are now moved to
   pc-bios/spapr-rtas/spapr-rtas.S as instructions.
 - Defined the RTAS blob in-memory layout.
 - FIX: save and restore cr register in the trampoline

 Changes in v2:
 - Re-based to github.com/agraf/qemu.git  branch: ppc-next
 - Merged patches 4 and 5.
 - Incorporated other review comments
 
 So, this may not still be possible depending on whether the KVM side
 of this is already merged, but it occurs to me that there's a simpler
 way.

The KVM part is already merged. Commit ID: 74845bc

 
 Rather than mucking about with having to update the hypervisor on the
 RTAS location, they have qemu copy the code out of RTAS, patch it and
 copy it back into the vector, you could instead do this:

Though this is possible, I have coupe of comments below

 
   1. Make KVM instead of immediately delivering a 0x200 for a guest
 machine check, cause a special exit to qemu.
 
   2. Have the register-nmi RTAS call store the guest side MC handler
 address in the spapr structure, but perform no actual guest code
 patching.
 
   3. Allocate the error log buffer independently from the RTAS blob,
 so qemu always knows where it is.

As per PAPR, the error log buffer should be part of RTAS blob and the
guest kernel explicitly checks if error log is inside RTAS blob.
This requires qemu to know the updated RTAS location by the OS which is
handled in patch 2/4.

 
   4. When qemu gets the MC exit condition, instead of going via a
 patched 0x200 vector, just directly set the guest register state and
 jump straight into the guest side MC handler.

PAPR mentions:

R1–7.3.14–8: Once the OS has registered for NMI notification, the
platform firmware must intercept all System Reset Interrupts on all of
the OS’s processors.

So do we need to go via 0x200?

-- 
Regards,
Aravinda