Re: [PATCH 27/27] KVM: PPC: Add Documentation about PV interface

2010-07-09 Thread Alexander Graf

On 09.07.2010, at 11:11, MJ embd wrote:

 On Thu, Jul 1, 2010 at 4:13 PM, Alexander Graf ag...@suse.de wrote:
 We just introduced a new PV interface that screams for documentation. So here
 it is - a shiny new and awesome text file describing the internal works of
 the PPC KVM paravirtual interface.
 
 Signed-off-by: Alexander Graf ag...@suse.de
 
 +
 +
 +Some instructions require more logic to determine what's going on than a 
 load
 +or store instruction can deliver. To enable patching of those, we keep some
 +RAM around where we can live translate instructions to. What happens is the
 +following:
 +
 +   1) copy emulation code to memory
 +   2) patch that code to fit the emulated instruction
 +   3) patch that code to return to the original pc + 4
 +   4) patch the original instruction to branch to the new code
 +
 +That way we can inject an arbitrary amount of code as replacement for a 
 single
 +instruction. This allows us to check for pending interrupts when setting 
 EE=1
 +for example.
 +
 
 Which patch does this mapping ? Can you please point to that.

The branch patching is in patch 22/27. For the respective users, see patch 
23-26/27.


Alex

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 27/27] KVM: PPC: Add Documentation about PV interface

2010-07-04 Thread Alexander Graf

On 02.07.2010, at 21:10, Scott Wood wrote:

 On Fri, 2 Jul 2010 20:47:44 +0200
 Alexander Graf ag...@suse.de wrote:
 
 
 On 02.07.2010, at 19:59, Hollis Blanchard wrote:
 
 [Resending...]
 
 Please reconcile this with
 http://www.linux-kvm.org/page/PowerPC_Hypercall_ABI, which has been
 discussed in the (admittedly closed) Power.org embedded hypervisor
 working group. Bear in mind that other hypervisors are already
 implementing the documented ABI, so if you have concerns, you should
 probably raise them with that audience...
 
 We can not use sc with LV=1 because that would break the KVM in
 something else case which is KVM's strong point on PPC.
 
 The current proposal involves the hypervisor specifying the hcall opcode
 sequence in the device tree -- to allow either sc 1 or sc 0 plus
 magic GPR depending on whether you've got the hardware hypervisor
 feature (hereafter HHV).

Ah right, so you can still trap a hypercall with HHV. Makes sense.

 
 With HHV, sc 0 plus magic GPR just doesn't work, since it won't trap
 to the hypervisor.  sc 1 plus magic GPR might be problematic on some
 non-HHV implementations, especially if you *do* have HHV but the
 non-HHV hypervisor is running as an HHV guest.

Yes, that's why I need sc 0 plus magic GPR in r0 and r3 - to accomodate for all 
the non-HHV cases. And it would be clever to have a way to expose the same 
functionality when we do use the HHV features.

So, is that draft available anywhere? The wiki page Hollis pointed to is very 
vague.


Alex

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 27/27] KVM: PPC: Add Documentation about PV interface

2010-07-04 Thread Alexander Graf

On 04.07.2010, at 00:41, Benjamin Herrenschmidt wrote:

 On Fri, 2010-07-02 at 18:27 +0200, Segher Boessenkool wrote:
 +To find out if we're running on KVM or not, we overlay the PVR  
 register. Usually
 +the PVR register contains an id that identifies your CPU type. If,  
 however, you
 +pass KVM_PVR_PARA in the register that you want the PVR result in,  
 the register
 +still contains KVM_PVR_PARA after the mfpvr call.
 +
 +   LOAD_REG_IMM(r5, KVM_PVR_PARA)
 +   mfpvr   r5
 +   [r5 still contains KVM_PVR_PARA]
 
 I love this part :-)
 
 Me not :-)
 
 It should be in the device-tree instead, or something like that. Enough
 games with PVR...

My biggest concern about putting things in the device-tree is that I was trying 
to keep things as separate as possible. Why does the firmware have to know that 
it's running in KVM? Why do I have to patch 3 projects (Linux, OpenBIOS, Qemu) 
when I could go with patching a single one (Linux)?

Alex

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 27/27] KVM: PPC: Add Documentation about PV interface

2010-07-04 Thread Alexander Graf

On 04.07.2010, at 11:17, Alexander Graf wrote:

 
 On 04.07.2010, at 11:10, Avi Kivity wrote:
 
 On 07/04/2010 12:04 PM, Alexander Graf wrote:
 
 My biggest concern about putting things in the device-tree is that I was 
 trying to keep things as separate as possible. Why does the firmware have 
 to know that it's running in KVM?
 
 It doesn't need to know about kvm, it needs to know that a particular 
 hypercall protocol is available.
 
 Considering how the parts of the draft that I read about sound like, that's 
 not the inventor's idea. PPC people love to see the BIOS be part of the 
 virtualization solution. I don't. That's the biggest difference here and 
 reason for us going different directions.
 
 I think what they thought of is something like
 
 if (in_kvm()) {
  device_tree_put(/hypervisor/exit, EXIT_TYPE_MAGIC);
  device_tree_put(/hypervisor/exit_magic, EXIT_MAGIC);
 }
 
 which then the OS reads out. But that's useless, as the hypercalls are 
 hypervisor specific. So why make the detection on the Linux side generic?

In fact, it's even worse. Right now with KVM for PPC we have 3 different ways 
of generating the device tree:

1) OpenBIOS (Mac emulation)
2) Qemu libfdt (BookE)
3) MOL OF implementation

So I'd have to touch even more projects. Just for the sake of splitting out 
something that belongs together anyway. And probably even create new interfaces 
just for that sake (qemu asking the kernel which type of hypercalls the vm 
should use) even though the guest could just query all that itself.

Alex

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 27/27] KVM: PPC: Add Documentation about PV interface

2010-07-04 Thread Avi Kivity

On 07/04/2010 12:17 PM, Alexander Graf wrote:

On 04.07.2010, at 11:10, Avi Kivity wrote:

   

On 07/04/2010 12:04 PM, Alexander Graf wrote:
 

My biggest concern about putting things in the device-tree is that I was trying 
to keep things as separate as possible. Why does the firmware have to know that 
it's running in KVM?
   

It doesn't need to know about kvm, it needs to know that a particular hypercall 
protocol is available.
 

Considering how the parts of the draft that I read about sound like, that's not 
the inventor's idea. PPC people love to see the BIOS be part of the 
virtualization solution. I don't. That's the biggest difference here and reason 
for us going different directions.
   


Regardless of which direction is correct, you need to go in one direction.


I think what they thought of is something like

if (in_kvm()) {
   device_tree_put(/hypervisor/exit, EXIT_TYPE_MAGIC);
   device_tree_put(/hypervisor/exit_magic, EXIT_MAGIC);
}

which then the OS reads out. But that's useless, as the hypercalls are 
hypervisor specific. So why make the detection on the Linux side generic?
   


Looks like the benefit is less magic in the detection code.  x86 has 
(more or less) standardized feature detection.  Is this an attempt to 
bring something similar to ppc-land?



Why do I have to patch 3 projects (Linux, OpenBIOS, Qemu) when I could go with 
patching a single one (Linux)?

   

That's not a valid argument.  You patch as many projects as it takes to get it 
right (not that I have an opinion in this particular discussion).
 

If you can put code in Linux that touches 3 submaintainer's directories or one 
submaintainer's directory with both ending up being functionally equivalent, 
which way would you go?
   


We would do the right thing.  Trivial examples include adding defines to 
include/asm/processor.h or include/asm/msr-index.h, more complicated 
ones are the topic for my talk in kvm forum 2010.


Yes, coordinating the acks and trees and merge windows is not as fun as 
coding.  Yes, it's even more difficult with separate trees.  No, that's 
not an excuse if we[1] determine that the right thing to do is the most 
complicated.


[1] we in this case are the powerpc Linux arch maintainers and/or 
whoever defines the hardware specification



At the very least you have to patch qemu for reasons described before 
(backwards compatible live migration).
 

There is no live migration on PPC (yet). That point is completely moot atm.
   


You still need to make that feature disableable from userspace.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 27/27] KVM: PPC: Add Documentation about PV interface

2010-07-04 Thread Avi Kivity

On 07/04/2010 12:30 PM, Alexander Graf wrote:



Considering how the parts of the draft that I read about sound like, that's not 
the inventor's idea. PPC people love to see the BIOS be part of the 
virtualization solution. I don't. That's the biggest difference here and reason 
for us going different directions.

I think what they thought of is something like

if (in_kvm()) {
  device_tree_put(/hypervisor/exit, EXIT_TYPE_MAGIC);
  device_tree_put(/hypervisor/exit_magic, EXIT_MAGIC);
}

which then the OS reads out. But that's useless, as the hypercalls are 
hypervisor specific. So why make the detection on the Linux side generic?
 

In fact, it's even worse. Right now with KVM for PPC we have 3 different ways 
of generating the device tree:

1) OpenBIOS (Mac emulation)
2) Qemu libfdt (BookE)
3) MOL OF implementation
   


I sympathize.  But, if the arch says that's how you do things, then 
that's how you do things.



So I'd have to touch even more projects. Just for the sake of splitting out 
something that belongs together anyway. And probably even create new interfaces 
just for that sake (qemu asking the kernel which type of hypercalls the vm 
should use) even though the guest could just query all that itself.
   


qemu needs to be involved, in case one day you support more than one 
type of hypercalls (like x86 does with hyper-v) or if you want to live 
migrate from a host that has hypercall support to another host that has 
this feature removed (as has already happened on x86 with the pvmmu).


Planning for the future means a lot of boring interfaces.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 27/27] KVM: PPC: Add Documentation about PV interface

2010-07-03 Thread Benjamin Herrenschmidt
On Fri, 2010-07-02 at 18:27 +0200, Segher Boessenkool wrote:
  +To find out if we're running on KVM or not, we overlay the PVR  
  register. Usually
  +the PVR register contains an id that identifies your CPU type. If,  
  however, you
  +pass KVM_PVR_PARA in the register that you want the PVR result in,  
  the register
  +still contains KVM_PVR_PARA after the mfpvr call.
  +
  +   LOAD_REG_IMM(r5, KVM_PVR_PARA)
  +   mfpvr   r5
  +   [r5 still contains KVM_PVR_PARA]
 
 I love this part :-)

Me not :-)

It should be in the device-tree instead, or something like that. Enough
games with PVR...

Ben.

  +   __u64 scratch3;
  +   __u64 critical; /* Guest may not get interrupts if == r1 */
  +   __u64 sprg0;
  +   __u64 sprg1;
  +   __u64 sprg2;
  +   __u64 sprg3;
  +   __u64 srr0;
  +   __u64 srr1;
  +   __u64 dar;
  +   __u64 msr;
  +   __u32 dsisr;
  +   __u32 int_pending;  /* Tells the guest if we have an interrupt */
  +};
  +
  +Additions to the page must only occur at the end. Struct fields  
  are always 32
  +bit aligned.
 
 The u64s are 64-bit aligned, should they always be?
 
  +The ld and std instructions are transormed to lwz and stw  
  instructions
  +respectively on 32 bit systems with an added offset of 4 to  
  accomodate for big
  +endianness.
 
 Will this add never overflow?  Is there anything that checks for it?
 
  +mtmsrd rX, 0   b   special mtmsr section
  +mtmsr  b   special mtmsr section
 
 mtmsr rX
 
 
 Segher
 
 ___
 Linuxppc-dev mailing list
 linuxppc-...@lists.ozlabs.org
 https://lists.ozlabs.org/listinfo/linuxppc-dev


--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 27/27] KVM: PPC: Add Documentation about PV interface

2010-07-03 Thread Benjamin Herrenschmidt
On Fri, 2010-07-02 at 20:41 +0200, Alexander Graf wrote:
 The u64s are 64-bit aligned, should they always be?
 
 That's obvious, isn't it? And the ABI only specifies u64s to be 32 bit
 aligned, no? At least that's what ld and std specify.

No, the PowerPC ABI specifies u64's to be 64-bit aligned, even for
32-bit binaries.

Ben.

  
  +The ld and std instructions are transormed to lwz and stw
 instructions
  +respectively on 32 bit systems with an added offset of 4 to
 accomodate for big
  +endianness.
  
  Will this add never overflow?  Is there anything that checks for it?
 
 It basically means that to access dar, we either do
 
 ld  rX, DAR(0)
 
 or
 
 lwz rX, DAR+4(0)
 
 
  
  +mtmsrd  rX, 0   b   special mtmsr section
  +mtmsr   b   special mtmsr section
  
  mtmsr rX
 
 Nod. 

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 27/27] KVM: PPC: Add Documentation about PV interface

2010-07-02 Thread Segher Boessenkool
+To find out if we're running on KVM or not, we overlay the PVR  
register. Usually
+the PVR register contains an id that identifies your CPU type. If,  
however, you
+pass KVM_PVR_PARA in the register that you want the PVR result in,  
the register

+still contains KVM_PVR_PARA after the mfpvr call.
+
+   LOAD_REG_IMM(r5, KVM_PVR_PARA)
+   mfpvr   r5
+   [r5 still contains KVM_PVR_PARA]


I love this part :-)


+   __u64 scratch3;
+   __u64 critical; /* Guest may not get interrupts if == r1 */
+   __u64 sprg0;
+   __u64 sprg1;
+   __u64 sprg2;
+   __u64 sprg3;
+   __u64 srr0;
+   __u64 srr1;
+   __u64 dar;
+   __u64 msr;
+   __u32 dsisr;
+   __u32 int_pending;  /* Tells the guest if we have an interrupt */
+};
+
+Additions to the page must only occur at the end. Struct fields  
are always 32

+bit aligned.


The u64s are 64-bit aligned, should they always be?

+The ld and std instructions are transormed to lwz and stw  
instructions
+respectively on 32 bit systems with an added offset of 4 to  
accomodate for big

+endianness.


Will this add never overflow?  Is there anything that checks for it?


+mtmsrd rX, 0   b   special mtmsr section
+mtmsr  b   special mtmsr section


mtmsr rX


Segher

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 27/27] KVM: PPC: Add Documentation about PV interface

2010-07-02 Thread Alexander Graf

On 02.07.2010, at 18:27, Segher Boessenkool wrote:

 +To find out if we're running on KVM or not, we overlay the PVR register. 
 Usually
 +the PVR register contains an id that identifies your CPU type. If, however, 
 you
 +pass KVM_PVR_PARA in the register that you want the PVR result in, the 
 register
 +still contains KVM_PVR_PARA after the mfpvr call.
 +
 +LOAD_REG_IMM(r5, KVM_PVR_PARA)
 +mfpvr   r5
 +[r5 still contains KVM_PVR_PARA]
 
 I love this part :-)

:)

 
 +__u64 scratch3;
 +__u64 critical; /* Guest may not get interrupts if == r1 */
 +__u64 sprg0;
 +__u64 sprg1;
 +__u64 sprg2;
 +__u64 sprg3;
 +__u64 srr0;
 +__u64 srr1;
 +__u64 dar;
 +__u64 msr;
 +__u32 dsisr;
 +__u32 int_pending;  /* Tells the guest if we have an interrupt */
 +};
 +
 +Additions to the page must only occur at the end. Struct fields are always 
 32
 +bit aligned.
 
 The u64s are 64-bit aligned, should they always be?

That's obvious, isn't it? And the ABI only specifies u64s to be 32 bit aligned, 
no? At least that's what ld and std specify.

 
 +The ld and std instructions are transormed to lwz and stw 
 instructions
 +respectively on 32 bit systems with an added offset of 4 to accomodate for 
 big
 +endianness.
 
 Will this add never overflow?  Is there anything that checks for it?

It basically means that to access dar, we either do

ld  rX, DAR(0)

or

lwz rX, DAR+4(0)


 
 +mtmsrd  rX, 0   b   special mtmsr section
 +mtmsr   b   special mtmsr section
 
 mtmsr rX

Nod.


Alex

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 27/27] KVM: PPC: Add Documentation about PV interface

2010-07-02 Thread Alexander Graf

On 02.07.2010, at 19:59, Hollis Blanchard wrote:

 [Resending...]
 
 Please reconcile this with
 http://www.linux-kvm.org/page/PowerPC_Hypercall_ABI, which has been
 discussed in the (admittedly closed) Power.org embedded hypervisor
 working group. Bear in mind that other hypervisors are already
 implementing the documented ABI, so if you have concerns, you should
 probably raise them with that audience...

We can not use sc with LV=1 because that would break the KVM in something else 
case which is KVM's strong point on PPC.

Alex

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 27/27] KVM: PPC: Add Documentation about PV interface

2010-07-02 Thread Scott Wood
On Fri, 2 Jul 2010 20:47:44 +0200
Alexander Graf ag...@suse.de wrote:

 
 On 02.07.2010, at 19:59, Hollis Blanchard wrote:
 
  [Resending...]
  
  Please reconcile this with
  http://www.linux-kvm.org/page/PowerPC_Hypercall_ABI, which has been
  discussed in the (admittedly closed) Power.org embedded hypervisor
  working group. Bear in mind that other hypervisors are already
  implementing the documented ABI, so if you have concerns, you should
  probably raise them with that audience...
 
 We can not use sc with LV=1 because that would break the KVM in
 something else case which is KVM's strong point on PPC.

The current proposal involves the hypervisor specifying the hcall opcode
sequence in the device tree -- to allow either sc 1 or sc 0 plus
magic GPR depending on whether you've got the hardware hypervisor
feature (hereafter HHV).

With HHV, sc 0 plus magic GPR just doesn't work, since it won't trap
to the hypervisor.  sc 1 plus magic GPR might be problematic on some
non-HHV implementations, especially if you *do* have HHV but the
non-HHV hypervisor is running as an HHV guest.

-Scott
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html