from:"Jan Kiszka"

Re: [PATCH 3/7] KVM: PPC: debug stub interface parameter defined

2013-03-14 Thread Jan Kiszka

On 2013-03-14 13:09, Alexander Graf wrote:
 
 On 14.03.2013, at 12:57, Jan Kiszka wrote:
 
 On 2013-03-14 12:54, Alexander Graf wrote:

 On 14.03.2013, at 05:42, Bhushan Bharat-R65777 wrote:



 -Original Message-
 From: Alexander Graf [mailto:ag...@suse.de]
 Sent: Thursday, March 07, 2013 6:51 PM
 To: Bhushan Bharat-R65777
 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Wood Scott-B07421; 
 Bhushan
 Bharat-R65777
 Subject: Re: [PATCH 3/7] KVM: PPC: debug stub interface parameter defined


 On 28.02.2013, at 05:13, Bharat Bhushan wrote:

 This patch defines the interface parameter for KVM_SET_GUEST_DEBUG
 ioctl support. Follow up patches will use this for setting up hardware
 breakpoints, watchpoints and software breakpoints.

 Also kvm_arch_vcpu_ioctl_set_guest_debug() is brought one level below.
 This is because I am not sure what is required for book3s. So this
 ioctl behaviour will not change for book3s.

 Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
 ---
 arch/powerpc/include/uapi/asm/kvm.h |   23 +++
 arch/powerpc/kvm/book3s.c   |6 ++
 arch/powerpc/kvm/booke.c|6 ++
 arch/powerpc/kvm/powerpc.c  |6 --
 4 files changed, 35 insertions(+), 6 deletions(-)

 diff --git a/arch/powerpc/include/uapi/asm/kvm.h
 b/arch/powerpc/include/uapi/asm/kvm.h
 index c2ff99c..15f9a00 100644
 --- a/arch/powerpc/include/uapi/asm/kvm.h
 +++ b/arch/powerpc/include/uapi/asm/kvm.h
 @@ -272,8 +272,31 @@ struct kvm_debug_exit_arch {

 /* for KVM_SET_GUEST_DEBUG */
 struct kvm_guest_debug_arch {
 +struct {
 +/* H/W breakpoint/watchpoint address */
 +__u64 addr;
 +/*
 + * Type denotes h/w breakpoint, read watchpoint, write
 + * watchpoint or watchpoint (both read and write).
 + */
 +#define KVMPPC_DEBUG_NOTYPE 0x0
 +#define KVMPPC_DEBUG_BREAKPOINT (1UL  1)
 +#define KVMPPC_DEBUG_WATCH_WRITE(1UL  2)
 +#define KVMPPC_DEBUG_WATCH_READ (1UL  3)
 +__u32 type;
 +__u32 reserved;
 +} bp[16];
 };

 +/* Debug related defines */
 +/*
 + * kvm_guest_debug-control is a 32 bit field. The lower 16 bits are
 +generic
 + * and upper 16 bits are architecture specific. Architecture specific
 +defines
 + * that ioctl is for setting hardware breakpoint or software breakpoint.
 + */
 +#define KVM_GUESTDBG_USE_SW_BP  0x0001
 +#define KVM_GUESTDBG_USE_HW_BP  0x0002

 You only need

 #define KVM_GUESTDBG_HW_BP 0x0001

 In absence of the flag, it's a SW breakpoint.

 We kept this for 2 reasons; 1) Same logic is applied for i386, so trying 
 to keep consistent 2) better clarity.

 Jan, was there any special reason to have 2 flags for HW/SW breakpoint on 
 x86 rather than one bit that indicates which one is used?

 Different mechanics on x86: HW goes via debug registers and shows up as
 INT1, SW is INT3 (plus guest patching done by user land).
 
 Well, the same thing goes for us. What I'm asking is whether there is a 
 specific reason (extensibility, oversight, taste, ...) that you did
 
 #define KVM_GUESTDBG_USE_SW_BP   0x0001
 #define KVM_GUESTDBG_USE_HW_BP   0x0002
 
 rather than
 
 #define KVM_GUESTDBG_BP_TYPE  0x0001
 #define KVM_GUESTDBG_BP_TYPE_SW   0x0001
 #define KVM_GUESTDBG_BP_TYPE_HW   0x
 
 :)

Those bits enable or disable the features separately. You may also leave
both off if you like (and just use single stepping).

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 3/7] KVM: PPC: debug stub interface parameter defined

2013-03-14 Thread Jan Kiszka

On 2013-03-14 13:19, Alexander Graf wrote:
 
 On 14.03.2013, at 13:13, Jan Kiszka wrote:
 
 On 2013-03-14 13:09, Alexander Graf wrote:

 On 14.03.2013, at 12:57, Jan Kiszka wrote:

 On 2013-03-14 12:54, Alexander Graf wrote:

 On 14.03.2013, at 05:42, Bhushan Bharat-R65777 wrote:



 -Original Message-
 From: Alexander Graf [mailto:ag...@suse.de]
 Sent: Thursday, March 07, 2013 6:51 PM
 To: Bhushan Bharat-R65777
 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Wood Scott-B07421; 
 Bhushan
 Bharat-R65777
 Subject: Re: [PATCH 3/7] KVM: PPC: debug stub interface parameter 
 defined


 On 28.02.2013, at 05:13, Bharat Bhushan wrote:

 This patch defines the interface parameter for KVM_SET_GUEST_DEBUG
 ioctl support. Follow up patches will use this for setting up hardware
 breakpoints, watchpoints and software breakpoints.

 Also kvm_arch_vcpu_ioctl_set_guest_debug() is brought one level below.
 This is because I am not sure what is required for book3s. So this
 ioctl behaviour will not change for book3s.

 Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
 ---
 arch/powerpc/include/uapi/asm/kvm.h |   23 +++
 arch/powerpc/kvm/book3s.c   |6 ++
 arch/powerpc/kvm/booke.c|6 ++
 arch/powerpc/kvm/powerpc.c  |6 --
 4 files changed, 35 insertions(+), 6 deletions(-)

 diff --git a/arch/powerpc/include/uapi/asm/kvm.h
 b/arch/powerpc/include/uapi/asm/kvm.h
 index c2ff99c..15f9a00 100644
 --- a/arch/powerpc/include/uapi/asm/kvm.h
 +++ b/arch/powerpc/include/uapi/asm/kvm.h
 @@ -272,8 +272,31 @@ struct kvm_debug_exit_arch {

 /* for KVM_SET_GUEST_DEBUG */
 struct kvm_guest_debug_arch {
 +  struct {
 +  /* H/W breakpoint/watchpoint address */
 +  __u64 addr;
 +  /*
 +   * Type denotes h/w breakpoint, read watchpoint, write
 +   * watchpoint or watchpoint (both read and write).
 +   */
 +#define KVMPPC_DEBUG_NOTYPE   0x0
 +#define KVMPPC_DEBUG_BREAKPOINT   (1UL  1)
 +#define KVMPPC_DEBUG_WATCH_WRITE  (1UL  2)
 +#define KVMPPC_DEBUG_WATCH_READ   (1UL  3)
 +  __u32 type;
 +  __u32 reserved;
 +  } bp[16];
 };

 +/* Debug related defines */
 +/*
 + * kvm_guest_debug-control is a 32 bit field. The lower 16 bits are
 +generic
 + * and upper 16 bits are architecture specific. Architecture specific
 +defines
 + * that ioctl is for setting hardware breakpoint or software 
 breakpoint.
 + */
 +#define KVM_GUESTDBG_USE_SW_BP0x0001
 +#define KVM_GUESTDBG_USE_HW_BP0x0002

 You only need

 #define KVM_GUESTDBG_HW_BP 0x0001

 In absence of the flag, it's a SW breakpoint.

 We kept this for 2 reasons; 1) Same logic is applied for i386, so trying 
 to keep consistent 2) better clarity.

 Jan, was there any special reason to have 2 flags for HW/SW breakpoint on 
 x86 rather than one bit that indicates which one is used?

 Different mechanics on x86: HW goes via debug registers and shows up as
 INT1, SW is INT3 (plus guest patching done by user land).

 Well, the same thing goes for us. What I'm asking is whether there is a 
 specific reason (extensibility, oversight, taste, ...) that you did

#define KVM_GUESTDBG_USE_SW_BP   0x0001
#define KVM_GUESTDBG_USE_HW_BP   0x0002

 rather than

#define KVM_GUESTDBG_BP_TYPE 0x0001
#define KVM_GUESTDBG_BP_TYPE_SW  0x0001
#define KVM_GUESTDBG_BP_TYPE_HW  0x

 :)

 Those bits enable or disable the features separately. You may also leave
 both off if you like (and just use single stepping).
 
 Ah, so these are global configuration bits, not per-breakpoint configuration?

Yes, the are meant for kvm_guest_debug.control on x86. I see that this
is apparently different for ppc. Those bits you cited just control the
general enabling of hard or soft BPs, not the activation of individual
one. That is encoded into the BP registers on x86.

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [kvmarm] [RFC PATCH 0/3] KVM: ARM: Get rid of hardcoded VGIC addresses

2012-10-27 Thread Jan Kiszka

On 2012-10-27 00:03, Benjamin Herrenschmidt wrote:
 On Sat, 2012-10-27 at 07:45 +1100, Benjamin Herrenschmidt wrote:
 On Fri, 2012-10-26 at 14:39 +0200, Jan Kiszka wrote:

 But we are just talking about sending messages from A to B or soldering
 an input to an output pin. That's pretty generic. Give each output event
 a virtual IRQ number and define where its output line should be linked
 to (input pin of target controller). All what will be specific are the
 IDs of those controllers.

 Hrm you seem to be saying something very different from Paolo here.
 Unless it's just a very very confused terminology.

 So let's see the powerpc pseries case. Things like embedded etc...
 might be quite different.
 
 So I had a chat with Anthony who explained to me a bit more about what
 the x86 stuff is about. It's pretty horrible I must say :-)
 
 So correct me if I'm wrong but you essentially have to differentiate
 between MSI outputs and other (GSI) outputs due to the fact that
 MSIs in x86 land don't act as normal interrupts going through a source
 controller but instead get shot directly to the target CPU.
 
 Then you have to establish some kind of routing from those GSIs to
 some IO/APIC, and from MSIs to local APICs.

I'm afraid there are still some misconceptions about what is happening
on x86 and what role the bits play that are more generic.

The fact that we can inject MSI messages directly to the target APIC
doesn't affect the need to have IRQ routing support. That is used for
two reason on x86:

 - define the wiring from a classic IRQ line to the various (legacy) IRQ
   controllers we have, namely the IOAPIC and the PIC
 - define the IRQ input that should be generated when an irqfd triggers
   (that's currently just irqfd-MSI associations, but irqfd-irqchip
   may come as well for vfio)

 
 That's where I think there is a fairly fundamental difference with us.
 
 So let's cut that problem in two. The GSI bit and the MSI bit. The
 reason is that the way x86 does MSIs seems to be fairly x86 specific, I
 wouldn't be surprised if everybody else did MSIs like we do them, that
 is turn them into normal interrupts (ie, GSIs). But let's discuss that
 below.
 
 So the GSI bit. We can assume that GSIs in that context are basically
 our global interrupt number. This would apply to pretty much every
 platform indeed.
 
 The routing here, if I understand things correctly, consists of
 associating such a global interrupt number with a specific input pin (or
 virtual pin) of a specific source controller (ie, IO APIC).

...or PIC or whatever you have on your platform.

 
 This would generally make sense in embedded space as well I suppose,
 where you can have multiple or even cascaded interrupt controllers of
 different breeds etc...
 
 However, in the pseries system, this routing is essentially encoded in
 the interrupt number itself. As I think I explained earlier, the number
 is arbitrarily split in two parts, the top bits indicating the source
 controller and the bottom bits the source within that controller. In
 qemu/kvm we have made an arbitrary split (whose size I don't remember
 precisely) and we currently create only one fairly big source controller
 but we might change that in the future.
 
 This there is no such thing as needing to associate or create routing
 entries here. qemu will directly shoot GSIs using an ioctl and our
 code can directly map that to a source controller without any routing
 table of any sort. In fact, adding one would complicate things since
 we'd have a requirement that it's populated 1:1 or thing would get very
 confused indeed so overall, there's no point for us to implement or use
 that API or the generic code behind it, it would be pure bloat,
 complication and problems.

OK, that puts the IRQ injection IOCTL on pseries in the same category as
KVM_SIGNAL_MSI on x86. Both require no routing as the target address is
fully encoded and not otherwise dynamically remapped in the kernel.

 
 However, making  that code more generic might make sense for other
 platforms (including other powerpc platforms such as embedded) where
 multiple interrupt controllers may exist though here too, it's probably
 going to be fairly common that the GSI numbers are essentially be a bit
 field split with entire ranges assigned to a given PIC. We don't have to
 emulate x86+ACPI ability to individually remap interrupts.

In KVM, a GSI is an index to its central IRQ routing table where the
target IRQ controller or MSI message is defined. Other interpretations
of the number passed down the IRQ injection IOCTL are of course possible
(like you do on pseries), but then it becomes arch-specific.

 
 The case if MSIs now. My understanding from what Anthony says is that
 your MSIs essentially bypass the IO APIC and route directly to the local
 APIC, which is equivalent to our presentation controller. You thus need
 specific APIs to associate an MSI (which isn't a GSI) to as specific
 local APIC.

First part is right

Re: [kvmarm] [RFC PATCH 0/3] KVM: ARM: Get rid of hardcoded VGIC addresses

2012-10-26 Thread Jan Kiszka

On 2012-10-26 12:15, Paolo Bonzini wrote:
 Il 26/10/2012 12:09, Peter Maydell ha scritto:

 The other problem is configuring the redirection table.  If you need 64
 sources you need ioctls like KVM_GET/SET_IRQCHIP_ONE_REG.
 Why would you want an extra ONE_REG-like ioctl? The existing ONE_REG
 ioctls have plenty of space in the ID range to allow you to devote
 a subsection of it to your irqchip. (This is exactly how the ARM
 VGIC save/load is going to work.)
 
 Ok, I stand corrected. :)
 
 Whether you want to do startup configuration and board wiring via
 the same ioctl that handles runtime state save/load/migration is
 a different question, of course.
 
 QEMU's MSI-X routing is not x86-specific, so it should use the same
 KVM_SET_GSI_ROUTING ioctl that x86 uses.

And it's not only MSI[-X]. Most IRQ sources need to be rounted, either
from userspace or from irqfd or from some other in-kernel source to a
specific IRQ controller. That allows to customize things according to a
specific board / SoC emulation.

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [kvmarm] [RFC PATCH 0/3] KVM: ARM: Get rid of hardcoded VGIC addresses

2012-10-26 Thread Jan Kiszka

On 2012-10-26 13:39, Benjamin Herrenschmidt wrote:
 On Fri, 2012-10-26 at 12:17 +0100, Peter Maydell wrote:
 Well, that's the thing, I haven't managed to figure that out so far,
 it
 looks very x86-specific to me. To begin with there's no such thing
 as a
 GSI in our world.

 This was roughly the feeling I had looking at these APIs. There
 might be some underlying generic concept but there is a definite
 tendency for the surface representation to use x86 specific
 terminology to the extent that you can't tell whether an API
 is x86 specific or merely apparently so...
 
 Right. Which is why I'm sure I'm actually missing something there :-)
 And I'm hoping Paolo and Jan will help shed some light.
 
 It might help if somebody could explain a bit more what a GSI is in x86
 land and how it relates to the various APICs, along with what exactly
 they mean by routing , ie. what are the different elements that get
 associated. Basically, if somebody could describe how the x86 stuff
 works, that might help.
 
 From my view of things, we have various sources of interrupts. On my
 list are emulated device LSIs, emulated device MSIs, both in qemu, then
 vhost and finally pass-through, I suppose on some platforms IPIs come in
 as well though. Those sources need, one way or another, to hit a
 source controller which will then itself, in a platform specific way,
 shoot the interrupt to a presentation controller.
 
 The routing between source and presentation controllers is fairly
 platform specific as far as I can tell even within a given CPU family.
 Ie the way an OpenPIC (aka MPIC, used on macs) does it is different than
 the way the XICS system does it on pseries, and is different from most
 embedded stuff (which typically doesn't have that source/presentation
 distinction but just cascaded dumber PICs). The amount of
 configurability, the type of configuration information etc... that
 governs such a layout is also very specific to the platform and the type
 of interrupt controller system used on it.

But we are just talking about sending messages from A to B or soldering
an input to an output pin. That's pretty generic. Give each output event
a virtual IRQ number and define where its output line should be linked
to (input pin of target controller). All what will be specific are the
IDs of those controllers.

Of course, all that provided you do their emulation in kernel space. For
x86, that even makes sense when the IRQ sources are in user space as the
guest may still have to interact during IRQ delivery with IOAPIC, thus
we save some costly heavy-weight exits when putting it in the kernel.

 
 Remains the routing between source of events and actual inputs to
 a source controller.
 
 This too doesn't seem totally obvious to generalize. For example an
 embedded platform with a bunch of cascaded dumb interrupt controllers
 doesn't have a concept of a flat number space in HW, an interrupt
 input to be identified properly, needs to identify the controller and
 the interrupt within that controller. However, within KVM/qemu, it's
 pretty easy to assign to each controller a number and by collating the
 two, get some kind of flat space, though it's not arbitrary and the
 routing is thus fairly constrained if not totally fixed.

IRQ routing entry:
 - virq number (gsi)
 - type (controller ID, MSI, whatever you like)
 - some flags (to extend it)
 - type-specific data (MSI message, controller input pin, etc.)

And there can be multiple entries with the same virq, thus you can
deliver to multiple targets. I bet you can model quite a lot of your
platform specific routing this way. I'm not saying our generic code will
work out of the box, but at least the interfaces and concepts are there.

 
 In the pseries case, the global number is split in two bit fields, the
 BUID identifying the specific source controller and the source within
 that controller. Here too it's fairly fixed though. So the ioctl we use
 to create a source controller in the kernel takes the BUID as an
 argument, and from there the kernel will find the right source
 controller based solely on the interrupt number.
 
 So basically on one side we have a global interrupt number that
 identifies an input, I assume that's what x86 calls a GSI ?

Right. The virtual IRQ numbers we call GSI is partially occupied by
the actual x86-GSIs (0..n, with n=23 so far), directed to the IOAPIC and
PIC there, and then followed by IRQs that are mapped on MSI messages.
But that's just how we _use_ it on x86, not how it has to work for other
archs.

 
 Remains how to associate the various sources of interrupts to that
 'global number'... and that is fairly specific to each source type isn't
 it ?
 
 In our current powerpc code, the emulated devices toggle the qirq which
 ends up shooting an ioctl to set/reset or message (for MSIs) the
 corresponding global interrupt. The mapping is established entirely
 within qemu, we just tell the kernel to trigger a given interrupt.
 
 We haven't

Re: [kvmarm] [RFC PATCH 0/3] KVM: ARM: Get rid of hardcoded VGIC addresses

2012-10-26 Thread Jan Kiszka

On 2012-10-26 14:08, Peter Maydell wrote:
 On 26 October 2012 12:57, Paolo Bonzini pbonz...@redhat.com wrote:
 If you exclude old-style PCI pass-through and limit yourself to vhost
 and VFIO, you can treat irqfd as the in-kernel source of the
 interrupt.  Then you need a mapping between MSIs and numbers used in
 KVM_IRQFD (GSIs).

 This is what KVM_SET_GSI_ROUTING modifies, and basically the mapping is
 modified every time a vector is masked/unmasked in the MSI-X table.
 
 So SET_GSI_ROUTING sets the routing for MSIs? Very logical...

See my reply to Ben: It is used for MSIs as well, but not only. The
concept is absolutely generic, you just need to define specific target
types and provide ways to associate specific sources with a virtual IRQ
number (GSI).

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [kvmarm] [RFC PATCH 0/3] KVM: ARM: Get rid of hardcoded VGIC addresses

2012-10-25 Thread Jan Kiszka

On 2012-10-24 02:50, Paul Mackerras wrote:
 On Tue, Oct 23, 2012 at 12:48:28PM +0200, Jan Kiszka wrote:
 
 The current irqchip API is like this:

 KVM_CREATE_IRQCHIP (without any parameters)
 ...
 KVM_CREATE_VCPU
 KVM_SET_IRQCHIP (or the other way around)
 ...
 KVM_RUN

 The arguments you cannot pass via KVM_CREATE_IRQCHIP - which is more
 like a Hey, there will be an IRQ chip! - could be passed via
 KVM_SET_IRQCHIP (it has 512 bytes space). Provided there are sane
 configuration defaults, at least after KVM_CREATE_VCPU, KVM_SET_IRQCHIP
 becomes optional. Then you don't need the a check on KVM_RUN.
 
 Interesting.  How many times do you call KVM_CREATE_IRQCHIP per VM?
 Just once?

Yes, just once. The model is that we switch between user space and
kernel space emulation (or hardware-assisted virtualization) of the
irqchip(s).

 
 What do we need in addition to this in any of the non-x86 archs?
 
 What we have with the XICS, and to some extent with the OpenPIC, is a
 separation between source and presentation controllers, with a
 message-passing fabric between them.  The source controllers handle
 the details of some number of interrupt sources, such as the priority
 and destination of each interrupt source, and the presentation
 controllers handle the interface to the CPUs, so there is one
 presentation controller per CPU.  The presentation controller for a
 CPU has registers for the CPU priority, IPI request priority, and
 pending interrupt status.
 
 So we could indeed use the existing KVM_CREATE_IRQCHIP to tell KVM to
 create a presentation controller per vcpu.  But then how do we tell
 KVM how many source controllers we want and how many interrupts each
 source controller should handle?  The source controllers are not tied
 to any particular vcpu, and a source controller could potentially have
 100s of interrupts or more (particularly with MSIs).  Configuration of
 each source fits into 64 bits, so if we tried to use KVM_SET_IRQCHIP
 for configuring a source controller we'd be limited to 64 interrupts
 per source controller.
 
 What I have in my patches to do XICS emulation in the kernel is a new
 KVM_CREATE_IRQCHIP_ARGS ioctl, which takes an argument struct with a
 type, and for source controllers, an identifying number (bus unit ID
 or BUID, since that's what the hardware calls it) and the number of
 sources.  Then for saving/restoring the presentation controller state
 there's a register identifier for the KVM_GET/SET_ONE_REG ioctls, and
 for the source controllers there are new KVM_IRQCHIP_GET/SET_SOURCES
 ioctls that take an argument struct like this:
 
 struct kvm_irq_sources {
   __u32 start_irq_number;
   __u32 nr_irqs;
   __u64 *irqbuf;
 };
 
 OpenPIC also can handle 100s or 1000s of interrupt sources and can
 have the sources divided up into blocks (which tends to make it
 desirable to have multiple source controllers).  It also has per-CPU
 interrupt delivery registers and per-source interrupt source
 registers.
 
 So I think all this could be shoehorned into KVM_CREATE/GET/SET_IRQCHIP
 for small configurations, but it seems like it would run out of space
 for larger configurations.

Our architectures are not that different. We'll have the same challenge
on x86 one day as well as there can be several IOAPICs (source
controllers), not just one as today. Those should be addressed via
chip_id of struct kvm_irqchip (we have a 32-bit address space there).

Also there the question is when to instantiate the chips. Without adding
another IOCTL, they could be created on first SET_IRQCHIP. For Power,
the number of IRQ lines can become a set-once field in the source
controller state, i.e. must never be written twice with different values.

But, of course, some KVM_CREATE_IRQCHIP[2|_ARGS] that takes extra
arguments and specifies those details is also be an option. There the
question is how often it should be called: once with a list of all
necessary parameters or multiple times as in your model. As I'd like to
see a new IOCTL being able to replace the old one (though we will still
support it for older user space, of course), I'm leaning more toward the
first option.

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [kvmarm] [RFC PATCH 0/3] KVM: ARM: Get rid of hardcoded VGIC addresses

2012-10-25 Thread Jan Kiszka

On 2012-10-25 18:14, Paolo Bonzini wrote:
 Il 24/10/2012 02:50, Paul Mackerras ha scritto:
 So we could indeed use the existing KVM_CREATE_IRQCHIP to tell KVM to
 create a presentation controller per vcpu.  But then how do we tell
 KVM how many source controllers we want and how many interrupts each
 source controller should handle?  The source controllers are not tied
 to any particular vcpu, and a source controller could potentially have
 100s of interrupts or more (particularly with MSIs).  Configuration of
 each source fits into 64 bits, so if we tried to use KVM_SET_IRQCHIP
 for configuring a source controller we'd be limited to 64 interrupts
 per source controller.
 
 As Jan said, there's more than a few similarities between the x86 and
 PPC model.
 
 Taking inspiration from the x86 API, configuring the source controller
 seems to be more of a task for KVM_SET_GSI_ROUTING (a very x86-centric
 name, I must admit).

For wiring things together, I agree. But the IOAPIC has a fixed number
of input lines per chip, Power needs to configure them. I don't think
deriving this from addressed lines is the best solution. Some lines may
be configured (too) late, when the chip should have defined its
configuration already.

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [kvmarm] [RFC PATCH 0/3] KVM: ARM: Get rid of hardcoded VGIC addresses

2012-10-23 Thread Jan Kiszka

On 2012-10-18 15:48, Christoffer Dall wrote:
 On Wed, Oct 17, 2012 at 7:58 PM, Benjamin Herrenschmidt
 b...@kernel.crashing.org wrote:
 On Thu, 2012-10-18 at 09:10 +1100, Paul Mackerras wrote:

 With the XICS, there are two types of irqchip: a source controller and
 a presentation controller.  There is one presentation controller per
 vcpu and typically one source controller per PCI host bridge (a source
 controller can manage multiple sources).  The buid above is
 basically an identifier for a source controller.

 So with the above, it would be quite easy to add new types and
 arguments for them.

 The only possible issue is that afiak, the ioctl number depends on the
 structure size, no ? If it does, then we should add some padding to the
 union to leave room for new types.

 It sounds overall ok to me, however, Peter Maydell pointed out that
 QEMU is architected in such a way that creating the IRQ chip happens
 before QEMU knows how the chip fits with the model it is emulating and
 therefore lacks bits of information that we require for initializing
 the irq chip.
 
 Specifically on ARM the information needed is the base address of a
 hardware peripheral, which is virtualization aware, and is mapped
 directly into the guest's physical address space.
 
 One could argue that's not a concern for designing a good kernel api,
 but on the other hand, qemu is already quite a large system on its
 own, and we have to be practical.
 
 Another alternative is to just let qemu init the device, later give
 the base address and check before each execution of the vcpu whether
 the base address has been set and simply return an error if not. This
 latter approach just seems horrible and unintuitive to me.
 
 Thoughts?

The current irqchip API is like this:

KVM_CREATE_IRQCHIP (without any parameters)
...
KVM_CREATE_VCPU
KVM_SET_IRQCHIP (or the other way around)
...
KVM_RUN

The arguments you cannot pass via KVM_CREATE_IRQCHIP - which is more
like a Hey, there will be an IRQ chip! - could be passed via
KVM_SET_IRQCHIP (it has 512 bytes space). Provided there are sane
configuration defaults, at least after KVM_CREATE_VCPU, KVM_SET_IRQCHIP
becomes optional. Then you don't need the a check on KVM_RUN.

What do we need in addition to this in any of the non-x86 archs?

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [kvmarm] [RFC PATCH 0/3] KVM: ARM: Get rid of hardcoded VGIC addresses

2012-10-23 Thread Jan Kiszka

On 2012-10-23 12:52, Peter Maydell wrote:
 On 23 October 2012 11:48, Jan Kiszka jan.kis...@siemens.com wrote:
 The current irqchip API is like this:

 KVM_CREATE_IRQCHIP (without any parameters)
 ...
 KVM_CREATE_VCPU
 KVM_SET_IRQCHIP (or the other way around)
 ...
 KVM_RUN

 The arguments you cannot pass via KVM_CREATE_IRQCHIP - which is more
 like a Hey, there will be an IRQ chip! - could be passed via
 KVM_SET_IRQCHIP (it has 512 bytes space). Provided there are sane
 configuration defaults, at least after KVM_CREATE_VCPU, KVM_SET_IRQCHIP
 becomes optional. Then you don't need the a check on KVM_RUN.
 
 KVM_SET_IRQCHIP is the state load ioctl, right (companion to
 KVM_GET_IRQCHIP)? It seems like a bit of an abuse to use that
 for configuration rather than for migration/sync of state
 with userspace...

Depends on how reconfigurable your chip is. x86 also sends the mapping
down this way, which happens to be guest configurable but is practically
static.

 
 (For ARM we will just use the ONE_REG ABI for irqchip state
 save/load anyway, so KVM_SET_IRQCHIP isn't relevant.)

The less problems you should have using the SET interface to perform
one-time configuration.

BTW, I guess we will regret that one-reg ABI one day and have to
introduce a multi-reg version again for hot-standby, i.e. continuous
state migration. I know we also do this for c86 MSRs - that interface
has the same limitation.

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [kvmarm] [RFC PATCH 0/3] KVM: ARM: Get rid of hardcoded VGIC addresses

2012-10-23 Thread Jan Kiszka

On 2012-10-23 13:04, Peter Maydell wrote:
 On 23 October 2012 12:00, Jan Kiszka jan.kis...@siemens.com wrote:
 BTW, I guess we will regret that one-reg ABI one day and have to
 introduce a multi-reg version again for hot-standby, i.e. continuous
 state migration. I know we also do this for c86 MSRs - that interface
 has the same limitation.
 
 The multi-reg ABI idea has been floated, it would be easy enough to add.
 We just don't want to tie up getting ARM KVM support in with arguments
 over yet another ABI -- ONE_REG is sufficient for our current
 purposes.

Good that the number of IOCTLs we can encode is still large. ;)

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 6/6] KVM: booke/bookehv: Add debug stub support

2012-09-25 Thread Jan Kiszka

On 2012-09-24 16:46, Alexander Graf wrote:
 
 On 07.09.2012, at 00:56, Scott Wood wrote:
 
 On 09/06/2012 09:56 AM, Bhushan Bharat-R65777 wrote:


 -Original Message-
 From: Wood Scott-B07421
 Sent: Thursday, September 06, 2012 4:57 AM
 To: Bhushan Bharat-R65777
 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Bhushan 
 Bharat-
 R65777
 Subject: Re: [PATCH 6/6] KVM: booke/bookehv: Add debug stub support

 On 09/05/2012 06:23 PM, Scott Wood wrote:
 On 08/21/2012 08:52 AM, Bharat Bhushan wrote:
 This patch adds the debug stub support on booke/bookehv.
 Now QEMU debug stub can use hw breakpoint, watchpoint and software
 breakpoint to debug guest.

 Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
 ---
 arch/powerpc/include/asm/kvm.h|   29 ++-
 arch/powerpc/include/asm/kvm_host.h   |5 +
 arch/powerpc/kernel/asm-offsets.c |   26 ++
 arch/powerpc/kvm/booke.c  |  144 
 +--
 --
 arch/powerpc/kvm/booke_interrupts.S   |  110 +
 arch/powerpc/kvm/bookehv_interrupts.S |  141
 +++-
 arch/powerpc/kvm/e500mc.c |3 +-
 7 files changed, 435 insertions(+), 23 deletions(-)

 diff --git a/arch/powerpc/include/asm/kvm.h
 b/arch/powerpc/include/asm/kvm.h index 61b197e..53479ea 100644
 --- a/arch/powerpc/include/asm/kvm.h
 +++ b/arch/powerpc/include/asm/kvm.h
 @@ -25,6 +25,7 @@
 /* Select powerpc specific features in linux/kvm.h */  #define
 __KVM_HAVE_SPAPR_TCE  #define __KVM_HAVE_PPC_SMT
 +#define __KVM_HAVE_GUEST_DEBUG

 struct kvm_regs {
  __u64 pc;
 @@ -264,7 +265,31 @@ struct kvm_fpu {
  __u64 fpr[32];
 };

 +
 +/*
 + * Defines for h/w breakpoint, watchpoint (read, write or both) and
 + * software breakpoint.
 + * These are used as type in KVM_SET_GUEST_DEBUG ioctl and status
 + * for KVM_DEBUG_EXIT.
 + */
 +#define KVMPPC_DEBUG_NONE   0x0
 +#define KVMPPC_DEBUG_BREAKPOINT (1UL  1)
 +#define KVMPPC_DEBUG_WATCH_WRITE(1UL  2)
 +#define KVMPPC_DEBUG_WATCH_READ (1UL  3)
 struct kvm_debug_exit_arch {

 That says arch, but it's not in an arch-specific file.

 Sigh, I can't read today apparently.

 +__u64 pc;
 +/*
 + * exception - returns the exception number. If the 
 KVM_DEBUG_EXIT
 + * exit is not handled (say not h/w breakpoint or software 
 breakpoint
 + * set for this address) by qemu then it is supposed to inject 
 this
 + * exception to guest.
 + */
 +__u32 exception;
 +/*
 + * exiting to userspace because of h/w breakpoint, watchpoint
 + * (read, write or both) and software breakpoint.
 + */
 +__u32 status;
 };

 What does exception number mean in a generic API?

 Still, exception number is not a well-defined concept powerpc-wide.

 Just for background why we added is that, on x86 this exception number is 
 used to inject the exception to guest if QEMU is not able to handle the 
 debug exception.

 Should we just through a print with clearing the exception condition? Or 
 something else you would like to suggest?

 We can pass up the exception type; it just needs more documentation
 about what exactly you're referring to, and probably some enumeration
 that says which exception numberspace it is.

 For booke the exception number should probably be related to the fixed
 offsets rather than the IVOR number, as IVORs are phased out.
 
 Jan, I would like to get your comment on this one.
 
 Since we don't have standardized exception vectors like x86 does, we need to 
 convert things between different semantics in user space if we want to make 
 use of the exception type. Do we actually need to know about it in user space 
 or do we only need to store it in case we get a migration at that point?
 
 If it's the latter, can we maybe keep the reinjection logic internal to KVM 
 and make DEBUG exits non-migratable similar to how we already handle MMIO/PIO 
 exits today?

Reinjection depends on userspace. It has to check if the debug event was
related to the host debugging the guest or if it was due to a
guest-internal debugging event. That's because the kernel does not track
individual breakpoints, just controls the trapping. So we need a way to
manage the reinjection, and we do this (on x86) by passing the exception
up and then using it for SET_VCPU_EVENTS to reinject it if necessary.

The general logic (reinjection filter in userspace) should be generic
enough, but all the details can, of course, be defined in an
arch-specific manner.

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 6/6] KVM: booke/bookehv: Add debug stub support

2012-09-25 Thread Jan Kiszka

On 2012-09-25 12:58, Alexander Graf wrote:
 
 On 25.09.2012, at 12:56, Jan Kiszka wrote:
 
 On 2012-09-25 12:47, Alexander Graf wrote:

 On 25.09.2012, at 12:38, Jan Kiszka wrote:

 On 2012-09-24 16:46, Alexander Graf wrote:

 On 07.09.2012, at 00:56, Scott Wood wrote:

 On 09/06/2012 09:56 AM, Bhushan Bharat-R65777 wrote:


 -Original Message-
 From: Wood Scott-B07421
 Sent: Thursday, September 06, 2012 4:57 AM
 To: Bhushan Bharat-R65777
 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; 
 Bhushan Bharat-
 R65777
 Subject: Re: [PATCH 6/6] KVM: booke/bookehv: Add debug stub support

 On 09/05/2012 06:23 PM, Scott Wood wrote:
 On 08/21/2012 08:52 AM, Bharat Bhushan wrote:
 This patch adds the debug stub support on booke/bookehv.
 Now QEMU debug stub can use hw breakpoint, watchpoint and software
 breakpoint to debug guest.

 Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
 ---
 arch/powerpc/include/asm/kvm.h|   29 ++-
 arch/powerpc/include/asm/kvm_host.h   |5 +
 arch/powerpc/kernel/asm-offsets.c |   26 ++
 arch/powerpc/kvm/booke.c  |  144 
 +--
 --
 arch/powerpc/kvm/booke_interrupts.S   |  110 
 +
 arch/powerpc/kvm/bookehv_interrupts.S |  141
 +++-
 arch/powerpc/kvm/e500mc.c |3 +-
 7 files changed, 435 insertions(+), 23 deletions(-)

 diff --git a/arch/powerpc/include/asm/kvm.h
 b/arch/powerpc/include/asm/kvm.h index 61b197e..53479ea 100644
 --- a/arch/powerpc/include/asm/kvm.h
 +++ b/arch/powerpc/include/asm/kvm.h
 @@ -25,6 +25,7 @@
 /* Select powerpc specific features in linux/kvm.h */  #define
 __KVM_HAVE_SPAPR_TCE  #define __KVM_HAVE_PPC_SMT
 +#define __KVM_HAVE_GUEST_DEBUG

 struct kvm_regs {
  __u64 pc;
 @@ -264,7 +265,31 @@ struct kvm_fpu {
  __u64 fpr[32];
 };

 +
 +/*
 + * Defines for h/w breakpoint, watchpoint (read, write or both) and
 + * software breakpoint.
 + * These are used as type in KVM_SET_GUEST_DEBUG ioctl and 
 status
 + * for KVM_DEBUG_EXIT.
 + */
 +#define KVMPPC_DEBUG_NONE   0x0
 +#define KVMPPC_DEBUG_BREAKPOINT (1UL  1)
 +#define KVMPPC_DEBUG_WATCH_WRITE(1UL  2)
 +#define KVMPPC_DEBUG_WATCH_READ (1UL  3)
 struct kvm_debug_exit_arch {

 That says arch, but it's not in an arch-specific file.

 Sigh, I can't read today apparently.

 +__u64 pc;
 +/*
 + * exception - returns the exception number. If the 
 KVM_DEBUG_EXIT
 + * exit is not handled (say not h/w breakpoint or software 
 breakpoint
 + * set for this address) by qemu then it is supposed to inject 
 this
 + * exception to guest.
 + */
 +__u32 exception;
 +/*
 + * exiting to userspace because of h/w breakpoint, watchpoint
 + * (read, write or both) and software breakpoint.
 + */
 +__u32 status;
 };

 What does exception number mean in a generic API?

 Still, exception number is not a well-defined concept powerpc-wide.

 Just for background why we added is that, on x86 this exception number 
 is used to inject the exception to guest if QEMU is not able to handle 
 the debug exception.

 Should we just through a print with clearing the exception condition? 
 Or something else you would like to suggest?

 We can pass up the exception type; it just needs more documentation
 about what exactly you're referring to, and probably some enumeration
 that says which exception numberspace it is.

 For booke the exception number should probably be related to the fixed
 offsets rather than the IVOR number, as IVORs are phased out.

 Jan, I would like to get your comment on this one.

 Since we don't have standardized exception vectors like x86 does, we need 
 to convert things between different semantics in user space if we want to 
 make use of the exception type. Do we actually need to know about it in 
 user space or do we only need to store it in case we get a migration at 
 that point?

 If it's the latter, can we maybe keep the reinjection logic internal to 
 KVM and make DEBUG exits non-migratable similar to how we already handle 
 MMIO/PIO exits today?

 Reinjection depends on userspace. It has to check if the debug event was
 related to the host debugging the guest or if it was due to a
 guest-internal debugging event. That's because the kernel does not track
 individual breakpoints, just controls the trapping. So we need a way to
 manage the reinjection, and we do this (on x86) by passing the exception
 up and then using it for SET_VCPU_EVENTS to reinject it if necessary.

 The general logic (reinjection filter in userspace) should be generic
 enough, but all the details can, of course, be defined in an
 arch-specific manner.

 Well, we could always just pass a payload to user space that it passes into 
 an ioctl to the kernel to reinject the interrupt. But that would break 
 migratability as this payload wouldn't get stored in env.

 Right.


 But the same thing applies

Re: [PATCH] msi/msix: added API to set MSI message address and data

2012-07-06 Thread Jan Kiszka

On 2012-07-06 17:36, Alexander Graf wrote:
 
 On 02.07.2012, at 09:24, Jan Kiszka wrote:
 
 On 2012-07-02 06:28, Alexey Kardashevskiy wrote:
 Ping?


 On 22/06/12 11:15, Alexey Kardashevskiy wrote:
 Added (msi|msix)_set_message() function for whoever might
 want to use them.

 Currently msi_notify()/msix_notify() write to these vectors to
 signal the guest about an interrupt so the correct values have to
 written there by the guest or QEMU.

 For example, POWER guest never initializes MSI/MSIX vectors, instead
 it uses RTAS hypercalls. So in order to support MSIX for virtio-pci on
 POWER we have to initialize MSI/MSIX message from QEMU.

 Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
 ---
 hw/msi.c  |   13 +
 hw/msi.h  |1 +
 hw/msix.c |9 +
 hw/msix.h |2 ++
 4 files changed, 25 insertions(+)

 diff --git a/hw/msi.c b/hw/msi.c
 index 5233204..cc6102f 100644
 --- a/hw/msi.c
 +++ b/hw/msi.c
 @@ -105,6 +105,19 @@ static inline uint8_t msi_pending_off(const 
 PCIDevice* dev, bool msi64bit)
 return dev-msi_cap + (msi64bit ? PCI_MSI_PENDING_64 : 
 PCI_MSI_PENDING_32);
 }

 +void msi_set_message(PCIDevice *dev, MSIMessage msg)
 +{
 +uint16_t flags = pci_get_word(dev-config + msi_flags_off(dev));
 +bool msi64bit = flags  PCI_MSI_FLAGS_64BIT;
 +
 +if (msi64bit) {
 +pci_set_quad(dev-config + msi_address_lo_off(dev), msg.address);
 +} else {
 +pci_set_long(dev-config + msi_address_lo_off(dev), msg.address);
 +}
 +pci_set_word(dev-config + msi_data_off(dev, msi64bit), msg.data);
 +}
 +
 bool msi_enabled(const PCIDevice *dev)
 {
 return msi_present(dev) 
 diff --git a/hw/msi.h b/hw/msi.h
 index 75747ab..6ec1f99 100644
 --- a/hw/msi.h
 +++ b/hw/msi.h
 @@ -31,6 +31,7 @@ struct MSIMessage {

 extern bool msi_supported;

 +void msi_set_message(PCIDevice *dev, MSIMessage msg);
 bool msi_enabled(const PCIDevice *dev);
 int msi_init(struct PCIDevice *dev, uint8_t offset,
  unsigned int nr_vectors, bool msi64bit, bool 
 msi_per_vector_mask);
 diff --git a/hw/msix.c b/hw/msix.c
 index ded3c55..5f7d6d3 100644
 --- a/hw/msix.c
 +++ b/hw/msix.c
 @@ -45,6 +45,15 @@ static MSIMessage msix_get_message(PCIDevice *dev, 
 unsigned vector)
 return msg;
 }

 +void msix_set_message(PCIDevice *dev, int vector, struct MSIMessage msg)
 +{
 +uint8_t *table_entry = dev-msix_table_page + vector * 
 PCI_MSIX_ENTRY_SIZE;
 +
 +pci_set_quad(table_entry + PCI_MSIX_ENTRY_LOWER_ADDR, msg.address);
 +pci_set_long(table_entry + PCI_MSIX_ENTRY_DATA, msg.data);
 +table_entry[PCI_MSIX_ENTRY_VECTOR_CTRL] = 
 ~PCI_MSIX_ENTRY_CTRL_MASKBIT;
 +}
 +
 /* Add MSI-X capability to the config space for the device. */
 /* Given a bar and its size, add MSI-X table on top of it
  * and fill MSI-X capability in the config space.
 diff --git a/hw/msix.h b/hw/msix.h
 index 50aee82..26a437e 100644
 --- a/hw/msix.h
 +++ b/hw/msix.h
 @@ -4,6 +4,8 @@
 #include qemu-common.h
 #include pci.h

 +void msix_set_message(PCIDevice *dev, int vector, MSIMessage msg);
 +
 int msix_init(PCIDevice *pdev, unsigned short nentries,
   MemoryRegion *bar,
   unsigned bar_nr, unsigned bar_size);



 Acked-by: Jan Kiszka jan.kis...@siemens.com
 
 So I take it that you expect this patch to go through my tree, not yours?

I suppose you wanted to address Michael, right? He has to decide.

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux


--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] msi/msix: added API to set MSI message address and data

2012-07-02 Thread Jan Kiszka

On 2012-07-02 06:28, Alexey Kardashevskiy wrote:
 Ping?
 
 
 On 22/06/12 11:15, Alexey Kardashevskiy wrote:
 Added (msi|msix)_set_message() function for whoever might
 want to use them.

 Currently msi_notify()/msix_notify() write to these vectors to
 signal the guest about an interrupt so the correct values have to
 written there by the guest or QEMU.

 For example, POWER guest never initializes MSI/MSIX vectors, instead
 it uses RTAS hypercalls. So in order to support MSIX for virtio-pci on
 POWER we have to initialize MSI/MSIX message from QEMU.

 Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
 ---
  hw/msi.c  |   13 +
  hw/msi.h  |1 +
  hw/msix.c |9 +
  hw/msix.h |2 ++
  4 files changed, 25 insertions(+)

 diff --git a/hw/msi.c b/hw/msi.c
 index 5233204..cc6102f 100644
 --- a/hw/msi.c
 +++ b/hw/msi.c
 @@ -105,6 +105,19 @@ static inline uint8_t msi_pending_off(const PCIDevice* 
 dev, bool msi64bit)
  return dev-msi_cap + (msi64bit ? PCI_MSI_PENDING_64 : 
 PCI_MSI_PENDING_32);
  }
  
 +void msi_set_message(PCIDevice *dev, MSIMessage msg)
 +{
 +uint16_t flags = pci_get_word(dev-config + msi_flags_off(dev));
 +bool msi64bit = flags  PCI_MSI_FLAGS_64BIT;
 +
 +if (msi64bit) {
 +pci_set_quad(dev-config + msi_address_lo_off(dev), msg.address);
 +} else {
 +pci_set_long(dev-config + msi_address_lo_off(dev), msg.address);
 +}
 +pci_set_word(dev-config + msi_data_off(dev, msi64bit), msg.data);
 +}
 +
  bool msi_enabled(const PCIDevice *dev)
  {
  return msi_present(dev) 
 diff --git a/hw/msi.h b/hw/msi.h
 index 75747ab..6ec1f99 100644
 --- a/hw/msi.h
 +++ b/hw/msi.h
 @@ -31,6 +31,7 @@ struct MSIMessage {
  
  extern bool msi_supported;
  
 +void msi_set_message(PCIDevice *dev, MSIMessage msg);
  bool msi_enabled(const PCIDevice *dev);
  int msi_init(struct PCIDevice *dev, uint8_t offset,
   unsigned int nr_vectors, bool msi64bit, bool 
 msi_per_vector_mask);
 diff --git a/hw/msix.c b/hw/msix.c
 index ded3c55..5f7d6d3 100644
 --- a/hw/msix.c
 +++ b/hw/msix.c
 @@ -45,6 +45,15 @@ static MSIMessage msix_get_message(PCIDevice *dev, 
 unsigned vector)
  return msg;
  }
  
 +void msix_set_message(PCIDevice *dev, int vector, struct MSIMessage msg)
 +{
 +uint8_t *table_entry = dev-msix_table_page + vector * 
 PCI_MSIX_ENTRY_SIZE;
 +
 +pci_set_quad(table_entry + PCI_MSIX_ENTRY_LOWER_ADDR, msg.address);
 +pci_set_long(table_entry + PCI_MSIX_ENTRY_DATA, msg.data);
 +table_entry[PCI_MSIX_ENTRY_VECTOR_CTRL] = ~PCI_MSIX_ENTRY_CTRL_MASKBIT;
 +}
 +
  /* Add MSI-X capability to the config space for the device. */
  /* Given a bar and its size, add MSI-X table on top of it
   * and fill MSI-X capability in the config space.
 diff --git a/hw/msix.h b/hw/msix.h
 index 50aee82..26a437e 100644
 --- a/hw/msix.h
 +++ b/hw/msix.h
 @@ -4,6 +4,8 @@
  #include qemu-common.h
  #include pci.h
  
 +void msix_set_message(PCIDevice *dev, int vector, MSIMessage msg);
 +
  int msix_init(PCIDevice *pdev, unsigned short nentries,
MemoryRegion *bar,
unsigned bar_nr, unsigned bar_size);
 
 

Acked-by: Jan Kiszka jan.kis...@siemens.com

-- 
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux


--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] msi/msix: added functions to API to set up message address, and data

2012-06-21 Thread Jan Kiszka

On 2012-06-21 08:46, Alexey Kardashevskiy wrote:
 
 Ok, another try. Is it any better now? :)

No - posted the old version accidentally?

Jan

 
 
 Normally QEMU expects the guest to initialize MSI/MSIX vectors.
 However on POWER the guest uses RTAS subsystem to configure MSI/MSIX and
 does not write these vectors to device's config space or MSIX BAR.
 
 On the other hand, msi_notify()/msix_notify() write to these vectors to
 signal the guest about an interrupt so we have to write correct vectors
 to the devices in order not to change every user of MSI/MSIX.
 
 The first aim is to support MSIX for virtio-pci on POWER. There is
 another patch for POWER coming which introduces a special memory region
 where MSI/MSIX vectors point to.
 
 Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
 ---
  hw/msi.c  |   14 ++
  hw/msi.h  |1 +
  hw/msix.c |8 
  hw/msix.h |3 +++
  4 files changed, 26 insertions(+)
 
 diff --git a/hw/msi.c b/hw/msi.c
 index 5233204..c7b3e6a 100644
 --- a/hw/msi.c
 +++ b/hw/msi.c
 @@ -363,3 +363,17 @@ unsigned int msi_nr_vectors_allocated(const PCIDevice 
 *dev)
  uint16_t flags = pci_get_word(dev-config + msi_flags_off(dev));
  return msi_nr_vectors(flags);
  }
 +
 +void msi_set_address_data(PCIDevice *dev, uint64_t address, uint16_t data)
 +{
 +uint16_t flags = pci_get_word(dev-config + msi_flags_off(dev));
 +bool msi64bit = flags  PCI_MSI_FLAGS_64BIT;
 +
 +if (msi64bit) {
 +pci_set_quad(dev-config + msi_address_lo_off(dev), address);
 +} else {
 +pci_set_long(dev-config + msi_address_lo_off(dev), address);
 +}
 +pci_set_word(dev-config + msi_data_off(dev, msi64bit), data);
 +}
 +
 diff --git a/hw/msi.h b/hw/msi.h
 index 75747ab..353386e 100644
 --- a/hw/msi.h
 +++ b/hw/msi.h
 @@ -39,6 +39,7 @@ void msi_reset(PCIDevice *dev);
  void msi_notify(PCIDevice *dev, unsigned int vector);
  void msi_write_config(PCIDevice *dev, uint32_t addr, uint32_t val, int len);
  unsigned int msi_nr_vectors_allocated(const PCIDevice *dev);
 +void msi_set_address_data(PCIDevice *dev, uint64_t address, uint16_t data);
  
  static inline bool msi_present(const PCIDevice *dev)
  {
 diff --git a/hw/msix.c b/hw/msix.c
 index ded3c55..08e773d 100644
 --- a/hw/msix.c
 +++ b/hw/msix.c
 @@ -526,3 +526,11 @@ void msix_unset_vector_notifiers(PCIDevice *dev)
  dev-msix_vector_use_notifier = NULL;
  dev-msix_vector_release_notifier = NULL;
  }
 +void msix_set_address_data(PCIDevice *dev, int vector,
 +   uint64_t address, uint32_t data)
 +{
 +uint8_t *table_entry = dev-msix_table_page + vector * 
 PCI_MSIX_ENTRY_SIZE;
 +pci_set_quad(table_entry + PCI_MSIX_ENTRY_LOWER_ADDR, address);
 +pci_set_long(table_entry + PCI_MSIX_ENTRY_DATA, data);
 +table_entry[PCI_MSIX_ENTRY_VECTOR_CTRL] = ~PCI_MSIX_ENTRY_CTRL_MASKBIT;
 +}
 diff --git a/hw/msix.h b/hw/msix.h
 index 50aee82..901f101 100644
 --- a/hw/msix.h
 +++ b/hw/msix.h
 @@ -35,4 +35,7 @@ int msix_set_vector_notifiers(PCIDevice *dev,
MSIVectorUseNotifier use_notifier,
MSIVectorReleaseNotifier release_notifier);
  void msix_unset_vector_notifiers(PCIDevice *dev);
 +void msix_set_address_data(PCIDevice *dev, int vector,
 +   uint64_t address, uint32_t data);
 +
  #endif
 


-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux


--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] msi/msix: added public API to set/get MSI message address, and data

2012-06-21 Thread Jan Kiszka

On 2012-06-21 09:18, Alexey Kardashevskiy wrote:
 
 agrhhh. sha1 of the patch changed after rebasing :)
 
 
 
 Added (msi|msix)_(set|get)_message() function for whoever might
 want to use them.
 
 Currently msi_notify()/msix_notify() write to these vectors to
 signal the guest about an interrupt so the correct values have to
 written there by the guest or QEMU.
 
 For example, POWER guest never initializes MSI/MSIX vectors, instead
 it uses RTAS hypercalls. So in order to support MSIX for virtio-pci on
 POWER we have to initialize MSI/MSIX message from QEMU.
 
 As only set* function are required by now, the get functions were added
 or made public for a symmetry.
 
 Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
 ---
  hw/msi.c  |   29 +
  hw/msi.h  |2 ++
  hw/msix.c |   11 ++-
  hw/msix.h |3 +++
  4 files changed, 44 insertions(+), 1 deletion(-)
 
 diff --git a/hw/msi.c b/hw/msi.c
 index 5233204..9ad84a4 100644
 --- a/hw/msi.c
 +++ b/hw/msi.c
 @@ -105,6 +105,35 @@ static inline uint8_t msi_pending_off(const PCIDevice* 
 dev, bool msi64bit)
  return dev-msi_cap + (msi64bit ? PCI_MSI_PENDING_64 : 
 PCI_MSI_PENDING_32);
  }
  
 +MSIMessage msi_get_message(PCIDevice *dev)

MSIMessage msi_get_message(PCIDevice *dev, unsigned vector)

 +{
 +uint16_t flags = pci_get_word(dev-config + msi_flags_off(dev));
 +bool msi64bit = flags  PCI_MSI_FLAGS_64BIT;
 +MSIMessage msg;
 +
 +if (msi64bit) {
 +msg.address = pci_get_quad(dev-config + msi_address_lo_off(dev));
 +} else {
 +msg.address = pci_get_long(dev-config + msi_address_lo_off(dev));
 +}
 +msg.data = pci_get_word(dev-config + msi_data_off(dev, msi64bit));

And I have this here in addition:

unsigned int nr_vectors = msi_nr_vectors(flags);
...

if (nr_vectors  1) {
msg.data = ~(nr_vectors - 1);
msg.data |= vector;
}

See PCI spec and existing code.

 +
 +return msg;
 +}
 +
 +void msi_set_message(PCIDevice *dev, MSIMessage msg)
 +{
 +uint16_t flags = pci_get_word(dev-config + msi_flags_off(dev));
 +bool msi64bit = flags  PCI_MSI_FLAGS_64BIT;
 +
 +if (msi64bit) {
 +pci_set_quad(dev-config + msi_address_lo_off(dev), msg.address);
 +} else {
 +pci_set_long(dev-config + msi_address_lo_off(dev), msg.address);
 +}
 +pci_set_word(dev-config + msi_data_off(dev, msi64bit), msg.data);
 +}
 +
  bool msi_enabled(const PCIDevice *dev)
  {
  return msi_present(dev) 
 diff --git a/hw/msi.h b/hw/msi.h
 index 75747ab..4b0f4f8 100644
 --- a/hw/msi.h
 +++ b/hw/msi.h
 @@ -31,6 +31,8 @@ struct MSIMessage {
  
  extern bool msi_supported;
  
 +MSIMessage msi_get_message(PCIDevice *dev);
 +void msi_set_message(PCIDevice *dev, MSIMessage msg);
  bool msi_enabled(const PCIDevice *dev);
  int msi_init(struct PCIDevice *dev, uint8_t offset,
   unsigned int nr_vectors, bool msi64bit, bool 
 msi_per_vector_mask);
 diff --git a/hw/msix.c b/hw/msix.c
 index ded3c55..9e8d8bb 100644
 --- a/hw/msix.c
 +++ b/hw/msix.c
 @@ -35,7 +35,7 @@
  #define MSIX_PAGE_PENDING (MSIX_PAGE_SIZE / 2)
  #define MSIX_MAX_ENTRIES 32
  
 -static MSIMessage msix_get_message(PCIDevice *dev, unsigned vector)
 +MSIMessage msix_get_message(PCIDevice *dev, unsigned vector)
  {
  uint8_t *table_entry = dev-msix_table_page + vector * 
 PCI_MSIX_ENTRY_SIZE;
  MSIMessage msg;
 @@ -45,6 +45,15 @@ static MSIMessage msix_get_message(PCIDevice *dev, 
 unsigned vector)
  return msg;
  }
  
 +void msix_set_message(PCIDevice *dev, int vector, struct MSIMessage msg)
 +{
 +uint8_t *table_entry = dev-msix_table_page + vector * 
 PCI_MSIX_ENTRY_SIZE;
 +
 +pci_set_quad(table_entry + PCI_MSIX_ENTRY_LOWER_ADDR, msg.address);
 +pci_set_long(table_entry + PCI_MSIX_ENTRY_DATA, msg.data);
 +table_entry[PCI_MSIX_ENTRY_VECTOR_CTRL] = ~PCI_MSIX_ENTRY_CTRL_MASKBIT;
 +}
 +
  /* Add MSI-X capability to the config space for the device. */
  /* Given a bar and its size, add MSI-X table on top of it
   * and fill MSI-X capability in the config space.
 diff --git a/hw/msix.h b/hw/msix.h
 index 50aee82..3374cf8 100644
 --- a/hw/msix.h
 +++ b/hw/msix.h
 @@ -4,6 +4,9 @@
  #include qemu-common.h
  #include pci.h
  
 +MSIMessage msix_get_message(PCIDevice *dev, unsigned vector);
 +void msix_set_message(PCIDevice *dev, int vector, MSIMessage msg);
 +
  int msix_init(PCIDevice *pdev, unsigned short nentries,
MemoryRegion *bar,
unsigned bar_nr, unsigned bar_size);
 

General remark: You will make the life of the maintainers easier by
formatting your patch in a way that a clean merge via git works without
hand-editing. E.g. this patch was no scissor line (---8--- etc.)
between introductory text and the patch description. And the subject is
not [PATCH] 

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux


--
To unsubscribe from this list: send the line unsubscribe

Re: [PATCH] msi/msix: added public API to set/get MSI message address, and data

2012-06-21 Thread Jan Kiszka

On 2012-06-21 12:50, Alexey Kardashevskiy wrote:
 On 21/06/12 20:38, Jan Kiszka wrote:
 On 2012-06-21 12:28, Alexey Kardashevskiy wrote:
 On 21/06/12 17:39, Jan Kiszka wrote:
 On 2012-06-21 09:18, Alexey Kardashevskiy wrote:

 agrhhh. sha1 of the patch changed after rebasing :)



 Added (msi|msix)_(set|get)_message() function for whoever might
 want to use them.

 Currently msi_notify()/msix_notify() write to these vectors to
 signal the guest about an interrupt so the correct values have to
 written there by the guest or QEMU.

 For example, POWER guest never initializes MSI/MSIX vectors, instead
 it uses RTAS hypercalls. So in order to support MSIX for virtio-pci on
 POWER we have to initialize MSI/MSIX message from QEMU.

 As only set* function are required by now, the get functions were added
 or made public for a symmetry.

 Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
 ---
  hw/msi.c  |   29 +
  hw/msi.h  |2 ++
  hw/msix.c |   11 ++-
  hw/msix.h |3 +++
  4 files changed, 44 insertions(+), 1 deletion(-)

 diff --git a/hw/msi.c b/hw/msi.c
 index 5233204..9ad84a4 100644
 --- a/hw/msi.c
 +++ b/hw/msi.c
 @@ -105,6 +105,35 @@ static inline uint8_t msi_pending_off(const 
 PCIDevice* dev, bool msi64bit)
  return dev-msi_cap + (msi64bit ? PCI_MSI_PENDING_64 : 
 PCI_MSI_PENDING_32);
  }
  
 +MSIMessage msi_get_message(PCIDevice *dev)

 MSIMessage msi_get_message(PCIDevice *dev, unsigned vector)


 Who/how/why is going to calculate the vector here?


 +{
 +uint16_t flags = pci_get_word(dev-config + msi_flags_off(dev));
 +bool msi64bit = flags  PCI_MSI_FLAGS_64BIT;
 +MSIMessage msg;
 +
 +if (msi64bit) {
 +msg.address = pci_get_quad(dev-config + 
 msi_address_lo_off(dev));
 +} else {
 +msg.address = pci_get_long(dev-config + 
 msi_address_lo_off(dev));
 +}
 +msg.data = pci_get_word(dev-config + msi_data_off(dev, msi64bit));

 And I have this here in addition:

 unsigned int nr_vectors = msi_nr_vectors(flags);
 ...

 if (nr_vectors  1) {
 msg.data = ~(nr_vectors - 1);
 msg.data |= vector;
 }

 See PCI spec and existing code.


 What for? I really do not get it why someone might want to read something 
 but not real value.
 What PCI code should I look?

 I'm not sure what your use case for reading the message is. For KVM
 device assignment it is preparing an alternative message delivery path
 for MSI vectors. And for this we will need vector notifier support for
 MSI as well. You can check the MSI-X code for corresponding use cases of
 msix_get_message.
 
 And when we already have msi_get_message, another logical use case is
 msi_notify. See msix.c again.
 
 Aaaa.
 
 I have no case for reading the message. All I need is writing. And I want it 
 public as I want to use
 it from hw/spapr_pci.c. You suggested to add reading, I added get to be 
 _symmetric_ to set
 (get returns what set wrote). You want a different thing which I can do 
 but it is not
 msi_get_message(), it is something like msi_prepare_message(MSImessage msg) or
 msi_set_vector(uint16_t data) or simply internal kitchen of msi_notify().
 
 Still can do what you suggested, it just does not seem right.

It is right - when looking at it from a different angle. ;)

I don't mind if you add msi_get_message now or leave this to me. Likely
the latter is better as you have no use case for msi_get_message (and
also msix_get_message!) outside of their modules, thus we should not
export those functions anyway.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux


--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] msi/msix: added API to set MSI message address and data

2012-06-21 Thread Jan Kiszka

On 2012-06-21 13:39, Alexey Kardashevskiy wrote:
 Added (msi|msix)_set_message() functions.
 
 Currently msi_notify()/msix_notify() write to these vectors to
 signal the guest about an interrupt so the correct values have to
 written there by the guest or QEMU.
 
 For example, POWER guest never initializes MSI/MSIX vectors, instead
 it uses RTAS hypercalls. So in order to support MSIX for virtio-pci on
 POWER we have to initialize MSI/MSIX message from QEMU.
 
 Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
 ---
  hw/msi.c  |   13 +
  hw/msi.h  |1 +
  hw/msix.c |9 +
  hw/msix.h |2 ++
  4 files changed, 25 insertions(+)
 
 diff --git a/hw/msi.c b/hw/msi.c
 index 5233204..cc6102f 100644
 --- a/hw/msi.c
 +++ b/hw/msi.c
 @@ -105,6 +105,19 @@ static inline uint8_t msi_pending_off(const PCIDevice* 
 dev, bool msi64bit)
  return dev-msi_cap + (msi64bit ? PCI_MSI_PENDING_64 : 
 PCI_MSI_PENDING_32);
  }
  
 +void msi_set_message(PCIDevice *dev, MSIMessage msg)
 +{
 +uint16_t flags = pci_get_word(dev-config + msi_flags_off(dev));
 +bool msi64bit = flags  PCI_MSI_FLAGS_64BIT;
 +
 +if (msi64bit) {
 +pci_set_quad(dev-config + msi_address_lo_off(dev), msg.address);
 +} else {
 +pci_set_long(dev-config + msi_address_lo_off(dev), msg.address);
 +}
 +pci_set_word(dev-config + msi_data_off(dev, msi64bit), msg.data);
 +}
 +
  bool msi_enabled(const PCIDevice *dev)
  {
  return msi_present(dev) 
 diff --git a/hw/msi.h b/hw/msi.h
 index 75747ab..6ec1f99 100644
 --- a/hw/msi.h
 +++ b/hw/msi.h
 @@ -31,6 +31,7 @@ struct MSIMessage {
  
  extern bool msi_supported;
  
 +void msi_set_message(PCIDevice *dev, MSIMessage msg);
  bool msi_enabled(const PCIDevice *dev);
  int msi_init(struct PCIDevice *dev, uint8_t offset,
   unsigned int nr_vectors, bool msi64bit, bool 
 msi_per_vector_mask);
 diff --git a/hw/msix.c b/hw/msix.c
 index ded3c55..5f7d6d3 100644
 --- a/hw/msix.c
 +++ b/hw/msix.c
 @@ -45,6 +45,15 @@ static MSIMessage msix_get_message(PCIDevice *dev, 
 unsigned vector)
  return msg;
  }
  
 +void msix_set_message(PCIDevice *dev, int vector, struct MSIMessage msg)
 +{
 +uint8_t *table_entry = dev-msix_table_page + vector * 
 PCI_MSIX_ENTRY_SIZE;
 +
 +pci_set_quad(table_entry + PCI_MSIX_ENTRY_LOWER_ADDR, msg.address);
 +pci_set_long(table_entry + PCI_MSIX_ENTRY_DATA, msg.data);
 +table_entry[PCI_MSIX_ENTRY_VECTOR_CTRL] = ~PCI_MSIX_ENTRY_CTRL_MASKBIT;
 +}
 +
  /* Add MSI-X capability to the config space for the device. */
  /* Given a bar and its size, add MSI-X table on top of it
   * and fill MSI-X capability in the config space.
 diff --git a/hw/msix.h b/hw/msix.h
 index 50aee82..26a437e 100644
 --- a/hw/msix.h
 +++ b/hw/msix.h
 @@ -4,6 +4,8 @@
  #include qemu-common.h
  #include pci.h
  
 +void msix_set_message(PCIDevice *dev, int vector, MSIMessage msg);
 +
  int msix_init(PCIDevice *pdev, unsigned short nentries,
MemoryRegion *bar,
unsigned bar_nr, unsigned bar_size);
 

Interface looks good as fas as I can tell (can't asses the POWER need
for clearing the mask bit on msix_set_message).

 -- 
 1.7.10
 
 ps. double '-' and git version is an end-of-patch scissor as I read 
 somewhere, cannot recall where exactly 

Check man git-am.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [Qemu-devel] [PATCH 1/3] msi/msix: added functions to API to set up message address and data

2012-06-13 Thread Jan Kiszka

On 2012-06-14 07:17, Alexey Kardashevskiy wrote:
 On 14/06/12 14:56, Alex Williamson wrote:
 On Thu, 2012-06-14 at 14:31 +1000, Alexey Kardashevskiy wrote:
 Normally QEMU expects the guest to initialize MSI/MSIX vectors.
 However on POWER the guest uses RTAS subsystem to configure MSI/MSIX and
 does not write these vectors to device's config space or MSIX BAR.

 On the other hand, msi_notify()/msix_notify() write to these vectors to
 signal the guest about an interrupt so we have to write correct vectors
 to the devices in order not to change every user of MSI/MSIX.

 The first aim is to support MSIX for virtio-pci on POWER. There is
 another patch for POWER coming which introduces a special memory region
 where MSI/MSIX vectors point to.

 Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
 ---
  hw/msi.c  |   14 ++
  hw/msi.h  |1 +
  hw/msix.c |   10 ++
  hw/msix.h |3 +++
  4 files changed, 28 insertions(+), 0 deletions(-)

 diff --git a/hw/msi.c b/hw/msi.c
 index 5d6ceb6..124878a 100644
 --- a/hw/msi.c
 +++ b/hw/msi.c
 @@ -358,3 +358,17 @@ unsigned int msi_nr_vectors_allocated(const PCIDevice 
 *dev)
  uint16_t flags = pci_get_word(dev-config + msi_flags_off(dev));
  return msi_nr_vectors(flags);
  }
 +
 +void msi_set_address_data(PCIDevice *dev, uint64_t address, uint16_t data)
 +{
 +uint16_t flags = pci_get_word(dev-config + msi_flags_off(dev));
 +bool msi64bit = flags  PCI_MSI_FLAGS_64BIT;
 +
 +if (msi64bit) {
 +pci_set_quad(dev-config + msi_address_lo_off(dev), address);
 +} else {
 +pci_set_long(dev-config + msi_address_lo_off(dev), address);
 +}
 +pci_set_word(dev-config + msi_data_off(dev, msi64bit), data);
 +}

 Why not make it msi_set_message() and pass MSIMessage?  I'd be great if
 you tossed in a msi_get_message() as well, I think we need it to be able
 to do a kvm_irqchip_add_msi_route() with MSI.  Thanks,
 
 
 I am missing the point. What is that MSIMessage?
 It is just an address and data, making a struct from this is a bit too much :)

This is about consistent APIs. In practice (assembly), MSIMessage will
make no difference from open-coded address/data tuples, but it is
cleaner to pass around. Please follow the existing patterns.

 I am totally unfamiliar with kvm_irqchip_add_msi_route to see the bigger 
 picture, sorry.

The kvm_irqchip_* API come into play when you implement some interrupt
controller models in the kernel for performance reasons. We have this on
x86, we will see it on Book3E and ARM soon. You are targeting Book3S,
right? Not sure what the plans are there.

And, yes, msi_get_message() will be needed soon as well, at latest when
I push vector notifiers for classic MSI. If you have a need for reading
the message back, please add this helper already.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Virtual machine does not respond to keyboard input when guest OS idles for a while

2012-06-07 Thread Jan Kiszka

On 2012-06-07 03:26, Fei K Chen wrote:
 hi everyone,
 
 This is Fei Chen from IBM Research China. I and my team are working on 
 enabling qemu-kvm on IBM prism A2 processor.

Just a note: Don't use qemu-kvm as basis, use upstream QEMU or
qemu-kvm's uq/master branch (if you have generic kvm subsystem changes).
qemu-kvm is an x86-only fork that is fading out.

Jan



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PULL 00/20 1.2] kvm updates

2012-06-05 Thread Jan Kiszka

On 2012-06-05 10:10, Alexander Graf wrote:
 
 
 On 05.06.2012, at 09:42, Avi Kivity a...@redhat.com wrote:
 
 On 06/05/2012 04:58 AM, Anthony Liguori wrote:
 On 06/05/2012 08:52 AM, Andreas Färber wrote:
 Am 04.06.2012 07:46, schrieb Anthony Liguori:
 On 05/22/2012 12:37 AM, Avi Kivity wrote:
 Please pull from:

git://git.kernel.org/pub/scm/virt/kvm/qemu-kvm.git uq/master

 Pulled.  Thanks.

 This broke the ppc build. Guys, why wasn't this tested? There's only
 three KVM targets to test compared to the 14 I'm struggling with...

 Is build bot running against uq/master?  If it's not, maybe we should
 add it to build bot to catch this sort of thing.

 A build bot is overkill.  I usually do a full build (and autotest),
 don't know how this was missed.
 
 Not sure why it would be overkill. We have build bots running on more exotic 
 hw (ppc and s390) with kvm available for that exact purpose. They're running 
 either way - all we need to do is enable uq/master and we catch these before 
 the next pull request :).

I think this makes some sense. uq/master is supposed to be the staging
branch for generic and (at least) x86 kvm changes in QEMU. So it should
cover at least their builds.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2] KVM: Factor out kvm_vcpu_kick to arch-generic code

2012-02-06 Thread Jan Kiszka

On 2012-02-06 19:25, Marcelo Tosatti wrote:
 diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
 index c38efd7..a1761ff 100644
 --- a/arch/x86/kvm/x86.c
 +++ b/arch/x86/kvm/x86.c
 @@ -2252,7 +2252,6 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
  kvm_make_request(KVM_REQ_CLOCK_UPDATE, vcpu);
  if (vcpu-cpu != cpu)
  kvm_migrate_timers(vcpu);
 -vcpu-cpu = cpu;
  }
 
 This is wrong, kvm_sched_in fails to see vcpu-cpu properly. Please
 keep vcpu-cpu assignment in arch code.

True, but then kvm_sched_in is a better place for this assignment (as
it's central), no?

Jan



signature.asc
Description: OpenPGP digital signature

Re: [PATCH] KVM: Factor out kvm_vcpu_kick to arch-generic code

2012-01-20 Thread Jan Kiszka

On 2012-01-20 04:22, Christoffer Dall wrote:
 diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
 index d9cfb78..cd30b68 100644
 --- a/virt/kvm/kvm_main.c
 +++ b/virt/kvm/kvm_main.c
 @@ -153,6 +153,7 @@ void vcpu_load(struct kvm_vcpu *vcpu)
   cpu = get_cpu();
   preempt_notifier_register(vcpu-preempt_notifier);
   kvm_arch_vcpu_load(vcpu, cpu);
 + vcpu-cpu = cpu;

That means setting it in the various kvm_arch_vcpu_load handlers becomes
redundant and should be cleaned up.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: State of KVM guest debugging support on Power

2011-11-03 Thread Jan Kiszka

On 2011-11-03 19:59, Stuart Yoder wrote:
 On Tue, Nov 1, 2011 at 9:22 AM, Jan Kiszka jan.kis...@web.de wrote:
 Hi there,

 I'm generating some slides on guest debugging via kvm. What's the
 current state for Book-E and Book-S? Works out of box, mostly usable, or
 to be implemented? Is anyone using it?
 
 Are you talking about guest debug using the QEMU stub or using
 the virtual CPU's debug registers and interrupts?For Freescale
 Book E we have both approaches working, and patches to be sent upstream
 soon.

It's good to see both features coming into mainline.

I'll talk about the former, ie. debugging without guest help/awareness
(virtual hardware debugger). Is gdb well prepared for it (all registers
available, real-mode support, etc.)?

Thanks,
Jan



signature.asc
Description: OpenPGP digital signature

Re: [PATCH 09/14] KVM: PPC: Add generic single register ioctls

2011-10-31 Thread Jan Kiszka

On 2011-10-31 14:36, Avi Kivity wrote:
 On 10/31/2011 09:53 AM, Alexander Graf wrote:
 Right now we transfer a static struct every time we want to get or set
 registers. Unfortunately, over time we realize that there are more of
 these than we thought of before and the extensibility and flexibility of
 transferring a full struct every time is limited.

 So this is a new approach to the problem. With these new ioctls, we can
 get and set a single register that is identified by an ID. This allows for
 very precise and limited transmittal of data. When we later realize that
 it's a better idea to shove over multiple registers at once, we can reuse
 most of the infrastructure and simply implement a GET_MANY_REGS / 
 SET_MANY_REGS
 interface.

 The only downpoint I see to this one is that it needs to pad to 1024 bits
 (hardware is already on 512 bit registers, so I wanted to leave some room)
 which is slightly too much for transmitting only 64 bits. But if that's all
 the tradeoff we have to do for getting an extensible interface, I'd say go
 for it nevertheless.
 
 Do we want this for x86 too?  How often do we want just one register?

On x86, a single register is probably only interesting for debugging
purposes. Things that matter (performance wise) are in kvm.run, anything
else does not worry that much about speed. At least for now. I'm still
waiting for Kemari to propose some get/set optimizations, but there is
obviously not that much to gain or still bigger fish to fry.

Also, x86 is less regular than PPC. And where it is fairly regular, we
already have a GET/SET_MANY interface: MSRs.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: qemu compiling error on ppc64: kvm.c:81: error: struct kvm_sregs has no member named pvr

2011-02-04 Thread Jan Kiszka

On 2011-02-04 15:43, Alexander Graf wrote:
 
 On 04.02.2011, at 14:25, Dushyant Bansal cs5070...@cse.iitd.ernet.in 
 wrote:
 
 Hi,
  I am trying to install kvm on ppc64 system (imac G5). I have built
 kernel with kvm module. When I try to install qemu, I am getting this
 error

 $ ./configure --enable-kvm --target-list=ppc-softmmu
 $ make

 [...]
  CCslirp/tftp.o
  CClibdis/ppc-dis.o
  GEN   config-target.h
  CCppc-softmmu/arch_init.o
  CCppc-softmmu/cpus.o
  GEN   ppc-softmmu/hmp-commands.h
  GEN   ppc-softmmu/qmp-commands.h
  CCppc-softmmu/monitor.o
  CCppc-softmmu/machine.o
  CCppc-softmmu/gdbstub.o
  CCppc-softmmu/balloon.o
  CCppc-softmmu/virtio-blk.o
  CCppc-softmmu/virtio-balloon.o
  CCppc-softmmu/virtio-net.o
  CCppc-softmmu/virtio-serial-bus.o
  CCppc-softmmu/virtio-pci.o
  CCppc-softmmu/vhost_net.o
  CCppc-softmmu/rwhandler.o
  CCppc-softmmu/kvm.o
 /home/user/project/qemu/target-ppc/kvm.c: In function ‘kvm_arch_init_vcpu’:
 /home/user/project/qemu/target-ppc/kvm.c:81: error: ‘struct kvm_sregs’ has
 no member named ‘pvr’
 
 Hrm. This means that your kernel headers in /usr/include/linux are too old. 
 Can you try and find out which kernel version they are from please?

 2.6.33. Lacking build-time KVM_CAP check? Or is such a kernel too old
anyway? Then catch it during configure and point the user to... well...
there is no kvm-kmod with kernel header updates for PowerPC. Hmm... ;)

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Guest reboot interface

2011-01-07 Thread Jan Kiszka

Am 07.01.2011 13:40, Alexander Graf wrote:
 
 On 07.01.2011, at 13:08, Liu Yu-B13201 wrote:
 

 Hi all,

 When guest reboot, it's better to reset status by qemu, and then sync them 
 back to kvm
 However as you know there's no BOOKE mmu emulation in qemu,
 So that qemu cannot directly reset BOOKE TLB.
 Moreover a lot SPR registers also cannot be reset by qemu.

 As a workaround, we need a interface that qemu can notify kvm to reset the 
 status.
 My question is, what should the interface be?
 Looks like there's no alike interface in other archs.
 Should we define a ioctl command number only used by BOOKE or powerpc?
 
 Wouldn't it make more sense to just fake-implement the registers (SPRs and 
 unused TLB) and sync that back on reset? You don't have to implement the 
 actual mmu logic, but that keeps things at the same place at least.
 

This is how it's done on x86, and it's the preferred way. There will
never be such things as KVM_IOCTL_RESET.

BTW, isn't the TLB state relevant for vmsave/restore? Then it should be
made accessible for user space anyway.

Jan



signature.asc
Description: OpenPGP digital signature

Re: Software breakpoint in kvmppc guest debug

2010-11-10 Thread Jan Kiszka

Am 10.11.2010 04:39, Alexander Graf wrote:
 
 On 10.11.2010, at 04:31, Yoder Stuart-B08248 wrote:
 


 -Original Message-
 From: kvm-ppc-ow...@vger.kernel.org
 [mailto:kvm-ppc-ow...@vger.kernel.org]
 On Behalf Of Alexander Graf
 Sent: Tuesday, November 09, 2010 12:50 PM
 To: Wood Scott-B07421
 Cc: Hollis Blanchard; Liu Yu-B13201; kvm-ppc@vger.kernel.org; Jan
 Kiszka
 Subject: Re: Software breakpoint in kvmppc guest debug


 On 09.11.2010, at 19:43, Scott Wood wrote:

 On Tue, 9 Nov 2010 19:26:25 +0100
 Alexander Graf ag...@suse.de wrote:


 On 09.11.2010, at 19:17, Scott Wood wrote:

 On Tue, 9 Nov 2010 18:14:31 +0100
 Alexander Graf ag...@suse.de wrote:

 Now, if we can get away with not using an undefined instruction
 (be it
 sc 64 or trap) I don't know. I'm not even sure we can get away with
 trap.
 Basically, WARN_ON should also trigger a trap, so you'd end up in gdb
 for
 that when having a breakpoint defined.

 There'd need to be a way for qemu/gdb to have KVM reflect the
 exception to the guest if it doesn't have a breakpoint on file for
 that
 address.

 Yes, but that piece is missing for every KVM target right now. I'd
 like
 to see something generic emerge here that we can reuse across
 different
 architectures :).

 We should try to define something that will make sense on other
 architectures, to whatever extent is practical -- but that doesn't
 mean we need to wait for them to act first. :-)

 Oh, I agree. I'm saying we should have Jan in the discussion :).

 Trap seems very tricky too though. According to page 1082 of the
 2.06
 spec, trap can issue a debug or a program interrupt depending on
 MSR_DE. I
 don't see any mentioning in the spec that we intercept program or
 debug
 interrupts. So I guess we'd have to override the offset vectors and
 handle
 _every_ interrupt ourselves. Bleks.

 Program and debug interrupts always go to the hypervisor.  Only the
 exceptions for which GIVORs are defined (DSI, ISI, external IRQ,
 syscall, TLB miss) can go directly to the guest, and even those are
 generally optional based on EPCR.

 Ah, so it's a positive list. I guess I missed that part :). Thanks for
 the
 clarification.

 So making it a trap instruction for now should be ok. Let's align the
 code
 to the same x86 looks like for now. In the meanwhile, let's discuss
 with
 Jan on how to get a list of patched IPs into the kernel and implement
 that
 on top of the code we'll have aligned with x86 by then :).

 What about using the ehpriv instruction (defined in 2.06)?
 We could define a unique qemu software breakpoint, and the
 hypervisor's ehpriv handler will be able to easily distinguish them.

 On e500v2, this would be an illegal instruction.

 We don't need to worry about a conflict with future opcodes
 and it seems like the whole point of ehpriv being defined-- to let
 a hypervisor define special opcodes for stuff like this.
 
 Yes, it even states so in the little box below the description. Very good 
 catch!
 
 If we use trap, we will need the list of patched IPs in 
 the kernel.   ehpriv avoids the need for that.
 
 Well, the way it's handled on x86 now is that the kernel asks user space if 
 it knows about the breakpoint and if not user space reinjects the debug 
 interrupt into the guest.

Right. That's fine for x86 as we have a dedicated soft BP instruction.
It's shared with the guest, but as long as the latter does not start
using it heavily for whatever purpose, this lengthy loop to find the
trap reason is OK. It avoids dispatching tables in the kernel.

Still, if it actually turns out that PowerPC (or some other arch) needs
this for performance reasons as no proper instructions could be defined,
we should probably introduce a generic service to set a soft BP
dispatching table in the kernel. That could also be used by x86 then,
even though the practical improvement would be minimal there.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2] KVM: Clean up vm creation and release

2010-11-10 Thread Jan Kiszka

Am 10.11.2010 10:17, Avi Kivity wrote:
 On 11/09/2010 11:01 PM, Alexander Graf wrote:
 On 09.11.2010, at 17:02, Jan Kiszka wrote:

  IA64 support forces us to abstract the allocation of the kvm structure.
  But instead of mixing this up with arch-specific initialization and
  doing the same on destruction, split both steps. This allows to move
  generic destruction calls into generic code.

  It also fixes error clean-up on failures of kvm_create_vm for IA64.

  Signed-off-by: Jan Kiszkajan.kis...@siemens.com
  ---

  Changes in v2:
  - Fixed s390 conversion and added linux/slab.h
as remarked by Christian (thanks!)

  arch/ia64/include/asm/kvm_host.h |4 
  arch/ia64/kvm/kvm-ia64.c |   28 +++-
  arch/powerpc/kvm/powerpc.c   |   20 +++-
  arch/s390/kvm/kvm-s390.c |   23 ++-
  arch/x86/kvm/x86.c   |   12 ++--
  include/linux/kvm_host.h |   15 ++-
  virt/kvm/kvm_main.c  |   19 +--
  7 files changed, 49 insertions(+), 72 deletions(-)

  diff --git a/arch/ia64/include/asm/kvm_host.h 
 b/arch/ia64/include/asm/kvm_host.h
  index 2f229e5..2689ee5 100644
  --- a/arch/ia64/include/asm/kvm_host.h
  +++ b/arch/ia64/include/asm/kvm_host.h
  @@ -590,6 +590,10 @@ int kvm_emulate_halt(struct kvm_vcpu *vcpu);
  int kvm_pal_emul(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run);
  void kvm_sal_emul(struct kvm_vcpu *vcpu);

  +#define __KVM_HAVE_ARCH_VM_ALLOC 1
  +struct kvm *kvm_arch_alloc_vm(void);
  +void kvm_arch_free_vm(struct kvm *kvm);
  +
  #endif /* __ASSEMBLY__*/

  #endif
  diff --git a/arch/ia64/kvm/kvm-ia64.c b/arch/ia64/kvm/kvm-ia64.c
  index f56a631..48a48bd 100644
  --- a/arch/ia64/kvm/kvm-ia64.c
  +++ b/arch/ia64/kvm/kvm-ia64.c
  @@ -749,7 +749,7 @@ out:
 return r;
  }

  -static struct kvm *kvm_alloc_kvm(void)
  +struct kvm *kvm_arch_alloc_vm(void)
  {

 struct kvm *kvm;
  @@ -760,7 +760,7 @@ static struct kvm *kvm_alloc_kvm(void)
 vm_base = __get_free_pages(GFP_KERNEL, get_order(KVM_VM_DATA_SIZE));

 if (!vm_base)
  -  return ERR_PTR(-ENOMEM);
  +  return NULL;

 memset((void *)vm_base, 0, KVM_VM_DATA_SIZE);
 kvm = (struct kvm *)(vm_base +
  @@ -806,10 +806,12 @@ static void kvm_build_io_pmt(struct kvm *kvm)
  #define GUEST_PHYSICAL_RR4 0x2739
  #define VMM_INIT_RR0x1660

  -static void kvm_init_vm(struct kvm *kvm)
  +int kvm_arch_init_vm(struct kvm *kvm)
  {
 BUG_ON(!kvm);

  +  kvm-arch.is_sn2 = ia64_platform_is(sn2);
  +
 kvm-arch.metaphysical_rr0 = GUEST_PHYSICAL_RR0;
 kvm-arch.metaphysical_rr4 = GUEST_PHYSICAL_RR4;
 kvm-arch.vmm_init_rr = VMM_INIT_RR;
  @@ -823,21 +825,8 @@ static void kvm_init_vm(struct kvm *kvm)

 /* Reserve bit 0 of irq_sources_bitmap for userspace irq source */
 set_bit(KVM_USERSPACE_IRQ_SOURCE_ID,kvm-arch.irq_sources_bitmap);
  -}
  -
  -struct  kvm *kvm_arch_create_vm(void)
  -{
  -  struct kvm *kvm = kvm_alloc_kvm();
  -
  -  if (IS_ERR(kvm))
  -  return ERR_PTR(-ENOMEM);
  -
  -  kvm-arch.is_sn2 = ia64_platform_is(sn2);
  -
  -  kvm_init_vm(kvm);
  -
  -  return kvm;

  +  return 0;
  }

  static int kvm_vm_ioctl_get_irqchip(struct kvm *kvm,
  @@ -1357,7 +1346,7 @@ int kvm_arch_vcpu_ioctl_set_guest_debug(struct 
 kvm_vcpu *vcpu,
 return -EINVAL;
  }

  -static void free_kvm(struct kvm *kvm)
  +void kvm_arch_free_vm(struct kvm *kvm)
  {
 unsigned long vm_base = kvm-arch.vm_base;

  @@ -1399,9 +1388,6 @@ void kvm_arch_destroy_vm(struct kvm *kvm)
  #endif
 kfree(kvm-arch.vioapic);
 kvm_release_vm_pages(kvm);
  -  kvm_free_physmem(kvm);
  -  cleanup_srcu_struct(kvm-srcu);
  -  free_kvm(kvm);
  }

  void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
  diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
  index 38f756f..ce3dd65 100644
  --- a/arch/powerpc/kvm/powerpc.c
  +++ b/arch/powerpc/kvm/powerpc.c
  @@ -145,18 +145,12 @@ void kvm_arch_check_processor_compat(void *rtn)
 *(int *)rtn = kvmppc_core_check_processor_compat();
  }

  -struct kvm *kvm_arch_create_vm(void)
  +int kvm_arch_init_vm(struct kvm *)

 Eh, no. This doesn't compile :).
 
 What's the problem? lack of a formal argument?

The unnamed argument is referenced as 'kvm' in this function. Will you
fix this up on merge or should I resend?

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 3/4] kvmppc/booke: guest debug support

2010-02-03 Thread Jan Kiszka

Liu Yu wrote:
 According to user's gdb command,
 we set the corresponding debug control bits in shadow.
 
 Signed-off-by: Liu Yu yu.liu-kzfg59tc24xl57midrc...@public.gmane.org
 ---
  arch/powerpc/include/asm/kvm_ppc.h |3 +
  arch/powerpc/kvm/booke.c   |   93 
 ++--
  arch/powerpc/kvm/e500.c|8 ---
  arch/powerpc/kvm/powerpc.c |2 +-
  4 files changed, 93 insertions(+), 13 deletions(-)
 
 diff --git a/arch/powerpc/include/asm/kvm_ppc.h 
 b/arch/powerpc/include/asm/kvm_ppc.h
 index e264282..8918aac 100644
 --- a/arch/powerpc/include/asm/kvm_ppc.h
 +++ b/arch/powerpc/include/asm/kvm_ppc.h
 @@ -94,6 +94,9 @@ extern int kvmppc_core_emulate_op(struct kvm_run *run, 
 struct kvm_vcpu *vcpu,
  extern int kvmppc_core_emulate_mtspr(struct kvm_vcpu *vcpu, int sprn, int 
 rs);
  extern int kvmppc_core_emulate_mfspr(struct kvm_vcpu *vcpu, int sprn, int 
 rt);
  
 +extern int kvmppc_core_set_guest_debug(struct kvm_vcpu *vcpu,
 +   struct kvm_guest_debug *dbg);
 +
  extern int kvmppc_booke_init(void);
  extern void kvmppc_booke_exit(void);
  
 diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
 index 4d686cc..ec2722d 100644
 --- a/arch/powerpc/kvm/booke.c
 +++ b/arch/powerpc/kvm/booke.c
 @@ -267,6 +267,16 @@ int kvmppc_handle_exit(struct kvm_run *run, struct 
 kvm_vcpu *vcpu,
   break;
   }
  
 + if (unlikely(vcpu-guest_debug  KVM_GUESTDBG_ENABLE) 

This should better check for KVM_GUESTDBG_USE_SW_BP.

 +  (vcpu-arch.last_inst == KVM_INST_GUESTGDB)) {
 + run-exit_reason = KVM_EXIT_DEBUG;
 + run-debug.arch.pc = vcpu-arch.pc;
 + run-debug.arch.exception = exit_nr;
 + kvmppc_account_exit(vcpu, DEBUG_EXITS);
 + r = RESUME_HOST;
 + break;
 + }
 +
   er = kvmppc_emulate_instruction(run, vcpu);
   switch (er) {
   case EMULATE_DONE:
 @@ -293,6 +303,12 @@ int kvmppc_handle_exit(struct kvm_run *run, struct 
 kvm_vcpu *vcpu,
   default:
   BUG();
   }
 +
 + if (unlikely(vcpu-guest_debug  KVM_GUESTDBG_ENABLE) 
 + (vcpu-guest_debug  KVM_GUESTDBG_SINGLESTEP)) {

Checking for KVM_GUESTDBG_ENABLE is redundant as you enforce guest_debug
= 0 in kvmppc_core_set_guest_debug if KVM_GUESTDBG_ENABLE is not set.

 + run-exit_reason = KVM_EXIT_DEBUG;
 + r = RESUME_HOST;
 + }
   break;
  
   case BOOKE_INTERRUPT_FP_UNAVAIL:
 @@ -421,12 +437,27 @@ int kvmppc_handle_exit(struct kvm_run *run, struct 
 kvm_vcpu *vcpu,
   u32 dbsr;
  
   vcpu-arch.pc = mfspr(SPRN_CSRR0);
 -
 - /* clear IAC events in DBSR register */
   dbsr = mfspr(SPRN_DBSR);
 - dbsr = DBSR_IAC1 | DBSR_IAC2 | DBSR_IAC3 | DBSR_IAC4;
 - mtspr(SPRN_DBSR, dbsr);
 + run-debug.arch.pc = vcpu-arch.pc;
 + run-debug.arch.status = 0;
 +
 + if (dbsr  (DBSR_IAC1 | DBSR_IAC2 | DBSR_IAC3 | DBSR_IAC4)) {
 + run-debug.arch.status |= KVMPPC_DEBUG_BREAKPOINT;
 + } else {
 + if (dbsr  (DBSR_DAC1W | DBSR_DAC2W))
 + run-debug.arch.status |= 
 KVMPPC_DEBUG_WATCH_WRITE;
 + else if (dbsr  (DBSR_DAC1R | DBSR_DAC2R))
 + run-debug.arch.status |= 
 KVMPPC_DEBUG_WATCH_READ;
 + if (dbsr  (DBSR_DAC1R | DBSR_DAC1W))
 + run-debug.arch.pc = 
 vcpu-arch.shadow_dbg_reg.dac1;
 + else if (dbsr  (DBSR_DAC2R | DBSR_DAC2W))
 + run-debug.arch.pc = 
 vcpu-arch.shadow_dbg_reg.dac2;
 + }
  
 + /* clear events in DBSR register */
 + mtspr(SPRN_DBSR, ~0);
 +
 + run-debug.arch.exception = exit_nr;
   run-exit_reason = KVM_EXIT_DEBUG;
   kvmppc_account_exit(vcpu, DEBUG_EXITS);
   r = RESUME_HOST;
 @@ -560,6 +591,60 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct 
 kvm_dirty_log *log)
   return -ENOTSUPP;
  }
  
 +int kvmppc_core_set_guest_debug(struct kvm_vcpu *vcpu,
 +struct kvm_guest_debug *dbg)
 +{
 + if (!(dbg-control  KVM_GUESTDBG_ENABLE)) {
 + vcpu-guest_debug = 0;
 + return 0;
 + }
 +
 + vcpu-guest_debug = dbg-control;
 + vcpu-arch.shadow_dbg_reg.dbcr0 = 0;
 +
 + if (vcpu-guest_debug  KVM_GUESTDBG_SINGLESTEP)
 + vcpu-arch.shadow_dbg_reg.dbcr0 |= DBCR0_IDM | DBCR0_IC;
 +
 + if (vcpu-guest_debug  KVM_GUESTDBG_USE_HW_BP) {
 + struct kvmppc_debug_reg *gdbgr = (vcpu-arch.shadow_dbg_reg);
 + int n, b = 0, w

Re: [PATCH] Enable 32bit dirty log pointers on 64bit host

2009-10-23 Thread Jan Kiszka

Alexander Graf wrote:
 On 23.10.2009, at 10:41, Jan Kiszka wrote:
 
 Alexander Graf wrote:
 From: Arnd Bergmann a...@arndb.de

 With big endian userspace, we can't quite figure out if a pointer
 is 32 bit (shifted  32) or 64 bit when we read a 64 bit pointer.

 This is what happens with dirty logging. To get the pointer  
 interpreted
 correctly, we thus need Arnd's patch to implement a compat layer for
 the ioctl:

 A better way to do this is to add a separate compat_ioctl() method  
 that
 converts this for you.

 From: Arnd Bergmann a...@arndb.de
 Signed-off-by: Arnd Bergmann a...@arndb.de
 Acked-by: Alexander Graf ag...@suse.de

 ---

 Changes from Arnd's example version:

  - s/log.log/log/ (Avi)
  - use sizeof(compat_log) (Avi)
  - compile fixes
 ---
 virt/kvm/kvm_main.c |   49 + 
 +++-
 1 files changed, 48 insertions(+), 1 deletions(-)

 diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
 index cac69c4..54a272f 100644
 --- a/virt/kvm/kvm_main.c
 +++ b/virt/kvm/kvm_main.c
 @@ -43,6 +43,7 @@
 #include linux/swap.h
 #include linux/bitops.h
 #include linux/spinlock.h
 +#include linux/compat.h

 #include asm/processor.h
 #include asm/io.h
 @@ -1542,6 +1543,52 @@ out:
 return r;
 }

 +#ifdef CONFIG_COMPAT
 +struct compat_kvm_dirty_log {
 +   __u32 slot;
 +   __u32 padding1;
 +   union {
 +   compat_uptr_t dirty_bitmap; /* one bit per page */
 +   __u64 padding2;
 +   };
 +};
 +
 +static long kvm_vm_compat_ioctl(struct file *filp,
 +  unsigned int ioctl, unsigned long arg)
 +{
 +   struct kvm *kvm = filp-private_data;
 +   int r;
 +
 +   if (kvm-mm != current-mm)
 +   return -EIO;
 +   switch (ioctl) {
 +   case KVM_GET_DIRTY_LOG: {
 +   struct compat_kvm_dirty_log compat_log;
 +   struct kvm_dirty_log log;
 +
 +   r = -EFAULT;
 +   if (copy_from_user(compat_log, (void __user *)arg,
 +  sizeof(compat_log)))
 +   goto out;
 +   log.slot = compat_log.slot;
 +   log.padding1 = compat_log.padding1;
 +   log.padding2 = compat_log.padding2;
 +   log.dirty_bitmap = compat_ptr(compat_log.dirty_bitmap);
 +
 +   r = kvm_vm_ioctl_get_dirty_log(kvm, log);
 +   if (r)
 +   goto out;
 +   break;
 +   }
 +   default:
 +   r = kvm_vm_ioctl(filp, ioctl, arg);
 +   }
 +
 +out:
 +   return r;
 +}
 +#endif
 +
 static int kvm_vm_fault(struct vm_area_struct *vma, struct vm_fault  
 *vmf)
 {
 struct page *page[1];
 @@ -1576,7 +1623,7 @@ static int kvm_vm_mmap(struct file *file,  
 struct vm_area_struct *vma)
 static struct file_operations kvm_vm_fops = {
 .release= kvm_vm_release,
 .unlocked_ioctl = kvm_vm_ioctl,
 -   .compat_ioctl   = kvm_vm_ioctl,
 +   .compat_ioctl   = kvm_vm_compat_ioctl,
 This fails in the absence of CONFIG_COMPAT.
 
 
 So should it rather be
 
 #ifdef CONFIG_COMPAT
   .compat_ioctl = kvm_vm_compat_ioctl,
 #else
   .compat_ioctl = kvm_vm_ioctl,
 #endif
 
 or
 
 #ifdef CONFIG_COMPAT
   .compat_ioctl = kvm_vm_compat_ioctl,
 #endif
 
 ?

I would say the latter as .compat_ioctl should simply be unused in case
of !CONFIG_COMPAT.

Jan

-- 
Siemens AG, Corporate Technology, CT SE 2
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/5] Fix booke registers init

2009-07-25 Thread Jan Kiszka

Liu Yu wrote:
 Commit 8d2ba1fb9c8e7006e10d71fa51a020977f14c8b0
 introduces a new new reset order.
 
 So that we have to synchronize registers explicitly.
 
 Signed-off-by: Liu Yu yu.liu-kzfg59tc24xl57midrc...@public.gmane.org
 ---
  hw/ppc440_bamboo.c |4 +++-
  hw/ppce500_mpc8544ds.c |4 +++-
  2 files changed, 6 insertions(+), 2 deletions(-)
 
 diff --git a/hw/ppc440_bamboo.c b/hw/ppc440_bamboo.c
 index d9ef3ec..f1ba130 100644
 --- a/hw/ppc440_bamboo.c
 +++ b/hw/ppc440_bamboo.c
 @@ -182,8 +182,10 @@ static void bamboo_init(ram_addr_t ram_size,
  /* XXX we currently depend on KVM to create some initial TLB 
 entries. */
  }
  
 -if (kvm_enabled())
 +if (kvm_enabled()) {
 +kvm_arch_put_registers(env);
  kvmppc_init();
 +}
  }
  
  static QEMUMachine bamboo_machine = {
 diff --git a/hw/ppce500_mpc8544ds.c b/hw/ppce500_mpc8544ds.c
 index c0e367d..f1b3c1a 100644
 --- a/hw/ppce500_mpc8544ds.c
 +++ b/hw/ppce500_mpc8544ds.c
 @@ -276,8 +276,10 @@ static void mpc8544ds_init(ram_addr_t ram_size,
  /* XXX we currently depend on KVM to create some initial TLB 
 entries. */
  }
  
 -if (kvm_enabled())
 +if (kvm_enabled()) {
 +kvm_arch_put_registers(env);
  kvmppc_init();
 +}
  
  return;
  }

These are required when loading a device tree and, thus, changing some
registers after cpu_init, right? Then please add
cpu_synchronize_state(env, 1) to the corresponding code blocks instead
of this explicit, kvm-specific loading.

Jan



signature.asc
Description: OpenPGP digital signature

Re: [PATCH 3/5] Add guest debug support for kvmppc

2009-07-25 Thread Jan Kiszka

Liu Yu wrote:
 Signed-off-by: Liu Yu yu.liu-kzfg59tc24xl57midrc...@public.gmane.org
 ---
  target-ppc/kvm.c |  197 
 ++
  1 files changed, 197 insertions(+), 0 deletions(-)
 
 diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
 index b53d6e9..d8dbdb4 100644
 --- a/target-ppc/kvm.c
 +++ b/target-ppc/kvm.c
 @@ -8,6 +8,9 @@
   *  Christian Ehrhardt 
 ehrhardt-23vcf4htsmix0ybbhkvfkdbpr1lh4...@public.gmane.org
   *  Hollis Blanchard hollisb-r/jw6+rmf7hqt0dzr+a...@public.gmane.org
   *
 + * Copyright (C) 2009 Freescale Semiconductor, Inc. All rights reserved.
 + *  Yu Liu yu.liu-kzfg59tc24xl57midrc...@public.gmane.org
 + *
   * This work is licensed under the terms of the GNU GPL, version 2 or later.
   * See the COPYING file in the top-level directory.
   *
 @@ -18,6 +21,7 @@
  #include sys/mman.h
  
  #include linux/kvm.h
 +#include asm/kvm_asm.h
  
  #include qemu-common.h
  #include qemu-timer.h
 @@ -26,6 +30,7 @@
  #include kvm_ppc.h
  #include cpu.h
  #include device_tree.h
 +#include gdbstub.h
  
  //#define DEBUG_KVM
  
 @@ -216,3 +221,195 @@ int kvm_arch_handle_exit(CPUState *env, struct kvm_run 
 *run)
  return ret;
  }
  
 +#ifdef KVM_CAP_SET_GUEST_DEBUG
 +int kvm_arch_insert_sw_breakpoint(CPUState *env, struct kvm_sw_breakpoint 
 *bp)
 +{
 +uint32_t sc = tswap32(KVM_INST_GUESTGDB);
 +uint32_t tmp;
 +
 +if (cpu_memory_rw_debug(env, bp-pc, (uint8_t *)bp-saved_insn, 4, 0) ||
 +cpu_memory_rw_debug(env, bp-pc, (uint8_t *)sc, 4, 1))
 +return -EINVAL;
 +cpu_memory_rw_debug(env, bp-pc, (uint8_t *)tmp, 4, 0);
 +return 0;
 +}
 +
 +int kvm_arch_remove_sw_breakpoint(CPUState *env, struct kvm_sw_breakpoint 
 *bp)
 +{
 +uint32_t sc;
 +
 +if (cpu_memory_rw_debug(env, bp-pc, (uint8_t *)sc, 4, 0) ||
 +sc != tswap32(KVM_INST_GUESTGDB) ||
 +cpu_memory_rw_debug(env, bp-pc, (uint8_t *)bp-saved_insn, 4, 1))
 +return -EINVAL;
 +return 0;
 +}
 +
 +static struct {
 +target_ulong addr;
 +int type;
 +} hw_breakpoint[6];
 +
 +static int nb_hw_breakpoint;
 +static int nb_hw_watchpoint;
 +static int max_hw_breakpoint;
 +static int max_hw_watchpoint;
 +
 +void kvmppc_debug_init(int max_hw_bp, int max_hw_wp)
 +{
 +max_hw_breakpoint = max_hw_bp  4? 4 : max_hw_bp;
 +max_hw_watchpoint = max_hw_wp  2? 2 : max_hw_wp;
 +}
 +
 +static int find_hw_breakpoint(target_ulong addr, int type)
 +{
 +int n;
 +
 +for (n = 0; n  nb_hw_breakpoint + nb_hw_watchpoint; n++)
 +if (hw_breakpoint[n].addr == addr  hw_breakpoint[n].type == type)
 +return n;
 +return -1;
 +}
 +
 +int kvm_arch_insert_hw_breakpoint(target_ulong addr,
 +  target_ulong len, int type)
 +{
 +hw_breakpoint[nb_hw_breakpoint + nb_hw_watchpoint].addr = addr;
 +hw_breakpoint[nb_hw_breakpoint + nb_hw_watchpoint].type = type;
 +
 +switch (type) {
 +case GDB_BREAKPOINT_HW:
 +if (nb_hw_breakpoint = max_hw_breakpoint)
 +return -ENOBUFS;
 +
 +if (find_hw_breakpoint(addr, type) = 0)
 +return -EEXIST;
 +
 +nb_hw_breakpoint++;
 +break;
 +
 +case GDB_WATCHPOINT_WRITE:
 +case GDB_WATCHPOINT_ACCESS:
 +if (nb_hw_watchpoint = max_hw_watchpoint)
 +return -ENOBUFS;
 +
 +if (find_hw_breakpoint(addr, type) = 0)
 +return -EEXIST;
 +
 +nb_hw_watchpoint++;
 +break;
 +
 +default:
 +return -ENOSYS;
 +}
 +
 +return 0;
 +}
 +
 +int kvm_arch_remove_hw_breakpoint(target_ulong addr,
 +  target_ulong len, int type)
 +{
 +int n;
 +
 +n = find_hw_breakpoint(addr, type);
 +if (n  0)
 +return -ENOENT;
 +
 +switch (type) {
 +case GDB_BREAKPOINT_HW:
 +nb_hw_breakpoint--;
 +break;
 +
 +case GDB_WATCHPOINT_WRITE:
 +case GDB_WATCHPOINT_ACCESS:
 +nb_hw_watchpoint--;
 +break;
 +
 +default:
 +return -ENOSYS;
 +}
 +hw_breakpoint[n] = hw_breakpoint[nb_hw_breakpoint + nb_hw_watchpoint];
 +
 +return 0;
 +}
 +
 +void kvm_arch_remove_all_hw_breakpoints(void)
 +{
 +nb_hw_breakpoint = nb_hw_watchpoint = 0;
 +}
 +
 +static CPUWatchpoint hw_watchpoint;
 +
 +int kvm_arch_debug(struct kvm_debug_exit_arch *arch_info)
 +{
 +int handle = 0;
 +int n;
 +
 +if (cpu_single_env-singlestep_enabled) {
 +handle = 1;
 +
 +} else if (arch_info-status) {
 +if (arch_info-status == KVMPPC_DEBUG_BREAKPOINT) {
 +n = find_hw_breakpoint(arch_info-pc, GDB_BREAKPOINT_HW);
 +if (n = 0)
 +handle = 1;
 +
 +} else if (arch_info-status == KVMPPC_DEBUG_WATCH_ACCESS) {
 +n = find_hw_breakpoint(arch_info-pc, GDB_WATCHPOINT_ACCESS);
 +if (n = 0) {
 +handle = 1;
 +cpu_single_env-watchpoint_hit = hw_watchpoint;
 +hw_watchpoint.vaddr =

Re: [PATCH 5/5] guest debug init for 440 and e500 core

2009-07-25 Thread Jan Kiszka

Liu Yu wrote:
 e500 only support 2 hardware breakpoints,
 440(BOOKE) supports 4.
 
 Signed-off-by: Liu Yu yu.liu-kzfg59tc24xl57midrc...@public.gmane.org
 ---
  hw/ppc440_bamboo.c |1 +
  hw/ppce500_mpc8544ds.c |1 +
  target-ppc/kvm_ppc.h   |1 +
  3 files changed, 3 insertions(+), 0 deletions(-)
 
 diff --git a/hw/ppc440_bamboo.c b/hw/ppc440_bamboo.c
 index f1ba130..8c9c3b6 100644
 --- a/hw/ppc440_bamboo.c
 +++ b/hw/ppc440_bamboo.c
 @@ -185,6 +185,7 @@ static void bamboo_init(ram_addr_t ram_size,
  if (kvm_enabled()) {
  kvm_arch_put_registers(env);
  kvmppc_init();
 +kvmppc_debug_init(4, 2);
  }
  }
  
 diff --git a/hw/ppce500_mpc8544ds.c b/hw/ppce500_mpc8544ds.c
 index f1b3c1a..6c2aa61 100644
 --- a/hw/ppce500_mpc8544ds.c
 +++ b/hw/ppce500_mpc8544ds.c
 @@ -279,6 +279,7 @@ static void mpc8544ds_init(ram_addr_t ram_size,
  if (kvm_enabled()) {
  kvm_arch_put_registers(env);
  kvmppc_init();
 +kvmppc_debug_init(2, 2); /* E500v2 doesn't support IAC3,IAC4 */

I think those two are better moved to kvm_arch_init_vcpu.

  }
  
  return;
 diff --git a/target-ppc/kvm_ppc.h b/target-ppc/kvm_ppc.h
 index 3792ef7..8b4edca 100644
 --- a/target-ppc/kvm_ppc.h
 +++ b/target-ppc/kvm_ppc.h
 @@ -13,5 +13,6 @@ void kvmppc_init(void);
  void kvmppc_fdt_update(void *fdt);
  int kvmppc_read_host_property(const char *node_path, const char *prop,
   void *val, size_t len);
 +void kvmppc_debug_init(int max_hw_bp, int max_hw_wp);
  
  #endif /* __KVM_PPC_H__ */

Jan



signature.asc
Description: OpenPGP digital signature

Re: [PATCH 0/5]

2009-07-25 Thread Jan Kiszka

Liu Yu wrote:
 The whole patchset includes:
 patch 1: fix kvmppc build error
 patch 2: fix kvmppc init error
 patch 3~5: add kvmppc guest debug support
 
 The guest debug still have some problems I haven't solved.
 
 1. gdb 'next' command uses software breakpoint
 software breakpoint is implemented via modify guest's code.
 In most case it works well,
 but when used by 'next' it's easy to make trouble on powerpc booke.
 
 For example booke has a code template for
 jumping to and returning from interrupt handlers:
 
   bl transfer
   .long handler_addr
   .long ret_addr
 
 when call transfer, it never return but
 in transfer assembly code it will read the handler_addr
 and ultimately call the handler.
 Gdb doesn't know that and treat it as a normal function call.
 so gdb put a software breakpoint instruction at handler_addr,
 in order to get trap there when return from transfer.
 
 Then guest will read software breakpoint as handler_addr and jump to there..
 
 I'm not sure if x86 suffer this kind of issue.

It would if it had such a pattern.

 Is there any way to avoid this?

Unless there is a mechanism via the debug infos of a binary to tell gdb
about this, I think one can only avoid it by not using next here.

 
 
 2. gdb 'watch' command
 Jan told me gdb6.8 can issue hardware watchpoint request via command 'watch',
 my gdb is 6.8.50.20080821-cvs and our toolchain provider confirm that it 
 supports hardware watch
 However when I use 'watch', I can only see single step from gdbstub side.
 Did I miss anything?

Did you install a watchpoint on a symbol? If yes, try if placing one on
an absolute address changes the picture.

Frankly, I didn't understand gdb's logic for selecting soft or hard
watchpoints so far. Soft watchpoints are those you saw: single step to
the program, checking after each step if the watched variable has
changed. In theory it should be clear when to use which. But practice
appears to be non-deterministic, at least with the versions we recently
tried on x86.

Jan



signature.asc
Description: OpenPGP digital signature

Re: qemu-kvm.git now live

2009-04-30 Thread Jan Kiszka

Avi Kivity wrote:
 Jan Kiszka wrote:
 Avi Kivity wrote:
  
 Where/how does the
 migration code disable dirty logging?
   
 Should be phase 3 of ram_save_live().
 
 But only in qemu-kvm. What is the plan about pushing it upstream? Then
 we could discuss how to extend the exiting support best.
 
 Pushing things upstream is quite difficult because of the very different
 infrastructure.
 

 Isn't the midterm goal to get rid of most of these differences (namely
 libkvm)?
   
 
 Yes, but not by removing existing functionality.

No one said this.

 
  
 It's unfortunate that upstream rewrote everything
 instead of changing things incrementally.  Rewrites are almost always a
 mistake since they throw away accumulated knowledge.
 

 I disagree, at least in this particular case. Upstream already diverged
 from qemu-kvm, and the latter provided no comparable alternative for
 slot management and dirty logging. And I still don't see that we lost
 anything that could not easily be re-integrated into upstream (ie.
 global dirty logging), finally leading to a cleaner and more complete
 result.
   
 
 It could have been done differently, by morphing the existing support
 into something mergable, and merging that.  In this way, we'd ensure no
 needed functionality is lost.

The existing support lacked features upstream already had and instead
required additional hacks to make qemu-kvm work.

 
 As is, we're adding something simple, then discovering it's
 insufficient.  We're throwing away information, that's not a good way to
 make progress.

I doubt this applies here.

 
 So, what bits are missing to make KVM migration work in upstream?
   
 
 I don't know of anything beyond dirty logging.
 

OK, then I will pick this up and have a look at something comparable to
cpu_physical_memory_set_dirty_tracking() for upstream.

Jan

-- 
Siemens AG, Corporate Technology, CT SE 2
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [Qemu-devel] [PATCH 4/7] kvm: Add sanity checks to slot management

2009-04-29 Thread Jan Kiszka

Liu Yu-B13201 wrote:

 -Original Message-
 From: qemu-devel-bounces+yu.liu=freescale@nongnu.org 
 [mailto:qemu-devel-bounces+yu.liu=freescale@nongnu.org] 
 On Behalf Of Jan Kiszka
 Sent: Sunday, April 12, 2009 1:20 AM
 To: qemu-de...@nongnu.org
 Subject: [Qemu-devel] [PATCH 4/7] kvm: Add sanity checks to 
 slot management

 Fail loudly if we run out of memory slot.

 Make sure that dirty log start/stop works with consistent 
 memory regions
 by reporting invalid parameters. This reveals several 
 inconsistencies in
 the vga code, patch to fix them follows later in this series.

 And, for simplicity reasons, also catch and report unaligned memory
 regions passed to kvm_set_phys_mem (KVM works on page basis).

 Commit d3f8d37fe2d0c24ec8bac9c94d5b0e2dc09c0d2a hurts kvm/powerpc
 The alignment check in kvm_set_phys_mem prevents pci controller and mpic 
 initializing mmio regions.

What is the alignment of those regions then? None? And do regions of
different types overlap even on the same page? Maybe the check reveals
some deeper conflict /wrt KVM. Can you point me to the involved code files?

Jan

-- 
Siemens AG, Corporate Technology, CT SE 2
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: qemu-kvm.git now live

2009-04-29 Thread Jan Kiszka

Avi Kivity wrote:
 Jan Kiszka wrote:
 Avi Kivity wrote:
  
 After a lengthy testing phase, qemu-kvm.git has replaced
 kvm-userspace.git as the source repository for kvm userspace
 development.

 Differences from kvm-userspace.git are as follows:
 - everything under qemu/ has been moved to the top-level directory
 - everything not under qemu/ has been moved under a new top-level
 directory, kvm/
 - all qemu subversion commits have been rewritten to be compatible with
 what will become the master qemu git repository
 - all branches and tags have been converted
 - qemu-kvm.git builds standalone (does not require kvm.git)

 The repository is compatible with upstream qemu.git; merges and
 cherry-picking will work fine in either direction

 Still missing:
 - I have not tested powerpc or ia64.  Patches welcome!
 - testsuite (kvm/user/ directory) does not build
 

 - proper guest memory setup

 During merge with upstream, the call to qemu_alloc_physram dropped under
 the table (due to removal of phys_ram_base from qemu). And that means we
 no longer call kvm_setup_guest_memory (fork will cause troubles, deja
 vu...) and -mempath is broken. Sorry, I'm not deep enough into all parts
 to provide a quick-fix.
   
 
 That's unrelated to the repository conversion.

Then it's a merge artifact, but it's a problematic regression compared
to still-working kvm-userspace.

 
 I'm unhappy with both qemu.git and qemu-kvm.git memory slot management;
 qemu-kvm.git is clumsy, and qemu.git is too simplistic (for example, it
 ignores the fact that dirty logging is a global resource with multiple
 users).

Don't get your point yet. Can you name a concrete scenario that is
problematic in upstream? I'd really like to get slot management right
there before we drop the old version of qemu-kvm.

Jan

-- 
Siemens AG, Corporate Technology, CT SE 2
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [Qemu-devel] [PATCH 4/7] kvm: Add sanity checks to slot management

2009-04-29 Thread Jan Kiszka

Liu Yu-B13201 wrote:
 -Original Message-
 From: Jan Kiszka [mailto:jan.kis...@siemens.com] 
 Sent: Wednesday, April 29, 2009 6:38 PM
 To: Liu Yu-B13201
 Cc: qemu-de...@nongnu.org; kvm-ppc@vger.kernel.org
 Subject: Re: [Qemu-devel] [PATCH 4/7] kvm: Add sanity checks 
 to slot management

 Liu Yu-B13201 wrote:
 -Original Message-
 From: qemu-devel-bounces+yu.liu=freescale@nongnu.org 
 [mailto:qemu-devel-bounces+yu.liu=freescale@nongnu.org] 
 On Behalf Of Jan Kiszka
 Sent: Sunday, April 12, 2009 1:20 AM
 To: qemu-de...@nongnu.org
 Subject: [Qemu-devel] [PATCH 4/7] kvm: Add sanity checks to 
 slot management

 Fail loudly if we run out of memory slot.

 Make sure that dirty log start/stop works with consistent 
 memory regions
 by reporting invalid parameters. This reveals several 
 inconsistencies in
 the vga code, patch to fix them follows later in this series.

 And, for simplicity reasons, also catch and report unaligned memory
 regions passed to kvm_set_phys_mem (KVM works on page basis).

 Commit d3f8d37fe2d0c24ec8bac9c94d5b0e2dc09c0d2a hurts kvm/powerpc
 The alignment check in kvm_set_phys_mem prevents pci 
 controller and mpic initializing mmio regions.

 What is the alignment of those regions then? None? 
 And do regions of
 different types overlap even on the same page?
 I think so.

 Maybe the check reveals
 some deeper conflict /wrt KVM. Can you point me to the 
 involved code files?

 hw/ppc4xx_pci.c   ppc4xx_pci_init()
 hw/ppce500_pci.c  ppce500_pci_init()
 hw/openpic.c  mpic_init()
 hw/ppce500_mpc8544ds.cserial_mm_init()

Hmm, too bad that I have no platform at hand to test this. Could you
instrument kvm_set_phys_mem (on an older, working version), dumping its
arguments and post this trace?

TIA,
Jan

-- 
Siemens AG, Corporate Technology, CT SE 2
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] [PATCH] kvm-userspace: gdb: fix new gdb function types

2008-12-13 Thread Jan Kiszka

Christian Ehrhardt wrote:
 # HG changeset patch
 # User Christian Ehrhardt ehrha...@linux.vnet.ibm.com
 # Date 1229085659 -3600
 # Node ID 37967a80a2757505488685aac135681945e6da91
 # Parent  f0ed33f14658fe91a14ec02501cb42d26e32f01f
 [PATCH] kvm-userspace: gdb: fix new gdb function types
 
 From: Christian Ehrhardt ehrha...@linux.vnet.ibm.com
 
 The types changed in the header but not in the powerpc and ia64 
 implementation.
 This patch fix that build error by changing the stubs to the right prototype.

Thanks for the fix! Will merge it into the master patch, adding your
signed-off - if that's fine for you.

Jan



signature.asc
Description: OpenPGP digital signature

Re: [PATCH] [PATCH] qemu: ppc: kvm-userspace: KVM PowerPC support for qemu gdbstub

2008-12-13 Thread Jan Kiszka

Christian Ehrhardt wrote:
 Hollis Blanchard wrote:
 On Thu, 2008-12-11 at 17:05 +0100, Jan Kiszka wrote:
  
 Hollis Blanchard wrote:

 On Thu, 2008-12-11 at 13:53 +0100, Christian Ehrhardt wrote:
  
 This is v2 as version one had a type in it occured when splitting
 patches.
 Mercurial somehow lost my changes to the patch description
 explaining that, but the patch is right this way.

 Christian Ehrhardt wrote:

 # HG changeset patch
 # User Christian Ehrhardt
 ehrhardt-23vcf4htsmix0ybbhkvfkdbpr1lh4...@public.gmane.org
 # Date 1228999833 -3600
 # Node ID dc1466c9077ab162f4637fffee1869f26be02299
 # Parent  4c07fe2a56c7653a9113e05bb08c2de9aec210ce
 [PATCH] qemu: ppc: kvm-userspace: KVM PowerPC support for qemu
 gdbstub

 From: Hollis Blanchard
 hollisb-r/jw6+rmf7hqt0dzr+a...@public.gmane.org

 Add basic KVM PowerPC support to qemu's gdbstub introducing a kvm
 ppc style
 mmu implementation that uses the kvm_translate ioctl.
 This also requires to save the kvm registers prior to the 'm' gdb
 operations.

 Signed-off-by: Hollis Blanchard
 hollisb-r/jw6+rmf7hqt0dzr+a...@public.gmane.org
 Signed-off-by: Christian Ehrhardt
 ehrhardt-23vcf4htsmix0ybbhkvfkdbpr1lh4...@public.gmane.org
   
 Let's *not* apply this to kvm-userspace. We will submit this to qemu,
 and once we work out the right solution there it will be merged
 naturally.

   
 I don't oversee yet what you want to push upstream, but in case it's the
 gdbstub support for kvm (including ppc bits): please note that I plan to
 push the new interface once it is merged into kvm-userspace, avoiding to
 spread the current, limited one as far as possible.

 BTW, would be great if you could have a look / provide patches for ppc
 to support the new interface already. I am open for feedback,
 specifically regarding its suitability beyond x86.
 

 I've been meaning to do this for a while, sorry. We'll take a look soon.

   
 Hi Jan,
 I saw that you already had that env-s-g_cpu fix, so if you change all
 that
 anyway it might really be better to test/extend your patches for powerpc
 now.
 
 If it is ok for you I would submit my patches that apply on top of yours to
 you and cc the kvm list. But as Hollis mentioned I would prefer go for qemu
 upstream first and then assist Avi in merging it into kvm-userspace because
 this is the natural direction patches flow atm (and if you need to
 change it
 multiple times until you get qemu acceptance you would have to extensivly
 patch both projects to match again).

My current roadmap is first merging kernel bits and corresponding
kvm-userspace changes so that we can test both extensively in the
context of full-blown kvm, and then push an adopted userspace interface
into qemu. The other way around would create the risk of missing
problems that only pop up under full-featured kvm (upstream is still
fairly limited, specifically as there is no threaded smp support).

That said, if you have (ppc-)changes that can be pushed immediately and
independently, there is surely no need to wait for the kvm-gdb series.

 
 As my code in that case depend on your patches it would be nice if you
 could
 put them into your series once you are happy with it.
 

I will happily carry them, no problem.

Jan



signature.asc
Description: OpenPGP digital signature

42 matches

Mail list logo