RE: [RFC 0/6] KVM: arm/arm64: gsi routing support

2015-06-18 Thread Pavel Fedin
 Hello!

> The series therefore allows and mandates the usage of KVM_SET_GSI_ROUTING
> ioctl along with KVM_IRQFD. If the userspace does not define any routing
> table, no irqfd injection can happen. The user-space can use
> KVM_CAP_IRQ_ROUTING to detect whether a routing table is needed.

 Yesterday, half-sleeping in the train back home, i've got a simple idea how to 
resolve
conflicts with existing static GSI->SPI routing without bringing in any more
inconsistencies.
 So far, in current implementation GSI is an SPI index (let alone KVM_IRQ_LINE, 
because
it's already another story on ARM). In order to maintain this convention we 
could simply
implement default routing which sets all GSIs to corresponding SPI pins. So, if 
the
userland never cares about KVM_SET_GSI_ROUTING, everything works as before. But 
it will be
possible to re-route GSIs to MSI. It will perfectly work because SPI signaling 
is used
with GICv2m, and MSI with GICv3(+), which cannot be used at the same time.

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Nested EPT Write Protection

2015-06-18 Thread Paolo Bonzini


On 19/06/2015 03:52, Hu Yaohui wrote:
> Hi All,
> In kernel 3.14.2, the kvm uses shadow EPT(EPT02) to implement the
> nested EPT. The shadow EPT(EPT02) is a shadow of guest EPT (EPT12). If
> the L1 guest writes to the guest EPT(EPT12). How can the shadow
> EPT(EPT02) be modified according?

Because the EPT02 is write protected, writes to the EPT12 will trap to
the hypervisor.  The hypervisor will execute the write instruction
before reentering the guest and invalidate the modified parts of the
EPT02.  When the invalidated part of the EPT02 is accessed, the
hypervisor will rebuild it according to the EPT12 and the KVM memslots.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Nested EPT Write Protection

2015-06-18 Thread Hu Yaohui
Hi All,
In kernel 3.14.2, the kvm uses shadow EPT(EPT02) to implement the
nested EPT. The shadow EPT(EPT02) is a shadow of guest EPT (EPT12). If
the L1 guest writes to the guest EPT(EPT12). How can the shadow
EPT(EPT02) be modified according?

Thanks,
Yaohui
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvmtool: Makefile: allow overriding CC and LD

2015-06-18 Thread Michael Ellerman
On Thu, 2015-06-18 at 16:50 +0100, Andre Przywara wrote:
> Currently we set CC unconditionally to ${CROSS_COMPILE}gcc, the same
> for LD.
> Allow people to override the compiler name by specifying it explicitly
> on the command line or via the environment.
> Beside calling a certain compiler binary this allows to pass in
> options to the compiler, which lets us get rid of the PowerPC
> overrides in the Makefile. Possible uses:
> $ make CC="gcc -m64" LD="ld -melf64ppc"
> (build kvmtool on a PowerPC toolchain defaulting to 32-bit)
> $ make CC="gcc -m32" LD="ld -melf_i386"
> (build a 32-bit binary on a multilib-enabled x86-64 compiler)


I'm not a big fan of that.

Your examples are all about overriding CFLAGS and LDFLAGS, not CC and LD. So
if anything you should be allowing that. Adding flags to CC and LD is asking
for trouble.

cheers


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/3] powerpc: use default endianness for converting guest/init

2015-06-18 Thread Michael Ellerman
On Thu, 2015-06-18 at 15:52 +0100, Andre Przywara wrote:
> Hi,
> 
> On 06/17/2015 10:43 AM, Andre Przywara wrote:
> > For converting the guest/init binary into an object file, we call
> > the linker binary, setting the endianness to big endian explicitly
> > when compiling kvmtool for powerpc.
> > This breaks if the compiler is actually targetting little endian
> > (which is true for the Debian port, for instance).
> > Remove the explicit big endianness switch from the linker call to
> > allow linking on little endian PowerPC builds again.
> > 
> > Signed-off-by: Andre Przywara 
> > ---
> > Hi,
> > 
> > this fixed the powerpc64le build for me, while still compiling fine
> > for big endian. Admittedly this whole init->guest_init.o conversion
> > has its issues (with MIPS, for instance), which deserve proper fixing,
> > but lets just fix that build for now.
> 
> Will was concerned about breaking toolchains where the linker does not
> default to 64-bit. Is that an issue we care about?

Yeah, that would be Debian & Ubuntu BE at least, and maybe Fedora too? I'm not
sure how you compiled it big endian?

> AFAICT LDFLAGS is only used in this dodgy binary-to-object-file
> conversion of guest/init. For this we rely on the resulting .o file to
> have the same ELF target as the other object files to be finally linked
> into the lkvm binary. As we don't compile guest/init with CFLAGS, there
> is a possible mismatch.
> 
> I am looking into a proper fix for this now (compiling guest/init with
> CFLAGS, calling $CC with linker options instead of $LD and allowing CC
> and LD override). Still struggling with MIPS, though :-(

Yeah that's obviously a better solution medium term.

Can you do something like this? Sorry untested:

diff --git a/Makefile b/Makefile
index 6110b8e..8663d67 100644
--- a/Makefile
+++ b/Makefile
@@ -149,7 +149,11 @@ ifeq ($(ARCH), powerpc)
OBJS+= powerpc/xics.o
ARCH_INCLUDE := powerpc/include
CFLAGS  += -m64
-   LDFLAGS += -m elf64ppc
+   ifeq ($(call try-build,$(SOURCE_HELLO),$(CFLAGS),-m elf64ppc),y)
+   LDFLAGS += -m elf64ppc
+   else
+   LDFLAGS += -m elf64leppc
+   endif
 
ARCH_WANT_LIBFDT := y
 endif


cheers


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 2/2] arm: KVM: keep arm vfp/simd exit handling consistent with arm64

2015-06-18 Thread Mario Smarduch
On 06/18/2015 10:27 AM, Marc Zyngier wrote:
> On 16/06/15 22:50, Mario Smarduch wrote:
>> After enhancing arm64 FP/SIMD exit handling, FP/SIMD exit branch is moved
>> to guest trap handling. This keeps exiting handling flow between both
>> architectures consistent.
>>
>> Signed-off-by: Mario Smarduch 
>> ---
>>  arch/arm/kvm/interrupts.S |   12 +++-
>>  1 file changed, 7 insertions(+), 5 deletions(-)
>>
>> diff --git a/arch/arm/kvm/interrupts.S b/arch/arm/kvm/interrupts.S
>> index 79caf79..fca2c56 100644
>> --- a/arch/arm/kvm/interrupts.S
>> +++ b/arch/arm/kvm/interrupts.S
>> @@ -363,10 +363,6 @@ hyp_hvc:
>>  @ Check syndrome register
>>  mrc p15, 4, r1, c5, c2, 0   @ HSR
>>  lsr r0, r1, #HSR_EC_SHIFT
>> -#ifdef CONFIG_VFPv3
>> -cmp r0, #HSR_EC_CP_0_13
>> -beq switch_to_guest_vfp
>> -#endif
>>  cmp r0, #HSR_EC_HVC
>>  bne guest_trap  @ Not HVC instr.
>>  
>> @@ -406,6 +402,12 @@ THUMB(  orr lr, #1)
>>  1:  eret
>>  
>>  guest_trap:
>> +#ifdef CONFIG_VFPv3
>> +/* Guest accessed VFP/SIMD registers, save host, restore Guest */
>> +cmp r0, #HSR_EC_CP_0_13
>> +beq switch_to_guest_fpsimd
>> +#endif
>> +
>>  load_vcpu   @ Load VCPU pointer to r0
>>  str r1, [vcpu, #VCPU_HSR]
>>  
>> @@ -478,7 +480,7 @@ guest_trap:
>>   * inject an undefined exception to the guest.
>>   */
>>  #ifdef CONFIG_VFPv3
>> -switch_to_guest_vfp:
>> +switch_to_guest_fpsimd:
> 
> Ah, I think I managed to confuse you in my previous comment.
> On ARMv7, we call the floating point stuff VFP.
> On ARMv8, we call it FP/SIMD.

Ah I see, I'll update.
> 
> Not very consistent, I know...
> 
>>  load_vcpu   @ Load VCPU pointer to r0

How about move it here - then it does not stick out like
before.

guest_trap:
load_vcpu   @ Load VCPU pointer to r0
str r1, [vcpu, #VCPU_HSR]

@ Check if we need the fault information
lsr r1, r1, #HSR_EC_SHIFT
#ifdef CONFIG_VFPv3
/* Guest accessed VFP/SIMD registers, save host, restore Guest */
cmp r1, #HSR_EC_CP_0_13
beq switch_to_guest_vfp
#endif


Regarding "host_switch_to_hyp:" it has no reference but appears
like a clean separator, that's on purpose?

Thanks

> 
> It would be interesting to find out if we can make this load_vcpu part
> of the common sequence (without spilling another register, of course).
> Probably involves moving the exception class to r2.
> 
> Thanks,
> 
>   M.
> 

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v5] i386: Introduce ARAT CPU feature

2015-06-18 Thread Eduardo Habkost
On Sun, Jun 07, 2015 at 11:15:08AM +0200, Jan Kiszka wrote:
> From: Jan Kiszka 
> 
> ARAT signals that the APIC timer does not stop in power saving states.
> As our APICs are emulated, it's fine to expose this feature to guests,
> at least when asking for KVM host features or with CPU types that
> include the flag. The exact model number that introduced the feature is
> not known, but reports can be found that it's at least available since
> Sandy Bridge.
> 
> Signed-off-by: Jan Kiszka 

The code looks good now, but: what are the real consequences of
enabling/disabling the flag? What exactly guests use it for?

Isn't this going to make guests have additional expectations about the
APIC timer that may be broken when live-migrating or pausing the VM?

-- 
Eduardo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [v4 08/16] KVM: kvm-vfio: User API for IRQ forwarding

2015-06-18 Thread Alex Williamson
[Adding Joerg since he was part of this original idea]

On Thu, 2015-06-18 at 09:16 +, Wu, Feng wrote:
> 
> 
> > -Original Message-
> > From: Alex Williamson [mailto:alex.william...@redhat.com]
> > Sent: Tuesday, June 16, 2015 12:45 AM
> > To: Eric Auger
> > Cc: Avi Kivity; Wu, Feng; kvm@vger.kernel.org; linux-ker...@vger.kernel.org;
> > pbonz...@redhat.com; mtosa...@redhat.com
> > Subject: Re: [v4 08/16] KVM: kvm-vfio: User API for IRQ forwarding
> > 
> > On Mon, 2015-06-15 at 18:17 +0200, Eric Auger wrote:
> > > Hi Alex, all,
> > > On 06/12/2015 09:03 PM, Alex Williamson wrote:
> > > > On Fri, 2015-06-12 at 21:48 +0300, Avi Kivity wrote:
> > > >> On 06/12/2015 06:41 PM, Alex Williamson wrote:
> > > >>> On Fri, 2015-06-12 at 00:23 +, Wu, Feng wrote:
> > > > -Original Message-
> > > > From: Avi Kivity [mailto:avi.kiv...@gmail.com]
> > > > Sent: Friday, June 12, 2015 3:59 AM
> > > > To: Wu, Feng; kvm@vger.kernel.org; linux-ker...@vger.kernel.org
> > > > Cc: pbonz...@redhat.com; mtosa...@redhat.com;
> > > > alex.william...@redhat.com; eric.au...@linaro.org
> > > > Subject: Re: [v4 08/16] KVM: kvm-vfio: User API for IRQ forwarding
> > > >
> > > > On 06/11/2015 01:51 PM, Feng Wu wrote:
> > > >> From: Eric Auger 
> > > >>
> > > >> This patch adds and documents a new KVM_DEV_VFIO_DEVICE
> > group
> > > >> and 2 device attributes: KVM_DEV_VFIO_DEVICE_FORWARD_IRQ,
> > > >> KVM_DEV_VFIO_DEVICE_UNFORWARD_IRQ. The purpose is to be
> > able
> > > >> to set a VFIO device IRQ as forwarded or not forwarded.
> > > >> the command takes as argument a handle to a new struct named
> > > >> kvm_vfio_dev_irq.
> > > > Is there no way to do this automatically?  After all, vfio knows 
> > > > that a
> > > > device interrupt is forwarded to some eventfd, and kvm knows that
> > some
> > > > eventfd is forwarded to a guest interrupt.  If they compare notes
> > > > through a central registry, they can figure out that the interrupt 
> > > > needs
> > > > to be forwarded.
> > >  Oh, just like Eric mentioned in his reply, this description is out 
> > >  of context
> > of
> > >  this series, I will remove them in the next version.
> > > >>>
> > > >>> I suspect Avi's question was more general.  While forward/unforward is
> > > >>> out of context for this series, it's very similar in nature to
> > > >>> enabling/disabling posted interrupts.  So I think the question remains
> > > >>> whether we really need userspace to participate in creating this
> > > >>> shortcut or if kvm and vfio can some how orchestrate figuring it out
> > > >>> automatically.
> > > >>>
> > > >>> Personally I don't know how we could do it automatically.  We've 
> > > >>> always
> > > >>> relied on userspace to independently setup vfio and kvm such that
> > > >>> neither have any idea that the other is there and update each side
> > > >>> independently when anything changes.  So it seems consistent to
> > continue
> > > >>> that here.  It doesn't seem like there's much to gain performance-wise
> > > >>> either, updates should be a relatively rare event I'd expect.
> > > >>>
> > > >>> There's really no metadata associated with an eventfd, so "comparing
> > > >>> notes" automatically might imply some central registration entity.  
> > > >>> That
> > > >>> immediately sounds like a much more complex solution, but maybe Avi
> > has
> > > >>> some ideas to manage it.  Thanks,
> > > >>>
> > > >>
> > > >> The idea is to have a central registry maintained by a posted 
> > > >> interrupts
> > > >> manager.  Both vfio and kvm pass the filp (along with extra 
> > > >> information)
> > > >> to the posted interrupts manager, which, when it detects a filp match,
> > > >> tells each of them what to do.
> > > >>
> > > >> The advantages are:
> > > >> - old userspace gains the optimization without change
> > > >> - a userspace API is more expensive to maintain than internal kernel
> > > >> interfaces (CVEs, documentation, maintaining backwards compatibility)
> > > >> - if you can do it without a new interface, this indicates that all the
> > > >> information in the new interface is redundant.  That means you have to
> > > >> check it for consistency with the existing information, so it's extra
> > > >> work (likely, it's exactly what the posted interrupt manager would be
> > > >> doing anyway).
> > > >
> > > > Yep, those all sound like good things and I believe that's similar in
> > > > design to the way we had originally discussed this interaction at
> > > > LPC/KVM Forum several years ago.  I'd be in favor of that approach.
> > >
> > > I guess this discussion also is relevant wrt "[RFC v6 00/16] KVM-VFIO
> > > IRQ forward control" series? Or is that "central registry maintained by
> > > a posted interrupts manager" something more specific to x86?
> > 
> > I'd think we'd want it for any sort of offload and supporting both
> > posted-interrupts and irq-forwa

Re: [PATCH 13/13] KVM: arm64: enable ITS emulation as a virtual MSI controller

2015-06-18 Thread Andre Przywara
On 06/18/2015 04:03 PM, Pavel Fedin wrote:
>  Hello!
> 
>> But that fails compilation on ARM (which uses this file as well),
>> because we have a dummy fail function in the header if
>> CONFIG_HAVE_KVM_MSI is not defined.
> 
>  May be then remove that fail function too? Too many #ifdef's are not good...

Yes, that seems to work - now. I think I had more code in there before
that prevented exposure without #ifdef guarding.

Cheers,
Andre.

> 
> Kind regards,
> Pavel Fedin
> Expert Engineer
> Samsung Electronics Research center Russia
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 4/6] KVM: arm/arm64: enable irqchip routing

2015-06-18 Thread Marc Zyngier
On 18/06/15 19:00, Eric Auger wrote:
> Hi Marc,
> On 06/18/2015 07:53 PM, Marc Zyngier wrote:
>> Hi Eric,
>>
>> On 18/06/15 18:40, Eric Auger wrote:
>>> This patch adds compilation and link against irqchip.
>>>
>>> On ARM, irqchip routing is not really useful since there is
>>> a single irqchip. However main motivation behind using irqchip
>>> code is to enable MSI routing code. With the support of in-kernel
>>> GICv3 ITS emulation, it now seems to be a MUST HAVE requirement.
>>>
>>> Functions previously implemented in vgic.c and substitute
>>> to more complex irqchip implementation are removed:
>>>
>>> - kvm_send_userspace_msi
>>> - kvm_irq_map_chip_pin
>>> - kvm_set_irq
>>> - kvm_irq_map_gsi.
>>>
>>> They implemented a kernel default identity GSI routing. This is now
>>> replaced by user-side provided routing.
>>>
>>> Routing standard hooks are now implemented in vgic:
>>> - kvm_set_routing_entry
>>> - kvm_set_irq
>>> - kvm_set_msi
>>>
>>> Both HAVE_KVM_IRQCHIP and HAVE_KVM_IRQ_ROUTING are defined.
>>> KVM_CAP_IRQ_ROUTING is advertised and KVM_SET_GSI_ROUTING is allowed.
>>>
>>> MSI routing is not yet allowed.
>>>
>>> Signed-off-by: Eric Auger 
>>> ---
>>>  Documentation/virtual/kvm/api.txt | 11 --
>>>  arch/arm/include/asm/kvm_host.h   |  2 +
>>>  arch/arm/kvm/Kconfig  |  2 +
>>>  arch/arm/kvm/Makefile |  2 +-
>>>  arch/arm64/include/asm/kvm_host.h |  1 +
>>>  arch/arm64/kvm/Kconfig|  2 +
>>>  arch/arm64/kvm/Makefile   |  2 +-
>>>  include/kvm/arm_vgic.h|  9 -
>>>  virt/kvm/arm/vgic.c   | 78 
>>> ---
>>>  virt/kvm/irqchip.c|  2 +
>>>  10 files changed, 67 insertions(+), 44 deletions(-)
>>>
>>> diff --git a/Documentation/virtual/kvm/api.txt 
>>> b/Documentation/virtual/kvm/api.txt
>>> index bcec91e..2bc96e1 100644
>>> --- a/Documentation/virtual/kvm/api.txt
>>> +++ b/Documentation/virtual/kvm/api.txt
>>> @@ -1395,7 +1395,7 @@ KVM_ASSIGN_DEV_IRQ. Partial deassignment of host or 
>>> guest IRQ is allowed.
>>>  4.52 KVM_SET_GSI_ROUTING
>>>  
>>>  Capability: KVM_CAP_IRQ_ROUTING
>>> -Architectures: x86 s390
>>> +Architectures: x86 s390 arm arm64
>>>  Type: vm ioctl
>>>  Parameters: struct kvm_irq_routing (in)
>>>  Returns: 0 on success, -1 on error
>>> @@ -2310,9 +2310,12 @@ Note that closing the resamplefd is not sufficient 
>>> to disable the
>>>  irqfd.  The KVM_IRQFD_FLAG_RESAMPLE is only necessary on assignment
>>>  and need not be specified with KVM_IRQFD_FLAG_DEASSIGN.
>>>  
>>> -On ARM/ARM64, the gsi field in the kvm_irqfd struct specifies the Shared
>>> -Peripheral Interrupt (SPI) index, such that the GIC interrupt ID is
>>> -given by gsi + 32.
>>> +On ARM/ARM64, when GSI routing is not used, the gsi field in the
>>> +kvm_irqfd struct specifies the Shared Peripheral Interrupt (SPI) index,
>>> +such that the GIC interrupt ID is given by gsi + 32. When GSI routing is
>>> +setup:
>>> +- if irqchip routing: irqchip.pin + 32 is the SPI ID that is injected
>>> +- if MSI routing: the MSI data is used as interrupt ID (SPI or LPI).
>>
>> This feels just wrong. With GICv3, the MSI data is not the LPI at all.
>> It is an opaque value that gets translated into an LPI when combined
>> with the DeviceID.
> I agree with you. I need to rephrase that. In practice this is what
> should happen in the code since I use Andre's MSI injection routine
> which does the translation; except for GICv2 where last patch attempts
> to do direct gsi mapping from msi msg data!

Agreed. The code seems to do the right thing, only the documentation is
misleading.

Thanks,

M.
-- 
Jazz is not dead. It just smells funny...
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] MAINTAINERS: Add vfio-platform sub-maintainer

2015-06-18 Thread Alex Williamson
Add Baptiste Reynal as the VFIO platform driver sub-maintainer.

Signed-off-by: Alex Williamson 
Cc: Baptiste Reynal 
---
 MAINTAINERS |6 ++
 1 file changed, 6 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index d8afd29..c6bf7f6 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -10545,6 +10545,12 @@ F: drivers/vfio/
 F: include/linux/vfio.h
 F: include/uapi/linux/vfio.h
 
+VFIO PLATFORM DRIVER
+M: Baptiste Reynal 
+L: kvm@vger.kernel.org
+S: Maintained
+F: drivers/vfio/platform/
+
 VIDEOBUF2 FRAMEWORK
 M: Pawel Osciak 
 M: Marek Szyprowski 

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 4/6] KVM: arm/arm64: enable irqchip routing

2015-06-18 Thread Eric Auger
Hi Marc,
On 06/18/2015 07:53 PM, Marc Zyngier wrote:
> Hi Eric,
> 
> On 18/06/15 18:40, Eric Auger wrote:
>> This patch adds compilation and link against irqchip.
>>
>> On ARM, irqchip routing is not really useful since there is
>> a single irqchip. However main motivation behind using irqchip
>> code is to enable MSI routing code. With the support of in-kernel
>> GICv3 ITS emulation, it now seems to be a MUST HAVE requirement.
>>
>> Functions previously implemented in vgic.c and substitute
>> to more complex irqchip implementation are removed:
>>
>> - kvm_send_userspace_msi
>> - kvm_irq_map_chip_pin
>> - kvm_set_irq
>> - kvm_irq_map_gsi.
>>
>> They implemented a kernel default identity GSI routing. This is now
>> replaced by user-side provided routing.
>>
>> Routing standard hooks are now implemented in vgic:
>> - kvm_set_routing_entry
>> - kvm_set_irq
>> - kvm_set_msi
>>
>> Both HAVE_KVM_IRQCHIP and HAVE_KVM_IRQ_ROUTING are defined.
>> KVM_CAP_IRQ_ROUTING is advertised and KVM_SET_GSI_ROUTING is allowed.
>>
>> MSI routing is not yet allowed.
>>
>> Signed-off-by: Eric Auger 
>> ---
>>  Documentation/virtual/kvm/api.txt | 11 --
>>  arch/arm/include/asm/kvm_host.h   |  2 +
>>  arch/arm/kvm/Kconfig  |  2 +
>>  arch/arm/kvm/Makefile |  2 +-
>>  arch/arm64/include/asm/kvm_host.h |  1 +
>>  arch/arm64/kvm/Kconfig|  2 +
>>  arch/arm64/kvm/Makefile   |  2 +-
>>  include/kvm/arm_vgic.h|  9 -
>>  virt/kvm/arm/vgic.c   | 78 
>> ---
>>  virt/kvm/irqchip.c|  2 +
>>  10 files changed, 67 insertions(+), 44 deletions(-)
>>
>> diff --git a/Documentation/virtual/kvm/api.txt 
>> b/Documentation/virtual/kvm/api.txt
>> index bcec91e..2bc96e1 100644
>> --- a/Documentation/virtual/kvm/api.txt
>> +++ b/Documentation/virtual/kvm/api.txt
>> @@ -1395,7 +1395,7 @@ KVM_ASSIGN_DEV_IRQ. Partial deassignment of host or 
>> guest IRQ is allowed.
>>  4.52 KVM_SET_GSI_ROUTING
>>  
>>  Capability: KVM_CAP_IRQ_ROUTING
>> -Architectures: x86 s390
>> +Architectures: x86 s390 arm arm64
>>  Type: vm ioctl
>>  Parameters: struct kvm_irq_routing (in)
>>  Returns: 0 on success, -1 on error
>> @@ -2310,9 +2310,12 @@ Note that closing the resamplefd is not sufficient to 
>> disable the
>>  irqfd.  The KVM_IRQFD_FLAG_RESAMPLE is only necessary on assignment
>>  and need not be specified with KVM_IRQFD_FLAG_DEASSIGN.
>>  
>> -On ARM/ARM64, the gsi field in the kvm_irqfd struct specifies the Shared
>> -Peripheral Interrupt (SPI) index, such that the GIC interrupt ID is
>> -given by gsi + 32.
>> +On ARM/ARM64, when GSI routing is not used, the gsi field in the
>> +kvm_irqfd struct specifies the Shared Peripheral Interrupt (SPI) index,
>> +such that the GIC interrupt ID is given by gsi + 32. When GSI routing is
>> +setup:
>> +- if irqchip routing: irqchip.pin + 32 is the SPI ID that is injected
>> +- if MSI routing: the MSI data is used as interrupt ID (SPI or LPI).
> 
> This feels just wrong. With GICv3, the MSI data is not the LPI at all.
> It is an opaque value that gets translated into an LPI when combined
> with the DeviceID.
I agree with you. I need to rephrase that. In practice this is what
should happen in the code since I use Andre's MSI injection routine
which does the translation; except for GICv2 where last patch attempts
to do direct gsi mapping from msi msg data!

Thanks

Eric
> 
> Thanks,
> 
>   M.
> 

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 4/6] KVM: arm/arm64: enable irqchip routing

2015-06-18 Thread Marc Zyngier
Hi Eric,

On 18/06/15 18:40, Eric Auger wrote:
> This patch adds compilation and link against irqchip.
> 
> On ARM, irqchip routing is not really useful since there is
> a single irqchip. However main motivation behind using irqchip
> code is to enable MSI routing code. With the support of in-kernel
> GICv3 ITS emulation, it now seems to be a MUST HAVE requirement.
> 
> Functions previously implemented in vgic.c and substitute
> to more complex irqchip implementation are removed:
> 
> - kvm_send_userspace_msi
> - kvm_irq_map_chip_pin
> - kvm_set_irq
> - kvm_irq_map_gsi.
> 
> They implemented a kernel default identity GSI routing. This is now
> replaced by user-side provided routing.
> 
> Routing standard hooks are now implemented in vgic:
> - kvm_set_routing_entry
> - kvm_set_irq
> - kvm_set_msi
> 
> Both HAVE_KVM_IRQCHIP and HAVE_KVM_IRQ_ROUTING are defined.
> KVM_CAP_IRQ_ROUTING is advertised and KVM_SET_GSI_ROUTING is allowed.
> 
> MSI routing is not yet allowed.
> 
> Signed-off-by: Eric Auger 
> ---
>  Documentation/virtual/kvm/api.txt | 11 --
>  arch/arm/include/asm/kvm_host.h   |  2 +
>  arch/arm/kvm/Kconfig  |  2 +
>  arch/arm/kvm/Makefile |  2 +-
>  arch/arm64/include/asm/kvm_host.h |  1 +
>  arch/arm64/kvm/Kconfig|  2 +
>  arch/arm64/kvm/Makefile   |  2 +-
>  include/kvm/arm_vgic.h|  9 -
>  virt/kvm/arm/vgic.c   | 78 
> ---
>  virt/kvm/irqchip.c|  2 +
>  10 files changed, 67 insertions(+), 44 deletions(-)
> 
> diff --git a/Documentation/virtual/kvm/api.txt 
> b/Documentation/virtual/kvm/api.txt
> index bcec91e..2bc96e1 100644
> --- a/Documentation/virtual/kvm/api.txt
> +++ b/Documentation/virtual/kvm/api.txt
> @@ -1395,7 +1395,7 @@ KVM_ASSIGN_DEV_IRQ. Partial deassignment of host or 
> guest IRQ is allowed.
>  4.52 KVM_SET_GSI_ROUTING
>  
>  Capability: KVM_CAP_IRQ_ROUTING
> -Architectures: x86 s390
> +Architectures: x86 s390 arm arm64
>  Type: vm ioctl
>  Parameters: struct kvm_irq_routing (in)
>  Returns: 0 on success, -1 on error
> @@ -2310,9 +2310,12 @@ Note that closing the resamplefd is not sufficient to 
> disable the
>  irqfd.  The KVM_IRQFD_FLAG_RESAMPLE is only necessary on assignment
>  and need not be specified with KVM_IRQFD_FLAG_DEASSIGN.
>  
> -On ARM/ARM64, the gsi field in the kvm_irqfd struct specifies the Shared
> -Peripheral Interrupt (SPI) index, such that the GIC interrupt ID is
> -given by gsi + 32.
> +On ARM/ARM64, when GSI routing is not used, the gsi field in the
> +kvm_irqfd struct specifies the Shared Peripheral Interrupt (SPI) index,
> +such that the GIC interrupt ID is given by gsi + 32. When GSI routing is
> +setup:
> +- if irqchip routing: irqchip.pin + 32 is the SPI ID that is injected
> +- if MSI routing: the MSI data is used as interrupt ID (SPI or LPI).

This feels just wrong. With GICv3, the MSI data is not the LPI at all.
It is an opaque value that gets translated into an LPI when combined
with the DeviceID.

Thanks,

M.
-- 
Jazz is not dead. It just smells funny...
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 10/10] KVM: arm/arm64: vgic: Allow non-shared device HW interrupts

2015-06-18 Thread Eric Auger
On 06/18/2015 10:37 AM, Marc Zyngier wrote:
> On 17/06/15 16:50, Eric Auger wrote:
>> On 06/17/2015 05:37 PM, Marc Zyngier wrote:
>>> On 17/06/15 16:11, Eric Auger wrote:
 Hi Marc,
 On 06/08/2015 07:04 PM, Marc Zyngier wrote:
> So far, the only use of the HW interrupt facility is the timer,
> implying that the active state is context-switched for each vcpu,
> as the device is is shared across all vcpus.
 s/is//
>
> This does not work for a device that has been assigned to a VM,
> as the guest is entierely in control of that device (the HW is
 entirely?
> not shared). In that case, it makes sense to bypass the whole
> active state srtwitchint, and only track the deactivation of the
 switching
>>>
>>> Congratulations, I think you're now ready to try deciphering my
>>> handwriting... ;-)
>> good to see you're not a machine or maybe you do it on purpose some
>> times ;-)
>>>
> interrupt.
>
> Signed-off-by: Marc Zyngier 
> ---
>  include/kvm/arm_vgic.h|  5 +++--
>  virt/kvm/arm/arch_timer.c |  2 +-
>  virt/kvm/arm/vgic.c   | 37 -
>  3 files changed, 28 insertions(+), 16 deletions(-)
>
> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
> index 1c653c1..5d47d60 100644
> --- a/include/kvm/arm_vgic.h
> +++ b/include/kvm/arm_vgic.h
> @@ -164,7 +164,8 @@ struct irq_phys_map {
>   u32 virt_irq;
>   u32 phys_irq;
>   u32 irq;
> - boolactive;
> + boolshared;
> + boolactive; /* Only valid if shared */
>  };
>  
>  struct vgic_dist {
> @@ -347,7 +348,7 @@ void vgic_v3_dispatch_sgi(struct kvm_vcpu *vcpu, u64 
> reg);
>  int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu);
>  int kvm_vgic_vcpu_active_irq(struct kvm_vcpu *vcpu);
>  struct irq_phys_map *vgic_map_phys_irq(struct kvm_vcpu *vcpu,
> -int virt_irq, int irq);
> +int virt_irq, int irq, bool shared);
>  int vgic_unmap_phys_irq(struct kvm_vcpu *vcpu, struct irq_phys_map *map);
>  bool vgic_get_phys_irq_active(struct irq_phys_map *map);
>  void vgic_set_phys_irq_active(struct irq_phys_map *map, bool active);
> diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
> index b9fff78..9544d79 100644
> --- a/virt/kvm/arm/arch_timer.c
> +++ b/virt/kvm/arm/arch_timer.c
> @@ -202,7 +202,7 @@ void kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu,
>* Tell the VGIC that the virtual interrupt is tied to a
>* physical interrupt. We do that once per VCPU.
>*/
> - timer->map = vgic_map_phys_irq(vcpu, irq->irq, host_vtimer_irq);
> + timer->map = vgic_map_phys_irq(vcpu, irq->irq, host_vtimer_irq, true);
>   WARN_ON(!timer->map);
>  }
>  
> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
> index f376b56..4223166 100644
> --- a/virt/kvm/arm/vgic.c
> +++ b/virt/kvm/arm/vgic.c
> @@ -1125,18 +1125,21 @@ static void vgic_queue_irq_to_lr(struct kvm_vcpu 
> *vcpu, int irq,
>   map = vgic_irq_map_search(vcpu, irq);
>  
>   if (map) {
> - int ret;
> -
> - BUG_ON(!map->active);
>   vlr.hwirq = map->phys_irq;
>   vlr.state |= LR_HW;
>   vlr.state &= ~LR_EOI_INT;
>  
> - ret = irq_set_irqchip_state(map->irq,
> - IRQCHIP_STATE_ACTIVE,
> - true);
>   vgic_irq_set_queued(vcpu, irq);

 the queued state is set again in vgic_queue_hwirq for level_sensitive
 IRQs although not harmful.
>>>
>>> Indeed. We still need it for edge interrupts though. I'll try to find a
>>> nicer way...
>>>
> - WARN_ON(ret);
> +
> + if (map->shared) {
> + int ret;
> +
> + BUG_ON(!map->active);
> + ret = irq_set_irqchip_state(map->irq,
> + 
> IRQCHIP_STATE_ACTIVE,
> + true);
> + WARN_ON(ret);
> + }
>   }
>   }
>  
> @@ -1368,21 +1371,28 @@ static bool vgic_process_maintenance(struct 
> kvm_vcpu *vcpu)
>  static int vgic_sync_hwirq(struct kvm_vcpu *vcpu, struct vgic_lr vlr)
>  {
>   struct irq_phys_map *map;
> + bool active;
>   int ret;
>  
>   if (!(vlr.state & LR_HW))
>   return 0;
>  
>   map = vgic_irq_map_search(vcpu, vlr.irq);
> - BUG_ON(!map || !map->active);

[RFC 1/6] KVM: api: add kvm_irq_routing_extended_msi

2015-06-18 Thread Eric Auger
On ARM, the MSI msg (address and data) comes along with
out-of-band device ID information. The device ID encodes the device
that composes the MSI msg. Let's create a new routing entry structure
that enables to encode that information on top of standard MSI
message

Signed-off-by: Eric Auger 
---
 Documentation/virtual/kvm/api.txt | 9 +
 include/uapi/linux/kvm.h  | 9 +
 2 files changed, 18 insertions(+)

diff --git a/Documentation/virtual/kvm/api.txt 
b/Documentation/virtual/kvm/api.txt
index d20fd94..bcec91e 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -1419,6 +1419,7 @@ struct kvm_irq_routing_entry {
struct kvm_irq_routing_irqchip irqchip;
struct kvm_irq_routing_msi msi;
struct kvm_irq_routing_s390_adapter adapter;
+   struct kvm_irq_routing_extended_msi ext_msi;
__u32 pad[8];
} u;
 };
@@ -1427,6 +1428,7 @@ struct kvm_irq_routing_entry {
 #define KVM_IRQ_ROUTING_IRQCHIP 1
 #define KVM_IRQ_ROUTING_MSI 2
 #define KVM_IRQ_ROUTING_S390_ADAPTER 3
+#define KVM_IRQ_ROUTING_EXTENDED_MSI 4
 
 No flags are specified so far, the corresponding field must be set to zero.
 
@@ -1442,6 +1444,13 @@ struct kvm_irq_routing_msi {
__u32 pad;
 };
 
+struct kvm_irq_routing_extended_msi {
+   __u32 address_lo;
+   __u32 address_hi;
+   __u32 data;
+   __u32 devid;
+};
+
 struct kvm_irq_routing_s390_adapter {
__u64 ind_addr;
__u64 summary_addr;
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 2a23705..e3f65a0 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -829,6 +829,13 @@ struct kvm_irq_routing_msi {
__u32 pad;
 };
 
+struct kvm_irq_routing_extended_msi {
+   __u32 address_lo;
+   __u32 address_hi;
+   __u32 data;
+   __u32 devid;
+};
+
 struct kvm_irq_routing_s390_adapter {
__u64 ind_addr;
__u64 summary_addr;
@@ -841,6 +848,7 @@ struct kvm_irq_routing_s390_adapter {
 #define KVM_IRQ_ROUTING_IRQCHIP 1
 #define KVM_IRQ_ROUTING_MSI 2
 #define KVM_IRQ_ROUTING_S390_ADAPTER 3
+#define KVM_IRQ_ROUTING_EXTENDED_MSI 4
 
 struct kvm_irq_routing_entry {
__u32 gsi;
@@ -851,6 +859,7 @@ struct kvm_irq_routing_entry {
struct kvm_irq_routing_irqchip irqchip;
struct kvm_irq_routing_msi msi;
struct kvm_irq_routing_s390_adapter adapter;
+   struct kvm_irq_routing_extended_msi ext_msi;
__u32 pad[8];
} u;
 };
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC 0/6] KVM: arm/arm64: gsi routing support

2015-06-18 Thread Eric Auger
With the advent of GICv3 ITS in-kernel emulation, KVM GSI routing
appears to be requested. More specifically MSI routing is needed.
irqchip routing does not sound to be really useful on arm but usage of
MSI routing also mandates to integrate irqchip routing. The initial
implementation of irqfd on arm must be upgraded with the integration
of kvm irqchip.c code and the implementation of its standard hooks
in the architecture specific part.

The series therefore allows and mandates the usage of KVM_SET_GSI_ROUTING
ioctl along with KVM_IRQFD. If the userspace does not define any routing
table, no irqfd injection can happen. The user-space can use
KVM_CAP_IRQ_ROUTING to detect whether a routing table is needed.

for irqchip routing, the convention is, only SPI can be injected and the
SPI ID corresponds to irqchip.pin + 32. For MSI routing the interrupt ID
matches the MSI msg data. API evolve to support associating a device ID
to a routine entry.

Known Issues of this RFC:

- One of the biggest is the API inconsistencies on ARM. Blame me.
  Routing should apply to KVM_IRQ_LINE ioctl which is not the case yet
  in this series. It only applies to irqfd.
  on x86 typically this KVM_IRQ_LINE is plugged onto irqchip.c kvm_set_irq
  whereas on ARM we inject directly through kvm_vgic_inject_irq
  x on arm/arm64 gsi has a specific structure:
bits:  | 31 ... 24 | 23  ... 16 | 15...0 |
field: | irq_type  | vcpu_index | irq_id |
where irq_id matches the Interrupt ID
- for KVM_IRQFD without routing (current implementation) the gsi field
  corresponds to an SPI index = irq_id (above) -32.
- as far as understand qemu integration, gsi is supposed to be within
  [0, KVM_MAX_IRQ_ROUTES]. Difficult to use KVM_IRQ_LINE gsi.
- to be defined what we choose as a convention with irqchip routing is
  applied: gsi -> irqchip input pin.
- Or shouldn't we simply rule out any userspace irqchip routing and stick
  to MSI routing? we could define a fixed identity in-kernel irqchip mapping
  and only offer MSI routing.
- static allocation of chip[KVM_NR_IRQCHIPS][KVM_IRQCHIP_NUM_PINS];
  arbitrary put KVM_IRQCHIP_NUM_PINS = 1020 - 32 (SPI count). On s390
  this is even bigger.

Currently tested on irqchip routing only (Calxeda midway only),
ie NOT TESTED on MSI routing yet.

This is a very preliminary RFC to ease the discussion.

Code can be found at 
https://git.linaro.org/people/eric.auger/linux.git/shortlog/refs/heads/v4.1-rc8-gsi-routing-rfc

It applies on Andre's [PATCH 00/13] arm64: KVM: GICv3 ITS emulation
(http://www.spinics.net/lists/kvm/msg117402.html)

Eric Auger (6):
  KVM: api: add kvm_irq_routing_extended_msi
  KVM: kvm_host: add kvm_extended_msi
  KVM: irqchip: convey devid to kvm_set_msi
  KVM: arm/arm64: enable irqchip routing
  KVM: arm/arm64: enable MSI routing
  KVM: arm: implement kvm_set_msi by gsi direct mapping

 Documentation/virtual/kvm/api.txt | 20 ++--
 arch/arm/include/asm/kvm_host.h   |  2 +
 arch/arm/kvm/Kconfig  |  3 ++
 arch/arm/kvm/Makefile |  2 +-
 arch/arm64/include/asm/kvm_host.h |  1 +
 arch/arm64/kvm/Kconfig|  2 +
 arch/arm64/kvm/Makefile   |  2 +-
 include/kvm/arm_vgic.h|  9 
 include/linux/kvm_host.h  | 10 
 include/uapi/linux/kvm.h  |  9 
 virt/kvm/arm/vgic.c   | 96 +++
 virt/kvm/irqchip.c| 20 ++--
 12 files changed, 128 insertions(+), 48 deletions(-)

-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC 5/6] KVM: arm/arm64: enable MSI routing

2015-06-18 Thread Eric Auger
Up to now, only irqchip routing entries could be set. This patch
adds the capability to insert MSI routing entries, extended or
standard ones. Although standard MSI entries can be set, their
injection still is not supported. For ARM64, let's also increase
KVM_MAX_IRQ_ROUTES to 4096.

Signed-off-by: Eric Auger 
---
 include/linux/kvm_host.h |  2 ++
 virt/kvm/arm/vgic.c  | 13 +
 2 files changed, 15 insertions(+)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index e1c1c0d..6cacf11 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -927,6 +927,8 @@ static inline int mmu_notifier_retry(struct kvm *kvm, 
unsigned long mmu_seq)
 
 #ifdef CONFIG_S390
 #define KVM_MAX_IRQ_ROUTES 4096 //FIXME: we can have more than that...
+#elif defined(CONFIG_ARM64)
+#define KVM_MAX_IRQ_ROUTES 4096 //FIXME: we can have more than that too...
 #else
 #define KVM_MAX_IRQ_ROUTES 1024
 #endif
diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index 212a5ff..16d232f 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -2256,6 +2256,19 @@ int kvm_set_routing_entry(struct 
kvm_kernel_irq_routing_entry *e,
(e->irqchip.irqchip >= KVM_NR_IRQCHIPS))
goto out;
break;
+   case KVM_IRQ_ROUTING_MSI:
+   e->set = kvm_set_msi;
+   e->msi.address_lo = ue->u.msi.address_lo;
+   e->msi.address_hi = ue->u.msi.address_hi;
+   e->msi.data = ue->u.msi.data;
+   break;
+   case KVM_IRQ_ROUTING_EXTENDED_MSI:
+   e->set = kvm_set_msi;
+   e->ext_msi.address_lo = ue->u.ext_msi.address_lo;
+   e->ext_msi.address_hi = ue->u.ext_msi.address_hi;
+   e->ext_msi.data = ue->u.ext_msi.data;
+   e->ext_msi.devid = ue->u.ext_msi.devid;
+   break;
default:
goto out;
}
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC 3/6] KVM: irqchip: convey devid to kvm_set_msi

2015-06-18 Thread Eric Auger
on ARM, a devid field is conveyed in kvm_msi struct. Let's choose the
rooting type and struct according to its availability and fill the
corresponding struct. Also remove the flag check now this latter can
be non null.

Signed-off-by: Eric Auger 
---
 virt/kvm/irqchip.c | 18 ++
 1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/virt/kvm/irqchip.c b/virt/kvm/irqchip.c
index 1d56a90..e76c7d2 100644
--- a/virt/kvm/irqchip.c
+++ b/virt/kvm/irqchip.c
@@ -73,12 +73,22 @@ int kvm_send_userspace_msi(struct kvm *kvm, struct kvm_msi 
*msi)
 {
struct kvm_kernel_irq_routing_entry route;
 
-   if (!irqchip_in_kernel(kvm) || msi->flags != 0)
+   if (!irqchip_in_kernel(kvm))
return -EINVAL;
 
-   route.msi.address_lo = msi->address_lo;
-   route.msi.address_hi = msi->address_hi;
-   route.msi.data = msi->data;
+   if (msi->flags & KVM_MSI_VALID_DEVID) {
+   route.type = KVM_IRQ_ROUTING_EXTENDED_MSI;
+   route.ext_msi.address_lo = msi->address_lo;
+   route.ext_msi.address_hi = msi->address_hi;
+   route.ext_msi.data = msi->data;
+   route.ext_msi.devid= msi->devid;
+   }
+   else {
+   route.type = KVM_IRQ_ROUTING_MSI;
+   route.msi.address_lo = msi->address_lo;
+   route.msi.address_hi = msi->address_hi;
+   route.msi.data = msi->data;
+   }
 
return kvm_set_msi(&route, kvm, KVM_USERSPACE_IRQ_SOURCE_ID, 1, false);
 }
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC 6/6] KVM: arm: implement kvm_set_msi by gsi direct mapping

2015-06-18 Thread Eric Auger
If the ITS modality is not available, let's simply support MSI
injection by transforming the MSI.data into an SPI ID.

This becomes possible to use KVM_SIGNAL_MSI ioctl for arm too.

Signed-off-by: Eric Auger 
---
 arch/arm/kvm/Kconfig | 1 +
 virt/kvm/arm/vgic.c  | 5 +
 2 files changed, 6 insertions(+)

diff --git a/arch/arm/kvm/Kconfig b/arch/arm/kvm/Kconfig
index 151e710..0f58baf 100644
--- a/arch/arm/kvm/Kconfig
+++ b/arch/arm/kvm/Kconfig
@@ -31,6 +31,7 @@ config KVM
select KVM_VFIO
select HAVE_KVM_EVENTFD
select HAVE_KVM_IRQFD
+   select HAVE_KVM_MSI
select HAVE_KVM_IRQCHIP
select HAVE_KVM_IRQ_ROUTING
depends on ARM_VIRT_EXT && ARM_LPAE && ARM_ARCH_TIMER
diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index 16d232f..40e96f9 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -2293,6 +2293,11 @@ int kvm_set_msi(struct kvm_kernel_irq_routing_entry *e,
return kvm->arch.vgic.vm_ops.inject_msi(kvm, &msi);
else
return -ENODEV;
+   case KVM_IRQ_ROUTING_MSI:
+   if (kvm->arch.vgic.vm_ops.inject_msi)
+   return -EINVAL;
+   else
+   return kvm_vgic_inject_irq(kvm, 0, e->msi.data, level);
default:
return -EINVAL;
}
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC 4/6] KVM: arm/arm64: enable irqchip routing

2015-06-18 Thread Eric Auger
This patch adds compilation and link against irqchip.

On ARM, irqchip routing is not really useful since there is
a single irqchip. However main motivation behind using irqchip
code is to enable MSI routing code. With the support of in-kernel
GICv3 ITS emulation, it now seems to be a MUST HAVE requirement.

Functions previously implemented in vgic.c and substitute
to more complex irqchip implementation are removed:

- kvm_send_userspace_msi
- kvm_irq_map_chip_pin
- kvm_set_irq
- kvm_irq_map_gsi.

They implemented a kernel default identity GSI routing. This is now
replaced by user-side provided routing.

Routing standard hooks are now implemented in vgic:
- kvm_set_routing_entry
- kvm_set_irq
- kvm_set_msi

Both HAVE_KVM_IRQCHIP and HAVE_KVM_IRQ_ROUTING are defined.
KVM_CAP_IRQ_ROUTING is advertised and KVM_SET_GSI_ROUTING is allowed.

MSI routing is not yet allowed.

Signed-off-by: Eric Auger 
---
 Documentation/virtual/kvm/api.txt | 11 --
 arch/arm/include/asm/kvm_host.h   |  2 +
 arch/arm/kvm/Kconfig  |  2 +
 arch/arm/kvm/Makefile |  2 +-
 arch/arm64/include/asm/kvm_host.h |  1 +
 arch/arm64/kvm/Kconfig|  2 +
 arch/arm64/kvm/Makefile   |  2 +-
 include/kvm/arm_vgic.h|  9 -
 virt/kvm/arm/vgic.c   | 78 ---
 virt/kvm/irqchip.c|  2 +
 10 files changed, 67 insertions(+), 44 deletions(-)

diff --git a/Documentation/virtual/kvm/api.txt 
b/Documentation/virtual/kvm/api.txt
index bcec91e..2bc96e1 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -1395,7 +1395,7 @@ KVM_ASSIGN_DEV_IRQ. Partial deassignment of host or guest 
IRQ is allowed.
 4.52 KVM_SET_GSI_ROUTING
 
 Capability: KVM_CAP_IRQ_ROUTING
-Architectures: x86 s390
+Architectures: x86 s390 arm arm64
 Type: vm ioctl
 Parameters: struct kvm_irq_routing (in)
 Returns: 0 on success, -1 on error
@@ -2310,9 +2310,12 @@ Note that closing the resamplefd is not sufficient to 
disable the
 irqfd.  The KVM_IRQFD_FLAG_RESAMPLE is only necessary on assignment
 and need not be specified with KVM_IRQFD_FLAG_DEASSIGN.
 
-On ARM/ARM64, the gsi field in the kvm_irqfd struct specifies the Shared
-Peripheral Interrupt (SPI) index, such that the GIC interrupt ID is
-given by gsi + 32.
+On ARM/ARM64, when GSI routing is not used, the gsi field in the
+kvm_irqfd struct specifies the Shared Peripheral Interrupt (SPI) index,
+such that the GIC interrupt ID is given by gsi + 32. When GSI routing is
+setup:
+- if irqchip routing: irqchip.pin + 32 is the SPI ID that is injected
+- if MSI routing: the MSI data is used as interrupt ID (SPI or LPI).
 
 4.76 KVM_PPC_ALLOCATE_HTAB
 
diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index d71607c..452697e 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -42,6 +42,8 @@
 
 #define KVM_VCPU_MAX_FEATURES 2
 
+#define KVM_IRQCHIP_NUM_PINS 988 /* 1020 -32 is the number of SPI */
+
 #include 
 
 u32 *kvm_vcpu_reg(struct kvm_vcpu *vcpu, u8 reg_num, u32 mode);
diff --git a/arch/arm/kvm/Kconfig b/arch/arm/kvm/Kconfig
index bfb915d..151e710 100644
--- a/arch/arm/kvm/Kconfig
+++ b/arch/arm/kvm/Kconfig
@@ -31,6 +31,8 @@ config KVM
select KVM_VFIO
select HAVE_KVM_EVENTFD
select HAVE_KVM_IRQFD
+   select HAVE_KVM_IRQCHIP
+   select HAVE_KVM_IRQ_ROUTING
depends on ARM_VIRT_EXT && ARM_LPAE && ARM_ARCH_TIMER
---help---
  Support hosting virtualized guest machines.
diff --git a/arch/arm/kvm/Makefile b/arch/arm/kvm/Makefile
index c5eef02c..1a8f48a 100644
--- a/arch/arm/kvm/Makefile
+++ b/arch/arm/kvm/Makefile
@@ -15,7 +15,7 @@ AFLAGS_init.o := -Wa,-march=armv7-a$(plus_virt)
 AFLAGS_interrupts.o := -Wa,-march=armv7-a$(plus_virt)
 
 KVM := ../../../virt/kvm
-kvm-arm-y = $(KVM)/kvm_main.o $(KVM)/coalesced_mmio.o $(KVM)/eventfd.o 
$(KVM)/vfio.o
+kvm-arm-y = $(KVM)/kvm_main.o $(KVM)/coalesced_mmio.o $(KVM)/eventfd.o 
$(KVM)/vfio.o $(KVM)/irqchip.o
 
 obj-y += kvm-arm.o init.o interrupts.o
 obj-y += arm.o handle_exit.o guest.o mmu.o emulate.o reset.o
diff --git a/arch/arm64/include/asm/kvm_host.h 
b/arch/arm64/include/asm/kvm_host.h
index f0f58c9..751210a 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -44,6 +44,7 @@
 #include 
 
 #define KVM_VCPU_MAX_FEATURES 3
+#define KVM_IRQCHIP_NUM_PINS 988 /* 1020 -32 is the number of SPI */
 
 int __attribute_const__ kvm_target_cpu(void);
 int kvm_reset_vcpu(struct kvm_vcpu *vcpu);
diff --git a/arch/arm64/kvm/Kconfig b/arch/arm64/kvm/Kconfig
index ff9722f..1a9900d 100644
--- a/arch/arm64/kvm/Kconfig
+++ b/arch/arm64/kvm/Kconfig
@@ -32,6 +32,8 @@ config KVM
select HAVE_KVM_EVENTFD
select HAVE_KVM_IRQFD
select HAVE_KVM_MSI
+   select HAVE_KVM_IRQCHIP
+   select HAVE_KVM_IRQ_ROUTING
---help---
  Support hosting virtualized guest machines.
 
diff --git a/

[RFC 2/6] KVM: kvm_host: add kvm_extended_msi

2015-06-18 Thread Eric Auger
As a follow-up of user API extension let's create a corresponding
kernel side structure

Signed-off-by: Eric Auger 
---
 include/linux/kvm_host.h | 8 
 1 file changed, 8 insertions(+)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index ad45054..e1c1c0d 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -304,6 +304,13 @@ struct kvm_s390_adapter_int {
u32 adapter_id;
 };
 
+struct kvm_extended_msi {
+   u32 address_lo; /* low 32 bits of msi message address */
+   u32 address_hi; /* high 32 bits of msi message address */
+   u32 data;   /* 16 bits of msi message data */
+   u32 devid;  /* out-of-band device ID */
+};
+
 struct kvm_kernel_irq_routing_entry {
u32 gsi;
u32 type;
@@ -317,6 +324,7 @@ struct kvm_kernel_irq_routing_entry {
} irqchip;
struct msi_msg msi;
struct kvm_s390_adapter_int adapter;
+   struct kvm_extended_msi ext_msi;
};
struct hlist_node link;
 };
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 2/2] arm: KVM: keep arm vfp/simd exit handling consistent with arm64

2015-06-18 Thread Marc Zyngier
On 16/06/15 22:50, Mario Smarduch wrote:
> After enhancing arm64 FP/SIMD exit handling, FP/SIMD exit branch is moved
> to guest trap handling. This keeps exiting handling flow between both
> architectures consistent.
> 
> Signed-off-by: Mario Smarduch 
> ---
>  arch/arm/kvm/interrupts.S |   12 +++-
>  1 file changed, 7 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/arm/kvm/interrupts.S b/arch/arm/kvm/interrupts.S
> index 79caf79..fca2c56 100644
> --- a/arch/arm/kvm/interrupts.S
> +++ b/arch/arm/kvm/interrupts.S
> @@ -363,10 +363,6 @@ hyp_hvc:
>   @ Check syndrome register
>   mrc p15, 4, r1, c5, c2, 0   @ HSR
>   lsr r0, r1, #HSR_EC_SHIFT
> -#ifdef CONFIG_VFPv3
> - cmp r0, #HSR_EC_CP_0_13
> - beq switch_to_guest_vfp
> -#endif
>   cmp r0, #HSR_EC_HVC
>   bne guest_trap  @ Not HVC instr.
>  
> @@ -406,6 +402,12 @@ THUMB(   orr lr, #1)
>  1:   eret
>  
>  guest_trap:
> +#ifdef CONFIG_VFPv3
> + /* Guest accessed VFP/SIMD registers, save host, restore Guest */
> + cmp r0, #HSR_EC_CP_0_13
> + beq switch_to_guest_fpsimd
> +#endif
> +
>   load_vcpu   @ Load VCPU pointer to r0
>   str r1, [vcpu, #VCPU_HSR]
>  
> @@ -478,7 +480,7 @@ guest_trap:
>   * inject an undefined exception to the guest.
>   */
>  #ifdef CONFIG_VFPv3
> -switch_to_guest_vfp:
> +switch_to_guest_fpsimd:

Ah, I think I managed to confuse you in my previous comment.
On ARMv7, we call the floating point stuff VFP.
On ARMv8, we call it FP/SIMD.

Not very consistent, I know...

>   load_vcpu   @ Load VCPU pointer to r0

It would be interesting to find out if we can make this load_vcpu part
of the common sequence (without spilling another register, of course).
Probably involves moving the exception class to r2.

Thanks,

M.
-- 
Jazz is not dead. It just smells funny...
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM slow LAMP guest

2015-06-18 Thread David Matlack
On Thu, Jun 18, 2015 at 1:25 AM, Hansa  wrote:
> Hi,
>
> I have a LAMP server as guest in KVM. Whenever the server is idle for some
> time it takes about 30 seconds to load a Wordpress site.
> If the server is not idle the site shows up in max 5 seconds. I've already
> turned of power management in the guest by passing
>
> GRUB_CMDLINE_LINUX_DEFAULT="apm=off"
>
> in /etc/default/grub. This has no effect.
> Does KVM do some power management on guests? If so, how do I turn this off
> for my LAMP guest?

KVM doesn't do any power management of guests. But if everything is idle on
the host (including your guest), then host power management could kick in.
Have you tried playing with host pm?

Could you try running your workload with the guest kernel parameter "idle=poll"
and let me know the performance?

Also, if you are running Linux 4.0 or later on the host, could you try running
your workload with the KVM module parameter "halt_poll_ns=50"?

>
> Best, Hansa
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvmtool: don't use PCI config space IRQ line field

2015-06-18 Thread Andre Przywara
Hi Will,

On 06/16/2015 06:06 PM, Will Deacon wrote:
> On Mon, Jun 15, 2015 at 11:45:38AM +0100, Andre Przywara wrote:
>> On 06/05/2015 05:41 PM, Will Deacon wrote:
>>> On Thu, Jun 04, 2015 at 04:20:45PM +0100, Andre Przywara wrote:
 In PCI config space there is an interrupt line field (offset 0x3f),
 which is used to initially communicate the IRQ line number from
 firmware to the OS. _Hardware_ should never use this information,
 as the OS is free to write any information in there.
 But kvmtool uses this number when it triggers IRQs in the guest,
 which fails starting with Linux 3.19-rc1, where the PCI layer starts
 writing the virtual IRQ number in there.

 Fix that by storing the IRQ number in a separate field in
 struct virtio_pci, which is independent from the PCI config space
 and cannot be influenced by the guest.
 This fixes ARM/ARM64 guests using PCI with newer kernels.

 Signed-off-by: Andre Przywara 
 ---
  include/kvm/virtio-pci.h | 8 
  virtio/pci.c | 9 ++---
  2 files changed, 14 insertions(+), 3 deletions(-)

 diff --git a/include/kvm/virtio-pci.h b/include/kvm/virtio-pci.h
 index c795ce7..b70cadd 100644
 --- a/include/kvm/virtio-pci.h
 +++ b/include/kvm/virtio-pci.h
 @@ -30,6 +30,14 @@ struct virtio_pci {
u8  isr;
u32 features;
  
 +  /*
 +   * We cannot rely on the INTERRUPT_LINE byte in the config space once
 +   * we have run guest code, as the OS is allowed to use that field
 +   * as a scratch pad to communicate between driver and PCI layer.
 +   * So store our legacy interrupt line number in here for internal use.
 +   */
 +  u8  legacy_irq_line;
 +
/* MSI-X */
u16 config_vector;
u32 config_gsi;
 diff --git a/virtio/pci.c b/virtio/pci.c
 index 7556239..e17e5a9 100644
 --- a/virtio/pci.c
 +++ b/virtio/pci.c
 @@ -141,7 +141,7 @@ static bool virtio_pci__io_in(struct ioport *ioport, 
 struct kvm_cpu *vcpu, u16 p
break;
case VIRTIO_PCI_ISR:
ioport__write8(data, vpci->isr);
 -  kvm__irq_line(kvm, vpci->pci_hdr.irq_line, VIRTIO_IRQ_LOW);
 +  kvm__irq_line(kvm, vpci->legacy_irq_line, VIRTIO_IRQ_LOW);
vpci->isr = VIRTIO_IRQ_LOW;
break;
default:
 @@ -299,7 +299,7 @@ int virtio_pci__signal_vq(struct kvm *kvm, struct 
 virtio_device *vdev, u32 vq)
kvm__irq_trigger(kvm, vpci->gsis[vq]);
} else {
vpci->isr = VIRTIO_IRQ_HIGH;
 -  kvm__irq_trigger(kvm, vpci->pci_hdr.irq_line);
 +  kvm__irq_trigger(kvm, vpci->legacy_irq_line);
}
return 0;
  }
 @@ -323,7 +323,7 @@ int virtio_pci__signal_config(struct kvm *kvm, struct 
 virtio_device *vdev)
kvm__irq_trigger(kvm, vpci->config_gsi);
} else {
vpci->isr = VIRTIO_PCI_ISR_CONFIG;
 -  kvm__irq_trigger(kvm, vpci->pci_hdr.irq_line);
 +  kvm__irq_trigger(kvm, vpci->legacy_irq_line);
}
  
return 0;
 @@ -422,6 +422,9 @@ int virtio_pci__init(struct kvm *kvm, void *dev, 
 struct virtio_device *vdev,
if (r < 0)
goto free_msix_mmio;
  
 +  /* save the IRQ that device__register() has allocated */
 +  vpci->legacy_irq_line = vpci->pci_hdr.irq_line;
>>>
>>> I'd rather we used the container_of trick that we do for virtio-mmio
>>> devices when assigning the irq in device__register. Then we can avoid
>>> this line completely.
>>
>> Not completely sure I get what you mean, I take it you want to assign
>> legacy_irq_line in pci__assign_irq() directly (where the IRQ number is
>> allocated).
>> But this function is PCI generic code and is used by the VESA
>> framebuffer and the shmem device on x86 as well. For those devices
>> dev_hdr is not part of a struct virtio_pci, so we can't do container_of
>> to assign the legacy_irq_line here directly.
>> Admittedly this fix should apply to the other two users as well, but
>> VESA does not use interrupts and pci-shmem is completely broken anyway,
>> so I didn't bother to fix it in this regard.
>> Would it be justified to provide an IRQ number field in struct
>> device_header to address all users?
>>
>> Or what am I missing here?
> 
> If VESA and shmem are broken, they should either be fixed or removed.

I am tempted to remove shmem, since it's broken:
a) there is no upstream driver, only some out-of-tree uio driver module
in some Github repo
b) the PCI device BARs do not match what QEMU implements and what the
uio driver expects (IO BAR vs. MMIO BAR)
c) there is (at least one) bug in kvmtool (easily fixed, though)
I haven't completely given up yet fixing it, but that's for another

Re: [PATCH v2 1/2] arm64: KVM: Optimize arm64 fp/simd save/restore

2015-06-18 Thread Marc Zyngier
On 16/06/15 22:50, Mario Smarduch wrote:
> This patch only saves and restores FP/SIMD registers on Guest access. To do
> this cptr_el2 FP/SIMD trap is set on Guest entry and later checked on exit.
> lmbench, hackbench show significant improvements, for 30-50% exits FP/SIMD
> context is not saved/restored
> 
> Signed-off-by: Mario Smarduch 

Looks nice and clean.

Reviewed-by: Marc Zyngier 

Thanks,

M.
-- 
Jazz is not dead. It just smells funny...
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/5] vhost: support upto 509 memory regions

2015-06-18 Thread Paolo Bonzini


On 18/06/2015 16:47, Michael S. Tsirkin wrote:
>> However, with Igor's patches a memory_region_del_subregion will cause a
>> mmap(MAP_NORESERVE), which _does_ have the effect of making the hva go away.
>>
>> I guess one way to do it would be to alias the same page in two places,
>> one for use by vhost and one for use by everything else.  However, the
>> kernel does not provide the means to do this kind of aliasing for
>> anonymous mmaps.
> 
> Basically pages go away on munmap, so won't simple
>   lock
>   munmap
>   mmap(MAP_NORESERVE)
>   unlock
> do the trick?

Not sure I follow.  Here we have this:

VCPU 1 VCPU 2  I/O 
worker


take big QEMU lock
p = address_space_map(hva, len)
pass I/O request to worker thread
   read(fd, 
p, len)
release big QEMU lock

memory_region_del_subregion
  mmap(MAP_NORESERVE)

   read 
returns EFAULT
   wake up 
VCPU 1
take big QEMU lock
EFAULT?  What's that?

In another scenario you are less lucky: the memory accesses
between address_space_map/unmap aren't done in the kernel and
you get a plain old SIGSEGV.

This is not something that you can fix with a lock.  The very
purpose of the map/unmap API is to do stuff asynchronously while
the lock is released.

Thanks,

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/5] vhost: support upto 509 memory regions

2015-06-18 Thread Igor Mammedov
On Thu, 18 Jun 2015 16:47:33 +0200
"Michael S. Tsirkin"  wrote:

> On Thu, Jun 18, 2015 at 03:46:14PM +0200, Paolo Bonzini wrote:
> > 
> > 
> > On 18/06/2015 15:19, Michael S. Tsirkin wrote:
> > > On Thu, Jun 18, 2015 at 01:50:32PM +0200, Paolo Bonzini wrote:
> > >>
> > >>
> > >> On 18/06/2015 13:41, Michael S. Tsirkin wrote:
> > >>> On Thu, Jun 18, 2015 at 01:39:12PM +0200, Igor Mammedov wrote:
> >  Lets leave decision upto users instead of making them live with
> >  crashing guests.
> > >>>
> > >>> Come on, let's fix it in userspace.
> > >>
> > >> It's not trivial to fix it in userspace.  Since QEMU uses RCU there
> > >> isn't a single memory map to use for a linear gpa->hva map.
> > > 
> > > Could you elaborate?
> > > 
> > > I'm confused by this mention of RCU.
> > > You use RCU for accesses to the memory map, correct?
> > > So memory map itself is a write side operation, as such all you need to
> > > do is take some kind of lock to prevent conflicting with other memory
> > > maps, do rcu sync under this lock.
> > 
> > You're right, the problem isn't directly related to RCU.  RCU would be
> > easy to handle by using synchronize_rcu instead of call_rcu.  While I
> > identified an RCU-related problem with Igor's patches, it's much more
> > entrenched.
> > 
> > RAM can be used by asynchronous operations while the VM runs, between
> > address_space_map and address_space_unmap.  It is possible and common to
> > have a quiescent state between the map and unmap, and a memory map
> > change can happen in the middle of this.  Normally this is not a
> > problem, because changes to the memory map do not make the hva go away
> > (memory regions are reference counted).
> 
> Right, so you want mmap(MAP_NORESERVE) when that reference
> count becomes 0.
> 
> > However, with Igor's patches a memory_region_del_subregion will cause a
> > mmap(MAP_NORESERVE), which _does_ have the effect of making the hva go away.
> > 
> > I guess one way to do it would be to alias the same page in two places,
> > one for use by vhost and one for use by everything else.  However, the
> > kernel does not provide the means to do this kind of aliasing for
> > anonymous mmaps.
> > 
> > Paolo
> 
> Basically pages go away on munmap, so won't simple
>   lock
>   munmap
>   mmap(MAP_NORESERVE)
>   unlock
> do the trick?
at what time are you suggesting to do this?



--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] kvmtool: Makefile: allow overriding CC and LD

2015-06-18 Thread Andre Przywara
Currently we set CC unconditionally to ${CROSS_COMPILE}gcc, the same
for LD.
Allow people to override the compiler name by specifying it explicitly
on the command line or via the environment.
Beside calling a certain compiler binary this allows to pass in
options to the compiler, which lets us get rid of the PowerPC
overrides in the Makefile. Possible uses:
$ make CC="gcc -m64" LD="ld -melf64ppc"
(build kvmtool on a PowerPC toolchain defaulting to 32-bit)
$ make CC="gcc -m32" LD="ld -melf_i386"
(build a 32-bit binary on a multilib-enabled x86-64 compiler)

Signed-off-by: Andre Przywara 
---
 Makefile | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/Makefile b/Makefile
index 6110b8e..888bee5 100644
--- a/Makefile
+++ b/Makefile
@@ -14,9 +14,13 @@ export E Q
 include config/utilities.mak
 include config/feature-tests.mak
 
-CC := $(CROSS_COMPILE)gcc
+ifeq ($(origin CC), default)
+   CC  := $(CROSS_COMPILE)gcc
+endif
 CFLAGS :=
-LD := $(CROSS_COMPILE)ld
+ifeq ($(origin LD), default)
+   LD  := $(CROSS_COMPILE)ld
+endif
 LDFLAGS:=
 
 FIND   := find
@@ -148,8 +152,6 @@ ifeq ($(ARCH), powerpc)
OBJS+= powerpc/spapr_pci.o
OBJS+= powerpc/xics.o
ARCH_INCLUDE := powerpc/include
-   CFLAGS  += -m64
-   LDFLAGS += -m elf64ppc
 
ARCH_WANT_LIBFDT := y
 endif
-- 
2.3.5

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 13/13] KVM: arm64: enable ITS emulation as a virtual MSI controller

2015-06-18 Thread Pavel Fedin
 Hello!

> But that fails compilation on ARM (which uses this file as well),
> because we have a dummy fail function in the header if
> CONFIG_HAVE_KVM_MSI is not defined.

 May be then remove that fail function too? Too many #ifdef's are not good...

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/3] powerpc: use default endianness for converting guest/init

2015-06-18 Thread Andre Przywara
Hi,

On 06/17/2015 10:43 AM, Andre Przywara wrote:
> For converting the guest/init binary into an object file, we call
> the linker binary, setting the endianness to big endian explicitly
> when compiling kvmtool for powerpc.
> This breaks if the compiler is actually targetting little endian
> (which is true for the Debian port, for instance).
> Remove the explicit big endianness switch from the linker call to
> allow linking on little endian PowerPC builds again.
> 
> Signed-off-by: Andre Przywara 
> ---
> Hi,
> 
> this fixed the powerpc64le build for me, while still compiling fine
> for big endian. Admittedly this whole init->guest_init.o conversion
> has its issues (with MIPS, for instance), which deserve proper fixing,
> but lets just fix that build for now.
> 

Will was concerned about breaking toolchains where the linker does not
default to 64-bit. Is that an issue we care about?
AFAICT LDFLAGS is only used in this dodgy binary-to-object-file
conversion of guest/init. For this we rely on the resulting .o file to
have the same ELF target as the other object files to be finally linked
into the lkvm binary. As we don't compile guest/init with CFLAGS, there
is a possible mismatch.

I am looking into a proper fix for this now (compiling guest/init with
CFLAGS, calling $CC with linker options instead of $LD and allowing CC
and LD override). Still struggling with MIPS, though :-(

If someone is eager to fix compilation on PowerPC meanwhile, feel free
to use this fix for the time being.

Cheers,
Andre.

> 
>  Makefile | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/Makefile b/Makefile
> index 6110b8e..c118e1a 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -149,7 +149,6 @@ ifeq ($(ARCH), powerpc)
>   OBJS+= powerpc/xics.o
>   ARCH_INCLUDE := powerpc/include
>   CFLAGS  += -m64
> - LDFLAGS += -m elf64ppc
>  
>   ARCH_WANT_LIBFDT := y
>  endif
> 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/5] vhost: support upto 509 memory regions

2015-06-18 Thread Michael S. Tsirkin
On Thu, Jun 18, 2015 at 03:46:14PM +0200, Paolo Bonzini wrote:
> 
> 
> On 18/06/2015 15:19, Michael S. Tsirkin wrote:
> > On Thu, Jun 18, 2015 at 01:50:32PM +0200, Paolo Bonzini wrote:
> >>
> >>
> >> On 18/06/2015 13:41, Michael S. Tsirkin wrote:
> >>> On Thu, Jun 18, 2015 at 01:39:12PM +0200, Igor Mammedov wrote:
>  Lets leave decision upto users instead of making them live with
>  crashing guests.
> >>>
> >>> Come on, let's fix it in userspace.
> >>
> >> It's not trivial to fix it in userspace.  Since QEMU uses RCU there
> >> isn't a single memory map to use for a linear gpa->hva map.
> > 
> > Could you elaborate?
> > 
> > I'm confused by this mention of RCU.
> > You use RCU for accesses to the memory map, correct?
> > So memory map itself is a write side operation, as such all you need to
> > do is take some kind of lock to prevent conflicting with other memory
> > maps, do rcu sync under this lock.
> 
> You're right, the problem isn't directly related to RCU.  RCU would be
> easy to handle by using synchronize_rcu instead of call_rcu.  While I
> identified an RCU-related problem with Igor's patches, it's much more
> entrenched.
> 
> RAM can be used by asynchronous operations while the VM runs, between
> address_space_map and address_space_unmap.  It is possible and common to
> have a quiescent state between the map and unmap, and a memory map
> change can happen in the middle of this.  Normally this is not a
> problem, because changes to the memory map do not make the hva go away
> (memory regions are reference counted).

Right, so you want mmap(MAP_NORESERVE) when that reference
count becomes 0.

> However, with Igor's patches a memory_region_del_subregion will cause a
> mmap(MAP_NORESERVE), which _does_ have the effect of making the hva go away.
> 
> I guess one way to do it would be to alias the same page in two places,
> one for use by vhost and one for use by everything else.  However, the
> kernel does not provide the means to do this kind of aliasing for
> anonymous mmaps.
> 
> Paolo

Basically pages go away on munmap, so won't simple
lock
munmap
mmap(MAP_NORESERVE)
unlock
do the trick?

-- 
MST
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 13/13] KVM: arm64: enable ITS emulation as a virtual MSI controller

2015-06-18 Thread Andre Przywara
Hi Eric,

On 06/18/2015 09:43 AM, Eric Auger wrote:
> On 05/29/2015 11:53 AM, Andre Przywara wrote:
>> If userspace has provided a base address for the ITS register frame,
>> we enable the bits that advertise LPIs in the GICv3.
>> When the guest has enabled LPIs and the ITS, we enable the emulation
>> part by initializing the ITS data structures and trapping on ITS
>> register frame accesses by the guest.
>> Also we enable the KVM_SIGNAL_MSI feature to allow userland to inject
>> MSIs into the guest. Not having enabled the ITS emulation will lead
>> to a -ENODEV when trying to inject a MSI.
>>



>> Signed-off-by: Andre Przywara 
>> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
>> index 9f7b05f..09b1f46 100644
>> --- a/virt/kvm/arm/vgic.c
>> +++ b/virt/kvm/arm/vgic.c
>> @@ -2254,3 +2254,13 @@ int kvm_set_msi(struct kvm_kernel_irq_routing_entry 
>> *e,
>>  {
>>  return 0;
>>  }
>> +
>> +#ifdef CONFIG_HAVE_KVM_MSI
> I don't think the if#def is requested since the entry is already
> prevented in kvm_main.c in, case KVM_SIGNAL_MSI.

But that fails compilation on ARM (which uses this file as well),
because we have a dummy fail function in the header if
CONFIG_HAVE_KVM_MSI is not defined.
So you get: error: redefinition of 'kvm_send_userspace_msi'

Cheers,
Andre.

>> +int kvm_send_userspace_msi(struct kvm *kvm, struct kvm_msi *msi)
>> +{
>> +if (kvm->arch.vgic.vm_ops.inject_msi)
>> +return kvm->arch.vgic.vm_ops.inject_msi(kvm, msi);
>> +else
>> +return -ENODEV;
>> +}
>> +#endif
>>
> 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: IRQFD support with GICv3 ITS (WAS: RE: [PATCH 00/13] arm64: KVM: GICv3 ITS emulation)

2015-06-18 Thread Pavel Fedin
 Hello!

> I also have an implementation of GSI routing on ARM, basically a rebase
> of my old/first implementation of irqfd
> (https://patches.linaro.org/32261/) based on irqchip gsi routing & qemu
> part (https://lists.gnu.org/archive/html/qemu-devel/2014-07/msg01090.html).

 I took a glance at it, and looks like it's already obsolete. We already have a 
convention of GSI number == SPI number. Kind of hardcoded default routing table 
which cannot be changed. It is used at least by GICv2m emulation.
 I think we should maintain backwards compatibility with it. I thought about 
something like:
 a) GSI < 8192 - correspond to SPIs and cannot be re-routed.
 b) GSI >= 8192 - correspond to MSI and need to be routed before use.
During routing setup we could use either GSI with offset (starting from 8192), 
or raw number (starting from 0). In case of raw number we would have some 
complex structure of GSI field in KVM_CAP_IRQFD ioctl, similar to KVM_IRQ_LINE. 
Something like:
 bits:  | 31 ... 24 | 23  ... 0 |
field: | irq_type  | irq_id |
irq_type[0]: irq_id = SPI
irq_type[3]: irq_id = GSI number routed to MSI

 Consequently, we have to implement only KVM_IRQ_ROUTING_MSI type and 
completely ignore KVM_IRQ_ROUTING_IRQCHIP.
 I hope i am clear enough...

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/5] vhost: support upto 509 memory regions

2015-06-18 Thread Paolo Bonzini


On 18/06/2015 15:19, Michael S. Tsirkin wrote:
> On Thu, Jun 18, 2015 at 01:50:32PM +0200, Paolo Bonzini wrote:
>>
>>
>> On 18/06/2015 13:41, Michael S. Tsirkin wrote:
>>> On Thu, Jun 18, 2015 at 01:39:12PM +0200, Igor Mammedov wrote:
 Lets leave decision upto users instead of making them live with
 crashing guests.
>>>
>>> Come on, let's fix it in userspace.
>>
>> It's not trivial to fix it in userspace.  Since QEMU uses RCU there
>> isn't a single memory map to use for a linear gpa->hva map.
> 
> Could you elaborate?
> 
> I'm confused by this mention of RCU.
> You use RCU for accesses to the memory map, correct?
> So memory map itself is a write side operation, as such all you need to
> do is take some kind of lock to prevent conflicting with other memory
> maps, do rcu sync under this lock.

You're right, the problem isn't directly related to RCU.  RCU would be
easy to handle by using synchronize_rcu instead of call_rcu.  While I
identified an RCU-related problem with Igor's patches, it's much more
entrenched.

RAM can be used by asynchronous operations while the VM runs, between
address_space_map and address_space_unmap.  It is possible and common to
have a quiescent state between the map and unmap, and a memory map
change can happen in the middle of this.  Normally this is not a
problem, because changes to the memory map do not make the hva go away
(memory regions are reference counted).

However, with Igor's patches a memory_region_del_subregion will cause a
mmap(MAP_NORESERVE), which _does_ have the effect of making the hva go away.

I guess one way to do it would be to alias the same page in two places,
one for use by vhost and one for use by everything else.  However, the
kernel does not provide the means to do this kind of aliasing for
anonymous mmaps.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/5] vhost: support upto 509 memory regions

2015-06-18 Thread Michael S. Tsirkin
On Thu, Jun 18, 2015 at 01:50:32PM +0200, Paolo Bonzini wrote:
> 
> 
> On 18/06/2015 13:41, Michael S. Tsirkin wrote:
> > On Thu, Jun 18, 2015 at 01:39:12PM +0200, Igor Mammedov wrote:
> >> Lets leave decision upto users instead of making them live with
> >> crashing guests.
> > 
> > Come on, let's fix it in userspace.
> 
> It's not trivial to fix it in userspace.  Since QEMU uses RCU there
> isn't a single memory map to use for a linear gpa->hva map.

Could you elaborate?

I'm confused by this mention of RCU.
You use RCU for accesses to the memory map, correct?
So memory map itself is a write side operation, as such all you need to
do is take some kind of lock to prevent conflicting with other memory
maps, do rcu sync under this lock.


> I find it absurd that we're fighting over 12K of memory.
> 
> Paolo

I wouldn't worry so much if it didn't affect kernel/userspace API.
Need to be careful there.

-- 
MST
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/5] vhost: support upto 509 memory regions

2015-06-18 Thread Igor Mammedov
On Thu, 18 Jun 2015 13:41:22 +0200
"Michael S. Tsirkin"  wrote:

> On Thu, Jun 18, 2015 at 01:39:12PM +0200, Igor Mammedov wrote:
> > Lets leave decision upto users instead of making them live with
> > crashing guests.
> 
> Come on, let's fix it in userspace.
I'm not abandoning userspace approach either but it might take time
to implement in robust manner as it's much more complex and has much
more places to backfire then a straightforward kernel fix which will
work for both old userspace and a new one.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/5] vhost: support upto 509 memory regions

2015-06-18 Thread Paolo Bonzini


On 18/06/2015 13:41, Michael S. Tsirkin wrote:
> On Thu, Jun 18, 2015 at 01:39:12PM +0200, Igor Mammedov wrote:
>> Lets leave decision upto users instead of making them live with
>> crashing guests.
> 
> Come on, let's fix it in userspace.

It's not trivial to fix it in userspace.  Since QEMU uses RCU there
isn't a single memory map to use for a linear gpa->hva map.

I find it absurd that we're fighting over 12K of memory.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/5] vhost: support upto 509 memory regions

2015-06-18 Thread Michael S. Tsirkin
On Thu, Jun 18, 2015 at 01:39:12PM +0200, Igor Mammedov wrote:
> Lets leave decision upto users instead of making them live with
> crashing guests.

Come on, let's fix it in userspace.

-- 
MST
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/5] vhost: support upto 509 memory regions

2015-06-18 Thread Igor Mammedov
On Thu, 18 Jun 2015 11:50:22 +0200
"Michael S. Tsirkin"  wrote:

> On Thu, Jun 18, 2015 at 11:12:24AM +0200, Igor Mammedov wrote:
> > On Wed, 17 Jun 2015 18:30:02 +0200
> > "Michael S. Tsirkin"  wrote:
> > 
> > > On Wed, Jun 17, 2015 at 06:09:21PM +0200, Igor Mammedov wrote:
> > > > On Wed, 17 Jun 2015 17:38:40 +0200
> > > > "Michael S. Tsirkin"  wrote:
> > > > 
> > > > > On Wed, Jun 17, 2015 at 05:12:57PM +0200, Igor Mammedov wrote:
> > > > > > On Wed, 17 Jun 2015 16:32:02 +0200
> > > > > > "Michael S. Tsirkin"  wrote:
> > > > > > 
> > > > > > > On Wed, Jun 17, 2015 at 03:20:44PM +0200, Paolo Bonzini wrote:
> > > > > > > > 
> > > > > > > > 
> > > > > > > > On 17/06/2015 15:13, Michael S. Tsirkin wrote:
> > > > > > > > > > > Considering userspace can be malicious, I guess yes.
> > > > > > > > > > I don't think it's a valid concern in this case,
> > > > > > > > > > setting limit back from 509 to 64 will not help here in any
> > > > > > > > > > way, userspace still can create as many vhost instances as
> > > > > > > > > > it needs to consume memory it desires.
> > > > > > > > > 
> > > > > > > > > Not really since vhost char device isn't world-accessible.
> > > > > > > > > It's typically opened by a priveledged tool, the fd is
> > > > > > > > > then passed to an unpriveledged userspace, or permissions
> > > > > > > > > dropped.
> > > > > > > > 
> > > > > > > > Then what's the concern anyway?
> > > > > > > > 
> > > > > > > > Paolo
> > > > > > > 
> > > > > > > Each fd now ties up 16K of kernel memory.  It didn't use to, so
> > > > > > > priveledged tool could safely give the unpriveledged userspace
> > > > > > > a ton of these fds.
> > > > > > if privileged tool gives out unlimited amount of fds then it
> > > > > > doesn't matter whether fd ties 4K or 16K, host still could be DoSed.
> > > > > > 
> > > > > 
> > > > > Of course it does not give out unlimited fds, there's a way
> > > > > for the sysadmin to specify the number of fds. Look at how libvirt
> > > > > uses vhost, it should become clear I think.
> > > > then it just means that tool has to take into account a new limits
> > > > to partition host in sensible manner.
> > > 
> > > Meanwhile old tools are vulnerable to OOM attacks.
> > I've chatted with libvirt folks, it doesn't care about how much memory
> > vhost would consume nor do any host capacity planning in that regard.
> 
> Exactly, it's up to host admin.
> 
> > But lets assume that there are tools that do this so
> > how about instead of hardcoding limit make it a module parameter
> > with default set to 64. That would allow users to set higher limit
> > if they need it and nor regress old tools. it will also give tools
> > interface for reading limit from vhost module.
> 
> And now you need to choose between security and functionality :(
There is no conflict here and it's not about choosing.
If admin has a method to estimate guest memory footprint
to do capacity partitioning then he would need to redo
partitioning taking in account new footprint when
he/she rises limit manually.

(BTW libvirt has tried and reverted patches that were trying to
predict required memory, admin might be able to do it manually
better but it's another topic how to do it ans it's not related
to this thread)

Lets leave decision upto users instead of making them live with
crashing guests.

> 
> > > 
> > > > Exposing limit as module parameter might be of help to tool for
> > > > getting/setting it in a way it needs.
> > > 
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2] powerpc: add hvcall.h header from Linux

2015-06-18 Thread Andre Przywara
The powerpc code uses some PAPR hypercalls, of which we need the
hypercall number. Copy just the needed macro definitions from the
kernel's (private) hvcall.h file and remove the extra tricks formerly
used to be able to include this header file directly.

Signed-off-by: Andre Przywara 
---
Hi,

this version of the header file just contains the definitions we
need, while still being easily diff-able against the original file.
Please consider applying this one.

Cheers,
Andre.

 powerpc/include/asm/hvcall.h | 33 +
 powerpc/spapr.h  |  3 ---
 2 files changed, 33 insertions(+), 3 deletions(-)
 create mode 100644 powerpc/include/asm/hvcall.h

diff --git a/powerpc/include/asm/hvcall.h b/powerpc/include/asm/hvcall.h
new file mode 100644
index 000..9d58f9b
--- /dev/null
+++ b/powerpc/include/asm/hvcall.h
@@ -0,0 +1,33 @@
+#ifndef _ASM_POWERPC_HVCALL_H
+#define _ASM_POWERPC_HVCALL_H
+
+/* This file is a trimmed-down version of arch/powerpc/include/asm/hvcall.h. */
+
+#define H_SUCCESS  0
+
+#define H_HARDWARE -1  /* Hardware error */
+#define H_FUNCTION -2  /* Function not supported */
+#define H_PRIVILEGE-3  /* Caller not privileged */
+#define H_PARAMETER-4  /* Parameter invalid, out-of-range or 
conflicting */
+
+#define H_SET_DABR 0x28
+#define H_LOGICAL_CI_LOAD  0x3c
+#define H_LOGICAL_CI_STORE 0x40
+#define H_LOGICAL_CACHE_LOAD   0x44
+#define H_LOGICAL_CACHE_STORE  0x48
+#define H_LOGICAL_ICBI 0x4c
+#define H_LOGICAL_DCBF 0x50
+
+#define H_GET_TERM_CHAR0x54
+#define H_PUT_TERM_CHAR0x58
+
+#define H_EOI  0x64
+#define H_CPPR 0x68
+#define H_IPI  0x6c
+#define H_IPOLL0x70
+#define H_XIRR 0x74
+
+#define H_SET_MODE 0x31C
+#define MAX_HCALL_OPCODE   H_SET_MODE
+
+#endif /* _ASM_POWERPC_HVCALL_H */
diff --git a/powerpc/spapr.h b/powerpc/spapr.h
index 0537f88..4c6e349 100644
--- a/powerpc/spapr.h
+++ b/powerpc/spapr.h
@@ -16,10 +16,7 @@
 
 #include 
 
-/* We need some of the H_ hcall defs, but they're __KERNEL__ only. */
-#define __KERNEL__
 #include 
-#undef __KERNEL__
 
 #include "kvm/kvm.h"
 #include "kvm/kvm-cpu.h"
-- 
2.3.5

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/5] vhost: support upto 509 memory regions

2015-06-18 Thread Paolo Bonzini


On 18/06/2015 11:50, Michael S. Tsirkin wrote:
> > But lets assume that there are tools that do this so
> > how about instead of hardcoding limit make it a module parameter
> > with default set to 64. That would allow users to set higher limit
> > if they need it and nor regress old tools. it will also give tools
> > interface for reading limit from vhost module.
> 
> And now you need to choose between security and functionality :(

Don't call "security" a 16K allocation that can fall back to vmalloc
please.  That's an insult to actual security problems...

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/5] vhost: support upto 509 memory regions

2015-06-18 Thread Michael S. Tsirkin
On Thu, Jun 18, 2015 at 11:12:24AM +0200, Igor Mammedov wrote:
> On Wed, 17 Jun 2015 18:30:02 +0200
> "Michael S. Tsirkin"  wrote:
> 
> > On Wed, Jun 17, 2015 at 06:09:21PM +0200, Igor Mammedov wrote:
> > > On Wed, 17 Jun 2015 17:38:40 +0200
> > > "Michael S. Tsirkin"  wrote:
> > > 
> > > > On Wed, Jun 17, 2015 at 05:12:57PM +0200, Igor Mammedov wrote:
> > > > > On Wed, 17 Jun 2015 16:32:02 +0200
> > > > > "Michael S. Tsirkin"  wrote:
> > > > > 
> > > > > > On Wed, Jun 17, 2015 at 03:20:44PM +0200, Paolo Bonzini wrote:
> > > > > > > 
> > > > > > > 
> > > > > > > On 17/06/2015 15:13, Michael S. Tsirkin wrote:
> > > > > > > > > > Considering userspace can be malicious, I guess yes.
> > > > > > > > > I don't think it's a valid concern in this case,
> > > > > > > > > setting limit back from 509 to 64 will not help here in any
> > > > > > > > > way, userspace still can create as many vhost instances as
> > > > > > > > > it needs to consume memory it desires.
> > > > > > > > 
> > > > > > > > Not really since vhost char device isn't world-accessible.
> > > > > > > > It's typically opened by a priveledged tool, the fd is
> > > > > > > > then passed to an unpriveledged userspace, or permissions
> > > > > > > > dropped.
> > > > > > > 
> > > > > > > Then what's the concern anyway?
> > > > > > > 
> > > > > > > Paolo
> > > > > > 
> > > > > > Each fd now ties up 16K of kernel memory.  It didn't use to, so
> > > > > > priveledged tool could safely give the unpriveledged userspace
> > > > > > a ton of these fds.
> > > > > if privileged tool gives out unlimited amount of fds then it
> > > > > doesn't matter whether fd ties 4K or 16K, host still could be DoSed.
> > > > > 
> > > > 
> > > > Of course it does not give out unlimited fds, there's a way
> > > > for the sysadmin to specify the number of fds. Look at how libvirt
> > > > uses vhost, it should become clear I think.
> > > then it just means that tool has to take into account a new limits
> > > to partition host in sensible manner.
> > 
> > Meanwhile old tools are vulnerable to OOM attacks.
> I've chatted with libvirt folks, it doesn't care about how much memory
> vhost would consume nor do any host capacity planning in that regard.

Exactly, it's up to host admin.

> But lets assume that there are tools that do this so
> how about instead of hardcoding limit make it a module parameter
> with default set to 64. That would allow users to set higher limit
> if they need it and nor regress old tools. it will also give tools
> interface for reading limit from vhost module.

And now you need to choose between security and functionality :(

> > 
> > > Exposing limit as module parameter might be of help to tool for
> > > getting/setting it in a way it needs.
> > 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/3] powerpc: implement barrier primitives

2015-06-18 Thread Will Deacon
On Thu, Jun 18, 2015 at 10:11:58AM +0100, Michael Ellerman wrote:
> On Wed, 2015-06-17 at 11:15 +0100, Will Deacon wrote:
> > On Wed, Jun 17, 2015 at 10:43:48AM +0100, Andre Przywara wrote:
> > > Instead of referring to the Linux header including the barrier
> > > macros, copy over the rather simple implementation for the PowerPC
> > > barrier instructions kvmtool uses. This fixes build for powerpc.
> > > 
> > > Signed-off-by: Andre Przywara 
> > > ---
> > > Hi,
> > > 
> > > I just took what kvmtool seems to have used before, I actually have
> > > no idea if "sync" is the right instruction or "lwsync" would do.
> > > Would be nice if some people with PowerPC knowledge could comment.
> > 
> > I *think* we can use lwsync for rmb and wmb, but would want confirmation
> > from a ppc guy before making that change!
> 
> Ugh, memory barriers :)

I prefer to call them "Job Security" :)

> You probably can use lwsync, assuming you're only ordering cacheable vs
> cacheable.
> 
> But, lwsync has given us pain in the past[1], so I'd be happier if you just 
> used
> sync.

No probs. I pushed Andre's original patch.

Will
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [v4 08/16] KVM: kvm-vfio: User API for IRQ forwarding

2015-06-18 Thread Wu, Feng


> -Original Message-
> From: Alex Williamson [mailto:alex.william...@redhat.com]
> Sent: Tuesday, June 16, 2015 12:45 AM
> To: Eric Auger
> Cc: Avi Kivity; Wu, Feng; kvm@vger.kernel.org; linux-ker...@vger.kernel.org;
> pbonz...@redhat.com; mtosa...@redhat.com
> Subject: Re: [v4 08/16] KVM: kvm-vfio: User API for IRQ forwarding
> 
> On Mon, 2015-06-15 at 18:17 +0200, Eric Auger wrote:
> > Hi Alex, all,
> > On 06/12/2015 09:03 PM, Alex Williamson wrote:
> > > On Fri, 2015-06-12 at 21:48 +0300, Avi Kivity wrote:
> > >> On 06/12/2015 06:41 PM, Alex Williamson wrote:
> > >>> On Fri, 2015-06-12 at 00:23 +, Wu, Feng wrote:
> > > -Original Message-
> > > From: Avi Kivity [mailto:avi.kiv...@gmail.com]
> > > Sent: Friday, June 12, 2015 3:59 AM
> > > To: Wu, Feng; kvm@vger.kernel.org; linux-ker...@vger.kernel.org
> > > Cc: pbonz...@redhat.com; mtosa...@redhat.com;
> > > alex.william...@redhat.com; eric.au...@linaro.org
> > > Subject: Re: [v4 08/16] KVM: kvm-vfio: User API for IRQ forwarding
> > >
> > > On 06/11/2015 01:51 PM, Feng Wu wrote:
> > >> From: Eric Auger 
> > >>
> > >> This patch adds and documents a new KVM_DEV_VFIO_DEVICE
> group
> > >> and 2 device attributes: KVM_DEV_VFIO_DEVICE_FORWARD_IRQ,
> > >> KVM_DEV_VFIO_DEVICE_UNFORWARD_IRQ. The purpose is to be
> able
> > >> to set a VFIO device IRQ as forwarded or not forwarded.
> > >> the command takes as argument a handle to a new struct named
> > >> kvm_vfio_dev_irq.
> > > Is there no way to do this automatically?  After all, vfio knows that 
> > > a
> > > device interrupt is forwarded to some eventfd, and kvm knows that
> some
> > > eventfd is forwarded to a guest interrupt.  If they compare notes
> > > through a central registry, they can figure out that the interrupt 
> > > needs
> > > to be forwarded.
> >  Oh, just like Eric mentioned in his reply, this description is out of 
> >  context
> of
> >  this series, I will remove them in the next version.
> > >>>
> > >>> I suspect Avi's question was more general.  While forward/unforward is
> > >>> out of context for this series, it's very similar in nature to
> > >>> enabling/disabling posted interrupts.  So I think the question remains
> > >>> whether we really need userspace to participate in creating this
> > >>> shortcut or if kvm and vfio can some how orchestrate figuring it out
> > >>> automatically.
> > >>>
> > >>> Personally I don't know how we could do it automatically.  We've always
> > >>> relied on userspace to independently setup vfio and kvm such that
> > >>> neither have any idea that the other is there and update each side
> > >>> independently when anything changes.  So it seems consistent to
> continue
> > >>> that here.  It doesn't seem like there's much to gain performance-wise
> > >>> either, updates should be a relatively rare event I'd expect.
> > >>>
> > >>> There's really no metadata associated with an eventfd, so "comparing
> > >>> notes" automatically might imply some central registration entity.  That
> > >>> immediately sounds like a much more complex solution, but maybe Avi
> has
> > >>> some ideas to manage it.  Thanks,
> > >>>
> > >>
> > >> The idea is to have a central registry maintained by a posted interrupts
> > >> manager.  Both vfio and kvm pass the filp (along with extra information)
> > >> to the posted interrupts manager, which, when it detects a filp match,
> > >> tells each of them what to do.
> > >>
> > >> The advantages are:
> > >> - old userspace gains the optimization without change
> > >> - a userspace API is more expensive to maintain than internal kernel
> > >> interfaces (CVEs, documentation, maintaining backwards compatibility)
> > >> - if you can do it without a new interface, this indicates that all the
> > >> information in the new interface is redundant.  That means you have to
> > >> check it for consistency with the existing information, so it's extra
> > >> work (likely, it's exactly what the posted interrupt manager would be
> > >> doing anyway).
> > >
> > > Yep, those all sound like good things and I believe that's similar in
> > > design to the way we had originally discussed this interaction at
> > > LPC/KVM Forum several years ago.  I'd be in favor of that approach.
> >
> > I guess this discussion also is relevant wrt "[RFC v6 00/16] KVM-VFIO
> > IRQ forward control" series? Or is that "central registry maintained by
> > a posted interrupts manager" something more specific to x86?
> 
> I'd think we'd want it for any sort of offload and supporting both
> posted-interrupts and irq-forwarding would be a good validation.  I
> imagine there would be registration/de-registration callbacks separate
> for interrupt producers vs interrupt consumers.  Each registration
> function would likely provide a struct of callbacks, probably similar to
> the get_symbol callbacks proposed for the kvm-vfio device on the IRQ
> producer sid

Re: [PATCH 3/3] powerpc: add hvcall.h header from Linux

2015-06-18 Thread Michael Ellerman
On Wed, 2015-06-17 at 11:13 +0100, Will Deacon wrote:
> On Wed, Jun 17, 2015 at 10:43:50AM +0100, Andre Przywara wrote:
> > The powerpc code uses some PAPR hypercalls, of which we need the
> > hypercall number. Copy the macro definition parts from the kernel's
> > (private) hvcall.h file and remove the extra tricks formerly used
> > to be able to include this header file directly.
> > 
> > Signed-off-by: Andre Przywara 
> > ---
> > Hi,
> > 
> > I copied most of the Linux header, without removing
> > definitions that kvmtool doesn't use. That should make updates
> > easier. If people would prefer a bespoke header, let me know.
> 
> I'd rather just #define the stuff we need now that we're outside of the
> kernel source tree.

Yeah that's probably cleaner.

I think you only need:

  H_CPPR
  H_EOI
  H_FUNCTION
  H_GET_TERM_CHAR
  H_HARDWARE
  H_IPI
  H_LOGICAL_CACHE_LOAD
  H_LOGICAL_CACHE_STORE
  H_LOGICAL_CI_LOAD
  H_LOGICAL_CI_STORE
  H_LOGICAL_DCBF
  H_LOGICAL_ICBI
  H_PARAMETER
  H_PUT_TERM_CHAR
  H_SET_DABR
  H_SUCCESS
  H_XIRR
  KVMPPC_H_RTAS

cheers


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/5] vhost: support upto 509 memory regions

2015-06-18 Thread Igor Mammedov
On Wed, 17 Jun 2015 18:30:02 +0200
"Michael S. Tsirkin"  wrote:

> On Wed, Jun 17, 2015 at 06:09:21PM +0200, Igor Mammedov wrote:
> > On Wed, 17 Jun 2015 17:38:40 +0200
> > "Michael S. Tsirkin"  wrote:
> > 
> > > On Wed, Jun 17, 2015 at 05:12:57PM +0200, Igor Mammedov wrote:
> > > > On Wed, 17 Jun 2015 16:32:02 +0200
> > > > "Michael S. Tsirkin"  wrote:
> > > > 
> > > > > On Wed, Jun 17, 2015 at 03:20:44PM +0200, Paolo Bonzini wrote:
> > > > > > 
> > > > > > 
> > > > > > On 17/06/2015 15:13, Michael S. Tsirkin wrote:
> > > > > > > > > Considering userspace can be malicious, I guess yes.
> > > > > > > > I don't think it's a valid concern in this case,
> > > > > > > > setting limit back from 509 to 64 will not help here in any
> > > > > > > > way, userspace still can create as many vhost instances as
> > > > > > > > it needs to consume memory it desires.
> > > > > > > 
> > > > > > > Not really since vhost char device isn't world-accessible.
> > > > > > > It's typically opened by a priveledged tool, the fd is
> > > > > > > then passed to an unpriveledged userspace, or permissions
> > > > > > > dropped.
> > > > > > 
> > > > > > Then what's the concern anyway?
> > > > > > 
> > > > > > Paolo
> > > > > 
> > > > > Each fd now ties up 16K of kernel memory.  It didn't use to, so
> > > > > priveledged tool could safely give the unpriveledged userspace
> > > > > a ton of these fds.
> > > > if privileged tool gives out unlimited amount of fds then it
> > > > doesn't matter whether fd ties 4K or 16K, host still could be DoSed.
> > > > 
> > > 
> > > Of course it does not give out unlimited fds, there's a way
> > > for the sysadmin to specify the number of fds. Look at how libvirt
> > > uses vhost, it should become clear I think.
> > then it just means that tool has to take into account a new limits
> > to partition host in sensible manner.
> 
> Meanwhile old tools are vulnerable to OOM attacks.
I've chatted with libvirt folks, it doesn't care about how much memory
vhost would consume nor do any host capacity planning in that regard.

But lets assume that there are tools that do this so
how about instead of hardcoding limit make it a module parameter
with default set to 64. That would allow users to set higher limit
if they need it and nor regress old tools. it will also give tools
interface for reading limit from vhost module.

> 
> > Exposing limit as module parameter might be of help to tool for
> > getting/setting it in a way it needs.
> 

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/3] powerpc: implement barrier primitives

2015-06-18 Thread Michael Ellerman
On Wed, 2015-06-17 at 11:15 +0100, Will Deacon wrote:
> On Wed, Jun 17, 2015 at 10:43:48AM +0100, Andre Przywara wrote:
> > Instead of referring to the Linux header including the barrier
> > macros, copy over the rather simple implementation for the PowerPC
> > barrier instructions kvmtool uses. This fixes build for powerpc.
> > 
> > Signed-off-by: Andre Przywara 
> > ---
> > Hi,
> > 
> > I just took what kvmtool seems to have used before, I actually have
> > no idea if "sync" is the right instruction or "lwsync" would do.
> > Would be nice if some people with PowerPC knowledge could comment.
> 
> I *think* we can use lwsync for rmb and wmb, but would want confirmation
> from a ppc guy before making that change!

Ugh, memory barriers :)

You probably can use lwsync, assuming you're only ordering cacheable vs
cacheable.

But, lwsync has given us pain in the past[1], so I'd be happier if you just used
sync.

cheers

[1]: 
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=51d7d5205d3389a32859f9939f1093f267409929


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 13/13] KVM: arm64: enable ITS emulation as a virtual MSI controller

2015-06-18 Thread Eric Auger
On 05/29/2015 11:53 AM, Andre Przywara wrote:
> If userspace has provided a base address for the ITS register frame,
> we enable the bits that advertise LPIs in the GICv3.
> When the guest has enabled LPIs and the ITS, we enable the emulation
> part by initializing the ITS data structures and trapping on ITS
> register frame accesses by the guest.
> Also we enable the KVM_SIGNAL_MSI feature to allow userland to inject
> MSIs into the guest. Not having enabled the ITS emulation will lead
> to a -ENODEV when trying to inject a MSI.
> 
> Signed-off-by: Andre Przywara 
> ---
>  Documentation/virtual/kvm/api.txt |  2 +-
>  arch/arm64/kvm/Kconfig|  1 +
>  include/kvm/arm_vgic.h| 10 ++
>  virt/kvm/arm/its-emul.c   |  9 -
>  virt/kvm/arm/vgic-v3-emul.c   | 20 +++-
>  virt/kvm/arm/vgic.c   | 10 ++
>  6 files changed, 45 insertions(+), 7 deletions(-)
> 
> diff --git a/Documentation/virtual/kvm/api.txt 
> b/Documentation/virtual/kvm/api.txt
> index 891d64a..d20fd94 100644
> --- a/Documentation/virtual/kvm/api.txt
> +++ b/Documentation/virtual/kvm/api.txt
> @@ -2108,7 +2108,7 @@ after pausing the vcpu, but before it is resumed.
>  4.71 KVM_SIGNAL_MSI
>  
>  Capability: KVM_CAP_SIGNAL_MSI
> -Architectures: x86
> +Architectures: x86 arm64
>  Type: vm ioctl
>  Parameters: struct kvm_msi (in)
>  Returns: >0 on delivery, 0 if guest blocked the MSI, and -1 on error
> diff --git a/arch/arm64/kvm/Kconfig b/arch/arm64/kvm/Kconfig
> index 5105e29..6c432c0 100644
> --- a/arch/arm64/kvm/Kconfig
> +++ b/arch/arm64/kvm/Kconfig
> @@ -30,6 +30,7 @@ config KVM
>   select SRCU
>   select HAVE_KVM_EVENTFD
>   select HAVE_KVM_IRQFD
> + select HAVE_KVM_MSI
>   ---help---
> Support hosting virtualized guest machines.
>  
> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
> index 6bb138d..8f1be6a 100644
> --- a/include/kvm/arm_vgic.h
> +++ b/include/kvm/arm_vgic.h
> @@ -162,6 +162,7 @@ struct vgic_io_device {
>  
>  struct vgic_its {
>   boolenabled;
> + struct vgic_io_device   iodev;
>   spinlock_t  lock;
>   u64 cbaser;
>   int creadr;
> @@ -365,4 +366,13 @@ static inline int vgic_v3_probe(struct device_node 
> *vgic_node,
>  }
>  #endif
>  
> +#ifdef CONFIG_HAVE_KVM_MSI
> +int kvm_send_userspace_msi(struct kvm *kvm, struct kvm_msi *msi);
> +#else
> +static inline int kvm_send_userspace_msi(struct kvm *kvm, struct kvm_msi 
> *msi)
> +{
> + return -ENODEV;
> +}
> +#endif
> +
>  #endif
> diff --git a/virt/kvm/arm/its-emul.c b/virt/kvm/arm/its-emul.c
> index 35e886c..864de19 100644
> --- a/virt/kvm/arm/its-emul.c
> +++ b/virt/kvm/arm/its-emul.c
> @@ -964,6 +964,7 @@ int vits_init(struct kvm *kvm)
>  {
>   struct vgic_dist *dist = &kvm->arch.vgic;
>   struct vgic_its *its = &dist->its;
> + int ret;
>  
>   if (IS_VGIC_ADDR_UNDEF(dist->vgic_its_base))
>   return -ENXIO;
> @@ -977,9 +978,15 @@ int vits_init(struct kvm *kvm)
>   INIT_LIST_HEAD(&its->device_list);
>   INIT_LIST_HEAD(&its->collection_list);
>  
> + ret = vgic_register_kvm_io_dev(kvm, dist->vgic_its_base,
> +KVM_VGIC_V3_ITS_SIZE, vgicv3_its_ranges,
> +-1, &its->iodev);
> + if (ret)
> + return ret;
> +
>   its->enabled = false;
>  
> - return -ENXIO;
> + return 0;
>  }
>  
>  void vits_destroy(struct kvm *kvm)
> diff --git a/virt/kvm/arm/vgic-v3-emul.c b/virt/kvm/arm/vgic-v3-emul.c
> index 4513551..71d0bcf 100644
> --- a/virt/kvm/arm/vgic-v3-emul.c
> +++ b/virt/kvm/arm/vgic-v3-emul.c
> @@ -89,10 +89,11 @@ static bool handle_mmio_ctlr(struct kvm_vcpu *vcpu,
>  /*
>   * As this implementation does not provide compatibility
>   * with GICv2 (ARE==1), we report zero CPUs in bits [5..7].
> - * Also LPIs and MBIs are not supported, so we set the respective bits to 0.
> - * Also we report at most 2**10=1024 interrupt IDs (to match 1024 SPIs).
> + * Also we report at most 2**10=1024 interrupt IDs (to match 1024 SPIs)
> + * and provide 16 bits worth of LPI number space (to give 8192 LPIs).
>   */
> -#define INTERRUPT_ID_BITS 10
> +#define INTERRUPT_ID_BITS_SPIS 10
> +#define INTERRUPT_ID_BITS_ITS 16
>  static bool handle_mmio_typer(struct kvm_vcpu *vcpu,
> struct kvm_exit_mmio *mmio, phys_addr_t offset)
>  {
> @@ -100,7 +101,12 @@ static bool handle_mmio_typer(struct kvm_vcpu *vcpu,
>  
>   reg = (min(vcpu->kvm->arch.vgic.nr_irqs, 1024) >> 5) - 1;
>  
> - reg |= (INTERRUPT_ID_BITS - 1) << 19;
> + if (vgic_has_its(vcpu->kvm)) {
> + reg |= GICD_TYPER_LPIS;
> + reg |= (INTERRUPT_ID_BITS_ITS - 1) << 19;
> + } else {
> + reg |= (INTERRUPT_ID_BITS_SPIS - 1) << 19;
> + }
>  
>   vgic_reg_access(mmio, ®, offset,
>   ACCESS_READ_VALUE | 

Re: [PATCH 10/10] KVM: arm/arm64: vgic: Allow non-shared device HW interrupts

2015-06-18 Thread Marc Zyngier
On 17/06/15 16:50, Eric Auger wrote:
> On 06/17/2015 05:37 PM, Marc Zyngier wrote:
>> On 17/06/15 16:11, Eric Auger wrote:
>>> Hi Marc,
>>> On 06/08/2015 07:04 PM, Marc Zyngier wrote:
 So far, the only use of the HW interrupt facility is the timer,
 implying that the active state is context-switched for each vcpu,
 as the device is is shared across all vcpus.
>>> s/is//

 This does not work for a device that has been assigned to a VM,
 as the guest is entierely in control of that device (the HW is
>>> entirely?
 not shared). In that case, it makes sense to bypass the whole
 active state srtwitchint, and only track the deactivation of the
>>> switching
>>
>> Congratulations, I think you're now ready to try deciphering my
>> handwriting... ;-)
> good to see you're not a machine or maybe you do it on purpose some
> times ;-)
>>
 interrupt.

 Signed-off-by: Marc Zyngier 
 ---
  include/kvm/arm_vgic.h|  5 +++--
  virt/kvm/arm/arch_timer.c |  2 +-
  virt/kvm/arm/vgic.c   | 37 -
  3 files changed, 28 insertions(+), 16 deletions(-)

 diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
 index 1c653c1..5d47d60 100644
 --- a/include/kvm/arm_vgic.h
 +++ b/include/kvm/arm_vgic.h
 @@ -164,7 +164,8 @@ struct irq_phys_map {
u32 virt_irq;
u32 phys_irq;
u32 irq;
 -  boolactive;
 +  boolshared;
 +  boolactive; /* Only valid if shared */
  };
  
  struct vgic_dist {
 @@ -347,7 +348,7 @@ void vgic_v3_dispatch_sgi(struct kvm_vcpu *vcpu, u64 
 reg);
  int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu);
  int kvm_vgic_vcpu_active_irq(struct kvm_vcpu *vcpu);
  struct irq_phys_map *vgic_map_phys_irq(struct kvm_vcpu *vcpu,
 - int virt_irq, int irq);
 + int virt_irq, int irq, bool shared);
  int vgic_unmap_phys_irq(struct kvm_vcpu *vcpu, struct irq_phys_map *map);
  bool vgic_get_phys_irq_active(struct irq_phys_map *map);
  void vgic_set_phys_irq_active(struct irq_phys_map *map, bool active);
 diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
 index b9fff78..9544d79 100644
 --- a/virt/kvm/arm/arch_timer.c
 +++ b/virt/kvm/arm/arch_timer.c
 @@ -202,7 +202,7 @@ void kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu,
 * Tell the VGIC that the virtual interrupt is tied to a
 * physical interrupt. We do that once per VCPU.
 */
 -  timer->map = vgic_map_phys_irq(vcpu, irq->irq, host_vtimer_irq);
 +  timer->map = vgic_map_phys_irq(vcpu, irq->irq, host_vtimer_irq, true);
WARN_ON(!timer->map);
  }
  
 diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
 index f376b56..4223166 100644
 --- a/virt/kvm/arm/vgic.c
 +++ b/virt/kvm/arm/vgic.c
 @@ -1125,18 +1125,21 @@ static void vgic_queue_irq_to_lr(struct kvm_vcpu 
 *vcpu, int irq,
map = vgic_irq_map_search(vcpu, irq);
  
if (map) {
 -  int ret;
 -
 -  BUG_ON(!map->active);
vlr.hwirq = map->phys_irq;
vlr.state |= LR_HW;
vlr.state &= ~LR_EOI_INT;
  
 -  ret = irq_set_irqchip_state(map->irq,
 -  IRQCHIP_STATE_ACTIVE,
 -  true);
vgic_irq_set_queued(vcpu, irq);
>>>
>>> the queued state is set again in vgic_queue_hwirq for level_sensitive
>>> IRQs although not harmful.
>>
>> Indeed. We still need it for edge interrupts though. I'll try to find a
>> nicer way...
>>
 -  WARN_ON(ret);
 +
 +  if (map->shared) {
 +  int ret;
 +
 +  BUG_ON(!map->active);
 +  ret = irq_set_irqchip_state(map->irq,
 +  
 IRQCHIP_STATE_ACTIVE,
 +  true);
 +  WARN_ON(ret);
 +  }
}
}
  
 @@ -1368,21 +1371,28 @@ static bool vgic_process_maintenance(struct 
 kvm_vcpu *vcpu)
  static int vgic_sync_hwirq(struct kvm_vcpu *vcpu, struct vgic_lr vlr)
  {
struct irq_phys_map *map;
 +  bool active;
int ret;
  
if (!(vlr.state & LR_HW))
return 0;
  
map = vgic_irq_map_search(vcpu, vlr.irq);
 -  BUG_ON(!map || !map->active);
 +  BUG_ON(!map);
 +  BUG_ON(map->shared && !map->active);
  
ret = irq_get_irqchip_state(map->irq,
 

KVM slow LAMP guest

2015-06-18 Thread Hansa

Hi,

I have a LAMP server as guest in KVM. Whenever the server is idle for some time 
it takes about 30 seconds to load a Wordpress site.
If the server is not idle the site shows up in max 5 seconds. I've already 
turned of power management in the guest by passing

GRUB_CMDLINE_LINUX_DEFAULT="apm=off"

in /etc/default/grub. This has no effect.
Does KVM do some power management on guests? If so, how do I turn this off for 
my LAMP guest?

Best, Hansa


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 12/13] KVM: x86: add SMM to the MMU role, support SMRAM address space

2015-06-18 Thread Paolo Bonzini


On 18/06/2015 07:02, Xiao Guangrong wrote:
> However, role->level is more hotter than role->smm so that it's also a good
> candidate for this kind of trick.

Right, we could give the first 8 bits to role->level, so it can be
accessed with a single memory load and extracted with a single AND.
Those two are definitely the hottest fields.

> And this is only 32 bits which can be operated in a CPU register by a
> single memory load, that is why i was worried if it is really needed.

However, an 8-bit field can be loaded from memory with a single movz
instruction.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html