RE: [v3 24/26] KVM: Update Posted-Interrupts Descriptor when vCPU is blocked

2015-02-26 Thread Wu, Feng


> -Original Message-
> From: Marcelo Tosatti [mailto:mtosa...@redhat.com]
> Sent: Thursday, February 26, 2015 5:50 AM
> To: Wu, Feng
> Cc: t...@linutronix.de; mi...@redhat.com; h...@zytor.com; x...@kernel.org;
> g...@kernel.org; pbonz...@redhat.com; dw...@infradead.org;
> j...@8bytes.org; alex.william...@redhat.com; jiang@linux.intel.com;
> eric.au...@linaro.org; linux-ker...@vger.kernel.org;
> io...@lists.linux-foundation.org; kvm@vger.kernel.org
> Subject: Re: [v3 24/26] KVM: Update Posted-Interrupts Descriptor when vCPU
> is blocked
> 
> On Fri, Dec 12, 2014 at 11:14:58PM +0800, Feng Wu wrote:
> > This patch updates the Posted-Interrupts Descriptor when vCPU
> > is blocked.
> >
> > pre-block:
> > - Add the vCPU to the blocked per-CPU list
> > - Clear 'SN'
> > - Set 'NV' to POSTED_INTR_WAKEUP_VECTOR
> >
> > post-block:
> > - Remove the vCPU from the per-CPU list
> >
> > Signed-off-by: Feng Wu 
> > ---
> 
> Don't see this is needed, can use the existing POSTED_INTR_VECTOR:
> 
> If in guest mode, IPI will be handled in VMX non-root by performed
> PIR->IRR transfer.
> 
> If outside guest mode, POSTED_INTR_VECTOR IPI will be handled by host
> which can wakeup the guest (in case it is halted).

Please see the following scenario:

1. vCPU0 is running on pCPU0
2. vCPU0 is halted and vCPU1 is currently running on pCPU0
3. An interrupt occurs for vCPU0, if we still use POSTED_INTR_VECTOR
for vCPU0, the notification event for vCPU0 (the event will go to pCPU1)
will be consumed by vCPU1 incorrectly. The worst case is that vCPU0
will never be woken up again since the wakeup event for it is always
consumed by other vCPUs incorrectly.

Thanks,
Feng
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/9] Fix WARNING: quoted string split across lines inkvm_main.c

2015-02-26 Thread Xiubo Li


On 26/02/2015 15:24, Thomas Huth wrote:

On Thu, 26 Feb 2015 14:58:18 +0800
Xiubo Li  wrote:


WARNING: quoted string split across lines
+   printk(KERN_INFO "kvm: enabling virtualization on "
+"CPU%d failed\n", cpu);

When fails to enable virtualization on CPUx for kvm, this log will
be output in only one line, and it will be a little confusing for us
to grep this log in kernel source code.

In some case, the user maybe using on script to searching the error
log in kernel source code, if so, won't it always fail?

So this patch fix this issue.

Signed-off-by: Xiubo Li 
---
  virt/kvm/kvm_main.c | 5 ++---
  1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index a109370..3f08716 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2817,12 +2817,11 @@ static void hardware_enable_nolock(void *junk)
cpumask_set_cpu(cpu, cpus_hardware_enabled);

r = kvm_arch_hardware_enable();
-
if (r) {
cpumask_clear_cpu(cpu, cpus_hardware_enabled);
atomic_inc(&hardware_enable_failed);
-   printk(KERN_INFO "kvm: enabling virtualization on "
-"CPU%d failed\n", cpu);
+   printk(KERN_INFO "kvm: enabling virtualization on CPU%d 
failed\n",
+   cpu);

You could use pr_info() here instead of printk(KERN_INFO, ...) to avoid
exceeding the 80 columns limit.

  Thomas

Yes, agree.

I have fix this in another patch later, but i think i should fix it in 
this patch at the same time.



Thanks very much,

BRs
Xiubo







--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] vhost: drop hard-coded num_buffers size

2015-02-26 Thread Michael S. Tsirkin
On Wed, Feb 25, 2015 at 04:56:52PM -0500, David Miller wrote:
> From: "Michael S. Tsirkin" 
> Date: Tue, 24 Feb 2015 17:31:10 +0100
> 
> > The 2 that we use for copy_to_iter comes from sizeof(u16),
> > it used to be that way before the iov iter update.
> > Fix it up, making it obvious the size of stack access
> > is right.
> > 
> > Signed-off-by: Michael S. Tsirkin 
> 
> Michael, what tree do you want these two patches to go through?

Sorry about being unclear.
They are net specific so I'd like them merged for 4.0 through net tree.
If that's a problem, pls let me know.

Thanks,
-- 
MST
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: virtio balloon: do not call blocking ops when !TASK_RUNNING

2015-02-26 Thread Michael S. Tsirkin
On Thu, Feb 26, 2015 at 11:50:42AM +1030, Rusty Russell wrote:
> Thomas Huth  writes:
> >  Hi all,
> >
> > with the recent kernel 3.19, I get a kernel warning when I start my
> > KVM guest on s390 with virtio balloon enabled:
> 
> The deeper problem is that virtio_ccw_get_config just silently fails on
> OOM.
> 
> Neither get_config nor set_config are expected to fail.
> 
> Cornelia, I think ccw and config_area should be allocated inside vcdev.
> You could either use pointers, or simply allocate vcdev with GDP_DMA.
> 
> This would avoid the kmalloc inside these calls.
> 
> Thanks,
> Rusty.

But it won't solve the problem of nested sleepers
with ccw: ATM is invokes ccw_io_helper to execute
commands, and that one calls wait_event
to wait for an interrupt.

Might be fixable but I think my patch looks like a safer
solution for 4.0/3.19, no?

-- 
MST
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: virtio balloon: do not call blocking ops when !TASK_RUNNING

2015-02-26 Thread Michael S. Tsirkin
On Thu, Feb 26, 2015 at 11:50:42AM +1030, Rusty Russell wrote:
> Thomas Huth  writes:
> >  Hi all,
> >
> > with the recent kernel 3.19, I get a kernel warning when I start my
> > KVM guest on s390 with virtio balloon enabled:
> 
> The deeper problem is that virtio_ccw_get_config just silently fails on
> OOM.

Same problem with virtio_ccw_reset.
But avoiding kmalloc calls in virtio_ccw_get_config isn't enough I think,
it might still sleep.


> 
> Neither get_config nor set_config are expected to fail.
> 
> Cornelia, I think ccw and config_area should be allocated inside vcdev.
> You could either use pointers, or simply allocate vcdev with GDP_DMA.
> 
> This would avoid the kmalloc inside these calls.
> 
> Thanks,
> Rusty.


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: virtio balloon: do not call blocking ops when !TASK_RUNNING

2015-02-26 Thread Cornelia Huck
On Thu, 26 Feb 2015 11:50:42 +1030
Rusty Russell  wrote:

> Thomas Huth  writes:
> >  Hi all,
> >
> > with the recent kernel 3.19, I get a kernel warning when I start my
> > KVM guest on s390 with virtio balloon enabled:
> 
> The deeper problem is that virtio_ccw_get_config just silently fails on
> OOM.
> 
> Neither get_config nor set_config are expected to fail.

It is a problem that we cannot relay failures back to the caller: not
only for the memory allocations. We need to do channel I/O, and any
channel I/O can fail. For our virtio case, we don't have to deal with
the failures that may happen on real hardware (like path failures), but
what can happen is a hotunplug, which means we cannot talk to the
device anymore from one moment to the other.

> 
> Cornelia, I think ccw and config_area should be allocated inside vcdev.
> You could either use pointers, or simply allocate vcdev with GDP_DMA.
> 
> This would avoid the kmalloc inside these calls.

I can certainly look into that, but I'm not sure it's worth it. We still
have to deal with possible failures from doing channel I/O.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 0/4] KVM: APIC improvements (with bonus mixed mode)

2015-02-26 Thread Nadav Amit
Marcello, Radim,

As you know - I can run some tests on the patches and whether they comply
with real hardware. Please let me know which version to test and I’ll try to
do next week.

Regards,
Nadav

Marcelo Tosatti  wrote:

> Radim,
> 
> On Thu, Feb 12, 2015 at 07:41:30PM +0100, Radim Krčmář wrote:
>> Each patch has a diff from v1, here is only a prologue on the mythical
>> mixed xAPIC and x2APIC mode:
>> 
>> There is one interesting alias in xAPIC and x2APIC ICR destination, the
>> 0xff00, which is a broadcast in xAPIC and either a physical message
>> to high x2APIC ID or a message to an empty set in x2APIC cluster.
>> 
>> This corner case in mixed mode can be triggered by a weird message
>> 1) when we have no x2APIC ID 0xff00, but send x2APIC message there
> 
> 10.7 SYSTEM AND APIC BUS ARBITRATION
> 
> Note that except for the SIPI IPI (see Section 10.6.1, “Interrupt
> Command Register (ICR)”), all bus messages that fail to be delivered
> to their specified destination or destinations are automatically
> retried. Software should avoid situations in which IPIs are sent to
> disabled or nonexistent local APICs, causing the messages to be resent
> repeatedly.
> 
>> or after something that isn't forbidden in SDM most likely because they
>> didn't think that anyone would ever consider it
>> 2) we have x2APIC ID 0xff00 and reset other x2APICs into xAPIC mode
> 
> Or just x2APIC initialization in non lockstep:
> 
> 
>   vcpu0   vcpu1
> T0)   xapic mode  xapic mode
> T1)   x2apic enable
> T2)   broadcast IPI
> 
>> Current KVM doesn't need to consider (2), so there only is a slim chance
>> that some hobbyist OS pushes the specification to its limits.
>> 
>> The problem is that SDM for xAPIC doesn't believably specify[1] if all
>> messages beginning with 0xff are considered as broadcasts, 10.6.2.1:
>>  A broadcast IPI (bits 28-31 of the MDA are 1's)
>> 
>> No volunteer came to clear this up, so I hacked Linux to have one x2APIC
>> core between xAPIC ones.  Physical IPI to 0xff00 got delivered only
>> to CPU0, like most other destinations, Logical IPI to 0xff00 got
>> dropped and only 0x worked as a broadcast in both modes.
>> 
>> I think it is because Intel never considered mixed mode to be valid, and
>> seen delivery might be an emergent feature of QPI routing.
> 
>> Luckily, broadcast from xAPIC isn't delivered to x2APIC.
> 
> In real hardware?
> 
>> Real exploration would require greater modifications to Linux (ideally
>> writing a custom kernel), so this series implements something that makes
>> some sense and isn't too far from reality.
> 
> Ok, so the problem is that broadcast (ICR.destination == 0xff00)
> from xAPIC CPU is not delivered to x2APIC CPUs ?
> 
>> Radim Krčmář (4):
>>  KVM: x86: use MDA for interrupt matching
>>  KVM: x86: fix mixed APIC mode broadcast
>>  KVM: x86: avoid logical_map when it is invalid
>>  KVM: x86: simplify kvm_apic_map
> 
> I can't find any restriction against (cpu0==x2APIC, cpu1==xAPIC) in 
> the documentation.
> 
> Anyway, emulation should match physical hardware. From your message above,
> it is not clear what is the behaviour there:
> 
> "No volunteer came to clear this up, so I hacked Linux to have one x2APIC
> core between xAPIC ones.  Physical IPI to 0xff00 got delivered only
> to CPU0, like most other destinations, Logical IPI to 0xff00 got
> dropped and only 0x worked as a broadcast in both modes.
> 
> Luckily, broadcast from xAPIC isn't delivered to x2APIC."
> 
> From the x2APIC CPU or the xAPIC ones?
> 
> It should be easy to write kvm-unit-test testcases that match physical
> hardware behaviour (in general, i am having a hard time figure out
> in what way "mixed mode" is supposed to behave, please describe it
> clearly).
> 
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: virtio balloon: do not call blocking ops when !TASK_RUNNING

2015-02-26 Thread Cornelia Huck
On Thu, 26 Feb 2015 09:45:29 +0100
"Michael S. Tsirkin"  wrote:

> On Thu, Feb 26, 2015 at 11:50:42AM +1030, Rusty Russell wrote:
> > Thomas Huth  writes:
> > >  Hi all,
> > >
> > > with the recent kernel 3.19, I get a kernel warning when I start my
> > > KVM guest on s390 with virtio balloon enabled:
> > 
> > The deeper problem is that virtio_ccw_get_config just silently fails on
> > OOM.
> 
> Same problem with virtio_ccw_reset.
> But avoiding kmalloc calls in virtio_ccw_get_config isn't enough I think,
> it might still sleep.

It is probably a problem with all calls into the transport that assume
an implementation that cannot fail: If we have a channel I/O backing,
we need to be able to handle things not working.

The only case we need to care about for virtio is probably a -ENODEV
triggered by a hotunplug, though.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: stand-alone kvmtool

2015-02-26 Thread Alex Bennée

Andre Przywara  writes:

> Hi Will,
>
> On 18/02/15 15:50, Will Deacon wrote:
>> Hi Andre,
>> 
>> Thanks for doing this. Since it looks unlikely that kvmtool will ever be
>> merged back into the kernel tree, it makes sense to cut the dependency
>> in my opinion.
>> 

>
> P.S. Although both approaches still provide the kvmtool patch history,
> they do not compile before the dependency cut patches. If that is an
> issue, one could think about injecting those new patches back into the
> repository time line. Admittedly that sounds scary, but would solve the
> problem.

If you can have it all it would be nice to preserve buildability all
through your history for bisecting (and the moon on a stick please ;-)

Is the dependency on the kernel sources something that has been stable
over the projects history or something that's been declining/increasing
over time?

-- 
Alex Bennée
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [kvm-unit-tests PATCH 00/18] arm/arm64: add smp support

2015-02-26 Thread Christoffer Dall
On Sun, Feb 01, 2015 at 07:34:28PM +0100, Andrew Jones wrote:
> This series extends the kvm-unit-tests/arm[64] framework to support smp.
> A break down of the patches is as follows
> 
> 01-02: prepare general framework for smp use
> 03-06: arm/arm64 fixups not 100% related to this series,
>but need to post some time...
> 07-09: add thread_info (for per-thread data) and suck some global
>data into it
> 10-11: add cpumask support (for per-cpu data) and suck some more
>global data in
>12: add arm64 simple spinlock implementation
> 13-14: add some PSCI support
> 15-16: further prep for smp_boot_secondary
>17: finally add smp_boot_secondary
>18: as usual, add a selftest to make sure it all works
> 
> These patches are also available here:
> https://github.com/rhdrjones/kvm-unit-tests/tree/arm/smp
> 
I've tested these patches on Juno and they seem to run fine, however,
since we don't support big.LITTLE yet, you have to run them under
'taskset ', but the config script uses $(getconf
_NPROCESSORS_CONF), which returns 6, and QEMU fails.  The interesting
bit is that the unit-tests still report 'PASS' - not sure why.

I'm fine with merging these and fixing anything up as we go along - this
is a test harness after all and I don't have time for an in-depth review
of these patches right now, sorry.

-Christoffer
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 03/18] arm: fixups: add barriers, actually set MAIR

2015-02-26 Thread Christoffer Dall
On Sun, Feb 01, 2015 at 07:34:31PM +0100, Andrew Jones wrote:
> Sprinkle in some more isbs after context-changing operations,
> as the ARM ARM states we should. I haven't seen any problems
> without them, but we should do it right. Also, *actually* set
> the MAIR in asm_mmu_enable. We were reading, not writing...
> Luckily this was just spotted while adding the isbs, rather
> than leading to a nightmare debug session some day...
> 
> Signed-off-by: Andrew Jones 
> ---
>  arm/cstart.S| 8 ++--
>  lib/arm/processor.c | 1 +
>  2 files changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/arm/cstart.S b/arm/cstart.S
> index da496e9eae7e0..39fac8f1e1bd8 100644
> --- a/arm/cstart.S
> +++ b/arm/cstart.S
> @@ -50,6 +50,7 @@ start:
>  .macro set_mode_stack mode, stack
>   add \stack, #S_FRAME_SIZE
>   msr cpsr_c, #(\mode | PSR_I_BIT | PSR_F_BIT)
> + isb
>   mov sp, \stack
>  .endm
>  
> @@ -70,6 +71,7 @@ exceptions_init:
>   set_mode_stack  FIQ_MODE, r1
>  
>   msr cpsr_cxsf, r2   @ back to svc mode
> + isb
>   mov pc, lr
>  
I don't think you really need those after the cpsr, but whatever.

-Christoffer
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM: SVM: fix interrupt injection (apic->isr_count always 0)

2015-02-26 Thread Borislav Petkov
On Wed, Feb 25, 2015 at 08:41:41PM +0100, Radim Krčmář wrote:
> this patch should fix it.

Yap, seems so :-)

> ---8<---
> In commit b4eef9b36db4, we started to use hwapic_isr_update() != NULL
> instead of kvm_apic_vid_enabled(vcpu->kvm).  This didn't work because
> SVM had it defined and "apicv" path in apic_{set,clear}_isr() does not
> change apic->isr_count, because it should always be 1.  The initial
> value of apic->isr_count was based on kvm_apic_vid_enabled(vcpu->kvm),
> which is always 0 for SVM, so KVM could have injected interrupts when it
> shouldn't.
> 
> Fix it by setting SVM's hwapic_isr_update to NULL and make the initial
> isr_count depend on hwapic_isr_update() for good measure.
> 
> Fixes: b4eef9b36db4 ("kvm: x86: vmx: NULL out hwapic_isr_update() in case of 
> !enable_apicv")
> Reported-by: Borislav Petkov 

Reported-and-tested-by: Borislav Petkov 

> Signed-off-by: Radim Krčmář 

Thanks.

-- 
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/6] target-arm: kvm: save/restore mp state

2015-02-26 Thread Paolo Bonzini


On 25/02/2015 17:02, Alex Bennée wrote:
> +#if defined CONFIG_KVM
> +{
> +.name = "mp_state",
> +.version_id = 0,
> +.size = sizeof(uint32_t),
> +.info = &vmstate_mpstate,
> +.flags = VMS_SINGLE,
> +.offset = 0,
> +},
> +#endif

This makes TCG migration state incompatible depending on whether QEMU
was built on ARM or x86.

You can instead add a subsection, and mark it as needed only iff
kvm_enabled().

Paolo

>  {
>  .name = "cpsr",
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [kvm-unit-tests PATCH 00/18] arm/arm64: add smp support

2015-02-26 Thread Andrew Jones
On Thu, Feb 26, 2015 at 12:34:02PM +0100, Christoffer Dall wrote:
> On Sun, Feb 01, 2015 at 07:34:28PM +0100, Andrew Jones wrote:
> > This series extends the kvm-unit-tests/arm[64] framework to support smp.
> > A break down of the patches is as follows
> > 
> > 01-02: prepare general framework for smp use
> > 03-06: arm/arm64 fixups not 100% related to this series,
> >but need to post some time...
> > 07-09: add thread_info (for per-thread data) and suck some global
> >data into it
> > 10-11: add cpumask support (for per-cpu data) and suck some more
> >global data in
> >12: add arm64 simple spinlock implementation
> > 13-14: add some PSCI support
> > 15-16: further prep for smp_boot_secondary
> >17: finally add smp_boot_secondary
> >18: as usual, add a selftest to make sure it all works
> > 
> > These patches are also available here:
> > https://github.com/rhdrjones/kvm-unit-tests/tree/arm/smp
> > 
> I've tested these patches on Juno and they seem to run fine, however,
> since we don't support big.LITTLE yet, you have to run them under
> 'taskset ', but the config script uses $(getconf
> _NPROCESSORS_CONF), which returns 6, and QEMU fails.  The interesting

Should I try to read the number of host cpus from some other source?
If you know something I can read that also works on big.LITTLE, then
I can change it now.

> bit is that the unit-tests still report 'PASS' - not sure why.

Ah, this is due to the weird way qemu's debugexit device sets its exit
code

hw/misc/debugexit.c:debug_exit_write()
{
  exit((val << 1) | 1);
}

To be consistent with that we made chr-testdev do the same thing (see
backends/testdev.c:testdev_eat_packet():case 'q'). Now, the
kvm-unit-tests run_tests.sh script knows about that, so it has

  eval $cmdline >> test.log
  if [ $? -le 1 ]; then
 echo -e "\e[32mPASS\e[0m $1"
  else
 echo -e "\e[31mFAIL\e[0m $1"
  fi

Yes, this sucks, as we can't tell the difference between qemu failing
to run the test, and exiting with 1 vs. the test running, passing -
exiting with (0 << 1) | 1. It's too bad debugexit didn't set a higher
bit (like 5 or 6) to flag a "debug exit". Maybe it's not too late to
change it? Paolo?

> 
> I'm fine with merging these and fixing anything up as we go along - this
> is a test harness after all and I don't have time for an in-depth review
> of these patches right now, sorry.

Thanks!
drew
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [kvm-unit-tests PATCH 00/18] arm/arm64: add smp support

2015-02-26 Thread Paolo Bonzini


On 26/02/2015 14:50, Andrew Jones wrote:
> Yes, this sucks, as we can't tell the difference between qemu failing
> to run the test, and exiting with 1 vs. the test running, passing -
> exiting with (0 << 1) | 1. It's too bad debugexit didn't set a higher
> bit (like 5 or 6) to flag a "debug exit". Maybe it's not too late to
> change it? Paolo?

That would indeed have been a better idea. :(  I'm not sure it's worth
it because in theory exit codes of 1 should never happen in
kvm-unit-tests.  That said, it does suck when they happen due to a bug
in the harness.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM: SVM: fix interrupt injection (apic->isr_count always 0)

2015-02-26 Thread Paolo Bonzini


On 25/02/2015 20:41, Radim Krčmář wrote:
> 2015-02-25 17:00+0100, Borislav Petkov:
>> Hi,
>>
>> commit in $Subject breaks my kvm guest on AMD host, causing it to do the
>> following below. Mouse doesn't work anymore in the guest, network is
>> gone too.
>>
>> Reverting it fixes the issue.
> 
> Thanks,
> 
> this patch should fix it.
> 
> ---8<---
> In commit b4eef9b36db4, we started to use hwapic_isr_update() != NULL
> instead of kvm_apic_vid_enabled(vcpu->kvm).  This didn't work because
> SVM had it defined and "apicv" path in apic_{set,clear}_isr() does not
> change apic->isr_count, because it should always be 1.  The initial
> value of apic->isr_count was based on kvm_apic_vid_enabled(vcpu->kvm),
> which is always 0 for SVM, so KVM could have injected interrupts when it
> shouldn't.
> 
> Fix it by setting SVM's hwapic_isr_update to NULL and make the initial
> isr_count depend on hwapic_isr_update() for good measure.
> 
> Fixes: b4eef9b36db4 ("kvm: x86: vmx: NULL out hwapic_isr_update() in case of 
> !enable_apicv")
> Reported-by: Borislav Petkov 
> Signed-off-by: Radim Krčmář 
> ---
>  arch/x86/kvm/lapic.c | 4 ++--
>  arch/x86/kvm/svm.c   | 7 +--
>  2 files changed, 3 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> index e55b5fc344eb..bd4e34de24c7 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -1572,7 +1572,7 @@ void kvm_lapic_reset(struct kvm_vcpu *vcpu)
>   apic_set_reg(apic, APIC_TMR + 0x10 * i, 0);
>   }
>   apic->irr_pending = kvm_apic_vid_enabled(vcpu->kvm);
> - apic->isr_count = kvm_apic_vid_enabled(vcpu->kvm);
> + apic->isr_count = kvm_x86_ops->hwapic_isr_update ? 1 : 0;
>   apic->highest_isr_cache = -1;
>   update_divide_count(apic);
>   atomic_set(&apic->lapic_timer.pending, 0);
> @@ -1782,7 +1782,7 @@ void kvm_apic_post_state_restore(struct kvm_vcpu *vcpu,
>   update_divide_count(apic);
>   start_apic_timer(apic);
>   apic->irr_pending = true;
> - apic->isr_count = kvm_apic_vid_enabled(vcpu->kvm) ?
> + apic->isr_count = kvm_x86_ops->hwapic_isr_update ?
>   1 : count_vectors(apic->regs + APIC_ISR);
>   apic->highest_isr_cache = -1;
>   if (kvm_x86_ops->hwapic_irr_update)
> diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
> index d319e0c24758..54c7b36ad12d 100644
> --- a/arch/x86/kvm/svm.c
> +++ b/arch/x86/kvm/svm.c
> @@ -3649,11 +3649,6 @@ static void svm_load_eoi_exitmap(struct kvm_vcpu 
> *vcpu, u64 *eoi_exit_bitmap)
>   return;
>  }
>  
> -static void svm_hwapic_isr_update(struct kvm *kvm, int isr)
> -{
> - return;
> -}
> -
>  static void svm_sync_pir_to_irr(struct kvm_vcpu *vcpu)
>  {
>   return;
> @@ -4403,7 +4398,7 @@ static struct kvm_x86_ops svm_x86_ops = {
>   .set_virtual_x2apic_mode = svm_set_virtual_x2apic_mode,
>   .vm_has_apicv = svm_vm_has_apicv,
>   .load_eoi_exitmap = svm_load_eoi_exitmap,
> - .hwapic_isr_update = svm_hwapic_isr_update,
> + .hwapic_isr_update = NULL,

Please remove the line altogether.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: virtio balloon: do not call blocking ops when !TASK_RUNNING

2015-02-26 Thread Peter Zijlstra
On Thu, Feb 26, 2015 at 09:30:31AM +0100, Michael S. Tsirkin wrote:
> On Thu, Feb 26, 2015 at 11:50:42AM +1030, Rusty Russell wrote:
> > Thomas Huth  writes:
> > >  Hi all,
> > >
> > > with the recent kernel 3.19, I get a kernel warning when I start my
> > > KVM guest on s390 with virtio balloon enabled:
> > 
> > The deeper problem is that virtio_ccw_get_config just silently fails on
> > OOM.
> > 
> > Neither get_config nor set_config are expected to fail.
> > 
> > Cornelia, I think ccw and config_area should be allocated inside vcdev.
> > You could either use pointers, or simply allocate vcdev with GDP_DMA.
> > 
> > This would avoid the kmalloc inside these calls.
> > 
> > Thanks,
> > Rusty.
> 
> But it won't solve the problem of nested sleepers
> with ccw: ATM is invokes ccw_io_helper to execute
> commands, and that one calls wait_event
> to wait for an interrupt.
> 
> Might be fixable but I think my patch looks like a safer
> solution for 4.0/3.19, no?

I've no idea what your patch was since I'm not subscribed to any of the
lists this discussion is had on.

But you can annotate the warning away; _however_ with the annotation
needs to be a big comment explaining why its safe to do so. Typically to
involved talking about how its actually rare for the call to sleep.

So occasional sleeps inside a wait_event() are ok-ish, we'll just get to
go around once more. But once you consistently sleep inside a
wait_event() things go a bit funny.

So for instance; if in ccw_io_helper() we expect that wait_event(,
!doing_io()) to be (mostly) true on first go, then we'll never get into
__wait_event() and ->state won't actually be mucked about with.

The thing to avoid is not actually sleeping (much) but setting
TASK_RUNNING and turning the entire thing into a giant poll loop.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: virtio balloon: do not call blocking ops when !TASK_RUNNING

2015-02-26 Thread Michael S. Tsirkin
On Thu, Feb 26, 2015 at 06:08:49PM +0100, Peter Zijlstra wrote:
> On Thu, Feb 26, 2015 at 09:30:31AM +0100, Michael S. Tsirkin wrote:
> > On Thu, Feb 26, 2015 at 11:50:42AM +1030, Rusty Russell wrote:
> > > Thomas Huth  writes:
> > > >  Hi all,
> > > >
> > > > with the recent kernel 3.19, I get a kernel warning when I start my
> > > > KVM guest on s390 with virtio balloon enabled:
> > > 
> > > The deeper problem is that virtio_ccw_get_config just silently fails on
> > > OOM.
> > > 
> > > Neither get_config nor set_config are expected to fail.
> > > 
> > > Cornelia, I think ccw and config_area should be allocated inside vcdev.
> > > You could either use pointers, or simply allocate vcdev with GDP_DMA.
> > > 
> > > This would avoid the kmalloc inside these calls.
> > > 
> > > Thanks,
> > > Rusty.
> > 
> > But it won't solve the problem of nested sleepers
> > with ccw: ATM is invokes ccw_io_helper to execute
> > commands, and that one calls wait_event
> > to wait for an interrupt.
> > 
> > Might be fixable but I think my patch looks like a safer
> > solution for 4.0/3.19, no?
> 
> I've no idea what your patch was since I'm not subscribed to any of the
> lists this discussion is had on.

Oh, sorry about that.
Here it is, below:

- Forwarded message from "Michael S. Tsirkin"  -

Date: Wed, 25 Feb 2015 15:36:02 +0100
From: "Michael S. Tsirkin" 
To: linux-ker...@vger.kernel.org
Cc: sta...@vger.kernel.org, Thomas Huth , Rusty 
Russell ,
virtualizat...@lists.linux-foundation.org, kvm@vger.kernel.org, 
Cornelia Huck 
Subject: [PATCH v2] virtio-balloon: do not call blocking ops when !TASK_RUNNING
Message-ID: <1424874878-17155-1-git-send-email-...@redhat.com>

virtio balloon has this code:
wait_event_interruptible(vb->config_change,
 (diff = towards_target(vb)) != 0
 || vb->need_stats_update
 || kthread_should_stop()
 || freezing(current));

Which is a problem because towards_target() call might block after
wait_event_interruptible sets task state to TAST_INTERRUPTIBLE, causing
the task_struct::state collision typical of nesting of sleeping
primitives

See also http://lwn.net/Articles/628628/ or Thomas's
bug report
http://article.gmane.org/gmane.linux.kernel.virtualization/24846
for a fuller explanation.

To fix, rewrite using wait_woken.

Cc: sta...@vger.kernel.org
Reported-by: Thomas Huth 
Signed-off-by: Michael S. Tsirkin 
---

changes from v1:
remove wait_event_interruptible
noticed by Cornelia Huck 

 drivers/virtio/virtio_balloon.c | 19 ++-
 1 file changed, 14 insertions(+), 5 deletions(-)

diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index 0413157..5a6ad6d 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -29,6 +29,7 @@
 #include 
 #include 
 #include 
+#include 
 
 /*
  * Balloon device works in 4K page units.  So each page is pointed to by
@@ -334,17 +335,25 @@ static int virtballoon_oom_notify(struct notifier_block 
*self,
 static int balloon(void *_vballoon)
 {
struct virtio_balloon *vb = _vballoon;
+   DEFINE_WAIT_FUNC(wait, woken_wake_function);
 
set_freezable();
while (!kthread_should_stop()) {
s64 diff;
 
try_to_freeze();
-   wait_event_interruptible(vb->config_change,
-(diff = towards_target(vb)) != 0
-|| vb->need_stats_update
-|| kthread_should_stop()
-|| freezing(current));
+
+   add_wait_queue(&vb->config_change, &wait);
+   for (;;) {
+   if ((diff = towards_target(vb)) != 0 ||
+   vb->need_stats_update ||
+   kthread_should_stop() ||
+   freezing(current))
+   break;
+   wait_woken(&wait, TASK_INTERRUPTIBLE, 
MAX_SCHEDULE_TIMEOUT);
+   }
+   remove_wait_queue(&vb->config_change, &wait);
+
if (vb->need_stats_update)
stats_handle_request(vb);
if (diff > 0)
-- 
MST

- End forwarded message -
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: virtio balloon: do not call blocking ops when !TASK_RUNNING

2015-02-26 Thread Michael S. Tsirkin
On Thu, Feb 26, 2015 at 06:27:43PM +0100, Michael S. Tsirkin wrote:
> On Thu, Feb 26, 2015 at 06:08:49PM +0100, Peter Zijlstra wrote:
> > On Thu, Feb 26, 2015 at 09:30:31AM +0100, Michael S. Tsirkin wrote:
> > > On Thu, Feb 26, 2015 at 11:50:42AM +1030, Rusty Russell wrote:
> > > > Thomas Huth  writes:
> > > > >  Hi all,
> > > > >
> > > > > with the recent kernel 3.19, I get a kernel warning when I start my
> > > > > KVM guest on s390 with virtio balloon enabled:
> > > > 
> > > > The deeper problem is that virtio_ccw_get_config just silently fails on
> > > > OOM.
> > > > 
> > > > Neither get_config nor set_config are expected to fail.
> > > > 
> > > > Cornelia, I think ccw and config_area should be allocated inside vcdev.
> > > > You could either use pointers, or simply allocate vcdev with GDP_DMA.
> > > > 
> > > > This would avoid the kmalloc inside these calls.
> > > > 
> > > > Thanks,
> > > > Rusty.
> > > 
> > > But it won't solve the problem of nested sleepers
> > > with ccw: ATM is invokes ccw_io_helper to execute
> > > commands, and that one calls wait_event
> > > to wait for an interrupt.
> > > 
> > > Might be fixable but I think my patch looks like a safer
> > > solution for 4.0/3.19, no?
> > 
> > I've no idea what your patch was since I'm not subscribed to any of the
> > lists this discussion is had on.
> 
> Oh, sorry about that.
> Here it is, below:
> 
> - Forwarded message from "Michael S. Tsirkin"  -
> 
> Date: Wed, 25 Feb 2015 15:36:02 +0100
> From: "Michael S. Tsirkin" 
> To: linux-ker...@vger.kernel.org
> Cc: sta...@vger.kernel.org, Thomas Huth , Rusty 
> Russell ,
>   virtualizat...@lists.linux-foundation.org, kvm@vger.kernel.org, 
> Cornelia Huck 
> Subject: [PATCH v2] virtio-balloon: do not call blocking ops when 
> !TASK_RUNNING
> Message-ID: <1424874878-17155-1-git-send-email-...@redhat.com>
> 
> virtio balloon has this code:
> wait_event_interruptible(vb->config_change,
>  (diff = towards_target(vb)) != 0
>  || vb->need_stats_update
>  || kthread_should_stop()
>  || freezing(current));
> 
> Which is a problem because towards_target() call might block after
> wait_event_interruptible sets task state to TAST_INTERRUPTIBLE, causing
> the task_struct::state collision typical of nesting of sleeping
> primitives
> 
> See also http://lwn.net/Articles/628628/ or Thomas's
> bug report
> http://article.gmane.org/gmane.linux.kernel.virtualization/24846
> for a fuller explanation.
> 
> To fix, rewrite using wait_woken.
> 
> Cc: sta...@vger.kernel.org
> Reported-by: Thomas Huth 
> Signed-off-by: Michael S. Tsirkin 
> ---
> 
> changes from v1:
>   remove wait_event_interruptible
>   noticed by Cornelia Huck 
> 
>  drivers/virtio/virtio_balloon.c | 19 ++-
>  1 file changed, 14 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
> index 0413157..5a6ad6d 100644
> --- a/drivers/virtio/virtio_balloon.c
> +++ b/drivers/virtio/virtio_balloon.c
> @@ -29,6 +29,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  /*
>   * Balloon device works in 4K page units.  So each page is pointed to by
> @@ -334,17 +335,25 @@ static int virtballoon_oom_notify(struct notifier_block 
> *self,
>  static int balloon(void *_vballoon)
>  {
>   struct virtio_balloon *vb = _vballoon;
> + DEFINE_WAIT_FUNC(wait, woken_wake_function);
>  
>   set_freezable();
>   while (!kthread_should_stop()) {
>   s64 diff;
>  
>   try_to_freeze();
> - wait_event_interruptible(vb->config_change,
> -  (diff = towards_target(vb)) != 0
> -  || vb->need_stats_update
> -  || kthread_should_stop()
> -  || freezing(current));
> +
> + add_wait_queue(&vb->config_change, &wait);
> + for (;;) {
> + if ((diff = towards_target(vb)) != 0 ||
> + vb->need_stats_update ||
> + kthread_should_stop() ||
> + freezing(current))
> + break;
> + wait_woken(&wait, TASK_INTERRUPTIBLE, 
> MAX_SCHEDULE_TIMEOUT);
> + }
> + remove_wait_queue(&vb->config_change, &wait);
> +
>   if (vb->need_stats_update)
>   stats_handle_request(vb);
>   if (diff > 0)
> -- 
> MST

WRT which, I have a question. IIUC it's OK for towards_target
in this code to call wait_event in its turn, assuming that
*that* wait_event is not not calling blocking ops.
Right?


> - End forwarded message -
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at 

[BUG] Balloon malfunctions with memory hotplug

2015-02-26 Thread Luiz Capitulino
Hello,

Reproducer:

1. Start QEMU with balloon and memory hotplug support:

# qemu [...] -m 1G,slots=2,maxmem=2G -balloon virtio

2. Check balloon size:

(qemu) info balloon
balloon: actual=1024
(qemu) 

3. Hotplug some memory:

(qemu) object_add memory-backend-ram,id=mem1,size=1G
(qemu) device_add pc-dimm,id=dimm1,memdev=mem1

4. This is step is _not_ needed to reproduce the problem,
   but you may need to online memory manually on Linux so
   that it becomes available in the guest

5. Check balloon size again:

(qemu) info balloon
balloon: actual=1024
(qemu) 

BUG: The guest now has 2GB of memory, but the balloon thinks
 the guest has 1GB

One may think that the problem is that the balloon driver is
ignoring hotplugged memory. This is not what's happening. If
you do balloon your guest, there's nothing stopping the
balloon driver in the guest from ballooning hotplugged memory.

The problem is that the balloon device in QEMU needs to know
the current amount of memory available to the guest.

Before memory hotplug this information was easy to obtain: the
current amount of memory available to the guest is the memory the
guest was booted with. This value is stored in the ram_size global
variable in QEMU and this is what the balloon device emulation
code uses today. However, when memory is hotplugged ram_size is
_not_ updated and the balloon device breaks.

I see two possible solutions for this problem:

1. In addition to reading ram_size, the balloon device in QEMU
   could scan pc-dimm devices to account for hotplugged memory.

   This solution was already implemented by zhanghailiang:

http://lists.gnu.org/archive/html/qemu-devel/2014-11/msg02362.html

   It works, except that on Linux memory hotplug is a two-step
   procedure: first memory is inserted then it has to be onlined
   from user-space. So, if memory is inserted but not onlined
   this solution gives the opposite problem: the balloon device
   will report a larger memory amount than the guest actually has.

   Can we live with that? I guess not, but I'm open for discussion.

   If QEMU could be notified when Linux makes memory online, then
   the problem would be gone. But I guess this can't be done.

2. Modify the balloon driver in the guest to inform the balloon
   device on the host about the current memory available to the
   guest. This way, whenever the balloon device in QEMU needs
   to know the current amount of memory in the guest, it asks
   the guest. This drops any usage of ram_size in the balloon
   device

   I'm not completely sure this is feasible though. For example,
   what happens if the guest reports a memory amount to QEMU and
   right after this more memory is plugged?

   Besides, this solution is more complex than solution 1 and
   won't address older guests.

Another important detail is that, I *suspect* that a very similar
bug already exists with 32-bit guests even without memory
hotplug: what happens if you assign 6GB to a 32-bit without PAE
support? I think the same problem we're seeing with memory
hotplug will happen and solution 1 won't fix this, although
no one seems to care about 32-bit guests...
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] KVM/s390: Fix trivial typo in comments

2015-02-26 Thread Yannick Guerrini
Change 'architecuture' to 'architecture'

Signed-off-by: Yannick Guerrini 
---
 arch/s390/kvm/gaccess.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/s390/kvm/gaccess.c b/arch/s390/kvm/gaccess.c
index 267523c..633fe9b 100644
--- a/arch/s390/kvm/gaccess.c
+++ b/arch/s390/kvm/gaccess.c
@@ -333,7 +333,7 @@ static int deref_table(struct kvm *kvm, unsigned long gpa, 
unsigned long *val)
  * @write: indicates if access is a write access
  *
  * Translate a guest virtual address into a guest absolute address by means
- * of dynamic address translation as specified by the architecuture.
+ * of dynamic address translation as specified by the architecture.
  * If the resulting absolute address is not available in the configuration
  * an addressing exception is indicated and @gpa will not be changed.
  *
-- 
1.9.5.msysgit.0

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 0/4] KVM: APIC improvements (with bonus mixed mode)

2015-02-26 Thread Marcelo Tosatti
On Thu, Feb 26, 2015 at 10:49:53AM +0200, Nadav Amit wrote:
> Marcello, Radim,
> 
> As you know - I can run some tests on the patches and whether they comply
> with real hardware. Please let me know which version to test and I’ll try to
> do next week.
> 
> Regards,
> Nadav

>From what i understand Radim did run a set of tests (with a modified
Linux kernel) to find out the information on mixed mode.

However its not clear what that behaviour is.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 2/2] x86, vdso, pvclock: Simplify and speed up the vdso pvclock reader

2015-02-26 Thread Andy Lutomirski
On Thu, Jan 8, 2015 at 2:43 PM, Andy Lutomirski  wrote:
> On Thu, Jan 8, 2015 at 2:31 PM, Marcelo Tosatti  wrote:
>> On Tue, Jan 06, 2015 at 11:49:09AM -0800, Andy Lutomirski wrote:
>>> On Tue, Jan 6, 2015 at 10:45 AM, Marcelo Tosatti  
>>> wrote:
>>> > On Tue, Jan 06, 2015 at 10:26:22AM -0800, Andy Lutomirski wrote:
>>> >> On Tue, Jan 6, 2015 at 10:13 AM, Marcelo Tosatti  
>>> >> wrote:
>>> >> > On Tue, Jan 06, 2015 at 08:56:40AM -0800, Andy Lutomirski wrote:
>>> >> >> On Jan 6, 2015 4:01 AM, "Paolo Bonzini"  wrote:
>>> >> >> >
>>> >> >> >
>>> >> >> >
>>> >> >> > On 06/01/2015 09:42, Paolo Bonzini wrote:
>>> >> >> > > > > Still confused.  So we can freeze all vCPUs in the host, then 
>>> >> >> > > > > update
>>> >> >> > > > > pvti 1, then resume vCPU 1, then update pvti 0?  In that 
>>> >> >> > > > > case, we have
>>> >> >> > > > > a problem, because vCPU 1 can observe pvti 0 mid-update, and 
>>> >> >> > > > > KVM
>>> >> >> > > > > doesn't increment the version pre-update, and we can return 
>>> >> >> > > > > completely
>>> >> >> > > > > bogus results.
>>> >> >> > > > Yes.
>>> >> >> > > But then the getcpu test would fail (1->0).  Even if you have an 
>>> >> >> > > ABA
>>> >> >> > > situation (1->0->1), it's okay because the pvti that is fetched 
>>> >> >> > > is the
>>> >> >> > > one returned by the first getcpu.
>>> >> >> >
>>> >> >> > ... this case of partial update of pvti, which is caught by the 
>>> >> >> > version
>>> >> >> > field, if of course different from the other (extremely unlikely) 
>>> >> >> > that
>>> >> >> > Andy pointed out.  That is when the getcpus are done on the same 
>>> >> >> > vCPU,
>>> >> >> > but the rdtsc is another.
>>> >> >> >
>>> >> >> > That one can be fixed by rdtscp, like
>>> >> >> >
>>> >> >> > do {
>>> >> >> > // get a consistent (pvti, v, tsc) tuple
>>> >> >> > do {
>>> >> >> > cpu = get_cpu();
>>> >> >> > pvti = get_pvti(cpu);
>>> >> >> > v = pvti->version & ~1;
>>> >> >> > // also acts as rmb();
>>> >> >> > rdtsc_barrier();
>>> >> >> > tsc = rdtscp(&cpu1);
>>> >> >>
>>> >> >> Off-topic note: rdtscp doesn't need a barrier at all.  AIUI AMD
>>> >> >> specified it that way and both AMD and Intel implement it correctly.
>>> >> >> (rdtsc, on the other hand, definitely needs the barrier beforehand.)
>>> >> >>
>>> >> >> > // control dependency, no need for rdtsc_barrier?
>>> >> >> > } while(cpu != cpu1);
>>> >> >> >
>>> >> >> > // ... compute nanoseconds from pvti and tsc ...
>>> >> >> > rmb();
>>> >> >> > }   while(v != pvti->version);
>>> >> >>
>>> >> >> Still no good.  We can migrate a bunch of times so we see the same CPU
>>> >> >> all three times and *still* don't get a consistent read, unless we
>>> >> >> play nasty games with lots of version checks (I have a patch for that,
>>> >> >> but I don't like it very much).  The patch is here:
>>> >> >>
>>> >> >> https://git.kernel.org/cgit/linux/kernel/git/luto/linux.git/commit/?h=x86/vdso_paranoia&id=a69754dc5ff33f5187162b5338854ad23dd7be8d
>>> >> >>
>>> >> >> but I don't like it.
>>> >> >>
>>> >> >> Thus far, I've been told unambiguously that a guest can't observe pvti
>>> >> >> while it's being written, and I think you're now telling me that this
>>> >> >> isn't true and that a guest *can* observe pvti while it's being
>>> >> >> written while the low bit of the version field is not set.  If so,
>>> >> >> this is rather strongly incompatible with the spec in the KVM docs.
>>> >> >>
>>> >> >> I don't suppose that you and Marcelo could agree on what the actual
>>> >> >> semantics that KVM provides are and could write it down in a way that
>>> >> >> people who haven't spent a long time staring at the request code
>>> >> >> understand?  And maybe you could even fix the implementation while
>>> >> >> you're at it if the implementation is, indeed, broken.  I have ugly
>>> >> >> patches to fix it here:
>>> >> >>
>>> >> >> https://git.kernel.org/cgit/linux/kernel/git/luto/linux.git/commit/?h=x86/vdso_paranoia&id=3b718a050cba52563d831febc2e1ca184c02bac0
>>> >> >>
>>> >> >> but I'm not thrilled with them.
>>> >> >>
>>> >> >> --Andy
>>> >> >
>>> >> > I suppose that separating the version write from the rest of the 
>>> >> > pvclock
>>> >> > structure is sufficient, as that would guarantee the writes are not
>>> >> > reordered even with fast string REP MOVS.
>>> >> >
>>> >> > Thanks for catching this Andy!
>>> >> >
>>> >>
>>> >> Don't you stil need:
>>> >>
>>> >> version++;
>>> >> write the rest;
>>> >> version++;
>>> >>
>>> >> with possible smp_wmb() in there to keep the compiler from messing 
>>> >> around?
>>> >
>>> > Correct. Could just as well follow the protocol and use odd/even, which
>>> > is what your patch does.
>>> >
>>> > What is the point with the new flags bit though?
>>>
>>> To try to work around the problem on old hosts.  I'm not at all
>>> convinced that this is worthwhile or that it helps, though.
>>
>> Andy,
>>
>> Are you going to submit t

Re: [v3 24/26] KVM: Update Posted-Interrupts Descriptor when vCPU is blocked

2015-02-26 Thread Marcelo Tosatti
On Fri, Dec 12, 2014 at 11:14:58PM +0800, Feng Wu wrote:
> This patch updates the Posted-Interrupts Descriptor when vCPU
> is blocked.
> 
> pre-block:
> - Add the vCPU to the blocked per-CPU list
> - Clear 'SN'
> - Set 'NV' to POSTED_INTR_WAKEUP_VECTOR
> 
> post-block:
> - Remove the vCPU from the per-CPU list
> 
> Signed-off-by: Feng Wu 
> ---
>  arch/x86/include/asm/kvm_host.h |  2 +
>  arch/x86/kvm/vmx.c  | 96 
> +
>  arch/x86/kvm/x86.c  | 22 +++---
>  include/linux/kvm_host.h|  4 ++
>  virt/kvm/kvm_main.c |  6 +++
>  5 files changed, 123 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 13e3e40..32c110a 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -101,6 +101,8 @@ static inline gfn_t gfn_to_index(gfn_t gfn, gfn_t 
> base_gfn, int level)
>  
>  #define ASYNC_PF_PER_VCPU 64
>  
> +extern void (*wakeup_handler_callback)(void);
> +
>  enum kvm_reg {
>   VCPU_REGS_RAX = 0,
>   VCPU_REGS_RCX = 1,
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index bf2e6cd..a1c83a2 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -832,6 +832,13 @@ static DEFINE_PER_CPU(struct vmcs *, current_vmcs);
>  static DEFINE_PER_CPU(struct list_head, loaded_vmcss_on_cpu);
>  static DEFINE_PER_CPU(struct desc_ptr, host_gdt);
>  
> +/*
> + * We maintian a per-CPU linked-list of vCPU, so in wakeup_handler() we
> + * can find which vCPU should be waken up.
> + */
> +static DEFINE_PER_CPU(struct list_head, blocked_vcpu_on_cpu);
> +static DEFINE_PER_CPU(spinlock_t, blocked_vcpu_on_cpu_lock);
> +
>  static unsigned long *vmx_io_bitmap_a;
>  static unsigned long *vmx_io_bitmap_b;
>  static unsigned long *vmx_msr_bitmap_legacy;
> @@ -1921,6 +1928,7 @@ static void vmx_vcpu_load(struct kvm_vcpu *vcpu, int 
> cpu)
>   struct pi_desc *pi_desc = vcpu_to_pi_desc(vcpu);
>   struct pi_desc old, new;
>   unsigned int dest;
> + unsigned long flags;
>  
>   memset(&old, 0, sizeof(old));
>   memset(&new, 0, sizeof(new));
> @@ -1942,6 +1950,20 @@ static void vmx_vcpu_load(struct kvm_vcpu *vcpu, int 
> cpu)
>   new.nv = POSTED_INTR_VECTOR;
>   } while (cmpxchg(&pi_desc->control, old.control,
>   new.control) != old.control);
> +
> + /*
> +  * Delete the vCPU from the related wakeup queue
> +  * if we are resuming from blocked state
> +  */
> + if (vcpu->blocked) {
> + vcpu->blocked = false;
> + spin_lock_irqsave(&per_cpu(blocked_vcpu_on_cpu_lock,
> + vcpu->wakeup_cpu), flags);
> + list_del(&vcpu->blocked_vcpu_list);
> + 
> spin_unlock_irqrestore(&per_cpu(blocked_vcpu_on_cpu_lock,
> + vcpu->wakeup_cpu), flags);
> + vcpu->wakeup_cpu = -1;
> + }
>   }
>  }
>  
> @@ -1950,6 +1972,9 @@ static void vmx_vcpu_put(struct kvm_vcpu *vcpu)
>   if (irq_remapping_cap(IRQ_POSTING_CAP)) {
>   struct pi_desc *pi_desc = vcpu_to_pi_desc(vcpu);
>   struct pi_desc old, new;
> + unsigned long flags;
> + int cpu;
> + struct cpumask cpu_others_mask;
>  
>   memset(&old, 0, sizeof(old));
>   memset(&new, 0, sizeof(new));
> @@ -1961,6 +1986,54 @@ static void vmx_vcpu_put(struct kvm_vcpu *vcpu)
>   pi_set_sn(&new);
>   } while (cmpxchg(&pi_desc->control, old.control,
>   new.control) != old.control);
> + } else if (vcpu->blocked) {
> + /*
> +  * The vcpu is blocked on the wait queue.
> +  * Store the blocked vCPU on the list of the
> +  * vcpu->wakeup_cpu, which is the destination
> +  * of the wake-up notification event.
> +  */
> + vcpu->wakeup_cpu = vcpu->cpu;
> + spin_lock_irqsave(&per_cpu(blocked_vcpu_on_cpu_lock,
> +   vcpu->wakeup_cpu), flags);
> + list_add_tail(&vcpu->blocked_vcpu_list,
> +   &per_cpu(blocked_vcpu_on_cpu,
> +   vcpu->wakeup_cpu));
> + spin_unlock_irqrestore(
> + &per_cpu(blocked_vcpu_on_cpu_lock,
> + vcpu->wakeup_cpu), flags);
> +
> + do {
> + old.control = new.control = pi_desc->control;
> +
> + /*
> +  * We should n

Re: [v3 24/26] KVM: Update Posted-Interrupts Descriptor when vCPU is blocked

2015-02-26 Thread Marcelo Tosatti
On Thu, Feb 26, 2015 at 08:08:15AM +, Wu, Feng wrote:
> 
> 
> > -Original Message-
> > From: Marcelo Tosatti [mailto:mtosa...@redhat.com]
> > Sent: Thursday, February 26, 2015 5:50 AM
> > To: Wu, Feng
> > Cc: t...@linutronix.de; mi...@redhat.com; h...@zytor.com; x...@kernel.org;
> > g...@kernel.org; pbonz...@redhat.com; dw...@infradead.org;
> > j...@8bytes.org; alex.william...@redhat.com; jiang@linux.intel.com;
> > eric.au...@linaro.org; linux-ker...@vger.kernel.org;
> > io...@lists.linux-foundation.org; kvm@vger.kernel.org
> > Subject: Re: [v3 24/26] KVM: Update Posted-Interrupts Descriptor when vCPU
> > is blocked
> > 
> > On Fri, Dec 12, 2014 at 11:14:58PM +0800, Feng Wu wrote:
> > > This patch updates the Posted-Interrupts Descriptor when vCPU
> > > is blocked.
> > >
> > > pre-block:
> > > - Add the vCPU to the blocked per-CPU list
> > > - Clear 'SN'
> > > - Set 'NV' to POSTED_INTR_WAKEUP_VECTOR
> > >
> > > post-block:
> > > - Remove the vCPU from the per-CPU list
> > >
> > > Signed-off-by: Feng Wu 
> > > ---
> > 
> > Don't see this is needed, can use the existing POSTED_INTR_VECTOR:
> > 
> > If in guest mode, IPI will be handled in VMX non-root by performed
> > PIR->IRR transfer.
> > 
> > If outside guest mode, POSTED_INTR_VECTOR IPI will be handled by host
> > which can wakeup the guest (in case it is halted).
> 
> Please see the following scenario:
> 
> 1. vCPU0 is running on pCPU0
> 2. vCPU0 is halted and vCPU1 is currently running on pCPU0
> 3. An interrupt occurs for vCPU0, if we still use POSTED_INTR_VECTOR
> for vCPU0, the notification event for vCPU0 (the event will go to pCPU1)
> will be consumed by vCPU1 incorrectly. The worst case is that vCPU0
> will never be woken up again since the wakeup event for it is always
> consumed by other vCPUs incorrectly.
> 
> Thanks,
> Feng

Ouch, yes.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch -rt 1/2] KVM: use simple waitqueue for vcpu->wq

2015-02-26 Thread Marcelo Tosatti
On Tue, Feb 17, 2015 at 06:44:19PM +0100, Sebastian Andrzej Siewior wrote:
> * Peter Zijlstra | 2015-01-21 16:07:16 [+0100]:
> 
> >On Tue, Jan 20, 2015 at 01:16:13PM -0500, Steven Rostedt wrote:
> >> I'm actually wondering if we should just nuke the _interruptible()
> >> version of swait. As it should only be all interruptible or all not
> >> interruptible, that the swait_wake() should just do the wake up
> >> regardless. In which case, swait_wake() is good enough. No need to have
> >> different versions where people may think do something special.
> >> 
> >> Peter?
> >
> >Yeah, I think the lastest thing I have sitting here on my disk only has
> >the swake_up() which does TASK_NORMAL, no choice there.
> 
> what is the swait status in terms of mainline? This sounds like it
> beeing worked on.
> I could take the series but then I would drop it again if the mainline
> implementation changes…

Hi Sebastian,

No, you would just adjust it to the upstream kernel interfaces, as the rest of
the -rt users of the swait interfaces.

Can you please include the series?

Thanks

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [kvm-unit-tests PATCH 00/18] arm/arm64: add smp support

2015-02-26 Thread Marcelo Tosatti
On Sun, Feb 01, 2015 at 07:34:28PM +0100, Andrew Jones wrote:
> This series extends the kvm-unit-tests/arm[64] framework to support smp.
> A break down of the patches is as follows
> 
> 01-02: prepare general framework for smp use
> 03-06: arm/arm64 fixups not 100% related to this series,
>but need to post some time...
> 07-09: add thread_info (for per-thread data) and suck some global
>data into it
> 10-11: add cpumask support (for per-cpu data) and suck some more
>global data in
>12: add arm64 simple spinlock implementation
> 13-14: add some PSCI support
> 15-16: further prep for smp_boot_secondary
>17: finally add smp_boot_secondary
>18: as usual, add a selftest to make sure it all works
> 
> These patches are also available here:
> https://github.com/rhdrjones/kvm-unit-tests/tree/arm/smp
> 
> Thanks in advance for reviews!
> 
> 
> Andrew Jones (18):
>   x86: expose spin_lock/unlock to lib code
>   lib/report: guard access to counters
>   arm: fixups: add barriers, actually set MAIR
>   arm64: fixup: use id_aa64mmfr0_el1 to set tcr
>   arm/arm64: processor.[ch] cleanups
>   arm/arm64: get rid of get_sp()
>   arm/arm64: introduce thread_info
>   arm/arm64: add per thread user_mode flag
>   arm/arm64: maintain per thread exception handlers
>   arm/arm64: add simple cpumask API
>   arm/arm64: make mmu_on per cpu
>   arm64: implement spinlocks
>   arm/arm64: import include/uapi/linux/psci.h
>   arm/arm64: add some PSCI API
>   arm/arm64: add cpu_relax() and friends
>   arm: clarify comment about exception stack use
>   arm/arm64: add smp_boot_secondary
>   arm/arm64: Add smp selftest
> 
>  arm/cstart.S |  47 +
>  arm/cstart64.S   |  48 +++---
>  arm/flat.lds |   6 +++
>  arm/selftest.c   |  70 +++--
>  arm/unittests.cfg|  11 +++-
>  config/config-arm-common.mak |   3 ++
>  config/config-arm64.mak  |   1 +
>  lib/arm/asm-offsets.c|   3 ++
>  lib/arm/asm/barrier.h|   5 ++
>  lib/arm/asm/bitops.h |  53 +++
>  lib/arm/asm/cpumask.h| 118 
> +++
>  lib/arm/asm/mmu-api.h|   1 +
>  lib/arm/asm/processor.h  |  12 -
>  lib/arm/asm/psci.h   |  15 ++
>  lib/arm/asm/smp.h|  56 
>  lib/arm/asm/thread_info.h|  60 ++
>  lib/arm/asm/uapi-psci.h  |  73 ++
>  lib/arm/bitops.c |  81 +
>  lib/arm/mmu.c|  15 --
>  lib/arm/processor.c  |  44 +++-
>  lib/arm/psci.c   |  49 ++
>  lib/arm/setup.c  |  11 +++-
>  lib/arm/smp.c|  57 +
>  lib/arm64/asm-offsets.c  |   2 +
>  lib/arm64/asm/barrier.h  |   5 ++
>  lib/arm64/asm/bitops.h   |  51 +++
>  lib/arm64/asm/cpumask.h  |   1 +
>  lib/arm64/asm/processor.h|  18 ++-
>  lib/arm64/asm/psci.h |  15 ++
>  lib/arm64/asm/smp.h  |   1 +
>  lib/arm64/asm/spinlock.h |   8 +--
>  lib/arm64/asm/thread_info.h  |   1 +
>  lib/arm64/asm/uapi-psci.h|   1 +
>  lib/arm64/processor.c|  81 +++--
>  lib/arm64/spinlock.c |  43 
>  lib/report.c |  16 ++
>  lib/x86/asm/spinlock.h   |  11 
>  lib/x86/smp.h|   7 +--
>  38 files changed, 1008 insertions(+), 92 deletions(-)
>  create mode 100644 lib/arm/asm/bitops.h
>  create mode 100644 lib/arm/asm/cpumask.h
>  create mode 100644 lib/arm/asm/psci.h
>  create mode 100644 lib/arm/asm/smp.h
>  create mode 100644 lib/arm/asm/thread_info.h
>  create mode 100644 lib/arm/asm/uapi-psci.h
>  create mode 100644 lib/arm/bitops.c
>  create mode 100644 lib/arm/psci.c
>  create mode 100644 lib/arm/smp.c
>  create mode 100644 lib/arm64/asm/bitops.h
>  create mode 100644 lib/arm64/asm/cpumask.h
>  create mode 100644 lib/arm64/asm/psci.h
>  create mode 100644 lib/arm64/asm/smp.h
>  create mode 100644 lib/arm64/asm/thread_info.h
>  create mode 100644 lib/arm64/asm/uapi-psci.h
>  create mode 100644 lib/arm64/spinlock.c
>  create mode 100644 lib/x86/asm/spinlock.h
> 
> -- 
> 1.9.3

Applied, thanks.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG] Balloon malfunctions with memory hotplug

2015-02-26 Thread zhanghailiang

On 2015/2/27 3:26, Luiz Capitulino wrote:

Hello,

Reproducer:

1. Start QEMU with balloon and memory hotplug support:

# qemu [...] -m 1G,slots=2,maxmem=2G -balloon virtio

2. Check balloon size:

(qemu) info balloon
balloon: actual=1024
(qemu)

3. Hotplug some memory:

(qemu) object_add memory-backend-ram,id=mem1,size=1G
(qemu) device_add pc-dimm,id=dimm1,memdev=mem1

4. This is step is _not_ needed to reproduce the problem,
but you may need to online memory manually on Linux so
that it becomes available in the guest

5. Check balloon size again:

(qemu) info balloon
balloon: actual=1024
(qemu)

BUG: The guest now has 2GB of memory, but the balloon thinks
  the guest has 1GB

One may think that the problem is that the balloon driver is
ignoring hotplugged memory. This is not what's happening. If
you do balloon your guest, there's nothing stopping the
balloon driver in the guest from ballooning hotplugged memory.

The problem is that the balloon device in QEMU needs to know
the current amount of memory available to the guest.

Before memory hotplug this information was easy to obtain: the
current amount of memory available to the guest is the memory the
guest was booted with. This value is stored in the ram_size global
variable in QEMU and this is what the balloon device emulation
code uses today. However, when memory is hotplugged ram_size is
_not_ updated and the balloon device breaks.

I see two possible solutions for this problem:

1. In addition to reading ram_size, the balloon device in QEMU
could scan pc-dimm devices to account for hotplugged memory.

This solution was already implemented by zhanghailiang:

 http://lists.gnu.org/archive/html/qemu-devel/2014-11/msg02362.html

It works, except that on Linux memory hotplug is a two-step
procedure: first memory is inserted then it has to be onlined
from user-space. So, if memory is inserted but not onlined
this solution gives the opposite problem: the balloon device
will report a larger memory amount than the guest actually has.

Can we live with that? I guess not, but I'm open for discussion.

If QEMU could be notified when Linux makes memory online, then
the problem would be gone. But I guess this can't be done.



Yes, it is really a problem, balloon can't work well with memory block 
online/offline now.
virtio-balloon can't be notified when memory block online/offline now, 
actually, we can
add this capability by using the exist kernel memory hotplug/unplug notifier 
mechanism. (
just a simple register_memory_notifier().)



2. Modify the balloon driver in the guest to inform the balloon
device on the host about the current memory available to the
guest. This way, whenever the balloon device in QEMU needs
to know the current amount of memory in the guest, it asks
the guest. This drops any usage of ram_size in the balloon
device

I'm not completely sure this is feasible though. For example,
what happens if the guest reports a memory amount to QEMU and
right after this more memory is plugged?



Hmm, i wonder why we notify the number of pages which should be adjusted to 
virtio-balloon,
why not the memory 'target' size ? Is there any special reason ?

For linux guest, it can always know exactly its current real memory size, but 
QEMU may not, because
guest can do online/offline memory block by themselves.

If virtio-balloon in guest know the balloon's 'target' size, it can calculate 
the exact memory size
that should  be adjuested. and also can do corresponding action (fill or leak 
balloon)
when there is online/offline memory block occurred.


Besides, this solution is more complex than solution 1 and
won't address older guests.

Another important detail is that, I *suspect* that a very similar
bug already exists with 32-bit guests even without memory
hotplug: what happens if you assign 6GB to a 32-bit without PAE
support? I think the same problem we're seeing with memory
hotplug will happen and solution 1 won't fix this, although
no one seems to care about 32-bit guests...

.




--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCHv2 5/9] Fix WARNING:EXPORT_SYMBOL(x) should immediately.. in kvm_main.c

2015-02-26 Thread Xiubo Li
WARNING: EXPORT_SYMBOL(foo); should immediately follow its
function/variable
+EXPORT_SYMBOL_GPL(gfn_to_page);

This patch fixes these warnings to reduce noise when checking new
patches in kvm_main.c.

Signed-off-by: Xiubo Li 
---
 virt/kvm/kvm_main.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index ba5355c..5761b3a 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1481,7 +1481,6 @@ struct page *gfn_to_page(struct kvm *kvm, gfn_t gfn)
 
return kvm_pfn_to_page(pfn);
 }
-
 EXPORT_SYMBOL_GPL(gfn_to_page);
 
 void kvm_release_page_clean(struct page *page)
-- 
1.9.1



--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCHv2 1/9] Fix WARNING: quoted string split across lines in kvm_main.c

2015-02-26 Thread Xiubo Li
WARNING: quoted string split across lines
+   printk(KERN_INFO "kvm: enabling virtualization on "
+"CPU%d failed\n", cpu);

When fails to enable virtualization on CPUx for kvm, this log will
be output in only one line, and it will be a little confusing for us
to grep this log in kernel source code.

In some case, the user maybe using on script to searching the error
log in kernel source code, if so, won't it always fail?

So this patch fix this issue.

Signed-off-by: Xiubo Li 
---
 virt/kvm/kvm_main.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index a109370..62c80fe 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2817,12 +2817,10 @@ static void hardware_enable_nolock(void *junk)
cpumask_set_cpu(cpu, cpus_hardware_enabled);
 
r = kvm_arch_hardware_enable();
-
if (r) {
cpumask_clear_cpu(cpu, cpus_hardware_enabled);
atomic_inc(&hardware_enable_failed);
-   printk(KERN_INFO "kvm: enabling virtualization on "
-"CPU%d failed\n", cpu);
+   pr_info("kvm: enabling virtualization on CPU%d failed\n", cpu);
}
 }
 
-- 
1.9.1



--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCHv2 9/9] Fix WARNINGs about using printk drectly in kvm_main.c

2015-02-26 Thread Xiubo Li
WARNING: Prefer [subsystem eg: netdev]_info([subsystem]dev, ... then
dev_info(dev, ... then pr_info(...  to printk(KERN_INFO ...
+   printk(KERN_INFO "kvm: exiting hardware virtualization\n");

WARNING: Prefer [subsystem eg: netdev]_err([subsystem]dev, ... then
dev_err(dev, ... then pr_err(...  to printk(KERN_ERR ...
+   printk(KERN_ERR "kvm: misc device register failed\n");

Signed-off-by: Xiubo Li 
---
 virt/kvm/kvm_main.c | 12 +---
 1 file changed, 5 insertions(+), 7 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 9e9a768..db931a5 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2900,13 +2900,11 @@ static int kvm_cpu_hotplug(struct notifier_block 
*notifier, unsigned long val,
val &= ~CPU_TASKS_FROZEN;
switch (val) {
case CPU_DYING:
-   printk(KERN_INFO "kvm: disabling virtualization on CPU%d\n",
-  cpu);
+   pr_info("kvm: disabling virtualization on CPU%d\n", cpu);
hardware_disable();
break;
case CPU_STARTING:
-   printk(KERN_INFO "kvm: enabling virtualization on CPU%d\n",
-  cpu);
+   pr_info("kvm: enabling virtualization on CPU%d\n", cpu);
hardware_enable();
break;
}
@@ -2922,7 +2920,7 @@ static int kvm_reboot(struct notifier_block *notifier, 
unsigned long val,
 *
 * And Intel TXT required VMX off for all cpu when system shutdown.
 */
-   printk(KERN_INFO "kvm: exiting hardware virtualization\n");
+   pr_info("kvm: exiting hardware virtualization\n");
kvm_rebooting = true;
on_each_cpu(hardware_disable_nolock, NULL, 1);
return NOTIFY_OK;
@@ -3352,7 +3350,7 @@ int kvm_init(void *opaque, unsigned vcpu_size, unsigned 
vcpu_align,
 
r = misc_register(&kvm_dev);
if (r) {
-   printk(KERN_ERR "kvm: misc device register failed\n");
+   pr_err("kvm: misc device register failed\n");
goto out_unreg;
}
 
@@ -3363,7 +3361,7 @@ int kvm_init(void *opaque, unsigned vcpu_size, unsigned 
vcpu_align,
 
r = kvm_init_debug();
if (r) {
-   printk(KERN_ERR "kvm: create debugfs files failed\n");
+   pr_err("kvm: create debugfs files failed\n");
goto out_undebugfs;
}
 
-- 
1.9.1



--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCHv2 0/9] Fix possible warnings and errors for kvm_main.c

2015-02-26 Thread Xiubo Li

There are to much warning/error noise in kvm_main.c when gernerating and
checking new patches.

This patch series fix most of these warnings and errors to reduce possible
noise.


Change in v2:
- Accept Thomas Huth's advice of using pr_info instead of printk to avoid
  exceeding the 80 columns limit.



Xiubo Li (9):
  Fix WARNING: quoted string split across lines in kvm_main.c
  Fix WARNINGs for 'sizeof(X)' instead of 'sizeof X' in kvm_main.c
  Fix WARNING: labels should not be indented in kvm_main.c
  Fix ERROR:do not initialise statics to 0 or NULL in kvm_main.c
  Fix WARNING:EXPORT_SYMBOL(x) should immediately.. in kvm_main.c
  Fix WARNINGs:Missing blank line after declarationgs in kvm_main.c
  Fix WARNINGs:please, no space before tabs in kvm_main.c
  Fix ERROR & WARNING for better alignment in kvm_main.c
  Fix WARNINGs about using printk drectly in kvm_main.c

 virt/kvm/kvm_main.c | 75 +++--
 1 file changed, 38 insertions(+), 37 deletions(-)

-- 
1.9.1



--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCHv2 8/9] Fix ERROR & WARNING for better alignment in kvm_main.c

2015-02-26 Thread Xiubo Li
ERROR: code indent should use tabs where possible
+ const struct kvm_io_range *r2)$

WARNING: please, no spaces at the start of a line
+ const struct kvm_io_range *r2)$

This patch fixes this ERROR & WARNING to reduce noise when checking new
patches in kvm_main.c.

Signed-off-by: Xiubo Li 
---
 virt/kvm/kvm_main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index fb8a556..9e9a768 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2946,7 +2946,7 @@ static void kvm_io_bus_destroy(struct kvm_io_bus *bus)
 }
 
 static inline int kvm_io_bus_cmp(const struct kvm_io_range *r1,
- const struct kvm_io_range *r2)
+const struct kvm_io_range *r2)
 {
if (r1->addr < r2->addr)
return -1;
-- 
1.9.1



--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCHv2 4/9] Fix ERROR:do not initialise statics to 0 or NULL in kvm_main.c

2015-02-26 Thread Xiubo Li
ERROR: do not initialise statics to 0 or NULL
+static int kvm_usage_count = 0;

The kvm_usage_count will be placed to .bss segment when linking, so
not need to set it to 0 here obviously.

This patch fixes this ERROR to reduce noise when checking new patches
in kvm_main.c.

Signed-off-by: Xiubo Li 
---
 virt/kvm/kvm_main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index cb1c187..ba5355c 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -80,7 +80,7 @@ static DEFINE_RAW_SPINLOCK(kvm_count_lock);
 LIST_HEAD(vm_list);
 
 static cpumask_var_t cpus_hardware_enabled;
-static int kvm_usage_count = 0;
+static int kvm_usage_count;
 static atomic_t hardware_enable_failed;
 
 struct kmem_cache *kvm_vcpu_cache;
-- 
1.9.1



--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCHv2 7/9] Fix WARNINGs:please, no space before tabs in kvm_main.c

2015-02-26 Thread Xiubo Li
WARNING: please, no space before tabs
+ * ^I^Ikvm->lock --> kvm->slots_lock --> kvm->irq_lock$

WARNING: please, no space before tabs
+^I^I * ^I- gfn_to_hva (kvm_read_guest, gfn_to_pfn)$

WARNING: please, no space before tabs
+^I^I * ^I- kvm_is_visible_gfn (mmu_check_roots)$

This patch fixes these warnings to reduce noise when checking new
patches in kvm_main.c.

Signed-off-by: Xiubo Li 
---
 virt/kvm/kvm_main.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index c6106f2..fb8a556 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -72,7 +72,7 @@ module_param(halt_poll_ns, uint, S_IRUGO | S_IWUSR);
 /*
  * Ordering of locks:
  *
- * kvm->lock --> kvm->slots_lock --> kvm->irq_lock
+ * kvm->lock --> kvm->slots_lock --> kvm->irq_lock
  */
 
 DEFINE_SPINLOCK(kvm_lock);
@@ -888,8 +888,8 @@ int __kvm_set_memory_region(struct kvm *kvm,
 * or moved, memslot will be created.
 *
 * validation of sp->gfn happens in:
-*  - gfn_to_hva (kvm_read_guest, gfn_to_pfn)
-*  - kvm_is_visible_gfn (mmu_check_roots)
+*  - gfn_to_hva (kvm_read_guest, gfn_to_pfn)
+*  - kvm_is_visible_gfn (mmu_check_roots)
 */
kvm_arch_flush_shadow_memslot(kvm, slot);
 
-- 
1.9.1



--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCHv2 2/9] Fix WARNINGs for 'sizeof(X)' instead of 'sizeof X' in kvm_main.c

2015-02-26 Thread Xiubo Li
There are many WARNINGs like this:
WARNING: sizeof tr should be sizeof(tr)
+   if (copy_from_user(&tr, argp, sizeof tr))

In kvm_main.c many places are using 'sizeof(X)', and the other places
are using 'sizeof X', while the kernel recommands to use 'sizeof(X)',
so this patch will replace all 'sizeof X' to 'sizeof(X)' to make them
consistent and at the same time to reduce the WARNINGs noise when we
are checking new patches.

Signed-off-by: Xiubo Li 
---
 virt/kvm/kvm_main.c | 40 
 1 file changed, 20 insertions(+), 20 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 62c80fe..847de52 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2205,7 +2205,7 @@ out_free1:
if (r)
goto out;
r = -EFAULT;
-   if (copy_to_user(argp, &mp_state, sizeof mp_state))
+   if (copy_to_user(argp, &mp_state, sizeof(mp_state)))
goto out;
r = 0;
break;
@@ -2214,7 +2214,7 @@ out_free1:
struct kvm_mp_state mp_state;
 
r = -EFAULT;
-   if (copy_from_user(&mp_state, argp, sizeof mp_state))
+   if (copy_from_user(&mp_state, argp, sizeof(mp_state)))
goto out;
r = kvm_arch_vcpu_ioctl_set_mpstate(vcpu, &mp_state);
break;
@@ -2223,13 +2223,13 @@ out_free1:
struct kvm_translation tr;
 
r = -EFAULT;
-   if (copy_from_user(&tr, argp, sizeof tr))
+   if (copy_from_user(&tr, argp, sizeof(tr)))
goto out;
r = kvm_arch_vcpu_ioctl_translate(vcpu, &tr);
if (r)
goto out;
r = -EFAULT;
-   if (copy_to_user(argp, &tr, sizeof tr))
+   if (copy_to_user(argp, &tr, sizeof(tr)))
goto out;
r = 0;
break;
@@ -2238,7 +2238,7 @@ out_free1:
struct kvm_guest_debug dbg;
 
r = -EFAULT;
-   if (copy_from_user(&dbg, argp, sizeof dbg))
+   if (copy_from_user(&dbg, argp, sizeof(dbg)))
goto out;
r = kvm_arch_vcpu_ioctl_set_guest_debug(vcpu, &dbg);
break;
@@ -2252,14 +2252,14 @@ out_free1:
if (argp) {
r = -EFAULT;
if (copy_from_user(&kvm_sigmask, argp,
-  sizeof kvm_sigmask))
+  sizeof(kvm_sigmask)))
goto out;
r = -EINVAL;
-   if (kvm_sigmask.len != sizeof sigset)
+   if (kvm_sigmask.len != sizeof(sigset))
goto out;
r = -EFAULT;
if (copy_from_user(&sigset, sigmask_arg->sigset,
-  sizeof sigset))
+  sizeof(sigset)))
goto out;
p = &sigset;
}
@@ -2321,14 +2321,14 @@ static long kvm_vcpu_compat_ioctl(struct file *filp,
if (argp) {
r = -EFAULT;
if (copy_from_user(&kvm_sigmask, argp,
-  sizeof kvm_sigmask))
+  sizeof(kvm_sigmask)))
goto out;
r = -EINVAL;
-   if (kvm_sigmask.len != sizeof csigset)
+   if (kvm_sigmask.len != sizeof(csigset))
goto out;
r = -EFAULT;
if (copy_from_user(&csigset, sigmask_arg->sigset,
-  sizeof csigset))
+  sizeof(csigset)))
goto out;
sigset_from_compat(&sigset, &csigset);
r = kvm_vcpu_ioctl_set_sigmask(vcpu, &sigset);
@@ -2524,7 +2524,7 @@ static long kvm_vm_ioctl(struct file *filp,
 
r = -EFAULT;
if (copy_from_user(&kvm_userspace_mem, argp,
-   sizeof kvm_userspace_mem))
+   sizeof(kvm_userspace_mem)))
goto out;
 
r = kvm_vm_ioctl_set_memory_region(kvm, &kvm_userspace_mem);
@@ -2534,7 +2534,7 @@ static long kvm_vm_ioctl(struct file *filp,
struct kvm_dirty_log log;
 
r = -EFAULT;
-   if (copy_from_user(&log, argp, sizeof log))
+   if (copy_from_user(&log, argp, sizeof(log)))
goto out;
r = kvm_vm_ioctl_get_dirty

[PATCHv2 6/9] Fix WARNINGs:Missing blank line after declarationgs in kvm_main.c

2015-02-26 Thread Xiubo Li
There are many Warnings like this:
WARNING: Missing a blank line after declarations
+   struct kvm_coalesced_mmio_zone zone;
+   r = -EFAULT;

This patch fixes these warnings to reduce noise when checking new
patches in kvm_main.c.

Signed-off-by: Xiubo Li 
---
 virt/kvm/kvm_main.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 5761b3a..c6106f2 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1516,6 +1516,7 @@ void kvm_set_pfn_dirty(pfn_t pfn)
 {
if (!kvm_is_reserved_pfn(pfn)) {
struct page *page = pfn_to_page(pfn);
+
if (!PageReserved(page))
SetPageDirty(page);
}
@@ -1799,6 +1800,7 @@ void kvm_vcpu_block(struct kvm_vcpu *vcpu)
start = cur = ktime_get();
if (halt_poll_ns) {
ktime_t stop = ktime_add_ns(ktime_get(), halt_poll_ns);
+
do {
/*
 * This sets KVM_REQ_UNHALT if an interrupt
@@ -2134,6 +2136,7 @@ static long kvm_vcpu_ioctl(struct file *filp,
/* The thread running this VCPU changed. */
struct pid *oldpid = vcpu->pid;
struct pid *newpid = get_task_pid(current, PIDTYPE_PID);
+
rcu_assign_pointer(vcpu->pid, newpid);
if (oldpid)
synchronize_rcu();
@@ -2541,6 +2544,7 @@ static long kvm_vm_ioctl(struct file *filp,
 #ifdef KVM_COALESCED_MMIO_PAGE_OFFSET
case KVM_REGISTER_COALESCED_MMIO: {
struct kvm_coalesced_mmio_zone zone;
+
r = -EFAULT;
if (copy_from_user(&zone, argp, sizeof(zone)))
goto out;
@@ -2549,6 +2553,7 @@ static long kvm_vm_ioctl(struct file *filp,
}
case KVM_UNREGISTER_COALESCED_MMIO: {
struct kvm_coalesced_mmio_zone zone;
+
r = -EFAULT;
if (copy_from_user(&zone, argp, sizeof(zone)))
goto out;
@@ -3265,6 +3270,7 @@ struct kvm_vcpu *preempt_notifier_to_vcpu(struct 
preempt_notifier *pn)
 static void kvm_sched_in(struct preempt_notifier *pn, int cpu)
 {
struct kvm_vcpu *vcpu = preempt_notifier_to_vcpu(pn);
+
if (vcpu->preempted)
vcpu->preempted = false;
 
-- 
1.9.1



--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCHv2 3/9] Fix WARNING: labels should not be indented in kvm_main.c

2015-02-26 Thread Xiubo Li
WARNING: labels should not be indented
+   out_free_irq_routing:

This patch fixes this WARNING to reduce noise when checking new patches
in kvm_main.c.

Signed-off-by: Xiubo Li 
---
 virt/kvm/kvm_main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 847de52..cb1c187 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2646,7 +2646,7 @@ static long kvm_vm_ioctl(struct file *filp,
goto out_free_irq_routing;
r = kvm_set_irq_routing(kvm, entries, routing.nr,
routing.flags);
-   out_free_irq_routing:
+out_free_irq_routing:
vfree(entries);
break;
}
-- 
1.9.1



--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [BUG] Balloon malfunctions with memory hotplug

2015-02-26 Thread Markus Armbruster
Luiz Capitulino  writes:

> Hello,
>
> Reproducer:
>
> 1. Start QEMU with balloon and memory hotplug support:
>
> # qemu [...] -m 1G,slots=2,maxmem=2G -balloon virtio
>
> 2. Check balloon size:
>
> (qemu) info balloon
> balloon: actual=1024
> (qemu) 
>
> 3. Hotplug some memory:
>
> (qemu) object_add memory-backend-ram,id=mem1,size=1G
> (qemu) device_add pc-dimm,id=dimm1,memdev=mem1
>
> 4. This is step is _not_ needed to reproduce the problem,
>but you may need to online memory manually on Linux so
>that it becomes available in the guest
>
> 5. Check balloon size again:
>
> (qemu) info balloon
> balloon: actual=1024
> (qemu) 
>
> BUG: The guest now has 2GB of memory, but the balloon thinks
>  the guest has 1GB

Impact other than "info balloon"?

> One may think that the problem is that the balloon driver is
> ignoring hotplugged memory. This is not what's happening. If
> you do balloon your guest, there's nothing stopping the
> balloon driver in the guest from ballooning hotplugged memory.
>
> The problem is that the balloon device in QEMU needs to know
> the current amount of memory available to the guest.
>
> Before memory hotplug this information was easy to obtain: the
> current amount of memory available to the guest is the memory the
> guest was booted with. This value is stored in the ram_size global
> variable in QEMU and this is what the balloon device emulation
> code uses today. However, when memory is hotplugged ram_size is
> _not_ updated and the balloon device breaks.
>
> I see two possible solutions for this problem:
>
> 1. In addition to reading ram_size, the balloon device in QEMU
>could scan pc-dimm devices to account for hotplugged memory.
>
>This solution was already implemented by zhanghailiang:
>
> http://lists.gnu.org/archive/html/qemu-devel/2014-11/msg02362.html
>
>It works, except that on Linux memory hotplug is a two-step
>procedure: first memory is inserted then it has to be onlined
>from user-space. So, if memory is inserted but not onlined
>this solution gives the opposite problem: the balloon device
>will report a larger memory amount than the guest actually has.
>
>Can we live with that? I guess not, but I'm open for discussion.
>
>If QEMU could be notified when Linux makes memory online, then
>the problem would be gone. But I guess this can't be done.
>
> 2. Modify the balloon driver in the guest to inform the balloon
>device on the host about the current memory available to the
>guest. This way, whenever the balloon device in QEMU needs
>to know the current amount of memory in the guest, it asks
>the guest. This drops any usage of ram_size in the balloon
>device

What happens when the guest lies?

>I'm not completely sure this is feasible though. For example,
>what happens if the guest reports a memory amount to QEMU and
>right after this more memory is plugged?
>
>Besides, this solution is more complex than solution 1 and
>won't address older guests.
>
> Another important detail is that, I *suspect* that a very similar
> bug already exists with 32-bit guests even without memory
> hotplug: what happens if you assign 6GB to a 32-bit without PAE
> support? I think the same problem we're seeing with memory
> hotplug will happen and solution 1 won't fix this, although
> no one seems to care about 32-bit guests...

Fun...
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html