RE: [Qemu-devel] Live migration sequence

2015-10-16 Thread Pavel Fedin
 Hello!

> Some thoughts:
>   a) There is a migration state notifier list - see 
> add_migration_state_change_notifier (spice
> calls it)
>  - but I don't think it's called in the right places for your needs;  we
>  could add some more places that gets called.

 I am now trying to add one more state, something like 
MIGRATION_STATUS_FINISHING. It would mean that CPUs are stopped.
 Can you explain me migration code a bit? Where is iteration loop, and where 
are CPUs stopped? I am looking at migration.c but
cannot say that i understand some good portion of it. :)

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia


___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v3 07/20] KVM: ARM64: PMU: Add perf event map and introduce perf event creating function

2015-10-16 Thread Wei Huang


On 09/24/2015 05:31 PM, Shannon Zhao wrote:
> When we use tools like perf on host, perf passes the event type and the
> id of this event type category to kernel, then kernel will map them to
> hardware event number and write this number to PMU PMEVTYPER_EL0
> register. When getting the event number in KVM, directly use raw event
> type to create a perf_event for it.
> 
> Signed-off-by: Shannon Zhao 
> ---
>  arch/arm64/include/asm/pmu.h |   2 +
>  arch/arm64/kvm/Makefile  |   1 +
>  include/kvm/arm_pmu.h|  13 
>  virt/kvm/arm/pmu.c   | 154 
> +++
>  4 files changed, 170 insertions(+)
>  create mode 100644 virt/kvm/arm/pmu.c
> 
> diff --git a/arch/arm64/include/asm/pmu.h b/arch/arm64/include/asm/pmu.h
> index b9f394a..2c025f2 100644
> --- a/arch/arm64/include/asm/pmu.h
> +++ b/arch/arm64/include/asm/pmu.h
> @@ -31,6 +31,8 @@
>  #define ARMV8_PMCR_D (1 << 3) /* CCNT counts every 64th cpu cycle */
>  #define ARMV8_PMCR_X (1 << 4) /* Export to ETM */
>  #define ARMV8_PMCR_DP(1 << 5) /* Disable CCNT if 
> non-invasive debug*/
> +/* Determines which PMCCNTR_EL0 bit generates an overflow */
> +#define ARMV8_PMCR_LC(1 << 6)
>  #define  ARMV8_PMCR_N_SHIFT  11   /* Number of counters 
> supported */
>  #define  ARMV8_PMCR_N_MASK   0x1f
>  #define  ARMV8_PMCR_MASK 0x3f /* Mask for writable bits */
> diff --git a/arch/arm64/kvm/Makefile b/arch/arm64/kvm/Makefile
> index 1949fe5..18d56d8 100644
> --- a/arch/arm64/kvm/Makefile
> +++ b/arch/arm64/kvm/Makefile
> @@ -27,3 +27,4 @@ kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/vgic-v3.o
>  kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/vgic-v3-emul.o
>  kvm-$(CONFIG_KVM_ARM_HOST) += vgic-v3-switch.o
>  kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/arch_timer.o
> +kvm-$(CONFIG_KVM_ARM_PMU) += $(KVM)/arm/pmu.o
> diff --git a/include/kvm/arm_pmu.h b/include/kvm/arm_pmu.h
> index bb0cd21..b48cdc6 100644
> --- a/include/kvm/arm_pmu.h
> +++ b/include/kvm/arm_pmu.h
> @@ -37,4 +37,17 @@ struct kvm_pmu {
>  #endif
>  };
>  
> +#ifdef CONFIG_KVM_ARM_PMU
> +unsigned long kvm_pmu_get_counter_value(struct kvm_vcpu *vcpu, u32 
> select_idx);
> +void kvm_pmu_set_counter_event_type(struct kvm_vcpu *vcpu, u32 data,
> + u32 select_idx);
> +#else
> +unsigned long kvm_pmu_get_counter_value(struct kvm_vcpu *vcpu, u32 
> select_idx)
> +{
> + return 0;
> +}
> +void kvm_pmu_set_counter_event_type(struct kvm_vcpu *vcpu, u32 data,
> + u32 select_idx) {}
> +#endif
> +
>  #endif
> diff --git a/virt/kvm/arm/pmu.c b/virt/kvm/arm/pmu.c
> new file mode 100644
> index 000..002ec79
> --- /dev/null
> +++ b/virt/kvm/arm/pmu.c
> @@ -0,0 +1,154 @@
> +/*
> + * Copyright (C) 2015 Linaro Ltd.
> + * Author: Shannon Zhao 
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program.  If not, see .
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +static void kvm_pmu_set_evttyper(struct kvm_vcpu *vcpu, u32 idx, u32 val)
> +{
> + if (!vcpu_mode_is_32bit(vcpu))
> + vcpu_sys_reg(vcpu, PMEVTYPER0_EL0 + idx) = val;
> + else
> + vcpu_cp15(vcpu, c14_PMEVTYPER0 + idx) = val;
> +}
> +
> +static unsigned long kvm_pmu_get_evttyper(struct kvm_vcpu *vcpu, u32 idx)
> +{
> + if (!vcpu_mode_is_32bit(vcpu))
> + return vcpu_sys_reg(vcpu, PMEVTYPER0_EL0 + idx)
> +& ARMV8_EVTYPE_EVENT;
> + else
> + return vcpu_cp15(vcpu, c14_PMEVTYPER0 + idx)
> +& ARMV8_EVTYPE_EVENT;
> +}
> +
> +/**
> + * kvm_pmu_stop_counter - stop PMU counter for the selected counter
> + * @vcpu: The vcpu pointer
> + * @select_idx: The counter index
> + *
> + * If this counter has been configured to monitor some event, disable and
> + * release it.
> + */
> +static void kvm_pmu_stop_counter(struct kvm_vcpu *vcpu, u32 select_idx)
> +{
> + struct kvm_pmu *pmu = >arch.pmu;
> + struct kvm_pmc *pmc = >pmc[select_idx];

A small suggestion (optional). It might be cleaner to define a macro and
use it here. Something like in arm_pmu.h :

#define VCPU_TO_PMU(vcpu)  (&(vcpu)->arch.pmu)

> +
> + if (pmc->perf_event) {
> + perf_event_disable(pmc->perf_event);
> + perf_event_release_kernel(pmc->perf_event);
> + 

Re: [PATCH] VFIO: platform: AMD xgbe reset module

2015-10-16 Thread Eric Auger
Dear all,
On 10/15/2015 06:53 PM, Alex Williamson wrote:
> On Thu, 2015-10-15 at 16:46 +0200, Eric Auger wrote:
>> Hi Arnd,
>> On 10/15/2015 03:59 PM, Arnd Bergmann wrote:
>>> On Thursday 15 October 2015 14:12:28 Christoffer Dall wrote:
>
> enum vfio_platform_op {
>   VFIO_PLATFORM_BIND,
>   VFIO_PLATFORM_UNBIND,
>   VFIO_PLATFORM_RESET,
> };
>
> struct platform_driver {
> int (*probe)(struct platform_device *);
> int (*remove)(struct platform_device *);
>   ...
>   int (*vfio_manage)(struct platform_device *, enum vfio_platform_op);
> struct device_driver driver;
> };
>
> This would integrate much more closely into the platform driver framework,
> just like the regular vfio driver integrates into the PCI framework.
> Unlike PCI however, you can't just use the generic driver framework to
> unbind the driver, because you still need device specific code.
>
 Thanks for these suggestions, really helpful.

 What I don't understand in the latter example is how VFIO knows which
 struct platform_driver to interact with?
>>>
>>> This would assume that the driver remains bound to the device, so VFIO
>>> gets a pointer to the device from somewhere (as it does today) and then
>>> follows the dev->driver pointer to get to the platform_driver.
> 
> The complexity of managing a bi-modal driver seems like far more than a
> little bit of code duplication in a device specific reset module and
> extends into how userspace makes devices available through vfio, so I
> think it's too late for that discussion.
>   
 Also, just so I'm sure I understand correctly, VFIO_PLATFORM_UNBIND is
 then called by VFIO before the VFIO driver unbinds from the device
 (unbinding the platform driver from the device being a completely
 separate thing)?
>>>
>>> This is where we'd need a little more changes for this approach. Instead
>>> of unbinding the device from its driver, the idea would be that the
>>> driver remains bound as far as the driver model is concerned, but
>>> it would be in a quiescent state where no other subsystem interacts with
>>> it (i.e. it gets unregistered from networking core or whichever it uses).
>>
>> Currently we use the same mechanism as for PCI, ie. unbind the native
>> driver and then bind VFIO platform driver in its place. Don't you think
>> changing this may be a pain for user-space tools that are designed to
>> work that way for PCI?
>>
>> My personal preference would be to start with your first proposal since
>> it looks (to me) less complex and "unknown" that the 2d approach.
>>
>> Let's wait for Alex opinion too...
> 
> I thought the reason we took the approach we have now is so that we
> don't have reset code loaded into the kernel unless we have a device
> that needs it.  Therefore we don't really want to preemptively load all
> the reset drivers and have them do a registration.  The unfortunate
> side-effect of that is the platform code needs to go looking for the
> driver.  We do that via the __symbol_get() trick, which only fails
> without modules because the underscore variant isn't defined in that
> case.  I remember asking Eric previously why we're using that rather
> than symbol_get(),

the rationale behind using __get_symbol is that the function takes a
char * as argument while the get_symbol macro takes the very symbol. I
needed to provide a string since this is what is stored in the lookup
table. Unfortunately I did not see the #if CONFIG_MODULES. I Should have
been warned about "__" and Alex doubts, sin of naïvety.

 I've since forgotten his answer, but the fact that
> __symbol_get() is only defined for modules makes it moot, we either need
> to make symbol_get() work or define __symbol_get() for non-module
> builds.
I currently don't see any solution for any of those. The only solution I
can see is someone registers the reset function pointer to vfio.

I think we could keep the existing reset modules, do the request_module
from VFIO, using their module name registered in the lookup table. But
instead of having the reset function in the look-up table we would have
the reset modules register their reset function pointer to VFIO. I think
this could work around the symbol_get issue.

This still leaves the layer violation argument though.

Assuming this works, would that be an acceptable solution, although I
acknowledge this does not perfectly fit into the driver model?

Best Regards

Eric
> 
> Otherwise, we should probably abandon the idea of these reset functions
> being modules and build them into the vfio platform driver (which would
> still be less loaded, dead code than a bi-modal host driver).  Thanks,

> 
> Alex
> 

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH] VFIO: platform: AMD xgbe reset module

2015-10-16 Thread Arnd Bergmann
On Friday 16 October 2015 15:06:45 Eric Auger wrote:

>  I've since forgotten his answer, but the fact that
> > __symbol_get() is only defined for modules makes it moot, we either need
> > to make symbol_get() work or define __symbol_get() for non-module
> > builds.
> I currently don't see any solution for any of those. The only solution I
> can see is someone registers the reset function pointer to vfio.
> 
> I think we could keep the existing reset modules, do the request_module
> from VFIO, using their module name registered in the lookup table. But
> instead of having the reset function in the look-up table we would have
> the reset modules register their reset function pointer to VFIO. I think
> this could work around the symbol_get issue.
> 
> This still leaves the layer violation argument though.
> 
> Assuming this works, would that be an acceptable solution, although I
> acknowledge this does not perfectly fit into the driver model?

I think it's possible to avoid the layering violation that way too,
by loading the module based on the compatible string, with a module_alias.

static void vfio_platform_get_reset(struct vfio_platform_device *vdev,
struct device *dev)
{
const char *compat;
int (*reset)(struct vfio_platform_device *);
int ret, i;
char modname[256];

ret = device_property_read_string(dev, "compatible", );
if (ret)
return;

reset = vfio_platform_lookup_reset(compat);
if (!reset) {
snprintf(modname, "vfio-reset:%s", compat);
request_module(modname);
reset = vfio_platform_lookup_reset(compat);
}

vdev->reset = reset;
}

---

#define module_vfio_reset_handler(compat, reset)\
MODULE_ALIAS("vfio_reset" compat);  \
static int reset ## _module_init(void)  \
{   \
return vfio_reset_register(THIS_MODULE, compat, );\
}

I think that solution is good enough, as it avoids most of the
problems with the current implementation but is a simple enough change.

Arnd
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v3 18/20] KVM: ARM64: Reset PMU state when resetting vcpu

2015-10-16 Thread Wei Huang


On 09/24/2015 05:31 PM, Shannon Zhao wrote:
> Signed-off-by: Shannon Zhao 

Missing commit message here.

> ---
>  arch/arm64/kvm/reset.c |  3 +++
>  include/kvm/arm_pmu.h  |  2 ++
>  virt/kvm/arm/pmu.c | 18 ++
>  3 files changed, 23 insertions(+)
> 
> diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
> index 91cf535..4da7f6c 100644
> --- a/arch/arm64/kvm/reset.c
> +++ b/arch/arm64/kvm/reset.c
> @@ -120,6 +120,9 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
>   /* Reset system registers */
>   kvm_reset_sys_regs(vcpu);
>  
> + /* Reset PMU */
> + kvm_pmu_vcpu_reset(vcpu);
> +
>   /* Reset timer */
>   return kvm_timer_vcpu_reset(vcpu, cpu_vtimer_irq);
>  }
> diff --git a/include/kvm/arm_pmu.h b/include/kvm/arm_pmu.h
> index 953c400..8dacfd3 100644
> --- a/include/kvm/arm_pmu.h
> +++ b/include/kvm/arm_pmu.h
> @@ -38,6 +38,7 @@ struct kvm_pmu {
>  };
>  
>  #ifdef CONFIG_KVM_ARM_PMU
> +void kvm_pmu_vcpu_reset(struct kvm_vcpu *vcpu);
>  void kvm_pmu_sync_hwstate(struct kvm_vcpu *vcpu);
>  unsigned long kvm_pmu_get_counter_value(struct kvm_vcpu *vcpu, u32 
> select_idx);
>  void kvm_pmu_disable_counter(struct kvm_vcpu *vcpu, u32 val);
> @@ -46,6 +47,7 @@ void kvm_pmu_software_increment(struct kvm_vcpu *vcpu, u32 
> val);
>  void kvm_pmu_set_counter_event_type(struct kvm_vcpu *vcpu, u32 data,
>   u32 select_idx);
>  #else
> +void kvm_pmu_vcpu_reset(struct kvm_vcpu *vcpu) {}
>  void kvm_pmu_sync_hwstate(struct kvm_vcpu *vcpu) {}
>  unsigned long kvm_pmu_get_counter_value(struct kvm_vcpu *vcpu, u32 
> select_idx)
>  {
> diff --git a/virt/kvm/arm/pmu.c b/virt/kvm/arm/pmu.c
> index ca7e849..faa2b76 100644
> --- a/virt/kvm/arm/pmu.c
> +++ b/virt/kvm/arm/pmu.c
> @@ -63,6 +63,24 @@ static void kvm_pmu_stop_counter(struct kvm_vcpu *vcpu, 
> u32 select_idx)
>  }
>  
>  /**
> + * kvm_pmu_vcpu_reset - reset pmu state for cpu
> + * @vcpu: The vcpu pointer
> + *
> + */
> +void kvm_pmu_vcpu_reset(struct kvm_vcpu *vcpu)
> +{
> + int i;
> + struct kvm_pmu *pmu = >arch.pmu;
> +
> + for (i = 0; i < ARMV8_MAX_COUNTERS; i++) {
> + kvm_pmu_stop_counter(vcpu, i);
> + pmu->pmc[i].idx = i;
> + pmu->pmc[i].vcpu = vcpu;
> + }
> + pmu->irq_pending = false;
> +}
> +
> +/**
>   * kvm_pmu_sync_hwstate - sync pmu state for cpu
>   * @vcpu: The vcpu pointer
>   *
> 
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH] KVM/arm: kernel low level debug suport for ARM32 virtual platforms

2015-10-16 Thread Mario Smarduch
When booting a VM using QEMU or Kvmtool there are no clear ways to 
enable low level debugging for these virtual platforms. some menu port 
choices are not supported by the virtual platforms at all. And there is no
help on the location of physical and virtual addresses for the ports.
This may lead to wrong debug port and a frozen VM with a blank screen.

This patch adds menu selections for QEMU and Kvmtool virtual platforms for low 
level kernel print debugging. Help section displays port physical and
virutal addresses.

ARM reference models use the MIDR register to run-time select UART port address 
(for ARCH_VEXPRESS) based on A9 or A15 part numbers. Looked for a same approach
but couldn't find a way to differentiate between virtual platforms, something
like a platform register.

Signed-off-by: Mario Smarduch 
---
 arch/arm/Kconfig.debug | 22 ++
 1 file changed, 22 insertions(+)

diff --git a/arch/arm/Kconfig.debug b/arch/arm/Kconfig.debug
index a2e16f9..d126bd4 100644
--- a/arch/arm/Kconfig.debug
+++ b/arch/arm/Kconfig.debug
@@ -1155,6 +1155,28 @@ choice
  This option selects UART0 on VIA/Wondermedia System-on-a-chip
  devices, including VT8500, WM8505, WM8650 and WM8850.
 
+   config DEBUG_VIRT_UART_QEMU
+   bool "Kernel low-level debugging on QEMU Virtual Platform"
+   depends on ARCH_VIRT
+   select DEBUG_UART_PL01X
+   help
+ Say Y here if you want the debug print routines to direct
+ their output to PL011 UART port on QEMU Virtual Platform.
+ Appropriate address values are:
+   PHYSVIRT
+   0x900   0xf809
+
+   config DEBUG_VIRT_UART_KVMTOOL
+   bool "Kernel low-level debugging on Kvmtool Virtual Platform"
+   depends on ARCH_VIRT
+   select DEBUG_UART_8250
+   help
+ Say Y here if you want the debug print routines to direct
+ their output to 8250 UART port on Kvmtool Virtual
+ Platform. Appropriate address values are:
+   PHYSVIRT
+   0x3f8   0xf80903f8
+
config DEBUG_ICEDCC
bool "Kernel low-level debugging via EmbeddedICE DCC channel"
help
-- 
1.9.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v3 00/20] KVM: ARM64: Add guest PMU support

2015-10-16 Thread Christopher Covington
On 10/16/2015 12:55 AM, Wei Huang wrote:
> 
> 
> On 09/24/2015 05:31 PM, Shannon Zhao wrote:
>> This patchset adds guest PMU support for KVM on ARM64. It takes
>> trap-and-emulate approach. When guest wants to monitor one event, it
>> will be trapped by KVM and KVM will call perf_event API to create a perf
>> event and call relevant perf_event APIs to get the count value of event.
>>
>> Use perf to test this patchset in guest. When using "perf list", it
>> shows the list of the hardware events and hardware cache events perf
>> supports. Then use "perf stat -e EVENT" to monitor some event. For
>> example, use "perf stat -e cycles" to count cpu cycles and
>> "perf stat -e cache-misses" to count cache misses.
>>
>> Below are the outputs of "perf stat -r 5 sleep 5" when running in host
>> and guest.
>>
>> Host:
>>  Performance counter stats for 'sleep 5' (5 runs):
>>
>>   0.551428  task-clock (msec) #0.000 CPUs utilized   
>>  ( +-  0.91% )
>>  1  context-switches  #0.002 M/sec
>>  0  cpu-migrations#0.000 K/sec
>> 48  page-faults   #0.088 M/sec   
>>  ( +-  1.05% )
>>1150265  cycles#2.086 GHz 
>>  ( +-  0.92% )
>>  stalled-cycles-frontend
>>  stalled-cycles-backend
>> 526398  instructions  #0.46  insns per cycle 
>>  ( +-  0.89% )
>>  branches
>>   9485  branch-misses #   17.201 M/sec   
>>  ( +-  2.35% )
>>
>>5.000831616 seconds time elapsed  
>> ( +-  0.00% )
>>
>> Guest:
>>  Performance counter stats for 'sleep 5' (5 runs):
>>
>>   0.730868  task-clock (msec) #0.000 CPUs utilized   
>>  ( +-  1.13% )
>>  1  context-switches  #0.001 M/sec
>>  0  cpu-migrations#0.000 K/sec
>> 48  page-faults   #0.065 M/sec   
>>  ( +-  0.42% )
>>1642982  cycles#2.248 GHz 
>>  ( +-  1.04% )
>>  stalled-cycles-frontend
>>  stalled-cycles-backend
>> 637964  instructions  #0.39  insns per cycle 
>>  ( +-  0.65% )
>>  branches
>>  10377  branch-misses #   14.198 M/sec   
>>  ( +-  1.09% )
>>
>>5.001289068 seconds time elapsed  
>> ( +-  0.00% )
>>
> 
> Thanks for V3. One suggestion is to run more perf stress tests, such as
> "perf test". So we know the corner cases are covered as much as possible.

I'd also recommend Vince Weaver's perf_event_tests. It tests things like
signal-on-counter-overflow that I've never seen anywhere else (other than some
of my own code).

https://github.com/deater/perf_event_tests

Christopher Covington

-- 
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [Qemu-devel] Live migration sequence

2015-10-16 Thread Dr. David Alan Gilbert
* Pavel Fedin (p.fe...@samsung.com) wrote:
>  Hello!
> 
> > Some thoughts:
> >   a) There is a migration state notifier list - see 
> > add_migration_state_change_notifier (spice
> > calls it)
> >  - but I don't think it's called in the right places for your needs;  we
> >  could add some more places that gets called.
> 
>  I am now trying to add one more state, something like 
> MIGRATION_STATUS_FINISHING. It would mean that CPUs are stopped.
>  Can you explain me migration code a bit? Where is iteration loop, and where 
> are CPUs stopped? I am looking at migration.c but
> cannot say that i understand some good portion of it. :)

The outgoing side of migration comes into migrate_fd_connect which does
all the setup and then starts 'migration_thread'.  The big while loop in there 
does
most of the work, and on each loop normally ends up calling either
   qemu_savevm_state_iterate
or
  migration_completion

migration_completion calls vm_stop_force_state to stop the CPU,
and then qemu_savevm_state_complete to save all the remaining devices
out.

Dave


> 
> Kind regards,
> Pavel Fedin
> Expert Engineer
> Samsung Electronics Research center Russia
> 
> 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm