[PATCH 2/2] kvm/x86: use __test_bit

2015-08-30 Thread Michael S. Tsirkin
Let compiler do slightly better optimizations using the non-volatile __test_bit in all cases where the values are set using the non-volatile __set_bit and __clear_bit. I left test_bit in place where the mask is set using the atomic set_bit/clear_bit, for symmetry. This shaves about 100 bytes off

[PATCH RFC 0/3] kvm add ioeventfd pf capability

2015-08-30 Thread Michael S. Tsirkin
One of the reasons MMIO is slower than port IO is because it requires a page table lookup. For normal memory accesses, this is solved by using the TLB cache - but MMIO entries are either not present or reserved and so are never cached. To fix, allow installing an ioeventfd on top of a read only me

[PATCH RFC 1/3] vmx: allow ioeventfd for EPT violations

2015-08-30 Thread Michael S. Tsirkin
Even when we skip data decoding, MMIO is slightly slower than port IO because it uses the page-tables, so the CPU must do a pagewalk on each access. This overhead is normally masked by using the TLB cache: but not so for KVM MMIO, where PTEs are marked as reserved and so are never cached. As ioev

[PATCH RFC 2/3] svm: allow ioeventfd for NPT page faults

2015-08-30 Thread Michael S. Tsirkin
MMIO is slightly slower than port IO because it uses the page-tables, so the CPU must do a pagewalk on each access. This overhead is normally masked by using the TLB cache: but not so for KVM MMIO, where PTEs are marked as reserved and so are never cached. As ioeventfd memory is never read, make

[PATCH RFC 3/3] kvm: add KVM_CAP_IOEVENTFD_PF capability

2015-08-30 Thread Michael S. Tsirkin
Signed-off-by: Michael S. Tsirkin --- include/uapi/linux/kvm.h | 1 + arch/x86/kvm/x86.c| 1 + Documentation/virtual/kvm/api.txt | 7 +++ 3 files changed, 9 insertions(+) diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index 716ad4a..4509aa3 100644 -

[PATCH RFC 3/3] pci-testdev: add RO pages for ioeventfd

2015-08-30 Thread Michael S. Tsirkin
This seems hackish - would it be better to create this region automatically within kvm? Suggestions are welcome. Signed-off-by: Michael S. Tsirkin --- hw/misc/pci-testdev.c | 13 + 1 file changed, 13 insertions(+) diff --git a/hw/misc/pci-testdev.c b/hw/misc/pci-testdev.c index 9414

[PATCH RFC 2/3] pci-testdev: add subregion

2015-08-30 Thread Michael S. Tsirkin
Make mmio a subregion of the BAR. This will allow mapping rom within the same BAR down the road. Signed-off-by: Michael S. Tsirkin --- hw/misc/pci-testdev.c | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/hw/misc/pci-testdev.c b/hw/misc/pci-testdev.c index 6edc1cd..941

[PATCH RFC 1/3] pci-testdev: separate page for each mmio test

2015-08-30 Thread Michael S. Tsirkin
note: this makes BAR > 4K, which requires kvm unit test patch to support such BAR. Do we need to worry about old kvm unit test binaries? I'm guessing not ... Signed-off-by: Michael S. Tsirkin --- hw/misc/pci-testdev.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/hw/mi

[PATCH RFC 0/3] pci-testdev add support for kvm ioeventfd pf

2015-08-30 Thread Michael S. Tsirkin
This adds a test for triggering ioeventfd on pagefaults. This was used to verify that mmio ioeventfd on pagefault is as fast as portio. Michael S. Tsirkin (3): pci-testdev: separate page for each mmio test pci-testdev: add subregion pci-testdev: add RO pages for ioeventfd hw/misc/pci-testd

Re: [PATCH RFC 3/3] pci-testdev: add RO pages for ioeventfd

2015-08-30 Thread Gonglei
On 2015/8/30 17:20, Michael S. Tsirkin wrote: > This seems hackish - would it be better to create this region > automatically within kvm? Suggestions are welcome. > > Signed-off-by: Michael S. Tsirkin > --- > hw/misc/pci-testdev.c | 13 + > 1 file changed, 13 insertions(+) > > diff

[PATCH 0/9] Rework architected timer and fix UEFI reset

2015-08-30 Thread Christoffer Dall
The architected timer integration with the vgic had some shortcomings in that certain guests (one being UEFI) weren't fully supported. In fixing this I also found that we are scheduling the hrtimer for the virtual timer way too often, with a potential performance overhead. This series tries to ad

[PATCH 7/9] arm/arm64: KVM: vgic: Move active state handling to flush_hwstate

2015-08-30 Thread Christoffer Dall
We currently set the physical active state only when we *inject* a new pending virtual interrupt, but this is actually not correct, because we could have been preempted and run something else on the system that resets the active state to clear. This causes us to run the VM with the timer set to fi

[PATCH 4/9] arm/arm64: Implement GICD_ICFGR as RO for PPIs

2015-08-30 Thread Christoffer Dall
The GICD_ICFGR allows the bits for the SGIs and PPIs to be read only. We currently simulate this behavior by writing a hardcoded value to the register for the SGIs and PPIs on every write of these bits to the register (ignoring what the guest actually wrote), and by writing the same value as the re

[PATCH 1/9] KVM: Add kvm_arch_vcpu_{un}blocking callbacks

2015-08-30 Thread Christoffer Dall
Some times it is useful for architecture implementations of KVM to know when the VCPU thread is about to block or when it comes back from blocking (arm/arm64 needs to know this to properly implement timers, for example). Therefore provide a generic architecture callback function in line with what

[PATCH 6/9] arm/arm64: KVM: Add mapped interrupts documentation

2015-08-30 Thread Christoffer Dall
Mapped interrupts on arm/arm64 is a tricky concept and the way we deal with them is not apparently easy to understand by reading various specs. Therefore, add a proper documentation file explaining the flow and rationale of the behavior of the vgic. Some of this text was contributed by Marc Zyngi

[PATCH 3/9] arm/arm64: KVM: vgic: Factor out level irq processing on guest exit

2015-08-30 Thread Christoffer Dall
Currently vgic_process_maintenance() processes dealing with a completed level-triggered interrupt directly, but we are soon going to reuse this logic for level-triggered mapped interrupts with the HW bit set, so move this logic into a separate static function. Probably the most scary part of this

[PATCH 2/9] arm/arm64: KVM: arch_timer: Only schedule soft timer on vcpu_block

2015-08-30 Thread Christoffer Dall
We currently schedule a soft timer every time we exit the guest if the timer did not expire while running the guest. This is really not necessary, because the only work we do in the timer work function is to kick the vcpu. Kicking the vcpu does two things: (1) If the vpcu thread is on a waitqueue

[PATCH 8/9] arm/arm64: KVM: Rework the arch timer to use level-triggered semantics

2015-08-30 Thread Christoffer Dall
The arch timer currently uses edge-triggered semantics in the sense that the line is never sampled by the vgic and lowering the line from the timer to the vgic doesn't have any affect on the pending state of virtual interrupts in the vgic. This means that we do not support a guest with the otherwi

[PATCH 5/9] arm/arm64: KVM: Use appropriate define in VGIC reset code

2015-08-30 Thread Christoffer Dall
We currently initialize the SGIs to be enabled in the VGIC code, but we use the VGIC_NR_PPIS define for this purpose, instead of the the more natural VGIC_NR_SGIS. Change this slightly confusing use of the defines. Note: This should have no functional change, as both names are defined to the numb

[PATCH 9/9] arm/arm64: KVM: arch timer: Reset CNTV_CTL to 0

2015-08-30 Thread Christoffer Dall
Provide a better quality of implementation and be architecture compliant on ARMv7 for the architected timer by resetting the CNTV_CTL to 0 on reset of the timer, and call kvm_timer_update_state(vcpu) at the same time, ensuring the timer output is not asserted after, for example, a PSCI system reset

[PATCH 1/2] arm/arm64: KVM: Add tracepoints for vgic and timer

2015-08-30 Thread Christoffer Dall
The VGIC and timer code for KVM arm/arm64 doesn't have any tracepoints or tracepoint infrastructure defined. Rewriting some of the timer code handling showed me how much we need this, so let's add these simple trace points once and for all and we can easily expand with additional trace points in t

[PATCH 0/2] Improve and add tracepoints for KVM on arm/arm64

2015-08-30 Thread Christoffer Dall
The timer and vgic code didn't have tracepoints for quite a while and we've been adding those ad-hoc when doing development a lot of times. Add some simple tracepoints for those parts of KVM to get the infrastructure in place. Also improve the kvm_exit tracepoint on arm/arm64 to print something me

[PATCH 2/2] arm/arm64: KVM: Improve kvm_exit tracepoint

2015-08-30 Thread Christoffer Dall
The ARM architecture only saves the exit class to the HSR (ESR_EL2 for arm64) on synchronous exceptions, not on asynchronous exceptions like an IRQ. However, we only report the exception class on kvm_exit, which is confusing because an IRQ looks like it exited at some PC with the same reason as th

Re: [PATCH 0/3] KVM: arm64: Implement API for vGICv3 live migration

2015-08-30 Thread Christoffer Dall
On Fri, Aug 28, 2015 at 03:56:09PM +0300, Pavel Fedin wrote: > This patchset adds necessary userspace API in order to support vGICv3 live > migration. This includes accessing GIC distributor and redistributor memory > regions using device attribute ioctls, and system registers of > CPU interface us

Re: [PATCH 1/3] KVM: arm64: Implement vGICv3 distributor and redistributor access from userspace

2015-08-30 Thread Christoffer Dall
On Fri, Aug 28, 2015 at 03:56:10PM +0300, Pavel Fedin wrote: > The access is done similar to GICv2, using KVM_DEV_ARM_VGIC_GRP_DIST_REGS > and KVM_DEV_ARM_VGIC_GRP_REDIST_REGS with KVM_SET_DEVICE_ATTR and > KVM_GET_DEVICE_ATTR ioctls. > > Registers are always assumed to be of their native size, 4

Re: [PATCH 3/3] KVM: arm64: Implement accessors for vGIC CPU interface registers

2015-08-30 Thread Christoffer Dall
On Fri, Aug 28, 2015 at 03:56:12PM +0300, Pavel Fedin wrote: > This commit adds accessors for all registers, being part of saved vGIC > context in the form of ICH_VMCR_EL2. This is necessary for enabling vGICv3 > live migration. > > Signed-off-by: Pavel Fedin > --- > arch/arm64/kvm/sys_regs.c

Re: [PATCH] KVM: arm64: Decode basic HYP fault information

2015-08-30 Thread Christoffer Dall
On Tue, Aug 11, 2015 at 10:34:07AM +0300, Pavel Fedin wrote: > Print exception vector name, exception class and PC translated to EL1 virtual > address. Significantly aids debugging HYP crashes without special means like > JTAG. my overall concern with this patch is that it adds complexity to an al

Re: [PATCH 3/3] KVM: arm64: Implement accessors for vGIC CPU interface registers

2015-08-30 Thread Peter Maydell
On 30 August 2015 at 17:50, Christoffer Dall wrote: > I had imagined we would encode the GICv3 register accesses through the > device API and not through the system register API, since I'm not crazy > about polluting the general system register handling logic with GIC > registers solely for the pu

Fwd: Data buffer Transfer through Hypercall

2015-08-30 Thread Hu Yaohui
Hi All, Does anyone know how to transfer data buffer through Hypercall? According to the current implementation from "kvm_emulate_hypercall", it only takes a primitive type as parameters through different registers. Can we use hyprecall like read/write system call to transfer data between guest and

Re: [PATCH RFC 1/3] vmx: allow ioeventfd for EPT violations

2015-08-30 Thread Xiao Guangrong
On 08/30/2015 05:12 PM, Michael S. Tsirkin wrote: Even when we skip data decoding, MMIO is slightly slower than port IO because it uses the page-tables, so the CPU must do a pagewalk on each access. This overhead is normally masked by using the TLB cache: but not so for KVM MMIO, where PTEs ar

Re: [PATCH V3 2/3] kvm: don't register wildcard MMIO EVENTFD on two buses

2015-08-30 Thread Jason Wang
On 08/26/2015 01:10 PM, Jason Wang wrote: > On 08/25/2015 07:51 PM, Michael S. Tsirkin wrote: >> > On Tue, Aug 25, 2015 at 05:05:47PM +0800, Jason Wang wrote: >> > We register wildcard mmio eventfd on two buses, one for KVM_MMIO_BUS >> > and another is KVM_FAST_MMIO_BUS. This leads to i

Re: [PATCH v2 08/18] nvdimm: init backend memory mapping and config data area

2015-08-30 Thread Xiao Guangrong
Hi Stefan, On 08/28/2015 07:58 PM, Stefan Hajnoczi wrote: +goto do_unmap; +} + +nvdimm->device_index = new_device_index(); +sprintf(name, "NVDIMM-%d", nvdimm->device_index); +memory_region_init_ram_ptr(&nvdimm->mr, OBJECT(dev), name, nvdimm_size, +

Re: [Qemu-devel] [PATCH v2 13/18] nvdimm: build namespace config data

2015-08-30 Thread Xiao Guangrong
On 08/28/2015 07:59 PM, Stefan Hajnoczi wrote: On Wed, Aug 26, 2015 at 06:42:01PM +0800, Xiao Guangrong wrote: On 08/26/2015 12:16 AM, Stefan Hajnoczi wrote: On Fri, Aug 14, 2015 at 10:52:06PM +0800, Xiao Guangrong wrote: +#ifdef NVDIMM_DEBUG +#define nvdebug(fmt, ...) fprintf(stderr, "nvd

RE: [PATCH] KVM: arm64: Decode basic HYP fault information

2015-08-30 Thread Pavel Fedin
Hello! > my overall concern with this patch is that it adds complexity to an > already really bad situation, and potentially increases the likelihood > of not seeing any debug info at all. Why? In this case we currently already drop into C code. I do the same, with some more useful printout. W

Re: [Qemu-devel] [PATCH v2 14/18] nvdimm: support NFIT_CMD_IMPLEMENTED function

2015-08-30 Thread Xiao Guangrong
On 08/28/2015 08:01 PM, Stefan Hajnoczi wrote: On Wed, Aug 26, 2015 at 06:46:35PM +0800, Xiao Guangrong wrote: On 08/26/2015 12:23 AM, Stefan Hajnoczi wrote: On Fri, Aug 14, 2015 at 10:52:07PM +0800, Xiao Guangrong wrote: static void dsm_write(void *opaque, hwaddr addr,