On Wed, Jun 17, 2015 at 12:00:56AM +0200, Igor Mammedov wrote:
> On Tue, 16 Jun 2015 23:14:20 +0200
> "Michael S. Tsirkin" wrote:
>
> > On Tue, Jun 16, 2015 at 06:33:37PM +0200, Igor Mammedov wrote:
> > > since commit
> > > 1d4e7e3 kvm: x86: increase user memory slots to 509
> > >
> > > it beca
On Wed, Jun 17, 2015 at 12:19:15AM +0200, Igor Mammedov wrote:
> On Tue, 16 Jun 2015 23:16:07 +0200
> "Michael S. Tsirkin" wrote:
>
> > On Tue, Jun 16, 2015 at 06:33:34PM +0200, Igor Mammedov wrote:
> > > Series extends vhost to support upto 509 memory regions,
> > > and adds some vhost:translate
Acked-by: Thomas Sailer
On 06/17/2015 02:35 AM, Andy Lutomirski wrote:
This is only used if BAYCOM_DEBUG is defined.
Cc: walter harms
Cc: Ralf Baechle
Cc: Thomas Sailer
Cc: linux-h...@vger.kernel.org
Signed-off-by: Andy Lutomirski
---
I'm hoping for an ack for this to go through -tip.
In cdc7957d1954 ("x86: move native_read_tsc() offline"),
native_read_tsc was moved out of line, presumably for some
now-obsolete vDSO-related reason. Undo it.
The entire rdtsc, shl, or sequence is only 11 bytes, and calls via
rdtscl and similar helpers were already inlined.
Signed-off-by: Andy L
This is only used if BAYCOM_DEBUG is defined.
Cc: walter harms
Cc: Ralf Baechle
Cc: Thomas Sailer
Cc: linux-h...@vger.kernel.org
Signed-off-by: Andy Lutomirski
---
I'm hoping for an ack for this to go through -tip.
drivers/net/hamradio/baycom_epp.c | 2 +-
1 file changed, 1 insertion(+), 1
Now that the read_tsc paravirt hook is gone, rdtscll() is just a
wrapper around native_read_tsc(). Unwrap it.
Signed-off-by: Andy Lutomirski
---
arch/x86/boot/compressed/aslr.c | 2 +-
arch/x86/include/asm/msr.h | 3 ---
arch/x86/include/asm/tsc.h
This code is timing 100k indirect calls, so the added overhead of
counting the number of cycles elapsed as a 64-bit number should be
insignificant. Drop the optimization of using a 32-bit count.
Signed-off-by: Andy Lutomirski
---
arch/x86/kernel/cpu/amd.c | 6 +++---
1 file changed, 3 insertion
It wasn't compiled in by default. I suspect that the driver was and
still is broken, though -- it's calling udelay with a parameter
that's derived from loops_per_jiffy.
Cc: Jarod Wilson
Cc: de...@driverdev.osuosl.org
Cc: Greg Kroah-Hartman
Signed-off-by: Andy Lutomirski
---
drivers/staging/me
They have no users. Leave native_read_tscp, which seems potentially
useful despite also having no callers.
Signed-off-by: Andy Lutomirski
---
arch/x86/include/asm/msr.h | 9 -
1 file changed, 9 deletions(-)
diff --git a/arch/x86/include/asm/msr.h b/arch/x86/include/asm/msr.h
index 7273
Now that there is no paravirt TSC, the "native" is inappropriate.
The function does RDTSC, so give it the obvious name: rdtsc()
Suggested-by: Borislav Petkov
Signed-off-by: Andy Lutomirski
---
arch/x86/boot/compressed/aslr.c | 2 +-
arch/x86/entry/vdso/vclock_gettime.c
It has no more callers, and it was never a very sensible interface
to begin with. Users of the TSC should either read all 64 bits or
explicitly throw out the high bits.
Signed-off-by: Andy Lutomirski
---
arch/x86/include/asm/msr.h | 3 ---
1 file changed, 3 deletions(-)
diff --git a/arch/x86/i
This timing code is hideous, and this doesn't help. It gets rid of
one of the last users of rdtscl, though.
Acked-by: Dmitry Torokhov
Cc: linux-in...@vger.kernel.org
Signed-off-by: Andy Lutomirski
---
drivers/input/joystick/analog.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
di
It's unclear to me why this code exists in the first place.
Acked-by: Dmitry Torokhov
Cc: linux-in...@vger.kernel.org
Signed-off-by: Andy Lutomirski
---
drivers/input/gameport/gameport.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/input/gameport/gameport.c
b
There are two logical changes here. First, this removes a check for
cpu_has_tsc. That check is unnecessary, as we don't register the
TSC as a clocksource on systems that have no TSC. Second, it adds a
barrier, thus preventing observable non-monotonicity.
I suspect that the missing barrier was n
Using get_cycles was unnecessary: check_tsc_warp() is not called on
TSC-less systems. Replace rdtsc_barrier(); get_cycles() with
rdtsc_ordered().
While we're at it, make the somewhat more dangerous change of
removing barrier_before_rdtsc after RDTSC in the TSC warp check
code. This should be oka
All callers have been converted to rdtsc_ordered().
Signed-off-by: Andy Lutomirski
---
arch/x86/include/asm/barrier.h | 11 ---
arch/x86/um/asm/barrier.h | 13 -
2 files changed, 24 deletions(-)
diff --git a/arch/x86/include/asm/barrier.h b/arch/x86/include/asm/barrier.
__pvclock_read_cycles had an unnecessary barrier. Get rid of that
barrier and clean up the code by just using rdtsc_ordered().
Cc: Paolo Bonzini
Cc: Radim Krcmar
Cc: Marcelo Tosatti
Cc: kvm@vger.kernel.org
Signed-off-by: Andy Lutomirski
---
I'm hoping to get an ack for this to go in through
rdtsc_barrier(); rdtsc() is an unnecessary mouthful and requires
more thought than should be necessary. Add an rdtsc_ordered()
helper and replace the trivial call sites with it.
This should not change generated code. The duplication of the fence
asm is temporary.
Signed-off-by: Andy Lutomirski
As a very minor optimization, tsc_delay was only using the low 32
bits of the TSC. It's a delay function, so just use the whole
thing.
Signed-off-by: Andy Lutomirski
---
arch/x86/lib/delay.c | 8
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/arch/x86/lib/delay.c b/arch/
We've had read_tsc and read_tscp paravirt hooks since the very
beginning of paravirt, i.e., d3561b7fa0fb ("[PATCH] paravirt: header
and stubs for paravirtualisation"). AFAICT the only paravirt guest
implementation that ever replaced these calls was vmware, and it's
gone. Arguably even vmware shou
The only caller was kvm's read_tsc. The only difference between
vget_cycles and native_read_tsc was that vget_cycles returned zero
instead of crashing on TSC-less systems. KVM's already checks
vclock_mode before calling that function, so the extra check is
unnecessary.
(Off-topic, but the whole
My sincere apologies for the spam. I send an unholy mixture of the
real patch set and an old poorly split-up patch set, and the result
is incomprehensible. Here's what I meant to send.
After the some recent threads about rdtsc barriers, I remembered
that our RDTSC wrappers are a big mess. Let's
On Tue, 16 Jun 2015 23:16:07 +0200
"Michael S. Tsirkin" wrote:
> On Tue, Jun 16, 2015 at 06:33:34PM +0200, Igor Mammedov wrote:
> > Series extends vhost to support upto 509 memory regions,
> > and adds some vhost:translate_desc() performance improvemnts
> > so it won't regress when memslots are i
On Tue, 16 Jun 2015 23:14:20 +0200
"Michael S. Tsirkin" wrote:
> On Tue, Jun 16, 2015 at 06:33:37PM +0200, Igor Mammedov wrote:
> > since commit
> > 1d4e7e3 kvm: x86: increase user memory slots to 509
> >
> > it became possible to use a bigger amount of memory
> > slots, which is used by memory
After enhancing arm64 FP/SIMD exit handling, FP/SIMD exit branch is moved
to guest trap handling. This keeps exiting handling flow between both
architectures consistent.
Signed-off-by: Mario Smarduch
---
arch/arm/kvm/interrupts.S | 12 +++-
1 file changed, 7 insertions(+), 5 deletions(
This patch only saves and restores FP/SIMD registers on Guest access. To do
this cptr_el2 FP/SIMD trap is set on Guest entry and later checked on exit.
lmbench, hackbench show significant improvements, for 30-50% exits FP/SIMD
context is not saved/restored
Signed-off-by: Mario Smarduch
---
arch/
Currently we save/restore fp/simd on each exit. Fist patch optimizes arm64
save/restore, we only do so on Guest access. hackbench and
several lmbench tests show anywhere from 30% to above 50% optimzation
achieved.
In second patch 32-bit handler is updated to keep exit handling consistent
with 6
On Tue, Jun 16, 2015 at 06:33:34PM +0200, Igor Mammedov wrote:
> Series extends vhost to support upto 509 memory regions,
> and adds some vhost:translate_desc() performance improvemnts
> so it won't regress when memslots are increased to 509.
>
> It fixes running VM crashing during memory hotplug
On Tue, Jun 16, 2015 at 06:33:37PM +0200, Igor Mammedov wrote:
> since commit
> 1d4e7e3 kvm: x86: increase user memory slots to 509
>
> it became possible to use a bigger amount of memory
> slots, which is used by memory hotplug for
> registering hotplugged memory.
> However QEMU crashes if it's
On Tue, 16 Jun 2015 23:07:24 +0200
"Michael S. Tsirkin" wrote:
> On Tue, Jun 16, 2015 at 06:33:35PM +0200, Igor Mammedov wrote:
> > For default region layouts performance stays the same
> > as linear search i.e. it takes around 210ns average for
> > translate_desc() that inlines find_region().
>
On Tue, Jun 16, 2015 at 06:33:39PM +0200, Igor Mammedov wrote:
> when translating descriptors they are typically less than
> memory region that holds them and translated into 1 iov
> enty,
entry
> so it's not nessesary to check remaining length
> twice and calculate used length and next address
>
On Tue, Jun 16, 2015 at 06:33:35PM +0200, Igor Mammedov wrote:
> For default region layouts performance stays the same
> as linear search i.e. it takes around 210ns average for
> translate_desc() that inlines find_region().
>
> But it scales better with larger amount of regions,
> 235ns BS vs 300n
On Mon, Jun 15, 2015 at 12:49:41PM +0100, Andreas Herrmann wrote:
> Following some patches to fix misc issues found when testing the
> standalone kvmtool version.
>
> Please apply.
All applied, apart from the ioeventfd patch which I'm not sure about.
Will
--
To unsubscribe from this list: send t
On Mon, Jun 15, 2015 at 12:49:45PM +0100, Andreas Herrmann wrote:
> W/o dedicated endianess it's impossible to find reliably a match
> e.g. in kernel/virt/kvm/eventfd.c ioeventfd_in_range.
Hmm, but shouldn't this be the endianness of the guest, rather than just
forcing things to little-endian?
Wi
On Mon, Jun 15, 2015 at 11:45:38AM +0100, Andre Przywara wrote:
> On 06/05/2015 05:41 PM, Will Deacon wrote:
> > On Thu, Jun 04, 2015 at 04:20:45PM +0100, Andre Przywara wrote:
> >> In PCI config space there is an interrupt line field (offset 0x3f),
> >> which is used to initially communicate the I
On Sun, Jun 14, 2015 at 05:13:05PM +0100, zichao wrote:
> I and marc are talking about how to plug the guest debug exploit in an
> easier way.
>
> I remembered that you mentioned disabling monitor mode had proven to be
> extremely fragile in practice on 32-bit ARM SoCs, what if I save/restore
> th
For default region layouts performance stays the same
as linear search i.e. it takes around 210ns average for
translate_desc() that inlines find_region().
But it scales better with larger amount of regions,
235ns BS vs 300ns LS with 55 memory regions
and it will be about the same values when allow
that brings down translate_desc() cost to around 210ns
if accessed descriptors are from the same memory region.
Signed-off-by: Igor Mammedov
---
that's what netperf/iperf workloads were during testing.
---
drivers/vhost/vhost.c | 16 +---
drivers/vhost/vhost.h | 1 +
2 files changed
since commit
1d4e7e3 kvm: x86: increase user memory slots to 509
it became possible to use a bigger amount of memory
slots, which is used by memory hotplug for
registering hotplugged memory.
However QEMU crashes if it's used with more than ~60
pc-dimm devices and vhost-net since host kernel
in mo
Series extends vhost to support upto 509 memory regions,
and adds some vhost:translate_desc() performance improvemnts
so it won't regress when memslots are increased to 509.
It fixes running VM crashing during memory hotplug due
to vhost refusing accepting more than 64 memory regions.
It's only h
when translating descriptors they are typically less than
memory region that holds them and translated into 1 iov
enty, so it's not nessesary to check remaining length
twice and calculate used length and next address
in such cases.
so relace a remaining length and 'size' increment branches
with a
with large number of memory regions we could end up with
high order allocations and kmalloc could fail if
host is under memory pressure.
Considering that memory regions array is used on hot path
try harder to allocate using kmalloc and if it fails resort
to vmalloc.
It's still better than just fail
On Fri, Jun 12, 2015 at 01:34:52AM -0400, Wei Huang wrote:
> This patchset is directlyh applicable on kvm.git/queue.
>
> V5:
> * Remove the get_pmu_ops from sub_arch; instead define pmu dispatcher
> in kvm_x86_ops->pmu_ops. The dispatcher is initialized in sub-arch.
> The PMU interface f
On Fri, Jun 12, 2015 at 01:34:54AM -0400, Wei Huang wrote:
> This patch splits existing vPMU code into a common vPMU interface (pmc.c)
> and Intel specific vPMU code (pmu_intel.c) using the following steps:
>
> - Part of arechitectural vPMU code is extracted and moved to pmu_intel.c
> file. They
On Tue, 16 Jun 2015, Juergen Gross wrote:
> AFAIK there are no outstanding questions for more than one month now.
> I'd appreciate some feedback or accepting these patches.
They are against dead code, which will be gone soon. We switched over
to queued locks.
Thanks,
tglx
--
To unsubsc
AFAIK there are no outstanding questions for more than one month now.
I'd appreciate some feedback or accepting these patches.
Juergen
On 04/30/2015 12:53 PM, Juergen Gross wrote:
Paravirtualized spinlocks produce some overhead even if the kernel is
running on bare metal. The main reason are t
In the future, please add the KVM/ARM maintainers on Cc.
On 12/06/15 22:57, Timur Tabi wrote:
> From: Shanker Donthineni
>
> This patch enables assignment of 32/64bit guest VCPU to
> Qualcomm Technologies ARMv8 CPU. Added KVM_ARM_TARGET_QCOM_KRYO
> to the KVM target list and modified vm_target_c
On Mon, Jun 15, 2015 at 08:41:24PM -1000, Linus Torvalds wrote:
> On Mon, Jun 15, 2015 at 12:19 PM, Andrea Arcangeli
> wrote:
> >
> > Yes, it would leave the other blocked, how is it different from having
> > just 1 reader and it gets killed?
>
> Either is completely wrong. But the read() code c
Linus,
Please pull from
git://git.kernel.org/pub/scm/virt/kvm/kvm.git master
To receive the following KVM bug fix, which restores
APIC migration functionality.
Radim Krčmář (1):
KVM: x86: fix lapic.timer_mode on restore
arch/x86/kvm/lapic.c | 26 --
1 file c
Tabs rather than spaces
Signed-off-by: Kevin Mulvey
---
virt/kvm/coalesced_mmio.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/virt/kvm/coalesced_mmio.h b/virt/kvm/coalesced_mmio.h
index b280c20..5cbf190 100644
--- a/virt/kvm/coalesced_mmio.h
+++ b/virt/kvm/coalesced_m
fix brace spacing
Signed-off-by: Kevin Mulvey
---
virt/kvm/async_pf.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/virt/kvm/async_pf.h b/virt/kvm/async_pf.h
index e7ef6447..ec4cfa2 100644
--- a/virt/kvm/async_pf.h
+++ b/virt/kvm/async_pf.h
@@ -29,8 +29,8 @@ void kvm_as
On 06/16/2015 10:28 AM, Marc Zyngier wrote:
> Hi Eric,
>
> On 15/06/15 16:44, Eric Auger wrote:
>> Hi Marc,
>> On 06/08/2015 07:04 PM, Marc Zyngier wrote:
>>> In order to be able to feed physical interrupts to a guest, we need
>>> to be able to establish the virtual-physical mapping between the tw
On 16/06/15 04:04, Mario Smarduch wrote:
> On 06/15/2015 11:20 AM, Marc Zyngier wrote:
>> On 15/06/15 19:04, Mario Smarduch wrote:
>>> On 06/15/2015 03:00 AM, Marc Zyngier wrote:
Hi Mario,
> [ ... ]
On 13/06/15 23:20, Mario Smarduch wrote:
> Currently VFP/SIMD registers are
Hi Eric,
On 15/06/15 16:44, Eric Auger wrote:
> Hi Marc,
> On 06/08/2015 07:04 PM, Marc Zyngier wrote:
>> In order to be able to feed physical interrupts to a guest, we need
>> to be able to establish the virtual-physical mapping between the two
>> worlds.
>>
>> The mapping is kept in a rbtree, in
KVM populates max_tsc_khz with tsc_khz at arch init stage on the
constant tsc machine and creates VM with max_tsc_khz as tsc. However,
tsc_khz maybe changed during tsc clocksource driver refines calibration.
This will cause to create VM with slow tsc and produce the following
warning. To fix the is
55 matches
Mail list logo