Re: [PATCH 1/4] PCI: Export MSI message relevant functions
On Mon, May 19, 2014 at 01:01:07PM +1000, Gavin Shan wrote: >The patch exports 2 MSI message relevant functions, which will be >used by VFIO PCI driver. The VFIO PCI driver would be built as >a module. > >Signed-off-by: Gavin Shan Bjorn, could you help ack it if you don't have objection to it? I guess Alex is probably waiting to merge the subsequent patch, which depends on this one. >--- > drivers/pci/msi.c | 2 ++ > 1 file changed, 2 insertions(+) > >diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c >index 955ab79..2350271 100644 >--- a/drivers/pci/msi.c >+++ b/drivers/pci/msi.c >@@ -324,6 +324,7 @@ void get_cached_msi_msg(unsigned int irq, struct msi_msg >*msg) > > __get_cached_msi_msg(entry, msg); > } >+EXPORT_SYMBOL_GPL(get_cached_msi_msg); > > void __write_msi_msg(struct msi_desc *entry, struct msi_msg *msg) > { >@@ -368,6 +369,7 @@ void write_msi_msg(unsigned int irq, struct msi_msg *msg) > > __write_msi_msg(entry, msg); > } >+EXPORT_SYMBOL_GPL(write_msi_msg); > > static void free_msi_irqs(struct pci_dev *dev) > { Thanks, Gavin -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Question about the kvm emulator
Hi, I noticed that there is a file call emulate.c, under the directory of arch/x86/kvm/, in its header part, it says: "Generic x86 (32-bit and 64-bit) instruction decoder and emulator." I am confused about this, since qemu will be the emulator, why does kvm itself also includes such an emulator? In particular, I added some printk debug statement in x86_emulate_insn() and emulate_instruction(), but I never see them being invoked. Can someone kindly explain this, thank you! -Jidong -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[Bug 65561] KVM:Entry failed on Single stepping sti instruction
https://bugzilla.kernel.org/show_bug.cgi?id=65561 --- Comment #17 from Jidong Xiao --- Hi,Paolo, I am not familiar with kvm-unit-tests, and I cannot find any documents describe it. So I use gdb to debug. And yes I can reproduce the problem. I just use Jatin's sample code to construct a c program, like this: linux:~/code/cvedr # cat ss.c #include main(){ asm __volatile__( "pushfl \n\t" "orl $0x100, (%%esp) \n\t" "popfl \n\t" "nop \n\t" "nop \n\t" "outb %b0, %w1 \n\t" "nop \n\t" "nop \n\t" "pushfl \n\t" "xorl $0x100, (%%esp) \n\t" "popfl \n\t" :: "a"(2), "Nd" (80) ); } And I ran the c program with gdb. (gdb) disas main Dump of assembler code for function main: 0x080483e4 <+0>: push %ebp 0x080483e5 <+1>: mov%esp,%ebp 0x080483e7 <+3>: mov$0x2,%eax 0x080483ec <+8>: pushf 0x080483ed <+9>: orl$0x100,(%esp) 0x080483f4 <+16>:popf 0x080483f5 <+17>:nop 0x080483f6 <+18>:nop => 0x080483f7 <+19>:out%al,$0x50 0x080483f9 <+21>:nop 0x080483fa <+22>:nop 0x080483fb <+23>:pushf 0x080483fc <+24>:xorl $0x100,(%esp) 0x08048403 <+31>:popf 0x08048404 <+32>:pop%ebp 0x08048405 <+33>:ret End of assembler dump. (gdb) nexti Program received signal SIGSEGV, Segmentation fault. 0x080483f7 in main () (gdb) You can see when the program counter points to the out instruction, and I used nexti command to do single step execution, and it ends up a segmentation fault. According to the kvm code, it seems that x86_emulate_insn() will be called, indeed I don't see any code in that function takes care of the out instruction. But why this only affect single step execution? I actually have some other program that includes some out instructions, and the program runs okay. -- You are receiving this mail because: You are watching the assignee of the bug. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
KVM call agenda for 2014-05-27
Hi Please, send any topic that you are interested in covering. Thanks, Juan. Call details: 15:00 CEST 13:00 UTC 09:00 EDT Every two weeks If you need phone number details, contact me privately. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[Bug 65561] KVM:Entry failed on Single stepping sti instruction
https://bugzilla.kernel.org/show_bug.cgi?id=65561 --- Comment #16 from Jidong Xiao --- Alright, thank you Paolo, I will try and let you know once I am done.(In reply to Paolo Bonzini from comment #15) -- You are receiving this mail because: You are watching the assignee of the bug. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 06/15] MIPS: Add minimal support for OCTEON3 to c-r4k.c
On Wed, May 21, 2014 at 02:40:41PM +0200, Ralf Baechle wrote: > On Tue, May 20, 2014 at 04:47:07PM +0200, Andreas Herrmann wrote: > > > +static inline void r4k_blast_dcache_page_dc128(unsigned long addr) > > +{ > > + R4600_HIT_CACHEOP_WAR_IMPL; > > The R4600 has 32 byte cache lines that is this line will never be > executed on an R4600 thus can be dropped. So the line can also be removed from r4k_blast_dcache_page_dc64? > > + blast_dcache128_page(addr); > > +} > > + > > static void r4k_blast_dcache_page_setup(void) > > { > > unsigned long dc_lsize = cpu_dcache_line_size(); > > @@ -121,6 +127,8 @@ static void r4k_blast_dcache_page_setup(void) > > r4k_blast_dcache_page = r4k_blast_dcache_page_dc32; > > else if (dc_lsize == 64) > > r4k_blast_dcache_page = r4k_blast_dcache_page_dc64; > > + else if (dc_lsize == 128) > > + r4k_blast_dcache_page = r4k_blast_dcache_page_dc128; > > > For another patch - let's see if this can be turned into a switch > construct which hopefully is more readable and produces just as > afficient code with reasonable vintage of gcc. Ok. Andreas -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[Bug 76621] Bridged networking stopped working after yum update
https://bugzilla.kernel.org/show_bug.cgi?id=76621 Michael Tokarev changed: What|Removed |Added CC||m...@tls.msk.ru --- Comment #1 from Michael Tokarev --- Please report this to fedora bugzilla. Not everyone is running fedora, and not everyone even understand what yum is. This problem is most likely qemu-related, not kernel, at least it has very small chance to have anything to do with kvm part of the kernel. And you completely failed to even mention the versions, from and to, of the components you upgraded. -- You are receiving this mail because: You are watching the assignee of the bug. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[Bug 65561] KVM:Entry failed on Single stepping sti instruction
https://bugzilla.kernel.org/show_bug.cgi?id=65561 --- Comment #15 from Paolo Bonzini --- > if I use gdb in a guest OS, like to debug a program inside the Guest OS, and > run the single step command in gdb, that should trigger this bug right Yes. For kvm-unit-tests you would modify x86/debug.c, which already tests singlestep. Jatin, can you confirm that _any_ single-stepping over sti triggers the bug? Does it matter if IF=0 or IF=1 before the sti? -- You are receiving this mail because: You are watching the assignee of the bug. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[Bug 65561] KVM:Entry failed on Single stepping sti instruction
https://bugzilla.kernel.org/show_bug.cgi?id=65561 --- Comment #14 from Jatin Kumar --- (In reply to Jidong Xiao from comment #8) > Hi, Jatin, > > "from inside the OS" means from within the Guest OS right? > (In reply to Jatin Kumar from comment #5) Hello Jidong, Yes it means from within guest OS. Sample code block is like this: intr_register_int(1, 3, INTR_OFF, intr_debug_handler_out, "#DB Debug Exception"); asm volatile( "pushfl \n\t" "orl $0x100, (%%esp) \n\t" "popfl \n\t" "nop \n\t" "nop \n\t" "outb %b0, %w1 \n\t" "nop \n\t" <--- Not getting Debug Trap before executing this "nop \n\t" "pushfl \n\t" "xorl $0x100, (%%esp) \n\t" "popfl \n\t" :: "a"(2), "Nd" (80) ); Debug handler simply prints the EIP from interrupt frame and I don't see the EIP of marked instruction. -- You are receiving this mail because: You are watching the assignee of the bug. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC 08/07] qspinlock: integrate pending bit into queue
2014-05-21 18:49+0200, Radim Krčmář: > 2014-05-19 16:17-0400, Waiman Long: > > As for now, I will focus on just having one pending bit. > > I'll throw some ideas at it, One of the ideas follows; it seems sound, but I haven't benchmarked it thoroughly. (Wasted a lot of time by writing/playing with various tools and loads.) Dbench on ext4 ramdisk, hackbench and ebizzy have shown a small improvement in performance, but my main drive was the weird design of Pending Bit. Does your setup yield improvements too? (A minor code swap noted in the patch might help things.) It is meant to be aplied on top of first 7 patches, because the virt stuff would just get in the way. I have preserved a lot of dead code and made some questionable decisions just to keep the diff short and in one patch, sorry about that. (It is work in progress, double slashed lines mark points of interest.) ---8<--- Pending Bit wasn't used if we already had a node queue with one cpu, which meant that we suffered from these drawbacks again: - unlock path was more complicated (last queued CPU had to clear the tail) - cold node cacheline was just one critical section away With this patch, Pending Bit is used as an additional step in the queue. Waiting for lock is the same: we try Pending Bit and if it is taken, we append to Node Queue. Unlock is different: pending CPU moves into critical section and first CPU from Node Queue takes Pending Bit and notifies next in line or clears the tail. This allows the pending CPU to take the lock as fast as possible, because all bookkeeping was done when entering Pending Queue. Node Queue operations can also be slower without affecting the performance, because we have an additional buffer of one critical section. --- kernel/locking/qspinlock.c | 180 + 1 file changed, 135 insertions(+), 45 deletions(-) diff --git a/kernel/locking/qspinlock.c b/kernel/locking/qspinlock.c index 0ee1a23..76cafb0 100644 --- a/kernel/locking/qspinlock.c +++ b/kernel/locking/qspinlock.c @@ -98,7 +98,10 @@ struct __qspinlock { union { atomic_t val; #ifdef __LITTLE_ENDIAN - u8 locked; + struct { + u8 locked; + u8 pending; + }; struct { u16 locked_pending; u16 tail; @@ -109,7 +112,8 @@ struct __qspinlock { u16 locked_pending; }; struct { - u8 reserved[3]; + u8 reserved[2]; + u8 pending; u8 locked; }; #endif @@ -314,6 +318,59 @@ static inline int trylock_pending(struct qspinlock *lock, u32 *pval) return 1; } +// nice comment here +static inline bool trylock(struct qspinlock *lock, u32 *val) { + if (!(*val = atomic_read(&lock->val)) && + (atomic_cmpxchg(&lock->val, 0, _Q_LOCKED_VAL) == 0)) { + *val = _Q_LOCKED_VAL; + return 1; + } + return 0; +} + +// here +static inline bool trypending(struct qspinlock *lock, u32 *pval) { + u32 old, val = *pval; + // optimizer might produce the same code if we use *pval directly + + // we could use 'if' and a xchg that touches only the pending bit to + // save some cycles at the price of a longer line cutting window + // (and I think it would bug without changing the rest) + while (!(val & (_Q_PENDING_MASK | _Q_TAIL_MASK))) { + old = atomic_cmpxchg(&lock->val, val, val | _Q_PENDING_MASK); + if (old == val) { + *pval = val | _Q_PENDING_MASK; + return 1; + } + val = old; + } + *pval = val; + return 0; +} + +// here +static inline void set_pending(struct qspinlock *lock, u8 pending) +{ + struct __qspinlock *l = (void *)lock; + + // take a look if this is necessary, and if we don't have an + // abstraction already + barrier(); + ACCESS_ONCE(l->pending) = pending; + barrier(); +} + +// and here +static inline u32 cmpxchg_tail(struct qspinlock *lock, u32 tail, u32 newtail) +// API-incompatible with set_pending and the shifting is ugly, so I'd rather +// refactor this one, xchg_tail() and encode_tail() ... another day +{ + struct __qspinlock *l = (void *)lock; + + return (u32)cmpxchg(&l->tail, tail >> _Q_TAIL_OFFSET, + newtail >> _Q_TAIL_OFFSET) << _Q_TAIL_OFFSET; +} + /** * queue_spin_lock_slowpath - acquire the queue spinlock * @lock: Pointer to queue spinlock structure @@ -324,21 +381,21 @@ static inline int trylock_pending(struct qspinlock *lock, u32 *pval) * fast :slow :unlock * :
[Bug 65561] KVM:Entry failed on Single stepping sti instruction
https://bugzilla.kernel.org/show_bug.cgi?id=65561 --- Comment #13 from Jidong Xiao --- Great, I will try. Yes I have the inter manual and I have studied it for a while, so basically I know the data structure of VMCS. To reproduce the failure, so, if I use gdb in a guest OS, like to debug a program inside the Guest OS, and run the single step command in gdb, that should trigger this bug right? Assuming the program includes a sti instruction. (In reply to Paolo Bonzini from comment #12) -- You are receiving this mail because: You are watching the assignee of the bug. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 10/15] MIPS: Add code for new system 'paravirt'.
On 21/05/14 17:31, David Daney wrote: >>> +paravirt_smp_sp[cpu] = __KSTK_TOS(idle); >>> +mb(); >> >> is this barrier necessary? > > Really it is just make_writes_visible_asap(), but for OCTEON mb() or > smp_wmb() is the closest that the kernel has. > > It may not be necessary, but it doesn't really harm anything. Okay, fair enough. I suggest adding a comment to that effect (I think checkpatch now complains about barriers without comments :) ). >>> diff --git a/arch/mips/paravirt/serial.c b/arch/mips/paravirt/serial.c >>> new file mode 100644 >>> index 000..e3f98b2 >>> --- /dev/null >>> +++ b/arch/mips/paravirt/serial.c >>> @@ -0,0 +1,38 @@ >>> +/* >>> + * This file is subject to the terms and conditions of the GNU >>> General Public >>> + * License. See the file "COPYING" in the main directory of this >>> archive >>> + * for more details. >>> + * >>> + * Copyright (C) 2013 Cavium, Inc. >>> + */ >>> + >>> +#include >>> +#include >>> + >>> +#include >>> + >>> +/* >>> + * Emit one character to the boot console. >>> + */ >>> +int prom_putchar(char c) >>> +{ >>> +hypcall3(0 /* Console output */, 0 /* port 0 */, (unsigned >>> long)&c, 1 /* len == 1 */); >> >> I think the hypcall API needs to be clearly specified and Documented >> somewhere along with its HYPCALL codes and scope. I.e. is it specific to >> kvmtool, or attempting to be a standard API across MIPS hypervisors. >> > > I was intending it to be the later. (standard API across MIPS > hypervisors.) > > The idea being that the first argument would be broken up into several > ranges. > > 0..x : Globally available HYPCALL provided by all hypervisors. > > m..n : MIPS KVM specific. > > y..z : Reserved for the vendor. > > > For some values of x, m, n, y and z. > > But perhaps it should just be MIPS KVM specific. If making it global is > too much trouble. I don't think making it global should be a problem (sounds ideal if it works without changes on multiple hypervisors), but it probably makes sense to ensure that other stakeholders are aware of it (those working on other hypervisors and semihosting stuff). Cheers James -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[Bug 65561] KVM:Entry failed on Single stepping sti instruction
https://bugzilla.kernel.org/show_bug.cgi?id=65561 --- Comment #12 from Paolo Bonzini --- > Hi, Paolo, thanks for your explanation. I am interested in fixing this. So > what's the technical challenge here? The first step is to reproduce the failure. To do this you can make a patch to kvm-unit-tests (git://git.kernel.org/pub/scm/virt/kvm/kvm-unit-tests.git). 0x8021 means invalid guest state in the VMCS (the VM information that is passed to the processor). So you could try adding some printk to understand what is the invalid guest state. To do this you need to download the Intel manuals (known as "Intel SDM", google is your friend). You can post the testcase patch on kvm@vger.kernel.org once you have a reproducer. > When you say "support for single-stepping and breakpoints in the emulator is > quite minimal", do you mean it's a problem in the Qemu side, rather than > something wrong in the kvm kernel modules? If so, and if we want to fix this, > we need to make some changes in the Qemu code, right? No, the emulator is part of KVM, see arch/x86/kvm/emulate.c. I have some pending patches for it, destined to 3.17. There is a small chance that they fix the bug. -- You are receiving this mail because: You are watching the assignee of the bug. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 10/15] MIPS: Add code for new system 'paravirt'.
On 05/21/2014 06:39 AM, James Hogan wrote: [...] diff --git a/arch/mips/paravirt/paravirt-irq.c b/arch/mips/paravirt/paravirt-irq.c new file mode 100644 index 000..e1603dd --- /dev/null +++ b/arch/mips/paravirt/paravirt-irq.c [...] +static void irq_core_set_enable_local(void *arg) +{ + struct irq_data *data = arg; + struct core_chip_data *cd = irq_data_get_irq_chip_data(data); + unsigned int mask = 0x100 << cd->bit; + + /* +* Interrupts are already disabled, so these are atomic. Really? Even when called directly from irq_core_bus_sync_unlock with only a single core online? Yes, but... +*/ + if (cd->desired_en) + set_c0_status(mask); + else + clear_c0_status(mask); + +} + +static void irq_core_disable(struct irq_data *data) +{ + struct core_chip_data *cd = irq_data_get_irq_chip_data(data); + cd->desired_en = false; +} + +static void irq_core_enable(struct irq_data *data) +{ + struct core_chip_data *cd = irq_data_get_irq_chip_data(data); + cd->desired_en = true; +} + +static void irq_core_bus_lock(struct irq_data *data) +{ + struct core_chip_data *cd = irq_data_get_irq_chip_data(data); + + mutex_lock(&cd->core_irq_mutex); +} + +static void irq_core_bus_sync_unlock(struct irq_data *data) +{ + struct core_chip_data *cd = irq_data_get_irq_chip_data(data); + + if (cd->desired_en != cd->current_en) { + /* +* Can be called in early init when on_each_cpu() will +* unconditionally enable irqs, so handle the case +* where only a single CPU is online specially, and +* directly call. +*/ + if (num_online_cpus() == 1) + irq_core_set_enable_local(data); + else + on_each_cpu(irq_core_set_enable_local, data, 1); + ... This code is not correct. It was initially done as a workaround for the issues fixed in commit 202da4005. Now that on_each_cpu() is less buggy, we can unconditionally use it and the assertion above about "Interrupts are already disabled" will be true. + cd->current_en = cd->desired_en; + } + + mutex_unlock(&cd->core_irq_mutex); +} +static int irq_pci_set_affinity(struct irq_data *data, const struct cpumask *dest, bool force) +{ + return 0; +} Is there any point even providing this callback? I guess we can add them only when they are implemented. + +static void irq_pci_cpu_offline(struct irq_data *data) +{ +} Or this one? Same. + +static struct irq_chip irq_chip_pci = { + .name = "PCI", + .irq_enable = irq_pci_enable, + .irq_disable = irq_pci_disable, + .irq_ack = irq_pci_ack, + .irq_mask = irq_pci_mask, + .irq_unmask = irq_pci_unmask, + .irq_set_affinity = irq_pci_set_affinity, + .irq_cpu_offline = irq_pci_cpu_offline, +}; diff --git a/arch/mips/paravirt/paravirt-smp.c b/arch/mips/paravirt/paravirt-smp.c new file mode 100644 index 000..52f86eb --- /dev/null +++ b/arch/mips/paravirt/paravirt-smp.c +static void paravirt_smp_finish(void) +{ + /* to generate the first CPU timer interrupt */ + write_c0_compare(read_c0_count() + mips_hpt_frequency / HZ); This strikes me as a bit hacky. Are you sure it's actually necessary? (I would have expected some generic hotplug notifier somewhere to ensure that percpu clocksources gets initialised sensibly when a new CPU is brought up) +static void paravirt_boot_secondary(int cpu, struct task_struct *idle) +{ + paravirt_smp_gp[cpu] = (unsigned long)(task_thread_info(idle)); spurious brackets around task_thread_info(idle) + wmb(); Wouldn't smp_wmb() be more accurate? Probably. + paravirt_smp_sp[cpu] = __KSTK_TOS(idle); + mb(); is this barrier necessary? Really it is just make_writes_visible_asap(), but for OCTEON mb() or smp_wmb() is the closest that the kernel has. It may not be necessary, but it doesn't really harm anything. diff --git a/arch/mips/paravirt/serial.c b/arch/mips/paravirt/serial.c new file mode 100644 index 000..e3f98b2 --- /dev/null +++ b/arch/mips/paravirt/serial.c @@ -0,0 +1,38 @@ +/* + * This file is subject to the terms and conditions of the GNU General Public + * License. See the file "COPYING" in the main directory of this archive + * for more details. + * + * Copyright (C) 2013 Cavium, Inc. + */ + +#include +#include + +#include + +/* + * Emit one character to the boot console. + */ +int prom_putchar(char c) +{ + hypcall3(0 /* Console output */, 0 /* port 0 */, (unsigned long)&c, 1 /* len == 1 */); I think the hypcall API needs to be clearly specified and Documented somewhere along with its HYPCALL codes and scope. I.e. is it specific to kvmtool, or attempting to be a standard API across MIPS hypervisors. I was intending it to be the later
[Bug 65561] KVM:Entry failed on Single stepping sti instruction
https://bugzilla.kernel.org/show_bug.cgi?id=65561 --- Comment #11 from Jidong Xiao --- Hi, Paolo, thanks for your explanation. I am interested in fixing this. So what's the technical challenge here? When you say "support for single-stepping and breakpoints in the emulator is quite minimal", do you mean it's a problem in the Qemu side, rather than something wrong in the kvm kernel modules? If so, and if we want to fix this, we need to make some changes in the Qemu code, right? (In reply to Paolo Bonzini from comment #10) > No, hardware error 0x8021 includes pretty much everything that could go > wrong in vmx.c. :) -- You are receiving this mail because: You are watching the assignee of the bug. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 06/15] MIPS: Add minimal support for OCTEON3 to c-r4k.c
On 05/21/2014 03:04 AM, James Hogan wrote: On 20/05/14 15:47, Andreas Herrmann wrote: From: David Daney These are needed to boot a generic mips64r2 kernel on OCTEONIII. Signed-off-by: David Daney Signed-off-by: Andreas Herrmann --- arch/mips/include/asm/r4kcache.h |2 ++ arch/mips/mm/c-r4k.c | 32 2 files changed, 34 insertions(+) diff --git a/arch/mips/mm/c-r4k.c b/arch/mips/mm/c-r4k.c index 1c74a6a..789ede9 100644 --- a/arch/mips/mm/c-r4k.c +++ b/arch/mips/mm/c-r4k.c @@ -1094,6 +1110,21 @@ static void probe_pcache(void) c->dcache.waybit = 0; break; + case CPU_CAVIUM_OCTEON3: + /* For now lie about the number of ways. */ Is this to work around the finite length of way_string[]? Can we fix that to be more dynamic instead? (admittedly special casing "direct mapped" looks like a bit of a pain). The OCTEON ICache is a weird size that is not (and I think cannot be) represented by the CP0_Config* bits. However, it doesn't matter, as any operation that attempts to invalidate any part of it, operates on the entire cache, so everything works out in the end. The DCache is fully coherent, so any invalidate/flush operations are redundant. So for both of these, we just need to supply values that are both plausible, and don't result in panics and/or OOPS messages being printed. Cheers James + c->icache.linesz = 128; + c->icache.sets = 16; + c->icache.ways = 8; + c->icache.flags |= MIPS_CACHE_VTAG; + icache_size = c->icache.sets * c->icache.ways * c->icache.linesz; + + c->dcache.linesz = 128; + c->dcache.ways = 8; + c->dcache.sets = 8; + dcache_size = c->dcache.sets * c->dcache.ways * c->dcache.linesz; + c->options |= MIPS_CPU_PREFETCH; + break; + -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[Bug 65561] KVM:Entry failed on Single stepping sti instruction
https://bugzilla.kernel.org/show_bug.cgi?id=65561 --- Comment #10 from Paolo Bonzini --- No, hardware error 0x8021 includes pretty much everything that could go wrong in vmx.c. :) -- You are receiving this mail because: You are watching the assignee of the bug. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2] kvm_stat: allow choosing between tracepoints and old stats
The old stats contain information not available in the tracepoints. By default, keep the old behavior, but allow choosing which set of stats to present, or even both. Inspired by a patch from Marcelo Tosatti. Signed-off-by: Paolo Bonzini --- scripts/kvm/kvm_stat | 60 +++- 1 file changed, 41 insertions(+), 19 deletions(-) diff --git a/scripts/kvm/kvm_stat b/scripts/kvm/kvm_stat index 762544b..d7e97e7 100755 --- a/scripts/kvm/kvm_stat +++ b/scripts/kvm/kvm_stat @@ -352,8 +352,8 @@ class TracepointProvider(object): return ret class Stats: -def __init__(self, provider, fields = None): -self.provider = provider +def __init__(self, providers, fields = None): +self.providers = providers self.fields_filter = fields self._update() def _update(self): @@ -362,22 +362,25 @@ class Stats: if not self.fields_filter: return True return re.match(self.fields_filter, key) is not None -self.values = dict([(key, None) -for key in provider.fields() -if wanted(key)]) -self.provider.select(self.values.keys()) +self.values = dict() +for d in providers: +provider_fields = [key for key in d.fields() if wanted(key)] +for key in provider_fields: +self.values[key] = None +d.select(provider_fields) def set_fields_filter(self, fields_filter): self.fields_filter = fields_filter self._update() def get(self): -new = self.provider.read() -for key in self.provider.fields(): -oldval = self.values.get(key, (0, 0)) -newval = new[key] -newdelta = None -if oldval is not None: -newdelta = newval - oldval[0] -self.values[key] = (newval, newdelta) +for d in providers: +new = d.read() +for key in d.fields(): +oldval = self.values.get(key, (0, 0)) +newval = new[key] +newdelta = None +if oldval is not None: +newdelta = newval - oldval[0] +self.values[key] = (newval, newdelta) return self.values if not os.access('/sys/kernel/debug', os.F_OK): @@ -487,6 +490,18 @@ options.add_option('-l', '--log', dest = 'log', help = 'run in logging mode (like vmstat)', ) +options.add_option('-t', '--tracepoints', + action = 'store_true', + default = False, + dest = 'tracepoints', + help = 'retrieve statistics from tracepoints', + ) +options.add_option('-d', '--debugfs', + action = 'store_true', + default = False, + dest = 'debugfs', + help = 'retrieve statistics from debugfs', + ) options.add_option('-f', '--fields', action = 'store', default = None, @@ -495,12 +510,19 @@ options.add_option('-f', '--fields', ) (options, args) = options.parse_args(sys.argv) -try: -provider = TracepointProvider() -except: -provider = DebugfsProvider() +providers = [] +if options.tracepoints: +providers.append(TracepointProvider()) +if options.debugfs: +providers.append(DebugfsProvider()) + +if len(providers) == 0: +try: +providers = [TracepointProvider()] +except: +providers = [DebugfsProvider()] -stats = Stats(provider, fields = options.fields) +stats = Stats(providers, fields = options.fields) if options.log: log(stats) -- 1.8.3.1 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] kvm_stat: always display non tracepoint based stats
On Wed, May 21, 2014 at 02:42:35PM +0200, Paolo Bonzini wrote: > Il 21/05/2014 14:27, Marcelo Tosatti ha scritto: > >On Wed, May 21, 2014 at 12:44:51PM +0200, Paolo Bonzini wrote: > >>The old stats contain information not available in the tracepoints. > >> > >>Inspired by a patch from Marcelo Tosatti. > > > >I avoided that because there is overlapping information. > > > >Now you report kvm_exit count twice, for example. > > What about doing both, i.e. adding the options and the ability to > choose multiple providers? That is, adding --tracepoint and > --debugfs options on top of my patch? So if you use --tracepoint > and tracing is not available, you get an error unlike the current > behavior, and you can also use --tracepoint --debugfs together. > > Paolo Sounds good to me. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[Bug 65561] KVM:Entry failed on Single stepping sti instruction
https://bugzilla.kernel.org/show_bug.cgi?id=65561 --- Comment #9 from Jidong Xiao --- Hi, Paolo, It seems that Gleb's patch commit 03617c188f41eeeb4223c919ee7e66e5a114f2c6 "KVM: VMX: mark unusable segment as nonpresent" fixed a similar problem like this, look at this: https://bugzilla.redhat.com/show_bug.cgi?id=854983, Even though they are triggered by different reasons, they both cause the "hardware error 0x8021" issue. So I feel that patch could also fix this single step issue. (In reply to Paolo Bonzini from comment #7) -- You are receiving this mail because: You are watching the assignee of the bug. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[Bug 65561] KVM:Entry failed on Single stepping sti instruction
https://bugzilla.kernel.org/show_bug.cgi?id=65561 --- Comment #8 from Jidong Xiao --- Hi, Jatin, "from inside the OS" means from within the Guest OS right? (In reply to Jatin Kumar from comment #5) -- You are receiving this mail because: You are watching the assignee of the bug. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[Bug 76621] New: Bridged networking stopped working after yum update
https://bugzilla.kernel.org/show_bug.cgi?id=76621 Bug ID: 76621 Summary: Bridged networking stopped working after yum update Product: Virtualization Version: unspecified Kernel Version: Very latest with Fedora 20 yum update 2014-05-21 Hardware: x86-64 OS: Linux Tree: Fedora Status: NEW Severity: low Priority: P1 Component: kvm Assignee: virtualization_...@kernel-bugs.osdl.org Reporter: hne.jwe...@eit.se Regression: No Hi KVM, We just moved to KVM from VirtualBox and VmWare. We like the performance improvement but now there is a small problem with bridged networking. There is an easy workaround, just don't do "yum update". But I write this since others might get problems too. The host machine is HP Z620, dual XEON, 70GB memory, Samsung SSD 840 EVO and Radeon HD5450 We are running Fedora 20 64 bit. The guest machines are Windows XP converted from VmWare via VirtualBox to KVM. We install Fedora, KVM and add the virtual machines and it works. We did a "yum update" perhaps 3 weeks ago and were still fine. After "yum update" last week the guest machines can no longer be reached from the network. We have reinstalled and things work again and after update the problems happens again. So it is repeatable. We think something has changed beween 3 weeks and 1 week ago that break the feature. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - This is what we did in more detail. Standard Fedora from DVD Add proxy setting in /etc/yum.conf. yum install openssh-server systemctl enable sshd.service systemctl start sshd.service yum install nano yum install samba nano /etc/samba/smb.conf systemctl enable smb.service systemctl start smb.service systemctl enable nmb.service systemctl start nmb.service Turn off firewall and selinux nano /etc/selinux/config SELINUX=disabled systemctl stop firewalld.service systemctl disable firewalld.service reboot Install kvm yum install @virtualization yum install libvirt systemctl start libvirtd Add/Import virtual machines Copy virtual machine images VirtualDeveloperMachines (VDM) and VitrualTestMachines (VTM) if they are not already on disk. VDMs need two network interfraces one bridged an one NAT. VTMs need only NAT. virt-manager Create a NAT network atpcutest with a dhcp 192.168.4.101 to 132 Create new machine. Give path to the image files type Windows XP Give VDMs 2 cores and 3096MB memory Give VTMs 1 core and 1024MB memory Set configure bofore install Set start VM at boot Set Disk driver to virtio For VDMs add an extra NIC, Set one NIC to bridged em1 set other NIC to atpcutest For VTMs set NIC to atpcutest Both NICs as virtio Set Grapchic as VMVGA After booting VDMs set their static IPs (both) - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Doing the above work. Also after rebooting. After doing "yum update" 2014-05-21 it does not work. -- You are receiving this mail because: You are watching the assignee of the bug. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 15/15] MIPS: paravirt: Provide _machine_halt function to exit VM on shutdown of guest
On 20/05/14 15:47, Andreas Herrmann wrote: > Signed-off-by: Andreas Herrmann Does it make sense to provide a _machine_restart too? I think this should be squashed into patch 10 really, or else patch 10 split up into several parts (irq, smp, serial, other). Cheers James > --- > arch/mips/paravirt/setup.c |7 +++ > 1 file changed, 7 insertions(+) > > diff --git a/arch/mips/paravirt/setup.c b/arch/mips/paravirt/setup.c > index f80c3bc..6d2781c 100644 > --- a/arch/mips/paravirt/setup.c > +++ b/arch/mips/paravirt/setup.c > @@ -8,6 +8,7 @@ > > #include > > +#include > #include > #include > #include > @@ -27,6 +28,11 @@ void __init plat_time_init(void) > preset_lpj = mips_hpt_frequency / (2 * HZ); > } > > +static void pv_machine_halt(void) > +{ > + hypcall0(1 /* Exit VM */); > +} > + > /* > * Early entry point for arch setup > */ > @@ -47,6 +53,7 @@ void __init prom_init(void) > if (i < argc - 1) > strlcat(arcs_cmdline, " ", COMMAND_LINE_SIZE); > } > + _machine_halt = pv_machine_halt; > register_smp_ops(¶virt_smp_ops); > } > > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 10/15] MIPS: Add code for new system 'paravirt'.
On 20/05/14 15:47, Andreas Herrmann wrote: > diff --git a/arch/mips/include/asm/mach-paravirt/kernel-entry-init.h > b/arch/mips/include/asm/mach-paravirt/kernel-entry-init.h > new file mode 100644 > index 000..c812efa > --- /dev/null > +++ b/arch/mips/include/asm/mach-paravirt/kernel-entry-init.h > @@ -0,0 +1,49 @@ > +/* > + * Do SMP slave processor setup necessary before we can safely execute > + * C code. > + */ > + .macro smp_slave_setup > + mfc0t0, CP0_EBASE > + andit0, t0, 0x3ff # CPUNum > + sltit1, t0, NR_CPUS > + bnezt1, 1f > +2: > + di > + wait > + b 2b # Unknown CPU, loop forever. > +1: > + PTR_LA t1, paravirt_smp_sp > + PTR_SLL t0, PTR_SCALESHIFT > + PTR_ADDU t1, t1, t0 > +3: > + PTR_L sp, 0(t1) > + beqzsp, 3b # Spin until told to proceed. > + > + PTR_LA t1, paravirt_smp_gp > + PTR_ADDU t1, t1, t0 Usually smp_wmb() at the writer needs to be paired with smp_rmb() at the reader (i.e. here) to guarantee that the two memory locations become visible to this CPU in the correct order, so I think you need a sync of some kind between here to be portable beyond Octeon. > + PTR_L gp, 0(t1) > + .endm > diff --git a/arch/mips/paravirt/paravirt-irq.c > b/arch/mips/paravirt/paravirt-irq.c > new file mode 100644 > index 000..e1603dd > --- /dev/null > +++ b/arch/mips/paravirt/paravirt-irq.c > @@ -0,0 +1,388 @@ > +/* > + * This file is subject to the terms and conditions of the GNU General Public > + * License. See the file "COPYING" in the main directory of this archive > + * for more details. > + * > + * Copyright (C) 2013 Cavium, Inc. > + */ > + > +#include > +#include > +#include > +#include > + > +#include > + > +#define MBOX_BITS_PER_CPU 2 > + > +int cpunum_for_cpu(int cpu) static? > +{ > +#ifdef CONFIG_SMP > + return cpu_logical_map(cpu); > +#else > + return mips_cpunum(); > +#endif > +} > +static void irq_core_set_enable_local(void *arg) > +{ > + struct irq_data *data = arg; > + struct core_chip_data *cd = irq_data_get_irq_chip_data(data); > + unsigned int mask = 0x100 << cd->bit; > + > + /* > + * Interrupts are already disabled, so these are atomic. Really? Even when called directly from irq_core_bus_sync_unlock with only a single core online? > + */ > + if (cd->desired_en) > + set_c0_status(mask); > + else > + clear_c0_status(mask); > + > +} > + > +static void irq_core_disable(struct irq_data *data) > +{ > + struct core_chip_data *cd = irq_data_get_irq_chip_data(data); > + cd->desired_en = false; > +} > + > +static void irq_core_enable(struct irq_data *data) > +{ > + struct core_chip_data *cd = irq_data_get_irq_chip_data(data); > + cd->desired_en = true; > +} > + > +static void irq_core_bus_lock(struct irq_data *data) > +{ > + struct core_chip_data *cd = irq_data_get_irq_chip_data(data); > + > + mutex_lock(&cd->core_irq_mutex); > +} > + > +static void irq_core_bus_sync_unlock(struct irq_data *data) > +{ > + struct core_chip_data *cd = irq_data_get_irq_chip_data(data); > + > + if (cd->desired_en != cd->current_en) { > + /* > + * Can be called in early init when on_each_cpu() will > + * unconditionally enable irqs, so handle the case > + * where only a single CPU is online specially, and > + * directly call. > + */ > + if (num_online_cpus() == 1) > + irq_core_set_enable_local(data); > + else > + on_each_cpu(irq_core_set_enable_local, data, 1); > + > + cd->current_en = cd->desired_en; > + } > + > + mutex_unlock(&cd->core_irq_mutex); > +} > +static int irq_pci_set_affinity(struct irq_data *data, const struct cpumask > *dest, bool force) > +{ > + return 0; > +} Is there any point even providing this callback? > + > +static void irq_pci_cpu_offline(struct irq_data *data) > +{ > +} Or this one? > + > +static struct irq_chip irq_chip_pci = { > + .name = "PCI", > + .irq_enable = irq_pci_enable, > + .irq_disable = irq_pci_disable, > + .irq_ack = irq_pci_ack, > + .irq_mask = irq_pci_mask, > + .irq_unmask = irq_pci_unmask, > + .irq_set_affinity = irq_pci_set_affinity, > + .irq_cpu_offline = irq_pci_cpu_offline, > +}; > diff --git a/arch/mips/paravirt/paravirt-smp.c > b/arch/mips/paravirt/paravirt-smp.c > new file mode 100644 > index 000..52f86eb > --- /dev/null > +++ b/arch/mips/paravirt/paravirt-smp.c > +static void paravirt_smp_finish(void) > +{ > + /* to generate the first CPU timer interrupt */ > + write_c0_compare(read_c0_count() + mips_hpt_frequency / HZ); This strikes me as a bit hacky. Are you sure it's actually necessary? (I would have expected some generic hotplug notifier somewhere to ensure that percpu clocksources gets
Re: [PATCH 11/15] MIPS: paravirt: Add pci controller for virtio
On Tue, May 20, 2014 at 04:47:12PM +0200, Andreas Herrmann wrote: > + > +union pci_config_address { > + struct { > +#ifdef __LITTLE_ENDIAN > + unsignedregister_number : 8;/* 7 .. 0 */ > + unsigneddevfn_number: 8;/* 15 .. 8 */ > + unsignedbus_number : 8;/* 23 .. 16 */ > + unsignedreserved: 7;/* 30 .. 24 */ > + unsignedenable_bit : 1;/* 31 */ > +#else > + unsignedenable_bit : 1;/* 31 */ > + unsignedreserved: 7;/* 30 .. 24 */ > + unsignedbus_number : 8;/* 23 .. 16 */ > + unsigneddevfn_number: 8;/* 15 .. 8 */ > + unsignedregister_number : 8;/* 7 .. 0 */ > +#endif For this kind of endianess dependency there is a more elegant way of defining things in linux-next's like: #include ... struct { __BITFIELD_FIELD(unsigned enable_bit : 1,/* 31 */ __BITFIELD_FIELD(unsigned reserved: 7,/* 30 .. 24 */ __BITFIELD_FIELD(unsigned bus_number : 8,/* 23 .. 16 */ __BITFIELD_FIELD(unsigned devfn_number: 8,/* 15 .. 8 */ __BITFIELD_FIELD(unsigned register_number : 8,/* 7 .. 0 */ ); }; No ifdef, no duplication! Ralf -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] [PATCH v1 RFC 6/6] KVM: s390: add cpu model support
On 21.05.14 14:56, Michael Mueller wrote: On Tue, 20 May 2014 12:10:23 +0200 Alexander Graf wrote: On 20.05.14 12:02, Michael Mueller wrote: On Mon, 19 May 2014 22:14:00 +0200 Alexander Graf wrote: On 19.05.14 19:03, Michael Mueller wrote: On Mon, 19 May 2014 16:49:28 +0200 Alexander Graf wrote: [...] What user and thus also user space wants depends on other factors: 1. reliability 2. performance 3. availability It's not features, that's what programmers want. That's why I have designed the model and migration capability around the hardware and not around the software features and don't allow them to be enabled currently together. A software feature is a nice add on that is helpful for evaluation or development purpose. There is few space for it on productions systems. One option that I currently see to make software implemented facility migration capable is to calculate some kind of hash value derived from the full set of active software facilities. That value can be compared with pre-calculated values also stored in the supported model table of qemu. This value could be seen like a virtual model extension that has to match like the model name. But I have said it elsewhere already, a soft facility should be an exception and not the rule. So all we need is a list of "features the guest sees available" which is the same as "features user space wants the guest to see" which then gets masked through "features the host can do in hardware". For emulation we can just check on the global feature availability on whether we should emulate them or not. Also, if user space wants to make sure that its feature list is actually workable on the host kernel, it needs to set and get the features again and then compare that with the ones it set? That's different from x86's cpuid implementation but probably workable. User space will probe what facilities are available and match them with the predefined cpu model set. Only those models which use a partial or full subset of the hard/host facility list are selectable. Why? If a host does not offer the features required for a model it is not able to run efficiently. Please take a look at how x86 does cpuid masking :). In fact, I'm not 100% convinced that it's a good idea to link cpuid / feature list exposure to the guest and actual feature implementation inside the guest together. On POWER there is a patch set pending that implements these two things separately - admittedly mostly because hardware sucks and we can't change the PVR. That is maybe the big difference with s390. The cpuid in the S390 case is not directly comparable with the processor version register of POWER. In the S390 world we have a well defined CPU model room spanned by the machine type and its GA count. Thus we can define a bijective mapping between (type, ga) <-> (cpuid, ibc, facility set). From type and ga we form the model name which BTW is meaningful also for a human user. Same thing as POWER. By means of this name, a management interface (libvirt) will draw decisions if migration to a remote hypervisor is a good idea or not. For that it just needs to compare if the current model of the guest on the source hypervisor ("query-cpu-model"), is contained in the supported model list of the target hypervisor ("query-cpu-definitions"). I don't think this works, since QEMU should always return all the cpu definitions it's aware of on query-cpu-definitions, not just the ones that it thinks may be compatible with the host at a random point in time. It does not return model names that it thinks they are compatible at some point in time. In s390 mode, it returns all definitions (CPU models) that a given host system is capable to run. Together with the CPU model run by the guest, some upper management interface knows if the hypervisor supports the required CPU model and uses a guest definition with the same CPU model on the target hypervisor. The information for that is taken from the model table which QEMU builds up during startup time. This list limits the command line selectable CPU models as well. This makes s390 derive from the way x86 handles things. NAK. One second, that goes a little fast here :-). x86 returns a list they support which happens to be the full list they define and s390 does logically the same because we know that certain models are not supported due to probing. BTW that happens only if you run Qemu on back level hardware and that is perfectly correct. It's not what other architectures do and I'd hate to see s390 deviate just because. Only these four architectures implement the query and they all differ a little... target-arm/helper.c:CpuDefinitionInfoList *arch_query_cpu_definitions(Error **errp) target-i386/cpu.c:CpuDefinitionInfoList *arch_query_cpu_definitions(Error **errp) target-ppc/translate_init.c:CpuDefinitionInfoList *arch_query_cpu_definitions(Error **errp) target-s390x/cpu.c:CpuDefinitionInfoList *arch_query_cpu_definitions(E
Re: [PATCH 05/15] MIPS: Don't build fast TLB refill handler with 32-bit kernels.
On Wed, May 21, 2014 at 03:04:20PM +0200, Ralf Baechle wrote: > On Wed, May 21, 2014 at 10:38:39AM +0100, James Hogan wrote: > > > On 20/05/14 15:47, Andreas Herrmann wrote: > > > From: David Daney > > > > > > The fast handler only supports 64-bit kernels. > > > > > > Signed-off-by: David Daney > > > Signed-off-by: Andreas Herrmann > > > --- > > > arch/mips/mm/tlbex.c |8 ++-- > > > 1 file changed, 6 insertions(+), 2 deletions(-) > > > > > > diff --git a/arch/mips/mm/tlbex.c b/arch/mips/mm/tlbex.c > > > index ee88367..781e183 100644 > > > --- a/arch/mips/mm/tlbex.c > > > +++ b/arch/mips/mm/tlbex.c > > > @@ -1250,13 +1250,17 @@ static void build_r4000_tlb_refill_handler(void) > > > unsigned int final_len; > > > struct mips_huge_tlb_info htlb_info __maybe_unused; > > > enum vmalloc64_mode vmalloc_mode __maybe_unused; > > > - > > > +#ifdef CONFIG_64BIT > > > + bool is64bit = true; > > > +#else > > > + bool is64bit = false; > > > +#endif > > > memset(tlb_handler, 0, sizeof(tlb_handler)); > > > memset(labels, 0, sizeof(labels)); > > > memset(relocs, 0, sizeof(relocs)); > > > memset(final_handler, 0, sizeof(final_handler)); > > > > > > - if ((scratch_reg >= 0 || scratchpad_available()) && use_bbit_insns()) { > > > + if (is64bit && (scratch_reg >= 0 || scratchpad_available()) && > > > use_bbit_insns()) { > > > htlb_info = build_fast_tlb_refill_handler(&p, &l, &r, K0, K1, > > > scratch_reg); > > > vmalloc_mode = refill_scratch; > > > > > > > This looks like a good place to use IS_ENABLED(CONFIG_64BIT) to reduce > > ifdefery. > > Or even the classic "if (sizeof(unsigned long) == 8)" which is a little less > expressive but more portable. Ok, makes sense. (Will use one of the two suggestions.) Thanks, Andreas -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 05/15] MIPS: Don't build fast TLB refill handler with 32-bit kernels.
On Wed, May 21, 2014 at 10:38:39AM +0100, James Hogan wrote: > On 20/05/14 15:47, Andreas Herrmann wrote: > > From: David Daney > > > > The fast handler only supports 64-bit kernels. > > > > Signed-off-by: David Daney > > Signed-off-by: Andreas Herrmann > > --- > > arch/mips/mm/tlbex.c |8 ++-- > > 1 file changed, 6 insertions(+), 2 deletions(-) > > > > diff --git a/arch/mips/mm/tlbex.c b/arch/mips/mm/tlbex.c > > index ee88367..781e183 100644 > > --- a/arch/mips/mm/tlbex.c > > +++ b/arch/mips/mm/tlbex.c > > @@ -1250,13 +1250,17 @@ static void build_r4000_tlb_refill_handler(void) > > unsigned int final_len; > > struct mips_huge_tlb_info htlb_info __maybe_unused; > > enum vmalloc64_mode vmalloc_mode __maybe_unused; > > - > > +#ifdef CONFIG_64BIT > > + bool is64bit = true; > > +#else > > + bool is64bit = false; > > +#endif > > memset(tlb_handler, 0, sizeof(tlb_handler)); > > memset(labels, 0, sizeof(labels)); > > memset(relocs, 0, sizeof(relocs)); > > memset(final_handler, 0, sizeof(final_handler)); > > > > - if ((scratch_reg >= 0 || scratchpad_available()) && use_bbit_insns()) { > > + if (is64bit && (scratch_reg >= 0 || scratchpad_available()) && > > use_bbit_insns()) { > > htlb_info = build_fast_tlb_refill_handler(&p, &l, &r, K0, K1, > > scratch_reg); > > vmalloc_mode = refill_scratch; > > > > This looks like a good place to use IS_ENABLED(CONFIG_64BIT) to reduce > ifdefery. Or even the classic "if (sizeof(unsigned long) == 8)" which is a little less expressive but more portable. Ralf -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] [PATCH v1 RFC 6/6] KVM: s390: add cpu model support
On Tue, 20 May 2014 12:10:23 +0200 Alexander Graf wrote: > > On 20.05.14 12:02, Michael Mueller wrote: > > On Mon, 19 May 2014 22:14:00 +0200 > > Alexander Graf wrote: > > > >> On 19.05.14 19:03, Michael Mueller wrote: > >>> On Mon, 19 May 2014 16:49:28 +0200 > >>> Alexander Graf wrote: > >>> > > [...] > > > What user and thus also user space wants depends on other factors: > > > > 1. reliability > > 2. performance > > 3. availability > > > > It's not features, that's what programmers want. > > > > That's why I have designed the model and migration capability around > > the hardware > > and not around the software features and don't allow them to be enabled > > currently > > together. > > > > A software feature is a nice add on that is helpful for evaluation or > > development > > purpose. There is few space for it on productions systems. > > > > One option that I currently see to make software implemented facility > > migration > > capable is to calculate some kind of hash value derived from the full > > set of > > active software facilities. That value can be compared with > > pre-calculated > > values also stored in the supported model table of qemu. This value > > could be > > seen like a virtual model extension that has to match like the model > > name. > > > > But I have said it elsewhere already, a soft facility should be an > > exception and > > not the rule. > > > So all we need is a list of "features the guest sees available" > which is > the same as "features user space wants the guest to see" which then > gets > masked through "features the host can do in hardware". > > For emulation we can just check on the global feature availability on > whether we should emulate them or not. > > >> Also, if user space wants to make sure that its feature list is > >> actually > >> workable on the host kernel, it needs to set and get the features > >> again > >> and then compare that with the ones it set? That's different from > >> x86's > >> cpuid implementation but probably workable. > > User space will probe what facilities are available and match them > > with the predefined > > cpu model set. Only those models which use a partial or full subset > > of the hard/host > > facility list are selectable. > Why? > >>> If a host does not offer the features required for a model it is not > >>> able to > >>> run efficiently. > >>> > Please take a look at how x86 does cpuid masking :). > > In fact, I'm not 100% convinced that it's a good idea to link cpuid / > feature list exposure to the guest and actual feature implementation > inside the guest together. On POWER there is a patch set pending that > implements these two things separately - admittedly mostly because > hardware sucks and we can't change the PVR. > >>> That is maybe the big difference with s390. The cpuid in the S390 > >>> case is not > >>> directly comparable with the processor version register of POWER. > >>> > >>> In the S390 world we have a well defined CPU model room spanned by > >>> the machine > >>> type and its GA count. Thus we can define a bijective mapping between > >>> (type, ga) <-> (cpuid, ibc, facility set). From type and ga we form > >>> the model > >>> name which BTW is meaningful also for a human user. > >> Same thing as POWER. > >> > >>> By means of this name, a management interface (libvirt) will draw > >>> decisions if > >>> migration to a remote hypervisor is a good idea or not. For that it > >>> just needs > >>> to compare if the current model of the guest on the source hypervisor > >>> ("query-cpu-model"), is contained in the supported model list of the > >>> target > >>> hypervisor ("query-cpu-definitions"). > >> I don't think this works, since QEMU should always return all the cpu > >> definitions it's aware of on query-cpu-definitions, not just the ones > >> that it thinks may be compatible with the host at a random point in > >> time. > > It does not return model names that it thinks they are compatible at > > some point > > in time. In s390 mode, it returns all definitions (CPU models) that a > > given host > > system is capable to run. Together with the CPU model run by the guest, > > some upper > > management interface knows if the hypervisor supports the required CPU > > model and > > uses a guest definition with the same CPU model on the target > > hypervisor. > > > > The information for that is taken from the model table which QEMU > > builds up
Re: [PATCH 06/15] MIPS: Add minimal support for OCTEON3 to c-r4k.c
On Tue, May 20, 2014 at 04:47:07PM +0200, Andreas Herrmann wrote: > +static inline void r4k_blast_dcache_page_dc128(unsigned long addr) > +{ > + R4600_HIT_CACHEOP_WAR_IMPL; The R4600 has 32 byte cache lines that is this line will never be executed on an R4600 thus can be dropped. > + blast_dcache128_page(addr); > +} > + > static void r4k_blast_dcache_page_setup(void) > { > unsigned long dc_lsize = cpu_dcache_line_size(); > @@ -121,6 +127,8 @@ static void r4k_blast_dcache_page_setup(void) > r4k_blast_dcache_page = r4k_blast_dcache_page_dc32; > else if (dc_lsize == 64) > r4k_blast_dcache_page = r4k_blast_dcache_page_dc64; > + else if (dc_lsize == 128) > + r4k_blast_dcache_page = r4k_blast_dcache_page_dc128; For another patch - let's see if this can be turned into a switch construct which hopefully is more readable and produces just as afficient code with reasonable vintage of gcc. Ralf -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] kvm_stat: always display non tracepoint based stats
Il 21/05/2014 14:27, Marcelo Tosatti ha scritto: On Wed, May 21, 2014 at 12:44:51PM +0200, Paolo Bonzini wrote: The old stats contain information not available in the tracepoints. Inspired by a patch from Marcelo Tosatti. I avoided that because there is overlapping information. Now you report kvm_exit count twice, for example. What about doing both, i.e. adding the options and the ability to choose multiple providers? That is, adding --tracepoint and --debugfs options on top of my patch? So if you use --tracepoint and tracing is not available, you get an error unlike the current behavior, and you can also use --tracepoint --debugfs together. Paolo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] kvm_stat: always display non tracepoint based stats
On Wed, May 21, 2014 at 12:44:51PM +0200, Paolo Bonzini wrote: > The old stats contain information not available in the tracepoints. > > Inspired by a patch from Marcelo Tosatti. I avoided that because there is overlapping information. Now you report kvm_exit count twice, for example. > > Signed-off-by: Paolo Bonzini > --- > scripts/kvm/kvm_stat | 37 - > 1 file changed, 20 insertions(+), 17 deletions(-) > > diff --git a/scripts/kvm/kvm_stat b/scripts/kvm/kvm_stat > index 762544b..b97542f 100755 > --- a/scripts/kvm/kvm_stat > +++ b/scripts/kvm/kvm_stat > @@ -352,8 +352,8 @@ class TracepointProvider(object): > return ret > > class Stats: > -def __init__(self, provider, fields = None): > -self.provider = provider > +def __init__(self, providers, fields = None): > +self.providers = providers > self.fields_filter = fields > self._update() > def _update(self): > @@ -362,22 +362,25 @@ class Stats: > if not self.fields_filter: > return True > return re.match(self.fields_filter, key) is not None > -self.values = dict([(key, None) > -for key in provider.fields() > -if wanted(key)]) > -self.provider.select(self.values.keys()) > +self.values = dict() > +for d in providers: > +provider_fields = [key for key in d.fields() if wanted(key)] > +for key in provider_fields: > +self.values[key] = None > +d.select(provider_fields) > def set_fields_filter(self, fields_filter): > self.fields_filter = fields_filter > self._update() > def get(self): > -new = self.provider.read() > -for key in self.provider.fields(): > -oldval = self.values.get(key, (0, 0)) > -newval = new[key] > -newdelta = None > -if oldval is not None: > -newdelta = newval - oldval[0] > -self.values[key] = (newval, newdelta) > +for d in providers: > +new = d.read() > +for key in d.fields(): > +oldval = self.values.get(key, (0, 0)) > +newval = new[key] > +newdelta = None > +if oldval is not None: > +newdelta = newval - oldval[0] > +self.values[key] = (newval, newdelta) > return self.values > > if not os.access('/sys/kernel/debug', os.F_OK): > @@ -496,11 +499,11 @@ options.add_option('-f', '--fields', > (options, args) = options.parse_args(sys.argv) > > try: > -provider = TracepointProvider() > +providers = [TracepointProvider(), DebugfsProvider()] > except: > -provider = DebugfsProvider() > +providers = [DebugfsProvider()] > > -stats = Stats(provider, fields = options.fields) > +stats = Stats(providers, fields = options.fields) > > if options.log: > log(stats) > -- > 1.8.3.1 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 11/15] MIPS: paravirt: Add pci controller for virtio
On 20/05/14 15:47, Andreas Herrmann wrote: > From: David Daney > > Signed-off-by: David Daney > Signed-off-by: Andreas Herrmann > --- > arch/mips/Kconfig|1 + > arch/mips/paravirt/Kconfig |6 ++ > arch/mips/pci/Makefile |2 +- > arch/mips/pci/pci-virtio-guest.c | 140 > ++ > 4 files changed, 148 insertions(+), 1 deletion(-) > create mode 100644 arch/mips/paravirt/Kconfig > create mode 100644 arch/mips/pci/pci-virtio-guest.c If I understand correctly this just drives a simple PCI controller for a PCI bus that a virtio device happens to be usually plugged in to, yeh? It sounds like it would make sense to take advantage of Will Deacon's recent efforts to make a generic pci controller driver for this sort of thing which specifically mentions emulation by kvmtool? Is it effectively the same PCI controller that is being emulated? http://lists.infradead.org/pipermail/linux-arm-kernel/2014-February/thread.html#233491 http://lists.infradead.org/pipermail/linux-arm-kernel/2014-February/233491.html http://lists.infradead.org/pipermail/linux-arm-kernel/2014-February/233490.html Cheers James -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 07/15] MIPS: Add mips_cpunum() function.
On 20/05/14 15:47, Andreas Herrmann wrote: > From: David Daney > > This returns the CPUNum from the low order Ebase bits. > > Signed-off-by: David Daney > Signed-off-by: Andreas Herrmann > --- > arch/mips/include/asm/mipsregs.h |5 + > 1 file changed, 5 insertions(+) > > diff --git a/arch/mips/include/asm/mipsregs.h > b/arch/mips/include/asm/mipsregs.h > index 3e025b5..f110d48 100644 > --- a/arch/mips/include/asm/mipsregs.h > +++ b/arch/mips/include/asm/mipsregs.h > @@ -1916,6 +1916,11 @@ __BUILD_SET_C0(brcm_cmt_ctrl) > __BUILD_SET_C0(brcm_config) > __BUILD_SET_C0(brcm_mode) > > +static inline unsigned int mips_cpunum(void) > +{ > + return read_c0_ebase() & 0x3ff; /* Low 10 bits of ebase. */ > +} If this is going to go in mips generic code I think it should be clearly defined, especially in the presence of MT, otherwise perhaps it makes sense for it to go in a paravirt specific header? I.e. does it return the core number of the running VPE (if so it should probably do something like below as in decode_configs() and go in smp.h), or does it simply always return that field in ebase register (in which case it should probably have ebase in the name and a comment to clarify that it doesn't necessarily map directly to core/vpe number). unsigned int core = read_c0_ebase() & 0x3ff; if (cpu_has_mipsmt) core >>= fls(smp_num_siblings) - 1; Cheers James -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH uq/master] kvm_stat: add option to display non tracepoint based stats
Il 20/05/2014 20:38, Marcelo Tosatti ha scritto: The old stats contain information not available in the tracepoints. Signed-off-by: Marcelo Tosatti diff --git a/scripts/kvm/kvm_stat b/scripts/kvm/kvm_stat index 762544b..6ac3b21 100755 --- a/scripts/kvm/kvm_stat +++ b/scripts/kvm/kvm_stat @@ -493,12 +493,21 @@ options.add_option('-f', '--fields', dest = 'fields', help = 'fields to display (regex)', ) +options.add_option('-o', '--old', + action = 'store_true', + default = False, + dest = 'old', + help = 'show kvm statistics counters (not tracepoint based)', + ) (options, args) = options.parse_args(sys.argv) -try: -provider = TracepointProvider() -except: +if options.old: provider = DebugfsProvider() +else: +try: +provider = TracepointProvider() +except: +provider = DebugfsProvider() stats = Stats(provider, fields = options.fields) Let's instead show information from both providers. I am sending a patch. Paolo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] kvm_stat: always display non tracepoint based stats
The old stats contain information not available in the tracepoints. Inspired by a patch from Marcelo Tosatti. Signed-off-by: Paolo Bonzini --- scripts/kvm/kvm_stat | 37 - 1 file changed, 20 insertions(+), 17 deletions(-) diff --git a/scripts/kvm/kvm_stat b/scripts/kvm/kvm_stat index 762544b..b97542f 100755 --- a/scripts/kvm/kvm_stat +++ b/scripts/kvm/kvm_stat @@ -352,8 +352,8 @@ class TracepointProvider(object): return ret class Stats: -def __init__(self, provider, fields = None): -self.provider = provider +def __init__(self, providers, fields = None): +self.providers = providers self.fields_filter = fields self._update() def _update(self): @@ -362,22 +362,25 @@ class Stats: if not self.fields_filter: return True return re.match(self.fields_filter, key) is not None -self.values = dict([(key, None) -for key in provider.fields() -if wanted(key)]) -self.provider.select(self.values.keys()) +self.values = dict() +for d in providers: +provider_fields = [key for key in d.fields() if wanted(key)] +for key in provider_fields: +self.values[key] = None +d.select(provider_fields) def set_fields_filter(self, fields_filter): self.fields_filter = fields_filter self._update() def get(self): -new = self.provider.read() -for key in self.provider.fields(): -oldval = self.values.get(key, (0, 0)) -newval = new[key] -newdelta = None -if oldval is not None: -newdelta = newval - oldval[0] -self.values[key] = (newval, newdelta) +for d in providers: +new = d.read() +for key in d.fields(): +oldval = self.values.get(key, (0, 0)) +newval = new[key] +newdelta = None +if oldval is not None: +newdelta = newval - oldval[0] +self.values[key] = (newval, newdelta) return self.values if not os.access('/sys/kernel/debug', os.F_OK): @@ -496,11 +499,11 @@ options.add_option('-f', '--fields', (options, args) = options.parse_args(sys.argv) try: -provider = TracepointProvider() +providers = [TracepointProvider(), DebugfsProvider()] except: -provider = DebugfsProvider() +providers = [DebugfsProvider()] -stats = Stats(provider, fields = options.fields) +stats = Stats(providers, fields = options.fields) if options.log: log(stats) -- 1.8.3.1 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 06/15] MIPS: Add minimal support for OCTEON3 to c-r4k.c
On 20/05/14 15:47, Andreas Herrmann wrote: > From: David Daney > > These are needed to boot a generic mips64r2 kernel on OCTEONIII. > > Signed-off-by: David Daney > Signed-off-by: Andreas Herrmann > --- > arch/mips/include/asm/r4kcache.h |2 ++ > arch/mips/mm/c-r4k.c | 32 > 2 files changed, 34 insertions(+) > diff --git a/arch/mips/mm/c-r4k.c b/arch/mips/mm/c-r4k.c > index 1c74a6a..789ede9 100644 > --- a/arch/mips/mm/c-r4k.c > +++ b/arch/mips/mm/c-r4k.c > @@ -1094,6 +1110,21 @@ static void probe_pcache(void) > c->dcache.waybit = 0; > break; > > + case CPU_CAVIUM_OCTEON3: > + /* For now lie about the number of ways. */ Is this to work around the finite length of way_string[]? Can we fix that to be more dynamic instead? (admittedly special casing "direct mapped" looks like a bit of a pain). Cheers James > + c->icache.linesz = 128; > + c->icache.sets = 16; > + c->icache.ways = 8; > + c->icache.flags |= MIPS_CACHE_VTAG; > + icache_size = c->icache.sets * c->icache.ways * > c->icache.linesz; > + > + c->dcache.linesz = 128; > + c->dcache.ways = 8; > + c->dcache.sets = 8; > + dcache_size = c->dcache.sets * c->dcache.ways * > c->dcache.linesz; > + c->options |= MIPS_CPU_PREFETCH; > + break; > + -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2] kvmclock: Ensure time in migration never goes backward
On 19.05.14 13:31, Paolo Bonzini wrote: Il 18/05/2014 15:20, Marcelo Tosatti ha scritto: Reviewed-by: Marcelo Tosatti Thanks Marcelo, applying to uq/master. Same here, please also CC to stable :). Alex -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: KVM: x86: disable master clock if TSC is reset during suspend (v3)
Il 21/05/2014 10:26, Alexander Graf ha scritto: On 14.05.14 18:00, Paolo Bonzini wrote: Il 14/05/2014 17:43, Marcelo Tosatti ha scritto: Updating system_time from the kernel clock once master clock has been enabled can result in time backwards event, in case kernel clock frequency is lower than TSC frequency. Disable master clock in case its necessary to update it from the resume path. Signed-off-by: Marcelo Tosatti diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index de0931c..c098728 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -106,6 +106,8 @@ EXPORT_SYMBOL_GPL(kvm_max_guest_tsc_khz); static u32 tsc_tolerance_ppm = 250; module_param(tsc_tolerance_ppm, uint, S_IRUGO | S_IWUSR); +static bool backwards_tsc_observed = false; + #define KVM_NR_SHARED_MSRS 16 struct kvm_shared_msrs_global { @@ -1471,7 +1473,8 @@ static void pvclock_update_vm_gtod_copy(struct kvm *kvm) &ka->master_kernel_ns, &ka->master_cycle_now); -ka->use_master_clock = host_tsc_clocksource & vcpus_matched; +ka->use_master_clock = host_tsc_clocksource && vcpus_matched +&& !backwards_tsc_observed; if (ka->use_master_clock) atomic_set(&kvm_guest_has_master_clock, 1); @@ -6939,6 +6942,7 @@ int kvm_arch_hardware_enable(void *garbage) */ if (backwards_tsc) { u64 delta_cyc = max_tsc - local_tsc; +backwards_tsc_observed = true; list_for_each_entry(kvm, &vm_list, vm_list) { kvm_for_each_vcpu(i, vcpu, kvm) { vcpu->arch.tsc_offset_adjustment += delta_cyc; Applying to kvm/master, thanks. Thanks. Can you please make sure to CC qemu-stable when sending a pull request? Now I will. Paolo :) -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 05/15] MIPS: Don't build fast TLB refill handler with 32-bit kernels.
On 20/05/14 15:47, Andreas Herrmann wrote: > From: David Daney > > The fast handler only supports 64-bit kernels. > > Signed-off-by: David Daney > Signed-off-by: Andreas Herrmann > --- > arch/mips/mm/tlbex.c |8 ++-- > 1 file changed, 6 insertions(+), 2 deletions(-) > > diff --git a/arch/mips/mm/tlbex.c b/arch/mips/mm/tlbex.c > index ee88367..781e183 100644 > --- a/arch/mips/mm/tlbex.c > +++ b/arch/mips/mm/tlbex.c > @@ -1250,13 +1250,17 @@ static void build_r4000_tlb_refill_handler(void) > unsigned int final_len; > struct mips_huge_tlb_info htlb_info __maybe_unused; > enum vmalloc64_mode vmalloc_mode __maybe_unused; > - > +#ifdef CONFIG_64BIT > + bool is64bit = true; > +#else > + bool is64bit = false; > +#endif > memset(tlb_handler, 0, sizeof(tlb_handler)); > memset(labels, 0, sizeof(labels)); > memset(relocs, 0, sizeof(relocs)); > memset(final_handler, 0, sizeof(final_handler)); > > - if ((scratch_reg >= 0 || scratchpad_available()) && use_bbit_insns()) { > + if (is64bit && (scratch_reg >= 0 || scratchpad_available()) && > use_bbit_insns()) { > htlb_info = build_fast_tlb_refill_handler(&p, &l, &r, K0, K1, > scratch_reg); > vmalloc_mode = refill_scratch; > This looks like a good place to use IS_ENABLED(CONFIG_64BIT) to reduce ifdefery. Cheers James -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: KVM: x86: disable master clock if TSC is reset during suspend (v3)
On 14.05.14 18:00, Paolo Bonzini wrote: Il 14/05/2014 17:43, Marcelo Tosatti ha scritto: Updating system_time from the kernel clock once master clock has been enabled can result in time backwards event, in case kernel clock frequency is lower than TSC frequency. Disable master clock in case its necessary to update it from the resume path. Signed-off-by: Marcelo Tosatti diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index de0931c..c098728 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -106,6 +106,8 @@ EXPORT_SYMBOL_GPL(kvm_max_guest_tsc_khz); static u32 tsc_tolerance_ppm = 250; module_param(tsc_tolerance_ppm, uint, S_IRUGO | S_IWUSR); +static bool backwards_tsc_observed = false; + #define KVM_NR_SHARED_MSRS 16 struct kvm_shared_msrs_global { @@ -1471,7 +1473,8 @@ static void pvclock_update_vm_gtod_copy(struct kvm *kvm) &ka->master_kernel_ns, &ka->master_cycle_now); -ka->use_master_clock = host_tsc_clocksource & vcpus_matched; +ka->use_master_clock = host_tsc_clocksource && vcpus_matched +&& !backwards_tsc_observed; if (ka->use_master_clock) atomic_set(&kvm_guest_has_master_clock, 1); @@ -6939,6 +6942,7 @@ int kvm_arch_hardware_enable(void *garbage) */ if (backwards_tsc) { u64 delta_cyc = max_tsc - local_tsc; +backwards_tsc_observed = true; list_for_each_entry(kvm, &vm_list, vm_list) { kvm_for_each_vcpu(i, vcpu, kvm) { vcpu->arch.tsc_offset_adjustment += delta_cyc; Applying to kvm/master, thanks. Thanks. Can you please make sure to CC qemu-stable when sending a pull request? Alex -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3] PCI: Introduce new device binding path using pci_dev.driver_override
On 20.05.14 16:53, Alex Williamson wrote: The driver_override field allows us to specify the driver for a device rather than relying on the driver to provide a positive match of the device. This shortcuts the existing process of looking up the vendor and device ID, adding them to the driver new_id, binding the device, then removing the ID, but it also provides a couple advantages. First, the above existing process allows the driver to bind to any device matching the new_id for the window where it's enabled. This is often not desired, such as the case of trying to bind a single device to a meta driver like pci-stub or vfio-pci. Using driver_override we can do this deterministically using: echo pci-stub > /sys/bus/pci/devices/:03:00.0/driver_override echo :03:00.0 > /sys/bus/pci/devices/:03:00.0/driver/unbind echo :03:00.0 > /sys/bus/pci/drivers_probe Previously we could not invoke drivers_probe after adding a device to new_id for a driver as we get non-deterministic behavior whether the driver we intend or the standard driver will claim the device. Now it becomes a deterministic process, only the driver matching driver_override will probe the device. To return the device to the standard driver, we simply clear the driver_override and reprobe the device: echo > /sys/bus/pci/devices/:03:00.0/driver_override echo :03:00.0 > /sys/bus/pci/devices/:03:00.0/driver/unbind echo :03:00.0 > /sys/bus/pci/drivers_probe Another advantage to this approach is that we can specify a driver override to force a specific binding or prevent any binding. For instance when an IOMMU group is exposed to userspace through VFIO we require that all devices within that group are owned by VFIO. However, devices can be hot-added into an IOMMU group, in which case we want to prevent the device from binding to any driver (override driver = "none") or perhaps have it automatically bind to vfio-pci. With driver_override it's a simple matter for this field to be set internally when the device is first discovered to prevent driver matches. Signed-off-by: Alex Williamson Cc: Greg Kroah-Hartman Reviewed-by: Alexander Graf I suppose Konrad's RB stays as well? Alex -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 09/15] MIPS: Add functions for hypervisor call
On Wed, May 21, 2014 at 01:16:30AM +0100, James Hogan wrote: > On Tuesday 20 May 2014 16:47:10 Andreas Herrmann wrote: > > From: David Daney > > > > Signed-off-by: David Daney > > Signed-off-by: Andreas Herrmann > These look similar to the kvm_hypercall${n} functions in > arch/{x86,s390}/include/asm/kvm_para.h. Does it make sense to define > that API in kvm_para.h for MIPS instead of this one? Yes, the functions should be moved to a separate header. Andreas -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[Bug 65561] KVM:Entry failed on Single stepping sti instruction
https://bugzilla.kernel.org/show_bug.cgi?id=65561 --- Comment #7 from Paolo Bonzini --- Hi Jidong, no this is not fixed yet. Basically OUT instructions are emulated by KVM, and support for single-stepping and breakpoints in the emulator is quite minimal. 3.12 added some support but OUTs and writes to memory are still broken: commit 0912c9771e9902f752e890e93af495cc06a786ac Author: Paolo Bonzini Date: Tue Aug 27 15:41:43 2013 +0200 KVM: x86: add comments where MMIO does not return to the emulator Support for single-step in the emulator (new in 3.12) does not work for MMIO or PIO writes, because they are completed without returning to the emulator. This is not worse than what we had in 3.11; still, add comments so that the issue is not forgotten. Signed-off-by: Paolo Bonzini Signed-off-by: Gleb Natapov -- You are receiving this mail because: You are watching the assignee of the bug. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html