[kvm-devel] [PATCH RFC 0/5]Add kvm trace support V2
Hi, The following patches add trace support in kvm using relay_fs and trace_mark according to last comments. if we want to use it, we should config kvm_trace firstly which depends on MARKERS and SYSFS, at this case it will add a tiny time penalty to check whether we provide a hook to call a function. if we don't config MARKERS, the trace functions are void. I refer to the blktrace code to implement it. Welcome comments. --Eric (Liu, Feng) - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [PATCH RFC 4/5] The tool for parsing trace data
From 182054d6c1e6a6213f3fa56aab1ca56c39f9ba80 Mon Sep 17 00:00:00 2001 From: Feng(Eric) Liu [EMAIL PROTECTED] Date: Mon, 31 Mar 2008 10:52:57 -0400 Subject: [PATCH] kvm: kvmtrace_format parses the binary trace data outputted by kvmtrace, and reformats it according to the rules in the file formats of definitions. Signed-off-by: Feng(Eric) Liu [EMAIL PROTECTED] --- user/formats | 23 ++ user/kvmtrace_format | 201 ++ 2 files changed, 224 insertions(+), 0 deletions(-) create mode 100644 user/formats create mode 100755 user/kvmtrace_format diff --git a/user/formats b/user/formats new file mode 100644 index 000..e85b219 --- /dev/null +++ b/user/formats @@ -0,0 +1,23 @@ +0x %(tsc)d (+%(reltsc)8d) unknown (0x%(event)016x) vcpu = 0x%(vcpu)08x pid = 0x%(pid)08x [ 0x%(1)08x 0x%(2)08x 0x%(3)08x 0x%(4)08x 0x%(5)08x ] + +0x00010001 %(tsc)d (+%(reltsc)8d) VMENTRY vcpu = 0x%(vcpu)08x pid = 0x%(pid)08x +0x00010002 %(tsc)d (+%(reltsc)8d) VMEXIT vcpu = 0x%(vcpu)08x pid = 0x%(pid)08x [ exitcode = 0x%(1)08x, rip = 0x%(2)08x ] +0x00020001 %(tsc)d (+%(reltsc)8d) PAGE_FAULT vcpu = 0x%(vcpu)08x pid = 0x%(pid)08x [ errorcode = 0x%(1)08x, virt = 0x%(2)08x ] +0x00020002 %(tsc)d (+%(reltsc)8d) INJ_VIRQvcpu = 0x%(vcpu)08x pid = 0x%(pid)08x [ vector = 0x%(1)02x ] +0x00020003 %(tsc)d (+%(reltsc)8d) PEND_INTR vcpu = 0x%(vcpu)08x pid = 0x%(pid)08x [ vector = 0x%(1)02x ] +0x00020004 %(tsc)d (+%(reltsc)8d) IO_READ vcpu = 0x%(vcpu)08x pid = 0x%(pid)08x [ port = 0x%(1)04x, size = %(2)d ] +0x00020005 %(tsc)d (+%(reltsc)8d) IO_WRITEvcpu = 0x%(vcpu)08x pid = 0x%(pid)08x [ port = 0x%(1)04x, size = %(2)d ] +0x00020006 %(tsc)d (+%(reltsc)8d) CR_READ vcpu = 0x%(vcpu)08x pid = 0x%(pid)08x [ CR# = %(1)d, value = 0x%(2)08x ] +0x00020007 %(tsc)d (+%(reltsc)8d) CR_WRITEvcpu = 0x%(vcpu)08x pid = 0x%(pid)08x [ CR# = %(1)d, value = 0x%(2)08x ] +0x00020008 %(tsc)d (+%(reltsc)8d) DR_READ vcpu = 0x%(vcpu)08x pid = 0x%(pid)08x [ DR# = %(1)d, value = 0x%(2)08x ] +0x00020009 %(tsc)d (+%(reltsc)8d) DR_WRITEvcpu = 0x%(vcpu)08x pid = 0x%(pid)08x [ DR# = %(1)d, value = 0x%(2)08x ] +0x0002000A %(tsc)d (+%(reltsc)8d) MSR_READvcpu = 0x%(vcpu)08x pid = 0x%(pid)08x [ MSR# = 0x%(1)08x, data_lo = 0x%(2)08x data_hi = 0x%(3)08x ] +0x0002000B %(tsc)d (+%(reltsc)8d) MSR_WRITE vcpu = 0x%(vcpu)08x pid = 0x%(pid)08x [ MSR# = 0x%(1)08x, data_lo = 0x%(2)08x data_hi = 0x%(3)08x] +0x0002000C %(tsc)d (+%(reltsc)8d) CPUID vcpu = 0x%(vcpu)08x pid = 0x%(pid)08x [ func = 0x%(1)08x, eax = 0x%(2)08x, ebx = 0x%(3)08x, ecx = 0x%(4)08x edx = 0x%(5)08x] +0x0002000D %(tsc)d (+%(reltsc)8d) INTRvcpu = 0x%(vcpu)08x pid = 0x%(pid)08x [ vector = 0x%(1)02x ] +0x0002000E %(tsc)d (+%(reltsc)8d) NMI vcpu = 0x%(vcpu)08x pid = 0x%(pid)08x +0x0002000F %(tsc)d (+%(reltsc)8d) VMMCALL vcpu = 0x%(vcpu)08x pid = 0x%(pid)08x [ func = 0x%(1)08x ] +0x00020010 %(tsc)d (+%(reltsc)8d) HLT vcpu = 0x%(vcpu)08x pid = 0x%(pid)08x +0x00020011 %(tsc)d (+%(reltsc)8d) CLTSvcpu = 0x%(vcpu)08x pid = 0x%(pid)08x +0x00020012 %(tsc)d (+%(reltsc)8d) LMSWvcpu = 0x%(vcpu)08x pid = 0x%(pid)08x [ value = 0x%(1)08x ] +0x00020013 %(tsc)d (+%(reltsc)8d) APIC_ACCESS vcpu = 0x%(vcpu)08x pid = 0x%(pid)08x [ offset = 0x%(1)08x ] diff --git a/user/kvmtrace_format b/user/kvmtrace_format new file mode 100755 index 000..7b71ba5 --- /dev/null +++ b/user/kvmtrace_format @@ -0,0 +1,201 @@ +#!/usr/bin/env python + +# by Mark Williamson, (C) 2004 Intel Research Cambridge + +# Program for reformatting trace buffer output according to user-supplied rules + +import re, sys, string, signal, struct, os, getopt + +def usage(): +print sys.stderr, \ + Usage: + sys.argv[0] + defs-file + Parses trace data in binary format, as output by kvmtrace and + reformats it according to the rules in a file of definitions. The + rules in this file should have the format ({ and } show grouping + and are not part of the syntax): + + {event_id}{whitespace}{text format string} + + The textual format string may include format specifiers, such as: +%(tsc)d, %(event)d, %(vcpu)d %(pid)d %(1)d, %(2)d, + %(3)d, %(4)d, %(5)d + [ the 'd' format specifier outputs in decimal, alternatively 'x' +will output in hexadecimal and 'o' will output in octal ] + + Which correspond to the event ID, timestamp counter, vcpu + , pid and the 5 data fields from the trace record. There should be + one such rule for each type of event. + Depending on your system and the volume of trace buffer data, + this script may not be able to keep up with the output of kvmtrace + if it is piped directly. In these circumstances you should have + kvmtrace output to a file for processing off-line. + +
[kvm-devel] [PATCH RFC 5/5] Modify userspace Kbuild and kvm_stat for kvm_trace
From a18df6cb088a07ce34fb1981ca3077f5e6925ae2 Mon Sep 17 00:00:00 2001 From: Feng (Eric) Liu [EMAIL PROTECTED] Date: Mon, 31 Mar 2008 10:35:59 -0400 Subject: [PATCH] kvm: Modify Kbuild for kvm trace and ensure kvm_stat work when kvm trace is enabled by userspace app. Signed-off-by: Feng (Eric) Liu [EMAIL PROTECTED] --- kernel/Kbuild |3 +++ kvm_stat |3 +++ 2 files changed, 6 insertions(+), 0 deletions(-) diff --git a/kernel/Kbuild b/kernel/Kbuild index 014cc17..e3e97ab 100644 --- a/kernel/Kbuild +++ b/kernel/Kbuild @@ -2,6 +2,9 @@ EXTRA_CFLAGS := -I$(src)/include -include $(src)/external-module-compat.h obj-m := kvm.o kvm-intel.o kvm-amd.o kvm-objs := kvm_main.o x86.o mmu.o x86_emulate.o anon_inodes.o irq.o i8259.o \ lapic.o ioapic.o preempt.o i8254.o +ifeq ($(CONFIG_KVM_TRACE),y) +kvm-objs += kvm_trace.o +endif kvm-intel-objs := vmx.o vmx-debug.o kvm-amd-objs := svm.o diff --git a/kvm_stat b/kvm_stat index 07773b0..75910c2 100755 --- a/kvm_stat +++ b/kvm_stat @@ -2,12 +2,15 @@ import curses import sys, os, time, optparse +import string class Stats: def __init__(self): self.base = '/sys/kernel/debug/kvm' self.values = {} for key in os.listdir(self.base): + if key.startswith('trace'): +continue self.values[key] = None def get(self): for key, oldval in self.values.iteritems(): -- 1.5.1 --Eric (Liu, Feng) 0001-kvm-Modify-Kbuild-for-kvm-trace-and-ensure-kvm_s.patch Description: 0001-kvm-Modify-Kbuild-for-kvm-trace-and-ensure-kvm_s.patch - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [PATCH RFC 1/5]Add some trace enties and define interface for tracing
From d56731ffc6d5742a88a157dfe0e4344d35f7db58 Mon Sep 17 00:00:00 2001 From: Feng(Eric) Liu [EMAIL PROTECTED] Date: Mon, 31 Mar 2008 10:08:55 -0400 Subject: [PATCH] KVM: Add some trace entries in current code and define some interfaces for userspace app to contrl and use tracing data. Signed-off-by: Feng(Eric) Liu [EMAIL PROTECTED] --- arch/x86/kvm/vmx.c | 34 +- arch/x86/kvm/x86.c | 26 +++ include/asm-x86/kvm.h | 19 + include/asm-x86/kvm_host.h | 19 + include/linux/kvm.h| 48 +++- include/linux/kvm_host.h | 14 virt/kvm/kvm_main.c|7 +- 7 files changed, 163 insertions(+), 4 deletions(-) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 9951ec9..8f70405 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -1794,6 +1794,10 @@ static void vmx_inject_irq(struct kvm_vcpu *vcpu, int irq) { struct vcpu_vmx *vmx = to_vmx(vcpu); + KVMTRACE_1D(INJ_VIRQ, vcpu, + (u32)(irq | INTR_TYPE_SOFT_INTR | INTR_INFO_VALID_MASK), + handler); + if (vcpu-arch.rmode.active) { vmx-rmode.irq.pending = true; vmx-rmode.irq.vector = irq; @@ -1944,6 +1948,7 @@ static int handle_exception(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run) error_code = vmcs_read32(VM_EXIT_INTR_ERROR_CODE); if (is_page_fault(intr_info)) { cr2 = vmcs_readl(EXIT_QUALIFICATION); + KVMTRACE_2D(PAGE_FAULT, vcpu, error_code, (u32)cr2, handler); return kvm_mmu_page_fault(vcpu, cr2, error_code); } @@ -1972,6 +1977,7 @@ static int handle_external_interrupt(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run) { ++vcpu-stat.irq_exits; + KVMTRACE_1D(INTR, vcpu, vmcs_read32(VM_EXIT_INTR_INFO), handler); return 1; } @@ -2029,6 +2035,8 @@ static int handle_cr(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run) reg = (exit_qualification 8) 15; switch ((exit_qualification 4) 3) { case 0: /* mov to cr */ + KVMTRACE_2D(CR_WRITE, vcpu, (u32)cr, (u32)vcpu-arch.regs[reg], + handler); switch (cr) { case 0: vcpu_load_rsp_rip(vcpu); @@ -2061,6 +2069,7 @@ static int handle_cr(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run) vcpu-arch.cr0 = ~X86_CR0_TS; vmcs_writel(CR0_READ_SHADOW, vcpu-arch.cr0); vmx_fpu_activate(vcpu); + KVMTRACE_0D(CLTS, vcpu, handler); skip_emulated_instruction(vcpu); return 1; case 1: /*mov from cr*/ @@ -2069,12 +2078,16 @@ static int handle_cr(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run) vcpu_load_rsp_rip(vcpu); vcpu-arch.regs[reg] = vcpu-arch.cr3; vcpu_put_rsp_rip(vcpu); + KVMTRACE_2D(CR_READ, vcpu, (u32)cr, + (u32)vcpu-arch.regs[reg], handler); skip_emulated_instruction(vcpu); return 1; case 8: vcpu_load_rsp_rip(vcpu); vcpu-arch.regs[reg] = kvm_get_cr8(vcpu); vcpu_put_rsp_rip(vcpu); + KVMTRACE_2D(CR_READ, vcpu, (u32)cr, + (u32)vcpu-arch.regs[reg], handler); skip_emulated_instruction(vcpu); return 1; } @@ -2120,6 +2133,7 @@ static int handle_dr(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run) val = 0; } vcpu-arch.regs[reg] = val; + KVMTRACE_2D(DR_READ, vcpu, (u32)dr, (u32)val, handler); } else { /* mov to dr */ } @@ -2144,6 +2158,9 @@ static int handle_rdmsr(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run) return 1; } + KVMTRACE_3D(MSR_READ, vcpu, ecx, (u32)data, (u32)(data 32), + handler); + /* FIXME: handling of bits 32:63 of rax, rdx */ vcpu-arch.regs[VCPU_REGS_RAX] = data -1u; vcpu-arch.regs[VCPU_REGS_RDX] = (data 32) -1u; @@ -2157,6 +2174,9 @@ static int handle_wrmsr(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run) u64 data = (vcpu-arch.regs[VCPU_REGS_RAX] -1u) | ((u64)(vcpu-arch.regs[VCPU_REGS_RDX] -1u) 32); + KVMTRACE_3D(MSR_WRITE, vcpu, ecx, (u32)data, (u32)(data 32), + handler); + if (vmx_set_msr(vcpu, ecx, data) != 0) { kvm_inject_gp(vcpu, 0); return 1; @@ -2181,6 +2201,9 @@ static int handle_interrupt_window(struct kvm_vcpu *vcpu,
[kvm-devel] [PATCH RFC 3/5] Add a userspace tool for collect tracing data based on blktrace
From e0d0f33ce1a536d4f2c3a1763f23a89f0a726cd6 Mon Sep 17 00:00:00 2001 From: Feng (Eric) Liu [EMAIL PROTECTED] Date: Mon, 31 Mar 2008 10:57:16 -0400 Subject: [PATCH] kvm: Add a tool kvmtrace for collecting binary data from relayfs. The code is based on blktrace. Signed-off-by: Feng (Eric) Liu [EMAIL PROTECTED] --- user/Makefile |7 +- user/config-x86-common.mak |4 +- user/kvmtrace.c| 698 3 files changed, 705 insertions(+), 4 deletions(-) create mode 100644 user/kvmtrace.c diff --git a/user/Makefile b/user/Makefile index 225a435..131baad 100644 --- a/user/Makefile +++ b/user/Makefile @@ -31,15 +31,18 @@ CXXFLAGS = $(autodepend-flags) autodepend-flags = -MMD -MF $(dir $*).$(notdir $*).d -kvmctl: LDFLAGS += -pthread -lrt +LDFLAGS += -pthread -lrt kvmctl: $(kvmctl_objs) $(CC) $(LDFLAGS) $^ -o $@ +kvmtrace: $(kvmtrace_objs) + $(CC) $(LDFLAGS) $^ -o $@ + %.o: %.S $(CC) $(CFLAGS) -c -nostdlib -o $@ $^ -include .*.d clean: arch_clean - $(RM) kvmctl *.o *.a .*.d + $(RM) kvmctl kvmtrace *.o *.a .*.d diff --git a/user/config-x86-common.mak b/user/config-x86-common.mak index 8cfdd45..4c90fe6 100644 --- a/user/config-x86-common.mak +++ b/user/config-x86-common.mak @@ -1,9 +1,9 @@ #This is a make file with common rules for both x86 x86-64 -all: kvmctl test_cases +all: kvmctl kvmtrace test_cases kvmctl_objs= main.o iotable.o ../libkvm/libkvm.a - +kvmtrace_objs= kvmtrace.o balloon_ctl: balloon_ctl.o FLATLIBS = $(TEST_DIR)/libcflat.a $(libgcc) diff --git a/user/kvmtrace.c b/user/kvmtrace.c new file mode 100644 index 000..d9db60d --- /dev/null +++ b/user/kvmtrace.c @@ -0,0 +1,698 @@ +/* + * kvm tracing application + * + * This tool is used for collecting trace buffer data + * for kvm trace. + * + * Based on blktrace 0.99.3 + * + * Copyright (C) 2005 Jens Axboe [EMAIL PROTECTED] + * Copyright (C) 2006 Jens Axboe [EMAIL PROTECTED] + * Copyright (C) 2008 Eric Liu [EMAIL PROTECTED] + * + * This work is licensed under the GNU LGPL license, version 2. + */ + +#define _GNU_SOURCE + +#include pthread.h +#include sys/types.h +#include sys/stat.h +#include unistd.h +#include signal.h +#include fcntl.h +#include string.h +#include sys/ioctl.h +#include sys/param.h +#include sys/statfs.h +#include sys/poll.h +#include sys/mman.h +#include stdio.h +#include stdlib.h +#include ctype.h +#include getopt.h +#include errno.h +#include sched.h + +#ifndef __user +#define __user +#endif +#include linux/kvm.h + +static char kvmtrace_version[] = 0.1; + +/* + * You may want to increase this even more, if you are logging at a high + * rate and see skipped/missed events + */ +#define BUF_SIZE (512 * 1024) +#define BUF_NR (8) + +#define OFILE_BUF (128 * 1024) + +#define DEBUGFS_TYPE 0x64626720 + +#define max(a, b) ((a) (b) ? (a) : (b)) + +#define S_OPTS r:o:w:?V:b:n:D: +static struct option l_opts[] = { + { + .name = relay, + .has_arg = required_argument, + .flag = NULL, + .val = 'r' + }, + { + .name = output, + .has_arg = required_argument, + .flag = NULL, + .val = 'o' + }, + { + .name = stopwatch, + .has_arg = required_argument, + .flag = NULL, + .val = 'w' + }, + { + .name = version, + .has_arg = no_argument, + .flag = NULL, + .val = 'V' + }, + { + .name = buffer-size, + .has_arg = required_argument, + .flag = NULL, + .val = 'b' + }, + { + .name = num-sub-buffers, + .has_arg = required_argument, + .flag = NULL, + .val = 'n' + }, + { + .name = output-dir, + .has_arg = required_argument, + .flag = NULL, + .val = 'D' + }, + { + .name = NULL, + } +}; + +struct thread_information { + int cpu; + pthread_t thread; + + int fd; + char fn[MAXPATHLEN + 64]; + + FILE *ofile; + char *ofile_buffer; + + int (*get_subbuf)(struct thread_information *, unsigned int); + int (*read_data)(struct thread_information *, void *, unsigned int); + + unsigned long long data_read; + + struct kvm_trace_information *trace_info; + + int exited; + + /* +* mmap controlled output files +*/ + unsigned long long fs_size; + unsigned long long fs_max_size; + unsigned long fs_off; + void *fs_buf; + unsigned long fs_buf_len; + +}; + +struct kvm_trace_information { + int fd; + volatile int trace_started; + unsigned long lost_records; + struct thread_information *threads; +
[kvm-devel] [PATCH RFC 2/5] Create relay channels and add trace data
From 41d65b55580d3f07f9f1c50e89e3d64c5d10fbaf Mon Sep 17 00:00:00 2001 From: Feng (Eric) Liu [EMAIL PROTECTED] Date: Tue, 1 Apr 2008 07:26:14 -0400 Subject: [PATCH] KVM: Add kvm trace support. when config KVM_TRACE, it allows userspace app to read the trace of kvm_related events through the relayfs. Signed-off-by: Feng (Eric) Liu [EMAIL PROTECTED] --- arch/x86/kvm/Kconfig | 10 ++ arch/x86/kvm/Makefile |3 + virt/kvm/kvm_trace.c | 245 + 3 files changed, 258 insertions(+), 0 deletions(-) create mode 100644 virt/kvm/kvm_trace.c diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig index 76c70ab..d51ffb3 100644 --- a/arch/x86/kvm/Kconfig +++ b/arch/x86/kvm/Kconfig @@ -50,6 +50,16 @@ config KVM_AMD Provides support for KVM on AMD processors equipped with the AMD-V (SVM) extensions. +config KVM_TRACE + bool KVM trace support + depends on KVM MARKERS SYSFS + select RELAY + select DEBUG_FS + default n + ---help--- + This option allows reading a trace of kvm-related events through + the relayfs. + # OK, it's a little counter-intuitive to do this, but it puts it neatly under # the virtualization menu. source drivers/lguest/Kconfig diff --git a/arch/x86/kvm/Makefile b/arch/x86/kvm/Makefile index 4d0c22e..c97d35c 100644 --- a/arch/x86/kvm/Makefile +++ b/arch/x86/kvm/Makefile @@ -3,6 +3,9 @@ # common-objs = $(addprefix ../../../virt/kvm/, kvm_main.o ioapic.o) +ifeq ($(CONFIG_KVM_TRACE),y) +common-objs += $(addprefix ../../../virt/kvm/, kvm_trace.o) +endif EXTRA_CFLAGS += -Ivirt/kvm -Iarch/x86/kvm diff --git a/virt/kvm/kvm_trace.c b/virt/kvm/kvm_trace.c new file mode 100644 index 000..a07760d --- /dev/null +++ b/virt/kvm/kvm_trace.c @@ -0,0 +1,245 @@ +/* + * kvm trace + * + * it is designed to allow debugging traces of kvm to be generated + * on UP / SMP machines. Each trace entry can be timestamped so that + * it's possible to reconstruct a chronological record of trace events. + * + * Copyright (c) 2008 Intel Corporation + * + * Authors: Eric Liu, [EMAIL PROTECTED] + * + * Date:Feb 2008 + */ + +#include linux/module.h +#include linux/relay.h +#include linux/debugfs.h + +#include linux/kvm_host.h + +#define KVM_TRACE_STATE_RUNNING(1 0) +#define KVM_TRACE_STATE_CLEARUP(1 1) + +struct kvm_trace { + int trace_state; + struct rchan *rchan; + struct dentry *lost_file; + atomic_t lost_records; +}; +static struct kvm_trace *kvm_trace; + +struct kvm_trace_probe { + const char *name; + const char *format; + u32 cycle_in; + marker_probe_func *probe_func; +}; + +static inline int calc_rec_size(int cycle, int extra) +{ + int rec_size = KVM_TRC_HEAD_SIZE; + + rec_size += extra; + return cycle ? rec_size += KVM_TRC_CYCLE_SIZE : rec_size; +} + +static void kvm_add_trace(void *probe_private, void *call_data, + const char *format, va_list *args) +{ + struct kvm_trace_probe *p = probe_private; + struct kvm_trace *kt = kvm_trace; + struct kvm_trace_rec rec; + struct kvm_vcpu *vcpu; + inti, extra, size; + + if (unlikely(kt-trace_state != KVM_TRACE_STATE_RUNNING)) + return; + + rec.event = va_arg(*args, u32); + vcpu= va_arg(*args, struct kvm_vcpu *); + rec.pid = current-tgid; + rec.vcpu_id = vcpu-vcpu_id; + + extra = va_arg(*args, u32); + WARN_ON(!(extra = KVM_TRC_EXTRA_MAX)); + extra = min_t(u32, extra, KVM_TRC_EXTRA_MAX); + rec.extra_u32 = extra; + + rec.cycle_in = p-cycle_in; + + if (rec.cycle_in) { + u64 cycle = 0; + + cycle = get_cycles(); + rec.u.cycle.cycle_lo = (u32)cycle; + rec.u.cycle.cycle_hi = (u32)(cycle 32); + + for (i = 0; i rec.extra_u32; i++) + rec.u.cycle.extra_u32[i] = va_arg(*args, u32); + } else { + for (i = 0; i rec.extra_u32; i++) + rec.u.nocycle.extra_u32[i] = va_arg(*args, u32); + } + + size = calc_rec_size(rec.cycle_in, rec.extra_u32 * sizeof(u32)); + relay_write(kt-rchan, rec, size); +} + +static struct kvm_trace_probe kvm_trace_probes[] = { + { kvm_trace_entryexit, %u %p %u %u %u %u %u %u, 1, kvm_add_trace }, + { kvm_trace_handler, %u %p %u %u %u %u %u %u, 0, kvm_add_trace }, +}; + +static int lost_records_get(void *data, u64 *val) +{ + struct kvm_trace *kt = data; + + *val = atomic_read(kt-lost_records); + return 0; +} + +DEFINE_SIMPLE_ATTRIBUTE(kvm_trace_lost_ops, lost_records_get, NULL, %llu\n); + +/* + * The relay channel is used in no-overwrite mode, it keeps trace of how + * many times we encountered a full subbuffer, to tell user space app the + * lost records there were. + */ +static int
Re: [kvm-devel] [PATCH] KVM: MMU: Fix rmap_remove() race
Andrea Arcangeli wrote: I thought some more about this. BTW, for completeness: normally (with exception of vm_destroy) the put_page run by rmap_remove won't be the last one, but still the page can go in the freelist a moment after put_page runs (leading to the same problem). The VM is prevented to free the page while it's pinned, but the VM can do the final free on the page before rmap_remove returns. And w/o mmu notifiers there's no serialization that makes the core VM stop on the mmu_lock to wait the tlb flush to run, before the VM finally executes the last free of the page. mmu notifiers fixes this race for regular swapping as the core VM will block on the mmu_lock waiting the tlb flush (for this to work the tlb flush must always happen inside the mmu_lock unless the order is exactly spte = nonpresent; tlbflush; put_page). A VM_LOCKED on the vmas backing the anonymous memory will fix this for regolar swapping too (I did something like this in a patch at the end as a band-aid). But thinking more the moment we pretend to allow anybody to randomly __munmap__ any part of the guest physical address space like for ballooning while the guest runs (think unprivileged user owning /dev/kvm and running munmap at will), not even VM_LOCKED (ignored by munmap) and not even the mmu notifiers, can prevent the page to be queued in the kernel freelists immediately after rmap_remove returns, this is because rmap_remove may run in a different host-cpu in between unmap_vmas and invalidate_range_end. Running the ioctl before munmap won't help to prevent the race as the guest can still re-instantiate the sptes with page faults between the ioctl and munmap. However we've invalidate_range_begin. If we invalidate all sptes in invalidate_range_begin and we hold off the page faults in between _begin/_end, then we can fix this with the mmu notifiers. This can be done by taking mmu_lock in _begin and releasing it in _end, unless there's a lock dependency issue. So I think I can allow munmap safely (to unprivileged user too) by using _range_begin somehow. For this to work any relevant tlb flush must happen inside the _same_ mmu_lock critical section where spte=nonpresent and rmap_remove run too (thanks to the mmu_lock the ordering of those operations won't matter anymore, and no queue will be needed). I don't understand your conclusion: you prove that mlock() is not good enough, then post a patch to do it? I'll take another shot at fixing rmap_remove(), I don't like to cripple swapping for 2.6.25 (though it will only be really dependable in .26). -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [kvm-ppc-devel] virtio-net working on PowerPC KVM
On Mar 31, 2008, at 6:52 AM, Christian Ehrhardt wrote: Avi Kivity wrote: Hollis Blanchard wrote: I'm pleased to report that we now have working network support in the guest, via the virtio-net driver. In fact, we can use NFS for the guest's root filesystem. :) Boot log attached. Congrats! The bad news is that it's very slow, but the good news is that it's nice to be improving performance rather than debugging mysterious crashes... ;) With this milestone reached, in the near future I intend to start sending patches to Avi and linuxppc-dev for review, hopefully for inclusion in 2.6.26. However, I do want to see if we can improve the performance a little bit first... Low virtio net performance may be due to the virtio net host timer. What's your guest/host ping latency? I would be happy about 0.25ms atm :-). The current ping latency to the Host or other PC's is around 7-8ms (native sys is ~0.15ms). We are investigating performance improvements in general and also some changes in the setup e.g. booting from virtio-block as alternative for some speedup. I am experiencing 7-8ms ping latencies (native 0.1ms) on x86_64 as well, when pinging the virtual machine. Maybe it's not related to PowerPC? Is it supposed to be that slow? Even if you have a good hrtimer implementation, I think you'll see 0.25ms latency, and that may be enough to slow down nfs. Unfortunately virtio is tuned for throughput at this time (it should be easy to disable the timer when we detect the queue is usually empty). Alex - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [kvm-ppc-devel] virtio-net working on PowerPC KVM
Alexander Graf wrote: I am experiencing 7-8ms ping latencies (native 0.1ms) on x86_64 as well, when pinging the virtual machine. Maybe it's not related to PowerPC? Is it supposed to be that slow? If you have a really old host kernel, or a misconfigured one, it can happen. What's your host kernel? are you using hrtimers? -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [kvm-ia64-devel] [09/17] [PATCH] kvm/ia64: Add mmio decoder for kvm/ia64.
[EMAIL PROTECTED] wrote: Hi, Selon Zhang, Xiantao [EMAIL PROTECTED]: From 5f82ea88c095cf89cbae920944c05e578f35365f Mon Sep 17 00:00:00 2001 From: Xiantao Zhang [EMAIL PROTECTED] Date: Wed, 12 Mar 2008 14:48:09 +0800 Subject: [PATCH] kvm/ia64: Add mmio decoder for kvm/ia64. [...] +post_update = (inst.M5.i 7) + inst.M5.imm7; +if (inst.M5.s) +temp -= post_update; +else +temp += post_update; The sign extension is not done correctly here. (This has been fixed in Xen code). Include the fix in the latest merge candidate patchset. Thanks! :) git://git.kernel.org/pub/scm/linux/kernel/git/xiantao/kvm-ia64.git kvm-ia64-mc8 +post_update = (inst.M3.i 7) + inst.M3.imm7; +if (inst.M3.s) +temp -= post_update; +else +temp += post_update; Ditto. +post_update = (inst.M10.i 7) + inst.M10.imm7; + if (inst.M10.s) + temp -= post_update; +else +temp += post_update; Ditto. +post_update = (inst.M10.i 7) + inst.M10.imm7; + if (inst.M10.s) + temp -= post_update; +else +temp += post_update; Ditto. +post_update = (inst.M15.i 7) + inst.M15.imm7; + if (inst.M15.s) + temp -= post_update; +else +temp += post_update; Ditto. Tristan. - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [kvm-ppc-devel] virtio-net working on PowerPC KVM
On Mar 31, 2008, at 9:43 AM, Avi Kivity wrote: Alexander Graf wrote: I am experiencing 7-8ms ping latencies (native 0.1ms) on x86_64 as well, when pinging the virtual machine. Maybe it's not related to PowerPC? Is it supposed to be that slow? If you have a really old host kernel, or a misconfigured one, it can happen. What's your host kernel? are you using hrtimers? This is a 2.6.22 SUSE kernel. As far as I can see CONFIG_HIGH_RES_TIMERS is not set. Guess it's my fault then - nevertheless a warning when starting kvm would be nice. Alex - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [02/17][PATCH] Implement smp_call_function_mask for ia64 - V8
From 697d50286088e98da5ac8653c80aaa96c81abf87 Mon Sep 17 00:00:00 2001 From: Xiantao Zhang [EMAIL PROTECTED] Date: Mon, 31 Mar 2008 09:50:24 +0800 Subject: [PATCH] KVM:IA64: Implement smp_call_function_mask for ia64 This function provides more flexible interface for smp infrastructure. Signed-off-by: Xiantao Zhang [EMAIL PROTECTED] --- arch/ia64/kernel/smp.c | 84 +-- include/asm-ia64/smp.h |3 ++ 2 files changed, 69 insertions(+), 18 deletions(-) diff --git a/arch/ia64/kernel/smp.c b/arch/ia64/kernel/smp.c index 4e446aa..5bb241f 100644 --- a/arch/ia64/kernel/smp.c +++ b/arch/ia64/kernel/smp.c @@ -213,6 +213,19 @@ send_IPI_allbutself (int op) * Called with preemption disabled. */ static inline void +send_IPI_mask(cpumask_t mask, int op) +{ + unsigned int cpu; + + for_each_cpu_mask(cpu, mask) { + send_IPI_single(cpu, op); + } +} + +/* + * Called with preemption disabled. + */ +static inline void send_IPI_all (int op) { int i; @@ -401,33 +414,36 @@ smp_call_function_single (int cpuid, void (*func) (void *info), void *info, int } EXPORT_SYMBOL(smp_call_function_single); -/* - * this function sends a 'generic call function' IPI to all other CPUs - * in the system. - */ - -/* - * [SUMMARY] Run a function on all other CPUs. - * func The function to run. This must be fast and non-blocking. - * info An arbitrary pointer to pass to the function. - * nonatomiccurrently unused. - * wait If true, wait (atomically) until function has completed on other CPUs. - * [RETURNS] 0 on success, else a negative status code. +/** + * smp_call_function_mask(): Run a function on a set of other CPUs. + * mask The set of cpus to run on. Must not include the current cpu. + * func The function to run. This must be fast and non-blocking. + * info An arbitrary pointer to pass to the function. + * wait If true, wait (atomically) until function + * has completed on other CPUs. * - * Does not return until remote CPUs are nearly ready to execute func or are or have - * executed. + * Returns 0 on success, else a negative status code. + * + * If @wait is true, then returns once @func has returned; otherwise + * it returns just before the target cpu calls @func. * * You must not call this function with disabled interrupts or from a * hardware interrupt handler or from a bottom half handler. */ -int -smp_call_function (void (*func) (void *info), void *info, int nonatomic, int wait) +int smp_call_function_mask(cpumask_t mask, + void (*func)(void *), void *info, + int wait) { struct call_data_struct data; + cpumask_t allbutself; int cpus; spin_lock(call_lock); - cpus = num_online_cpus() - 1; + allbutself = cpu_online_map; + cpu_clear(smp_processor_id(), allbutself); + + cpus_and(mask, mask, allbutself); + cpus = cpus_weight(mask); if (!cpus) { spin_unlock(call_lock); return 0; @@ -445,7 +461,12 @@ smp_call_function (void (*func) (void *info), void *info, int nonatomic, int wai call_data = data; mb(); /* ensure store to call_data precedes setting of IPI_CALL_FUNC */ - send_IPI_allbutself(IPI_CALL_FUNC); + + /* Send a message to other CPUs */ + if (cpus_equal(mask, allbutself)) + send_IPI_allbutself(IPI_CALL_FUNC); + else + send_IPI_mask(mask, IPI_CALL_FUNC); /* Wait for response */ while (atomic_read(data.started) != cpus) @@ -458,6 +479,33 @@ smp_call_function (void (*func) (void *info), void *info, int nonatomic, int wai spin_unlock(call_lock); return 0; + +} +EXPORT_SYMBOL(smp_call_function_mask); + +/* + * this function sends a 'generic call function' IPI to all other CPUs + * in the system. + */ + +/* + * [SUMMARY] Run a function on all other CPUs. + * func The function to run. This must be fast and non-blocking. + * info An arbitrary pointer to pass to the function. + * nonatomiccurrently unused. + * wait If true, wait (atomically) until function has completed on other CPUs. + * [RETURNS] 0 on success, else a negative status code. + * + * Does not return until remote CPUs are nearly ready to execute func or are or have + * executed. + * + * You must not call this function with disabled interrupts or from a + * hardware interrupt handler or from a bottom half handler. + */ +int +smp_call_function (void (*func) (void *info), void *info, int nonatomic, int wait) +{ + return smp_call_function_mask(cpu_online_map, func, info, wait); } EXPORT_SYMBOL(smp_call_function); diff --git a/include/asm-ia64/smp.h b/include/asm-ia64/smp.h index 4fa733d..ec5f355 100644 --- a/include/asm-ia64/smp.h +++ b/include/asm-ia64/smp.h @@ -38,6 +38,9 @@ ia64_get_lid (void) return
[kvm-devel] [09/17] [PATCH] kvm/ia64: Add mmio decoder for kvm/ia64. V8
From cb572f8887ccfb939457c79fb2d2893ead2a3632 Mon Sep 17 00:00:00 2001 From: Xiantao Zhang [EMAIL PROTECTED] Date: Mon, 31 Mar 2008 10:08:09 +0800 Subject: [PATCH] KVM:IA64 : Add mmio decoder for kvm/ia64. mmio.c includes mmio decoder routines. Signed-off-by: Anthony Xu [EMAIL PROTECTED] Signed-off-by: Xiantao Zhang [EMAIL PROTECTED] --- arch/ia64/kvm/mmio.c | 340 ++ 1 files changed, 340 insertions(+), 0 deletions(-) create mode 100644 arch/ia64/kvm/mmio.c diff --git a/arch/ia64/kvm/mmio.c b/arch/ia64/kvm/mmio.c new file mode 100644 index 000..9ba879f --- /dev/null +++ b/arch/ia64/kvm/mmio.c @@ -0,0 +1,340 @@ +/* + * mmio.c: MMIO emulation components. + * Copyright (c) 2004, Intel Corporation. + * Yaozu Dong (Eddie Dong) ([EMAIL PROTECTED]) + * Kun Tian (Kevin Tian) ([EMAIL PROTECTED]) + * + * Copyright (c) 2007 Intel Corporation KVM support. + * Xuefei Xu (Anthony Xu) ([EMAIL PROTECTED]) + * Xiantao Zhang ([EMAIL PROTECTED]) + * + * This program is free software; you can redistribute it and/or modify it + * under the terms and conditions of the GNU General Public License, + * version 2, as published by the Free Software Foundation. + * + * This program is distributed in the hope it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + * You should have received a copy of the GNU General Public License along with + * this program; if not, write to the Free Software Foundation, Inc., 59 Temple + * Place - Suite 330, Boston, MA 02111-1307 USA. + * + */ + +#include linux/kvm_host.h + +#include vcpu.h + +static void vlsapic_write_xtp(VCPU *v, uint8_t val) +{ + VLSAPIC_XTP(v) = val; +} + +/* + * LSAPIC OFFSET + */ +#define PIB_LOW_HALF(ofst) !(ofst (1 20)) +#define PIB_OFST_INTA 0x1E +#define PIB_OFST_XTP 0x1E0008 + +/* + * execute write IPI op. + */ +static void vlsapic_write_ipi(VCPU *vcpu, uint64_t addr, uint64_t data) +{ + struct exit_ctl_data *p = current_vcpu-arch.exit_data; + unsigned long psr; + + local_irq_save(psr); + + p-exit_reason = EXIT_REASON_IPI; + p-u.ipi_data.addr.val = addr; + p-u.ipi_data.data.val = data; + vmm_transition(current_vcpu); + + local_irq_restore(psr); + +} + +void lsapic_write(VCPU *v, unsigned long addr, unsigned long length, + unsigned long val) +{ + addr = (PIB_SIZE - 1); + + switch (addr) { + case PIB_OFST_INTA: + /*panic_domain(NULL, Undefined write on PIB INTA\n);*/ + panic_vm(v); + break; + case PIB_OFST_XTP: + if (length == 1) { + vlsapic_write_xtp(v, val); + } else { + /*panic_domain(NULL, + Undefined write on PIB XTP\n);*/ + panic_vm(v); + } + break; + default: + if (PIB_LOW_HALF(addr)) { + /*lower half */ + if (length != 8) + /*panic_domain(NULL, + Can't LHF write with size %ld!\n, + length);*/ + panic_vm(v); + else + vlsapic_write_ipi(v, addr, val); + } else { /* upper half + printk(IPI-UHF write %lx\n,addr);*/ + panic_vm(v); + } + break; + } +} + +unsigned long lsapic_read(VCPU *v, unsigned long addr, + unsigned long length) +{ + uint64_t result = 0; + + addr = (PIB_SIZE - 1); + + switch (addr) { + case PIB_OFST_INTA: + if (length == 1) /* 1 byte load */ + ; /* There is no i8259, there is no INTA access*/ + else + /*panic_domain(NULL,Undefined read on PIB INTA\n); */ + panic_vm(v); + + break; + case PIB_OFST_XTP: + if (length == 1) { + result = VLSAPIC_XTP(v); + /* printk(read xtp %lx\n, result); */ + } else { + /*panic_domain(NULL, + Undefined read on PIB XTP\n);*/ + panic_vm(v); + } + break; + default: + panic_vm(v); + break; + } + return result; +} + +static void mmio_access(VCPU *vcpu, u64 src_pa, u64 *dest, + u16 s, int ma, int dir) +{ + unsigned long iot; + struct exit_ctl_data *p = vcpu-arch.exit_data; + unsigned long psr; + + iot = __gpfn_is_io(src_pa PAGE_SHIFT); + + local_irq_save(psr); +
[kvm-devel] [12/17][PATCH] kvm/ia64: add optimization for some virtulization faults- V8
From a2bf407dd4dbcec75a076b9ed9a6d22ab98c54b7 Mon Sep 17 00:00:00 2001 From: Xiantao Zhang [EMAIL PROTECTED] Date: Wed, 12 Mar 2008 13:49:38 +0800 Subject: [PATCH] KVM:IA64: add optimization for some virtulization faults optvfault.S adds optimization for some performance-critical virtualization faults. Signed-off-by: Anthony Xu [EMAIL PROTECTED] Signed-off-by: Xiantao Zhang [EMAIL PROTECTED] --- arch/ia64/kvm/optvfault.S | 918 + 1 files changed, 918 insertions(+), 0 deletions(-) create mode 100644 arch/ia64/kvm/optvfault.S diff --git a/arch/ia64/kvm/optvfault.S b/arch/ia64/kvm/optvfault.S new file mode 100644 index 000..5de210e --- /dev/null +++ b/arch/ia64/kvm/optvfault.S @@ -0,0 +1,918 @@ +/* + * arch/ia64/vmx/optvfault.S + * optimize virtualization fault handler + * + * Copyright (C) 2006 Intel Co + * Xuefei Xu (Anthony Xu) [EMAIL PROTECTED] + */ + +#include asm/asmmacro.h +#include asm/processor.h + +#include vti.h +#include asm-offsets.h + +#define ACCE_MOV_FROM_AR +#define ACCE_MOV_FROM_RR +#define ACCE_MOV_TO_RR +#define ACCE_RSM +#define ACCE_SSM +#define ACCE_MOV_TO_PSR +#define ACCE_THASH + +//mov r1=ar3 +GLOBAL_ENTRY(kvm_asm_mov_from_ar) +#ifndef ACCE_MOV_FROM_AR +br.many kvm_virtualization_fault_back +#endif +add r18=VMM_VCPU_ITC_OFS_OFFSET, r21 +add r16=VMM_VCPU_LAST_ITC_OFFSET,r21 +extr.u r17=r25,6,7 +;; +ld8 r18=[r18] +mov r19=ar.itc +mov r24=b0 +;; +add r19=r19,r18 +addl [EMAIL PROTECTED](asm_mov_to_reg),gp +;; +st8 [r16] = r19 +adds r30=kvm_resume_to_guest-asm_mov_to_reg,r20 +shladd r17=r17,4,r20 +;; +mov b0=r17 +br.sptk.few b0 +;; +END(kvm_asm_mov_from_ar) + + +// mov r1=rr[r3] +GLOBAL_ENTRY(kvm_asm_mov_from_rr) +#ifndef ACCE_MOV_FROM_RR +br.many kvm_virtualization_fault_back +#endif +extr.u r16=r25,20,7 +extr.u r17=r25,6,7 +addl [EMAIL PROTECTED](asm_mov_from_reg),gp +;; +adds r30=kvm_asm_mov_from_rr_back_1-asm_mov_from_reg,r20 +shladd r16=r16,4,r20 +mov r24=b0 +;; +add r27=VMM_VCPU_VRR0_OFFSET,r21 +mov b0=r16 +br.many b0 +;; +kvm_asm_mov_from_rr_back_1: +adds r30=kvm_resume_to_guest-asm_mov_from_reg,r20 +adds r22=asm_mov_to_reg-asm_mov_from_reg,r20 +shr.u r26=r19,61 +;; +shladd r17=r17,4,r22 +shladd r27=r26,3,r27 +;; +ld8 r19=[r27] +mov b0=r17 +br.many b0 +END(kvm_asm_mov_from_rr) + + +// mov rr[r3]=r2 +GLOBAL_ENTRY(kvm_asm_mov_to_rr) +#ifndef ACCE_MOV_TO_RR +br.many kvm_virtualization_fault_back +#endif +extr.u r16=r25,20,7 +extr.u r17=r25,13,7 +addl [EMAIL PROTECTED](asm_mov_from_reg),gp +;; +adds r30=kvm_asm_mov_to_rr_back_1-asm_mov_from_reg,r20 +shladd r16=r16,4,r20 +mov r22=b0 +;; +add r27=VMM_VCPU_VRR0_OFFSET,r21 +mov b0=r16 +br.many b0 +;; +kvm_asm_mov_to_rr_back_1: +adds r30=kvm_asm_mov_to_rr_back_2-asm_mov_from_reg,r20 +shr.u r23=r19,61 +shladd r17=r17,4,r20 +;; +//if rr6, go back +cmp.eq p6,p0=6,r23 +mov b0=r22 +(p6) br.cond.dpnt.many kvm_virtualization_fault_back +;; +mov r28=r19 +mov b0=r17 +br.many b0 +kvm_asm_mov_to_rr_back_2: +adds r30=kvm_resume_to_guest-asm_mov_from_reg,r20 +shladd r27=r23,3,r27 +;; // vrr.rid4 |0xe +st8 [r27]=r19 +mov b0=r30 +;; +extr.u r16=r19,8,26 +extr.u r18 =r19,2,6 +mov r17 =0xe +;; +shladd r16 = r16, 4, r17 +extr.u r19 =r19,0,8 +;; +shl r16 = r16,8 +;; +add r19 = r19, r16 +;; //set ve 1 +dep r19=-1,r19,0,1 +cmp.lt p6,p0=14,r18 +;; +(p6) mov r18=14 +;; +(p6) dep r19=r18,r19,2,6 +;; +cmp.eq p6,p0=0,r23 +;; +cmp.eq.or p6,p0=4,r23 +;; +adds r16=VMM_VCPU_MODE_FLAGS_OFFSET,r21 +(p6) adds r17=VMM_VCPU_META_SAVED_RR0_OFFSET,r21 +;; +ld4 r16=[r16] +cmp.eq p7,p0=r0,r0 +(p6) shladd r17=r23,1,r17 +;; +(p6) st8 [r17]=r19 +(p6) tbit.nz p6,p7=r16,0 +;; +(p7) mov rr[r28]=r19 +mov r24=r22 +br.many b0 +END(kvm_asm_mov_to_rr) + + +//rsm +GLOBAL_ENTRY(kvm_asm_rsm) +#ifndef ACCE_RSM +br.many kvm_virtualization_fault_back +#endif +add r16=VMM_VPD_BASE_OFFSET,r21 +extr.u r26=r25,6,21 +extr.u r27=r25,31,2 +;; +ld8 r16=[r16] +extr.u r28=r25,36,1 +dep r26=r27,r26,21,2 +;; +add r17=VPD_VPSR_START_OFFSET,r16 +add r22=VMM_VCPU_MODE_FLAGS_OFFSET,r21 +//r26 is imm24 +dep r26=r28,r26,23,1 +;; +ld8 r18=[r17] +movl r28=IA64_PSR_IC+IA64_PSR_I+IA64_PSR_DT+IA64_PSR_SI +ld4 r23=[r22] +sub r27=-1,r26 +mov r24=b0 +;; +mov r20=cr.ipsr +or r28=r27,r28 +and r19=r18,r27 +;; +st8 [r17]=r19 +and r20=r20,r28 +/* Comment it out due to short of fp lazy alorgithm support +adds r27=IA64_VCPU_FP_PSR_OFFSET,r21 +;; +ld8 r27=[r27] +;; +tbit.nz p8,p0= r27,IA64_PSR_DFH_BIT +;; +(p8) dep
[kvm-devel] [13/17][PATCH] kvm/ia64: Generate offset values for assembly code use. V8
From b0f8c3bf3b020077c14bebd9d052cec455ccedaf Mon Sep 17 00:00:00 2001 From: Xiantao Zhang [EMAIL PROTECTED] Date: Wed, 12 Mar 2008 13:50:13 +0800 Subject: [PATCH] KVM:IA64 : Generate offset values for assembly code use. asm-offsets.c will generate offset values used for assembly code for some fileds of special structures. Signed-off-by: Anthony Xu [EMAIL PROTECTED] Signed-off-by: Xiantao Zhang [EMAIL PROTECTED] --- arch/ia64/kvm/asm-offsets.c | 251 +++ 1 files changed, 251 insertions(+), 0 deletions(-) create mode 100644 arch/ia64/kvm/asm-offsets.c diff --git a/arch/ia64/kvm/asm-offsets.c b/arch/ia64/kvm/asm-offsets.c new file mode 100644 index 000..fc2ac82 --- /dev/null +++ b/arch/ia64/kvm/asm-offsets.c @@ -0,0 +1,251 @@ +/* + * asm-offsets.c Generate definitions needed by assembly language modules. + * This code generates raw asm output which is post-processed + * to extract and format the required data. + * + * Anthony Xu[EMAIL PROTECTED] + * Xiantao Zhang [EMAIL PROTECTED] + * Copyright (c) 2007 Intel Corporation KVM support. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms and conditions of the GNU General Public License, + * version 2, as published by the Free Software Foundation. + * + * This program is distributed in the hope it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + * You should have received a copy of the GNU General Public License along with + * this program; if not, write to the Free Software Foundation, Inc., 59 Temple + * Place - Suite 330, Boston, MA 02111-1307 USA. + * + */ + +#include linux/autoconf.h +#include linux/kvm_host.h + +#include vcpu.h + +#define task_struct kvm_vcpu + +#define DEFINE(sym, val) \ + asm volatile(\n- #sym (%0) #val : : i (val)) + +#define BLANK() asm volatile(\n- : :) + +#define OFFSET(_sym, _str, _mem) \ +DEFINE(_sym, offsetof(_str, _mem)); + +void foo(void) +{ + DEFINE(VMM_TASK_SIZE, sizeof(struct kvm_vcpu)); + DEFINE(VMM_PT_REGS_SIZE, sizeof(struct kvm_pt_regs)); + + BLANK(); + + DEFINE(VMM_VCPU_META_RR0_OFFSET, + offsetof(struct kvm_vcpu, arch.metaphysical_rr0)); + DEFINE(VMM_VCPU_META_SAVED_RR0_OFFSET, + offsetof(struct kvm_vcpu, + arch.metaphysical_saved_rr0)); + DEFINE(VMM_VCPU_VRR0_OFFSET, + offsetof(struct kvm_vcpu, arch.vrr[0])); + DEFINE(VMM_VPD_IRR0_OFFSET, + offsetof(struct vpd, irr[0])); + DEFINE(VMM_VCPU_ITC_CHECK_OFFSET, + offsetof(struct kvm_vcpu, arch.itc_check)); + DEFINE(VMM_VCPU_IRQ_CHECK_OFFSET, + offsetof(struct kvm_vcpu, arch.irq_check)); + DEFINE(VMM_VPD_VHPI_OFFSET, + offsetof(struct vpd, vhpi)); + DEFINE(VMM_VCPU_VSA_BASE_OFFSET, + offsetof(struct kvm_vcpu, arch.vsa_base)); + DEFINE(VMM_VCPU_VPD_OFFSET, + offsetof(struct kvm_vcpu, arch.vpd)); + DEFINE(VMM_VCPU_IRQ_CHECK, + offsetof(struct kvm_vcpu, arch.irq_check)); + DEFINE(VMM_VCPU_TIMER_PENDING, + offsetof(struct kvm_vcpu, arch.timer_pending)); + DEFINE(VMM_VCPU_META_SAVED_RR0_OFFSET, + offsetof(struct kvm_vcpu, arch.metaphysical_saved_rr0)); + DEFINE(VMM_VCPU_MODE_FLAGS_OFFSET, + offsetof(struct kvm_vcpu, arch.mode_flags)); + DEFINE(VMM_VCPU_ITC_OFS_OFFSET, + offsetof(struct kvm_vcpu, arch.itc_offset)); + DEFINE(VMM_VCPU_LAST_ITC_OFFSET, + offsetof(struct kvm_vcpu, arch.last_itc)); + DEFINE(VMM_VCPU_SAVED_GP_OFFSET, + offsetof(struct kvm_vcpu, arch.saved_gp)); + + BLANK(); + + DEFINE(VMM_PT_REGS_B6_OFFSET, + offsetof(struct kvm_pt_regs, b6)); + DEFINE(VMM_PT_REGS_B7_OFFSET, + offsetof(struct kvm_pt_regs, b7)); + DEFINE(VMM_PT_REGS_AR_CSD_OFFSET, + offsetof(struct kvm_pt_regs, ar_csd)); + DEFINE(VMM_PT_REGS_AR_SSD_OFFSET, + offsetof(struct kvm_pt_regs, ar_ssd)); + DEFINE(VMM_PT_REGS_R8_OFFSET, + offsetof(struct kvm_pt_regs, r8)); + DEFINE(VMM_PT_REGS_R9_OFFSET, + offsetof(struct kvm_pt_regs, r9)); + DEFINE(VMM_PT_REGS_R10_OFFSET, + offsetof(struct kvm_pt_regs, r10)); + DEFINE(VMM_PT_REGS_R11_OFFSET, + offsetof(struct kvm_pt_regs, r11)); + DEFINE(VMM_PT_REGS_CR_IPSR_OFFSET, + offsetof(struct
[kvm-devel] [16/17] [PATCH] kvm:ia64 Enable kvm build for ia64 - V8
From 9b38270a4c01d8cfe85cd022e22a6f5c0efe45e7 Mon Sep 17 00:00:00 2001 From: Xiantao Zhang [EMAIL PROTECTED] Date: Fri, 28 Mar 2008 14:58:47 +0800 Subject: [PATCH] KVM:IA64 Enable kvm build for ia64 Update the related Makefile and KConfig for kvm build Signed-off-by: Xiantao Zhang [EMAIL PROTECTED] --- arch/ia64/Kconfig |3 ++ arch/ia64/Makefile |1 + arch/ia64/kvm/Kconfig | 46 arch/ia64/kvm/Makefile | 61 4 files changed, 111 insertions(+), 0 deletions(-) create mode 100644 arch/ia64/kvm/Kconfig create mode 100644 arch/ia64/kvm/Makefile diff --git a/arch/ia64/Kconfig b/arch/ia64/Kconfig index 8fa3faf..a7bb62e 100644 --- a/arch/ia64/Kconfig +++ b/arch/ia64/Kconfig @@ -19,6 +19,7 @@ config IA64 select HAVE_OPROFILE select HAVE_KPROBES select HAVE_KRETPROBES + select HAVE_KVM default y help The Itanium Processor Family is Intel's 64-bit successor to @@ -589,6 +590,8 @@ config MSPEC source fs/Kconfig +source arch/ia64/kvm/Kconfig + source lib/Kconfig # diff --git a/arch/ia64/Makefile b/arch/ia64/Makefile index f1645c4..ec4cca4 100644 --- a/arch/ia64/Makefile +++ b/arch/ia64/Makefile @@ -57,6 +57,7 @@ core-$(CONFIG_IA64_GENERIC) += arch/ia64/dig/ core-$(CONFIG_IA64_HP_ZX1) += arch/ia64/dig/ core-$(CONFIG_IA64_HP_ZX1_SWIOTLB) += arch/ia64/dig/ core-$(CONFIG_IA64_SGI_SN2)+= arch/ia64/sn/ +core-$(CONFIG_KVM) += arch/ia64/kvm/ drivers-$(CONFIG_PCI) += arch/ia64/pci/ drivers-$(CONFIG_IA64_HP_SIM) += arch/ia64/hp/sim/ diff --git a/arch/ia64/kvm/Kconfig b/arch/ia64/kvm/Kconfig new file mode 100644 index 000..d2e54b9 --- /dev/null +++ b/arch/ia64/kvm/Kconfig @@ -0,0 +1,46 @@ +# +# KVM configuration +# +config HAVE_KVM + bool + +menuconfig VIRTUALIZATION + bool Virtualization + depends on HAVE_KVM || IA64 + default y + ---help--- + Say Y here to get to see options for using your Linux host to run other + operating systems inside virtual machines (guests). + This option alone does not add any kernel code. + + If you say N, all options in this submenu will be skipped and disabled. + +if VIRTUALIZATION + +config KVM + tristate Kernel-based Virtual Machine (KVM) support + depends on HAVE_KVM EXPERIMENTAL + select PREEMPT_NOTIFIERS + select ANON_INODES + ---help--- + Support hosting fully virtualized guest machines using hardware + virtualization extensions. You will need a fairly recent + processor equipped with virtualization extensions. You will also + need to select one or more of the processor modules below. + + This module provides access to the hardware capabilities through + a character device node named /dev/kvm. + + To compile this as a module, choose M here: the module + will be called kvm. + + If unsure, say N. + +config KVM_INTEL + tristate KVM for Intel Itanium 2 processors support + depends on KVM m + ---help--- + Provides support for KVM on Itanium 2 processors equipped with the VT + extensions. + +endif # VIRTUALIZATION diff --git a/arch/ia64/kvm/Makefile b/arch/ia64/kvm/Makefile new file mode 100644 index 000..cde7d8e --- /dev/null +++ b/arch/ia64/kvm/Makefile @@ -0,0 +1,61 @@ +#This Make file is to generate asm-offsets.h and build source. +# + +#Generate asm-offsets.h for vmm module build +offsets-file := asm-offsets.h + +always := $(offsets-file) +targets := $(offsets-file) +targets += arch/ia64/kvm/asm-offsets.s +clean-files := $(addprefix $(objtree)/,$(targets) $(obj)/memcpy.S $(obj)/memset.S) + +# Default sed regexp - multiline due to syntax constraints +define sed-y + /^-/{s:^-\([^ ]*\) [\$$#]*\([^ ]*\) \(.*\):#define \1 \2 /* \3 */:; s:-::; p;} +endef + +quiet_cmd_offsets = GEN $@ +define cmd_offsets + (set -e; \ +echo #ifndef __ASM_KVM_OFFSETS_H__; \ +echo #define __ASM_KVM_OFFSETS_H__; \ +echo /*; \ +echo * DO NOT MODIFY.; \ +echo *; \ +echo * This file was generated by Makefile; \ +echo *; \ +echo */; \ +echo ; \ +sed -ne $(sed-y) $; \ +echo ; \ +echo #endif ) $@ +endef +# We use internal rules to avoid the is up to date message from make +arch/ia64/kvm/asm-offsets.s: arch/ia64/kvm/asm-offsets.c + $(call if_changed_dep,cc_s_c) + +$(obj)/$(offsets-file): arch/ia64/kvm/asm-offsets.s + $(call cmd,offsets) + +# +# Makefile for Kernel-based Virtual Machine module +# + +EXTRA_CFLAGS += -Ivirt/kvm -Iarch/ia64/kvm/ + +$(addprefix $(objtree)/,$(obj)/memcpy.S $(obj)/memset.S): + $(shell ln -snf ../lib/memcpy.S $(src)/memcpy.S) + $(shell ln -snf ../lib/memset.S $(src)/memset.S) + +common-objs = $(addprefix ../../../virt/kvm/,
[kvm-devel] [15/17][PATCH] kvm/ia64: Add kvm sal/pal virtulization support. V8
From e9f15f3838626eacface8a863394e6b8825182be Mon Sep 17 00:00:00 2001 From: Xiantao Zhang [EMAIL PROTECTED] Date: Wed, 12 Mar 2008 13:42:18 +0800 Subject: [PATCH] KVM:IA64 : Add kvm sal/pal virtulization support. Some sal/pal calls would be traped to kvm for virtulization from guest firmware. Signed-off-by: Xiantao Zhang [EMAIL PROTECTED] --- arch/ia64/kvm/kvm_fw.c | 500 1 files changed, 500 insertions(+), 0 deletions(-) create mode 100644 arch/ia64/kvm/kvm_fw.c diff --git a/arch/ia64/kvm/kvm_fw.c b/arch/ia64/kvm/kvm_fw.c new file mode 100644 index 000..077d6e7 --- /dev/null +++ b/arch/ia64/kvm/kvm_fw.c @@ -0,0 +1,500 @@ +/* + * PAL/SAL call delegation + * + * Copyright (c) 2004 Li Susie [EMAIL PROTECTED] + * Copyright (c) 2005 Yu Ke [EMAIL PROTECTED] + * Copyright (c) 2007 Xiantao Zhang [EMAIL PROTECTED] + * + * This program is free software; you can redistribute it and/or modify it + * under the terms and conditions of the GNU General Public License, + * version 2, as published by the Free Software Foundation. + * + * This program is distributed in the hope it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + * You should have received a copy of the GNU General Public License along with + * this program; if not, write to the Free Software Foundation, Inc., 59 Temple + * Place - Suite 330, Boston, MA 02111-1307 USA. + */ + +#include linux/kvm_host.h +#include linux/smp.h + +#include vti.h +#include misc.h + +#include asm/pal.h +#include asm/sal.h +#include asm/tlb.h + +/* + * Handy macros to make sure that the PAL return values start out + * as something meaningful. + */ +#define INIT_PAL_STATUS_UNIMPLEMENTED(x) \ + { \ + x.status = PAL_STATUS_UNIMPLEMENTED;\ + x.v0 = 0; \ + x.v1 = 0; \ + x.v2 = 0; \ + } + +#define INIT_PAL_STATUS_SUCCESS(x) \ + { \ + x.status = PAL_STATUS_SUCCESS; \ + x.v0 = 0; \ + x.v1 = 0; \ + x.v2 = 0; \ +} + +static void kvm_get_pal_call_data(struct kvm_vcpu *vcpu, + u64 *gr28, u64 *gr29, u64 *gr30, u64 *gr31) { + struct exit_ctl_data *p; + + if (vcpu) { + p = vcpu-arch.exit_data; + if (p-exit_reason == EXIT_REASON_PAL_CALL) { + *gr28 = p-u.pal_data.gr28; + *gr29 = p-u.pal_data.gr29; + *gr30 = p-u.pal_data.gr30; + *gr31 = p-u.pal_data.gr31; + return ; + } + } + printk(KERN_DEBUGError occurs in kvm_get_pal_call_data!!\n); +} + +static void set_pal_result(struct kvm_vcpu *vcpu, + struct ia64_pal_retval result) { + + struct exit_ctl_data *p; + + p = kvm_get_exit_data(vcpu); + if (p p-exit_reason == EXIT_REASON_PAL_CALL) { + p-u.pal_data.ret = result; + return ; + } + INIT_PAL_STATUS_UNIMPLEMENTED(p-u.pal_data.ret); +} + +static void set_sal_result(struct kvm_vcpu *vcpu, + struct sal_ret_values result) { + struct exit_ctl_data *p; + + p = kvm_get_exit_data(vcpu); + if (p p-exit_reason == EXIT_REASON_SAL_CALL) { + p-u.sal_data.ret = result; + return ; + } + printk(KERN_WARNINGError occurs!!!\n); +} + +struct cache_flush_args { + u64 cache_type; + u64 operation; + u64 progress; + long status; +}; + +cpumask_t cpu_cache_coherent_map; + +static void remote_pal_cache_flush(void *data) +{ + struct cache_flush_args *args = data; + long status; + u64 progress = args-progress; + + status = ia64_pal_cache_flush(args-cache_type, args-operation, + progress, NULL); + if (status != 0) + args-status = status; +} + +static struct ia64_pal_retval pal_cache_flush(struct kvm_vcpu *vcpu) +{ + u64 gr28, gr29, gr30, gr31; + struct ia64_pal_retval result = {0, 0, 0, 0}; + struct cache_flush_args args = {0, 0, 0, 0}; + long psr; + + gr28 = gr29 = gr30 = gr31 = 0; + kvm_get_pal_call_data(vcpu, gr28, gr29, gr30, gr31); + + if (gr31 != 0) + printk(KERN_ERRvcpu:%p called cache_flush error!\n, vcpu); + + /* Always call Host Pal in int=1 */ + gr30 = ~PAL_CACHE_FLUSH_CHK_INTRS; + args.cache_type = gr29; + args.operation = gr30; +
Re: [kvm-devel] [kvm-ppc-devel] virtio-net working on PowerPC KVM
Alexander Graf wrote: On Mar 31, 2008, at 9:43 AM, Avi Kivity wrote: Alexander Graf wrote: I am experiencing 7-8ms ping latencies (native 0.1ms) on x86_64 as well, when pinging the virtual machine. Maybe it's not related to PowerPC? Is it supposed to be that slow? If you have a really old host kernel, or a misconfigured one, it can happen. What's your host kernel? are you using hrtimers? This is a 2.6.22 SUSE kernel. As far as I can see CONFIG_HIGH_RES_TIMERS is not set. An alternative is -clock hpet or -clock rtc. Guess it's my fault then - nevertheless a warning when starting kvm would be nice. Yes, indeed. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [17/17][PATCH] kvm/ia64: How to boot up guests on kvm/ia64 -V8
From b04624ce5ff919d776bf1d64b157d67410c6bc27 Mon Sep 17 00:00:00 2001 From: Xiantao Zhang [EMAIL PROTECTED] Date: Wed, 12 Mar 2008 13:57:33 +0800 Subject: [PATCH] KVM:IA64 : How to boot up guests on kvm/ia64 Signed-off-by: Xiantao Zhang [EMAIL PROTECTED] --- Documentation/ia64/kvm-howto.txt | 74 ++ 1 files changed, 74 insertions(+), 0 deletions(-) create mode 100644 Documentation/ia64/kvm-howto.txt diff --git a/Documentation/ia64/kvm-howto.txt b/Documentation/ia64/kvm-howto.txt new file mode 100644 index 000..5a8049c --- /dev/null +++ b/Documentation/ia64/kvm-howto.txt @@ -0,0 +1,74 @@ + Guide: How to boot up guests on kvm/ia64 + +1. Get the kvm source from git.kernel.org. + Userspace source: + git clone git://git.kernel.org/pub/scm/virt/kvm/kvm-userspace.git + Kernel Source: + git clone git://git.kernel.org/pub/scm/linux/kernel/git/xiantao/kvm-ia64.git + +2. Compile the source code. + 2.1 Compile userspace code: + (1)cd ./kvm-userspace + (2)./configure + (3)cd kernel + (4)make sync LINUX= $kernel_dir (kernel_dir is the directory of kernel source.) + (5)cd .. + (6)make qemu + (7)cd qemu; make install + + 2.2 Compile kernel source code: + (1) cd ./$kernel_dir + (2) Make menuconfig + (3) Enter into virtualization option, and choose kvm. + (4) make + (5) Once (4) done, make modules_install + (6) Make initrd, and use new kernel to reboot up host machine. + (7) Once (6) done, cd $kernel_dir/arch/ia64/kvm + (8) insmod kvm.ko; insmod kvm-intel.ko + +Note: For step 2, please make sure that host page size == TARGET_PAGE_SIZE of qemu, otherwise, may fail. + +3. Get Guest Firmware named as Flash.fd, and put it under right place: + (1) If you have the guest firmware (binary)released by Intel Corp for Xen, you can use it directly. + (2) If you want to build a guest firmware form source code. Please download the source from + hg clone http://xenbits.xensource.com/ext/efi-vfirmware.hg + Use the Guide of the source to build open Guest Firmware. + (3) Rename it to Flash.fd, and copy it to /usr/local/share/qemu +Note: For step 3, kvm use the guest firmware which complies with the one Xen uses. + +4. Boot up Linux or Windows guests: + 4.1 Create or install a image for guest boot. If you have xen experience, it should be easy. + + 4.2 Boot up guests use the following command. + /usr/local/bin/qemu-system-ia64 -smp xx -m 512 -hda $your_image + (xx is the number of virtual processors for the guest, now the maximum value is 4) + +5. Known possibile issue on some platforms with old Firmware + +If meet strange host crashes, you may try to solve it through either of the following methods. +(1): Upgrade your Firmware to the latest one. + +(2): Applying the below patch to kernel source. +diff --git a/arch/ia64/kernel/pal.S b/arch/ia64/kernel/pal.S +index 0b53344..f02b0f7 100644 +--- a/arch/ia64/kernel/pal.S b/arch/ia64/kernel/pal.S +@@ -84,7 +84,8 @@ GLOBAL_ENTRY(ia64_pal_call_static) + mov ar.pfs = loc1 + mov rp = loc0 + ;; +- srlz.d // serialize restoration of psr.l ++ srlz.i // serialize restoration of psr.l ++ ;; + br.ret.sptk.many b0 + END(ia64_pal_call_static) + +6. Bug report: + If you found any issues when use kvm/ia64, Please post the bug info to kvm-ia64-devel mailing list. + https://lists.sourceforge.net/lists/listinfo/kvm-ia64-devel/ + +Thanks for your interest! Let's work together, and make kvm/ia64 stronger and stronger! + + + Xiantao Zhang [EMAIL PROTECTED] + 2008.3.10 -- 1.5.2 0017-KVM-IA64-How-to-boot-up-guests-on-kvm-ia64.patch Description: 0017-KVM-IA64-How-to-boot-up-guests-on-kvm-ia64.patch - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [03/15][PATCH] kvm/ia64: Add header files for kvm/ia64. V8
From 03259a60f3c8104cd61f523f9ddeccce0e635782 Mon Sep 17 00:00:00 2001 From: Xiantao Zhang [EMAIL PROTECTED] Date: Fri, 28 Mar 2008 09:48:10 +0800 Subject: [PATCH] KVM: IA64: Add header files for kvm/ia64. Three header files are added: asm-ia64/kvm.h asm-ia64/kvm_host.h asm-ia64/kvm_para.h Signed-off-by: Xiantao Zhang [EMAIL PROTECTED] --- include/asm-ia64/kvm.h | 205 + include/asm-ia64/kvm_host.h | 530 +++ include/asm-ia64/kvm_para.h | 29 +++ 3 files changed, 764 insertions(+), 0 deletions(-) create mode 100644 include/asm-ia64/kvm.h create mode 100644 include/asm-ia64/kvm_host.h create mode 100644 include/asm-ia64/kvm_para.h diff --git a/include/asm-ia64/kvm.h b/include/asm-ia64/kvm.h new file mode 100644 index 000..8c70dd6 --- /dev/null +++ b/include/asm-ia64/kvm.h @@ -0,0 +1,205 @@ +#ifndef __ASM_KVM_IA64_H +#define __ASM_KVM_IA64_H + +/* + * asm-ia64/kvm.h: kvm structure definitions for ia64 + * + * Copyright (C) 2007 Xiantao Zhang [EMAIL PROTECTED] + * + * This program is free software; you can redistribute it and/or modify it + * under the terms and conditions of the GNU General Public License, + * version 2, as published by the Free Software Foundation. + * + * This program is distributed in the hope it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + * You should have received a copy of the GNU General Public License along with + * this program; if not, write to the Free Software Foundation, Inc., 59 Temple + * Place - Suite 330, Boston, MA 02111-1307 USA. + * + */ + +#include asm/types.h +#include asm/fpu.h + +#include linux/ioctl.h + +/* Architectural interrupt line count. */ +#define KVM_NR_INTERRUPTS 256 + +#define KVM_IOAPIC_NUM_PINS 24 + +struct kvm_ioapic_state { + __u64 base_address; + __u32 ioregsel; + __u32 id; + __u32 irr; + __u32 pad; + union { + __u64 bits; + struct { + __u8 vector; + __u8 delivery_mode:3; + __u8 dest_mode:1; + __u8 delivery_status:1; + __u8 polarity:1; + __u8 remote_irr:1; + __u8 trig_mode:1; + __u8 mask:1; + __u8 reserve:7; + __u8 reserved[4]; + __u8 dest_id; + } fields; + } redirtbl[KVM_IOAPIC_NUM_PINS]; +}; + +#define KVM_IRQCHIP_PIC_MASTER 0 +#define KVM_IRQCHIP_PIC_SLAVE1 +#define KVM_IRQCHIP_IOAPIC 2 + +#define KVM_CONTEXT_SIZE 8*1024 + +typedef union context { + /* 8K size */ + chardummy[KVM_CONTEXT_SIZE]; + struct { + unsigned long psr; + unsigned long pr; + unsigned long caller_unat; + unsigned long pad; + unsigned long gr[32]; + unsigned long ar[128]; + unsigned long br[8]; + unsigned long cr[128]; + unsigned long rr[8]; + unsigned long ibr[8]; + unsigned long dbr[8]; + unsigned long pkr[8]; + struct ia64_fpreg fr[128]; + }; +} context_t; + +typedef struct thash_data { + union { + struct { + unsigned long p: 1; /* 0 */ + unsigned long rv1 : 1; /* 1 */ + unsigned long ma : 3; /* 2-4 */ + unsigned long a: 1; /* 5 */ + unsigned long d: 1; /* 6 */ + unsigned long pl : 2; /* 7-8 */ + unsigned long ar : 3; /* 9-11 */ + unsigned long ppn : 38; /* 12-49 */ + unsigned long rv2 : 2; /* 50-51 */ + unsigned long ed : 1; /* 52 */ + unsigned long ig1 : 11; /* 53-63 */ + }; + struct { + unsigned long __rv1 : 53; /* 0-52 */ + unsigned long contiguous : 1; /*53 */ + unsigned long tc : 1; /* 54 TR or TC */ + unsigned long cl : 1; + /* 55 I side or D side cache line */ + unsigned long len : 4; /* 56-59 */ + unsigned long io : 1; /* 60 entry is for io or not */ + unsigned long nomap : 1; + /* 61 entry cann't be inserted into machine TLB.*/ + unsigned long checked : 1; + /* 62 for VTLB/VHPT sanity check */ + unsigned long invalid : 1;
[kvm-devel] [07/17][PATCH] kvm/ia64: Add TLB virtulization support.-V8
From 6b731c15afa8cec84f16408c421c286f1dd1b7d3 Mon Sep 17 00:00:00 2001 From: Xiantao Zhang [EMAIL PROTECTED] Date: Wed, 12 Mar 2008 13:45:40 +0800 Subject: [PATCH] KVM:IA64 : Add TLB virtulization support. vtlb.c includes tlb/VHPT virtulization. Signed-off-by: Anthony Xu [EMAIL PROTECTED] Signed-off-by: Xiantao Zhang [EMAIL PROTECTED] --- arch/ia64/kvm/vtlb.c | 631 ++ 1 files changed, 631 insertions(+), 0 deletions(-) create mode 100644 arch/ia64/kvm/vtlb.c diff --git a/arch/ia64/kvm/vtlb.c b/arch/ia64/kvm/vtlb.c new file mode 100644 index 000..6e6ed25 --- /dev/null +++ b/arch/ia64/kvm/vtlb.c @@ -0,0 +1,631 @@ +/* + * vtlb.c: guest virtual tlb handling module. + * Copyright (c) 2004, Intel Corporation. + * Yaozu Dong (Eddie Dong) [EMAIL PROTECTED] + * Xuefei Xu (Anthony Xu) [EMAIL PROTECTED] + * + * Copyright (c) 2007, Intel Corporation. + * Xuefei Xu (Anthony Xu) [EMAIL PROTECTED] + * Xiantao Zhang [EMAIL PROTECTED] + * + * This program is free software; you can redistribute it and/or modify it + * under the terms and conditions of the GNU General Public License, + * version 2, as published by the Free Software Foundation. + * + * This program is distributed in the hope it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + * You should have received a copy of the GNU General Public License along with + * this program; if not, write to the Free Software Foundation, Inc., 59 Temple + * Place - Suite 330, Boston, MA 02111-1307 USA. + * + */ + +#include vcpu.h + +#include linux/rwsem.h +/* + * Check to see if the address rid:va is translated by the TLB + */ + +static int __is_tr_translated(thash_data_t *trp, u64 rid, u64 va) +{ + return ((trp-p) (trp-rid == rid) +((va-trp-vadr) PSIZE(trp-ps))); +} + +/* + * Only for GUEST TR format. + */ +static int __is_tr_overlap(thash_data_t *trp, u64 rid, u64 sva, u64 eva) +{ + u64 sa1, ea1; + + if (!trp-p || trp-rid != rid) + return 0; + + sa1 = trp-vadr; + ea1 = sa1 + PSIZE(trp-ps) - 1; + eva -= 1; + if ((sva ea1) || (sa1 eva)) + return 0; + else + return 1; + +} + +void machine_tlb_purge(u64 va, u64 ps) +{ + ia64_ptcl(va, ps 2); +} + +void local_flush_tlb_all(void) +{ + int i, j; + unsigned long flags, count0, count1; + unsigned long stride0, stride1, addr; + + addr= current_vcpu-arch.ptce_base; + count0 = current_vcpu-arch.ptce_count[0]; + count1 = current_vcpu-arch.ptce_count[1]; + stride0 = current_vcpu-arch.ptce_stride[0]; + stride1 = current_vcpu-arch.ptce_stride[1]; + + local_irq_save(flags); + for (i = 0; i count0; ++i) { + for (j = 0; j count1; ++j) { + ia64_ptce(addr); + addr += stride1; + } + addr += stride0; + } + local_irq_restore(flags); + ia64_srlz_i(); /* srlz.i implies srlz.d */ +} + +int vhpt_enabled(VCPU *vcpu, u64 vadr, vhpt_ref_t ref) +{ + ia64_rrvrr; + ia64_pta vpta; + ia64_psr vpsr; + + vpsr.val = VCPU(vcpu, vpsr); + vrr.val = vcpu_get_rr(vcpu, vadr); + vpta.val = vcpu_get_pta(vcpu); + + if (vrr.ve vpta.ve) { + switch (ref) { + case DATA_REF: + case NA_REF: + return vpsr.dt; + case INST_REF: + return vpsr.dt vpsr.it vpsr.ic; + case RSE_REF: + return vpsr.dt vpsr.rt; + + } + } + return 0; +} + +thash_data_t *vsa_thash(ia64_pta vpta, u64 va, u64 vrr, u64 *tag) +{ + u64 index, pfn, rid, pfn_bits; + + pfn_bits = vpta.size - 5 - 8; + pfn = REGION_OFFSET(va) _REGION_PAGE_SIZE(vrr); + rid = _REGION_ID(vrr); + index = ((rid 0xff) pfn_bits)|(pfn ((1UL pfn_bits) - 1)); + *tag = ((rid 8) 0x) | ((pfn pfn_bits) 16); + + return (thash_data_t *)((vpta.base PTA_BASE_SHIFT) + (index 5)); +} + +thash_data_t *__vtr_lookup(VCPU *vcpu, u64 va, int type) +{ + + thash_data_t *trp; + int i; + u64 rid; + + rid = vcpu_get_rr(vcpu, va); + rid = rid RR_RID_MASK;; + if (type == D_TLB) { + if (vcpu_quick_region_check(vcpu-arch.dtr_regions, va)) { + for (trp = (thash_data_t *)vcpu-arch.dtrs, i = 0; + i NDTRS; i++, trp++) { + if (__is_tr_translated(trp, rid, va)) + return trp; + } + } + } else { + if (vcpu_quick_region_check(vcpu-arch.itr_regions,
Re: [kvm-devel] [01/17]PATCH Add API for allocating dynamic TR resouce. V8
Hi Xiantao, I general I think the code in this patch is fine. I have a couple of nit-picking comments: + if (target_mask0x1) { The formatting here isn't quite what most of the kernel does. It would be better if you added spaces so it's a little easier to read, ie: if (target_mask 0x1) { + p = __per_cpu_idtrs[cpu][0][0]; + for (i = IA64_TR_ALLOC_BASE; i = per_cpu(ia64_tr_used, cpu); + i++, p++) { + if (p-pte0x1) Same thing here. +#define RR_TO_RID(rr) ((rr)3240) I would prefer to have this one defined like this: #define RR_TO_RID(rr) (rr 8) 0xff It should generate the same code, but is more intuitive for the reader. Otherwise I think this patch is fine - this is really just cosmetics. Cheers, Jes - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] Автомобиль как основное средст во (семинар)
Автoмoбиль в Вашей opганизации oднoдневный cеминаp / 2 апpеля 2008 г Пpoгpамма: 1. Пoкупка автoмoбиля, пеpеoфopмление и пocтанoвка на учет - чтo нужнo знать бухгалтеpу. 2. Пеpвичные дoкументы пo учету автoтpанcпopта в opганизации. Фopмиpoвание инвентаpнoгo oбъекта автoмoбиль. Фopмы oc-1, oc-4а, oc-6 пo автoмoбилю. Автoмoбиль, запаcнoй кoмплект шин, cигнализация, магнитoла - oдин oбъект или pазные. 3. oценка пpи пpиoбpетении за плату, пpи пoлучении тpанcпopта в качеcтве вклада в уcтавный капитал, пpи безвoзмезднoм пoлучении, пpи пpиoбpетении пo дoгoвopу мены. Вычет НДc - cлoжные вoпpocы, нoвшеcтва 2006 - 2007 гг. 4. Пoлучение автoмoбиля в пoльзoвание без пеpехoда пpава coбcтвеннocти - налoгoвые выгoды 5. Автoмoбиль как ocнoвнoе cpедcтвo - cpoк пoлезнoгo иcпoльзoвания, пеpеoценка, амopтизация, мoдеpнизация и дooбopудoвание. Нoвoе в бухгалтеpcкoм учете автoмoбилей. Кoэффициент амopтизации 0,5 пo автoмoбилям cтoимocтью бoльше 300 000 pуб. и паccажиpcким микpoавтoбуcам cтoимocтью бoлее 400 000 pуб. - игнopиpoвать нельзя, нo мoжнo нейтpализoвать. 6. Аpенда автoмoбиля - oфopмление и налoгooблoжение у аpендoдателя и аpенлатopа. ocoбеннocти аpенды у физичеcких лиц и дpугих opганизаций. Аpенда автoмoбиля c экипажем. 7. Лизинг автoтpанcпopтных cpедcтв. ocoбеннocти выкупа автoмoбилей. Налoг на имущеcтвo и тpанcпopтный налoг в oтнoшении oбъекта лизинга. 8. pаcхoды на cигнализацию, запчаcти и pемoнт автoмoбиля. Капитальный и экcплуатациoнный pемoнт и техничеcкoе oбcлуживание. Техocмoтpы. Учет автoшин. Запчаcти - ocoбеннocти cпиcания. Учет pаcхoдoв на автocтoянки - нoвoе в 2007 г. 9. pаcхoды на ГcМ. Нopмы - миф или pеальнocть. oбocнoвание pаcхoдoв на ГcМ пo путевым лиcтам coбcтвеннoгo oбpазца. НДc пpи пoкупке ГcМ за наличный pаcчет. 10. ocoбеннocти oплаты тpуда вoдителей. pазъезднoй хаpактеp pабoт. Надбавки за pазъезднoй хаpактеp pабoт, cвеpхуpoчные. pабoчее вpемя и вpемя oтдыха вoдителей. oптимальные pежимы pабoчегo вpемени вoдителя.. 11. cтpахoвание автoмoбилей - КАcКo, ocАГo, зеленая каpта. cпиcание cтpахoвых взнocoв в pаcхoды. 12. oценка автoмoбиля пpи cпиcании. Пеpвичная дoкументация. Прoдoлжитeльнocть oбучeния: c 10 дo 17 чаcoв (c пeрeрывoм на oбeд и кoфe-паузу). Мecтo oбучeния: г. Мocква, 5 мин. пeшкoм oт м. Акадeмичecкая. cтoимocть oбучeния: 4900 руб. (c НДc). (В cтoимocть вхoдит: раздатoчный матeриал, кoфe-пауза, oбeд в рecтoранe). При oтcутcтвии вoзмoжнocти пoceтить ceминар, мы прeдлагаeм приoбрecти eгo видeoвeрcию на DVD/CD диcках или видeoкаcceтах (прилагаeтcя автoрcкий раздатoчный матeриал). Цeна видeoкурcа - 3500 рублeй, c учeтoм НДc. Для рeгиcтрации на ceминар нeoбхoдимo oтправить нам пo факcу или элeктрoннoй пoчтe: рeквизиты oрганизации, тeму и дату ceминара, пoлнoe ФИo учаcтникoв, кoнтактный тeлeфoн и факc. Для заказа видeoкурcа нeoбхoдимo oтправить нам пo факcу или элeктрoннoй пoчтe: рeквизиты oрганизации, тeму видeoкурcа, указать нocитeль (ДВД или cД диcки), тeлeфoн, факc, кoнтактнoe лицo и тoчный адрec дocтавки. Пoлучить дoпoлнитeльную инфoрмацию и зарeгиcтрирoватьcя мoжнo: пo т/ф: (495) 543-88-46 пo элeктрoннoй пoчтe: [EMAIL PROTECTED] - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH] KVM: MMU: Fix rmap_remove() race
On Mon, Mar 31, 2008 at 09:35:00AM +0300, Avi Kivity wrote: This can be done by taking mmu_lock in _begin and releasing it in _end, unless there's a lock dependency issue. The main problem is if want to be able to co-exit with XPMEM methods registered in the same notifier chain for the same MM with the KVM methods. The ideal would be to solve the race with a non-blocking lock like seqlock. I don't understand your conclusion: you prove that mlock() is not good enough, then post a patch to do it? mlock isn't good enough to allow munmap/madvise(don't need). So mlock fixes the race in the current kvm code, but only unless you use ballooning. This is because VM_LOCKED should be ignored by madvise(don't need). But at least this is only a trouble for smp guest. It'd require rmap_remove to run on a different physical cpu while another qemu thread runs madvise. So supposedly with an up guest, the guest won't run rmap_remove while madvise runs. To better explain the race, if we could take the mmu_lock around madvise that would fix it for smp guest too (however currently it's userland calling into madvise so that's not feasible with the current model). I'll take another shot at fixing rmap_remove(), I don't like to cripple swapping for 2.6.25 (though it will only be really dependable in .26). Ok! Clearly it would look more robust if rmap_remove is capable of doing the last free on the page and it won't relay on the page not to be freed until mmu_lock is released. - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] KVM Test result, kernel 2ac530f.., userspace 5ef6cd8..
Hi All, This is today's KVM test result against kvm.git 2ac530fcd40ad730d4cc7961768e412eaf9b7caa and kvm-userspace.git 5ef6cd806d5c8dd3ad2d1dfe01752103702fc34f. Issue list 1. Booting four guests likely fails https://sourceforge.net/tracker/?func=detailatid=893831aid=1919354group_id=180599 2. booting smp windows guests has 30% chance of hang https://sourceforge.net/tracker/?func=detailatid=893831aid=1910923group_id=180599 Test environment PlatformWoodcrest CPU 4 Memory size 8G' Details IA32-pae: 1. boot guest with 256M memory PASS 2. boot two windows xp guest PASS 3. boot 4 same guest in parallelPASS 4. boot linux and windows guest in parallel PASS 5. boot guest with 1500M memory PASS 6. boot windows 2003 with ACPI enabled PASS 7. boot Windows xp with ACPI enabled PASS 8. boot Windows 2000 without ACPI PASS 9. kernel build on SMP linux guestPASS 10. LTP on SMP linux guest PASS 11. boot base kernel linux PASS 12. save/restore 32-bit HVM guests PASS 13. live migration 32-bit HVM guests PASS 14. boot SMP Windows xp with ACPI enabledPASS 15. boot SMP Windows 2003 with ACPI enabled PASS 16. boot SMP Windows 2000 with ACPI enabled PASS IA32e: 1. boot four 32-bit guest in parallel PASS 2. boot four 64-bit guest in parallel PASS 3. boot 4G 64-bit guest PASS 4. boot 4G pae guest PASS 5. boot 32-bit linux and 32 bit windows guest in parallelPASS 6. boot 32-bit guest with 1500M memory PASS 7. boot 64-bit guest with 1500M memory PASS 8. boot 32-bit guest with 256M memory PASS 9. boot 64-bit guest with 256M memory PASS 10. boot two 32-bit windows xp in parallelPASS 11. boot four 32-bit different guest in paraPASS 12. save/restore 64-bit linux guests PASS 13. save/restore 32-bit linux guests PASS 14. boot 32-bit SMP windows 2003 with ACPI enabled FAIL 15. boot 32-bit SMP Windows 2000 with ACPI enabled FAIL 16. boot 32-bit SMP Windows xp with ACPI enabledPASS 17. boot 32-bit Windows 2000 without ACPIPASS 18. boot 64-bit Windows xp with ACPI enabledPASS 19. boot 32-bit Windows xp without ACPIPASS 20. boot 64-bit UP vista PASS 21. boot 64-bit SMP vista FAIL 22. kernel build in 32-bit linux guest OS PASS 23. kernel build in 64-bit linux guest OS PASS 24. LTP on SMP 32-bit linux guest OSPASS 25. LTP on SMP 64-bit linux guest OSPASS 26. boot 64-bit guests with ACPI enabled PASS 27. boot 32-bit x-server PASS 28. boot 64-bit SMP windows XP with ACPI enabled PASS 29. boot 64-bit SMP windows 2003 with ACPI enabled PASS 30. live migration 64bit linux guests PASS 31. live migration 32bit linux guests PASS 32. reboot 32bit windows xp guestPASS 33. reboot 32bit windows xp guest PASS Report Summary on IA32-pae Summary Test Report of Last Session = Total PassFailNoResult Crash = control_panel 7 7 0 00 Restart 2 2 0 00 gtest 15 15 0 00 = control_panel 7 7 0 00 :KVM_LM_PAE_gPAE 1 1 0 00
Re: [kvm-devel] [kvm-ppc-devel] virtio-net working on PowerPC KVM
On Mon, 2008-03-31 at 10:17 +0200, Alexander Graf wrote: On Mar 31, 2008, at 9:43 AM, Avi Kivity wrote: Alexander Graf wrote: I am experiencing 7-8ms ping latencies (native 0.1ms) on x86_64 as well, when pinging the virtual machine. Maybe it's not related to PowerPC? Is it supposed to be that slow? If you have a really old host kernel, or a misconfigured one, it can happen. What's your host kernel? are you using hrtimers? This is a 2.6.22 SUSE kernel. As far as I can see CONFIG_HIGH_RES_TIMERS is not set. Guess it's my fault then - nevertheless a warning when starting kvm would be nice. You can also try with -clock=hpet or -clock=rtc Alex - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [02/17][PATCH] Implement smp_call_function_mask for ia64 - V8
Zhang, Xiantao wrote: From 697d50286088e98da5ac8653c80aaa96c81abf87 Mon Sep 17 00:00:00 2001 From: Xiantao Zhang [EMAIL PROTECTED] Date: Mon, 31 Mar 2008 09:50:24 +0800 Subject: [PATCH] KVM:IA64: Implement smp_call_function_mask for ia64 This function provides more flexible interface for smp infrastructure. Signed-off-by: Xiantao Zhang [EMAIL PROTECTED] Hi Xiantao, I'm a little wary of the performance impact of this change. Doing a cpumask compare on all smp_call_function calls seems a little expensive. Maybe it's just noise in the big picture compared to the actual cost of the IPIs, but I thought I'd bring it up. Keep in mind that a cpumask can be fairly big these days, max NR_CPUS is currently 4096. For those booting a kernel with NR_CPUS at 4096 on a dual CPU machine, it would be a bit expensive. Why not keep smp_call_function() the way it was before, rather than implementing it via the call to smp_call_function_mask()? Cheers, Jes - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [01/17]PATCH Add API for allocating dynamic TR resouce. V8
Jes Sorensen wrote: Hi Xiantao, I general I think the code in this patch is fine. I have a couple of nit-picking comments: +if (target_mask0x1) { The formatting here isn't quite what most of the kernel does. It would be better if you added spaces so it's a little easier to read, ie: Good suggesion! if (target_mask 0x1) { +p = __per_cpu_idtrs[cpu][0][0]; +for (i = IA64_TR_ALLOC_BASE; i = per_cpu(ia64_tr_used, cpu); +i++, p++) { +if (p-pte0x1) Same thing here. +#define RR_TO_RID(rr) ((rr)3240) I would prefer to have this one defined like this: #define RR_TO_RID(rr) (rr 8) 0xff It should generate the same code, but is more intuitive for the reader. Looks better :) Otherwise I think this patch is fine - this is really just cosmetics. Thank you! Xiantao 0001-KVM-IA64-Add-API-for-allocating-Dynamic-TR-resource.patch Description: 0001-KVM-IA64-Add-API-for-allocating-Dynamic-TR-resource.patch - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] (no subject)
各位老总:您们好! 诚祝:您们在2008年里;有鼠不尽的快乐!鼠不尽的收获!鼠不尽的钞票! 鼠不尽的幸福!鼠不尽的美满生活!愿:您们阖家欢乐!幸福安康! 我是(深圳市珊湖岛进出口有限公司)的负责人;可以提供:出口报关单, 核销单等等一系列手续;代理:出口报关,商检,境内外运输..等等;还可 以代办:出口欧盟许可证,欧盟产地证;并且还有(广州国际贸易交易会)的摊 位可以转让;有意者请来邮件或来电联系。 电话:0755-81153047。 传真:0755-81172940。 手机:15817477278。 联系人:钟文辉。 此致: 敬礼! - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [02/17][PATCH] Implement smp_call_function_mask foria64 - V8
Jes Sorensen wrote: Zhang, Xiantao wrote: From 697d50286088e98da5ac8653c80aaa96c81abf87 Mon Sep 17 00:00:00 2001 From: Xiantao Zhang [EMAIL PROTECTED] Date: Mon, 31 Mar 2008 09:50:24 +0800 Subject: [PATCH] KVM:IA64: Implement smp_call_function_mask for ia64 This function provides more flexible interface for smp infrastructure. Signed-off-by: Xiantao Zhang [EMAIL PROTECTED] Hi Xiantao, I'm a little wary of the performance impact of this change. Doing a cpumask compare on all smp_call_function calls seems a little expensive. Maybe it's just noise in the big picture compared to the actual cost of the IPIs, but I thought I'd bring it up. Keep in mind that a cpumask can be fairly big these days, max NR_CPUS is currently 4096. For those booting a kernel with NR_CPUS at 4096 on a dual CPU machine, it would be a bit expensive. Why not keep smp_call_function() the way it was before, rather than implementing it via the call to smp_call_function_mask()? Hi, Jes I'm not aware of the performance impact before. If the worst case occurs, it need 64 comparisions ? Maybe keeping old smp_call_function is better ? Xiantao - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketp lace ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [17/17][PATCH] kvm/ia64: How to boot up guests on kvm/ia64 -V8
Hi, Xiantao +3. Get Guest Firmware named as Flash.fd, and put it under right place: + (1) If you have the guest firmware (binary)released by Intel Corp for Xen, you can use it directly. + (2) If you want to build a guest firmware form source code. Please download the source from + hg clone http://xenbits.xensource.com/ext/efi-vfirmware.hg + Use the Guide of the source to build open Guest Firmware. The Guide is not include in this README, so we may make a link to the Guide in wiki. Or we may add a link to binary release of GFW. Best Regards, Akio Takebe - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [17/17][PATCH] kvm/ia64: How to boot up guests on kvm/ia64 -V8
Akio Takebe wrote: Hi, Xiantao +3. Get Guest Firmware named as Flash.fd, and put it under right place: + (1) If you have the guest firmware (binary)released by Intel Corp for Xen, you can use it directly. +(2) If you want to build a guest firmware form source code. Please download the source from +hg clone http://xenbits.xensource.com/ext/efi-vfirmware.hg +Use the Guide of the source to build open Guest Firmware. The Guide is not include in this README, so we may make a link to the Guide in wiki. Or we may add a link to binary release of GFW. Good suggestion! If that users don't need to build firmware by themselves. :) Xiantao - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [03/15][PATCH] kvm/ia64: Add header files for kvm/ia64. V8
Hi Xiantao, Some more nit-picking, though some of this is a bit more important to fixup. Cheers, Jes +typedef struct thash_data { Urgh! argh! Please avoid typedefs unless you really need them, see Chapter 5 of Documentation/CodingStyle for details. diff --git a/include/asm-ia64/kvm_host.h b/include/asm-ia64/kvm_host.h new file mode 100644 index 000..522bde0 --- /dev/null +++ b/include/asm-ia64/kvm_host.h @@ -0,0 +1,530 @@ +/* -*- Mode:C; c-basic-offset:4; tab-width:4; indent-tabs-mode:nil -*- */ The standard indentation for Linux is 8 characters using tabs. If possible it's preferred to comply with that to make the entire kernel tree easier for everybody to deal with. See CodingStyle for details. +struct kvm_mmio_req { + uint64_t addr; /* physical address*/ + uint64_t size; /* size in bytes */ + uint64_t data; /* data (or paddr of data) */ + uint8_t state:4; + uint8_t dir:1; /* 1=read, 0=write */ +}; +typedef struct kvm_mmio_req mmio_req_t; More typedefs +/*Pal data struct */ +typedef struct pal_call{ and again. + /*In area*/ + uint64_t gr28; + uint64_t gr29; + uint64_t gr30; + uint64_t gr31; + /*Out area*/ + struct ia64_pal_retval ret; +} pal_call_t; + +/* Sal data structure */ +typedef struct sal_call{ and again... + /*In area*/ + uint64_t in0; + uint64_t in1; + uint64_t in2; + uint64_t in3; + uint64_t in4; + uint64_t in5; + uint64_t in6; + uint64_t in7; + /*Our area*/ + struct sal_ret_values ret; +} sal_call_t; - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [04/17] [PATCH] Add kvm arch-specific core code for kvm/ia64.-V8
Zhang, Xiantao wrote: From 62895ff991d48398a77afdbf7f2bef127e802230 Mon Sep 17 00:00:00 2001 From: Xiantao Zhang [EMAIL PROTECTED] Date: Fri, 28 Mar 2008 09:49:57 +0800 Subject: [PATCH] KVM: IA64: Add kvm arch-specific core code for kvm/ia64. kvm_ia64.c is created to handle kvm ia64-specific core logic. Signed-off-by: Xiantao Zhang [EMAIL PROTECTED] More comments, a couple of bugs in this one. +#include linux/module.h +#include linux/vmalloc.h Don't think you need vmalloc.h here. +int kvm_arch_vcpu_ioctl_set_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs) +{ [snip] + copy_from_user(vcpu-arch.guest, regs-saved_guest, + sizeof(union context)); + copy_from_user(vcpu + 1, regs-saved_stack + sizeof(struct kvm_vcpu), + IA64_STK_OFFSET - sizeof(struct kvm_vcpu)); You need to check the return values from copy_from_user() here and deal with possible failure. + vcpu-arch.apic = kzalloc(sizeof(struct kvm_lapic), GFP_KERNEL); + vcpu-arch.apic-vcpu = vcpu; Whoops! Missing NULL pointer check here after the kzalloc. + copy_to_user(regs-saved_guest, vcpu-arch.guest, + sizeof(union context)); + copy_to_user(regs-saved_stack, (void *)vcpu, IA64_STK_OFFSET); Same problem as above - check the return values. Cheers, Jes - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] kvm-64 hang
hi, while kvm-62 works for us kmv-64 is hang at random position and even if it's start it's not able to login neither from net (can't connect) nor from console (through virt-manager's vnc i can give root username but after the password nothing happened). attached a screenshot of the boot hang of the 32bit system. there isn't any kind of log or error in any logfile. our system: - host: - Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz - Intel S3000AHV - 8GB RAM - CentOS-5.1 - kernel-2.6.18-53.1.14.el5 x86_64 64bit - guest-1: - CentOS-5.1 - kernel-2.6.18-53.1.14.el5 i386 32bit - guest-2: - CentOS-5.1 - kernel-2.6.18-53.1.14.el5 x86_64 64bit - guest-3: - Mandrake-9 - kernel-2.4.19.16mdk-1-1mdk 32bit - guest-4: - Windows XP Professional 32bit -- Levente Si vis pacem para bellum! inline: Screenshot-devel-i386VirtualMachineConsole.png- Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [05/17][PATCH] kvm/ia64 : Add head files for kvm/ia64
Hi Xiantao, More comments. Zhang, Xiantao wrote: From 696b9eea9f5001a7b7a07c0e58514aa10306b91a Mon Sep 17 00:00:00 2001 From: Xiantao Zhang [EMAIL PROTECTED] Date: Fri, 28 Mar 2008 09:51:36 +0800 Subject: [PATCH] KVM:IA64 : Add head files for kvm/ia64 ia64_regs: some defintions for special registers which aren't defined in asm-ia64/ia64regs. Please put missing definitions of registers into asm-ia64/ia64regs.h if they are official definitions from the spec. kvm_minstate.h : Marcos about Min save routines. lapic.h: apic structure definition. vcpu.h : routions related to vcpu virtualization. vti.h : Some macros or routines for VT support on Itanium. Signed-off-by: Xiantao Zhang [EMAIL PROTECTED] +/* + * Flushrs instruction stream. + */ +#define ia64_flushrs() asm volatile (flushrs;;:::memory) + +#define ia64_loadrs() asm volatile (loadrs;;:::memory) Please put these into include/asm-ia64/gcc_intrin.h +#define ia64_get_rsc() \ +({ \ + unsigned long val; \ + asm volatile (mov %0=ar.rsc;; : =r(val) :: memory); \ + val; \ +}) + +#define ia64_set_rsc(val)\ + asm volatile (mov ar.rsc=%0;; :: r(val) : memory) Please update the ia64_get/set_reg macros to handle the RSC register and use those macros. +#define ia64_get_bspstore() \ +({ \ + unsigned long val; \ + asm volatile (mov %0=ar.bspstore;; : =r(val) :: memory); \ + val; \ +}) Ditto for for AR.BSPSTORE +#define ia64_get_rnat() \ +({ \ + unsigned long val; \ + asm volatile (mov %0=ar.rnat; : =r(val) :: memory); \ + val; \ +}) Ditto for AR.RNAT +static inline unsigned long ia64_get_itc(void) +{ + unsigned long result; + result = ia64_getreg(_IA64_REG_AR_ITC); + return result; +} This exists in include/asm-ia64/delay.h +static inline void ia64_set_dcr(unsigned long dcr) +{ + ia64_setreg(_IA64_REG_CR_DCR, dcr); +} Please just call ia64_setreg() in your code rather than defining a wrapper for it. +#define ia64_ttag(addr) \ +({ \ + __u64 ia64_intri_res; \ + asm volatile (ttag %0=%1 : =r(ia64_intri_res) : r (addr)); \ + ia64_intri_res; \ +}) Please add to include/asm-ia64/gcc_intrin.h instead. diff --git a/arch/ia64/kvm/lapic.h b/arch/ia64/kvm/lapic.h new file mode 100644 index 000..152cbdc --- /dev/null +++ b/arch/ia64/kvm/lapic.h @@ -0,0 +1,27 @@ +#ifndef __KVM_IA64_LAPIC_H +#define __KVM_IA64_LAPIC_H + +#include iodev.h I don't understand why iodev.h is included here? --- /dev/null +++ b/arch/ia64/kvm/vcpu.h The formatting of this file is dodgy, please try and make it comply with the Linux standards in Documentation/CodingStyle +#define _vmm_raw_spin_lock(x) \ [snip] + +#define _vmm_raw_spin_unlock(x) \ Could you explain the reasoning behind these two macros? Whenever I see open coded spin lock modifications like these, I have to admit I get a bit worried. +typedef struct kvm_vcpu VCPU; +typedef struct kvm_pt_regs REGS; +typedef enum { DATA_REF, NA_REF, INST_REF, RSE_REF } vhpt_ref_t; +typedef enum { INSTRUCTION, DATA, REGISTER } miss_type; ARGH! Please see previous mail about typedefs! I suspect this is code inherited from Xen ? Xen has a lot of really nasty and pointless typedefs like these :-( +static inline void vcpu_set_dbr(VCPU *vcpu, u64 reg, u64 val) +{ + /* TODO: need to virtualize */ + __ia64_set_dbr(reg, val); +} + +static inline void vcpu_set_ibr(VCPU *vcpu, u64 reg, u64 val) +{ + /* TODO: need to virtualize */ + ia64_set_ibr(reg, val); +} + +static inline u64 vcpu_get_dbr(VCPU *vcpu, u64 reg) +{ + /* TODO: need to virtualize */ + return ((u64)__ia64_get_dbr(reg)); +} + +static inline u64 vcpu_get_ibr(VCPU *vcpu, u64 reg) +{ + /* TODO: need to virtualize */ + return ((u64)ia64_get_ibr(reg)); +} More wrapper macros that really should just use ia64_get/set_reg() directly in the code. diff --git a/arch/ia64/kvm/vti.h b/arch/ia64/kvm/vti.h new file mode 100644 index 000..591ab22 [ship] +/* -*- Mode:C; c-basic-offset:4; tab-width:4; indent-tabs-mode:nil -*- */ Evil formatting again! Cheers, Jes - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] kvm-64 hang
Farkas Levente wrote: hi, while kvm-62 works for us kmv-64 is hang at random position and even if it's start it's not able to login neither from net (can't connect) nor from console (through virt-manager's vnc i can give root username but after the password nothing happened). attached a screenshot of the boot hang of the 32bit system. there isn't any kind of log or error in any logfile. Please try: kvm-63 kvm-64 with the -no-kvm-pit command line option -- error compiling committee.c: too many arguments to function - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [01/17]PATCH Add API for allocating dynamic TR resouce. V8
Zhang, Xiantao wrote: +/* mca_insert_tr + * + * Switch rid when TR reload and needed! + * iord: 1: itr, 2: itr; + * +*/ +static void mca_insert_tr(u64 iord) +{ + + int i; + u64 old_rr; + struct ia64_tr_entry *p; + unsigned long psr; + int cpu = smp_processor_id(); What if CONFIG_PREEMPT is on, and we're being preempted and scheduled to a different CPU here? Are we running preempt disabled here? If so, the function header should state that this function needs to be called preempt_disabled. +/* + * ia64_insert_tr in virtual mode. Allocate a TR slot + * + * target_mask : 0x1 : itr, 0x2 : dtr, 0x3 : idtr + * + * va: virtual address. + * pte : pte entries inserted. + * log_size: range to be covered. + * + * Return value: 0 : error No. + * + * =0 : slot number allocated for TR. + */ +int ia64_itr_entry(u64 target_mask, u64 va, u64 pte, u64 log_size) +{ + int i, r; + unsigned long psr; + struct ia64_tr_entry *p; + int cpu = smp_processor_id(); Same here. +/* + * ia64_purge_tr + * + * target_mask: 0x1: purge itr, 0x2 : purge dtr, 0x3 purge idtr. + * + * slot: slot number to be freed. + */ +void ia64_ptr_entry(u64 target_mask, int slot) +{ + int cpu = smp_processor_id(); + int i; + struct ia64_tr_entry *p; Here again. - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [03/15][PATCH] kvm/ia64: Add header files for kvm/ia64. V8
Zhang, Xiantao wrote: +typedef union context { + /* 8K size */ + chardummy[KVM_CONTEXT_SIZE]; + struct { + unsigned long psr; + unsigned long pr; + unsigned long caller_unat; + unsigned long pad; + unsigned long gr[32]; + unsigned long ar[128]; + unsigned long br[8]; + unsigned long cr[128]; + unsigned long rr[8]; + unsigned long ibr[8]; + unsigned long dbr[8]; + unsigned long pkr[8]; + struct ia64_fpreg fr[128]; + }; +} context_t; This looks ugly to me. I'd rather prefer to have a straight struct with elements psr...fr[], and cast the pointer to char* when needed. KVM_CONTEXT_SIZE can be used as parameter to kzalloc() on allocation, it's too large to be on stack anyway. +typedef struct thash_data { + union { + struct { + unsigned long p: 1; /* 0 */ + unsigned long rv1 : 1; /* 1 */ + unsigned long ma : 3; /* 2-4 */ + unsigned long a: 1; /* 5 */ + unsigned long d: 1; /* 6 */ + unsigned long pl : 2; /* 7-8 */ + unsigned long ar : 3; /* 9-11 */ + unsigned long ppn : 38; /* 12-49 */ + unsigned long rv2 : 2; /* 50-51 */ + unsigned long ed : 1; /* 52 */ + unsigned long ig1 : 11; /* 53-63 */ + }; + struct { + unsigned long __rv1 : 53; /* 0-52 */ + unsigned long contiguous : 1; /*53 */ + unsigned long tc : 1; /* 54 TR or TC */ + unsigned long cl : 1; + /* 55 I side or D side cache line */ + unsigned long len : 4; /* 56-59 */ + unsigned long io : 1; /* 60 entry is for io or not */ + unsigned long nomap : 1; + /* 61 entry cann't be inserted into machine TLB.*/ + unsigned long checked : 1; + /* 62 for VTLB/VHPT sanity check */ + unsigned long invalid : 1; + /* 63 invalid entry */ + }; + unsigned long page_flags; + }; /* same for VHPT and TLB */ + + union { + struct { + unsigned long rv3 : 2; + unsigned long ps : 6; + unsigned long key : 24; + unsigned long rv4 : 32; + }; + unsigned long itir; + }; + union { + struct { + unsigned long ig2 : 12; + unsigned long vpn : 49; + unsigned long vrn : 3; + }; + unsigned long ifa; + unsigned long vadr; + struct { + unsigned long tag : 63; + unsigned long ti : 1; + }; + unsigned long etag; + }; + union { + struct thash_data *next; + unsigned long rid; + unsigned long gpaddr; + }; +} thash_data_t; A matter of taste, but I'd prefer unsigned long mask, and #define MASK_BIT_FOR_PURPUSE over bitfields. This structure could be much smaller that way. +struct kvm_regs { + char *saved_guest; + char *saved_stack; + struct saved_vpd vpd; + /*Arch-regs*/ + int mp_state; + unsigned long vmm_rr; + /* TR and TC. */ + struct thash_data itrs[NITRS]; + struct thash_data dtrs[NDTRS]; + /* Bit is set if there is a tr/tc for the region. */ + unsigned char itr_regions; + unsigned char dtr_regions; + unsigned char tc_regions; + + char irq_check; + unsigned long saved_itc; + unsigned long itc_check; + unsigned long timer_check; + unsigned long timer_pending; + unsigned long last_itc; + + unsigned long vrr[8]; + unsigned long ibr[8]; + unsigned long dbr[8]; + unsigned long insvc[4]; /* Interrupt in service. */ + unsigned long xtp; + + unsigned long metaphysical_rr0; /* from kvm_arch (so is pinned) */ + unsigned long metaphysical_rr4; /* from kvm_arch (so is pinned) */ + unsigned long metaphysical_saved_rr0; /* from kvm_arch */ + unsigned long metaphysical_saved_rr4; /* from kvm_arch */ + unsigned long fp_psr; /*used for lazy float register */ + unsigned long saved_gp; + /*for phycial emulation */ +}; This looks like it does'nt just have guest register content in it. It seems to me preferable to have another ioctl
[kvm-devel] [PATCH 07/40] KVM: x86 emulator: add group 7 decoding
This adds group decoding for opcode 0x0f 0x01 (group 7). Signed-off-by: Avi Kivity [EMAIL PROTECTED] --- arch/x86/kvm/x86_emulate.c |9 +++-- 1 files changed, 7 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/x86_emulate.c b/arch/x86/kvm/x86_emulate.c index 7310368..ef6de16 100644 --- a/arch/x86/kvm/x86_emulate.c +++ b/arch/x86/kvm/x86_emulate.c @@ -70,7 +70,7 @@ #define GroupMask 0xff/* Group number stored in bits 0:7 */ enum { - Group1A, Group3_Byte, Group3, Group4, Group5, + Group1A, Group3_Byte, Group3, Group4, Group5, Group7, }; static u16 opcode_table[256] = { @@ -179,7 +179,7 @@ static u16 opcode_table[256] = { static u16 twobyte_table[256] = { /* 0x00 - 0x0F */ - 0, SrcMem | ModRM | DstReg, 0, 0, 0, 0, ImplicitOps, 0, + 0, Group | GroupDual | Group7, 0, 0, 0, 0, ImplicitOps, 0, ImplicitOps, ImplicitOps, 0, 0, 0, ImplicitOps | ModRM, 0, 0, /* 0x10 - 0x1F */ 0, 0, 0, 0, 0, 0, 0, 0, ImplicitOps | ModRM, 0, 0, 0, 0, 0, 0, 0, @@ -252,9 +252,14 @@ static u16 group_table[] = { [Group5*8] = DstMem | SrcNone | ModRM, DstMem | SrcNone | ModRM, 0, 0, SrcMem | ModRM, 0, SrcMem | ModRM | Stack, 0, + [Group7*8] = + 0, 0, ModRM | SrcMem, ModRM | SrcMem, + SrcNone | ModRM | DstMem, 0, SrcMem | ModRM, SrcMem | ModRM | ByteOp, }; static u16 group2_table[] = { + [Group7*8] = + SrcNone | ModRM, 0, 0, 0, SrcNone | ModRM | DstMem, 0, SrcMem | ModRM, 0, }; /* EFLAGS bit definitions. */ -- 1.5.4.5 - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [PATCH 06/40] KVM: x86 emulator: Group decoding for groups 4 and 5
Add group decoding support for opcode 0xfe (group 4) and 0xff (group 5). Signed-off-by: Avi Kivity [EMAIL PROTECTED] --- arch/x86/kvm/x86_emulate.c | 40 ++-- 1 files changed, 10 insertions(+), 30 deletions(-) diff --git a/arch/x86/kvm/x86_emulate.c b/arch/x86/kvm/x86_emulate.c index 52e65ae..7310368 100644 --- a/arch/x86/kvm/x86_emulate.c +++ b/arch/x86/kvm/x86_emulate.c @@ -70,7 +70,7 @@ #define GroupMask 0xff/* Group number stored in bits 0:7 */ enum { - Group1A, Group3_Byte, Group3, + Group1A, Group3_Byte, Group3, Group4, Group5, }; static u16 opcode_table[256] = { @@ -174,7 +174,7 @@ static u16 opcode_table[256] = { ImplicitOps, ImplicitOps, Group | Group3_Byte, Group | Group3, /* 0xF8 - 0xFF */ ImplicitOps, 0, ImplicitOps, ImplicitOps, - 0, 0, ByteOp | DstMem | SrcNone | ModRM, DstMem | SrcNone | ModRM + 0, 0, Group | Group4, Group | Group5, }; static u16 twobyte_table[256] = { @@ -246,6 +246,12 @@ static u16 group_table[] = { DstMem | SrcImm | ModRM | SrcImm, 0, DstMem | SrcNone | ModRM, ByteOp | DstMem | SrcNone | ModRM, 0, 0, 0, 0, + [Group4*8] = + ByteOp | DstMem | SrcNone | ModRM, ByteOp | DstMem | SrcNone | ModRM, + 0, 0, 0, 0, 0, 0, + [Group5*8] = + DstMem | SrcNone | ModRM, DstMem | SrcNone | ModRM, 0, 0, + SrcMem | ModRM, 0, SrcMem | ModRM | Stack, 0, }; static u16 group2_table[] = { @@ -1097,7 +1103,6 @@ static inline int emulate_grp45(struct x86_emulate_ctxt *ctxt, struct x86_emulate_ops *ops) { struct decode_cache *c = ctxt-decode; - int rc; switch (c-modrm_reg) { case 0: /* inc */ @@ -1107,36 +1112,11 @@ static inline int emulate_grp45(struct x86_emulate_ctxt *ctxt, emulate_1op(dec, c-dst, ctxt-eflags); break; case 4: /* jmp abs */ - if (c-b == 0xff) - c-eip = c-dst.val; - else { - DPRINTF(Cannot emulate %02x\n, c-b); - return X86EMUL_UNHANDLEABLE; - } + c-eip = c-src.val; break; case 6: /* push */ - - /* 64-bit mode: PUSH always pushes a 64-bit operand. */ - - if (ctxt-mode == X86EMUL_MODE_PROT64) { - c-dst.bytes = 8; - rc = ops-read_std((unsigned long)c-dst.ptr, - c-dst.val, 8, ctxt-vcpu); - if (rc != 0) - return rc; - } - register_address_increment(c-regs[VCPU_REGS_RSP], - -c-dst.bytes); - rc = ops-write_emulated(register_address(ctxt-ss_base, - c-regs[VCPU_REGS_RSP]), c-dst.val, - c-dst.bytes, ctxt-vcpu); - if (rc != 0) - return rc; - c-dst.type = OP_NONE; + emulate_push(ctxt); break; - default: - DPRINTF(Cannot emulate %02x\n, c-b); - return X86EMUL_UNHANDLEABLE; } return 0; } -- 1.5.4.5 - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [PATCH 05/40] KVM: x86 emulator: Group decoding for group 3
This adds group decoding support for opcodes 0xf6, 0xf7 (group 3). Signed-off-by: Avi Kivity [EMAIL PROTECTED] --- arch/x86/kvm/x86_emulate.c | 34 ++ 1 files changed, 10 insertions(+), 24 deletions(-) diff --git a/arch/x86/kvm/x86_emulate.c b/arch/x86/kvm/x86_emulate.c index cf1ce7c..52e65ae 100644 --- a/arch/x86/kvm/x86_emulate.c +++ b/arch/x86/kvm/x86_emulate.c @@ -70,7 +70,7 @@ #define GroupMask 0xff/* Group number stored in bits 0:7 */ enum { - Group1A, + Group1A, Group3_Byte, Group3, }; static u16 opcode_table[256] = { @@ -171,8 +171,7 @@ static u16 opcode_table[256] = { 0, 0, 0, 0, /* 0xF0 - 0xF7 */ 0, 0, 0, 0, - ImplicitOps, ImplicitOps, - ByteOp | DstMem | SrcNone | ModRM, DstMem | SrcNone | ModRM, + ImplicitOps, ImplicitOps, Group | Group3_Byte, Group | Group3, /* 0xF8 - 0xFF */ ImplicitOps, 0, ImplicitOps, ImplicitOps, 0, 0, ByteOp | DstMem | SrcNone | ModRM, DstMem | SrcNone | ModRM @@ -239,6 +238,14 @@ static u16 twobyte_table[256] = { static u16 group_table[] = { [Group1A*8] = DstMem | SrcNone | ModRM | Mov | Stack, 0, 0, 0, 0, 0, 0, 0, + [Group3_Byte*8] = + ByteOp | SrcImm | DstMem | ModRM, 0, + ByteOp | DstMem | SrcNone | ModRM, ByteOp | DstMem | SrcNone | ModRM, + 0, 0, 0, 0, + [Group3*8] = + DstMem | SrcImm | ModRM | SrcImm, 0, + DstMem | SrcNone | ModRM, ByteOp | DstMem | SrcNone | ModRM, + 0, 0, 0, 0, }; static u16 group2_table[] = { @@ -1070,26 +1077,6 @@ static inline int emulate_grp3(struct x86_emulate_ctxt *ctxt, switch (c-modrm_reg) { case 0 ... 1: /* test */ - /* -* Special case in Grp3: test has an immediate -* source operand. -*/ - c-src.type = OP_IMM; - c-src.ptr = (unsigned long *)c-eip; - c-src.bytes = (c-d ByteOp) ? 1 : c-op_bytes; - if (c-src.bytes == 8) - c-src.bytes = 4; - switch (c-src.bytes) { - case 1: - c-src.val = insn_fetch(s8, 1, c-eip); - break; - case 2: - c-src.val = insn_fetch(s16, 2, c-eip); - break; - case 4: - c-src.val = insn_fetch(s32, 4, c-eip); - break; - } emulate_2op_SrcV(test, c-src, c-dst, ctxt-eflags); break; case 2: /* not */ @@ -1103,7 +1090,6 @@ static inline int emulate_grp3(struct x86_emulate_ctxt *ctxt, rc = X86EMUL_UNHANDLEABLE; break; } -done: return rc; } -- 1.5.4.5 - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [PATCH 13/40] KVM: VMX: Enable Virtual Processor Identification (VPID)
From: Sheng Yang [EMAIL PROTECTED] To allow TLB entries to be retained across VM entry and VM exit, the VMM can now identify distinct address spaces through a new virtual-processor ID (VPID) field of the VMCS. [avi: drop vpid_sync_all()] [avi: add cc to asm constraints] Signed-off-by: Sheng Yang [EMAIL PROTECTED] Signed-off-by: Avi Kivity [EMAIL PROTECTED] --- arch/x86/kvm/vmx.c | 80 +--- arch/x86/kvm/vmx.h |6 +++ include/asm-x86/kvm_host.h |1 + 3 files changed, 82 insertions(+), 5 deletions(-) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 8e14628..1157e8a 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -37,6 +37,9 @@ MODULE_LICENSE(GPL); static int bypass_guest_pf = 1; module_param(bypass_guest_pf, bool, 0); +static int enable_vpid = 1; +module_param(enable_vpid, bool, 0); + struct vmcs { u32 revision_id; u32 abort; @@ -71,6 +74,7 @@ struct vcpu_vmx { unsigned rip; } irq; } rmode; + int vpid; }; static inline struct vcpu_vmx *to_vmx(struct kvm_vcpu *vcpu) @@ -86,6 +90,9 @@ static DEFINE_PER_CPU(struct vmcs *, current_vmcs); static struct page *vmx_io_bitmap_a; static struct page *vmx_io_bitmap_b; +static DECLARE_BITMAP(vmx_vpid_bitmap, VMX_NR_VPIDS); +static DEFINE_SPINLOCK(vmx_vpid_lock); + static struct vmcs_config { int size; int order; @@ -204,6 +211,12 @@ static inline int vm_need_virtualize_apic_accesses(struct kvm *kvm) (irqchip_in_kernel(kvm))); } +static inline int cpu_has_vmx_vpid(void) +{ + return (vmcs_config.cpu_based_2nd_exec_ctrl + SECONDARY_EXEC_ENABLE_VPID); +} + static int __find_msr_index(struct vcpu_vmx *vmx, u32 msr) { int i; @@ -214,6 +227,20 @@ static int __find_msr_index(struct vcpu_vmx *vmx, u32 msr) return -1; } +static inline void __invvpid(int ext, u16 vpid, gva_t gva) +{ +struct { + u64 vpid : 16; + u64 rsvd : 48; + u64 gva; +} operand = { vpid, 0, gva }; + +asm volatile (ASM_VMX_INVVPID + /* CF==1 or ZF==1 -- rc = -1 */ + ; ja 1f ; ud2 ; 1: + : : a(operand), c(ext) : cc, memory); +} + static struct kvm_msr_entry *find_msr_entry(struct vcpu_vmx *vmx, u32 msr) { int i; @@ -257,6 +284,14 @@ static void vcpu_clear(struct vcpu_vmx *vmx) vmx-launched = 0; } +static inline void vpid_sync_vcpu_all(struct vcpu_vmx *vmx) +{ + if (vmx-vpid == 0) + return; + + __invvpid(VMX_VPID_EXTENT_SINGLE_CONTEXT, vmx-vpid, 0); +} + static unsigned long vmcs_readl(unsigned long field) { unsigned long value; @@ -490,6 +525,7 @@ static void vmx_vcpu_load(struct kvm_vcpu *vcpu, int cpu) if (vcpu-cpu != cpu) { vcpu_clear(vmx); kvm_migrate_apic_timer(vcpu); + vpid_sync_vcpu_all(vmx); } if (per_cpu(current_vmcs, cpu) != vmx-vmcs) { @@ -971,7 +1007,8 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf) if (_cpu_based_exec_control CPU_BASED_ACTIVATE_SECONDARY_CONTROLS) { min = 0; opt = SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES | - SECONDARY_EXEC_WBINVD_EXITING; + SECONDARY_EXEC_WBINVD_EXITING | + SECONDARY_EXEC_ENABLE_VPID; if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_PROCBASED_CTLS2, _cpu_based_2nd_exec_control) 0) return -EIO; @@ -1239,6 +1276,11 @@ static void exit_lmode(struct kvm_vcpu *vcpu) #endif +static void vmx_flush_tlb(struct kvm_vcpu *vcpu) +{ + vpid_sync_vcpu_all(to_vmx(vcpu)); +} + static void vmx_decache_cr4_guest_bits(struct kvm_vcpu *vcpu) { vcpu-arch.cr4 = KVM_GUEST_CR4_MASK; @@ -1275,6 +1317,7 @@ static void vmx_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0) static void vmx_set_cr3(struct kvm_vcpu *vcpu, unsigned long cr3) { + vmx_flush_tlb(vcpu); vmcs_writel(GUEST_CR3, cr3); if (vcpu-arch.cr0 X86_CR0_PE) vmx_fpu_deactivate(vcpu); @@ -1494,6 +1537,22 @@ out: return r; } +static void allocate_vpid(struct vcpu_vmx *vmx) +{ + int vpid; + + vmx-vpid = 0; + if (!enable_vpid || !cpu_has_vmx_vpid()) + return; + spin_lock(vmx_vpid_lock); + vpid = find_first_zero_bit(vmx_vpid_bitmap, VMX_NR_VPIDS); + if (vpid VMX_NR_VPIDS) { + vmx-vpid = vpid; + __set_bit(vpid, vmx_vpid_bitmap); + } + spin_unlock(vmx_vpid_lock); +} + /* * Sets up the vmcs for emulated real mode. */ @@ -1532,6 +1591,8 @@ static int vmx_vcpu_setup(struct vcpu_vmx *vmx) if (!vm_need_virtualize_apic_accesses(vmx-vcpu.kvm)) exec_control =
[kvm-devel] [PATCH 09/40] KVM: Only x86 has pio
Signed-off-by: Avi Kivity [EMAIL PROTECTED] --- virt/kvm/kvm_main.c |2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 04595fe..121e65c 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -678,8 +678,10 @@ static int kvm_vcpu_fault(struct vm_area_struct *vma, struct vm_fault *vmf) if (vmf-pgoff == 0) page = virt_to_page(vcpu-run); +#ifdef CONFIG_X86 else if (vmf-pgoff == KVM_PIO_PAGE_OFFSET) page = virt_to_page(vcpu-arch.pio_data); +#endif else return VM_FAULT_SIGBUS; get_page(page); -- 1.5.4.5 - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [PATCH 11/40] KVM: MMU: Decouple mmio from shadow page tables
Currently an mmio guest pte is encoded in the shadow pagetable as a not-present trapping pte, with the SHADOW_IO_MARK bit set. However nothing is ever done with this information, so maintaining it is a useless complication. This patch moves the check for mmio to before shadow ptes are instantiated, so the shadow code is never invoked for ptes that reference mmio. The code is simpler, and with future work, can be made to handle mmio concurrently. Signed-off-by: Avi Kivity [EMAIL PROTECTED] --- arch/x86/kvm/mmu.c | 34 +++--- arch/x86/kvm/paging_tmpl.h | 17 - 2 files changed, 23 insertions(+), 28 deletions(-) diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index 6f8392d..6651dfa 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -101,8 +101,6 @@ static int dbg = 1; #define PT_FIRST_AVAIL_BITS_SHIFT 9 #define PT64_SECOND_AVAIL_BITS_SHIFT 52 -#define PT_SHADOW_IO_MARK (1ULL PT_FIRST_AVAIL_BITS_SHIFT) - #define VALID_PAGE(x) ((x) != INVALID_PAGE) #define PT64_LEVEL_BITS 9 @@ -200,7 +198,6 @@ static int is_present_pte(unsigned long pte) static int is_shadow_present_pte(u64 pte) { - pte = ~PT_SHADOW_IO_MARK; return pte != shadow_trap_nonpresent_pte pte != shadow_notrap_nonpresent_pte; } @@ -215,11 +212,6 @@ static int is_dirty_pte(unsigned long pte) return pte PT_DIRTY_MASK; } -static int is_io_pte(unsigned long pte) -{ - return pte PT_SHADOW_IO_MARK; -} - static int is_rmap_pte(u64 pte) { return is_shadow_present_pte(pte); @@ -538,7 +530,7 @@ static int is_empty_shadow_page(u64 *spt) u64 *end; for (pos = spt, end = pos + PAGE_SIZE / sizeof(u64); pos != end; pos++) - if ((*pos ~PT_SHADOW_IO_MARK) != shadow_trap_nonpresent_pte) { + if (*pos != shadow_trap_nonpresent_pte) { printk(KERN_ERR %s: %p %llx\n, __FUNCTION__, pos, *pos); return 0; @@ -926,13 +918,6 @@ static void mmu_set_spte(struct kvm_vcpu *vcpu, u64 *shadow_pte, if (pte_access ACC_USER_MASK) spte |= PT_USER_MASK; - if (is_error_page(page)) { - set_shadow_pte(shadow_pte, - shadow_trap_nonpresent_pte | PT_SHADOW_IO_MARK); - kvm_release_page_clean(page); - return; - } - spte |= page_to_phys(page); if ((pte_access ACC_WRITE_MASK) @@ -1002,7 +987,7 @@ static int __nonpaging_map(struct kvm_vcpu *vcpu, gva_t v, int write, if (level == 1) { mmu_set_spte(vcpu, table[index], ACC_ALL, ACC_ALL, 0, write, 1, pt_write, gfn, page); - return pt_write || is_io_pte(table[index]); + return pt_write; } if (table[index] == shadow_trap_nonpresent_pte) { @@ -1039,6 +1024,13 @@ static int nonpaging_map(struct kvm_vcpu *vcpu, gva_t v, int write, gfn_t gfn) page = gfn_to_page(vcpu-kvm, gfn); up_read(current-mm-mmap_sem); + /* mmio */ + if (is_error_page(page)) { + kvm_release_page_clean(page); + up_read(vcpu-kvm-slots_lock); + return 1; + } + spin_lock(vcpu-kvm-mmu_lock); kvm_mmu_free_some_pages(vcpu); r = __nonpaging_map(vcpu, v, write, gfn, page); @@ -1406,10 +1398,14 @@ static void mmu_guess_page_from_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa, return; gfn = (gpte PT64_BASE_ADDR_MASK) PAGE_SHIFT; - down_read(current-mm-mmap_sem); + down_read(vcpu-kvm-slots_lock); page = gfn_to_page(vcpu-kvm, gfn); - up_read(current-mm-mmap_sem); + up_read(vcpu-kvm-slots_lock); + if (is_error_page(page)) { + kvm_release_page_clean(page); + return; + } vcpu-arch.update_pte.gfn = gfn; vcpu-arch.update_pte.page = page; } diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h index c2fd2b9..4b55f46 100644 --- a/arch/x86/kvm/paging_tmpl.h +++ b/arch/x86/kvm/paging_tmpl.h @@ -399,6 +399,14 @@ static int FNAME(page_fault)(struct kvm_vcpu *vcpu, gva_t addr, page = gfn_to_page(vcpu-kvm, walker.gfn); up_read(current-mm-mmap_sem); + /* mmio */ + if (is_error_page(page)) { + pgprintk(gfn %x is mmio\n, walker.gfn); + kvm_release_page_clean(page); + up_read(vcpu-kvm-slots_lock); + return 1; + } + spin_lock(vcpu-kvm-mmu_lock); kvm_mmu_free_some_pages(vcpu); shadow_pte = FNAME(fetch)(vcpu, addr, walker, user_fault, write_fault, @@ -409,15 +417,6 @@ static int FNAME(page_fault)(struct kvm_vcpu *vcpu, gva_t addr, if (!write_pt) vcpu-arch.last_pt_write_count = 0; /* reset fork detector
[kvm-devel] [PATCH 04/40] KVM: x86 emulator: group decoding for group 1A
This adds group decode support for opcode 0x8f. Signed-off-by: Avi Kivity [EMAIL PROTECTED] --- arch/x86/kvm/x86_emulate.c |8 +++- 1 files changed, 7 insertions(+), 1 deletions(-) diff --git a/arch/x86/kvm/x86_emulate.c b/arch/x86/kvm/x86_emulate.c index 46ecf34..cf1ce7c 100644 --- a/arch/x86/kvm/x86_emulate.c +++ b/arch/x86/kvm/x86_emulate.c @@ -69,6 +69,10 @@ #define GroupDual (115) /* Alternate decoding of mod == 3 */ #define GroupMask 0xff/* Group number stored in bits 0:7 */ +enum { + Group1A, +}; + static u16 opcode_table[256] = { /* 0x00 - 0x07 */ ByteOp | DstMem | SrcReg | ModRM, DstMem | SrcReg | ModRM, @@ -133,7 +137,7 @@ static u16 opcode_table[256] = { /* 0x88 - 0x8F */ ByteOp | DstMem | SrcReg | ModRM | Mov, DstMem | SrcReg | ModRM | Mov, ByteOp | DstReg | SrcMem | ModRM | Mov, DstReg | SrcMem | ModRM | Mov, - 0, ModRM | DstReg, 0, DstMem | SrcNone | ModRM | Mov | Stack, + 0, ModRM | DstReg, 0, Group | Group1A, /* 0x90 - 0x9F */ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ImplicitOps | Stack, ImplicitOps | Stack, 0, 0, @@ -233,6 +237,8 @@ static u16 twobyte_table[256] = { }; static u16 group_table[] = { + [Group1A*8] = + DstMem | SrcNone | ModRM | Mov | Stack, 0, 0, 0, 0, 0, 0, 0, }; static u16 group2_table[] = { -- 1.5.4.5 - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [PATCH 32/40] KVM: paravirtualized clocksource: host part
From: Glauber de Oliveira Costa [EMAIL PROTECTED] This is the host part of kvm clocksource implementation. As it does not include clockevents, it is a fairly simple implementation. We only have to register a per-vcpu area, and start writing to it periodically. The area is binary compatible with xen, as we use the same shadow_info structure. [marcelo: fix bad_page on MSR_KVM_SYSTEM_TIME] [avi: save full value of the msr, even if enable bit is clear] [avi: clear previous value of time_page] Signed-off-by: Glauber de Oliveira Costa [EMAIL PROTECTED] Signed-off-by: Marcelo Tosatti [EMAIL PROTECTED] Signed-off-by: Avi Kivity [EMAIL PROTECTED] --- arch/x86/kvm/x86.c | 113 +++- include/asm-x86/kvm_host.h |7 +++ include/asm-x86/kvm_para.h | 25 ++ include/linux/kvm.h|1 + 4 files changed, 145 insertions(+), 1 deletions(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 0c910c7..256c0fc 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -19,6 +19,7 @@ #include irq.h #include mmu.h +#include linux/clocksource.h #include linux/kvm.h #include linux/fs.h #include linux/vmalloc.h @@ -424,7 +425,7 @@ static u32 msrs_to_save[] = { #ifdef CONFIG_X86_64 MSR_CSTAR, MSR_KERNEL_GS_BASE, MSR_SYSCALL_MASK, MSR_LSTAR, #endif - MSR_IA32_TIME_STAMP_COUNTER, + MSR_IA32_TIME_STAMP_COUNTER, MSR_KVM_SYSTEM_TIME, MSR_KVM_WALL_CLOCK, }; static unsigned num_msrs_to_save; @@ -482,6 +483,70 @@ static int do_set_msr(struct kvm_vcpu *vcpu, unsigned index, u64 *data) return kvm_set_msr(vcpu, index, *data); } +static void kvm_write_wall_clock(struct kvm *kvm, gpa_t wall_clock) +{ + static int version; + struct kvm_wall_clock wc; + struct timespec wc_ts; + + if (!wall_clock) + return; + + version++; + + down_read(kvm-slots_lock); + kvm_write_guest(kvm, wall_clock, version, sizeof(version)); + + wc_ts = current_kernel_time(); + wc.wc_sec = wc_ts.tv_sec; + wc.wc_nsec = wc_ts.tv_nsec; + wc.wc_version = version; + + kvm_write_guest(kvm, wall_clock, wc, sizeof(wc)); + + version++; + kvm_write_guest(kvm, wall_clock, version, sizeof(version)); + up_read(kvm-slots_lock); +} + +static void kvm_write_guest_time(struct kvm_vcpu *v) +{ + struct timespec ts; + unsigned long flags; + struct kvm_vcpu_arch *vcpu = v-arch; + void *shared_kaddr; + + if ((!vcpu-time_page)) + return; + + /* Keep irq disabled to prevent changes to the clock */ + local_irq_save(flags); + kvm_get_msr(v, MSR_IA32_TIME_STAMP_COUNTER, + vcpu-hv_clock.tsc_timestamp); + ktime_get_ts(ts); + local_irq_restore(flags); + + /* With all the info we got, fill in the values */ + + vcpu-hv_clock.system_time = ts.tv_nsec + +(NSEC_PER_SEC * (u64)ts.tv_sec); + /* +* The interface expects us to write an even number signaling that the +* update is finished. Since the guest won't see the intermediate +* state, we just write 2 at the end +*/ + vcpu-hv_clock.version = 2; + + shared_kaddr = kmap_atomic(vcpu-time_page, KM_USER0); + + memcpy(shared_kaddr + vcpu-time_offset, vcpu-hv_clock, + sizeof(vcpu-hv_clock)); + + kunmap_atomic(shared_kaddr, KM_USER0); + + mark_page_dirty(v-kvm, vcpu-time PAGE_SHIFT); +} + int kvm_set_msr_common(struct kvm_vcpu *vcpu, u32 msr, u64 data) { @@ -511,6 +576,44 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, u32 msr, u64 data) case MSR_IA32_MISC_ENABLE: vcpu-arch.ia32_misc_enable_msr = data; break; + case MSR_KVM_WALL_CLOCK: + vcpu-kvm-arch.wall_clock = data; + kvm_write_wall_clock(vcpu-kvm, data); + break; + case MSR_KVM_SYSTEM_TIME: { + if (vcpu-arch.time_page) { + kvm_release_page_dirty(vcpu-arch.time_page); + vcpu-arch.time_page = NULL; + } + + vcpu-arch.time = data; + + /* we verify if the enable bit is set... */ + if (!(data 1)) + break; + + /* ...but clean it before doing the actual write */ + vcpu-arch.time_offset = data ~(PAGE_MASK | 1); + + vcpu-arch.hv_clock.tsc_to_system_mul = + clocksource_khz2mult(tsc_khz, 22); + vcpu-arch.hv_clock.tsc_shift = 22; + + down_read(current-mm-mmap_sem); + down_read(vcpu-kvm-slots_lock); + vcpu-arch.time_page = + gfn_to_page(vcpu-kvm, data PAGE_SHIFT); + up_read(vcpu-kvm-slots_lock); + up_read(current-mm-mmap_sem); + + if
[kvm-devel] [PATCH 24/40] KVM: MMU: make the __nonpaging_map function generic
From: Joerg Roedel [EMAIL PROTECTED] The mapping function for the nonpaging case in the softmmu does basically the same as required for Nested Paging. Make this function generic so it can be used for both. Signed-off-by: Joerg Roedel [EMAIL PROTECTED] Signed-off-by: Avi Kivity [EMAIL PROTECTED] --- arch/x86/kvm/mmu.c |7 +++ 1 files changed, 3 insertions(+), 4 deletions(-) diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index 21cfa28..33cd7c9 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -979,10 +979,9 @@ static void nonpaging_new_cr3(struct kvm_vcpu *vcpu) { } -static int __nonpaging_map(struct kvm_vcpu *vcpu, gva_t v, int write, - gfn_t gfn, struct page *page) +static int __direct_map(struct kvm_vcpu *vcpu, gpa_t v, int write, + gfn_t gfn, struct page *page, int level) { - int level = PT32E_ROOT_LEVEL; hpa_t table_addr = vcpu-arch.mmu.root_hpa; int pt_write = 0; @@ -1042,7 +1041,7 @@ static int nonpaging_map(struct kvm_vcpu *vcpu, gva_t v, int write, gfn_t gfn) spin_lock(vcpu-kvm-mmu_lock); kvm_mmu_free_some_pages(vcpu); - r = __nonpaging_map(vcpu, v, write, gfn, page); + r = __direct_map(vcpu, v, write, gfn, page, PT32E_ROOT_LEVEL); spin_unlock(vcpu-kvm-mmu_lock); up_read(vcpu-kvm-slots_lock); -- 1.5.4.5 - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [PATCH 34/40] KVM: x86 emulator: add ad_mask static inline
From: Harvey Harrison [EMAIL PROTECTED] Replaces open-coded mask calculation in macros. Signed-off-by: Harvey Harrison [EMAIL PROTECTED] Signed-off-by: Avi Kivity [EMAIL PROTECTED] --- arch/x86/kvm/x86_emulate.c | 11 --- 1 files changed, 8 insertions(+), 3 deletions(-) diff --git a/arch/x86/kvm/x86_emulate.c b/arch/x86/kvm/x86_emulate.c index 22900f7..f6f6544 100644 --- a/arch/x86/kvm/x86_emulate.c +++ b/arch/x86/kvm/x86_emulate.c @@ -480,10 +480,15 @@ static u16 group2_table[] = { (_type)_x; \ }) +static inline unsigned long ad_mask(struct decode_cache *c) +{ + return (1UL (c-ad_bytes 3)) - 1; +} + /* Access/update address held in a register, based on addressing mode. */ #define address_mask(reg) \ ((c-ad_bytes == sizeof(unsigned long)) ? \ - (reg) : ((reg) ((1UL (c-ad_bytes 3)) - 1))) + (reg) : ((reg) ad_mask(c))) #define register_address(base, reg) \ ((base) + address_mask(reg)) #define register_address_increment(reg, inc)\ @@ -494,9 +499,9 @@ static u16 group2_table[] = { (reg) += _inc; \ else\ (reg) = ((reg) \ -~((1UL (c-ad_bytes 3)) - 1)) | \ +~ad_mask(c)) | \ (((reg) + _inc)\ -((1UL (c-ad_bytes 3)) - 1));\ +ad_mask(c)); \ } while (0) #define JMP_REL(rel) \ -- 1.5.4.5 - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [PATCH 35/40] KVM: x86 emulator: make register_address, address_mask static inlines
From: Harvey Harrison [EMAIL PROTECTED] Signed-off-by: Harvey Harrison [EMAIL PROTECTED] Signed-off-by: Avi Kivity [EMAIL PROTECTED] --- arch/x86/kvm/x86_emulate.c | 48 ++- 1 files changed, 29 insertions(+), 19 deletions(-) diff --git a/arch/x86/kvm/x86_emulate.c b/arch/x86/kvm/x86_emulate.c index f6f6544..008db4d 100644 --- a/arch/x86/kvm/x86_emulate.c +++ b/arch/x86/kvm/x86_emulate.c @@ -486,11 +486,21 @@ static inline unsigned long ad_mask(struct decode_cache *c) } /* Access/update address held in a register, based on addressing mode. */ -#define address_mask(reg) \ - ((c-ad_bytes == sizeof(unsigned long)) ? \ - (reg) : ((reg) ad_mask(c))) -#define register_address(base, reg) \ - ((base) + address_mask(reg)) +static inline unsigned long +address_mask(struct decode_cache *c, unsigned long reg) +{ + if (c-ad_bytes == sizeof(unsigned long)) + return reg; + else + return reg ad_mask(c); +} + +static inline unsigned long +register_address(struct decode_cache *c, unsigned long base, unsigned long reg) +{ + return base + address_mask(c, reg); +} + #define register_address_increment(reg, inc)\ do {\ /* signed type ensures sign extension to long */\ @@ -1056,7 +1066,7 @@ static inline void emulate_push(struct x86_emulate_ctxt *ctxt) c-dst.bytes = c-op_bytes; c-dst.val = c-src.val; register_address_increment(c-regs[VCPU_REGS_RSP], -c-op_bytes); - c-dst.ptr = (void *) register_address(ctxt-ss_base, + c-dst.ptr = (void *) register_address(c, ctxt-ss_base, c-regs[VCPU_REGS_RSP]); } @@ -1066,7 +1076,7 @@ static inline int emulate_grp1a(struct x86_emulate_ctxt *ctxt, struct decode_cache *c = ctxt-decode; int rc; - rc = ops-read_std(register_address(ctxt-ss_base, + rc = ops-read_std(register_address(c, ctxt-ss_base, c-regs[VCPU_REGS_RSP]), c-dst.val, c-dst.bytes, ctxt-vcpu); if (rc != 0) @@ -1388,11 +1398,11 @@ special_insn: register_address_increment(c-regs[VCPU_REGS_RSP], -c-op_bytes); c-dst.ptr = (void *) register_address( - ctxt-ss_base, c-regs[VCPU_REGS_RSP]); + c, ctxt-ss_base, c-regs[VCPU_REGS_RSP]); break; case 0x58 ... 0x5f: /* pop reg */ pop_instruction: - if ((rc = ops-read_std(register_address(ctxt-ss_base, + if ((rc = ops-read_std(register_address(c, ctxt-ss_base, c-regs[VCPU_REGS_RSP]), c-dst.ptr, c-op_bytes, ctxt-vcpu)) != 0) goto done; @@ -1417,9 +1427,9 @@ special_insn: 1, (c-d ByteOp) ? 1 : c-op_bytes, c-rep_prefix ? - address_mask(c-regs[VCPU_REGS_RCX]) : 1, + address_mask(c, c-regs[VCPU_REGS_RCX]) : 1, (ctxt-eflags EFLG_DF), - register_address(ctxt-es_base, + register_address(c, ctxt-es_base, c-regs[VCPU_REGS_RDI]), c-rep_prefix, c-regs[VCPU_REGS_RDX]) == 0) { @@ -1433,9 +1443,9 @@ special_insn: 0, (c-d ByteOp) ? 1 : c-op_bytes, c-rep_prefix ? - address_mask(c-regs[VCPU_REGS_RCX]) : 1, + address_mask(c, c-regs[VCPU_REGS_RCX]) : 1, (ctxt-eflags EFLG_DF), - register_address(c-override_base ? + register_address(c, c-override_base ? *c-override_base : ctxt-ds_base, c-regs[VCPU_REGS_RSI]), @@ -1525,10 +1535,10 @@ special_insn: case 0xa4 ... 0xa5: /* movs */ c-dst.type = OP_MEM; c-dst.bytes = (c-d ByteOp) ? 1 : c-op_bytes; - c-dst.ptr = (unsigned long *)register_address( + c-dst.ptr = (unsigned long *)register_address(c, ctxt-es_base, c-regs[VCPU_REGS_RDI]); - if ((rc =
[kvm-devel] [PATCH 37/40] KVM: Add API to retrieve the number of supported vcpus per vm
Signed-off-by: Avi Kivity [EMAIL PROTECTED] --- arch/x86/kvm/x86.c |3 +++ include/linux/kvm.h |1 + 2 files changed, 4 insertions(+), 0 deletions(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 256c0fc..955d2ee 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -811,6 +811,9 @@ int kvm_dev_ioctl_check_extension(long ext) case KVM_CAP_VAPIC: r = !kvm_x86_ops-cpu_has_accelerated_tpr(); break; + case KVM_CAP_NR_VCPUS: + r = KVM_MAX_VCPUS; + break; default: r = 0; break; diff --git a/include/linux/kvm.h b/include/linux/kvm.h index 94540b3..deb9c38 100644 --- a/include/linux/kvm.h +++ b/include/linux/kvm.h @@ -234,6 +234,7 @@ struct kvm_vapic_addr { #define KVM_CAP_VAPIC 6 #define KVM_CAP_EXT_CPUID 7 #define KVM_CAP_CLOCKSOURCE 8 +#define KVM_CAP_NR_VCPUS 9 /* returns max vcpus per vm */ /* * ioctls for VM fds -- 1.5.4.5 - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [PATCH 36/40] KVM: x86 emulator: make register_address_increment and JMP_REL static inlines
From: Harvey Harrison [EMAIL PROTECTED] Change jmp_rel() to a function as well. Signed-off-by: Harvey Harrison [EMAIL PROTECTED] Signed-off-by: Avi Kivity [EMAIL PROTECTED] --- arch/x86/kvm/x86_emulate.c | 56 --- 1 files changed, 26 insertions(+), 30 deletions(-) diff --git a/arch/x86/kvm/x86_emulate.c b/arch/x86/kvm/x86_emulate.c index 008db4d..cacdcf5 100644 --- a/arch/x86/kvm/x86_emulate.c +++ b/arch/x86/kvm/x86_emulate.c @@ -501,23 +501,19 @@ register_address(struct decode_cache *c, unsigned long base, unsigned long reg) return base + address_mask(c, reg); } -#define register_address_increment(reg, inc)\ - do {\ - /* signed type ensures sign extension to long */\ - int _inc = (inc); \ - if (c-ad_bytes == sizeof(unsigned long)) \ - (reg) += _inc; \ - else\ - (reg) = ((reg) \ -~ad_mask(c)) | \ - (((reg) + _inc)\ -ad_mask(c)); \ - } while (0) +static inline void +register_address_increment(struct decode_cache *c, unsigned long *reg, int inc) +{ + if (c-ad_bytes == sizeof(unsigned long)) + *reg += inc; + else + *reg = (*reg ~ad_mask(c)) | ((*reg + inc) ad_mask(c)); +} -#define JMP_REL(rel) \ - do {\ - register_address_increment(c-eip, rel);\ - } while (0) +static inline void jmp_rel(struct decode_cache *c, int rel) +{ + register_address_increment(c, c-eip, rel); +} static int do_fetch_insn_byte(struct x86_emulate_ctxt *ctxt, struct x86_emulate_ops *ops, @@ -1065,7 +1061,7 @@ static inline void emulate_push(struct x86_emulate_ctxt *ctxt) c-dst.type = OP_MEM; c-dst.bytes = c-op_bytes; c-dst.val = c-src.val; - register_address_increment(c-regs[VCPU_REGS_RSP], -c-op_bytes); + register_address_increment(c, c-regs[VCPU_REGS_RSP], -c-op_bytes); c-dst.ptr = (void *) register_address(c, ctxt-ss_base, c-regs[VCPU_REGS_RSP]); } @@ -1082,7 +1078,7 @@ static inline int emulate_grp1a(struct x86_emulate_ctxt *ctxt, if (rc != 0) return rc; - register_address_increment(c-regs[VCPU_REGS_RSP], c-dst.bytes); + register_address_increment(c, c-regs[VCPU_REGS_RSP], c-dst.bytes); return 0; } @@ -1395,7 +1391,7 @@ special_insn: c-dst.type = OP_MEM; c-dst.bytes = c-op_bytes; c-dst.val = c-src.val; - register_address_increment(c-regs[VCPU_REGS_RSP], + register_address_increment(c, c-regs[VCPU_REGS_RSP], -c-op_bytes); c-dst.ptr = (void *) register_address( c, ctxt-ss_base, c-regs[VCPU_REGS_RSP]); @@ -1407,7 +1403,7 @@ special_insn: c-op_bytes, ctxt-vcpu)) != 0) goto done; - register_address_increment(c-regs[VCPU_REGS_RSP], + register_address_increment(c, c-regs[VCPU_REGS_RSP], c-op_bytes); c-dst.type = OP_NONE; /* Disable writeback. */ break; @@ -1459,7 +1455,7 @@ special_insn: int rel = insn_fetch(s8, 1, c-eip); if (test_cc(c-b, ctxt-eflags)) - JMP_REL(rel); + jmp_rel(c, rel); break; } case 0x80 ... 0x83: /* Grp1 */ @@ -1545,10 +1541,10 @@ special_insn: c-dst.val, c-dst.bytes, ctxt-vcpu)) != 0) goto done; - register_address_increment(c-regs[VCPU_REGS_RSI], + register_address_increment(c, c-regs[VCPU_REGS_RSI], (ctxt-eflags EFLG_DF) ? -c-dst.bytes : c-dst.bytes); - register_address_increment(c-regs[VCPU_REGS_RDI], + register_address_increment(c, c-regs[VCPU_REGS_RDI], (ctxt-eflags EFLG_DF) ? -c-dst.bytes : c-dst.bytes); break; @@ -1580,10 +1576,10 @@ special_insn: emulate_2op_SrcV(cmp, c-src, c-dst,
[kvm-devel] [PATCH 38/40] KVM: Increase vcpu count to 16
With NPT support, scalability is much improved. Signed-off-by: Avi Kivity [EMAIL PROTECTED] --- include/linux/kvm_host.h |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index b90ca36..f4deb99 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -24,7 +24,7 @@ #include asm/kvm_host.h -#define KVM_MAX_VCPUS 4 +#define KVM_MAX_VCPUS 16 #define KVM_MEMORY_SLOTS 8 /* memory slots that does not exposed to userspace */ #define KVM_PRIVATE_MEM_SLOTS 4 -- 1.5.4.5 - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [PATCH 25/40] KVM: export the load_pdptrs() function to modules
From: Joerg Roedel [EMAIL PROTECTED] The load_pdptrs() function is required in the SVM module for NPT support. Signed-off-by: Joerg Roedel [EMAIL PROTECTED] Signed-off-by: Avi Kivity [EMAIL PROTECTED] --- arch/x86/kvm/x86.c |1 + include/asm-x86/kvm_host.h |2 ++ 2 files changed, 3 insertions(+), 0 deletions(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 38edb2f..0c910c7 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -213,6 +213,7 @@ out: return ret; } +EXPORT_SYMBOL_GPL(load_pdptrs); static bool pdptrs_changed(struct kvm_vcpu *vcpu) { diff --git a/include/asm-x86/kvm_host.h b/include/asm-x86/kvm_host.h index 15fb2e8..da61255 100644 --- a/include/asm-x86/kvm_host.h +++ b/include/asm-x86/kvm_host.h @@ -410,6 +410,8 @@ void kvm_mmu_zap_all(struct kvm *kvm); unsigned int kvm_mmu_calculate_mmu_pages(struct kvm *kvm); void kvm_mmu_change_mmu_pages(struct kvm *kvm, unsigned int kvm_nr_mmu_pages); +int load_pdptrs(struct kvm_vcpu *vcpu, unsigned long cr3); + enum emulation_result { EMULATE_DONE, /* no further processing */ EMULATE_DO_MMIO, /* kvm_run filled with mmio request */ -- 1.5.4.5 - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [PATCH 33/40] x86: KVM guest: paravirtualized clocksource
From: Glauber de Oliveira Costa [EMAIL PROTECTED] This is the guest part of kvm clock implementation It does not do tsc-only timing, as tsc can have deltas between cpus, and it did not seem worthy to me to keep adjusting them. We do use it, however, for fine-grained adjustment. Other than that, time comes from the host. [randy dunlap: add missing include] [randy dunlap: disallow on Voyager or Visual WS] Signed-off-by: Glauber de Oliveira Costa [EMAIL PROTECTED] Signed-off-by: Randy Dunlap [EMAIL PROTECTED] Signed-off-by: Avi Kivity [EMAIL PROTECTED] --- arch/x86/Kconfig | 11 +++ arch/x86/kernel/Makefile |1 + arch/x86/kernel/kvmclock.c | 160 arch/x86/kernel/setup_32.c |5 ++ arch/x86/kernel/setup_64.c |5 ++ 5 files changed, 182 insertions(+), 0 deletions(-) create mode 100644 arch/x86/kernel/kvmclock.c diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 6c70fed..e59ea05 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -370,6 +370,17 @@ config VMI at the moment), by linking the kernel to a GPL-ed ROM module provided by the hypervisor. +config KVM_CLOCK + bool KVM paravirtualized clock + select PARAVIRT + depends on !(X86_VISWS || X86_VOYAGER) + help + Turning on this option will allow you to run a paravirtualized clock + when running over the KVM hypervisor. Instead of relying on a PIT + (or probably other) emulation by the underlying device model, the host + provides the guest with timing infrastructure such as time of day, and + system time + source arch/x86/lguest/Kconfig config PARAVIRT diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile index 4eb5ce8..a3379a3 100644 --- a/arch/x86/kernel/Makefile +++ b/arch/x86/kernel/Makefile @@ -77,6 +77,7 @@ obj-$(CONFIG_DEBUG_RODATA_TEST) += test_rodata.o obj-$(CONFIG_DEBUG_NX_TEST)+= test_nx.o obj-$(CONFIG_VMI) += vmi_32.o vmiclock_32.o +obj-$(CONFIG_KVM_CLOCK)+= kvmclock.o obj-$(CONFIG_PARAVIRT) += paravirt.o paravirt_patch_$(BITS).o ifdef CONFIG_INPUT_PCSPKR diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c new file mode 100644 index 000..b999f5e --- /dev/null +++ b/arch/x86/kernel/kvmclock.c @@ -0,0 +1,160 @@ +/* KVM paravirtual clock driver. A clocksource implementation +Copyright (C) 2008 Glauber de Oliveira Costa, Red Hat Inc. + +This program is free software; you can redistribute it and/or modify +it under the terms of the GNU General Public License as published by +the Free Software Foundation; either version 2 of the License, or +(at your option) any later version. + +This program is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU General Public License for more details. + +You should have received a copy of the GNU General Public License +along with this program; if not, write to the Free Software +Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA +*/ + +#include linux/clocksource.h +#include linux/kvm_para.h +#include asm/arch_hooks.h +#include asm/msr.h +#include asm/apic.h +#include linux/percpu.h + +#define KVM_SCALE 22 + +static int kvmclock = 1; + +static int parse_no_kvmclock(char *arg) +{ + kvmclock = 0; + return 0; +} +early_param(no-kvmclock, parse_no_kvmclock); + +/* The hypervisor will put information about time periodically here */ +static DEFINE_PER_CPU_SHARED_ALIGNED(struct kvm_vcpu_time_info, hv_clock); +#define get_clock(cpu, field) per_cpu(hv_clock, cpu).field + +static inline u64 kvm_get_delta(u64 last_tsc) +{ + int cpu = smp_processor_id(); + u64 delta = native_read_tsc() - last_tsc; + return (delta * get_clock(cpu, tsc_to_system_mul)) KVM_SCALE; +} + +static struct kvm_wall_clock wall_clock; +static cycle_t kvm_clock_read(void); +/* + * The wallclock is the time of day when we booted. Since then, some time may + * have elapsed since the hypervisor wrote the data. So we try to account for + * that with system time + */ +unsigned long kvm_get_wallclock(void) +{ + u32 wc_sec, wc_nsec; + u64 delta; + struct timespec ts; + int version, nsec; + int low, high; + + low = (int)__pa(wall_clock); + high = ((u64)__pa(wall_clock) 32); + + delta = kvm_clock_read(); + + native_write_msr(MSR_KVM_WALL_CLOCK, low, high); + do { + version = wall_clock.wc_version; + rmb(); + wc_sec = wall_clock.wc_sec; + wc_nsec = wall_clock.wc_nsec; + rmb(); + } while ((wall_clock.wc_version != version) || (version 1)); + + delta = kvm_clock_read() - delta; + delta += wc_nsec; + nsec = do_div(delta, NSEC_PER_SEC);
[kvm-devel] [PATCH 40/40] KVM: Increase the number of user memory slots per vm
Signed-off-by: Avi Kivity [EMAIL PROTECTED] --- include/linux/kvm_host.h |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index f4deb99..eb88d32 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -25,7 +25,7 @@ #include asm/kvm_host.h #define KVM_MAX_VCPUS 16 -#define KVM_MEMORY_SLOTS 8 +#define KVM_MEMORY_SLOTS 32 /* memory slots that does not exposed to userspace */ #define KVM_PRIVATE_MEM_SLOTS 4 -- 1.5.4.5 - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [PATCH 39/40] KVM: Add API for determining the number of supported memory slots
Signed-off-by: Avi Kivity [EMAIL PROTECTED] --- arch/x86/kvm/x86.c |3 +++ include/linux/kvm.h |1 + 2 files changed, 4 insertions(+), 0 deletions(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 955d2ee..b7c32f6 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -814,6 +814,9 @@ int kvm_dev_ioctl_check_extension(long ext) case KVM_CAP_NR_VCPUS: r = KVM_MAX_VCPUS; break; + case KVM_CAP_NR_MEMSLOTS: + r = KVM_MEMORY_SLOTS; + break; default: r = 0; break; diff --git a/include/linux/kvm.h b/include/linux/kvm.h index deb9c38..e92e703 100644 --- a/include/linux/kvm.h +++ b/include/linux/kvm.h @@ -235,6 +235,7 @@ struct kvm_vapic_addr { #define KVM_CAP_EXT_CPUID 7 #define KVM_CAP_CLOCKSOURCE 8 #define KVM_CAP_NR_VCPUS 9 /* returns max vcpus per vm */ +#define KVM_CAP_NR_MEMSLOTS 10 /* returns max memory slots per vm */ /* * ioctls for VM fds -- 1.5.4.5 - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [PATCH 31/40] KVM: SVM: enable LBR virtualization
From: Joerg Roedel [EMAIL PROTECTED] This patch implements the Last Branch Record Virtualization (LBRV) feature of the AMD Barcelona and Phenom processors into the kvm-amd module. It will only be enabled if the guest enables last branch recording in the DEBUG_CTL MSR. So there is no increased world switch overhead when the guest doesn't use these MSRs. Signed-off-by: Joerg Roedel [EMAIL PROTECTED] Signed-off-by: Markus Rechberger [EMAIL PROTECTED] Signed-off-by: Avi Kivity [EMAIL PROTECTED] --- arch/x86/kvm/svm.c | 39 +-- 1 files changed, 37 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index 281a2ff..7d73e93 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -47,6 +47,8 @@ MODULE_LICENSE(GPL); #define SVM_FEATURE_LBRV (1 1) #define SVM_DEATURE_SVML (1 2) +#define DEBUGCTL_RESERVED_BITS (~(0x3fULL)) + /* enable NPT for AMD64 and X86 with PAE */ #if defined(CONFIG_X86_64) || defined(CONFIG_X86_PAE) static bool npt_enabled = true; @@ -387,6 +389,28 @@ static void svm_vcpu_init_msrpm(u32 *msrpm) set_msr_interception(msrpm, MSR_IA32_SYSENTER_EIP, 1, 1); } +static void svm_enable_lbrv(struct vcpu_svm *svm) +{ + u32 *msrpm = svm-msrpm; + + svm-vmcb-control.lbr_ctl = 1; + set_msr_interception(msrpm, MSR_IA32_LASTBRANCHFROMIP, 1, 1); + set_msr_interception(msrpm, MSR_IA32_LASTBRANCHTOIP, 1, 1); + set_msr_interception(msrpm, MSR_IA32_LASTINTFROMIP, 1, 1); + set_msr_interception(msrpm, MSR_IA32_LASTINTTOIP, 1, 1); +} + +static void svm_disable_lbrv(struct vcpu_svm *svm) +{ + u32 *msrpm = svm-msrpm; + + svm-vmcb-control.lbr_ctl = 0; + set_msr_interception(msrpm, MSR_IA32_LASTBRANCHFROMIP, 0, 0); + set_msr_interception(msrpm, MSR_IA32_LASTBRANCHTOIP, 0, 0); + set_msr_interception(msrpm, MSR_IA32_LASTINTFROMIP, 0, 0); + set_msr_interception(msrpm, MSR_IA32_LASTINTTOIP, 0, 0); +} + static __init int svm_hardware_setup(void) { int cpu; @@ -1231,8 +1255,19 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, unsigned ecx, u64 data) svm-vmcb-save.sysenter_esp = data; break; case MSR_IA32_DEBUGCTLMSR: - pr_unimpl(vcpu, %s: MSR_IA32_DEBUGCTLMSR 0x%llx, nop\n, - __FUNCTION__, data); + if (!svm_has(SVM_FEATURE_LBRV)) { + pr_unimpl(vcpu, %s: MSR_IA32_DEBUGCTL 0x%llx, nop\n, + __FUNCTION__, data); + break; + } + if (data DEBUGCTL_RESERVED_BITS) + return 1; + + svm-vmcb-save.dbgctl = data; + if (data (1ULL0)) + svm_enable_lbrv(svm); + else + svm_disable_lbrv(svm); break; case MSR_K7_EVNTSEL0: case MSR_K7_EVNTSEL1: -- 1.5.4.5 - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [PATCH 16/40] KVM: make EFER_RESERVED_BITS configurable for architecture code
From: Joerg Roedel [EMAIL PROTECTED] This patch give the SVM and VMX implementations the ability to add some bits the guest can set in its EFER register. Signed-off-by: Joerg Roedel [EMAIL PROTECTED] Signed-off-by: Avi Kivity [EMAIL PROTECTED] --- arch/x86/kvm/x86.c | 11 +-- include/asm-x86/kvm_host.h |1 + 2 files changed, 10 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 6b01552..ec9265b 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -41,7 +41,7 @@ | X86_CR4_OSXMMEXCPT | X86_CR4_VMXE)) #define CR8_RESERVED_BITS (~(unsigned long)X86_CR8_TPR) -#define EFER_RESERVED_BITS 0xf2fe +static u64 __read_mostly efer_reserved_bits = 0xf2fe; #define VM_STAT(x) offsetof(struct kvm, stat.x), KVM_STAT_VM #define VCPU_STAT(x) offsetof(struct kvm_vcpu, stat.x), KVM_STAT_VCPU @@ -428,7 +428,7 @@ static u32 emulated_msrs[] = { static void set_efer(struct kvm_vcpu *vcpu, u64 efer) { - if (efer EFER_RESERVED_BITS) { + if (efer efer_reserved_bits) { printk(KERN_DEBUG set_efer: 0x%llx #GP, reserved bits\n, efer); kvm_inject_gp(vcpu, 0); @@ -452,6 +452,13 @@ static void set_efer(struct kvm_vcpu *vcpu, u64 efer) #endif +void kvm_enable_efer_bits(u64 mask) +{ + efer_reserved_bits = ~mask; +} +EXPORT_SYMBOL_GPL(kvm_enable_efer_bits); + + /* * Writes msr value into into the appropriate register. * Returns 0 on success, non-0 otherwise. diff --git a/include/asm-x86/kvm_host.h b/include/asm-x86/kvm_host.h index 67ae307..92668fa 100644 --- a/include/asm-x86/kvm_host.h +++ b/include/asm-x86/kvm_host.h @@ -429,6 +429,7 @@ void realmode_lmsw(struct kvm_vcpu *vcpu, unsigned long msw, unsigned long realmode_get_cr(struct kvm_vcpu *vcpu, int cr); void realmode_set_cr(struct kvm_vcpu *vcpu, int cr, unsigned long value, unsigned long *rflags); +void kvm_enable_efer_bits(u64); int kvm_get_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 *data); int kvm_set_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 data); -- 1.5.4.5 - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [PATCH 21/40] KVM: SVM: add detection of Nested Paging feature
From: Joerg Roedel [EMAIL PROTECTED] Let SVM detect if the Nested Paging feature is available on the hardware. Disable it to keep this patch series bisectable. Signed-off-by: Joerg Roedel [EMAIL PROTECTED] Signed-off-by: Avi Kivity [EMAIL PROTECTED] --- arch/x86/kvm/svm.c |8 1 files changed, 8 insertions(+), 0 deletions(-) diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index 5f527dc..c12a759 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -47,6 +47,8 @@ MODULE_LICENSE(GPL); #define SVM_FEATURE_LBRV (1 1) #define SVM_DEATURE_SVML (1 2) +static bool npt_enabled = false; + static void kvm_reput_irq(struct vcpu_svm *svm); static inline struct vcpu_svm *to_svm(struct kvm_vcpu *vcpu) @@ -413,6 +415,12 @@ static __init int svm_hardware_setup(void) svm_features = cpuid_edx(SVM_CPUID_FUNC); + if (!svm_has(SVM_FEATURE_NPT)) + npt_enabled = false; + + if (npt_enabled) + printk(KERN_INFO kvm: Nested Paging enabled\n); + return 0; err_2: -- 1.5.4.5 - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [PATCH 17/40] KVM: align valid EFER bits with the features of the host system
From: Joerg Roedel [EMAIL PROTECTED] This patch aligns the bits the guest can set in the EFER register with the features in the host processor. Currently it lets EFER.NX disabled if the processor does not support it and enables EFER.LME and EFER.LMA only for KVM on 64 bit hosts. Signed-off-by: Joerg Roedel [EMAIL PROTECTED] Signed-off-by: Avi Kivity [EMAIL PROTECTED] --- arch/x86/kvm/svm.c |3 +++ arch/x86/kvm/vmx.c |4 arch/x86/kvm/x86.c | 10 +- 3 files changed, 16 insertions(+), 1 deletions(-) diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index 1a582f1..ff3bc74 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -403,6 +403,9 @@ static __init int svm_hardware_setup(void) set_msr_interception(msrpm_va, MSR_IA32_SYSENTER_ESP, 1, 1); set_msr_interception(msrpm_va, MSR_IA32_SYSENTER_EIP, 1, 1); + if (boot_cpu_has(X86_FEATURE_NX)) + kvm_enable_efer_bits(EFER_NX); + for_each_online_cpu(cpu) { r = svm_cpu_init(cpu); if (r) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 1157e8a..a509910 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -1117,6 +1117,10 @@ static __init int hardware_setup(void) { if (setup_vmcs_config(vmcs_config) 0) return -EIO; + + if (boot_cpu_has(X86_FEATURE_NX)) + kvm_enable_efer_bits(EFER_NX); + return alloc_kvm_area(); } diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index ec9265b..db16f23 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -41,7 +41,15 @@ | X86_CR4_OSXMMEXCPT | X86_CR4_VMXE)) #define CR8_RESERVED_BITS (~(unsigned long)X86_CR8_TPR) -static u64 __read_mostly efer_reserved_bits = 0xf2fe; +/* EFER defaults: + * - enable syscall per default because its emulated by KVM + * - enable LME and LMA per default on 64 bit KVM + */ +#ifdef CONFIG_X86_64 +static u64 __read_mostly efer_reserved_bits = 0xfafeULL; +#else +static u64 __read_mostly efer_reserved_bits = 0xfffeULL; +#endif #define VM_STAT(x) offsetof(struct kvm, stat.x), KVM_STAT_VM #define VCPU_STAT(x) offsetof(struct kvm_vcpu, stat.x), KVM_STAT_VCPU -- 1.5.4.5 - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [PATCH 22/40] KVM: SVM: add module parameter to disable Nested Paging
From: Joerg Roedel [EMAIL PROTECTED] To disable the use of the Nested Paging feature even if it is available in hardware this patch adds a module parameter. Nested Paging can be disabled by passing npt=0 to the kvm_amd module. Signed-off-by: Joerg Roedel [EMAIL PROTECTED] Signed-off-by: Avi Kivity [EMAIL PROTECTED] --- arch/x86/kvm/svm.c |8 1 files changed, 8 insertions(+), 0 deletions(-) diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index c12a759..fb5d6c2 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -48,6 +48,9 @@ MODULE_LICENSE(GPL); #define SVM_DEATURE_SVML (1 2) static bool npt_enabled = false; +static int npt = 1; + +module_param(npt, int, S_IRUGO); static void kvm_reput_irq(struct vcpu_svm *svm); @@ -418,6 +421,11 @@ static __init int svm_hardware_setup(void) if (!svm_has(SVM_FEATURE_NPT)) npt_enabled = false; + if (npt_enabled !npt) { + printk(KERN_INFO kvm: Nested Paging disabled\n); + npt_enabled = false; + } + if (npt_enabled) printk(KERN_INFO kvm: Nested Paging enabled\n); -- 1.5.4.5 - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [PATCH 23/40] KVM: export information about NPT to generic x86 code
From: Joerg Roedel [EMAIL PROTECTED] The generic x86 code has to know if the specific implementation uses Nested Paging. In the generic code Nested Paging is called Two Dimensional Paging (TDP) to avoid confusion with (future) TDP implementations of other vendors. This patch exports the availability of TDP to the generic x86 code. Signed-off-by: Joerg Roedel [EMAIL PROTECTED] Signed-off-by: Avi Kivity [EMAIL PROTECTED] --- arch/x86/kvm/mmu.c | 15 +++ arch/x86/kvm/svm.c |4 +++- include/asm-x86/kvm_host.h |2 ++ 3 files changed, 20 insertions(+), 1 deletions(-) diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index 6651dfa..21cfa28 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -32,6 +32,15 @@ #include asm/cmpxchg.h #include asm/io.h +/* + * When setting this variable to true it enables Two-Dimensional-Paging + * where the hardware walks 2 page tables: + * 1. the guest-virtual to guest-physical + * 2. while doing 1. it walks guest-physical to host-physical + * If the hardware supports that we don't need to do shadow paging. + */ +static bool tdp_enabled = false; + #undef MMU_DEBUG #undef AUDIT @@ -1582,6 +1591,12 @@ out: } EXPORT_SYMBOL_GPL(kvm_mmu_page_fault); +void kvm_enable_tdp(void) +{ + tdp_enabled = true; +} +EXPORT_SYMBOL_GPL(kvm_enable_tdp); + static void free_mmu_pages(struct kvm_vcpu *vcpu) { struct kvm_mmu_page *sp; diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index fb5d6c2..9e29a13 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -426,8 +426,10 @@ static __init int svm_hardware_setup(void) npt_enabled = false; } - if (npt_enabled) + if (npt_enabled) { printk(KERN_INFO kvm: Nested Paging enabled\n); + kvm_enable_tdp(); + } return 0; diff --git a/include/asm-x86/kvm_host.h b/include/asm-x86/kvm_host.h index 92668fa..15fb2e8 100644 --- a/include/asm-x86/kvm_host.h +++ b/include/asm-x86/kvm_host.h @@ -492,6 +492,8 @@ int kvm_fix_hypercall(struct kvm_vcpu *vcpu); int kvm_mmu_page_fault(struct kvm_vcpu *vcpu, gva_t gva, u32 error_code); +void kvm_enable_tdp(void); + int load_pdptrs(struct kvm_vcpu *vcpu, unsigned long cr3); int complete_pio(struct kvm_vcpu *vcpu); -- 1.5.4.5 - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [PATCH 27/40] KVM: SVM: add support for Nested Paging
From: Joerg Roedel [EMAIL PROTECTED] This patch contains the SVM architecture dependent changes for KVM to enable support for the Nested Paging feature of AMD Barcelona and Phenom processors. Signed-off-by: Joerg Roedel [EMAIL PROTECTED] Signed-off-by: Avi Kivity [EMAIL PROTECTED] --- arch/x86/kvm/svm.c | 72 --- 1 files changed, 67 insertions(+), 5 deletions(-) diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index 9e29a13..8e9d4a5 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -47,7 +47,12 @@ MODULE_LICENSE(GPL); #define SVM_FEATURE_LBRV (1 1) #define SVM_DEATURE_SVML (1 2) +/* enable NPT for AMD64 and X86 with PAE */ +#if defined(CONFIG_X86_64) || defined(CONFIG_X86_PAE) +static bool npt_enabled = true; +#else static bool npt_enabled = false; +#endif static int npt = 1; module_param(npt, int, S_IRUGO); @@ -187,7 +192,7 @@ static inline void flush_guest_tlb(struct kvm_vcpu *vcpu) static void svm_set_efer(struct kvm_vcpu *vcpu, u64 efer) { - if (!(efer EFER_LMA)) + if (!npt_enabled !(efer EFER_LMA)) efer = ~EFER_LME; to_svm(vcpu)-vmcb-save.efer = efer | MSR_EFER_SVME_MASK; @@ -573,6 +578,22 @@ static void init_vmcb(struct vmcb *vmcb) save-cr0 = 0x0010 | X86_CR0_PG | X86_CR0_WP; save-cr4 = X86_CR4_PAE; /* rdx = ?? */ + + if (npt_enabled) { + /* Setup VMCB for Nested Paging */ + control-nested_ctl = 1; + control-intercept_exceptions = ~(1 PF_VECTOR); + control-intercept_cr_read = ~(INTERCEPT_CR0_MASK| + INTERCEPT_CR3_MASK); + control-intercept_cr_write = ~(INTERCEPT_CR0_MASK| +INTERCEPT_CR3_MASK); + save-g_pat = 0x0007040600070406ULL; + /* enable caching because the QEMU Bios doesn't enable it */ + save-cr0 = X86_CR0_ET; + save-cr3 = 0; + save-cr4 = 0; + } + } static int svm_vcpu_reset(struct kvm_vcpu *vcpu) @@ -807,6 +828,9 @@ static void svm_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0) } } #endif + if (npt_enabled) + goto set; + if ((vcpu-arch.cr0 X86_CR0_TS) !(cr0 X86_CR0_TS)) { svm-vmcb-control.intercept_exceptions = ~(1 NM_VECTOR); vcpu-fpu_active = 1; @@ -814,18 +838,26 @@ static void svm_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0) vcpu-arch.cr0 = cr0; cr0 |= X86_CR0_PG | X86_CR0_WP; - cr0 = ~(X86_CR0_CD | X86_CR0_NW); if (!vcpu-fpu_active) { svm-vmcb-control.intercept_exceptions |= (1 NM_VECTOR); cr0 |= X86_CR0_TS; } +set: + /* +* re-enable caching here because the QEMU bios +* does not do it - this results in some delay at +* reboot +*/ + cr0 = ~(X86_CR0_CD | X86_CR0_NW); svm-vmcb-save.cr0 = cr0; } static void svm_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4) { vcpu-arch.cr4 = cr4; - to_svm(vcpu)-vmcb-save.cr4 = cr4 | X86_CR4_PAE; + if (!npt_enabled) + cr4 |= X86_CR4_PAE; + to_svm(vcpu)-vmcb-save.cr4 = cr4; } static void svm_set_segment(struct kvm_vcpu *vcpu, @@ -1313,14 +1345,34 @@ static int (*svm_exit_handlers[])(struct vcpu_svm *svm, [SVM_EXIT_WBINVD] = emulate_on_interception, [SVM_EXIT_MONITOR] = invalid_op_interception, [SVM_EXIT_MWAIT]= invalid_op_interception, + [SVM_EXIT_NPF] = pf_interception, }; - static int handle_exit(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu) { struct vcpu_svm *svm = to_svm(vcpu); u32 exit_code = svm-vmcb-control.exit_code; + if (npt_enabled) { + int mmu_reload = 0; + if ((vcpu-arch.cr0 ^ svm-vmcb-save.cr0) X86_CR0_PG) { + svm_set_cr0(vcpu, svm-vmcb-save.cr0); + mmu_reload = 1; + } + vcpu-arch.cr0 = svm-vmcb-save.cr0; + vcpu-arch.cr3 = svm-vmcb-save.cr3; + if (is_paging(vcpu) is_pae(vcpu) !is_long_mode(vcpu)) { + if (!load_pdptrs(vcpu, vcpu-arch.cr3)) { + kvm_inject_gp(vcpu, 0); + return 1; + } + } + if (mmu_reload) { + kvm_mmu_reset_context(vcpu); + kvm_mmu_load(vcpu); + } + } + kvm_reput_irq(svm); if (svm-vmcb-control.exit_code == SVM_EXIT_ERR) { @@ -1331,7 +1383,8 @@ static int handle_exit(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu) } if
[kvm-devel] [PATCH 18/40] KVM: VMX: unifdef the EFER specific code
From: Joerg Roedel [EMAIL PROTECTED] To allow access to the EFER register in 32bit KVM the EFER specific code has to be exported to the x86 generic code. This patch does this in a backwards compatible manner. [avi: add check for EFER-less hosts] Signed-off-by: Joerg Roedel [EMAIL PROTECTED] Signed-off-by: Avi Kivity [EMAIL PROTECTED] --- arch/x86/kvm/vmx.c | 10 ++ 1 files changed, 2 insertions(+), 8 deletions(-) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index a509910..76944f2 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -1335,14 +1335,14 @@ static void vmx_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4) vcpu-arch.cr4 = cr4; } -#ifdef CONFIG_X86_64 - static void vmx_set_efer(struct kvm_vcpu *vcpu, u64 efer) { struct vcpu_vmx *vmx = to_vmx(vcpu); struct kvm_msr_entry *msr = find_msr_entry(vmx, MSR_EFER); vcpu-arch.shadow_efer = efer; + if (!msr) + return; if (efer EFER_LMA) { vmcs_write32(VM_ENTRY_CONTROLS, vmcs_read32(VM_ENTRY_CONTROLS) | @@ -1359,8 +1359,6 @@ static void vmx_set_efer(struct kvm_vcpu *vcpu, u64 efer) setup_msrs(vmx); } -#endif - static u64 vmx_get_segment_base(struct kvm_vcpu *vcpu, int seg) { struct kvm_vmx_segment_field *sf = kvm_vmx_segment_fields[seg]; @@ -1775,9 +1773,7 @@ static int vmx_vcpu_reset(struct kvm_vcpu *vcpu) vmx-vcpu.arch.cr0 = 0x6010; vmx_set_cr0(vmx-vcpu, vmx-vcpu.arch.cr0); /* enter rmode */ vmx_set_cr4(vmx-vcpu, 0); -#ifdef CONFIG_X86_64 vmx_set_efer(vmx-vcpu, 0); -#endif vmx_fpu_activate(vmx-vcpu); update_exception_bitmap(vmx-vcpu); @@ -2668,9 +2664,7 @@ static struct kvm_x86_ops vmx_x86_ops = { .set_cr0 = vmx_set_cr0, .set_cr3 = vmx_set_cr3, .set_cr4 = vmx_set_cr4, -#ifdef CONFIG_X86_64 .set_efer = vmx_set_efer, -#endif .get_idt = vmx_get_idt, .set_idt = vmx_set_idt, .get_gdt = vmx_get_gdt, -- 1.5.4.5 - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [PATCH 20/40] KVM: SVM: move feature detection to hardware setup code
From: Joerg Roedel [EMAIL PROTECTED] By moving the SVM feature detection from the each_cpu code to the hardware setup code it runs only once. As an additional advance the feature check is now available earlier in the module setup process. Signed-off-by: Joerg Roedel [EMAIL PROTECTED] Signed-off-by: Avi Kivity [EMAIL PROTECTED] --- arch/x86/kvm/svm.c |4 +++- 1 files changed, 3 insertions(+), 1 deletions(-) diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index ff3bc74..5f527dc 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -302,7 +302,6 @@ static void svm_hardware_enable(void *garbage) svm_data-asid_generation = 1; svm_data-max_asid = cpuid_ebx(SVM_CPUID_FUNC) - 1; svm_data-next_asid = svm_data-max_asid + 1; - svm_features = cpuid_edx(SVM_CPUID_FUNC); asm volatile (sgdt %0 : =m(gdt_descr)); gdt = (struct desc_struct *)gdt_descr.address; @@ -411,6 +410,9 @@ static __init int svm_hardware_setup(void) if (r) goto err_2; } + + svm_features = cpuid_edx(SVM_CPUID_FUNC); + return 0; err_2: -- 1.5.4.5 - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [PATCH 14/40] KVM: Use CONFIG_PREEMPT_NOTIFIERS around struct preempt_notifier
From: Hollis Blanchard [EMAIL PROTECTED] This allows kvm_host.h to be #included even when struct preempt_notifier is undefined. This is needed to build ppc asm-offsets.h. Signed-off-by: Hollis Blanchard [EMAIL PROTECTED] Signed-off-by: Avi Kivity [EMAIL PROTECTED] --- include/linux/kvm_host.h |2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 928b0d5..b90ca36 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -67,7 +67,9 @@ void kvm_io_bus_register_dev(struct kvm_io_bus *bus, struct kvm_vcpu { struct kvm *kvm; +#ifdef CONFIG_PREEMPT_NOTIFIERS struct preempt_notifier preempt_notifier; +#endif int vcpu_id; struct mutex mutex; int cpu; -- 1.5.4.5 - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [PATCH 30/40] KVM: SVM: allocate the MSR permission map per VCPU
From: Joerg Roedel [EMAIL PROTECTED] This patch changes the kvm-amd module to allocate the SVM MSR permission map per VCPU instead of a global map for all VCPUs. With this we have more flexibility allowing specific guests to access virtualized MSRs. This is required for LBR virtualization. Signed-off-by: Joerg Roedel [EMAIL PROTECTED] Signed-off-by: Markus Rechberger [EMAIL PROTECTED] Signed-off-by: Avi Kivity [EMAIL PROTECTED] --- arch/x86/kvm/kvm_svm.h |2 + arch/x86/kvm/svm.c | 67 +++- 2 files changed, 34 insertions(+), 35 deletions(-) diff --git a/arch/x86/kvm/kvm_svm.h b/arch/x86/kvm/kvm_svm.h index ecdfe97..65ef0fc 100644 --- a/arch/x86/kvm/kvm_svm.h +++ b/arch/x86/kvm/kvm_svm.h @@ -39,6 +39,8 @@ struct vcpu_svm { unsigned long host_db_regs[NUM_DB_REGS]; unsigned long host_dr6; unsigned long host_dr7; + + u32 *msrpm; }; #endif diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index d934819..281a2ff 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -65,7 +65,6 @@ static inline struct vcpu_svm *to_svm(struct kvm_vcpu *vcpu) } unsigned long iopm_base; -unsigned long msrpm_base; struct kvm_ldttss_desc { u16 limit0; @@ -370,12 +369,29 @@ static void set_msr_interception(u32 *msrpm, unsigned msr, BUG(); } +static void svm_vcpu_init_msrpm(u32 *msrpm) +{ + memset(msrpm, 0xff, PAGE_SIZE * (1 MSRPM_ALLOC_ORDER)); + +#ifdef CONFIG_X86_64 + set_msr_interception(msrpm, MSR_GS_BASE, 1, 1); + set_msr_interception(msrpm, MSR_FS_BASE, 1, 1); + set_msr_interception(msrpm, MSR_KERNEL_GS_BASE, 1, 1); + set_msr_interception(msrpm, MSR_LSTAR, 1, 1); + set_msr_interception(msrpm, MSR_CSTAR, 1, 1); + set_msr_interception(msrpm, MSR_SYSCALL_MASK, 1, 1); +#endif + set_msr_interception(msrpm, MSR_K6_STAR, 1, 1); + set_msr_interception(msrpm, MSR_IA32_SYSENTER_CS, 1, 1); + set_msr_interception(msrpm, MSR_IA32_SYSENTER_ESP, 1, 1); + set_msr_interception(msrpm, MSR_IA32_SYSENTER_EIP, 1, 1); +} + static __init int svm_hardware_setup(void) { int cpu; struct page *iopm_pages; - struct page *msrpm_pages; - void *iopm_va, *msrpm_va; + void *iopm_va; int r; iopm_pages = alloc_pages(GFP_KERNEL, IOPM_ALLOC_ORDER); @@ -388,37 +404,13 @@ static __init int svm_hardware_setup(void) clear_bit(0x80, iopm_va); /* allow direct access to PC debug port */ iopm_base = page_to_pfn(iopm_pages) PAGE_SHIFT; - - msrpm_pages = alloc_pages(GFP_KERNEL, MSRPM_ALLOC_ORDER); - - r = -ENOMEM; - if (!msrpm_pages) - goto err_1; - - msrpm_va = page_address(msrpm_pages); - memset(msrpm_va, 0xff, PAGE_SIZE * (1 MSRPM_ALLOC_ORDER)); - msrpm_base = page_to_pfn(msrpm_pages) PAGE_SHIFT; - -#ifdef CONFIG_X86_64 - set_msr_interception(msrpm_va, MSR_GS_BASE, 1, 1); - set_msr_interception(msrpm_va, MSR_FS_BASE, 1, 1); - set_msr_interception(msrpm_va, MSR_KERNEL_GS_BASE, 1, 1); - set_msr_interception(msrpm_va, MSR_LSTAR, 1, 1); - set_msr_interception(msrpm_va, MSR_CSTAR, 1, 1); - set_msr_interception(msrpm_va, MSR_SYSCALL_MASK, 1, 1); -#endif - set_msr_interception(msrpm_va, MSR_K6_STAR, 1, 1); - set_msr_interception(msrpm_va, MSR_IA32_SYSENTER_CS, 1, 1); - set_msr_interception(msrpm_va, MSR_IA32_SYSENTER_ESP, 1, 1); - set_msr_interception(msrpm_va, MSR_IA32_SYSENTER_EIP, 1, 1); - if (boot_cpu_has(X86_FEATURE_NX)) kvm_enable_efer_bits(EFER_NX); for_each_online_cpu(cpu) { r = svm_cpu_init(cpu); if (r) - goto err_2; + goto err; } svm_features = cpuid_edx(SVM_CPUID_FUNC); @@ -438,10 +430,7 @@ static __init int svm_hardware_setup(void) return 0; -err_2: - __free_pages(msrpm_pages, MSRPM_ALLOC_ORDER); - msrpm_base = 0; -err_1: +err: __free_pages(iopm_pages, IOPM_ALLOC_ORDER); iopm_base = 0; return r; @@ -449,9 +438,8 @@ err_1: static __exit void svm_hardware_unsetup(void) { - __free_pages(pfn_to_page(msrpm_base PAGE_SHIFT), MSRPM_ALLOC_ORDER); __free_pages(pfn_to_page(iopm_base PAGE_SHIFT), IOPM_ALLOC_ORDER); - iopm_base = msrpm_base = 0; + iopm_base = 0; } static void init_seg(struct vmcb_seg *seg) @@ -536,7 +524,7 @@ static void init_vmcb(struct vcpu_svm *svm) (1ULL INTERCEPT_MWAIT); control-iopm_base_pa = iopm_base; - control-msrpm_base_pa = msrpm_base; + control-msrpm_base_pa = __pa(svm-msrpm); control-tsc_offset = 0; control-int_ctl = V_INTR_MASKING_MASK; @@ -615,6 +603,7 @@ static struct kvm_vcpu *svm_create_vcpu(struct kvm *kvm, unsigned int id) { struct vcpu_svm *svm; struct page *page; +
[kvm-devel] [PATCH 00/40] KVM updates for the 2.6.26 merge window (part I)
These are the first forty of about a hundred patches I have queued for 2.6.26. Note that a few will go through git-s390, maybe a couple through x86.git, and a few to 2.6.25-rc. The ia64 patches are not included as they are being reviewed, but I hope to have them merged in time for the 2.6.26 merge window. Happy reviewing! Diffstat for this batch: arch/x86/Kconfig | 11 ++ arch/x86/kernel/Makefile |1 + arch/x86/kernel/kvmclock.c | 160 arch/x86/kernel/setup_32.c |5 + arch/x86/kernel/setup_64.c |5 + arch/x86/kvm/kvm_svm.h |2 + arch/x86/kvm/mmu.c | 168 +++--- arch/x86/kvm/mmu.h |6 + arch/x86/kvm/paging_tmpl.h | 24 ++--- arch/x86/kvm/svm.c | 211 arch/x86/kvm/vmx.c | 100 +++--- arch/x86/kvm/vmx.h | 10 ++- arch/x86/kvm/x86.c | 147 +++-- arch/x86/kvm/x86_emulate.c | 252 --- include/asm-x86/kvm_host.h | 16 +++- include/asm-x86/kvm_para.h | 25 + include/linux/kvm.h|3 + include/linux/kvm_host.h |6 +- virt/kvm/kvm_main.c| 13 ++- 19 files changed, 919 insertions(+), 246 deletions(-) - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [PATCH 02/40] KVM: MMU: Simplify hash table indexing
From: Dong, Eddie [EMAIL PROTECTED] Signed-off-by: Yaozu (Eddie) Dong [EMAIL PROTECTED] Signed-off-by: Avi Kivity [EMAIL PROTECTED] --- arch/x86/kvm/mmu.c | 10 +- include/asm-x86/kvm_host.h |3 ++- 2 files changed, 7 insertions(+), 6 deletions(-) diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index 28f9a44..6f8392d 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -559,7 +559,7 @@ static void kvm_mmu_free_page(struct kvm *kvm, struct kvm_mmu_page *sp) static unsigned kvm_page_table_hashfn(gfn_t gfn) { - return gfn; + return gfn ((1 KVM_MMU_HASH_SHIFT) - 1); } static struct kvm_mmu_page *kvm_mmu_alloc_page(struct kvm_vcpu *vcpu, @@ -663,7 +663,7 @@ static struct kvm_mmu_page *kvm_mmu_lookup_page(struct kvm *kvm, gfn_t gfn) struct hlist_node *node; pgprintk(%s: looking for gfn %lx\n, __FUNCTION__, gfn); - index = kvm_page_table_hashfn(gfn) % KVM_NUM_MMU_PAGES; + index = kvm_page_table_hashfn(gfn); bucket = kvm-arch.mmu_page_hash[index]; hlist_for_each_entry(sp, node, bucket, hash_link) if (sp-gfn == gfn !sp-role.metaphysical) { @@ -701,7 +701,7 @@ static struct kvm_mmu_page *kvm_mmu_get_page(struct kvm_vcpu *vcpu, } pgprintk(%s: looking gfn %lx role %x\n, __FUNCTION__, gfn, role.word); - index = kvm_page_table_hashfn(gfn) % KVM_NUM_MMU_PAGES; + index = kvm_page_table_hashfn(gfn); bucket = vcpu-kvm-arch.mmu_page_hash[index]; hlist_for_each_entry(sp, node, bucket, hash_link) if (sp-gfn == gfn sp-role.word == role.word) { @@ -840,7 +840,7 @@ static int kvm_mmu_unprotect_page(struct kvm *kvm, gfn_t gfn) pgprintk(%s: looking for gfn %lx\n, __FUNCTION__, gfn); r = 0; - index = kvm_page_table_hashfn(gfn) % KVM_NUM_MMU_PAGES; + index = kvm_page_table_hashfn(gfn); bucket = kvm-arch.mmu_page_hash[index]; hlist_for_each_entry_safe(sp, node, n, bucket, hash_link) if (sp-gfn == gfn !sp-role.metaphysical) { @@ -1450,7 +1450,7 @@ void kvm_mmu_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa, vcpu-arch.last_pt_write_count = 1; vcpu-arch.last_pte_updated = NULL; } - index = kvm_page_table_hashfn(gfn) % KVM_NUM_MMU_PAGES; + index = kvm_page_table_hashfn(gfn); bucket = vcpu-kvm-arch.mmu_page_hash[index]; hlist_for_each_entry_safe(sp, node, n, bucket, hash_link) { if (sp-gfn != gfn || sp-role.metaphysical) diff --git a/include/asm-x86/kvm_host.h b/include/asm-x86/kvm_host.h index 4702b04..d6db0de 100644 --- a/include/asm-x86/kvm_host.h +++ b/include/asm-x86/kvm_host.h @@ -57,7 +57,8 @@ #define KVM_PERMILLE_MMU_PAGES 20 #define KVM_MIN_ALLOC_MMU_PAGES 64 -#define KVM_NUM_MMU_PAGES 1024 +#define KVM_MMU_HASH_SHIFT 10 +#define KVM_NUM_MMU_PAGES (1 KVM_MMU_HASH_SHIFT) #define KVM_MIN_FREE_MMU_PAGES 5 #define KVM_REFILL_PAGES 25 #define KVM_MAX_CPUID_ENTRIES 40 -- 1.5.4.5 - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [PATCH 03/40] KVM: x86 emulator: add support for group decoding
Certain x86 instructions use bits 3:5 of the byte following the opcode as an opcode extension, with the decode sometimes depending on bits 6:7 as well. Add support for this in the main decoding table rather than an ad-hock adaptation per opcode. Signed-off-by: Avi Kivity [EMAIL PROTECTED] --- arch/x86/kvm/x86_emulate.c | 33 +++-- 1 files changed, 27 insertions(+), 6 deletions(-) diff --git a/arch/x86/kvm/x86_emulate.c b/arch/x86/kvm/x86_emulate.c index 7958600..46ecf34 100644 --- a/arch/x86/kvm/x86_emulate.c +++ b/arch/x86/kvm/x86_emulate.c @@ -65,6 +65,9 @@ #define MemAbs (19) /* Memory operand is absolute displacement */ #define String (110) /* String instruction (rep capable) */ #define Stack (111) /* Stack instruction (push/pop) */ +#define Group (114) /* Bits 3:5 of modrm byte extend opcode */ +#define GroupDual (115) /* Alternate decoding of mod == 3 */ +#define GroupMask 0xff/* Group number stored in bits 0:7 */ static u16 opcode_table[256] = { /* 0x00 - 0x07 */ @@ -229,6 +232,12 @@ static u16 twobyte_table[256] = { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }; +static u16 group_table[] = { +}; + +static u16 group2_table[] = { +}; + /* EFLAGS bit definitions. */ #define EFLG_OF (111) #define EFLG_DF (110) @@ -763,7 +772,7 @@ x86_decode_insn(struct x86_emulate_ctxt *ctxt, struct x86_emulate_ops *ops) struct decode_cache *c = ctxt-decode; int rc = 0; int mode = ctxt-mode; - int def_op_bytes, def_ad_bytes; + int def_op_bytes, def_ad_bytes, group; /* Shadow copy of register state. Committed on successful emulation. */ @@ -864,12 +873,24 @@ done_prefixes: c-b = insn_fetch(u8, 1, c-eip); c-d = twobyte_table[c-b]; } + } - /* Unrecognised? */ - if (c-d == 0) { - DPRINTF(Cannot emulate %02x\n, c-b); - return -1; - } + if (c-d Group) { + group = c-d GroupMask; + c-modrm = insn_fetch(u8, 1, c-eip); + --c-eip; + + group = (group 3) + ((c-modrm 3) 7); + if ((c-d GroupDual) (c-modrm 6) == 3) + c-d = group2_table[group]; + else + c-d = group_table[group]; + } + + /* Unrecognised? */ + if (c-d == 0) { + DPRINTF(Cannot emulate %02x\n, c-b); + return -1; } if (mode == X86EMUL_MODE_PROT64 (c-d Stack)) -- 1.5.4.5 - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [PATCH 08/40] KVM: constify function pointer tables
From: Jan Engelhardt [EMAIL PROTECTED] Signed-off-by: Jan Engelhardt [EMAIL PROTECTED] Signed-off-by: Avi Kivity [EMAIL PROTECTED] --- virt/kvm/kvm_main.c |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index b2e1289..04595fe 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -705,7 +705,7 @@ static int kvm_vcpu_release(struct inode *inode, struct file *filp) return 0; } -static struct file_operations kvm_vcpu_fops = { +static const struct file_operations kvm_vcpu_fops = { .release= kvm_vcpu_release, .unlocked_ioctl = kvm_vcpu_ioctl, .compat_ioctl = kvm_vcpu_ioctl, @@ -1005,7 +1005,7 @@ static int kvm_vm_mmap(struct file *file, struct vm_area_struct *vma) return 0; } -static struct file_operations kvm_vm_fops = { +static const struct file_operations kvm_vm_fops = { .release= kvm_vm_release, .unlocked_ioctl = kvm_vm_ioctl, .compat_ioctl = kvm_vm_ioctl, -- 1.5.4.5 - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [04/17] [PATCH] Add kvm arch-specific core code for kvm/ia64.-V8
Zhang, Xiantao wrote: +static struct kvm_vcpu *lid_to_vcpu(struct kvm *kvm, unsigned long id, + unsigned long eid) +{ + ia64_lid_t lid; + int i; + + for (i = 0; i KVM_MAX_VCPUS; i++) { + if (kvm-vcpus[i]) { + lid.val = VCPU_LID(kvm-vcpus[i]); + if (lid.id == id lid.eid == eid) + return kvm-vcpus[i]; + } + } + + return NULL; +} + +static int handle_ipi(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run) +{ + struct exit_ctl_data *p = kvm_get_exit_data(vcpu); + struct kvm_vcpu *target_vcpu; + struct kvm_pt_regs *regs; + ia64_ipi_a addr = p-u.ipi_data.addr; + ia64_ipi_d data = p-u.ipi_data.data; + + target_vcpu = lid_to_vcpu(vcpu-kvm, addr.id, addr.eid); + if (!target_vcpu) + return handle_vm_error(vcpu, kvm_run); + + if (!target_vcpu-arch.launched) { + regs = vcpu_regs(target_vcpu); + + regs-cr_iip = vcpu-kvm-arch.rdv_sal_data.boot_ip; + regs-r1 = vcpu-kvm-arch.rdv_sal_data.boot_gp; + + target_vcpu-arch.mp_state = VCPU_MP_STATE_RUNNABLE; + if (waitqueue_active(target_vcpu-wq)) + wake_up_interruptible(target_vcpu-wq); + } else { + vcpu_deliver_ipi(target_vcpu, data.dm, data.vector); + if (target_vcpu != vcpu) + kvm_vcpu_kick(target_vcpu); + } + + return 1; +} *Shrug*. This looks highly racy to me. You do access various values in target_vcpu without any lock! I know that taking the target vcpu's lock does'nt work because that one is held all the time during KVM_VCPU_RUN. My solution to that was struct local_interrupt, which has its own lock, and has the waitqueue plus everything I need to send a sigp [that's our flavor of ipi]. +int kvm_emulate_halt(struct kvm_vcpu *vcpu) +{ + + ktime_t kt; + long itc_diff; + unsigned long vcpu_now_itc; + + unsigned long expires; + struct hrtimer *p_ht = vcpu-arch.hlt_timer; That makes me jealous, I'd love to have hrtimer on s390 for this. I've got to round up to the next jiffie. *Sigh* +int kvm_arch_vcpu_ioctl_set_sregs(struct kvm_vcpu *vcpu, + struct kvm_sregs *sregs) +{ + printk(KERN_WARNINGkvm:kvm_arch_vcpu_ioctl_set_sregs called!!\n); + return 0; +} + +int kvm_arch_vcpu_ioctl_get_sregs(struct kvm_vcpu *vcpu, + struct kvm_sregs *sregs) +{ + printk(KERN_WARNINGkvm:kvm_arch_vcpu_ioctl_get_sregs called!!\n); + return 0; + +} Suggestion: if get/set sregs does'nt seem useful on ia64, why not return -EINVAL? In that case, you could also not print a kern warning, the user will either handle that situation or complain. +int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu) +{ snip + /*FIXME:Need to removed it later!!\n*/ + vcpu-arch.apic = kzalloc(sizeof(struct kvm_lapic), GFP_KERNEL); + vcpu-arch.apic-vcpu = vcpu; Fixme! +static int vti_vcpu_setup(struct kvm_vcpu *vcpu, int id) +{ + unsigned long psr; + int r; + + local_irq_save(psr); + r = kvm_insert_vmm_mapping(vcpu); + if (r) + goto fail; + r = kvm_vcpu_init(vcpu, vcpu-kvm, id); + if (r) + goto fail; Maybe change to return r, rather then goto fail? +int kvm_arch_vcpu_ioctl_get_fpu(struct kvm_vcpu *vcpu, struct kvm_fpu *fpu) +{ + printk(KERN_WARNINGkvm:IA64 doesn't need to export + fpu to userspace!\n); + return 0; +} + +int kvm_arch_vcpu_ioctl_set_fpu(struct kvm_vcpu *vcpu, struct kvm_fpu *fpu) +{ + printk(KERN_WARNINGkvm:IA64 doesn't need to export + fpu to userspace !\n); + return 0; +} maybe -EINVAL? +static int find_highest_bits(int *dat) +{ + u32 bits, bitnum; + int i; + + /* loop for all 256 bits */ + for (i = 7; i = 0 ; i--) { + bits = dat[i]; + if (bits) { + bitnum = fls(bits); + return i * 32 + bitnum - 1; + } + } + + return -1; +} Should be in asm/bitops.h. Look at find_first_bit() and friends, this is duplicate. - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [02/17][PATCH] Implement smp_call_function_mask for ia64 - V8
Jes Sorensen wrote: I'm a little wary of the performance impact of this change. Doing a cpumask compare on all smp_call_function calls seems a little expensive. Maybe it's just noise in the big picture compared to the actual cost of the IPIs, but I thought I'd bring it up. Keep in mind that a cpumask can be fairly big these days, max NR_CPUS is currently 4096. For those booting a kernel with NR_CPUS at 4096 on a dual CPU machine, it would be a bit expensive. Unless your hardware has remarkably fast IPIs, I think really the cost of scanning 512 bytes is going to be in the noise... This change has been on the x86 side for ages, and not even Ingo made a peep about it ;) Why not keep smp_call_function() the way it was before, rather than implementing it via the call to smp_call_function_mask()? Because Xen needs a different core implementation (because of its different IPI implementation), and it would be better to just have to do one of them rather than N. J - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [05/17][PATCH] kvm/ia64 : Add head files for kvm/ia64
+/** + VCPU control register access routines + **/ +static inline u64 vcpu_get_itir(VCPU *vcpu) +{ + return ((u64)VCPU(vcpu, itir)); +} + +static inline void vcpu_set_itir(VCPU *vcpu, u64 val) +{ + VCPU(vcpu, itir) = val; +} + +static inline u64 vcpu_get_ifa(VCPU *vcpu) +{ + return ((u64)VCPU(vcpu, ifa)); +} + +static inline void vcpu_set_ifa(VCPU *vcpu, u64 val) +{ + VCPU(vcpu, ifa) = val; +} + +static inline u64 vcpu_get_iva(VCPU *vcpu) +{ + return ((u64)VCPU(vcpu, iva)); +} + +static inline u64 vcpu_get_pta(VCPU *vcpu) +{ + return ((u64)VCPU(vcpu, pta)); +} + +static inline u64 vcpu_get_lid(VCPU *vcpu) +{ + return ((u64)VCPU(vcpu, lid)); +} + +static inline u64 vcpu_get_tpr(VCPU *vcpu) +{ + return ((u64)VCPU(vcpu, tpr)); +} + +static inline u64 vcpu_get_eoi(VCPU *vcpu) +{ + return (0UL); /*reads of eoi always return 0 */ +} + +static inline u64 vcpu_get_irr0(VCPU *vcpu) +{ + return ((u64)VCPU(vcpu, irr[0])); +} + +static inline u64 vcpu_get_irr1(VCPU *vcpu) +{ + return ((u64)VCPU(vcpu, irr[1])); +} + +static inline u64 vcpu_get_irr2(VCPU *vcpu) +{ + return ((u64)VCPU(vcpu, irr[2])); +} + +static inline u64 vcpu_get_irr3(VCPU *vcpu) +{ + return ((u64)VCPU(vcpu, irr[3])); +} + +static inline void vcpu_set_dcr(VCPU *vcpu, u64 val) +{ + ia64_set_dcr(val); +} + +static inline void vcpu_set_isr(VCPU *vcpu, u64 val) +{ + VCPU(vcpu, isr) = val; +} + +static inline void vcpu_set_lid(VCPU *vcpu, u64 val) +{ + VCPU(vcpu, lid) = val; +} + +static inline void vcpu_set_ipsr(VCPU *vcpu, u64 val) +{ + VCPU(vcpu, ipsr) = val; +} + +static inline void vcpu_set_iip(VCPU *vcpu, u64 val) +{ + VCPU(vcpu, iip) = val; +} + +static inline void vcpu_set_ifs(VCPU *vcpu, u64 val) +{ + VCPU(vcpu, ifs) = val; +} + +static inline void vcpu_set_iipa(VCPU *vcpu, u64 val) +{ + VCPU(vcpu, iipa) = val; +} + +static inline void vcpu_set_iha(VCPU *vcpu, u64 val) +{ + VCPU(vcpu, iha) = val; +} + + +static inline u64 vcpu_get_rr(VCPU *vcpu, u64 reg) +{ + return vcpu-arch.vrr[reg61]; +} Looks to me like most of them can be replaced by a few macros using macro_##. +static inline int highest_bits(int *dat) +{ + u32 bits, bitnum; + int i; + + /* loop for all 256 bits */ + for (i = 7; i = 0 ; i --) { + bits = dat[i]; + if (bits) { + bitnum = fls(bits); + return i * 32 + bitnum - 1; + } + } + return NULL_VECTOR; +} duplicate to asm/bitops.h find_first_bit(). - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [0/3] -reserved-ram for PCI passthrough without VT-d and without paravirt
Hello, These three patches (one against host kernel, one against kvm.git, one against kvm-userland.git) forces KVM to map all RAM mapped in the virtualized e820 map provided to the guest with gfn = hfn. In turn it's now possible to give direct hardware access to the guest, all DMA will work fine on the virtualized guest ram. The bios has to be updated to alter the end of the first ram slot in the virtualized e820 map. This is unfixable as the address hardcoded in the current bios is higher than what's marked as ram in my hardware e820 map. The only exception where gfn != hfn for ranges included in the virtualized e820 map is for the magic bios page at host physical address zero (bytes from 0 to 4096). All linux versions will definitely never attempt to dma on such a page. If all OS are like linux there will be no problem and pci passthrough will work regardless of the guest OS without requiring any paravirtualization, nor VT-d. This only implements the memory management side, the logic to map mmio regions into kvm address space will require further changes. The limit of the reserved ram is around 1G and it has to be set at compile time, so the guest will run with no more than 1G of ram (it's fairly easy to extend it to 2G though). You can't run more than one guest with -reserved-ram at once or they'll be overwriting themself. You need access to /dev/mem on the userland side, and CAP_ADMIN on the kernel side to run this. I choosed an approach to require the minimal number of changes given this is a short term approach due the lack of hardware features in lots of cpus out there. This is how the memory layout looks like when live guest runs (physical start set to 512M and kvm -m 300). 7f108d429000-7f108d42b000 rw-p 7f108d429000 00:00 0 7f108d42b000-7f108d4ba000 rw-s 00:0e 275 /dev/mem 7f108d4ba000-7f108d52a000 rw-p 7f108d4ba000 00:00 0 7f108d52a000-7f10a002a000 rw-s 0010 00:0e 275 /dev/mem 7f10a002a000-7f10a142e000 rw-p 7f10a002a000 00:00 0 PID TTY STAT TIME MAJFL TRS DRS RSS %MEM COMMAND 5522 ?SLl4:06 1 1568 427067 33144 7.0 bin/x86_64/kvm/bin/qemu-system-x86_64 -hda tmp/vir RSS isn't including the reserved ram of course. - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] hugetlbfs not working
On Fri, Mar 28, 2008 at 09:41:06AM +0300, Avi Kivity wrote: mmap() should fail if anything goes wrong with ftruncate and the file length is not extented on tmpfs. --- vl.c.orig 2008-03-27 18:51:31.0 -0300 +++ vl.c2008-03-27 18:52:40.0 -0300 @@ -8749,11 +8749,7 @@ memory = (memory+hpagesize-1) ~(hpagesize-1); -if (ftruncate(fd, memory) == -1) { - perror(ftruncate); - close(fd); - return NULL; -} +ftruncate(fd, memory); I'm sure a patch will follow to add a missing error check, so how about a comment here? btw, we use signoffs on kvm-userspace, too. QEMU/KVM: ftruncate() is not supported by hugetlbfs on older hosts Signed-off-by: Marcelo Tosatti [EMAIL PROTECTED] diff --git a/qemu/vl.c b/qemu/vl.c index 5627862..6a240bf 100644 --- a/qemu/vl.c +++ b/qemu/vl.c @@ -8739,11 +8739,13 @@ void *alloc_mem_area(unsigned long memory, const char *path) memory = (memory+hpagesize-1) ~(hpagesize-1); -if (ftruncate(fd, memory) == -1) { - perror(ftruncate); - close(fd); - return NULL; -} +/* + * ftruncate is not supported by hugetlbfs in older + * hosts, so don't bother checking for errors. + * If anything goes wrong with it under other filesystems, + * mmap will fail. + */ +ftruncate(fd, memory); area = mmap(0, memory, PROT_READ|PROT_WRITE, MAP_PRIVATE, fd, 0); if (area == MAP_FAILED) { - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [1/3] -reserved-ram for PCI passthrough without VT-d and without paravirt
This is the kvm.git patch to enable -reserved-ram (without this kvm will simply gracefully fail to emulate the illegal instruction inside the bad_page). This trick avoids altering the ioctl api with libkvm, in short if get_user_pages fails on a host kernel with reserved ram config option enabled, it tries to see if it's a remap_pfn_range mapping backing the memslot. In such case it checks if the ram was reserved with page_count 0 and if so it disables the reference counting as those pages are invisibile to linux. As long as pfn_valid returns, pfn_to_page should be safe, so shall the memmap array be allocated with holes corresponding to the holes generated in the e820 map, simply bad_page will be returned gracefully without risk like if this patch wasn't applied to kvm.git. Signed-off-by: Andrea Arcangeli [EMAIL PROTECTED] diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h index e9ae5db..e7a9c82 100644 --- a/arch/x86/kvm/paging_tmpl.h +++ b/arch/x86/kvm/paging_tmpl.h @@ -263,7 +263,8 @@ static void FNAME(update_pte)(struct kvm_vcpu *vcpu, struct kvm_mmu_page *page, npage = vcpu-arch.update_pte.page; if (!npage) return; - get_page(npage); + if (!page_is_reserved(npage)) + get_page(npage); mmu_set_spte(vcpu, spte, page-role.access, pte_access, 0, 0, gpte PT_DIRTY_MASK, NULL, largepage, gpte_to_gfn(gpte), npage, true); diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index f4e1436..7f087ac 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -281,6 +281,18 @@ static inline void kvm_migrate_apic_timer(struct kvm_vcpu *vcpu) set_bit(KVM_REQ_MIGRATE_TIMER, vcpu-requests); } +#ifdef CONFIG_RESERVE_PHYSICAL_START +static inline int page_is_reserved(struct page * page) +{ + return !page_count(page); +} +#else /* CONFIG_RESERVE_PHYSICAL_START */ +static inline int page_is_reserved(struct page * page) +{ + return 0; +} +#endif /* CONFIG_RESERVE_PHYSICAL_START */ + enum kvm_stat_kind { KVM_STAT_VM, KVM_STAT_VCPU, diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 30bf832..50a7b3e 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -498,6 +524,65 @@ unsigned long gfn_to_hva(struct kvm *kvm, gfn_t gfn) return (slot-userspace_addr + (gfn - slot-base_gfn) * PAGE_SIZE); } + +#ifdef CONFIG_RESERVE_PHYSICAL_START +static struct page *direct_page(struct mm_struct *mm, + unsigned long address) +{ + pgd_t *pgd; + pud_t *pud; + pmd_t *pmd; + pte_t *ptep, pte; + spinlock_t *ptl; + struct page *page; + struct vm_area_struct *vma; + unsigned long pfn; + + page = NULL; + if (!capable(CAP_SYS_ADMIN)) /* go safe */ + goto out; + + vma = find_vma(current-mm, address); + if (!vma || vma-vm_start address || + !(vma-vm_flags VM_PFNMAP)) + goto out; + + pgd = pgd_offset(mm, address); + if (pgd_none(*pgd) || unlikely(pgd_bad(*pgd))) + goto out; + + pud = pud_offset(pgd, address); + if (pud_none(*pud) || unlikely(pud_bad(*pud))) + goto out; + + pmd = pmd_offset(pud, address); + if (pmd_none(*pmd) || unlikely(pmd_bad(*pmd))) + goto out; + + ptep = pte_offset_map_lock(mm, pmd, address, ptl); + if (!ptep) + goto out; + + pte = *ptep; + if (!pte_present(pte)) + goto unlock; + + pfn = pte_pfn(pte); + if (!pfn_valid(pfn)) + goto unlock; + + page = pfn_to_page(pfn); + if (!page_is_reserved(page)) { + page = NULL; + goto unlock; + } +unlock: + pte_unmap_unlock(ptep, ptl); +out: + return page; +} +#endif /* CONFIG_RESERVE_PHYSICAL_START */ + /* * Requires current-mm-mmap_sem to be held */ @@ -519,6 +604,11 @@ struct page *gfn_to_page(struct kvm *kvm, gfn_t gfn) NULL); if (npages != 1) { +#ifdef CONFIG_RESERVE_PHYSICAL_START + page[0] = direct_page(current-mm, addr); + if (page[0]) + return page[0]; +#endif get_page(bad_page); return bad_page; } @@ -530,15 +620,18 @@ EXPORT_SYMBOL_GPL(gfn_to_page); void kvm_release_page_clean(struct page *page) { - put_page(page); + if (!page_is_reserved(page)) + put_page(page); } EXPORT_SYMBOL_GPL(kvm_release_page_clean); void kvm_release_page_dirty(struct page *page) { - if (!PageReserved(page)) - SetPageDirty(page); - put_page(page); + if (!page_is_reserved(page)) { + if (!PageReserved(page)) + SetPageDirty(page); + put_page(page); + } }
[kvm-devel] [2/3] -reserved-ram for PCI passthrough without VT-d and without paravirt
This is the kvm-userland.git patch overwriting the ranges in the virtualized e820 map with /dev/mem. All is validated through /proc/iomem, so shall the hardware e820 map be weird, there will be zero risk of corruption, simply it will fail to startup with a verbose error. The bios has to be rebuilt to pass the variable address near 640k where to stop the virtualized e820 slot in function of the ram available in the host, and in function of the eary-reserve for things like the smp trampoline page that we don't want to pass as available ram to the guest. Only the page at address zero is magic and it's mapped as ram in the guest, but it's allocated through regular anonymous memory as you can see from the first /dev/mem mapping starting at area+reserved[0]. To rebuild the bios make bios before make install should do the trick. If you don't rebuild the bios everything will work fine if you don't use pci-passthrough, but then pci passthrough will randomly memory corrupt the host. Signed-off-by: Andrea Arcangeli [EMAIL PROTECTED] diff --git a/bios/rombios.c b/bios/rombios.c index 318de57..f93a6c6 100644 --- a/bios/rombios.c +++ b/bios/rombios.c @@ -4251,6 +4251,7 @@ int15_function32(regs, ES, DS, FLAGS) Bit32u extra_lowbits_memory_size=0; Bit16u CX,DX; Bit8u extra_highbits_memory_size=0; + Bit32u below_640_end; BX_DEBUG_INT15(int15 AX=%04x\n,regs.u.r16.ax); @@ -4305,6 +4306,11 @@ ASM_END case 0x20: // coded by osmaker aka K.J. if(regs.u.r32.edx == 0x534D4150) { +below_640_end = inb_cmos(0x16); +below_640_end = 8; +below_640_end |= inb_cmos(0x15); +below_640_end *= 1024; + extended_memory_size = inb_cmos(0x35); extended_memory_size = 8; extended_memory_size |= inb_cmos(0x34); @@ -4334,7 +4340,7 @@ ASM_END { case 0: set_e820_range(ES, regs.u.r16.di, - 0x000L, 0x0009fc00L, 0, 0, 1); + 0x000L, below_640_end, 0, 0, 1); regs.u.r32.ebx = 1; regs.u.r32.eax = 0x534D4150; regs.u.r32.ecx = 0x14; @@ -4343,7 +4349,7 @@ ASM_END break; case 1: set_e820_range(ES, regs.u.r16.di, - 0x0009fc00L, 0x000aL, 0, 0, 2); + below_640_end, 0x000aL, 0, 0, 2); regs.u.r32.ebx = 2; regs.u.r32.eax = 0x534D4150; regs.u.r32.ecx = 0x14; diff --git a/qemu/hw/pc.c b/qemu/hw/pc.c index 0d2e6c3..a6b28c8 100644 --- a/qemu/hw/pc.c +++ b/qemu/hw/pc.c @@ -198,6 +198,8 @@ static void cmos_init(ram_addr_t ram_size, ram_addr_t above_4g_mem_size, /* memory size */ val = 640; /* base memory in K */ +if (reserved_ram) + val = reserved[1] / 1024; rtc_set_memory(s, 0x15, val); rtc_set_memory(s, 0x16, val 8); diff --git a/qemu/pc-bios/bios.bin b/qemu/pc-bios/bios.bin index 2e7d3e0..90d626d 100644 Binary files a/qemu/pc-bios/bios.bin and b/qemu/pc-bios/bios.bin differ diff --git a/qemu/sysemu.h b/qemu/sysemu.h index c728605..db0dda4 100644 --- a/qemu/sysemu.h +++ b/qemu/sysemu.h @@ -103,6 +103,8 @@ extern int autostart; extern int old_param; extern int hpagesize; extern const char *bootp_filename; +extern int reserved_ram; +extern int64_t reserved[4]; #ifdef USE_KQEMU diff --git a/qemu/vl.c b/qemu/vl.c index 3570388..31adc90 100644 --- a/qemu/vl.c +++ b/qemu/vl.c @@ -240,6 +240,8 @@ int time_drift_fix = 0; unsigned int kvm_shadow_memory = 0; const char *mem_path = NULL; int hpagesize = 0; +int reserved_ram = 0; +int64_t reserved[4]; const char *cpu_vendor_string; #ifdef TARGET_ARM int old_param = 0; @@ -8313,6 +8315,7 @@ enum { QEMU_OPTION_tdf, QEMU_OPTION_kvm_shadow_memory, QEMU_OPTION_mempath, +QEMU_OPTION_reserved_ram, }; typedef struct QEMUOption { @@ -8439,6 +8442,7 @@ const QEMUOption qemu_options[] = { { clock, HAS_ARG, QEMU_OPTION_clock }, { startdate, HAS_ARG, QEMU_OPTION_startdate }, { mem-path, HAS_ARG, QEMU_OPTION_mempath }, +{ reserved-ram, 0, QEMU_OPTION_reserved_ram }, { NULL }, }; @@ -8724,6 +8728,80 @@ static int gethugepagesize(void) return hugepagesize; } +static int find_reserved_ram(int64_t *_start, int64_t *_end, +unsigned long below, unsigned long above, +unsigned long min_size) +{ +int ret, fd; +char buf[4096]; +char *needle = reserved RAM\n; +//char *needle = System RAM\n; +char *size, *curr; +int64_t start, end; + +fd = open(/proc/iomem, O_RDONLY); +if (fd 0) { + perror(open); + exit(0); +} + +ret = read(fd,
Re: [kvm-devel] hugetlbfs not working
Marcelo Tosatti wrote: QEMU/KVM: ftruncate() is not supported by hugetlbfs on older hosts Signed-off-by: Marcelo Tosatti [EMAIL PROTECTED] Applied, thanks. -- error compiling committee.c: too many arguments to function - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [3/3] -reserved-ram for PCI passthrough without iommu and without paravirt
Hello, The reserved RAM can be mapped by virtualization software with /dev/mem to create a 1:1 mapping between guest physical (bus) address and host physical (bus) address. Please let me know if something like this can be merged in -mm (this is the minimal possible change to achieve the feature). The part at the end is more a fix, but it's only required with this applied (unless you want to have a kexec above ~40M). Here the complete patchset: http://marc.info/?l=kvm-develm=120698256716369w=2 http://marc.info/?l=kvm-develm=120698299317253w=2 http://marc.info/?l=kvm-develm=120698328617835w=2 Note current mainline is buggy so this patch should also applied with -R for the host kernel to boot but hopefully the regression will be solved sooner than later (no reply yet though). http://marc.info/?l=kvm-develm=120673375913890w=2 [EMAIL PROTECTED] ~ $ cat /proc/iomem | head -0fff : reserved RAM failed 1000-0008 : reserved RAM 0009-00091fff : reserved RAM failed 00092000-0009efff : reserved RAM 0009f000-0009 : reserved 000cd600-000c : pnp 00:0d 000f-000f : reserved 0010-1fff : reserved RAM 2000-3ded : System RAM Signed-off-by: Andrea Arcangeli [EMAIL PROTECTED] diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -1107,8 +1107,36 @@ config CRASH_DUMP (CONFIG_RELOCATABLE=y). For more details see Documentation/kdump/kdump.txt +config RESERVE_PHYSICAL_START + bool Reserve all RAM below PHYSICAL_START (EXPERIMENTAL) + depends on !RELOCATABLE X86_64 + help + This makes the kernel use only RAM above __PHYSICAL_START. + All memory below __PHYSICAL_START will be left unused and + marked as reserved RAM in /proc/iomem. The few special + pages that can't be relocated at addresses above + __PHYSICAL_START and that can't be guaranteed to be unused + by the running kernel will be marked reserved RAM failed + in /proc/iomem. Those may or may be not used by the kernel + (for example SMP trampoline pages would only be used if + CPU hotplug is enabled). + + The reserved RAM can be mapped by virtualization software + with /dev/mem to create a 1:1 mapping between guest physical + (bus) address and host physical (bus) address. This will + allow PCI passthrough with DMA for the guest using the RAM + with the 1:1 mapping. The only detail to take care of is the + RAM marked reserved RAM failed. The virtualization + software must create for the guest an e820 map that only + includes the reserved RAM regions but if the guest touches + memory with guest physical address in the reserved RAM + failed ranges (Linux guest will do that even if the RAM + isn't present in the e820 map), it should provide that as + RAM and map it with a non-linear mapping. This should allow + any Linux kernel to run fine and hopefully any other OS too. + config PHYSICAL_START - hex Physical address where the kernel is loaded if (EMBEDDED || CRASH_DUMP) + hex Physical address where the kernel is loaded if (EMBEDDED || CRASH_DUMP || RESERVE_PHYSICAL_START) default 0x100 if X86_NUMAQ default 0x20 if X86_64 default 0x10 diff --git a/arch/x86/kernel/e820_64.c b/arch/x86/kernel/e820_64.c --- a/arch/x86/kernel/e820_64.c +++ b/arch/x86/kernel/e820_64.c @@ -91,6 +91,11 @@ void __init early_res_to_bootmem(void) printk(KERN_INFO early res: %d [%lx-%lx] %s\n, i, r-start, r-end - 1, r-name); reserve_bootmem_generic(r-start, r-end - r-start); +#ifdef CONFIG_RESERVE_PHYSICAL_START + if (r-start __PHYSICAL_START) + add_memory_region(r-start, r-end - r-start, + E820_RESERVED_RAM_FAILED); +#endif } } @@ -231,6 +236,10 @@ void __init e820_reserve_resources(struc struct resource *data_resource, struct resource *bss_resource) { int i; +#ifdef CONFIG_RESERVE_PHYSICAL_START + /* solve E820_RESERVED_RAM vs E820_RESERVED_RAM_FAILED conflicts */ + update_e820(); +#endif for (i = 0; i e820.nr_map; i++) { struct resource *res; res = alloc_bootmem_low(sizeof(struct resource)); @@ -238,6 +247,16 @@ void __init e820_reserve_resources(struc case E820_RAM: res-name = System RAM; break; case E820_ACPI: res-name = ACPI Tables; break; case E820_NVS: res-name = ACPI Non-volatile Storage; break; +#ifdef CONFIG_RESERVE_PHYSICAL_START + case E820_RESERVED_RAM_FAILED: + res-name = reserved RAM failed; + break; + case E820_RESERVED_RAM: +
[kvm-devel] [PATCH 1/2] Refactor in-kernel PIT to a separate device
This patch refactors the in-kernel PIT to be a logically separate device. Signed-off-by: Anthony Liguori [EMAIL PROTECTED] diff --git a/qemu/hw/i8254.c b/qemu/hw/i8254.c index e215f8b..bcd9dba 100644 --- a/qemu/hw/i8254.c +++ b/qemu/hw/i8254.c @@ -30,6 +30,9 @@ //#define DEBUG_PIT +#define PIT_SAVEVM_NAME i8254 +#define PIT_SAVEVM_VERSION 1 + #define RW_STATE_LSB 1 #define RW_STATE_MSB 2 #define RW_STATE_WORD0 3 @@ -414,78 +417,12 @@ static void pit_irq_timer(void *opaque) pit_irq_timer_update(s, s-next_transition_time); } -#ifdef KVM_CAP_PIT - -static void kvm_kernel_pit_save_to_user(PITState *s) -{ -struct kvm_pit_state pit; -struct kvm_pit_channel_state *c; -struct PITChannelState *sc; -int i; - -kvm_get_pit(kvm_context, pit); - -for (i = 0; i 3; i++) { - c = pit.channels[i]; - sc = s-channels[i]; - sc-count = c-count; - sc-latched_count = c-latched_count; - sc-count_latched = c-count_latched; - sc-status_latched = c-status_latched; - sc-status = c-status; - sc-read_state = c-read_state; - sc-write_state = c-write_state; - sc-write_latch = c-write_latch; - sc-rw_mode = c-rw_mode; - sc-mode = c-mode; - sc-bcd = c-bcd; - sc-gate = c-gate; - sc-count_load_time = c-count_load_time; -} -} - -static void kvm_kernel_pit_load_from_user(PITState *s) -{ -struct kvm_pit_state pit; -struct kvm_pit_channel_state *c; -struct PITChannelState *sc; -int i; - -for (i = 0; i 3; i++) { - c = pit.channels[i]; - sc = s-channels[i]; - c-count = sc-count; - c-latched_count = sc-latched_count; - c-count_latched = sc-count_latched; - c-status_latched = sc-status_latched; - c-status = sc-status; - c-read_state = sc-read_state; - c-write_state = sc-write_state; - c-write_latch = sc-write_latch; - c-rw_mode = sc-rw_mode; - c-mode = sc-mode; - c-bcd = sc-bcd; - c-gate = sc-gate; - c-count_load_time = sc-count_load_time; -} - -kvm_set_pit(kvm_context, pit); -} - -#endif - static void pit_save(QEMUFile *f, void *opaque) { PITState *pit = opaque; PITChannelState *s; int i; -#ifdef KVM_CAP_PIT -if (kvm_enabled() qemu_kvm_pit_in_kernel()) { -kvm_kernel_pit_save_to_user(pit); -} -#endif - for(i = 0; i 3; i++) { s = pit-channels[i]; qemu_put_be32(f, s-count); @@ -538,12 +475,6 @@ static int pit_load(QEMUFile *f, void *opaque, int version_id) } } -#ifdef KVM_CAP_PIT -if (kvm_enabled() qemu_kvm_pit_in_kernel()) { -kvm_kernel_pit_load_from_user(pit); -} -#endif - return 0; } @@ -566,14 +497,13 @@ PITState *pit_init(int base, qemu_irq irq) PITState *pit = pit_state; PITChannelState *s; -if (!kvm_enabled() || !qemu_kvm_pit_in_kernel()) { - s = pit-channels[0]; - /* the timer 0 is connected to an IRQ */ - s-irq_timer = qemu_new_timer(vm_clock, pit_irq_timer, s); - s-irq = irq; -} +s = pit-channels[0]; +/* the timer 0 is connected to an IRQ */ +s-irq_timer = qemu_new_timer(vm_clock, pit_irq_timer, s); +s-irq = irq; -register_savevm(i8254, base, 1, pit_save, pit_load, pit); +register_savevm(PIT_SAVEVM_NAME, base, PIT_SAVEVM_VERSION, + pit_save, pit_load, pit); qemu_register_reset(pit_reset, pit); register_ioport_write(base, 4, 1, pit_ioport_write, pit); @@ -583,3 +513,83 @@ PITState *pit_init(int base, qemu_irq irq) return pit; } + +#ifdef KVM_CAP_PIT + +static void kvm_pit_save(QEMUFile *f, void *opaque) +{ +PITState *s = opaque; +struct kvm_pit_state pit; +struct kvm_pit_channel_state *c; +struct PITChannelState *sc; +int i; + +kvm_get_pit(kvm_context, pit); + +for (i = 0; i 3; i++) { + c = pit.channels[i]; + sc = s-channels[i]; + sc-count = c-count; + sc-latched_count = c-latched_count; + sc-count_latched = c-count_latched; + sc-status_latched = c-status_latched; + sc-status = c-status; + sc-read_state = c-read_state; + sc-write_state = c-write_state; + sc-write_latch = c-write_latch; + sc-rw_mode = c-rw_mode; + sc-mode = c-mode; + sc-bcd = c-bcd; + sc-gate = c-gate; + sc-count_load_time = c-count_load_time; +} + +pit_save(f, s); +} + +static int kvm_pit_load(QEMUFile *f, void *opaque, int version_id) +{ +PITState *s = opaque; +struct kvm_pit_state pit; +struct kvm_pit_channel_state *c; +struct PITChannelState *sc; +int i; + +pit_load(f, s, version_id); + +for (i = 0; i 3; i++) { + c = pit.channels[i]; + sc = s-channels[i]; + c-count = sc-count; + c-latched_count = sc-latched_count; + c-count_latched = sc-count_latched; + c-status_latched = sc-status_latched; + c-status =
[kvm-devel] Ubuntu Gutsy host / XP guest / -smp 2
With the title combination, the guest takes nearly 100% of my real CPU time and still only sees one CPU. Is this a known problem, and does it have a known solution? Thanks in Advance, -- Dave Abrahams Boost Consulting http://boost-consulting.com - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [05/17][PATCH] kvm/ia64 : Add head files for kvm/ia64
Jes Sorensen wrote: Hi Xiantao, Hi, Jes I fixed the coding style issues. Thanks! More comments. Zhang, Xiantao wrote: From 696b9eea9f5001a7b7a07c0e58514aa10306b91a Mon Sep 17 00:00:00 2001 From: Xiantao Zhang [EMAIL PROTECTED] Date: Fri, 28 Mar 2008 09:51:36 +0800 Subject: [PATCH] KVM:IA64 : Add head files for kvm/ia64 ia64_regs: some defintions for special registers which aren't defined in asm-ia64/ia64regs. Please put missing definitions of registers into asm-ia64/ia64regs.h if they are official definitions from the spec. Moved! kvm_minstate.h : Marcos about Min save routines. lapic.h: apic structure definition. vcpu.h : routions related to vcpu virtualization. vti.h : Some macros or routines for VT support on Itanium. Signed-off-by: Xiantao Zhang [EMAIL PROTECTED] +/* + * Flushrs instruction stream. + */ +#define ia64_flushrs() asm volatile (flushrs;;:::memory) + +#define ia64_loadrs() asm volatile (loadrs;;:::memory) Please put these into include/asm-ia64/gcc_intrin.h OK. +#define ia64_get_rsc() \ +({ \ +unsigned long val; \ +asm volatile (mov %0=ar.rsc;; : =r(val) :: memory); \ +val; \ +}) + +#define ia64_set_rsc(val) \ +asm volatile (mov ar.rsc=%0;; :: r(val) : memory) Please update the ia64_get/set_reg macros to handle the RSC register and use those macros. Moved. +#define ia64_get_bspstore() \ +({ \ +unsigned long val; \ +asm volatile (mov %0=ar.bspstore;; : =r(val) :: memory); \ +val; \ +}) Ditto for for AR.BSPSTORE +#define ia64_get_rnat() \ +({ \ +unsigned long val; \ +asm volatile (mov %0=ar.rnat; : =r(val) :: memory); \ +val; \ +}) Ditto for AR.RNAT +static inline unsigned long ia64_get_itc(void) +{ +unsigned long result; +result = ia64_getreg(_IA64_REG_AR_ITC); +return result; +} This exists in include/asm-ia64/delay.h +static inline void ia64_set_dcr(unsigned long dcr) +{ +ia64_setreg(_IA64_REG_CR_DCR, dcr); +} Please just call ia64_setreg() in your code rather than defining a wrapper for it. Sure. +#define ia64_ttag(addr) \ +({ \ +__u64 ia64_intri_res; \ +asm volatile (ttag %0=%1 : =r(ia64_intri_res) : r (addr)); \ +ia64_intri_res; \ +}) Please add to include/asm-ia64/gcc_intrin.h instead. diff --git a/arch/ia64/kvm/lapic.h b/arch/ia64/kvm/lapic.h new file mode 100644 index 000..152cbdc --- /dev/null +++ b/arch/ia64/kvm/lapic.h @@ -0,0 +1,27 @@ +#ifndef __KVM_IA64_LAPIC_H +#define __KVM_IA64_LAPIC_H + +#include iodev.h I don't understand why iodev.h is included here? It is inherited from x86 side, and forget to remove it. Seems redundant. --- /dev/null +++ b/arch/ia64/kvm/vcpu.h The formatting of this file is dodgy, please try and make it comply with the Linux standards in Documentation/CodingStyle +#define _vmm_raw_spin_lock(x) \ [snip] + +#define _vmm_raw_spin_unlock(x) \ Could you explain the reasoning behind these two macros? Whenever I see open coded spin lock modifications like these, I have to admit I get a bit worried. In the architecture of kvm/ia64, gvmm and host are in the two different worlds, and gvmm can't call host's interface. In migration case, we need to take a lock to sync the status of dirty memory. In order to make it work, this spin_lock is defined and used. +typedef struct kvm_vcpu VCPU; +typedef struct kvm_pt_regs REGS; +typedef enum { DATA_REF, NA_REF, INST_REF, RSE_REF } vhpt_ref_t; +typedef enum { INSTRUCTION, DATA, REGISTER } miss_type; ARGH! Please see previous mail about typedefs! I suspect this is code inherited from Xen ? Xen has a lot of really nasty and pointless typedefs like these :-( Removed. +static inline void vcpu_set_dbr(VCPU *vcpu, u64 reg, u64 val) +{ +/* TODO: need to virtualize */ +__ia64_set_dbr(reg, val); +} + +static inline void vcpu_set_ibr(VCPU *vcpu, u64 reg, u64 val) +{ +/* TODO: need to virtualize */ +ia64_set_ibr(reg, val); +} + +static inline u64 vcpu_get_dbr(VCPU *vcpu, u64 reg) +{ +/* TODO: need to virtualize */ +return ((u64)__ia64_get_dbr(reg)); +} + +static inline u64 vcpu_get_ibr(VCPU *vcpu, u64 reg) +{ +/* TODO: need to virtualize */ +return ((u64)ia64_get_ibr(reg)); +} More wrapper macros that really should just use ia64_get/set_reg() directly in the code. Removed, and used the one without wrapper. diff --git a/arch/ia64/kvm/vti.h b/arch/ia64/kvm/vti.h new file mode 100644 index 000..591ab22 [ship] +/* -*- Mode:C; c-basic-offset:4; tab-width:4; indent-tabs-mode:nil -*- */ Evil formatting again! Cheers, Jes - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source.
Re: [kvm-devel] [kvm-ia64-devel] [03/15][PATCH] kvm/ia64: Add header files forkvm/ia64. V8
Carsten Otte wrote: Zhang, Xiantao wrote: +typedef union context { +/* 8K size */ +chardummy[KVM_CONTEXT_SIZE]; +struct { +unsigned long psr; +unsigned long pr; +unsigned long caller_unat; +unsigned long pad; +unsigned long gr[32]; +unsigned long ar[128]; +unsigned long br[8]; +unsigned long cr[128]; +unsigned long rr[8]; +unsigned long ibr[8]; +unsigned long dbr[8]; +unsigned long pkr[8]; +struct ia64_fpreg fr[128]; +}; +} context_t; This looks ugly to me. I'd rather prefer to have a straight struct with elements psr...fr[], and cast the pointer to char* when needed. KVM_CONTEXT_SIZE can be used as parameter to kzalloc() on allocation, it's too large to be on stack anyway. We need to allocate enough memory fix area, considering back-ward compabitility. In migration or save/restore case, we need to save this area. If migration happens in different kvm versions, and the size of different, it may cause issues. For example, we added a new field in new kvm, and restore a new snapshot to old versions, it may fail. +typedef struct thash_data { +union { +struct { +unsigned long p: 1; /* 0 */ +unsigned long rv1 : 1; /* 1 */ +unsigned long ma : 3; /* 2-4 */ +unsigned long a: 1; /* 5 */ +unsigned long d: 1; /* 6 */ +unsigned long pl : 2; /* 7-8 */ +unsigned long ar : 3; /* 9-11 */ +unsigned long ppn : 38; /* 12-49 */ +unsigned long rv2 : 2; /* 50-51 */ +unsigned long ed : 1; /* 52 */ +unsigned long ig1 : 11; /* 53-63 */ +}; +struct { +unsigned long __rv1 : 53; /* 0-52 */ +unsigned long contiguous : 1; /*53 */ +unsigned long tc : 1; /* 54 TR or TC */ + unsigned long cl : 1; + /* 55 I side or D side cache line */ +unsigned long len : 4; /* 56-59 */ +unsigned long io : 1; /* 60 entry is for io or not */ +unsigned long nomap : 1; +/* 61 entry cann't be inserted into machine TLB.*/ +unsigned long checked : 1; +/* 62 for VTLB/VHPT sanity check */ +unsigned long invalid : 1; +/* 63 invalid entry */ +}; +unsigned long page_flags; +}; /* same for VHPT and TLB */ + +union { +struct { +unsigned long rv3 : 2; +unsigned long ps : 6; +unsigned long key : 24; +unsigned long rv4 : 32; +}; +unsigned long itir; +}; +union { +struct { +unsigned long ig2 : 12; +unsigned long vpn : 49; +unsigned long vrn : 3; +}; +unsigned long ifa; +unsigned long vadr; +struct { +unsigned long tag : 63; +unsigned long ti : 1; +}; +unsigned long etag; +}; +union { +struct thash_data *next; +unsigned long rid; +unsigned long gpaddr; +}; +} thash_data_t; A matter of taste, but I'd prefer unsigned long mask, and #define MASK_BIT_FOR_PURPUSE over bitfields. This structure could be much smaller that way. Yes, but it may be not so flexible to use. +struct kvm_regs { +char *saved_guest; +char *saved_stack; +struct saved_vpd vpd; +/*Arch-regs*/ +int mp_state; +unsigned long vmm_rr; +/* TR and TC. */ +struct thash_data itrs[NITRS]; +struct thash_data dtrs[NDTRS]; +/* Bit is set if there is a tr/tc for the region. */ + unsigned char itr_regions; + unsigned char dtr_regions; +unsigned char tc_regions; + +char irq_check; +unsigned long saved_itc; +unsigned long itc_check; +unsigned long timer_check; +unsigned long timer_pending; +unsigned long last_itc; + +unsigned long vrr[8]; +unsigned long ibr[8]; +unsigned long dbr[8]; +unsigned long insvc[4]; /* Interrupt in service. */ + unsigned long xtp; + +unsigned long metaphysical_rr0; /* from kvm_arch (so is pinned) */ +unsigned long metaphysical_rr4; /* from kvm_arch (so is pinned) */ +unsigned long metaphysical_saved_rr0; /* from kvm_arch */ +unsigned
Re: [kvm-devel] [04/17] [PATCH] Add kvm arch-specific core code for kvm/ia64.-V8
Carsten Otte wrote: Zhang, Xiantao wrote: +static struct kvm_vcpu *lid_to_vcpu(struct kvm *kvm, unsigned long id, +unsigned long eid) +{ +ia64_lid_t lid; +int i; + +for (i = 0; i KVM_MAX_VCPUS; i++) { +if (kvm-vcpus[i]) { +lid.val = VCPU_LID(kvm-vcpus[i]); +if (lid.id == id lid.eid == eid) +return kvm-vcpus[i]; +} +} + +return NULL; +} + +static int handle_ipi(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run) +{ + struct exit_ctl_data *p = kvm_get_exit_data(vcpu); +struct kvm_vcpu *target_vcpu; +struct kvm_pt_regs *regs; +ia64_ipi_a addr = p-u.ipi_data.addr; +ia64_ipi_d data = p-u.ipi_data.data; + +target_vcpu = lid_to_vcpu(vcpu-kvm, addr.id, addr.eid); + if (!target_vcpu) + return handle_vm_error(vcpu, kvm_run); + +if (!target_vcpu-arch.launched) { +regs = vcpu_regs(target_vcpu); + +regs-cr_iip = vcpu-kvm-arch.rdv_sal_data.boot_ip; +regs-r1 = vcpu-kvm-arch.rdv_sal_data.boot_gp; + +target_vcpu-arch.mp_state = VCPU_MP_STATE_RUNNABLE; +if (waitqueue_active(target_vcpu-wq)) +wake_up_interruptible(target_vcpu-wq); +} else { +vcpu_deliver_ipi(target_vcpu, data.dm, data.vector); + if (target_vcpu != vcpu) + kvm_vcpu_kick(target_vcpu); +} + +return 1; +} *Shrug*. This looks highly racy to me. You do access various values in target_vcpu without any lock! I know that taking the target vcpu's lock does'nt work because that one is held all the time during KVM_VCPU_RUN. My solution to that was struct local_interrupt, which has its own lock, and has the waitqueue plus everything I need to send a sigp [that's our flavor of ipi]. +int kvm_emulate_halt(struct kvm_vcpu *vcpu) +{ + +ktime_t kt; +long itc_diff; +unsigned long vcpu_now_itc; + +unsigned long expires; +struct hrtimer *p_ht = vcpu-arch.hlt_timer; That makes me jealous, I'd love to have hrtimer on s390 for this. I've got to round up to the next jiffie. *Sigh* +int kvm_arch_vcpu_ioctl_set_sregs(struct kvm_vcpu *vcpu, + struct kvm_sregs *sregs) +{ +printk(KERN_WARNINGkvm:kvm_arch_vcpu_ioctl_set_sregs called!!\n); +return 0; +} + +int kvm_arch_vcpu_ioctl_get_sregs(struct kvm_vcpu *vcpu, + struct kvm_sregs *sregs) +{ +printk(KERN_WARNINGkvm:kvm_arch_vcpu_ioctl_get_sregs called!!\n); +return 0; + +} Suggestion: if get/set sregs does'nt seem useful on ia64, why not return -EINVAL? In that case, you could also not print a kern warning, the user will either handle that situation or complain. +int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu) +{ snip +/*FIXME:Need to removed it later!!\n*/ +vcpu-arch.apic = kzalloc(sizeof(struct kvm_lapic), GFP_KERNEL); +vcpu-arch.apic-vcpu = vcpu; Fixme! Removed! +static int vti_vcpu_setup(struct kvm_vcpu *vcpu, int id) +{ +unsigned long psr; +int r; + +local_irq_save(psr); +r = kvm_insert_vmm_mapping(vcpu); +if (r) +goto fail; +r = kvm_vcpu_init(vcpu, vcpu-kvm, id); +if (r) +goto fail; Maybe change to return r, rather then goto fail? It should be same. +int kvm_arch_vcpu_ioctl_get_fpu(struct kvm_vcpu *vcpu, struct kvm_fpu *fpu) +{ +printk(KERN_WARNINGkvm:IA64 doesn't need to export + fpu to userspace!\n); +return 0; +} + +int kvm_arch_vcpu_ioctl_set_fpu(struct kvm_vcpu *vcpu, struct kvm_fpu *fpu) +{ +printk(KERN_WARNINGkvm:IA64 doesn't need to export +fpu to userspace !\n); +return 0; +} maybe -EINVAL? Good suggestion! +static int find_highest_bits(int *dat) +{ +u32 bits, bitnum; +int i; + +/* loop for all 256 bits */ +for (i = 7; i = 0 ; i--) { +bits = dat[i]; +if (bits) { +bitnum = fls(bits); +return i * 32 + bitnum - 1; +} +} + +return -1; +} Should be in asm/bitops.h. Look at find_first_bit() and friends, this is duplicate. Seems find_first_bit only can be used to find lowest bit? Xiantao - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [04/17] [PATCH] Add kvm arch-specific core code for kvm/ia64.-V8
Carsten Otte wrote: Zhang, Xiantao wrote: +static struct kvm_vcpu *lid_to_vcpu(struct kvm *kvm, unsigned long id, +unsigned long eid) +{ +ia64_lid_t lid; +int i; + +for (i = 0; i KVM_MAX_VCPUS; i++) { +if (kvm-vcpus[i]) { +lid.val = VCPU_LID(kvm-vcpus[i]); +if (lid.id == id lid.eid == eid) +return kvm-vcpus[i]; +} +} + +return NULL; +} + +static int handle_ipi(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run) +{ + struct exit_ctl_data *p = kvm_get_exit_data(vcpu); +struct kvm_vcpu *target_vcpu; +struct kvm_pt_regs *regs; +ia64_ipi_a addr = p-u.ipi_data.addr; +ia64_ipi_d data = p-u.ipi_data.data; + +target_vcpu = lid_to_vcpu(vcpu-kvm, addr.id, addr.eid); + if (!target_vcpu) + return handle_vm_error(vcpu, kvm_run); + +if (!target_vcpu-arch.launched) { +regs = vcpu_regs(target_vcpu); + +regs-cr_iip = vcpu-kvm-arch.rdv_sal_data.boot_ip; +regs-r1 = vcpu-kvm-arch.rdv_sal_data.boot_gp; + +target_vcpu-arch.mp_state = VCPU_MP_STATE_RUNNABLE; +if (waitqueue_active(target_vcpu-wq)) +wake_up_interruptible(target_vcpu-wq); +} else { +vcpu_deliver_ipi(target_vcpu, data.dm, data.vector); + if (target_vcpu != vcpu) + kvm_vcpu_kick(target_vcpu); +} + +return 1; +} *Shrug*. This looks highly racy to me. You do access various values in target_vcpu without any lock! I know that taking the target vcpu's lock does'nt work because that one is held all the time during KVM_VCPU_RUN. My solution to that was struct local_interrupt, which has its own lock, and has the waitqueue plus everything I need to send a sigp [that's our flavor of ipi]. ex Hi, Carsten Why do you think it is racy? In this function, target_vcpu-arch.launched should be set to 1 for the first run, and keep its value all the time. Except the first IPI to wake up the vcpu, all IPIs received by target vcpu should go into else condition. So you mean the race condition exist in else code ? Xiantao - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel