[kvm-devel] [RFC][PATCH 0/2]In-kernel PIT model
Hi, The patch have moved PIT from QEMU to kernel, which greatly increase the accuracy of KVM guest timer. The code mostly based on QEMU and Xen's code. The patch works well on IA32e host(passed 2.6.22, 2.6.20, 2.6.18, 2.6.16 with hpet=disable, 2.6.9 with clock=pit), mostly OK on pae host(passed 2.6.18, 2.6.9 with clock=pit). But Linux 2.6.16 guest's timer on pae host is inaccuracy with the patch, about 1/10 faster. This is a regression, for it's OK with QEMU. The same kernel on IA32e need hpet=disable to keep accuracy, but I can't find a parameter to make it accuracy on pae host with 2.6.16. The patch didn't contains save/restore part, for the function is broken in KVM now (I may spent some effect on this issue later). Any comments are welcome! Thanks Yang, Sheng From 56a50952929f9a7e78fc3ec812dd4550c623b956 Mon Sep 17 00:00:00 2001 From: Sheng Yang [EMAIL PROTECTED] Date: Mon, 21 Jan 2008 16:42:37 +0800 Subject: [PATCH] KVM: In-kernel PIT model Signed-off-by: Sheng Yang [EMAIL PROTECTED] --- arch/x86/kvm/Makefile |3 +- arch/x86/kvm/i8254.c | 589 arch/x86/kvm/i8254.h | 60 + arch/x86/kvm/irq.c |3 + arch/x86/kvm/x86.c |9 + include/asm-x86/kvm_host.h |1 + include/linux/kvm.h|2 + 7 files changed, 666 insertions(+), 1 deletions(-) create mode 100644 arch/x86/kvm/i8254.c create mode 100644 arch/x86/kvm/i8254.h diff --git a/arch/x86/kvm/Makefile b/arch/x86/kvm/Makefile index ffdd0b3..4d0c22e 100644 --- a/arch/x86/kvm/Makefile +++ b/arch/x86/kvm/Makefile @@ -6,7 +6,8 @@ common-objs = $(addprefix ../../../virt/kvm/, kvm_main.o ioapic.o) EXTRA_CFLAGS += -Ivirt/kvm -Iarch/x86/kvm -kvm-objs := $(common-objs) x86.o mmu.o x86_emulate.o i8259.o irq.o lapic.o +kvm-objs := $(common-objs) x86.o mmu.o x86_emulate.o i8259.o irq.o lapic.o \ + i8254.o obj-$(CONFIG_KVM) += kvm.o kvm-intel-objs = vmx.o obj-$(CONFIG_KVM_INTEL) += kvm-intel.o diff --git a/arch/x86/kvm/i8254.c b/arch/x86/kvm/i8254.c new file mode 100644 index 000..f5b53a5 --- /dev/null +++ b/arch/x86/kvm/i8254.c @@ -0,0 +1,589 @@ +/* + * 8253/8254 interval timer emulation + * + * Copyright (c) 2003-2004 Fabrice Bellard + * Copyright (c) 2006 Intel Corporation + * Copyright (c) 2007 Keir Fraser, XenSource Inc + * Copyright (c) 2008 Intel Corporation + * + * Permission is hereby granted, free of charge, to any person obtaining a copy + * of this software and associated documentation files (the Software), to deal + * in the Software without restriction, including without limitation the rights + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell + * copies of the Software, and to permit persons to whom the Software is + * furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN + * THE SOFTWARE. + * + * Authors: + * Sheng Yang [EMAIL PROTECTED] + * Port from QEMU and Xen. + */ + +#include linux/kvm_host.h + +#include i8254.h + +#define pit_debug(fmt, arg...) printk(KERN_WARNING fmt, ##arg) +/* #define pit_debug(fmt, arg...) */ + +#ifndef CONFIG_X86_64 +#define mod_64(x, y) ((x) - (y) * div64_64(x, y)) +#else +#define mod_64(x, y) ((x) % (y)) +#endif + +#define RW_STATE_LSB 1 +#define RW_STATE_MSB 2 +#define RW_STATE_WORD0 3 +#define RW_STATE_WORD1 4 + +/* Compute with 96 bit intermediate result: (a*b)/c */ +static u64 muldiv64(u64 a, u32 b, u32 c) +{ + union { + u64 ll; + struct { +#ifdef WORDS_BIGENDIAN + u32 high, low; +#else + u32 low, high; +#endif + } l; + } u, res; + u64 rl, rh; + + u.ll = a; + rl = (u64)u.l.low * (u64)b; + rh = (u64)u.l.high * (u64)b; + rh += (rl 32); + res.l.high = div64_64(rh, c); + res.l.low = div64_64(((mod_64(rh, c) 32) + (rl 0x)), c); + return res.ll; +} + +static void pit_set_gate(struct kvm *kvm, int channel, u32 val) +{ + struct PITChannelState *c = + kvm-arch.vpit-pit_state.channels[channel]; + struct kvm_vcpu *vcpu = kvm-vcpus[0]; + + ASSERT(mutex_is_locked(kvm-arch.vpit-pit_state.lock)); + + switch (c-mode) { + default: + case 0: + case 4: + /* XXX: just disable/enable counting */ + break; + case 1: + case 2: + case 3: + case 5: + /* Restart counting on rising edge. */ + if (c-gate val) + kvm_get_msr(vcpu, MSR_IA32_TIME_STAMP_COUNTER, +c-count_load_time); + break; + } + + c-gate = val; +} + +int pit_get_gate(struct kvm
[kvm-devel] [RFC][PATCH 2/2] kvm: libkvm: In-kernel PIT model
From 5f7e9bf8856602cf8ffcb50ff744ee1d0058a850 Mon Sep 17 00:00:00 2001 From: Sheng Yang [EMAIL PROTECTED] Date: Mon, 21 Jan 2008 16:41:47 +0800 Subject: [PATCH] kvm: libkvm: In-kernel PIT model Signed-off-by: Sheng Yang [EMAIL PROTECTED] --- kernel/Kbuild |2 +- libkvm/kvm-common.h |2 ++ libkvm/libkvm.c | 20 qemu/qemu-kvm.c |4 qemu/qemu-kvm.h |1 + qemu/vl.c |6 ++ 6 files changed, 34 insertions(+), 1 deletions(-) diff --git a/kernel/Kbuild b/kernel/Kbuild index ed02f5a..014cc17 100644 --- a/kernel/Kbuild +++ b/kernel/Kbuild @@ -1,7 +1,7 @@ EXTRA_CFLAGS := -I$(src)/include -include $(src)/external-module-compat.h obj-m := kvm.o kvm-intel.o kvm-amd.o kvm-objs := kvm_main.o x86.o mmu.o x86_emulate.o anon_inodes.o irq.o i8259.o \ -lapic.o ioapic.o preempt.o +lapic.o ioapic.o preempt.o i8254.o kvm-intel-objs := vmx.o vmx-debug.o kvm-amd-objs := svm.o diff --git a/libkvm/kvm-common.h b/libkvm/kvm-common.h index f4040be..bd9f1de 100644 --- a/libkvm/kvm-common.h +++ b/libkvm/kvm-common.h @@ -47,6 +47,8 @@ struct kvm_context { int no_irqchip_creation; /// in-kernel irqchip status int irqchip_in_kernel; + /// do not create in-kernel pit if set + int no_pit_creation; }; void init_slots(void); diff --git a/libkvm/libkvm.c b/libkvm/libkvm.c index 45f58d6..15e7c0d 100644 --- a/libkvm/libkvm.c +++ b/libkvm/libkvm.c @@ -271,6 +271,11 @@ void kvm_disable_irqchip_creation(kvm_context_t kvm) kvm-no_irqchip_creation = 1; } +void kvm_disable_pit_creation(kvm_context_t kvm) +{ + kvm-no_pit_creation = 1; +} + int kvm_create_vcpu(kvm_context_t kvm, int slot) { long mmap_size; @@ -368,6 +373,20 @@ void kvm_create_irqchip(kvm_context_t kvm) #endif } +void kvm_create_pit(kvm_context_t kvm) +{ + int r; + + if (!kvm-no_pit_creation) { + r = ioctl(kvm-fd, KVM_CHECK_EXTENSION, KVM_CAP_PIT); + if (r 0) { + r = ioctl(kvm-vm_fd, KVM_CREATE_PIT); + if (r 0) + printf(Create kernel PIC irqchip failed\n); + } + } +} + int kvm_create(kvm_context_t kvm, unsigned long phys_mem_bytes, void **vm_mem) { int r; @@ -383,6 +402,7 @@ int kvm_create(kvm_context_t kvm, unsigned long phys_mem_bytes, void **vm_mem) if (r 0) return r; kvm_create_irqchip(kvm); + kvm_create_pit(kvm); r = kvm_create_vcpu(kvm, 0); if (r 0) return r; diff --git a/qemu/qemu-kvm.c b/qemu/qemu-kvm.c index fddbbd6..a4f4761 100644 --- a/qemu/qemu-kvm.c +++ b/qemu/qemu-kvm.c @@ -10,6 +10,7 @@ int kvm_allowed = KVM_ALLOWED_DEFAULT; int kvm_irqchip = 1; +int kvm_pit = 1; #ifdef USE_KVM @@ -556,6 +557,9 @@ int kvm_qemu_create_context(void) if (!kvm_irqchip) { kvm_disable_irqchip_creation(kvm_context); } +if (!kvm_pit) { +kvm_disable_pit_creation(kvm_context); +} if (kvm_create(kvm_context, phys_ram_size, (void**)phys_ram_base) 0) { kvm_qemu_destroy(); return -1; diff --git a/qemu/qemu-kvm.h b/qemu/qemu-kvm.h index c4514bb..883a4da 100644 --- a/qemu/qemu-kvm.h +++ b/qemu/qemu-kvm.h @@ -42,6 +42,7 @@ void kvm_arch_update_regs_for_sipi(CPUState *env); extern int kvm_allowed; extern int kvm_irqchip; +extern int kvm_pit; void kvm_tpr_opt_setup(CPUState *env); void kvm_tpr_access_report(CPUState *env, uint64_t rip, int is_write); diff --git a/qemu/vl.c b/qemu/vl.c index 756e13d..5b76c8d 100644 --- a/qemu/vl.c +++ b/qemu/vl.c @@ -8015,6 +8015,7 @@ static void help(int exitcode) #ifdef USE_KVM -no-kvm disable KVM hardware virtualization\n -no-kvm-irqchip disable KVM kernel mode PIC/IOAPIC/LAPIC\n + -no-kvm-pit disable KVM kernel mode PIT\n #endif #ifdef TARGET_I386 -std-vgasimulate a standard VGA card with VESA Bochs Extensions\n @@ -8131,6 +8132,7 @@ enum { QEMU_OPTION_no_acpi, QEMU_OPTION_no_kvm, QEMU_OPTION_no_kvm_irqchip, +QEMU_OPTION_no_kvm_pit, QEMU_OPTION_no_reboot, QEMU_OPTION_show_cursor, QEMU_OPTION_daemonize, @@ -8213,6 +8215,7 @@ const QEMUOption qemu_options[] = { #ifdef USE_KVM { no-kvm, 0, QEMU_OPTION_no_kvm }, { no-kvm-irqchip, 0, QEMU_OPTION_no_kvm_irqchip }, +{ no-kvm-pit, 0, QEMU_OPTION_no_kvm_pit }, #endif #if defined(TARGET_PPC) || defined(TARGET_SPARC) { g, 1, QEMU_OPTION_g }, @@ -9046,6 +9049,9 @@ int main(int argc, char **argv) case QEMU_OPTION_no_kvm_irqchip: kvm_irqchip = 0; break; + case QEMU_OPTION_no_kvm_pit: + kvm_pit = 0; + break; #endif case QEMU_OPTION_usb: usb_enabled = 1; -- debian.1.5.3.7.1-dirty From 5f7e9bf8856602cf8ffcb50ff744ee1d0058a850 Mon Sep 17
Re: [kvm-devel] [PATCH] virtio: remove explicit pci ids from virtio_pci.c
Anthony Liguori wrote: Rusty Russell wrote: Qumranet let us use their PCI vendor ID, with device ids = 0x1000. We can specify that we accept all of them in the device ID table, and then return -ENODEV in the probe routine. I thought the device id range was smaller. Avi? Yes, 0x1000-0x10ff IIRC. Also, if we break compatibility in the future, won't this claim devices we don't actually support? -- error compiling committee.c: too many arguments to function - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [PATCH] fix reading from character devices
Hi Avi, commit kvm: qemu: consume all pending I/O in I/O loop (8ab8bb09f1115b9bf733f885cc92b6c63d83f420) broke reading data bursts from serial devices (and maybe from other character devices as well) by guests. Reason: serial devices do input flow control via fd_read_poll, but qemu now ignores this fact by pushing all data into the virtual device as soon as it is available. Patch below is not really nice (just as the whole internal virtual I/O interface at the moment, IMHO), but it re-enables the serial ports for now. Jan -- Siemens AG, Corporate Technology, CT SE 2 Corporate Competence Center Embedded Linux --- qemu/vl.c |5 - 1 file changed, 4 insertions(+), 1 deletion(-) Index: kvm-userspace/qemu/vl.c === --- kvm-userspace.orig/qemu/vl.c +++ kvm-userspace/qemu/vl.c @@ -7751,7 +7751,10 @@ void main_loop_wait(int timeout) for(ioh = first_io_handler; ioh != NULL; ioh = ioh-next) { if (!ioh-deleted ioh-fd_read FD_ISSET(ioh-fd, rfds)) { ioh-fd_read(ioh-opaque); -more = 1; +if (!ioh-fd_read_poll || ioh-fd_read_poll(ioh-opaque)) +more = 1; +else +FD_CLR(ioh-fd, rfds); } if (!ioh-deleted ioh-fd_write FD_ISSET(ioh-fd, wfds)) { ioh-fd_write(ioh-opaque); - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [RFC][PATCH 1/2] KVM: In-kernel PIT model
Yang, Sheng wrote: --- /dev/null +++ b/arch/x86/kvm/i8254.h ... +#define PIT_BASE_ADDRESS 0x40 +#define SPEAKER_BASE_ADDRESS 0x61 +#define PIT_MEM_LENGTH 4 +#define PIT_FREQ 1193181 The PIT may not be limited to x86 platforms. So I would propose to make the setup more generic and flexible. And I would move the code out of arch/x86, just the speaker support should remain there. I'm currently struggling with emulating a proprietary platform that has (among other specialties...) a different PIT base frequency, and I already had to patch user space qemu for customizable frequencies. Maybe this kernel extension is a good chance to generalize the PIT setup, and I would be happy to contribute to this if there is a consensus. Jan -- Siemens AG, Corporate Technology, CT SE 2 Corporate Competence Center Embedded Linux - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [RFC][PATCH 1/2] KVM: In-kernel PIT model
Avi Kivity wrote: Jan Kiszka wrote: The PIT may not be limited to x86 platforms. So I would propose to make the setup more generic and flexible. And I would move the code out of arch/x86, just the speaker support should remain there. I'm currently struggling with emulating a proprietary platform that has (among other specialties...) a different PIT base frequency, and I already had to patch user space qemu for customizable frequencies. Maybe this kernel extension is a good chance to generalize the PIT setup, and I would be happy to contribute to this if there is a consensus. Certainly an ioctl() to configure the PIT can be added. I think that we can leave that to a later patch though. I would rather stuff these parameters into KVM_CREATE_PIT right from the beginning than later breaking the kernel/user ABI or adding a clumsy KVM_CREATE_PIT_SPECIAL_EXTENDED_VERSION. :- Jan -- Siemens AG, Corporate Technology, CT SE 2 Corporate Competence Center Embedded Linux - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [RFC][PATCH 1/2] KVM: In-kernel PIT model
Jan Kiszka wrote: The PIT may not be limited to x86 platforms. So I would propose to make the setup more generic and flexible. And I would move the code out of arch/x86, just the speaker support should remain there. It should also not be common among all archs. On s390 we have CPU timer, which is way superior to PIT (clock cycle granularity, no vmexit to set it up or deliver the irq, no hypervisor support needed because it works transparent). - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH] add acpi powerbutton support
Hi Guido, [posting via gmane sucks, just re-enabled mail delivery in this account...] Guido Guenther wrote: Hi Jan, On Sat, Jan 19, 2008 at 04:40:06PM +0100, Jan Kiszka wrote: What about additionally listening on signals? If you run qemu from the console, you can then just press ctrl-c to shut the guest down (instead Catching ctrl-c sounds like a good idea but ctrl-c, ctrl-c should probably kill qemu then, since the machine might have no acpid running - in that case hitting ctrl-c would have no effect. Good idea. of killing it that way). The same happens on host shutdown (if the guest is faster than the host's grace period before SIGKILL...). +signal(SIGINT, qemu_powerdown_sighand); +signal(SIGTERM, qemu_powerdown_sighand); We shouldn't catch SIGTERM here since libvirt uses it for domainDestroy() (in contrast to domainShutdown() which uses system_powerdown). Something like this? I also included the SDL window this time, and at least I like it this way now :- Jan -- Siemens AG, Corporate Technology, CT SE 2 Corporate Competence Center Embedded Linux --- qemu/sdl.c|2 +- qemu/sysemu.h |2 +- qemu/vl.c | 21 - 3 files changed, 22 insertions(+), 3 deletions(-) Index: b/qemu/vl.c === --- a/qemu/vl.c +++ b/qemu/vl.c @@ -7653,9 +7653,21 @@ void qemu_system_shutdown_request(void) cpu_interrupt(cpu_single_env, CPU_INTERRUPT_EXIT); } +/* more than one requests within 2 s = hard powerdown */ +#define HARD_POWERDOWN_WINDOW 2 + void qemu_system_powerdown_request(void) { -powerdown_requested = 1; +static time_t last_request; +time_t now, delta; + +now = time(NULL); +delta = now-last_request; +last_request = now; +if (delta 0 || delta HARD_POWERDOWN_WINDOW) +powerdown_requested = 1; +else +shutdown_requested = 1; if (cpu_single_env) cpu_interrupt(cpu_single_env, CPU_INTERRUPT_EXIT); } @@ -8501,6 +8513,11 @@ void qemu_get_launch_info(int *argc, cha *opt_incoming = incoming; } +void qemu_powerdown_sighand(int signal) +{ +qemu_system_powerdown_request(); +} + int main(int argc, char **argv) { #ifdef CONFIG_GDBSTUB @@ -9475,6 +9492,8 @@ int main(int argc, char **argv) } } +signal(SIGINT, qemu_powerdown_sighand); + machine-init(ram_size, vga_ram_size, boot_devices, ds, kernel_filename, kernel_cmdline, initrd_filename, cpu_model); Index: b/qemu/sysemu.h === --- a/qemu/sysemu.h +++ b/qemu/sysemu.h @@ -35,7 +35,7 @@ int qemu_reset_requested(void); int qemu_powerdown_requested(void); #if !defined(TARGET_SPARC) !defined(TARGET_I386) // Please implement a power failure function to signal the OS -#define qemu_system_powerdown() do{}while(0) +#define qemu_system_powerdown() exit(0) #else void qemu_system_powerdown(void); #endif Index: b/qemu/sdl.c === --- a/qemu/sdl.c +++ b/qemu/sdl.c @@ -469,7 +469,7 @@ static void sdl_refresh(DisplayState *ds break; case SDL_QUIT: if (!no_quit) { -qemu_system_shutdown_request(); +qemu_system_powerdown_request(); vm_start();/* In case we're paused */ } break; - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH] fix reading from character devices
Jan Kiszka wrote: Hi Avi, commit kvm: qemu: consume all pending I/O in I/O loop (8ab8bb09f1115b9bf733f885cc92b6c63d83f420) broke reading data bursts from serial devices (and maybe from other character devices as well) by guests. Reason: serial devices do input flow control via fd_read_poll, but qemu now ignores this fact by pushing all data into the virtual device as soon as it is available. Patch below is not really nice (just as the whole internal virtual I/O interface at the moment, IMHO), but it re-enables the serial ports for now. I'm worried that it will break Dor's hack that speeds up virtio. Dor? -- error compiling committee.c: too many arguments to function - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [RFC][PATCH 1/2] KVM: In-kernel PIT model
Jan Kiszka wrote: The PIT may not be limited to x86 platforms. So I would propose to make the setup more generic and flexible. And I would move the code out of arch/x86, just the speaker support should remain there. I'm currently struggling with emulating a proprietary platform that has (among other specialties...) a different PIT base frequency, and I already had to patch user space qemu for customizable frequencies. Maybe this kernel extension is a good chance to generalize the PIT setup, and I would be happy to contribute to this if there is a consensus. Certainly an ioctl() to configure the PIT can be added. I think that we can leave that to a later patch though. -- error compiling committee.c: too many arguments to function - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH] fix reading from character devices
On Mon, 2008-01-21 at 12:13 +0200, Avi Kivity wrote: Jan Kiszka wrote: Hi Avi, commit kvm: qemu: consume all pending I/O in I/O loop (8ab8bb09f1115b9bf733f885cc92b6c63d83f420) broke reading data bursts from serial devices (and maybe from other character devices as well) by guests. Reason: serial devices do input flow control via fd_read_poll, but qemu now ignores this fact by pushing all data into the virtual device as soon as it is available. Patch below is not really nice (just as the whole internal virtual I/O interface at the moment, IMHO), but it re-enables the serial ports for now. I'm worried that it will break Dor's hack that speeds up virtio. Dor? It should be fine. Tap device without my hack has fd_read_poll null and I hacked it to have a handler that returns false if virtio is used and true otherwise. So Jan's patch should set 'more' to 1 and it will be like before. - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [RFC][PATCH 1/2] KVM: In-kernel PIT model
Jan Kiszka wrote: I would rather stuff these parameters into KVM_CREATE_PIT right from the beginning than later breaking the kernel/user ABI or adding a clumsy KVM_CREATE_PIT_SPECIAL_EXTENDED_VERSION. :- Binary compatibility is only guaranteed between released kernel versions; development versions might have changes. -- error compiling committee.c: too many arguments to function - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH] add acpi powerbutton support
Jan Kiszka wrote: Hi Guido, [posting via gmane sucks, just re-enabled mail delivery in this account...] Much appreciated. Guido Guenther wrote: Hi Jan, On Sat, Jan 19, 2008 at 04:40:06PM +0100, Jan Kiszka wrote: What about additionally listening on signals? If you run qemu from the console, you can then just press ctrl-c to shut the guest down (instead Catching ctrl-c sounds like a good idea but ctrl-c, ctrl-c should probably kill qemu then, since the machine might have no acpid running - in that case hitting ctrl-c would have no effect. Good idea. I'm worried about the 30+ second shutdown latency. Is there precedent for SIGTERM or SIGINT requiring this long to take effect? Maybe we should suspend the VM instead (using qemu suspend, not guest suspend). -- error compiling committee.c: too many arguments to function - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH] fix reading from character devices
Dor Laor wrote: On Mon, 2008-01-21 at 12:13 +0200, Avi Kivity wrote: Jan Kiszka wrote: Hi Avi, commit kvm: qemu: consume all pending I/O in I/O loop (8ab8bb09f1115b9bf733f885cc92b6c63d83f420) broke reading data bursts from serial devices (and maybe from other character devices as well) by guests. Reason: serial devices do input flow control via fd_read_poll, but qemu now ignores this fact by pushing all data into the virtual device as soon as it is available. Patch below is not really nice (just as the whole internal virtual I/O interface at the moment, IMHO), but it re-enables the serial ports for now. I'm worried that it will break Dor's hack that speeds up virtio. Dor? It should be fine. Tap device without my hack has fd_read_poll null and I hacked it to have a handler that returns false if virtio is used and true otherwise. So Jan's patch should set 'more' to 1 and it will be like before. But if virtio is used with this patch, it won't set 'more' to 1. Will virtio handle it or will throughput drop to be related to whatever timers we have set up? -- error compiling committee.c: too many arguments to function - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [RFC][PATCH 0/2]In-kernel PIT model
On Mon, Jan 21, 2008 at 05:18:21PM +0800, Yang, Sheng wrote: The patch works well on IA32e host(passed 2.6.22, 2.6.20, 2.6.18, 2.6.16 with hpet=disable, 2.6.9 with clock=pit), slightly off-topic but how you got around on building KVM and loading it with the undefined symbols from hpet (hrtimer_{cancel,start,init}) and smp (cpu_online_map) with 2.6.18? or you patched and enabled it statically into the kernel? Carlo - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH] virtio: remove explicit pci ids from virtio_pci.c
On Monday 21 January 2008 20:12:46 Avi Kivity wrote: Anthony Liguori wrote: Rusty Russell wrote: Qumranet let us use their PCI vendor ID, with device ids = 0x1000. We can specify that we accept all of them in the device ID table, and then return -ENODEV in the probe routine. I thought the device id range was smaller. Avi? Yes, 0x1000-0x10ff IIRC. Also, if we break compatibility in the future, won't this claim devices we don't actually support? Yep, I reduced it to 0x1000 - 0x103F for the moment. We can always expand long before we get our 64th device type. Cheers, Rusty. - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH] fix reading from character devices
Avi Kivity wrote: Dor Laor wrote: On Mon, 2008-01-21 at 12:13 +0200, Avi Kivity wrote: Jan Kiszka wrote: Hi Avi, commit kvm: qemu: consume all pending I/O in I/O loop (8ab8bb09f1115b9bf733f885cc92b6c63d83f420) broke reading data bursts from serial devices (and maybe from other character devices as well) by guests. Reason: serial devices do input flow control via fd_read_poll, but qemu now ignores this fact by pushing all data into the virtual device as soon as it is available. Patch below is not really nice (just as the whole internal virtual I/O interface at the moment, IMHO), but it re-enables the serial ports for now. I'm worried that it will break Dor's hack that speeds up virtio. Dor? It should be fine. Tap device without my hack has fd_read_poll null and I hacked it to have a handler that returns false if virtio is used and true otherwise. So Jan's patch should set 'more' to 1 and it will be like before. But if virtio is used with this patch, it won't set 'more' to 1. Will virtio handle it or will throughput drop to be related to whatever timers we have set up? As long as virtio_net_can_receive returns 1, the loop will keep on feeding also this device. And if it returns 0, I see no reason for ignoring this signal. Anythings else would be, well, weird. However, the current io-polling loop in git urgently needs fixing, or even a redesign. Jan -- Siemens AG, Corporate Technology, CT SE 2 Corporate Competence Center Embedded Linux - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH] add acpi powerbutton support
Avi Kivity wrote: Jan Kiszka wrote: Guido Guenther wrote: Hi Jan, On Sat, Jan 19, 2008 at 04:40:06PM +0100, Jan Kiszka wrote: What about additionally listening on signals? If you run qemu from the console, you can then just press ctrl-c to shut the guest down (instead Catching ctrl-c sounds like a good idea but ctrl-c, ctrl-c should probably kill qemu then, since the machine might have no acpid running - in that case hitting ctrl-c would have no effect. Good idea. I'm worried about the 30+ second shutdown latency. Is there precedent for SIGTERM or SIGINT requiring this long to take effect? Sorry, can't follow this yet: Are you talking about host system shutdown that should wait on the guest system that long? Maybe we should suspend the VM instead (using qemu suspend, not guest suspend). You mean on SIGTERM? I think Guido's concern was that this signal is expected to actually kill the qemu instance. Therefore I dropped this signal from my second patch. Jan -- Siemens AG, Corporate Technology, CT SE 2 Corporate Competence Center Embedded Linux - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] fix a potential issue
From ab229d437b59317a322ca0efd2b0d239b74e Mon Sep 17 00:00:00 2001 From: Feng(Eric) Liu [EMAIL PROTECTED] Date: Tue, 22 Jan 2008 17:01:29 -0500 Subject: [PATCH] kvm: mmu: fix a potential issue Since the type of gpa is u64, so extend pte_size to u64. Otherwise, a potential issue may happen. Signed-off-by: Feng(Eric) Liu [EMAIL PROTECTED] --- arch/x86/kvm/mmu.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index cb62ef6..c85b904 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -1477,7 +1477,7 @@ void kvm_mmu_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa, if ((gpa (pte_size - 1)) || (bytes pte_size)) { gentry = 0; r = kvm_read_guest_atomic(vcpu-kvm, - gpa ~(pte_size - 1), + gpa ~(u64)(pte_size - 1), gentry, pte_size); new = (const void *)gentry; if (r 0) --Eric (Liu, Feng) 0001-kvm-mmu-fix-a-potential-issue.patch Description: 0001-kvm-mmu-fix-a-potential-issue.patch - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [PATCH] janitor: revert accidental type change
While trying to reduce the warning noise (to identify warnings of homebrewed patches), I also came across this bogus but fortunately harmless type change in bdrv_commit. Fix below. Jan Index: kvm-userspace/qemu/block.c === --- kvm-userspace.orig/qemu/block.c +++ kvm-userspace/qemu/block.c @@ -460,7 +460,7 @@ int bdrv_commit(BlockDriverState *bs) BlockDriver *drv = bs-drv; int64_t i, total_sectors; int n, j; -unsigned char *sector[512]; +unsigned char sector[512]; if (!drv) return -ENOMEDIUM; -- Siemens AG, Corporate Technology, CT SE 2 Corporate Competence Center Embedded Linux - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [PATCH] kvm memslot read-locking with mmu_lock
This adds locking to the memslots so they can be looked up with only the mmu_lock. Entries with memslot-userspace_addr have to be ignored because they're not fully inserted yet. Signed-off-by: Andrea Arcangeli [EMAIL PROTECTED] diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 8a90403..35a2ee0 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -3219,14 +3249,20 @@ int kvm_arch_set_memory_region(struct kvm *kvm, */ if (!user_alloc) { if (npages !old.rmap) { - memslot-userspace_addr = do_mmap(NULL, 0, -npages * PAGE_SIZE, -PROT_READ | PROT_WRITE, -MAP_SHARED | MAP_ANONYMOUS, -0); - - if (IS_ERR((void *)memslot-userspace_addr)) - return PTR_ERR((void *)memslot-userspace_addr); + unsigned long userspace_addr; + + userspace_addr = do_mmap(NULL, 0, +npages * PAGE_SIZE, +PROT_READ | PROT_WRITE, +MAP_SHARED | MAP_ANONYMOUS, +0); + if (IS_ERR((void *)userspace_addr)) + return PTR_ERR((void *)userspace_addr); + + /* set userspace_addr atomically for kvm_hva_to_rmapp */ + spin_lock(kvm-mmu_lock); + memslot-userspace_addr = userspace_addr; + spin_unlock(kvm-mmu_lock); } else { if (!old.user_alloc old.rmap) { int ret; diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 4295623..a67e38f 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -298,7 +299,15 @@ int __kvm_set_memory_region(struct kvm *kvm, memset(new.rmap, 0, npages * sizeof(*new.rmap)); new.user_alloc = user_alloc; - new.userspace_addr = mem-userspace_addr; + /* +* hva_to_rmmap() serialzies with the mmu_lock and to be +* safe it has to ignore memslots with !user_alloc +* !userspace_addr. +*/ + if (user_alloc) + new.userspace_addr = mem-userspace_addr; + else + new.userspace_addr = 0; } /* Allocate page dirty bitmap if needed */ @@ -311,14 +320,18 @@ int __kvm_set_memory_region(struct kvm *kvm, memset(new.dirty_bitmap, 0, dirty_bytes); } + spin_lock(kvm-mmu_lock); if (mem-slot = kvm-nmemslots) kvm-nmemslots = mem-slot + 1; *memslot = new; + spin_unlock(kvm-mmu_lock); r = kvm_arch_set_memory_region(kvm, mem, old, user_alloc); if (r) { + spin_lock(kvm-mmu_lock); *memslot = old; + spin_unlock(kvm-mmu_lock); goto out_free; } - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH] qemu: fix some warnings
On Mon, Jan 21, 2008 at 01:46:11PM +0100, Jan Kiszka wrote: Here are 4 more warnings fixes (actually, I should sent 2 of them to qemu...). Nothing critical, just less noise during compilation. probably a good idea having them in independent patches as they are unrelated (other by the fact that they are all warnings in current git). Index: b/qemu/vl.c === --- a/qemu/vl.c +++ b/qemu/vl.c @@ -8862,7 +8862,7 @@ int main(int argc, char **argv) if (ram_size = 0) help(1); if (ram_size PHYS_RAM_MAX_SIZE) { -fprintf(stderr, qemu: at most %d MB RAM can be simulated\n, +fprintf(stderr, qemu: at most %llu MB RAM can be simulated\n, PHYS_RAM_MAX_SIZE / (1024 * 1024)); using TARGET_FMT_lu instead of %llu would seem more appropriate here because this is meant to reflect a physical memory address, but then the fact that kvm is using the x64_64 target also for 32 bit will mean that the definition of PHYS_RAM_MAX_SIZE has to be made somehow also HOST specific. for my take on that (which will need to be updated with your version of the exec.c changes and re-tested) look at : http://tapir.sajinet.com.pe/gentoo/portage/app-emulation/kvm/files/kvm-51-qemu-ramaddr.patch Carlo - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [RFC][PATCH 2/5] add new exported function replace_page()
-- woof. From c6fc21397e37481696723115cb1680f42661be48 Mon Sep 17 00:00:00 2001 From: Izik Eidus [EMAIL PROTECTED] Date: Mon, 21 Jan 2008 17:04:45 +0200 Subject: [PATCH] memory.c: add new exported function replace_page() replace_page() - replace the pte mapping related to vm area between two pages (from oldpage to newpage) Signed-off-by: Izik Eidus [EMAIL PROTECTED] --- include/linux/mm.h |5 +++- mm/memory.c| 60 2 files changed, 64 insertions(+), 1 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 1b7b95c..a311c25 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1094,7 +1094,10 @@ int remap_pfn_range(struct vm_area_struct *, unsigned long addr, unsigned long pfn, unsigned long size, pgprot_t); int vm_insert_page(struct vm_area_struct *, unsigned long addr, struct page *); int vm_insert_pfn(struct vm_area_struct *vma, unsigned long addr, - unsigned long pfn); + unsigned long pfn); + +int replace_page(struct vm_area_struct *vma, struct page *oldpage, + struct page *newpage, pgprot_t prot); struct page *follow_page(struct vm_area_struct *, unsigned long address, unsigned int foll_flags); diff --git a/mm/memory.c b/mm/memory.c index 4bf0b6d..d8cb36b 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -2360,6 +2360,66 @@ static int do_linear_fault(struct mm_struct *mm, struct vm_area_struct *vma, return __do_fault(mm, vma, address, pmd, pgoff, flags, orig_pte); } +/** + * replace_page - replace the pte mapping related to vm area between two pages + * (from oldpage to newpage) + */ +int replace_page(struct vm_area_struct *vma, struct page *oldpage, + struct page *newpage, pgprot_t prot) +{ + struct mm_struct *mm = vma-vm_mm; + pgd_t *pgd; + pud_t *pud; + pmd_t *pmd; + pte_t *ptep; + spinlock_t *ptl; + unsigned long addr; + int ret; + + BUG_ON(!PageLocked(oldpage)); + + ret = -EFAULT; + addr = page_address_in_vma(oldpage, vma); + if (addr == -EFAULT) + goto out; + + pgd = pgd_offset(mm, addr); + if (!pgd_present(*pgd)) + goto out; + + pud = pud_offset(pgd, addr); + if (!pud_present(*pud)) + goto out; + + pmd = pmd_offset(pud, addr); + if (!pmd_present(*pmd)) + goto out; + + ptep = pte_offset_map_lock(mm, pmd, addr, ptl); + if (!ptep) + goto out; + + ret = 0; + get_page(newpage); + page_add_file_rmap(newpage); + + flush_cache_page(vma, addr, pte_pfn(*ptep)); + ptep_clear_flush(vma, addr, ptep); + set_pte_at(mm, addr, ptep, mk_pte(newpage, prot)); + + page_remove_rmap(oldpage, vma); + if (PageAnon(oldpage)) { + dec_mm_counter(mm, anon_rss); + inc_mm_counter(mm, file_rss); + } + put_page(oldpage); + + pte_unmap_unlock(ptep, ptl); +out: + return ret; +} +EXPORT_SYMBOL_GPL(replace_page); + /* * do_no_pfn() tries to create a new page mapping for a page without -- 1.5.3.6 - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [RFC][PATCH 1/5] rmap: add new exported function: page_wrprotect()
From 45e5a255b004e0d578007576304a6b1e272fcb67 Mon Sep 17 00:00:00 2001 From: Izik Eidus [EMAIL PROTECTED] Date: Mon, 21 Jan 2008 16:59:35 +0200 Subject: [PATCH] rmap: add new exported function: page_wrprotect(), page_wrprotect() make the page as read only by setting the ptes point to it as read only. Signed-off-by: Izik Eidus [EMAIL PROTECTED] --- include/linux/rmap.h |7 mm/rmap.c| 91 ++ 2 files changed, 98 insertions(+), 0 deletions(-) diff --git a/include/linux/rmap.h b/include/linux/rmap.h index 97347f2..dac41af 100644 --- a/include/linux/rmap.h +++ b/include/linux/rmap.h @@ -108,6 +108,8 @@ unsigned long page_address_in_vma(struct page *, struct vm_area_struct *); */ int page_mkclean(struct page *); +int page_wrprotect(struct page *page); + #else /* !CONFIG_MMU */ #define anon_vma_init() do {} while (0) @@ -122,6 +124,11 @@ static inline int page_mkclean(struct page *page) return 0; } +static inline int page_wrprotect(struct page *page) +{ + return 0; +} + #endif /* CONFIG_MMU */ diff --git a/mm/rmap.c b/mm/rmap.c index dbc2ca2..462b7c9 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -484,6 +484,97 @@ int page_mkclean(struct page *page) } EXPORT_SYMBOL_GPL(page_mkclean); +static int page_wrprotect_one(struct page *page, struct vm_area_struct *vma) +{ + struct mm_struct *mm = vma-vm_mm; + unsigned long address; + pte_t *pte; + spinlock_t *ptl; + int ret = 0; + + address = vma_address(page, vma); + if (address == -EFAULT) + goto out; + + pte = page_check_address(page, mm, address, ptl); + if (!pte) + goto out; + + if (pte_write(*pte)) { + pte_t entry; + + flush_cache_page(vma, address, pte_pfn(*pte)); + entry = ptep_clear_flush(vma, address, pte); + entry = pte_wrprotect(entry); + set_pte_at(mm, address, pte, entry); + } + ret = 1; + + pte_unmap_unlock(pte, ptl); +out: + return ret; +} + +static int page_wrprotect_file(struct page *page) +{ + struct address_space *mapping; + struct prio_tree_iter iter; + struct vm_area_struct *vma; + pgoff_t pgoff = page-index (PAGE_CACHE_SHIFT - PAGE_SHIFT); + int ret = 0; + + mapping = page_mapping(page); + if (!mapping) + return ret; + + spin_lock(mapping-i_mmap_lock); + + vma_prio_tree_foreach(vma, iter, mapping-i_mmap, pgoff, pgoff) + ret += page_wrprotect_one(page, vma); + + spin_unlock(mapping-i_mmap_lock); + + return ret; +} + +static int page_wrprotect_anon(struct page *page) +{ + struct vm_area_struct *vma; + struct anon_vma *anon_vma; + int ret = 0; + + anon_vma = page_lock_anon_vma(page); + if (!anon_vma) + return ret; + + list_for_each_entry(vma, anon_vma-head, anon_vma_node) + ret += page_wrprotect_one(page, vma); + + page_unlock_anon_vma(anon_vma); + + return ret; +} + +/** + * set all the ptes pointed to a page as read only + * return the number of ptes that were set as read only + * (ptes that were read only before this was called are counted as well) + */ +int page_wrprotect(struct page *page) +{ + int ret = 0; + + BUG_ON(!PageLocked(page)); + + if (PageAnon(page)) + ret = page_wrprotect_anon(page); + else + ret = page_wrprotect_file(page); + + return ret; +} +EXPORT_SYMBOL(page_wrprotect); + /** * page_set_anon_rmap - setup new anonymous rmap * @page: the page to add the mapping to -- 1.5.3.6 - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [RFC][PATCH 0/5] Memory merging driver for Linux
when kvm is used in production servers, many times it run the same guests operation systems more than once the idea of this module is to find the identical pages in diffrent guests and to share them so we can save memory, due to the fact that many guests run identical operation systems, alot of data in the ram is equal between the guests this module find this identical data (pages) and merge them into one single page this new page is write protected so in any case the guest will try to write to it do_wp_page will duplicate the page this module simply go over a list of pages that were registered, and find the identical pages (using hash table) the pages that it scan are anonymous, each time that it find an identical pages it create a file mapped (right now it is just kernel allocated) page that will be the shared page, as for now i am missing swapping support (will add soon using non-linear vmas) this module can be used for every other purpuse and work without kvm (i used it for qemu) to make it work for kvm, the mmu notifers sent by andrea should be used i added 2 new functions to the kernel one: page_wrprotect() make the page as read only by setting the ptes point to it as read only. second: replace_page() - replace the pte mapping related to vm area between two pages few numbers: for started windows i can share almost the whole memory (as it zero all the pages), so i can start much much more windows guests than i have memory (as long as no one touch it) for linux guests i was able to share 800mb+ for 4 centos guests that each had 512mb memory allocated to (again it was without work load, and they ran X) -- woof. - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [RFC][PATCH 3/5] ksm source code
-- woof. /* * Memory merging driver for Linux * * This module enables dynamic sharing of identical pages found in different * memory areas, even if they are not shared by fork() * * Copyright (C) 2008 Qumranet, Inc. * * This work is licensed under the terms of the GNU GPL, version 2. * */ #include ksm.h #include linux/module.h #include linux/errno.h #include linux/mm.h #include linux/fs.h #include linux/miscdevice.h #include linux/vmalloc.h #include linux/file.h #include linux/pagemap.h #include linux/mman.h #include linux/sched.h #include linux/anon_inodes.h #include linux/rwsem.h #include linux/pagemap.h #include linux/vmalloc.h #include linux/sched.h #include linux/rmap.h #include linux/spinlock.h #include asm-x86/kvm_host.h #include linux/jhash.h #include linux/ksm.h #include asm/tlbflush.h MODULE_AUTHOR(Qumranet); MODULE_LICENSE(GPL); struct ksm *ksm; static int page_hash_size = 0; module_param(page_hash_size, int, 0); MODULE_PARM_DESC(page_hash_size, Hash table size for the pages checksum); static int rmap_hash_size = 0; module_param(rmap_hash_size, int, 0); MODULE_PARM_DESC(rmap_hash_size, Hash table size for the reverse mapping); int ksm_slab_init(void) { int ret = 1; ksm-page_item_cache = kmem_cache_create(ksm_page_item, sizeof(struct page_item), 0, 0, NULL); if (!ksm-page_item_cache) goto out; ksm-rmap_item_cache = kmem_cache_create(ksm_rmap_item, sizeof(struct rmap_item), 0, 0, NULL); if (!ksm-rmap_item_cache) goto out_free; return 0; out_free: kmem_cache_destroy(ksm-page_item_cache); out: return ret; } void ksm_slab_free(void) { kmem_cache_destroy(ksm-rmap_item_cache); kmem_cache_destroy(ksm-page_item_cache); } static struct page_item *alloc_page_item(void) { void *obj; obj = kmem_cache_zalloc(ksm-page_item_cache, GFP_KERNEL); return (struct page_item *)obj; } static void free_page_item(struct page_item *page_item) { kfree(page_item); } static struct rmap_item *alloc_rmap_item(void) { void *obj; obj = kmem_cache_zalloc(ksm-rmap_item_cache, GFP_KERNEL); return (struct rmap_item *)obj; } static void free_rmap_item(struct rmap_item *rmap_item) { kfree(rmap_item); } static int inline PageKsm(struct page *page) { return !PageAnon(page); } static int page_hash_init(void) { if (!page_hash_size) { struct sysinfo sinfo; si_meminfo(sinfo); page_hash_size = sinfo.totalram; } ksm-npages_hash = page_hash_size; ksm-page_hash = vmalloc(ksm-npages_hash * sizeof(struct hlist_head *)); if (IS_ERR(ksm-page_hash)) return PTR_ERR(ksm-page_hash); memset(ksm-page_hash, 0, ksm-npages_hash * sizeof(struct hlist_head *)); return 0; } static void page_hash_free(void) { int i; struct hlist_head *bucket; struct hlist_node *node, *n; struct page_item *page_item; for (i = 0; i ksm-npages_hash; ++i) { bucket = ksm-page_hash[i]; hlist_for_each_entry_safe(page_item, node, n, bucket, link) { hlist_del(page_item-link); free_page_item(page_item); } } vfree(ksm-page_hash); } static int rmap_hash_init(void) { if (!rmap_hash_size) { struct sysinfo sinfo; si_meminfo(sinfo); rmap_hash_size = sinfo.totalram; } ksm-nrmaps_hash = rmap_hash_size; ksm-rmap_hash = vmalloc(ksm-nrmaps_hash * sizeof(struct hlist_head *)); if (IS_ERR(ksm-rmap_hash)) return PTR_ERR(ksm-rmap_hash); memset(ksm-rmap_hash, 0, ksm-nrmaps_hash * sizeof(struct hlist_head *)); return 0; } static void rmap_hash_free(void) { int i; struct hlist_head *bucket; struct hlist_node *node, *n; struct rmap_item *rmap_item; for (i = 0; i ksm-nrmaps_hash; ++i) { bucket = ksm-rmap_hash[i]; hlist_for_each_entry_safe(rmap_item, node, n, bucket, link) { hlist_del(rmap_item-link); free_rmap_item(rmap_item); } } vfree(ksm-rmap_hash); } static inline u32 calc_hash_index(void *addr) { return jhash(addr, PAGE_SIZE, 17) % ksm-npages_hash; } static void remove_page_from_hash(struct mm_struct *mm, unsigned long addr) { struct rmap_item *rmap_item; struct hlist_head *bucket; struct hlist_node *node, *n; bucket = ksm-rmap_hash[addr % ksm-nrmaps_hash]; hlist_for_each_entry_safe(rmap_item, node, n, bucket, link) { if (mm == rmap_item-page_item-mm rmap_item-page_item-addr == addr) { hlist_del(rmap_item-page_item-link); free_page_item(rmap_item-page_item); hlist_del(rmap_item-link); free_rmap_item(rmap_item); return; } } } static int ksm_sma_ioctl_register_memory_region(struct ksm_sma *ksm_sma, struct ksm_memory_region *mem) { struct ksm_mem_slot *slot; int ret = 1; if (!current-mm) goto out; slot = kzalloc(sizeof(struct ksm_mem_slot), GFP_KERNEL); if (IS_ERR(slot)) { ret = PTR_ERR(slot); goto out; } slot-mm = get_task_mm(current); slot-addr = mem-addr; slot-npages = mem-npages; list_add_tail(slot-link, ksm-slots); list_add_tail(slot-sma_link, ksm_sma-sma_slots); ret = 0; out: return ret; } static void remove_mm_from_hash(struct mm_struct *mm) {
Re: [kvm-devel] [PATCH] add acpi powerbutton support
Jan Kiszka wrote: Avi Kivity wrote: Jan Kiszka wrote: Guido Guenther wrote: Hi Jan, On Sat, Jan 19, 2008 at 04:40:06PM +0100, Jan Kiszka wrote: What about additionally listening on signals? If you run qemu from the console, you can then just press ctrl-c to shut the guest down (instead Catching ctrl-c sounds like a good idea but ctrl-c, ctrl-c should probably kill qemu then, since the machine might have no acpid running - in that case hitting ctrl-c would have no effect. Good idea. I'm worried about the 30+ second shutdown latency. Is there precedent for SIGTERM or SIGINT requiring this long to take effect? Sorry, can't follow this yet: Are you talking about host system shutdown that should wait on the guest system that long? Mostly that ctrl-C shouldn't take long, especially without some kind of visual indication (say with headless or vnc). Maybe we should suspend the VM instead (using qemu suspend, not guest suspend). You mean on SIGTERM? I think Guido's concern was that this signal is expected to actually kill the qemu instance. Therefore I dropped this signal from my second patch. As a fast, non-guest dependent to shutdown, whichever signals we eventually hook into it. I think that for the unmanaged case we need to improve the windowed interface instead of hooking signals. The managed case can handle anything we throw at it as long as it's well defined. -- error compiling committee.c: too many arguments to function - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [RFC][PATCH 5/5] example of userspace application register it memory in ksm
(this is the modification for qemu to make it work with ksm) -- woof. From 813266c7d59acb64c104448c255cc2c9d4f0187e Mon Sep 17 00:00:00 2001 From: Izik Eidus [EMAIL PROTECTED] Date: Mon, 21 Jan 2008 17:41:09 +0200 Subject: [PATCH] qemu: register the memory of qemu with ksm, so it can be shared Signed-off-by: Izik Eidus [EMAIL PROTECTED] --- qemu/vl.c | 35 +++ 1 files changed, 35 insertions(+), 0 deletions(-) diff --git a/qemu/vl.c b/qemu/vl.c index 756e13d..691db36 100644 --- a/qemu/vl.c +++ b/qemu/vl.c @@ -21,6 +21,8 @@ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN * THE SOFTWARE. */ + +#include ksm.h #include hw/hw.h #include hw/boards.h #include hw/usb.h @@ -8501,6 +8503,37 @@ void qemu_get_launch_info(int *argc, char ***argv, int *opt_daemonize, const cha *opt_incoming = incoming; } +int ksm_register_memory(void) +{ +int fd; +int ksm_fd; +int r = 1; +struct ksm_memory_region ksm_region; + +fd = open(/dev/ksm, O_RDWR | O_TRUNC, (mode_t)0600); +if (fd == -1) +goto out; + +ksm_fd = ioctl(fd, KSM_CREATE_SHARED_MEMORY_AREA); +if (ksm_fd == -1) +goto out_free; + +ksm_region.npages = phys_ram_size / TARGET_PAGE_SIZE; +ksm_region.addr = phys_ram_base; +r = ioctl(ksm_fd, KSM_REGISTER_MEMORY_REGION, ksm_region); +if (r) +goto out_free1; + +return r; + +out_free1: +close(ksm_fd); +out_free: +close(fd); +out: +return r; +} + int main(int argc, char **argv) { #ifdef CONFIG_GDBSTUB @@ -9353,6 +9386,8 @@ int main(int argc, char **argv) } #endif +ksm_register_memory(); + bdrv_init(); /* we always create the cdrom drive, even if no disk is there */ -- 1.5.3.6 - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [RFC][PATCH 4/5] example for userspace scanner
-- woof. #include stdio.h #include stdlib.h #include string.h #include sys/types.h #include sys/stat.h #include fcntl.h #include sys/mman.h #include unistd.h #include linux/ksm.h int main() { int fd; int fd_scan; int r; fd = open(/dev/ksm, O_RDWR | O_TRUNC, (mode_t)0600); if (fd == -1) { fprintf(stderr, couldnt even open it\n); exit(1); } fd_scan = ioctl(fd, KSM_CREATE_SCAN); if (fd_scan == -1) { printf(KSM_CREATE_SCAN failed\n); exit(1); } printf(created scanner!\n); while(1) { r = ioctl(fd_scan, KSM_SCAN, 100); usleep(1000); } return 0; } - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] KVM swapping with mmu notifiers
Andrea Arcangeli wrote: On Sun, Jan 20, 2008 at 05:16:03PM +0200, Avi Kivity wrote: Yes, it's supposed to work (we can't prevent userspace from doing it). Hmm, I think we already prevent it, so I don't think I need to update my swap code until the below is removed. /* Check for overlaps */ r = -EEXIST; for (i = 0; i KVM_MEMORY_SLOTS; ++i) { struct kvm_memory_slot *s = kvm-memslots[i]; if (s == memslot) continue; if (!((base_gfn + npages = s-base_gfn) || (base_gfn = s-base_gfn + s-npages))) goto out_free; } Right. We will have to eventually remove it (to support aliases on non-x86), but no hurry now. -- error compiling committee.c: too many arguments to function - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH] SVM: fix lazy FPU switching
Joerg Roedel wrote: If the guest writes to cr0 and leaves the TS flag at 0 while vcpu-fpu_active is also 0, the TS flag in the guests cr0 gets lost. This leads to corrupt FPU state an causes Windows Vista 64bit to crash very soon after boot. This patch fixes this bug. Applied, thanks. -- error compiling committee.c: too many arguments to function - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH]Add TR insert/purge interface for add-on component
Looks pretty good. Stylistically it would be nicer to initialize ia64_max_tr_num to 8 (with a comment that this is the least smallest allowed value allowed by the architecture - SDM p.2:44 section 4.1.1.1) and increase this if PAL_VM_SUMMARY indicates that the current processor model supports a larger value. That way you are sure that it never has a larger value that it should. N.B. SGI ship systems with mixed processor models, so to be completely correct here you either need ia64_max_tr_num to be a per-cpu value, or to make sure you only set it to the smallest value supported by any cpu on the system. Your overlap checking code only looks like it checks for overlaps among the new entries being inserted via this interface. Is there some other non-obvious way that these are prevented from overlapping with the TR entries in use by the base kernel (ITR[0]+DTR[0] mapping the kernel in region 5, ITR[1] for the PAL code and DTR[1] for the current kernel stack granule? I don't know how kvm will use this interface, perhaps the virtual address range is limited to areas that can't overlap? If so, perhaps the ia64_itr_enty() routine should check that the va,size arguments are in the virtual address range that KVM will use? ia64_itr_entry() should check that 'log_size' is a supported page size for this processor, and that 'va' is suitably aligned for that size. ia64_ptr_entry() perhaps this should just take a 'target_mask' and 'reg' argument. Then it could skip all the overlap checks and just lookup the address+size in the __per_cpu_idtrs[][][] array return an error if you try to purge something that you didn't set up (-pte == 0). Calling this routine 'ia64_purge_tr' (which is the name in the header comment :-) would help note the non-symmetric calling arguments between insert and purge. What is the expected usage pattern for itr, dtr, itr+dtr mappings by KVM? If you are going to allocate enough that there might be contention, it could be better to start the allocation search for i+d entries at the top and work downwards, while allocating just-i and just-d entries from low numbers working up. That might avoid issues with not having an i+d pair available becaue all the odd entries were allocated for 'i' and all the even ones allocated for 'd' ... so even though there are plenty of free TR registers, none of the free ones are in pairs. Maybe we should put a 'kvm_' into the names of the exported interfaces? Sadly there isn't a way to just make these visible to KVM ... but I'd like to make it crystal clear that other drivers should not use these. -Tony - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] kvm crash
Hi all -- I have kvm 1:28-4ubuntu2 installed under ubuntu 7.10 running 2.6.22-14-generic Attempting to install plan9 I get the following crash: $ kvm -nographic --cdrom ISO/plan9.iso -boot d KVM/plan9-qcow.img (qemu) apm ax=f000 cx=f000 dx=f000 di=fff0 ebx=9e42 esi=-f0010 initial probe, to find plan9.ini...dev A0 port 1F0 config 0040 capabilities 0B00 mwdma 0007 udma 203FLLBA sectors 4194304 dev A0 port 170 config 85C0 capabilities 0300 mwdma 0007 udma 203F found partition sdD0!cdboot; 50782+1440 using sdD0!cdboot!plan9.ini . Plan 9 Startup Menu: 1. Install Plan 9 from this CD 2. Boot Plan 9 from this CD 3. Boot Plan 9 from this CD and debug 9load Selection: 1 1 booting sdD0!cdboot!9pcflop.gz found 9pcflop.gz .gz.. 1208343 = 792836+1045244+130816=1968896 entry: f0100020 exception 13 (0) rax 00010010 rbx 0001 rcx f0012000 rdx 00a1 rsi f0101000 rdi f0009000 rsp 7bfc rbp f0001320 r8 r9 r10 r11 r12 r13 r14 r15 rip 00100239 rflags 00033002 cs (/ p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0) ds (/ p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0) es (/ p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0) ss (/ p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0) fs (/ p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0) gs (/ p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0) tr (0885/2088 p 1 dpl 0 db 0 s 0 type b l 0 g 0 avl 0) ldt (/ p 1 dpl 0 db 0 s 0 type 2 l 0 g 0 avl 0) gdt 14000/4f idt 0/3ff cr0 10010 cr2 0 cr3 12000 cr4 d0 cr8 0 efer 0 code: ea 3e 02 00 08 b8 00 00 8e d0 bd 00 7c 8b 86 2c 00 8e d8 8b 86 28 00 8e c0 66 8b be 00 00 Aborted (core dumped) The install succeeds if I use qemu and leave it running all night. But booting the installed image using kvm in the morning gets the identical crash except for the rbp value: rsi f0101000 rdi f0009000 rsp 7bfc rbp f0001320 --- rsi f0101000 rdi f0009000 rsp 7bfc rbp f0001314 -- rec -- - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [ kvm-Bugs-1876714 ] core dump
Bugs item #1876714, was opened at 2008-01-21 12:59 Message generated for change (Tracker Item Submitted) made by Item Submitter You can respond by visiting: https://sourceforge.net/tracker/?func=detailatid=893831aid=1876714group_id=180599 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: amd Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: n7zzt (n7zzt) Assigned to: Nobody/Anonymous (nobody) Summary: core dump Initial Comment: The following error code was generated while attempting to start a previously installed qemu image under kvm: *** [EMAIL PROTECTED]:~/qemu-kvm$ kvm -no-acpi -m 512 -boot c win2k.img unhandled vm exit: 0x0 rax rbx 00060184 rcx 00ff rdx 001f rsi fffe0080 rdi fffe0080 rsp 0004ffe0 rbp 0004fff0 r8 r9 r10 r11 r12 r13 r14 r15 rip 804689b1 rflags 0256 cs 0008 (/ p 1 dpl 0 db 1 s 1 type b l 0 g 0 avl 0) ds 0023 (/ p 1 dpl 3 db 1 s 1 type 3 l 0 g 1 avl 0) es 0023 (/ p 1 dpl 3 db 1 s 1 type 3 l 0 g 1 avl 0) ss 0010 (/ p 1 dpl 0 db 1 s 1 type 3 l 0 g 1 avl 0) fs 0030 (ffdff000/1fff p 1 dpl 0 db 1 s 1 type 3 l 0 g 1 avl 0) gs (/ p 1 dpl 0 db 0 s 1 type 3 l 0 g 0 avl 0) tr 0028 (806e1000/20ab p 1 dpl 0 db 0 s 0 type 9 l 0 g 0 avl 0) ldt (/ p 0 dpl 0 db 0 s 0 type 0 l 0 g 0 avl 0) gdt 80036000/3ff idt 80036400/7ff cr0 8001 cr2 fffe0080 cr3 3 cr4 0 cr8 0 efer 0 Aborted (core dumped) [EMAIL PROTECTED]:~/qemu-kvm$ ** unfortunately, I do not know where the actual core dump file was written (if at all). I hope the above helps. -- You can respond by visiting: https://sourceforge.net/tracker/?func=detailatid=893831aid=1876714group_id=180599 - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] qemu_mutex deadlock with -smp 2
On Sun, Jan 20, 2008 at 05:15:07PM +0200, Avi Kivity wrote: Is this immediately after reboot, or during the boot process? After reboot. --D - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [RFC][PATCH 0/2]In-kernel PIT model
On Monday 21 January 2008 19:17:08 Carlo Marcelo Arenas Belon wrote: On Mon, Jan 21, 2008 at 05:18:21PM +0800, Yang, Sheng wrote: The patch works well on IA32e host(passed 2.6.22, 2.6.20, 2.6.18, 2.6.16 with hpet=disable, 2.6.9 with clock=pit), slightly off-topic but how you got around on building KVM and loading it with the undefined symbols from hpet (hrtimer_{cancel,start,init}) and smp (cpu_online_map) with 2.6.18? or you patched and enabled it statically into the kernel? Carlo Oh, sorry to cause confusion. The versions I mentioned is the guest kernel version, and the host kernel is 2.6.22, both in IA32e and PAE. -- Thanks Yang, Sheng - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [RFC][PATCH 1/2] KVM: In-kernel PIT model
On Monday 21 January 2008 18:28:32 Carsten Otte wrote: Jan Kiszka wrote: The PIT may not be limited to x86 platforms. So I would propose to make the setup more generic and flexible. And I would move the code out of arch/x86, just the speaker support should remain there. It should also not be common among all archs. On s390 we have CPU timer, which is way superior to PIT (clock cycle granularity, no vmexit to set it up or deliver the irq, no hypervisor support needed because it works transparent). Yeah, I also checked IA64 side, and it didn't got PIT too. So I put it in x86 directory. -- Thanks Yang, Sheng - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH] mmu notifiers #v3
On Mon, 21 Jan 2008 13:52:04 +0100 Andrea Arcangeli [EMAIL PROTECTED] wrote: Signed-off-by: Andrea Arcangeli [EMAIL PROTECTED] Reviewed-by: Rik van Riel [EMAIL PROTECTED] -- All rights reversed. - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH/RFC 0/2] CPU hotplug virtio driver
Avi Kivity wrote: Glauber de Oliveira Costa wrote: That said, if acpi is really the preference here, and this patches have chance, no problem. But it will take me a little more time to implement them ;-) The power button support that was recently added at least proves that the host-guest notification path works, so that's one tooth that does not need pulling out. The real trouble with acpi is understanding the spec which is very badly written. _AVI Edwin implemented device hotplug support for Xen/Qemu, probably we can reuse the effort. Edwin: Can u share the device hotplug patch with Glauber? thx,eddie - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] kvm-devel新《劳动合同法》
新《劳动合同法》实施细则从解读到企业应对与运用技巧及审理劳动争议案件司法解释 主办单位:深圳众人行 深.圳.时.间:2008年1月19-20 (星期六日) 深 圳 参.会.费.用:1800元 【学-员-对-象】各企业董事长、总经理、副总经理、人力资源部经理/总监、人事主管、薪酬专 员、绩效主管、培训经理、培训主管、招聘主管,企业法律顾问,劳资、工会、行政及各部门经 理、主管等相关人员参加。 报名电话: 深圳 0755-26075365 26075429 传 真:0755-61351396 吴小姐 课程背景 --- 新《劳动合同法》将于二00八年一月一日起正式实施,国家劳动和社会保障部根据《劳动合同法》 制定的实施细则也将同时与之配套实施,细则对《劳动合同法》中的很多条款作了明确化的规定,该 细则将是用人单位贯彻学习《劳动合同法》的主要依据和帮手。同时国家最高人民法院也将随之出台 《劳动合同法》司法解释。 课程收获 1、掌握了解《劳动合同法》实施细则对人力资源管理现行操作提出的新要求; 2、帮助企业了解人力资源管理将会遇到的困惑和难题; 3、深刻理解《劳动合同法》实施细则对公司人力资源管理带来的影响; 4、掌握如何采取积极措施应对,以确保相关风险得到有效控制; 5、帮助企业在新法制环境下及时调整管理思路,维持劳动关系和谐稳定。 课程内容: 第一部分:新《劳动合同法》及其实施细则解析 一、《劳动合同法》及其实施细则主要调整条款的解析 1、劳动合同期限、试用期限与无固定期限合同订立条件的变化 2、劳动合同的效力、用人单位及劳动者解除劳动合同的新条件 3、用人单位出资培训、竞业限制与违约金的新规定 4、人事外包及劳务派遣的运用与限制 5、经济补偿金支付的条件及计算方法的变化 6、规章制度的生效及制定程序的变化 7、用人单位单方解除劳动合同、辞退员工、裁减员工的规定 8、劳动合同必备条款、约定条款和免责条款 9、违法行为惩罚标准及仲裁诉讼选择条款的变化 10、工会等民主力量的限制 二、新法对企业的影响及企业面临的风险 1、企业用工模式的影响 2、新旧劳动合同的整改及签订 3、对企业商业秘密保护的影响 4、劳动争议内容的新特点 5、劳动关系管理难度的加大 三、最新《劳动合同法》实施细则解读 第二部分:企业应对策略编 四、劳动合同的订立 1、签订劳动合同对企业的利弊; 2、规避无固定期限合同的续签; 3、企业订立劳动合同的常见误区。 五、员工跳槽与辞退风险的控制与防范 1、员工违纪及辞退 2、跳槽员工违约责任的追究及赔偿 3、辞退不能胜任工作、违纪员工的实务技巧 4、如何避免辞退员工时的法律风险 5、如何与核心员工签订离职保密及竞业限制协议 六、企业应对新法的策略与措施 1、劳动合同的撰写与必备条款约定策略及技巧; 2、企业招聘中的风险与应对 3、企业薪酬福利制度设计中的新风险与规避 4、无固定期限合同签订与风险规避策略; 5、培训协议、保密协议、竞业限制协议、劳务派遣协议与离职协议的撰写应注意的问题; 6、违纪员工与怠工员工处理策略; 7、违约金及赔偿金的设定策略 8、企业用工模式在新法律体系下的必要调整 9、劳动争议预防与处理策略。 10、企业应对工会组织成立与集体合同谈判策略以及如何利用工会的特殊法律地位; 七、关于用人单位应当支付经济补偿金问题 1.劳动者解除劳动合同,用人单位应当支付经济补偿金的9种情形 2.用人单位解除或终止劳动合同,应当向劳动者支付经济补偿金的12种情形 3.用人单位支付经济补偿金的标准 4.用人单位支付经济补偿金的时间 八、企业应谨慎处理的一些情况 1.工资标准的确定 2.试用期的约定 3.调整工作岗位 4.绩效考核与末位淘汰 5.孕期女工的处理 6.患病或者非因工负伤员工的处理 九、劳动合同法下的劳动合同、保密协议、培训协议等各类协议签订技巧及风险控制 十、劳动合同法下的规章制度、员工手册撰写技巧及风险控制 十一、新《劳动合同法》下其他协议的变更与撰写 第三部分:关于新法审理劳动争议案件司法解释适用若干问题 1、新劳动合同法下的劳动争议预防策略 2、劳动争议范围的界定及定性问题 3、新《司法解释草案》与《劳动合同法》内容协调与衔接 4、新《司法解释草案》理解与适用及纠纷案件认定与处理 5、目前劳动争议的受案范围及处理途径和案件管辖 6、有关法律、法规及司法解释对劳动争议案件用人单位举证的特殊要求 十二、劳动争议案件审理中的疑难问题及处理意见 (1)劳动争议仲裁委员会认为申请事项不属劳动争议的、法院如何处理 (2)劳动争议发生日如何确定?如何认定60日的申请时限?何种情况 属申请期间中断? (3)用人单位承包经营的诉讼主体如何确定? (4)司法解释规定哪些情况下用人单位支付经济补偿金 (5)仲裁请求与诉讼请求不一致的,法院如何处理 (6)用人单位内部制度、员工手册规定与劳动合同约定不一致,应以哪个作为仲裁或诉讼依据 (7)生效的仲裁裁决何种情况下法院可裁定不予执行,争议如何解决? 老师介绍:金英杰 我国著名劳动法专家中国政法大学民商经济法学院劳动与社会保障法研究所副主任,副教授,硕士生导师。参与劳动合同立法,是国务院法制办劳动合同法、细则起草小组成员负责人之一。 中国劳动法学会会员,北京劳动法与社会保障法学会常务理事. 金老师近百家知名企业提供劳动法与员工关系管理顾问、咨询和劳动争议处理服务的多年实践经验与研究成果,从企业管理所面临的实际问题出发,结合最新法律规定,通过大量经典案例的分析讲解、现场答疑、经验交流与互动研讨等方式.被多家公司聘请为常年法律顾问或劳动法专业顾问:中国中煤能源集团公司,中国中铁三局集团公司,中国二十冶建设有限公司,中国海油湛江分公司,国电燃料有限公司,电网公司,神火集团有限公司,华北制药公司,广东省广晟资产经营有限公司 --- ==新《劳动合同法》及其实施细则报 名 表=== 单位名称: 培训联系人:__职务:联系电话: 联系传真:_ 手机:邮箱:___ 参会人一:_职务:__手机:__邮箱: 参会人二:_职务:__手机:__邮箱: 参会人三:_职务:__手机:__邮箱: 费用总计:___元 付款方式:(请选择打钩 ) □1、现金 □2、转帐 - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH]Add TR insert/purge interface for add-on component
Thanks for your comments Luck, Tony wrote: Looks pretty good. Stylistically it would be nicer to initialize ia64_max_tr_num to 8 (with a comment that this is the least smallest allowed value allowed by the architecture - SDM p.2:44 section 4.1.1.1) and increase this if PAL_VM_SUMMARY indicates that the current processor model supports a larger value. That way you are sure that it never has a larger value that it should. N.B. SGI ship systems with mixed processor models, so to be completely correct here you either need ia64_max_tr_num to be a per-cpu value, or to make sure you only set it to the smallest value supported by any cpu on the system. Agree, we can initialize ia64_max_tr_num to 8. Your overlap checking code only looks like it checks for overlaps among the new entries being inserted via this interface. Is there some other non-obvious way that these are prevented from overlapping with the TR entries in use by the base kernel (ITR[0]+DTR[0] mapping the kernel in region 5, ITR[1] for the PAL code and DTR[1] for the current kernel stack granule? I don't know how kvm will use this interface, perhaps the virtual address range is limited to areas that can't overlap? If so, perhaps the ia64_itr_enty() routine should check that the va,size arguments are in the virtual address range that KVM will use? Kvm only use TRs whose virtual address starts with 0xD000.. I will add virtual address check for va at function ia64_itr_entry and ia64_ptr_entry. ia64_itr_entry() should check that 'log_size' is a supported page size for this processor, and that 'va' is suitably aligned for that size. Agree, we need check log_size in ia64_itr_entry and ia64_ptr_entry. Va doesn't need to be aligned for that size, if you look at itr spec. We can make it aligned for that size. ia64_ptr_entry() perhaps this should just take a 'target_mask' and 'reg' argument. Then it could skip all the overlap checks and just lookup the address+size in the __per_cpu_idtrs[][][] array return an error if you try to purge something that you didn't set up (-pte == 0). Calling this routine 'ia64_purge_tr' (which is the name in the header comment :-) would help note the non-symmetric calling arguments between insert and purge. Ok, we will do it. What is the expected usage pattern for itr, dtr, itr+dtr mappings by KVM? KVM/IA64 use two itrs and two dtrs. If you are going to allocate enough that there might be contention, it could be better to start the allocation search for i+d entries at the top and work downwards, while allocating just-i and just-d entries from low numbers working up. That might avoid issues with not having an i+d pair available becaue all the odd entries were allocated for 'i' and all the even ones allocated for 'd' ... so even though there are plenty of free TR registers, none of the free ones are in pairs. The __per_cpu_idtrs is to reflect the machine ITRs and DTRS. And ITR and DTR are separate. It is impossible that odd entries are for 'I' and even ones for 'd'. In theory, i+d pair can not be available even if there are plenty of free. While KVM/IA64 only uses two pair, that will not happen. If we want to provide a general TR insert/purge interfaces, we need to handle this issue. One possible solution is we don't support i+d pair allocation Maybe we should put a 'kvm_' into the names of the exported interfaces? Sadly there isn't a way to just make these visible to KVM ... but I'd like to make it crystal clear that other drivers should not use these. Agree. Will send out patch soon per your comments. Thanks - Anthony - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH]Add TR insert/purge interface for add-on component
Xu, Anthony wrote: Thanks for your comments Luck, Tony wrote: Looks pretty good. Stylistically it would be nicer to initialize ia64_max_tr_num to 8 (with a comment that this is the least smallest allowed value allowed by the architecture - SDM p.2:44 section 4.1.1.1) and increase this if PAL_VM_SUMMARY indicates that the current processor model supports a larger value. That way you are sure that it never has a larger value that it should. N.B. SGI ship systems with mixed processor models, so to be completely correct here you either need ia64_max_tr_num to be a per-cpu value, or to make sure you only set it to the smallest value supported by any cpu on the system. Agree, we can initialize ia64_max_tr_num to 8. We will use per-cpu ia64_max_tr_num. Your overlap checking code only looks like it checks for overlaps among the new entries being inserted via this interface. Is there some other non-obvious way that these are prevented from overlapping with the TR entries in use by the base kernel (ITR[0]+DTR[0] mapping the kernel in region 5, ITR[1] for the PAL code and DTR[1] for the current kernel stack granule? I don't know how kvm will use this interface, perhaps the virtual address range is limited to areas that can't overlap? If so, perhaps the ia64_itr_enty() routine should check that the va,size arguments are in the virtual address range that KVM will use? Kvm only use TRs whose virtual address starts with 0xD000.. I will add virtual address check for va at function ia64_itr_entry and ia64_ptr_entry. One concern about this Where can we put Macro #define KVM_ADDRESS 0xD000..? It's not suitable to put it in file tlb.c. ia64_itr_entry() should check that 'log_size' is a supported page size for this processor, and that 'va' is suitably aligned for that size. Agree, we need check log_size in ia64_itr_entry and ia64_ptr_entry. Va doesn't need to be aligned for that size, if you look at itr spec. We can make it aligned for that size. ia64_ptr_entry() perhaps this should just take a 'target_mask' and 'reg' argument. Then it could skip all the overlap checks and just lookup the address+size in the __per_cpu_idtrs[][][] array return an error if you try to purge something that you didn't set up (-pte == 0). Calling this routine 'ia64_purge_tr' (which is the name in the header comment :-) would help note the non-symmetric calling arguments between insert and purge. Ok, we will do it. What is the expected usage pattern for itr, dtr, itr+dtr mappings by KVM? KVM/IA64 use two itrs and two dtrs. If you are going to allocate enough that there might be contention, it could be better to start the allocation search for i+d entries at the top and work downwards, while allocating just-i and just-d entries from low numbers working up. That might avoid issues with not having an i+d pair available becaue all the odd entries were allocated for 'i' and all the even ones allocated for 'd' ... so even though there are plenty of free TR registers, none of the free ones are in pairs. The __per_cpu_idtrs is to reflect the machine ITRs and DTRS. And ITR and DTR are separate. It is impossible that odd entries are for 'I' and even ones for 'd'. In theory, i+d pair can not be available even if there are plenty of free. While KVM/IA64 only uses two pair, that will not happen. If we want to provide a general TR insert/purge interfaces, we need to handle this issue. One possible solution is we don't support i+d pair allocation Maybe we should put a 'kvm_' into the names of the exported interfaces? Sadly there isn't a way to just make these visible to KVM ... but I'd like to make it crystal clear that other drivers should not use these. Agree. More thinking. If we put kvm_ into the names of the exported interfaces, then it is the kvm specific interfaces, Seems it is not appropriate to put the interfaces in tlb.c. Basically this patch provides common TR insert/purge interface for all modules, not specific to KVM. Currently only KVM/IA64 need these interfaces, maybe later on other modules also need it. What do you think? Thanks, - Anthony Will send out patch soon per your comments. Thanks - Anthony - To unsubscribe from this list: send the line unsubscribe linux-ia64 in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel