Re: Asynchronous interruption of a compute-intensive guest
On Wed, Apr 13, 2011 at 1:09 AM, Tommaso Cucinotta tommaso.cucino...@sssup.it wrote: I'd like to intercept from the host the exact times at which an incoming network packet directed to a guest VM: a) is delivered from the host OS to the KVM process; b) is delivered to the CPU thread of the KVM process. Specifically, I don't have a clean idea of how b) happens when the CPU thread is doing compute-intensive activities within the VM. How is the flow of control of such thread asynchronously interrupted so as to hand over control to the proper network driver in kvm ? Any pointer to the exact points to look at, in the KVM code, are also very well appreciated. If you are using userspace virtio-net (not in-kernel vhost-net), then an incoming (rx) packet results in the qemu-kvm iothread's select(2) system call returning with a readable tap file descriptor: vl.c:main_loop_wait() (During this time the vcpu thread may still be executing guest code.) The iothread runs the tap receive function: net/tap.c:tap_send() The iothread places the received packet into the rx virtqueue and interrupts the guest: hw/virtio-net.c:virtio_net_receive() hw/virtio-pci.c:virtio_pci_notify() The interrupt is injected by the KVM kernel module: arch/x86/kvm/x86.c:kvm_arch_vm_ioctl() KVM_IRQ_LINE There is some guest mode exiting logic here to kick the vcpu: arch/x86/kvm/lapic.c:__apic_accept_irq() During this whole time the vcpu may be executing guest code. Only at the very end has the interrupt been inject and the vcpu notified. Stefan -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] KVM: Add CPUID support for VIA CPU
On 04/13/2011 06:26 AM, brill...@viatech.com.cn wrote: The CPUIDs for Centaur are added, and then the features of PadLock hardware engine on VIA CPU, such as ace, ace_en and so on, can be passed into the kvm guest. Nice to see this. Please post a link to the documentation describing these features. + /* cpuid 0xC001.edx */ + const u32 kvm_supported_word5_x86_features = + F(XSTORE) | F(XSTORE_EN) | F(XCRYPT) | F(XCRYPT_EN) | + F(ACE2) | F(ACE2_EN) | F(PHE) | F(PHE_EN) | + F(PMM) | F(PMM_EN); + Are all of these features save wrt save/restore? (do they all act on state in standard registers?) Do they need any control register bits to be active or MSRs to configure? @@ -2484,6 +2504,17 @@ static int kvm_dev_ioctl_get_supported_c r = -E2BIG; if (nent= cpuid-nent) + goto out_free; + + /* Add support for Centaur's CPUID instruction. */ + do_cpuid_ent(cpuid_entries[nent], 0xC000, 0,nent, cpuid-nent); nent overflow check missing here. Also, should probably skip if not a Via. + limit = cpuid_entries[nent - 1].eax; + for (func = 0xC001; func= limit nent cpuid-nent; ++func) + do_cpuid_ent(cpuid_entries[nent], func, 0, + nent, cpuid-nent); + + r = -E2BIG; + if (nent= cpuid-nent) goto out_free; do_cpuid_ent(cpuid_entries[nent], KVM_CPUID_SIGNATURE, 0, nent, -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH] KVM: Add CPUID support for VIA CPU
On 04/13/2011 06:26 AM, brill...@viatech.com.cn wrote: The CPUIDs for Centaur are added, and then the features of PadLock hardware engine on VIA CPU, such as ace, ace_en and so on, can be passed into the kvm guest. Nice to see this. Please post a link to the documentation describing these features. These features are defined in Linux Kernel (arch/x86/include/asm/cpufeature.h), and the description for these features can be found at the following address: http://www.via.com.tw/en/initiatives/padlock/hardware.jsp +/* cpuid 0xC001.edx */ +const u32 kvm_supported_word5_x86_features = +F(XSTORE) | F(XSTORE_EN) | F(XCRYPT) | F(XCRYPT_EN) | +F(ACE2) | F(ACE2_EN) | F(PHE) | F(PHE_EN) | +F(PMM) | F(PMM_EN); + Are all of these features save wrt save/restore? (do they all act on state in standard registers?) Do they need any control register bits to be active or MSRs to configure? These features depend on instructions for the PadLock hardware engine of VIA CPU. The PadLock instructions just act on standard registers like general X86 instructions, and the features have been enabled when the CPU leave the factory, so there is no need to activate any control register bits or configure MSRs. @@ -2484,6 +2504,17 @@ static int kvm_dev_ioctl_get_supported_c r = -E2BIG; if (nent= cpuid-nent) +goto out_free; + +/* Add support for Centaur's CPUID instruction. */ +do_cpuid_ent(cpuid_entries[nent], 0xC000, 0,nent, cpuid-nent); nent overflow check missing here. Also, should probably skip if not a Via. If not a VIA, the limit will be 0, so the following cycle can not run. Moreover, it seems that there is no method to know whther the CPU is a VIA or not in this function. The nent overflow check is put after the cycle like the 0x800 case, and when on a VIA, the returned limit is not large (generally it is 0xC004), is it neccesary to add a more check here? + limit = cpuid_entries[nent - 1].eax; + for (func = 0xC001; func= limit nent cpuid-nent; ++func) + do_cpuid_ent(cpuid_entries[nent], func, 0, + nent, cpuid-nent); + + r = -E2BIG; + if (nent= cpuid-nent) goto out_free; do_cpuid_ent(cpuid_entries[nent], KVM_CPUID_SIGNATURE, 0, nent, -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3 v2] KVM: x86 emulator: Disable writeback for CMP emulation
On 04/12/2011 06:24 PM, Takuya Yoshikawa wrote: From: Takuya Yoshikawayoshikawa.tak...@oss.ntt.co.jp This stops CMP r/m, reg to write back the data into memory. Pointed out by Avi. The writeback suppression now covers CMP, CMPS, SCAS. Patchset looks good, nice cleanup. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] KVM: Add CPUID support for VIA CPU
On 04/13/2011 02:05 PM, brill...@viatech.com.cn wrote: + /* cpuid 0xC001.edx */ + const u32 kvm_supported_word5_x86_features = + F(XSTORE) | F(XSTORE_EN) | F(XCRYPT) | F(XCRYPT_EN) | + F(ACE2) | F(ACE2_EN) | F(PHE) | F(PHE_EN) | + F(PMM) | F(PMM_EN); + Are all of these features save wrt save/restore? (do they all act on state in standard registers?) Do they need any control register bits to be active or MSRs to configure? These features depend on instructions for the PadLock hardware engine of VIA CPU. The PadLock instructions just act on standard registers like general X86 instructions, and the features have been enabled when the CPU leave the factory, so there is no need to activate any control register bits or configure MSRs. I see there is a dependency on EFLAGS[30]. Does a VM entry clear this bit? If not, we have to do it ourselves. @@ -2484,6 +2504,17 @@ static int kvm_dev_ioctl_get_supported_c r = -E2BIG; if (nent= cpuid-nent) + goto out_free; + + /* Add support for Centaur's CPUID instruction. */ + do_cpuid_ent(cpuid_entries[nent], 0xC000, 0,nent, cpuid-nent); nent overflow check missing here. Also, should probably skip if not a Via. If not a VIA, the limit will be 0, so the following cycle can not run. I think Intel defines CPUID to return the highest standard leaf, so it will be equivalent to cpuid(0x1a) or something like that. Moreover, it seems that there is no method to know whther the CPU is a VIA or not in this function. Can't you check the vendor ID? see boot_cpu_data. The nent overflow check is put after the cycle like the 0x800 case, and when on a VIA, the returned limit is not large (generally it is 0xC004), is it neccesary to add a more check here? Yes, otherwise userspace can supply a buffer that is exactly the wrong size and cause an overflow. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] kvm tools: Implement virtio network device
This patch implement virtio network device. Use '-n virtio or --network=virtio' to enable it. The current implementation uses tap which needs root privileges to create a virtual network device (tap0) on host side. Actually, what we need is CAP_NET_ADMIN. The host side tap0 is set to 192.168.33.2/24. You need to configure the guest side eth0 to any ip address in 192.168.33.0/24. Here are some scp performance test for differenct implementations: None of rx and tx as thread: guest to host 3.2MB/s host to guest 3.1MB/s Only rx as thread: guest to host 14.7MB/s host to guest 33.4MB/s Both rx and tx as thread(This patch works this way): guest to host 19.8MB/s host to guest 32.5MB/s Signed-off-by: Asias He asias.he...@gmail.com --- tools/kvm/Makefile |1 + tools/kvm/include/kvm/ioport.h |2 + tools/kvm/include/kvm/types.h |7 + tools/kvm/include/kvm/virtio-net.h |7 + tools/kvm/kvm-run.c| 11 ++ tools/kvm/virtio-net.c | 318 6 files changed, 346 insertions(+), 0 deletions(-) create mode 100644 tools/kvm/include/kvm/types.h create mode 100644 tools/kvm/include/kvm/virtio-net.h create mode 100644 tools/kvm/virtio-net.c diff --git a/tools/kvm/Makefile b/tools/kvm/Makefile index 7a2863d..6895113 100644 --- a/tools/kvm/Makefile +++ b/tools/kvm/Makefile @@ -14,6 +14,7 @@ TAGS = ctags OBJS += 8250-serial.o OBJS += virtio-blk.o +OBJS += virtio-net.o OBJS += virtio-console.o OBJS += cpuid.o OBJS += read-write.o diff --git a/tools/kvm/include/kvm/ioport.h b/tools/kvm/include/kvm/ioport.h index 0218329..2fdcca4 100644 --- a/tools/kvm/include/kvm/ioport.h +++ b/tools/kvm/include/kvm/ioport.h @@ -10,6 +10,8 @@ #define IOPORT_VIRTIO_BLK_SIZE 256 #define IOPORT_VIRTIO_CONSOLE 0xd200 /* Virtio console device */ #define IOPORT_VIRTIO_CONSOLE_SIZE 256 +#define IOPORT_VIRTIO_NET 0xe200 /* Virtio network device */ +#define IOPORT_VIRTIO_NET_SIZE 256 struct kvm; diff --git a/tools/kvm/include/kvm/types.h b/tools/kvm/include/kvm/types.h new file mode 100644 index 000..0cbc5fb --- /dev/null +++ b/tools/kvm/include/kvm/types.h @@ -0,0 +1,7 @@ +#ifndef KVM_TYPES_H +#define KVM_TYPES_H + +/* FIXME: include/linux/if_tun.h and include/linux/if_ether.h complains */ +#define __be16 u16 + +#endif /* KVM_TYPES_H */ diff --git a/tools/kvm/include/kvm/virtio-net.h b/tools/kvm/include/kvm/virtio-net.h new file mode 100644 index 000..a1cab15 --- /dev/null +++ b/tools/kvm/include/kvm/virtio-net.h @@ -0,0 +1,7 @@ +#ifndef KVM__VIRTIO_NET_H +#define KVM__VIRTIO_NET_H + +struct kvm; +void virtio_net__init(struct kvm *self); + +#endif /* KVM__VIRTIO_NET_H */ diff --git a/tools/kvm/kvm-run.c b/tools/kvm/kvm-run.c index 65c4787..6046a0a 100644 --- a/tools/kvm/kvm-run.c +++ b/tools/kvm/kvm-run.c @@ -16,6 +16,7 @@ #include kvm/kvm-cpu.h #include kvm/8250-serial.h #include kvm/virtio-blk.h +#include kvm/virtio-net.h #include kvm/virtio-console.h #include kvm/disk-image.h #include kvm/util.h @@ -29,6 +30,7 @@ #define DEFAULT_KVM_DEV/dev/kvm #define DEFAULT_CONSOLEserial +#define DEFAULT_NETWORKnone #define MB_SHIFT (20) #define MIN_RAM_SIZE_MB(64ULL) @@ -63,6 +65,7 @@ static const char *initrd_filename; static const char *image_filename; static const char *console; static const char *kvm_dev; +static const char *network; static bool single_step; static bool readonly_image; extern bool ioport_debug; @@ -84,6 +87,8 @@ static const struct option options[] = { Don't write changes back to disk image), OPT_STRING('c', console, console, serial or virtio, Console to use), + OPT_STRING('n', network, network, virtio, + Network to use), OPT_GROUP(Kernel options:), OPT_STRING('k', kernel, kernel_filename, kernel, @@ -250,6 +255,12 @@ int kvm_cmd_run(int argc, const char **argv, const char *prefix) virtio_console__init(kvm); + if (!network) + network = DEFAULT_NETWORK; + + if (!strncmp(network, virtio, 6)) + virtio_net__init(kvm); + kvm__start_timer(kvm); for (i = 0; i nrcpus; i++) { diff --git a/tools/kvm/virtio-net.c b/tools/kvm/virtio-net.c new file mode 100644 index 000..ec70d5c --- /dev/null +++ b/tools/kvm/virtio-net.c @@ -0,0 +1,318 @@ +#include kvm/virtio-net.h +#include kvm/virtio-pci.h +#include kvm/virtio.h +#include kvm/ioport.h +#include kvm/types.h +#include kvm/mutex.h +#include kvm/util.h +#include kvm/kvm.h +#include kvm/pci.h + +#include linux/virtio_net.h +#include linux/if_tun.h +#include net/if.h +#include sys/ioctl.h +#include assert.h +#include fcntl.h + +#define VIRTIO_NET_IRQ 14 +#define VIRTIO_NET_QUEUE_SIZE 128 +#define VIRTIO_NET_NUM_QUEUES 2 +#define
Re: [PATCH] kvm tools: Implement virtio network device
On 4/13/11 2:48 PM, Asias He wrote: This patch implement virtio network device. Use '-n virtio or --network=virtio' to enable it. The current implementation uses tap which needs root privileges to create a virtual network device (tap0) on host side. Actually, what we need is CAP_NET_ADMIN. The host side tap0 is set to 192.168.33.2/24. You need to configure the guest side eth0 to any ip address in 192.168.33.0/24. Here are some scp performance test for differenct implementations: None of rx and tx as thread: guest to host 3.2MB/s host to guest 3.1MB/s Only rx as thread: guest to host 14.7MB/s host to guest 33.4MB/s Both rx and tx as thread(This patch works this way): guest to host 19.8MB/s host to guest 32.5MB/s Signed-off-by: Asias Heasias.he...@gmail.com This is already in master. Thanks! -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/4] [RFC rev2] Implement multiqueue (RX TX) virtio-net
On 04/05/2011 06:08 PM, Krishna Kumar wrote: This patchset implements both RX and TX MQ. Patch against virtio-net, vhost and qemu are included. Changes from rev1: --- 1. vqs are allocated as: rx/tx, rx/tx, rx/tx, etc. Lot of code in guest/host/qemu changes, but code becomes simpler. 2. vhost cache align of vhost_dev correctly. 3. virtio-net: cleanup properly on errors (eg detach buf for vq0 as pointed out by Micheal). 4. Minor changes: - Fixed some typos. - Changed vhost_get_thread_index to use MAX_VHOST_THREADS. - Removed VIRTIO_MAX_TXQS. - Changed capability to VIRTIO_NET_F_MULTIQUEUE. - Modified numtxqs in virtnet_info to num_queue_pairs. virtnet_info still has numtxqs as it is more convenient. - Moved code for VIRTIO_NET_F_CTRL_VLAN into probe function. - Improve check for return value of virtio_config_val(). - Removed cache align directives in guest as it was redundant. 5. If we have a wrapper to init all vqs, pls add a wrapper to clean up all vqs as well: I haven't done this as some errors are very specific to the failure location (and what was initialized till then). So only those errors are cleaned up using goto's like the rest of the code. I can change in next version if you feel this is still required. 6. I think we should have free_unused_bufs that handles a single queue, and call it in a loop: I haven't done this as I think the caller wants all rx/tx queues to be cleaned up by calling this function. TODO's: 1. Reduce vectors for find_vqs(). 2. Make vhost changes minimal. For now, I have restricted the number of vhost threads to 4. This can be either made unrestricted; or if the userspace vhost works, it can be removed altogether. Please review and provide feedback. I am travelling a bit in the next few days but will respond at the earliest. Do you have an update to the virtio-pci spec for this? -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] kvm tools: Implement virtio network device
On 04/13/2011 07:51 PM, Pekka Enberg wrote: On 4/13/11 2:48 PM, Asias He wrote: This patch implement virtio network device. Use '-n virtio or --network=virtio' to enable it. The current implementation uses tap which needs root privileges to create a virtual network device (tap0) on host side. Actually, what we need is CAP_NET_ADMIN. The host side tap0 is set to 192.168.33.2/24. You need to configure the guest side eth0 to any ip address in 192.168.33.0/24. Here are some scp performance test for differenct implementations: None of rx and tx as thread: guest to host 3.2MB/s host to guest 3.1MB/s Only rx as thread: guest to host 14.7MB/s host to guest 33.4MB/s Both rx and tx as thread(This patch works this way): guest to host 19.8MB/s host to guest 32.5MB/s Signed-off-by: Asias Heasias.he...@gmail.com This is already in master. Thanks! Ingo suggested to CC the updated version of this patch to kvm list. So I am posting this patch again. -- Best Regards, Asias He -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/2] kvm tools: Make host side IP configurable
Add --host-ip-addr parameter to allow changing the host-side IP address. Signed-off-by: Sasha Levin levinsasha...@gmail.com --- tools/kvm/include/kvm/virtio-net.h |2 +- tools/kvm/kvm-run.c|9 - tools/kvm/virtio-net.c |9 - 3 files changed, 13 insertions(+), 7 deletions(-) diff --git a/tools/kvm/include/kvm/virtio-net.h b/tools/kvm/include/kvm/virtio-net.h index a1cab15..03eb623 100644 --- a/tools/kvm/include/kvm/virtio-net.h +++ b/tools/kvm/include/kvm/virtio-net.h @@ -2,6 +2,6 @@ #define KVM__VIRTIO_NET_H struct kvm; -void virtio_net__init(struct kvm *self); +void virtio_net__init(struct kvm *self, const char* host_ip_addr); #endif /* KVM__VIRTIO_NET_H */ diff --git a/tools/kvm/kvm-run.c b/tools/kvm/kvm-run.c index 6046a0a..910a8d8 100644 --- a/tools/kvm/kvm-run.c +++ b/tools/kvm/kvm-run.c @@ -31,6 +31,7 @@ #define DEFAULT_KVM_DEV/dev/kvm #define DEFAULT_CONSOLEserial #define DEFAULT_NETWORKnone +#define DEFAULT_HOST_ADDR 192.168.33.2 #define MB_SHIFT (20) #define MIN_RAM_SIZE_MB(64ULL) @@ -66,6 +67,7 @@ static const char *image_filename; static const char *console; static const char *kvm_dev; static const char *network; +static const char *host_ip_addr; static bool single_step; static bool readonly_image; extern bool ioport_debug; @@ -89,6 +91,8 @@ static const struct option options[] = { Console to use), OPT_STRING('n', network, network, virtio, Network to use), + OPT_STRING('\0', host-ip-addr, host_ip_addr, a.b.c.d, + Assign this address to the host side networking), OPT_GROUP(Kernel options:), OPT_STRING('k', kernel, kernel_filename, kernel, @@ -218,6 +222,9 @@ int kvm_cmd_run(int argc, const char **argv, const char *prefix) else active_console = CONSOLE_8250; + if (!host_ip_addr) + host_ip_addr = DEFAULT_HOST_ADDR; + term_init(); kvm = kvm__init(kvm_dev, ram_size); @@ -259,7 +266,7 @@ int kvm_cmd_run(int argc, const char **argv, const char *prefix) network = DEFAULT_NETWORK; if (!strncmp(network, virtio, 6)) - virtio_net__init(kvm); + virtio_net__init(kvm, host_ip_addr); kvm__start_timer(kvm); diff --git a/tools/kvm/virtio-net.c b/tools/kvm/virtio-net.c index 5f9bf07..ec65779 100644 --- a/tools/kvm/virtio-net.c +++ b/tools/kvm/virtio-net.c @@ -276,7 +276,7 @@ static struct pci_device_header virtio_net_pci_device = { .irq_line = VIRTIO_NET_IRQ, }; -static void virtio_net__tap_init(void) +static void virtio_net__tap_init(const char *host_ip_addr) { struct ifreq ifr; int sock = socket(AF_INET, SOCK_STREAM, 0); @@ -298,8 +298,7 @@ static void virtio_net__tap_init(void) strncpy(ifr.ifr_name, net_device.tap_name, sizeof(net_device.tap_name)); - /*FIXME: Remove this after user can specify ip address and netmask*/ - ((struct sockaddr_in *)((ifr.ifr_addr)))-sin_addr.s_addr = inet_addr(192.168.33.2); + ((struct sockaddr_in *)((ifr.ifr_addr)))-sin_addr.s_addr = inet_addr(host_ip_addr); ifr.ifr_addr.sa_family = AF_INET; if (ioctl(sock, SIOCSIFADDR, ifr) 0) @@ -327,11 +326,11 @@ static void virtio_net__io_thread_init(struct kvm *self) pthread_create(net_device.io_tx_thread, NULL, virtio_net_tx_thread, (void *)self); } -void virtio_net__init(struct kvm *self) +void virtio_net__init(struct kvm *self, const char* host_ip_addr) { pci__register(virtio_net_pci_device, PCI_VIRTIO_NET_DEVNUM); ioport__register(IOPORT_VIRTIO_NET, virtio_net_io_ops, IOPORT_VIRTIO_NET_SIZE); - virtio_net__tap_init(); + virtio_net__tap_init(host_ip_addr); virtio_net__io_thread_init(self); } -- 1.7.5.rc1 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/2] kvm tools: Set up tun interface using ioctls
Use ioctls to assign IP address and bring interface up instead of using ifconfig. Signed-off-by: Sasha Levin levinsasha...@gmail.com --- tools/kvm/virtio-net.c | 25 ++--- 1 files changed, 22 insertions(+), 3 deletions(-) diff --git a/tools/kvm/virtio-net.c b/tools/kvm/virtio-net.c index ec70d5c..5f9bf07 100644 --- a/tools/kvm/virtio-net.c +++ b/tools/kvm/virtio-net.c @@ -14,6 +14,9 @@ #include sys/ioctl.h #include assert.h #include fcntl.h +#include arpa/inet.h +#include sys/types.h +#include sys/socket.h #define VIRTIO_NET_IRQ 14 #define VIRTIO_NET_QUEUE_SIZE 128 @@ -276,7 +279,7 @@ static struct pci_device_header virtio_net_pci_device = { static void virtio_net__tap_init(void) { struct ifreq ifr; - + int sock = socket(AF_INET, SOCK_STREAM, 0); net_device.tap_fd = open(/dev/net/tun, O_RDWR); if (net_device.tap_fd 0) die(Unable to open /dev/net/tun\n); @@ -291,9 +294,25 @@ static void virtio_net__tap_init(void) ioctl(net_device.tap_fd, TUNSETNOCSUM, 1); + memset(ifr, 0, sizeof(ifr)); + + strncpy(ifr.ifr_name, net_device.tap_name, sizeof(net_device.tap_name)); + /*FIXME: Remove this after user can specify ip address and netmask*/ - if (system(ifconfig tap0 192.168.33.2) 0) - warning(Can not set ip address on tap0); + ((struct sockaddr_in *)((ifr.ifr_addr)))-sin_addr.s_addr = inet_addr(192.168.33.2); + ifr.ifr_addr.sa_family = AF_INET; + + if (ioctl(sock, SIOCSIFADDR, ifr) 0) + warning(Can not set ip address on tap device); + + memset(ifr, 0, sizeof(ifr)); + strncpy(ifr.ifr_name, net_device.tap_name, sizeof(net_device.tap_name)); + ioctl(sock, SIOCGIFFLAGS, ifr); + ifr.ifr_flags |= IFF_UP | IFF_RUNNING; + if (ioctl(sock, SIOCSIFFLAGS, ifr) 0) + warning(Could not bring tap device up); + + close(sock); } static void virtio_net__io_thread_init(struct kvm *self) -- 1.7.5.rc1 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] kvm tools: Make host side IP configurable
On 04/13/2011 08:13 PM, Sasha Levin wrote: Add --host-ip-addr parameter to allow changing the host-side IP address. I'd personally prefer something like this: --network=model=virtio,hostip=x.x.x.x/mask,guestmac=yy:yy:yy:yy Once we can set up ip address for guest, we can use: --network=model=virtio,hostip=x.x.x.x/mask,guestmac=yy:yy:yy:yy,guestip=z.z.z.z/mask,guestgw=v.v.v.v Signed-off-by: Sasha Levin levinsasha...@gmail.com --- tools/kvm/include/kvm/virtio-net.h |2 +- tools/kvm/kvm-run.c|9 - tools/kvm/virtio-net.c |9 - 3 files changed, 13 insertions(+), 7 deletions(-) diff --git a/tools/kvm/include/kvm/virtio-net.h b/tools/kvm/include/kvm/virtio-net.h index a1cab15..03eb623 100644 --- a/tools/kvm/include/kvm/virtio-net.h +++ b/tools/kvm/include/kvm/virtio-net.h @@ -2,6 +2,6 @@ #define KVM__VIRTIO_NET_H struct kvm; -void virtio_net__init(struct kvm *self); +void virtio_net__init(struct kvm *self, const char* host_ip_addr); #endif /* KVM__VIRTIO_NET_H */ diff --git a/tools/kvm/kvm-run.c b/tools/kvm/kvm-run.c index 6046a0a..910a8d8 100644 --- a/tools/kvm/kvm-run.c +++ b/tools/kvm/kvm-run.c @@ -31,6 +31,7 @@ #define DEFAULT_KVM_DEV /dev/kvm #define DEFAULT_CONSOLE serial #define DEFAULT_NETWORK none +#define DEFAULT_HOST_ADDR192.168.33.2 #define MB_SHIFT (20) #define MIN_RAM_SIZE_MB (64ULL) @@ -66,6 +67,7 @@ static const char *image_filename; static const char *console; static const char *kvm_dev; static const char *network; +static const char *host_ip_addr; static bool single_step; static bool readonly_image; extern bool ioport_debug; @@ -89,6 +91,8 @@ static const struct option options[] = { Console to use), OPT_STRING('n', network, network, virtio, Network to use), + OPT_STRING('\0', host-ip-addr, host_ip_addr, a.b.c.d, + Assign this address to the host side networking), OPT_GROUP(Kernel options:), OPT_STRING('k', kernel, kernel_filename, kernel, @@ -218,6 +222,9 @@ int kvm_cmd_run(int argc, const char **argv, const char *prefix) else active_console = CONSOLE_8250; + if (!host_ip_addr) + host_ip_addr = DEFAULT_HOST_ADDR; + term_init(); kvm = kvm__init(kvm_dev, ram_size); @@ -259,7 +266,7 @@ int kvm_cmd_run(int argc, const char **argv, const char *prefix) network = DEFAULT_NETWORK; if (!strncmp(network, virtio, 6)) - virtio_net__init(kvm); + virtio_net__init(kvm, host_ip_addr); kvm__start_timer(kvm); diff --git a/tools/kvm/virtio-net.c b/tools/kvm/virtio-net.c index 5f9bf07..ec65779 100644 --- a/tools/kvm/virtio-net.c +++ b/tools/kvm/virtio-net.c @@ -276,7 +276,7 @@ static struct pci_device_header virtio_net_pci_device = { .irq_line = VIRTIO_NET_IRQ, }; -static void virtio_net__tap_init(void) +static void virtio_net__tap_init(const char *host_ip_addr) { struct ifreq ifr; int sock = socket(AF_INET, SOCK_STREAM, 0); @@ -298,8 +298,7 @@ static void virtio_net__tap_init(void) strncpy(ifr.ifr_name, net_device.tap_name, sizeof(net_device.tap_name)); - /*FIXME: Remove this after user can specify ip address and netmask*/ - ((struct sockaddr_in *)((ifr.ifr_addr)))-sin_addr.s_addr = inet_addr(192.168.33.2); + ((struct sockaddr_in *)((ifr.ifr_addr)))-sin_addr.s_addr = inet_addr(host_ip_addr); ifr.ifr_addr.sa_family = AF_INET; if (ioctl(sock, SIOCSIFADDR, ifr) 0) @@ -327,11 +326,11 @@ static void virtio_net__io_thread_init(struct kvm *self) pthread_create(net_device.io_tx_thread, NULL, virtio_net_tx_thread, (void *)self); } -void virtio_net__init(struct kvm *self) +void virtio_net__init(struct kvm *self, const char* host_ip_addr) { pci__register(virtio_net_pci_device, PCI_VIRTIO_NET_DEVNUM); ioport__register(IOPORT_VIRTIO_NET, virtio_net_io_ops, IOPORT_VIRTIO_NET_SIZE); - virtio_net__tap_init(); + virtio_net__tap_init(host_ip_addr); virtio_net__io_thread_init(self); } -- Best Regards, Asias He -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] kvm tools: Make host side IP configurable
On 4/13/11 3:28 PM, Asias He wrote: On 04/13/2011 08:13 PM, Sasha Levin wrote: Add --host-ip-addr parameter to allow changing the host-side IP address. I'd personally prefer something like this: --network=model=virtio,hostip=x.x.x.x/mask,guestmac=yy:yy:yy:yy Once we can set up ip address for guest, we can use: --network=model=virtio,hostip=x.x.x.x/mask,guestmac=yy:yy:yy:yy,guestip=z.z.z.z/mask,guestgw=v.v.v.v What's the benefit over --network=virtio --hostip=x.x.x.x/mask --guestmac= [...] form? Isn't Sasha's form a much better fit with the git-like front end (for parsing and help text)? Pekka -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] kvm tools: Set up tun interface using ioctls
On 04/13/2011 08:13 PM, Sasha Levin wrote: Use ioctls to assign IP address and bring interface up instead of using ifconfig. Signed-off-by: Sasha Levin levinsasha...@gmail.com --- tools/kvm/virtio-net.c | 25 ++--- 1 files changed, 22 insertions(+), 3 deletions(-) diff --git a/tools/kvm/virtio-net.c b/tools/kvm/virtio-net.c index ec70d5c..5f9bf07 100644 --- a/tools/kvm/virtio-net.c +++ b/tools/kvm/virtio-net.c @@ -14,6 +14,9 @@ #include sys/ioctl.h #include assert.h #include fcntl.h +#include arpa/inet.h +#include sys/types.h +#include sys/socket.h #define VIRTIO_NET_IRQ 14 #define VIRTIO_NET_QUEUE_SIZE128 @@ -276,7 +279,7 @@ static struct pci_device_header virtio_net_pci_device = { static void virtio_net__tap_init(void) { struct ifreq ifr; - + int sock = socket(AF_INET, SOCK_STREAM, 0); net_device.tap_fd = open(/dev/net/tun, O_RDWR); if (net_device.tap_fd 0) die(Unable to open /dev/net/tun\n); @@ -291,9 +294,25 @@ static void virtio_net__tap_init(void) ioctl(net_device.tap_fd, TUNSETNOCSUM, 1); + memset(ifr, 0, sizeof(ifr)); + + strncpy(ifr.ifr_name, net_device.tap_name, sizeof(net_device.tap_name)); + /*FIXME: Remove this after user can specify ip address and netmask*/ - if (system(ifconfig tap0 192.168.33.2) 0) - warning(Can not set ip address on tap0); + ((struct sockaddr_in *)((ifr.ifr_addr)))-sin_addr.s_addr = inet_addr(192.168.33.2); + ifr.ifr_addr.sa_family = AF_INET; + + if (ioctl(sock, SIOCSIFADDR, ifr) 0) + warning(Can not set ip address on tap device); + + memset(ifr, 0, sizeof(ifr)); + strncpy(ifr.ifr_name, net_device.tap_name, sizeof(net_device.tap_name)); + ioctl(sock, SIOCGIFFLAGS, ifr); + ifr.ifr_flags |= IFF_UP | IFF_RUNNING; + if (ioctl(sock, SIOCSIFFLAGS, ifr) 0) + warning(Could not bring tap device up); + + close(sock); } static void virtio_net__io_thread_init(struct kvm *self) Looks good to me. Reviewed-by: Asias He asias.he...@gmail.com -- Best Regards, Asias He -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Kernel oops in host caused by mmaping RAM
On Wed, Apr 13, 2011 at 12:09 AM, Jan Kiszka jan.kis...@web.de wrote: On 2011-04-12 21:41, Sasha Levin wrote: Hello, I've tried using mmap to map the RAM of a guest instead of posix_memalign which is used both in the kvm tool and qemu. Doing so caused a kernel Oops, which happens every time I run the code and was confirmed both on 2.6.38 and the latest git build of 2.6.39. Can you share the test case that triggers it? That's easier than guessing what you did precisely. It's the native Linux kvm tool patched to use mmap() instead of posix_memalign(). Sasha, maybe you should post your patch so other people can try to reproduce the problem? -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Kernel oops in host caused by mmaping RAM
On Wed, Apr 13, 2011 at 3:50 PM, Pekka Enberg penb...@kernel.org wrote: On Wed, Apr 13, 2011 at 12:09 AM, Jan Kiszka jan.kis...@web.de wrote: On 2011-04-12 21:41, Sasha Levin wrote: Hello, I've tried using mmap to map the RAM of a guest instead of posix_memalign which is used both in the kvm tool and qemu. Doing so caused a kernel Oops, which happens every time I run the code and was confirmed both on 2.6.38 and the latest git build of 2.6.39. Can you share the test case that triggers it? That's easier than guessing what you did precisely. It's the native Linux kvm tool patched to use mmap() instead of posix_memalign(). Sasha, maybe you should post your patch so other people can try to reproduce the problem? I provided Jan with a patch to the kvm tool yesterday, Jan has reproduced the oops and sent a patch to kernel-side KVM to fix it. Here's the patch for the Linux kvm tool which triggered the oops. diff --git a/tools/kvm/kvm.c b/tools/kvm/kvm.c index 08ff63c..bac2a5e 100644 --- a/tools/kvm/kvm.c +++ b/tools/kvm/kvm.c @@ -158,7 +158,6 @@ struct kvm *kvm__init(const char *kvm_dev, unsigned long ram_size) struct kvm_userspace_memory_region mem; struct kvm_pit_config pit_config = { .flags = 0, }; struct kvm *self; - long page_size; int ret; if (!kvm__cpu_supports_vm()) @@ -199,8 +198,8 @@ struct kvm *kvm__init(const char *kvm_dev, unsigned long ram_size) self-ram_size = ram_size; - page_size = sysconf(_SC_PAGESIZE); - if (posix_memalign(self-ram_start, page_size, self-ram_size) != 0) + self-ram_start = mmap(NULL, self-ram_size, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_NORESERVE | MAP_ANONYMOUS, -1, 0); + if (self == MAP_FAILED) die(out of memory); mem = (struct kvm_userspace_memory_region) { -- Sasha. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] [PATCH 2/2 V7] qemu,qmp: add inject-nmi qmp command
On Tue, 12 Apr 2011 21:31:18 +0300 Blue Swirl blauwir...@gmail.com wrote: On Tue, Apr 12, 2011 at 10:52 AM, Avi Kivity a...@redhat.com wrote: On 04/11/2011 08:15 PM, Blue Swirl wrote: On Mon, Apr 11, 2011 at 10:01 AM, Markus Armbrusterarm...@redhat.com wrote: Avi Kivitya...@redhat.com writes: On 04/08/2011 12:41 AM, Anthony Liguori wrote: And it's a good thing to have, but exposing this as the only API to do something as simple as generating a guest crash dump is not the friendliest thing in the world to do to users. nmi is a fine name for something that corresponds to a real-life nmi button (often labeled NMI). Agree. We could also introduce an alias mechanism for user friendly names, so nmi could be used in addition of full path. Aliases could be useful for device paths as well. Yes. Perhaps limited to the human monitor. I'd limit all debugging commands (including NMI) to the human monitor. Why? -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: buggy emulate_int_real
Quoting Jan Kiszka (jan.kis...@web.de): Looks consistent to me. It's working for me, so if there are no further comments I'll improve the changelog and send to lkml. thanks, -serge -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] kvm tools: Make host side IP configurable
On 04/13/2011 08:28 PM, Asias He wrote: On 04/13/2011 08:13 PM, Sasha Levin wrote: Add --host-ip-addr parameter to allow changing the host-side IP address. I'd personally prefer something like this: --network=model=virtio,hostip=x.x.x.x/mask,guestmac=yy:yy:yy:yy Once we can set up ip address for guest, we can use: --network=model=virtio,hostip=x.x.x.x/mask,guestmac=yy:yy:yy:yy,guestip=z.z.z.z/mask,guestgw=v.v.v.v Alternatively, we can use an option group like this: Network options: -n --network virtio -h --hostipx.x.x.x/n -y --guestip x.x.x.x/n -z --guestmac xx:xx:xx:xx:xx:xx -- Best Regards, Asias He -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [transparent networking] Re: [PATCH] kvm tools: Implement virtio network device
On 04/13/2011 04:02 PM, Ingo Molnar wrote: * Asias Heasias.he...@gmail.com wrote: Here are some scp performance test for differenct implementations: None of rx and tx as thread: guest to host 3.2MB/s host to guest 3.1MB/s Only rx as thread: guest to host 14.7MB/s host to guest 33.4MB/s Both rx and tx as thread(This patch works this way): guest to host 19.8MB/s host to guest 32.5MB/s Signed-off-by: Asias Heasias.he...@gmail.com This is already in master. Thanks! Ingo suggested to CC the updated version of this patch to kvm list. So I am posting this patch again. Thanks Asias, cool stuff. Maybe other KVM developers want to chime in about how to best implement transparent (non-TAP-using) guest-side networking. The best approach would be to not go down as low as the IP/Ethernet packeting level (it's unnecessary protocol overhead), but to implement some sort of streaming, virtio based TCP connection proxying support. Strictly talking the guest does not need ICMP packets to have working Internet connectivity - only passing/tunneling through TCP sockets would be enough. The following highlevel ops are needed: - connect/shutdown/close - send/receive - poll And would be passed through to the host side and mirrored there into real connect/shutdown TCP socket ops and into send/receive ops. The guest OS does not need to be 'aware' of this in any way, as long as the bzImage has this magic guest tunneling support included. Obviously, such a highlevel approach would be much faster as well than any packet level virtual networking approach. Does something like this exist upstream, or do we have to implement it? macvtap does non-privileged setupless networking. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [transparent networking] Re: [PATCH] kvm tools: Implement virtio network device
On 04/13/2011 04:33 PM, Avi Kivity wrote: On 04/13/2011 04:02 PM, Ingo Molnar wrote: * Asias Heasias.he...@gmail.com wrote: Here are some scp performance test for differenct implementations: None of rx and tx as thread: guest to host 3.2MB/s host to guest 3.1MB/s Only rx as thread: guest to host 14.7MB/s host to guest 33.4MB/s Both rx and tx as thread(This patch works this way): guest to host 19.8MB/s host to guest 32.5MB/s Signed-off-by: Asias Heasias.he...@gmail.com This is already in master. Thanks! Ingo suggested to CC the updated version of this patch to kvm list. So I am posting this patch again. Thanks Asias, cool stuff. Maybe other KVM developers want to chime in about how to best implement transparent (non-TAP-using) guest-side networking. The best approach would be to not go down as low as the IP/Ethernet packeting level (it's unnecessary protocol overhead), but to implement some sort of streaming, virtio based TCP connection proxying support. Strictly talking the guest does not need ICMP packets to have working Internet connectivity - only passing/tunneling through TCP sockets would be enough. The following highlevel ops are needed: - connect/shutdown/close - send/receive - poll And would be passed through to the host side and mirrored there into real connect/shutdown TCP socket ops and into send/receive ops. The guest OS does not need to be 'aware' of this in any way, as long as the bzImage has this magic guest tunneling support included. Obviously, such a highlevel approach would be much faster as well than any packet level virtual networking approach. Does something like this exist upstream, or do we have to implement it? macvtap does non-privileged setupless networking. But this doesn't really answer your message. No, there is no tcp-level virtio device. However, because of GSO/GRO, I don't think there is a huge win to be gained by bypassing the lower layers. If you want to send a megabyte's worth of data into a tcp stream, you prepend a header and post it to virtio-net, and this goes all the way down to the real device. I'm not sure tcp-offload like you propose would pass netdev@. Similar approaches for real hardware were rejected since they would bypass the tcp stack. Things like netfilter would no longer work. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] kvm tools: Make host side IP configurable
On Wed, Apr 13, 2011 at 4:28 PM, Asias He asias.he...@gmail.com wrote: On 04/13/2011 08:28 PM, Asias He wrote: On 04/13/2011 08:13 PM, Sasha Levin wrote: Add --host-ip-addr parameter to allow changing the host-side IP address. I'd personally prefer something like this: --network=model=virtio,hostip=x.x.x.x/mask,guestmac=yy:yy:yy:yy Once we can set up ip address for guest, we can use: --network=model=virtio,hostip=x.x.x.x/mask,guestmac=yy:yy:yy:yy,guestip=z.z.z.z/mask,guestgw=v.v.v.v Alternatively, we can use an option group like this: Network options: -n --network virtio -h --hostip x.x.x.x/n -y --guestip x.x.x.x/n -z --guestmac xx:xx:xx:xx:xx:xx I agree. We can create a networking group once we start expanding the options. -- Sasha. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [transparent networking] Re: [PATCH] kvm tools: Implement virtio network device
On 04/13/2011 09:33 PM, Avi Kivity wrote: On 04/13/2011 04:02 PM, Ingo Molnar wrote: * Asias Heasias.he...@gmail.com wrote: Here are some scp performance test for differenct implementations: None of rx and tx as thread: guest to host 3.2MB/s host to guest 3.1MB/s Only rx as thread: guest to host 14.7MB/s host to guest 33.4MB/s Both rx and tx as thread(This patch works this way): guest to host 19.8MB/s host to guest 32.5MB/s Signed-off-by: Asias Heasias.he...@gmail.com This is already in master. Thanks! Ingo suggested to CC the updated version of this patch to kvm list. So I am posting this patch again. Thanks Asias, cool stuff. Maybe other KVM developers want to chime in about how to best implement transparent (non-TAP-using) guest-side networking. The best approach would be to not go down as low as the IP/Ethernet packeting level (it's unnecessary protocol overhead), but to implement some sort of streaming, virtio based TCP connection proxying support. Strictly talking the guest does not need ICMP packets to have working Internet connectivity - only passing/tunneling through TCP sockets would be enough. The following highlevel ops are needed: - connect/shutdown/close - send/receive - poll And would be passed through to the host side and mirrored there into real connect/shutdown TCP socket ops and into send/receive ops. The guest OS does not need to be 'aware' of this in any way, as long as the bzImage has this magic guest tunneling support included. Obviously, such a highlevel approach would be much faster as well than any packet level virtual networking approach. Does something like this exist upstream, or do we have to implement it? macvtap does non-privileged setupless networking. Great! Thanks Avi! -- Best Regards, Asias He -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] kvm tools: Set up tun interface using ioctls
On Wed, 13 Apr 2011, Sasha Levin wrote: Use ioctls to assign IP address and bring interface up instead of using ifconfig. Signed-off-by: Sasha Levin levinsasha...@gmail.com I'm seeing this: penberg@tiger:~/linux/tools/kvm$ make GEN include/common-cmds.h CC 8250-serial.o CC virtio-blk.o CC virtio-net.o cc1: warnings being treated as errors virtio-net.c: In function ‘virtio_net__init’: virtio-net.c:302: error: dereferencing pointer ‘({anonymous})’ does break strict-aliasing rules virtio-net.c:302: note: initialized from here make: *** [virtio-net.o] Error 1
Re: buggy emulate_int_real
Quoting Avi Kivity (a...@redhat.com): On 04/13/2011 04:24 PM, Serge E. Hallyn wrote: Quoting Jan Kiszka (jan.kis...@web.de): Looks consistent to me. It's working for me, so if there are no further comments Looks good to me too. I'll improve the changelog and send to lkml. kvm@, surely. Oh? Didn't realize that was the proper path - absolutely. thanks, -serge -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] kvm tools: Set up tun interface using ioctls
On 04/13/2011 09:49 PM, Pekka Enberg wrote: On Wed, 13 Apr 2011, Sasha Levin wrote: Use ioctls to assign IP address and bring interface up instead of using ifconfig. Signed-off-by: Sasha Levin levinsasha...@gmail.com I'm seeing this: penberg@tiger:~/linux/tools/kvm$ make GEN include/common-cmds.h CC 8250-serial.o CC virtio-blk.o CC virtio-net.o cc1: warnings being treated as errors virtio-net.c: In function ‘virtio_net__init’: virtio-net.c:302: error: dereferencing pointer ‘({anonymous})’ does break strict-aliasing rules virtio-net.c:302: note: initialized from here make: *** [virtio-net.o] Error 1 Hi, Pekka I am wondering why is your gcc always stricter than mine? -- Best Regards, Asias He -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] kvm tools: Set up tun interface using ioctls
On Wed, 2011-04-13 at 21:59 +0800, Asias He wrote: On 04/13/2011 09:49 PM, Pekka Enberg wrote: On Wed, 13 Apr 2011, Sasha Levin wrote: Use ioctls to assign IP address and bring interface up instead of using ifconfig. Signed-off-by: Sasha Levin levinsasha...@gmail.com I'm seeing this: penberg@tiger:~/linux/tools/kvm$ make GEN include/common-cmds.h CC 8250-serial.o CC virtio-blk.o CC virtio-net.o cc1: warnings being treated as errors virtio-net.c: In function ‘virtio_net__init’: virtio-net.c:302: error: dereferencing pointer ‘({anonymous})’ does break strict-aliasing rules virtio-net.c:302: note: initialized from here make: *** [virtio-net.o] Error 1 Hi, Pekka I am wondering why is your gcc always stricter than mine? I have gcc 4.4.3 here. I wouldn't be surprised if those were actually gcc bugs but and that we should turn off the aliasing checks. However, for this particular case, it was pretty good at spotting questionable code so... :-) Pekka -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] kvm tools: Make host side IP configurable
On Wed, 2011-04-13 at 16:38 +0300, Sasha Levin wrote: On Wed, Apr 13, 2011 at 4:28 PM, Asias He asias.he...@gmail.com wrote: On 04/13/2011 08:28 PM, Asias He wrote: On 04/13/2011 08:13 PM, Sasha Levin wrote: Add --host-ip-addr parameter to allow changing the host-side IP address. I'd personally prefer something like this: --network=model=virtio,hostip=x.x.x.x/mask,guestmac=yy:yy:yy:yy Once we can set up ip address for guest, we can use: --network=model=virtio,hostip=x.x.x.x/mask,guestmac=yy:yy:yy:yy,guestip=z.z.z.z/mask,guestgw=v.v.v.v Alternatively, we can use an option group like this: Network options: -n --network virtio -h --hostipx.x.x.x/n -y --guestip x.x.x.x/n -z --guestmac xx:xx:xx:xx:xx:xx I agree. We can create a networking group once we start expanding the options. Asias, is the patch OK to merge? -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
kvm mmu tracing quirks
Hi Xiao, while tracing a guest with all events enabled, I notices some issues with kvm mmu instrumentations. A critical one: TRACE_EVENT( kvm_mmu_audit, TP_PROTO(struct kvm_vcpu *vcpu, int audit_point), TP_ARGS(vcpu, audit_point), TP_STRUCT__entry( __field(struct kvm_vcpu *, vcpu) __field(int, audit_point) ), TP_fast_assign( __entry-vcpu = vcpu; __entry-audit_point = audit_point; ), TP_printk(vcpu:%d %s, __entry-vcpu-cpu, audit_point_name[__entry-audit_point]) ); Saving the vcpu reference to the trace buffer can break on dump if the vcpu was destroyed in the meantime. I was about to fix that by saving vcpu-cpu instead, but then I wondered what kind of information you actually need here. The triggering host cpu is automatically recorded by ftrace anyway. Can you comment on this / clean it up? Also, it would be nice to use __print_symbolic for translating audit_point to a string as that would also work for trace-cmd/kernelshark. And then we have tracepoints for kvm_mmu_sync_page, kvm_mmu_unsync_page and kvm_mmu_prepare_zap_page. All take a kvm_mmu_page as argument but do nothing with it. On the other hand, the kvm plugin of trace-cmd expects values like gfn, role, or unsync from those. Maybe you can have a look. The latter topic is just for the sake of cleanness, but the former one is actually problematic even if you are not interested in all mmu details but only enable those events by chance. Thanks, Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] kvm: fix push of wrong eip when doing softint
When doing a soft int, we need to bump eip before pushing it to the stack. Otherwise we'll do the int a second time. [a...@canonical.com: merged eip update as per Jan's recommendation.] Signed-off-by: Serge E. Hallyn serge.hal...@ubuntu.com Signed-off-by: Andy Whitcroft a...@canonical.com --- arch/x86/kvm/vmx.c | 12 +--- arch/x86/kvm/x86.c |5 +++-- arch/x86/kvm/x86.h |2 +- 3 files changed, 13 insertions(+), 6 deletions(-) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index bf89ec2..3aad96c 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -1053,7 +1053,10 @@ static void vmx_queue_exception(struct kvm_vcpu *vcpu, unsigned nr, } if (vmx-rmode.vm86_active) { - if (kvm_inject_realmode_interrupt(vcpu, nr) != EMULATE_DONE) + int inc_eip = 0; + if (kvm_exception_is_soft(nr)) + inc_eip = vcpu-arch.event_exit_inst_len; + if (kvm_inject_realmode_interrupt(vcpu, nr, inc_eip) != EMULATE_DONE) kvm_make_request(KVM_REQ_TRIPLE_FAULT, vcpu); return; } @@ -2871,7 +2874,10 @@ static void vmx_inject_irq(struct kvm_vcpu *vcpu) ++vcpu-stat.irq_injections; if (vmx-rmode.vm86_active) { - if (kvm_inject_realmode_interrupt(vcpu, irq) != EMULATE_DONE) + int inc_eip = 0; + if (vcpu-arch.interrupt.soft) + inc_eip = vcpu-arch.event_exit_inst_len; + if (kvm_inject_realmode_interrupt(vcpu, irq, inc_eip) != EMULATE_DONE) kvm_make_request(KVM_REQ_TRIPLE_FAULT, vcpu); return; } @@ -2905,7 +2911,7 @@ static void vmx_inject_nmi(struct kvm_vcpu *vcpu) ++vcpu-stat.nmi_injections; if (vmx-rmode.vm86_active) { - if (kvm_inject_realmode_interrupt(vcpu, NMI_VECTOR) != EMULATE_DONE) + if (kvm_inject_realmode_interrupt(vcpu, NMI_VECTOR, 0) != EMULATE_DONE) kvm_make_request(KVM_REQ_TRIPLE_FAULT, vcpu); return; } diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index bcc0efc..980317a 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -4293,7 +4293,7 @@ static void init_emulate_ctxt(struct kvm_vcpu *vcpu) memcpy(c-regs, vcpu-arch.regs, sizeof c-regs); } -int kvm_inject_realmode_interrupt(struct kvm_vcpu *vcpu, int irq) +int kvm_inject_realmode_interrupt(struct kvm_vcpu *vcpu, int irq, int inc_eip) { struct decode_cache *c = vcpu-arch.emulate_ctxt.decode; int ret; @@ -4302,7 +4302,8 @@ int kvm_inject_realmode_interrupt(struct kvm_vcpu *vcpu, int irq) vcpu-arch.emulate_ctxt.decode.op_bytes = 2; vcpu-arch.emulate_ctxt.decode.ad_bytes = 2; - vcpu-arch.emulate_ctxt.decode.eip = vcpu-arch.emulate_ctxt.eip; + vcpu-arch.emulate_ctxt.decode.eip = vcpu-arch.emulate_ctxt.eip + +inc_eip; ret = emulate_int_real(vcpu-arch.emulate_ctxt, emulate_ops, irq); if (ret != X86EMUL_CONTINUE) diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h index c600da8..e407ed3 100644 --- a/arch/x86/kvm/x86.h +++ b/arch/x86/kvm/x86.h @@ -77,7 +77,7 @@ static inline u32 bit(int bitno) void kvm_before_handle_nmi(struct kvm_vcpu *vcpu); void kvm_after_handle_nmi(struct kvm_vcpu *vcpu); -int kvm_inject_realmode_interrupt(struct kvm_vcpu *vcpu, int irq); +int kvm_inject_realmode_interrupt(struct kvm_vcpu *vcpu, int irq, int inc_eip); void kvm_write_tsc(struct kvm_vcpu *vcpu, u64 data); -- 1.7.4.1 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] kvm tools: Make host side IP configurable
On 04/13/2011 10:07 PM, Pekka Enberg wrote: On Wed, 2011-04-13 at 16:38 +0300, Sasha Levin wrote: On Wed, Apr 13, 2011 at 4:28 PM, Asias He asias.he...@gmail.com wrote: On 04/13/2011 08:28 PM, Asias He wrote: On 04/13/2011 08:13 PM, Sasha Levin wrote: Add --host-ip-addr parameter to allow changing the host-side IP address. I'd personally prefer something like this: --network=model=virtio,hostip=x.x.x.x/mask,guestmac=yy:yy:yy:yy Once we can set up ip address for guest, we can use: --network=model=virtio,hostip=x.x.x.x/mask,guestmac=yy:yy:yy:yy,guestip=z.z.z.z/mask,guestgw=v.v.v.v Alternatively, we can use an option group like this: Network options: -n --network virtio -h --hostipx.x.x.x/n -y --guestip x.x.x.x/n -z --guestmac xx:xx:xx:xx:xx:xx I agree. We can create a networking group once we start expanding the options. Asias, is the patch OK to merge? The first patch is ok. Sasha needs modification for the second patch. -- Best Regards, Asias He -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] KVM: emulator: Handle wraparound in (cs_base + offset) when fetching.
Currently, setting a large (i.e. negative) base address for %cs does not work on a 64-bit host. The JOS teaching operating system, used by MIT and other universities, relies on such segments while bootstrapping its way to full virtual memory management. Signed-off-by: Nelson Elhage nelh...@ksplice.com --- arch/x86/kvm/emulate.c |5 - 1 files changed, 4 insertions(+), 1 deletions(-) diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c index 0ad47b8..54e84b2 100644 --- a/arch/x86/kvm/emulate.c +++ b/arch/x86/kvm/emulate.c @@ -505,9 +505,12 @@ static int do_fetch_insn_byte(struct x86_emulate_ctxt *ctxt, int size, cur_size; if (eip == fc-end) { + unsigned long linear = eip + ctxt-cs_base; + if (ctxt-mode != X86EMUL_MODE_PROT64) + linear = (u32)-1; cur_size = fc-end - fc-start; size = min(15UL - cur_size, PAGE_SIZE - offset_in_page(eip)); - rc = ops-fetch(ctxt-cs_base + eip, fc-data + cur_size, + rc = ops-fetch(linear, fc-data + cur_size, size, ctxt-vcpu, ctxt-exception); if (rc != X86EMUL_CONTINUE) return rc; -- 1.7.2.43.g36c08.dirty -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] KVM: emulator: Handle wraparound in (cs_base + offset) when fetching.
On 04/13/2011 06:44 PM, Nelson Elhage wrote: Currently, setting a large (i.e. negative) base address for %cs does not work on a 64-bit host. The JOS teaching operating system, used by MIT and other universities, relies on such segments while bootstrapping its way to full virtual memory management. Signed-off-by: Nelson Elhagenelh...@ksplice.com --- arch/x86/kvm/emulate.c |5 - 1 files changed, 4 insertions(+), 1 deletions(-) diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c index 0ad47b8..54e84b2 100644 --- a/arch/x86/kvm/emulate.c +++ b/arch/x86/kvm/emulate.c @@ -505,9 +505,12 @@ static int do_fetch_insn_byte(struct x86_emulate_ctxt *ctxt, int size, cur_size; if (eip == fc-end) { + unsigned long linear = eip + ctxt-cs_base; + if (ctxt-mode != X86EMUL_MODE_PROT64) + linear= (u32)-1; cur_size = fc-end - fc-start; size = min(15UL - cur_size, PAGE_SIZE - offset_in_page(eip)); - rc = ops-fetch(ctxt-cs_base + eip, fc-data + cur_size, + rc = ops-fetch(linear, fc-data + cur_size, size, ctxt-vcpu,ctxt-exception); if (rc != X86EMUL_CONTINUE) return rc; A better fix would be to call linearize() here, which does the necessary truncation as well as segment checks. However, this patch is a lot more backportable, so I think it should be applied, and a conversion to linearize() performed afterwards. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] KVM test: Unattended install - Give Linux VMs time to shutdown cleanly
On Tue, Apr 12, 2011 at 07:28:15PM -0300, Lucas Meneghel Rodrigues wrote: During unattended install, right after we receive the ACK from the guest the test is deemed to be finished, and as shutdown_vm = yes, it'll try to end the vm issuing a shutdown command to it. However, on virtually all Linux guests an SSH server is not available at the end of install, so KVM autotest will end the VM forcefully, which is not really safe, although it has served us well so far. We did not fix this 'problem' so far because on RHEL3, a supported guest, the anaconda syntax does not support the 'poweroff' directive, only 'reboot', so if we don't finish the VM right after the ACK from guest we really can't prevent it from starting the install again, getting an infinite loop. On the other hand, RHEL 3 supports 'reboot'[1], so we could simply run the install-finished notification on first boot (it's the method we use for Windows guests, right?), and simply shut down the machine cleanly using ssh or the virtual power button. [1] http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/3/html/System_Administration_Guide/s1-kickstart2-options.html -- Eduardo -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/4] [RFC rev2] virtio-net changes
Hi Rusty, Thanks for your feedback. I agree with all the changes, and will make it and resubmit next. thanks, - KK Rusty Russell ru...@rustcorp.com.au wrote on 04/13/2011 06:58:02 AM: Rusty Russell ru...@rustcorp.com.au 04/13/2011 06:58 AM To Krishna Kumar2/India/IBM@IBMIN, da...@davemloft.net, m...@redhat.com cc eric.duma...@gmail.com, a...@arndb.de, net...@vger.kernel.org, ho...@verge.net.au, a...@redhat.com, anth...@codemonkey.ws, kvm@vger.kernel.org, Krishna Kumar2/India/IBM@IBMIN Subject Re: [PATCH 2/4] [RFC rev2] virtio-net changes On Tue, 05 Apr 2011 20:38:52 +0530, Krishna Kumar krkum...@in.ibm.com wrote: Implement mq virtio-net driver. Though struct virtio_net_config changes, it works with the old qemu since the last element is not accessed unless qemu sets VIRTIO_NET_F_MULTIQUEUE. Signed-off-by: Krishna Kumar krkum...@in.ibm.com Hi Krishna! This change looks fairly solid, but I'd prefer it split into a few stages for clarity. The first patch should extract out the struct send_queue and struct receive_queue, even though there's still only one. The second patch can then introduce VIRTIO_NET_F_MULTIQUEUE. You could split into more parts if that makes sense, but I'd prefer to see the mechanical changes separate from the feature addition. -struct virtnet_info { - struct virtio_device *vdev; - struct virtqueue *rvq, *svq, *cvq; - struct net_device *dev; +/* Internal representation of a send virtqueue */ +struct send_queue { + /* Virtqueue associated with this send _queue */ + struct virtqueue *svq; You can simply call this vq now it's inside 'send_queue'. + + /* TX: fragments + linear part + virtio header */ + struct scatterlist tx_sg[MAX_SKB_FRAGS + 2]; Similarly, this can just be sg. +static void free_receive_bufs(struct virtnet_info *vi) +{ + int i; + + for (i = 0; i vi-numtxqs; i++) { + BUG_ON(vi-rq[i] == NULL); + while (vi-rq[i]-pages) + __free_pages(get_a_page(vi-rq[i], GFP_KERNEL), 0); + } +} You can skip the BUG_ON(), since the next line will have the same effect. +/* Free memory allocated for send and receive queues */ +static void free_rq_sq(struct virtnet_info *vi) +{ + int i; + + if (vi-rq) { + for (i = 0; i vi-numtxqs; i++) + kfree(vi-rq[i]); + kfree(vi-rq); + } + + if (vi-sq) { + for (i = 0; i vi-numtxqs; i++) + kfree(vi-sq[i]); + kfree(vi-sq); + } This looks weird, even though it's correct. I think we need a better name than numtxqs and shorter than num_queue_pairs. Let's just use num_queues; sure, there are both tx and rq queues, but I still think it's pretty clear. + for (i = 0; i vi-numtxqs; i++) { + struct virtqueue *svq = vi-sq[i]-svq; + + while (1) { + buf = virtqueue_detach_unused_buf(svq); + if (!buf) +break; + dev_kfree_skb(buf); + } + } I know this isn't your code, but it's ugly :) while ((buf = virtqueue_detach_unused_buf(svq)) != NULL) dev_kfree_skb(buf); + for (i = 0; i vi-numtxqs; i++) { + struct virtqueue *rvq = vi-rq[i]-rvq; + + while (1) { + buf = virtqueue_detach_unused_buf(rvq); + if (!buf) +break; Here too... +#define MAX_DEVICE_NAME 16 This isn't a good idea, see below. +static int initialize_vqs(struct virtnet_info *vi, int numtxqs) +{ + vq_callback_t **callbacks; + struct virtqueue **vqs; + int i, err = -ENOMEM; + int totalvqs; + char **names; This whole routine is really messy. How about doing find_vqs first, then have routines like setup_rxq(), setup_txq() and setup_controlq() would make this neater: static int setup_rxq(struct send_queue *sq, char *name); Also, use kasprintf() instead of kmalloc sprintf. +#if 1 + /* Allocate/initialize parameters for recv/send virtqueues */ Why is this #if 1'd? I do prefer the #else method of doing two loops, myself (but use kasprintf). Cheers, Rusty. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/4] [RFC rev2] Implement multiqueue (RX TX) virtio-net
Avi Kivity a...@redhat.com wrote on 04/13/2011 05:30:11 PM: Hi Avi, 1. Reduce vectors for find_vqs(). 2. Make vhost changes minimal. For now, I have restricted the number of vhost threads to 4. This can be either made unrestricted; or if the userspace vhost works, it can be removed altogether. Please review and provide feedback. I am travelling a bit in the next few days but will respond at the earliest. Do you have an update to the virtio-pci spec for this? Not yet, will keep it in my TODO list. thanks, - KK -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [transparent networking] Re: [PATCH] kvm tools: Implement virtio network device
On Wed, Apr 13, 2011 at 2:02 PM, Ingo Molnar mi...@elte.hu wrote: Strictly talking the guest does not need ICMP packets to have working Internet connectivity - only passing/tunneling through TCP sockets would be enough. Don't forget UDP for DNS. Stefan -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: kvm mmu tracing quirks
On 04/13/2011 05:09 PM, Jan Kiszka wrote: Hi Xiao, while tracing a guest with all events enabled, I notices some issues with kvm mmu instrumentations. A critical one: TRACE_EVENT( kvm_mmu_audit, TP_PROTO(struct kvm_vcpu *vcpu, int audit_point), TP_ARGS(vcpu, audit_point), TP_STRUCT__entry( __field(struct kvm_vcpu *, vcpu) __field(int, audit_point) ), TP_fast_assign( __entry-vcpu = vcpu; __entry-audit_point = audit_point; ), TP_printk(vcpu:%d %s, __entry-vcpu-cpu, audit_point_name[__entry-audit_point]) ); Saving the vcpu reference to the trace buffer can break on dump if the vcpu was destroyed in the meantime. I was about to fix that by saving vcpu-cpu instead, but then I wondered what kind of information you actually need here. The triggering host cpu is automatically recorded by ftrace anyway. Can you comment on this / clean it up? Also, it would be nice to use __print_symbolic for translating audit_point to a string as that would also work for trace-cmd/kernelshark. IIRC the audit trace is only used to get a low cost invocation for the audit machinery; it's not actually useful as a user tracepoint. We should probably switch to static_branch() instead (https://lwn.net/Articles/429447/). -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] KVM test: Unattended install - Give Linux VMs time to shutdown cleanly
On Wed, 2011-04-13 at 13:16 -0300, Eduardo Habkost wrote: On Tue, Apr 12, 2011 at 07:28:15PM -0300, Lucas Meneghel Rodrigues wrote: During unattended install, right after we receive the ACK from the guest the test is deemed to be finished, and as shutdown_vm = yes, it'll try to end the vm issuing a shutdown command to it. However, on virtually all Linux guests an SSH server is not available at the end of install, so KVM autotest will end the VM forcefully, which is not really safe, although it has served us well so far. We did not fix this 'problem' so far because on RHEL3, a supported guest, the anaconda syntax does not support the 'poweroff' directive, only 'reboot', so if we don't finish the VM right after the ACK from guest we really can't prevent it from starting the install again, getting an infinite loop. On the other hand, RHEL 3 supports 'reboot'[1], so we could simply run the install-finished notification on first boot (it's the method we use for Windows guests, right?), and simply shut down the machine cleanly using ssh or the virtual power button. Nope, as the VM is started with -kernel and -initrd options, the machine will start anaconda and therefore, install again. I wish we could do it as you described, really. Windows is a different beast, we don't start it with -kernel and -initrd options, just regular boot from the CD + the unattended floppy, so the VM can first boot nice and fine, and then we execute the finish.exe program, telling autotest the install is done. [1] http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/3/html/System_Administration_Guide/s1-kickstart2-options.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] kvm tool: Avoid using disk_image-priv member in disk_image__new()
The disk_image-priv is supposed to be a private member for users of disk_image__new(). The other block device drivers, for example qcow, might need this pointer to hold their header. Added a new function disk_image__new_readonly() which calls disk_image__new() to allocate a new disk and then sets the priv member to mmamped address. Signed-off-by: Prasad Joshi prasadjoshi...@gmail.com --- tools/kvm/disk-image.c | 27 +++ tools/kvm/include/kvm/disk-image.h |3 ++- 2 files changed, 21 insertions(+), 9 deletions(-) diff --git a/tools/kvm/disk-image.c b/tools/kvm/disk-image.c index 595d407..c666c04 100644 --- a/tools/kvm/disk-image.c +++ b/tools/kvm/disk-image.c @@ -13,7 +13,7 @@ #include unistd.h #include fcntl.h -struct disk_image *disk_image__new(int fd, uint64_t size, struct disk_image_operations *ops, bool readonly) +struct disk_image *disk_image__new(int fd, uint64_t size, struct disk_image_operations *ops) { struct disk_image *self; @@ -24,16 +24,24 @@ struct disk_image *disk_image__new(int fd, uint64_t size, struct disk_image_oper self-fd= fd; self-size = size; self-ops = ops; - if (readonly) { - self-priv = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_NORESERVE, fd, 0); - if (self-priv == MAP_FAILED) - die(mmap() failed); - } else - self-priv = MAP_FAILED; + return self; +} + +struct disk_image *disk_image__new_readonly(int fd, uint64_t size, struct disk_image_operations *ops) +{ + struct disk_image *self; + self = disk_image__new(fd, size, ops); + if (!self) + return NULL; + + self-priv = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_NORESERVE, fd, 0); + if (self-priv == MAP_FAILED) + die(mmap() failed); return self; } + static int raw_image__read_sector(struct disk_image *self, uint64_t sector, void *dst, uint32_t dst_len) { uint64_t offset = sector SECTOR_SHIFT; @@ -101,7 +109,10 @@ static struct disk_image *raw_image__probe(int fd, bool readonly) if (fstat(fd, st) 0) return NULL; - return disk_image__new(fd, st.st_size, readonly ? raw_image_ro_mmap_ops : raw_image_ops, readonly); + if (readonly) + return disk_image__new_readonly(fd, st.st_size, raw_image_ro_mmap_ops); + else + return disk_image__new(fd, st.st_size, raw_image_ops); } struct disk_image *disk_image__open(const char *filename, bool readonly) diff --git a/tools/kvm/include/kvm/disk-image.h b/tools/kvm/include/kvm/disk-image.h index 91240c2..33962c6 100644 --- a/tools/kvm/include/kvm/disk-image.h +++ b/tools/kvm/include/kvm/disk-image.h @@ -23,7 +23,8 @@ struct disk_image { }; struct disk_image *disk_image__open(const char *filename, bool readonly); -struct disk_image *disk_image__new(int fd, uint64_t size, struct disk_image_operations *ops, bool readonly); +struct disk_image *disk_image__new(int fd, uint64_t size, struct disk_image_operations *ops); +struct disk_image *disk_image__new_readonly(int fd, uint64_t size, struct disk_image_operations *ops); void disk_image__close(struct disk_image *self); static inline int disk_image__read_sector(struct disk_image *self, uint64_t sector, void *dst, uint32_t dst_len) -- 1.7.1 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] kvm tool: add a close method for raw_image_ro_mmap_ops
--- tools/kvm/disk-image.c | 11 --- 1 files changed, 8 insertions(+), 3 deletions(-) diff --git a/tools/kvm/disk-image.c b/tools/kvm/disk-image.c index c666c04..df7dd48 100644 --- a/tools/kvm/disk-image.c +++ b/tools/kvm/disk-image.c @@ -92,6 +92,13 @@ static int raw_image__write_sector_ro_mmap(struct disk_image *self, uint64_t sec return 0; } +static void raw_image__close_sector_ro_mmap(struct disk_image *self) +{ + if (self-priv != MAP_FAILED) + munmap(self-priv, self-size); +} + + static struct disk_image_operations raw_image_ops = { .read_sector= raw_image__read_sector, .write_sector = raw_image__write_sector, @@ -100,6 +107,7 @@ static struct disk_image_operations raw_image_ops = { static struct disk_image_operations raw_image_ro_mmap_ops = { .read_sector= raw_image__read_sector_ro_mmap, .write_sector = raw_image__write_sector_ro_mmap, + .close = raw_image__close_sector_ro_mmap, }; static struct disk_image *raw_image__probe(int fd, bool readonly) @@ -140,9 +148,6 @@ void disk_image__close(struct disk_image *self) if (!self) return; - if (self-priv != MAP_FAILED) - munmap(self-priv, self-size); - if (self-ops-close) self-ops-close(self); -- 1.7.1 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: EuroSec'11 Presentation
Avi Kivity a...@redhat.com writes: With EPT or NPT you cannot detect if a page is read only. Why not? You can always walk the page tables manually again. Furthermore, at least Linux (without highmem) maps all of memory with a read/write mapping in addition to the per-process mapping, so no page is read-only. Even with 32bit highmem most memory will be eventually mapped writable by kmap when. There's currently no concept of a ro-kmap. However I suspect it wouldn't be too hard to add one. -Andi -- a...@linux.intel.com -- Speaking for myself only -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] kvm tool: add QCOW verions 1 read/write support
The patch only implements the basic read write support for QCOW version 1 images. Many of the QCOW features are not implmented, for example - image creation - snapshot - copy-on-write - encryption Renamed the file CREDITS-Git to CREDITS and added QEMU credits to CREDITS file. Signed-off-by: Prasad Joshi prasadjoshi...@gmail.com --- tools/kvm/CREDITS | 46 + tools/kvm/Makefile |2 + tools/kvm/disk-image.c |7 + tools/kvm/include/kvm/qcow.h| 55 ++ tools/kvm/include/linux/byteorder.h |7 + tools/kvm/include/linux/types.h | 19 ++ tools/kvm/qcow.c| 123 + tools/kvm/qcow1.c | 325 +++ 8 files changed, 584 insertions(+), 0 deletions(-) create mode 100644 tools/kvm/CREDITS create mode 100644 tools/kvm/include/kvm/qcow.h create mode 100644 tools/kvm/include/linux/byteorder.h create mode 100644 tools/kvm/qcow.c create mode 100644 tools/kvm/qcow1.c diff --git a/tools/kvm/CREDITS b/tools/kvm/CREDITS new file mode 100644 index 000..3e6cf55 --- /dev/null +++ b/tools/kvm/CREDITS @@ -0,0 +1,46 @@ +Perf/Git: +Most of the infrastructure that 'perf' uses here has been reused +from the Git project, as of version: + +66996ec: Sync with 1.6.2.4 + +Here is an (incomplete!) list of main contributors to those files +in util/* and elsewhere: + + Alex Riesen + Christian Couder + Dmitry Potapov + Jeff King + Johannes Schindelin + Johannes Sixt + Junio C Hamano + Linus Torvalds + Matthias Kestenholz + Michal Ostrowski + Miklos Vajna + Petr Baudis + Pierre Habouzit + René Scharfe + Samuel Tardieu + Shawn O. Pearce + Steffen Prohaska + Steve Haslam + +Thanks guys! + +The full history of the files can be found in the upstream Git commits. + + +QEMU +The source code of the QEMU was referenced while developing the QCOW support +for the kvm tool. The relevant QEMU commits were + +66f82ce block: Open the underlying image file in generic code +ea2384d new disk image layer + +Here is a possibly incomplete list of main contributors + Kevin Wolf kw...@redhat.com + Fabrice Bellard + Stefan Hajnoczi stefa...@linux.vnet.ibm.com + +Thanks a lot all! diff --git a/tools/kvm/Makefile b/tools/kvm/Makefile index 6895113..098b328 100644 --- a/tools/kvm/Makefile +++ b/tools/kvm/Makefile @@ -34,6 +34,8 @@ OBJS+= util/strbuf.o OBJS+= kvm-help.o OBJS+= kvm-cmd.o OBJS+= kvm-run.o +OBJS+= qcow.o +OBJS+= qcow1.o DEPS := $(patsubst %.o,%.d,$(OBJS)) diff --git a/tools/kvm/disk-image.c b/tools/kvm/disk-image.c index 908a744..ff3c076 100644 --- a/tools/kvm/disk-image.c +++ b/tools/kvm/disk-image.c @@ -13,6 +13,9 @@ #include unistd.h #include fcntl.h +#include linux/types.h +#include kvm/qcow.h + struct disk_image *disk_image__new(int fd, uint64_t size, struct disk_image_operations *ops) { struct disk_image *self; @@ -124,6 +127,10 @@ struct disk_image *disk_image__open(const char *filename, bool readonly) if (fd 0) return NULL; + self = qcow_probe(fd); + if (self) + return self; + self = raw_image__probe(fd, readonly); if (self) return self; diff --git a/tools/kvm/include/kvm/qcow.h b/tools/kvm/include/kvm/qcow.h new file mode 100644 index 000..96f7ad5 --- /dev/null +++ b/tools/kvm/include/kvm/qcow.h @@ -0,0 +1,55 @@ +#ifndef __QEMU_H__ + +#define __QEMU_H__ + +#define QCOW_MAGIC (('Q' 24) | ('F' 16) | ('I' 8) | 0xfb) +#define QCOW1_VERSION 1 +#define QCOW2_VERSION 2 + +#define QCOW_OFLAG_COMPRESSED (1LL 63) + +struct qcow_table { + uint32_t table_size; + u64 *l1_table; +}; + +struct qcow { + struct qcow_table *table; + void *header; + int fd; +}; + +/* common qcow header */ +struct qcow_common_header { + uint32_t magic; + uint32_t version; +}; + +/* qcow version 1 header format */ +struct qcow1_header { + uint32_t magic; + uint32_t version; + + u64 backing_file_offset; + uint32_t backing_file_size; + uint32_t mtime; + + u64 size; /* in bytes */ + + uint8_t cluster_bits; + uint8_t l2_bits; + uint32_t crypt_method; + + u64 l1_table_offset; +}; + +/* qcow common operations */ +struct disk_image *qcow_probe(int fd); +int qcow_read_l1_table(struct qcow *q); +int qcow_pwrite_with_sync(int fd, void *buf, size_t count, off_t offset); + +/* qcow1 global variables and operations */ +extern struct disk_image_operations qcow1_disk_ops; +uint32_t qcow1_get_table_size(struct qcow *q); +struct disk_image *qcow1_probe(int fd); +#endif diff --git a/tools/kvm/include/linux/byteorder.h b/tools/kvm/include/linux/byteorder.h new file mode 100644 index 000..c490de8 --- /dev/null +++ b/tools/kvm/include/linux/byteorder.h @@ -0,0 +1,7 @@ +#ifndef __BYTE_ORDER_H__ +#define __BYTE_ORDER_H__ + +#include asm/byteorder.h +#include linux/byteorder/generic.h
Re: [PATCH] kvm tool: add QCOW verions 1 read/write support
On Wed, Apr 13, 2011 at 10:01 PM, Prasad Joshi prasadjoshi...@gmail.com wrote: --- /dev/null +++ b/tools/kvm/qcow.c @@ -0,0 +1,123 @@ +/* + * This file contains code copied from QEMU source code + * + * Copyright (c) 2004-2006 Fabrice Bellard + * + * Permission is hereby granted, free of charge, to any person obtaining a copy + * of this software and associated documentation files (the Software), to deal + * in the Software without restriction, including without limitation the rights + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell + * copies of the Software, and to permit persons to whom the Software is + * furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN + * THE SOFTWARE. + */ Heh, almost there. Lets just drop these from the individual files now that we have attributions in CREDITS. Now you get the impression that the files are directly taken from QEMU which is not the case at all. Pekka -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2] kvm tool: add QCOW verions 1 read/write support
The patch only implements the basic read write support for QCOW version 1 images. Many of the QCOW features are not implmented, for example - image creation - snapshot - copy-on-write - encryption Renamed the file CREDITS-Git to CREDITS and added QEMU credits to CREDITS file. Signed-off-by: Prasad Joshi prasadjoshi...@gmail.com --- tools/kvm/CREDITS | 46 ++ tools/kvm/Makefile |2 + tools/kvm/disk-image.c |7 + tools/kvm/include/kvm/qcow.h| 55 +++ tools/kvm/include/linux/byteorder.h |7 + tools/kvm/include/linux/types.h | 19 +++ tools/kvm/qcow.c| 99 tools/kvm/qcow1.c | 301 +++ 8 files changed, 536 insertions(+), 0 deletions(-) create mode 100644 tools/kvm/CREDITS create mode 100644 tools/kvm/include/kvm/qcow.h create mode 100644 tools/kvm/include/linux/byteorder.h create mode 100644 tools/kvm/qcow.c create mode 100644 tools/kvm/qcow1.c diff --git a/tools/kvm/CREDITS b/tools/kvm/CREDITS new file mode 100644 index 000..96cc8d5 --- /dev/null +++ b/tools/kvm/CREDITS @@ -0,0 +1,46 @@ +Perf/Git: +Most of the infrastructure that 'perf' uses here has been reused +from the Git project, as of version: + +66996ec: Sync with 1.6.2.4 + +Here is an (incomplete!) list of main contributors to those files +in util/* and elsewhere: + + Alex Riesen + Christian Couder + Dmitry Potapov + Jeff King + Johannes Schindelin + Johannes Sixt + Junio C Hamano + Linus Torvalds + Matthias Kestenholz + Michal Ostrowski + Miklos Vajna + Petr Baudis + Pierre Habouzit + René Scharfe + Samuel Tardieu + Shawn O. Pearce + Steffen Prohaska + Steve Haslam + +Thanks guys! + +The full history of the files can be found in the upstream Git commits. + + +QEMU +The source code of the QEMU was referenced while developing the QCOW support +for the kvm tool. The relevant QEMU commits were + +66f82ce block: Open the underlying image file in generic code +ea2384d new disk image layer + +Here is a possibly incomplete list of main contributors + Kevin Wolf kw...@redhat.com + Fabrice Bellard + Stefan Hajnoczi stefa...@linux.vnet.ibm.com + +Thanks a lot all! diff --git a/tools/kvm/Makefile b/tools/kvm/Makefile index 6895113..098b328 100644 --- a/tools/kvm/Makefile +++ b/tools/kvm/Makefile @@ -34,6 +34,8 @@ OBJS+= util/strbuf.o OBJS+= kvm-help.o OBJS+= kvm-cmd.o OBJS+= kvm-run.o +OBJS+= qcow.o +OBJS+= qcow1.o DEPS := $(patsubst %.o,%.d,$(OBJS)) diff --git a/tools/kvm/disk-image.c b/tools/kvm/disk-image.c index 5d0f342..05b58b3 100644 --- a/tools/kvm/disk-image.c +++ b/tools/kvm/disk-image.c @@ -13,6 +13,9 @@ #include unistd.h #include fcntl.h +#include linux/types.h +#include kvm/qcow.h + struct disk_image *disk_image__new(int fd, uint64_t size, struct disk_image_operations *ops) { struct disk_image *self; @@ -131,6 +134,10 @@ struct disk_image *disk_image__open(const char *filename, bool readonly) if (fd 0) return NULL; + self = qcow_probe(fd); + if (self) + return self; + self = raw_image__probe(fd, readonly); if (self) return self; diff --git a/tools/kvm/include/kvm/qcow.h b/tools/kvm/include/kvm/qcow.h new file mode 100644 index 000..96f7ad5 --- /dev/null +++ b/tools/kvm/include/kvm/qcow.h @@ -0,0 +1,55 @@ +#ifndef __QEMU_H__ + +#define __QEMU_H__ + +#define QCOW_MAGIC (('Q' 24) | ('F' 16) | ('I' 8) | 0xfb) +#define QCOW1_VERSION 1 +#define QCOW2_VERSION 2 + +#define QCOW_OFLAG_COMPRESSED (1LL 63) + +struct qcow_table { + uint32_t table_size; + u64 *l1_table; +}; + +struct qcow { + struct qcow_table *table; + void *header; + int fd; +}; + +/* common qcow header */ +struct qcow_common_header { + uint32_t magic; + uint32_t version; +}; + +/* qcow version 1 header format */ +struct qcow1_header { + uint32_t magic; + uint32_t version; + + u64 backing_file_offset; + uint32_t backing_file_size; + uint32_t mtime; + + u64 size; /* in bytes */ + + uint8_t cluster_bits; + uint8_t l2_bits; + uint32_t crypt_method; + + u64 l1_table_offset; +}; + +/* qcow common operations */ +struct disk_image *qcow_probe(int fd); +int qcow_read_l1_table(struct qcow *q); +int qcow_pwrite_with_sync(int fd, void *buf, size_t count, off_t offset); + +/* qcow1 global variables and operations */ +extern struct disk_image_operations qcow1_disk_ops; +uint32_t qcow1_get_table_size(struct qcow *q); +struct disk_image *qcow1_probe(int fd); +#endif diff --git a/tools/kvm/include/linux/byteorder.h b/tools/kvm/include/linux/byteorder.h new file mode 100644 index 000..c490de8 --- /dev/null +++ b/tools/kvm/include/linux/byteorder.h @@ -0,0 +1,7 @@ +#ifndef __BYTE_ORDER_H__ +#define __BYTE_ORDER_H__ + +#include asm/byteorder.h +#include
Re: [PATCH v2] kvm tool: add QCOW verions 1 read/write support
On Wed, 13 Apr 2011, Prasad Joshi wrote: The patch only implements the basic read write support for QCOW version 1 images. Many of the QCOW features are not implmented, for example Applied, thanks! Pekka -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] [PATCH 2/2 V7] qemu,qmp: add inject-nmi qmp command
On Wed, Apr 13, 2011 at 4:08 PM, Luiz Capitulino lcapitul...@redhat.com wrote: On Tue, 12 Apr 2011 21:31:18 +0300 Blue Swirl blauwir...@gmail.com wrote: On Tue, Apr 12, 2011 at 10:52 AM, Avi Kivity a...@redhat.com wrote: On 04/11/2011 08:15 PM, Blue Swirl wrote: On Mon, Apr 11, 2011 at 10:01 AM, Markus Armbrusterarm...@redhat.com wrote: Avi Kivitya...@redhat.com writes: On 04/08/2011 12:41 AM, Anthony Liguori wrote: And it's a good thing to have, but exposing this as the only API to do something as simple as generating a guest crash dump is not the friendliest thing in the world to do to users. nmi is a fine name for something that corresponds to a real-life nmi button (often labeled NMI). Agree. We could also introduce an alias mechanism for user friendly names, so nmi could be used in addition of full path. Aliases could be useful for device paths as well. Yes. Perhaps limited to the human monitor. I'd limit all debugging commands (including NMI) to the human monitor. Why? Do they have any real use in production environment? Also, we should have the freedom to change the debugging facilities (for example, to improve some internal implementation) as we want without regard to compatibility to previous versions. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH] KVM: Add CPUID support for VIA CPU
On 04/13/2011 02:05 PM, brill...@viatech.com.cn wrote: + /* cpuid 0xC001.edx */ +const u32 kvm_supported_word5_x86_features = + F(XSTORE) | F(XSTORE_EN) | F(XCRYPT) | F(XCRYPT_EN) | + F(ACE2) | F(ACE2_EN) | F(PHE) | F(PHE_EN) | + F(PMM) | F(PMM_EN); + Are all of these features save wrt save/restore? (do they all act on state in standard registers?) Do they need any control register bits to be active or MSRs to configure? These features depend on instructions for the PadLock hardware engine of VIA CPU. The PadLock instructions just act on standard registers like general X86 instructions, and the features have been enabled when the CPU leave the factory, so there is no need to activate any control register bits or configure MSRs. I see there is a dependency on EFLAGS[30]. Does a VM entry clear this bit? If not, we have to do it ourselves. Yes, PadLock hardware engine has some association with EFLAGS[30], but it just required that the EFLAGS[30] should be set to 0 before using PadLock ACE instructions. It is recommanded that execute instruction sequence pushf;popf to clear this bit before using ACE instructions. AFAIK, the VM entry never sets the EFLAGS[30] bit, so it seems that we do not have to do it ourselves. @@ -2484,6 +2504,17 @@ static int kvm_dev_ioctl_get_supported_c r = -E2BIG; if (nent= cpuid-nent) + goto out_free; + + /* Add support for Centaur's CPUID instruction. */ +do_cpuid_ent(cpuid_entries[nent], 0xC000, 0,nent, cpuid-nent); nent overflow check missing here. Also, should probably skip if not a Via. If not a VIA, the limit will be 0, so the following cycle can not run. I think Intel defines CPUID to return the highest standard leaf, so it will be equivalent to cpuid(0x1a) or something like that. Yes, you're right. Moreover, it seems that there is no method to know whther the CPU is a VIA or not in this function. Can't you check the vendor ID? see boot_cpu_data. Good idea, thank you very much. The nent overflow check is put after the cycle like the 0x800 case, and when on a VIA, the returned limit is not large (generally it is 0xC004), is it neccesary to add a more check here? Yes, otherwise userspace can supply a buffer that is exactly the wrong size and cause an overflow. OK, I will add the check. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: kvm mmu tracing quirks
On 04/14/2011 12:12 AM, Avi Kivity wrote: On 04/13/2011 05:09 PM, Jan Kiszka wrote: Hi Xiao, while tracing a guest with all events enabled, I notices some issues with kvm mmu instrumentations. A critical one: TRACE_EVENT( kvm_mmu_audit, TP_PROTO(struct kvm_vcpu *vcpu, int audit_point), TP_ARGS(vcpu, audit_point), TP_STRUCT__entry( __field(struct kvm_vcpu *, vcpu) __field(int, audit_point) ), TP_fast_assign( __entry-vcpu = vcpu; __entry-audit_point = audit_point; ), TP_printk(vcpu:%d %s, __entry-vcpu-cpu, audit_point_name[__entry-audit_point]) ); Saving the vcpu reference to the trace buffer can break on dump if the vcpu was destroyed in the meantime. I was about to fix that by saving vcpu-cpu instead, but then I wondered what kind of information you actually need here. The triggering host cpu is automatically recorded by ftrace anyway. Can you comment on this / clean it up? Also, it would be nice to use __print_symbolic for translating audit_point to a string as that would also work for trace-cmd/kernelshark. Thanks you for pointing it out, Jan! The fix is good for me! IIRC the audit trace is only used to get a low cost invocation for the audit machinery; it's not actually useful as a user tracepoint. Yes, it is. We should probably switch to static_branch() instead (https://lwn.net/Articles/429447/). It's a good idea, i'll do that after static_branch merged! -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2] KVM: Add CPUID support for VIA CPU
The CPUIDs for Centaur are added, and then the features of PadLock hardware engine on VIA CPU, such as ace, ace_en and so on, can be passed into the kvm guest. Signed-off-by: BrillyWubrill...@viatech.com.cn Signed-off-by: KaryJinkary...@viatech.com.cn --- arch/x86/kvm/x86.c | 40 1 file changed, 40 insertions(+) --- a/arch/x86/kvm/x86.c2011-04-12 10:16:07.713785938 +0800 +++ b/arch/x86/kvm/x86.c2011-04-14 11:23:34.673820989 +0800 @@ -2331,6 +2331,12 @@ static void do_cpuid_ent(struct kvm_cpui F(3DNOWPREFETCH) | 0 /* OSVW */ | 0 /* IBS */ | F(XOP) | 0 /* SKINIT, WDT, LWP */ | F(FMA4) | F(TBM); + /* cpuid 0xC001.edx */ + const u32 kvm_supported_word5_x86_features = + F(XSTORE) | F(XSTORE_EN) | F(XCRYPT) | F(XCRYPT_EN) | + F(ACE2) | F(ACE2_EN) | F(PHE) | F(PHE_EN) | + F(PMM) | F(PMM_EN); + /* all calls to cpuid_count() should be made on the same cpu */ get_cpu(); do_cpuid_1_ent(entry, function, index); @@ -2440,6 +2446,20 @@ static void do_cpuid_ent(struct kvm_cpui entry-ecx = kvm_supported_word6_x86_features; cpuid_mask(entry-ecx, 6); break; + /*Add support for Centaur's CPUID instruction*/ + case 0xC000: + /*Just support up to 0xC004 now*/ + entry-eax = min(entry-eax, 0xC004); + break; + case 0xC001: + entry-edx = kvm_supported_word5_x86_features; + cpuid_mask(entry-edx, 5); + break; + case 0xC002: + case 0xC003: + case 0xC004: + /*Now nothing to do, reserved for the future*/ + break; } kvm_x86_ops-set_supported_cpuid(function, entry); @@ -2486,6 +2506,26 @@ static int kvm_dev_ioctl_get_supported_c if (nent = cpuid-nent) goto out_free; + /* Add support for Centaur's CPUID instruction. */ + if (boot_cpu_data.x86_vendor == X86_VENDOR_CENTAUR) { + do_cpuid_ent(cpuid_entries[nent], 0xC000, 0, + nent, cpuid-nent); + + r = -E2BIG; + if (nent = cpuid-nent) + goto out_free; + + limit = cpuid_entries[nent - 1].eax; + for (func = 0xC001; + func = limit nent cpuid-nent; ++func) + do_cpuid_ent(cpuid_entries[nent], func, 0, + nent, cpuid-nent); + + r = -E2BIG; + if (nent = cpuid-nent) + goto out_free; + } + do_cpuid_ent(cpuid_entries[nent], KVM_CPUID_SIGNATURE, 0, nent, cpuid-nent); -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/3] kvm tools: Add a script to setup private bridge
We can use this script to create/delete a private bridge, and launch a dhcp server on the bridge by dnsmasq, setup forware rule of iptable, then guest can access public network. # ./set_private_br.sh vbr0 192.168.33 add new private bridge: vbr0 # brctl show bridge name bridge id STP enabled interfaces vbr08000. yes # ifconfig vbr0 vbr0 Link encap:Ethernet HWaddr 82:0f:f5:8f:92:47 inet addr:192.168.33.1 Bcast:192.168.33.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:11 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:0 (0.0 B) TX bytes:1979 (1.9 KB) # ps aux |grep dnsmasq nobody .. dnsmasq --strict-order --bind-interfaces --listen-address 192.168.33.1 \ --dhcp-range 192.168.33.1,192.168.33.254 Signed-off-by: Amos Kong kongjian...@gmail.com --- tools/kvm/util/set_private_br.sh | 51 ++ 1 files changed, 51 insertions(+), 0 deletions(-) create mode 100755 tools/kvm/util/set_private_br.sh diff --git a/tools/kvm/util/set_private_br.sh b/tools/kvm/util/set_private_br.sh new file mode 100755 index 000..49867dd --- /dev/null +++ b/tools/kvm/util/set_private_br.sh @@ -0,0 +1,51 @@ +#!/bin/bash +# +# Author: Amos Kong kongjian...@gmail.com +# Date: Apr 14, 2011 +# Description: this script is used to create/delete a private bridge, +# launch a dhcp server on the bridge by dnsmasq. +# +# @ ./set_private_br.sh $bridge_name $subnet_prefix +# @ ./set_private_br.sh vbr0 192.168.33 + +brname='vbr0' +subnet='192.168.33' + +add_br() +{ +echo add new private bridge: $brname +/usr/sbin/brctl addbr $brname +echo 1 /proc/sys/net/ipv6/conf/$brname/disable_ipv6 +echo 1 /proc/sys/net/ipv4/ip_forward +/usr/sbin/brctl stp $brname on +/usr/sbin/brctl setfd $brname 0 +ifconfig $brname $subnet.1 +ifconfig $brname up +# Add forward rule, then guest can access public network +iptables -t nat -A POSTROUTING -s $subnet.254/24 ! -d $subnet.254/24 -j MASQUERADE +/etc/init.d/dnsmasq stop +/etc/init.d/tftpd-hpa stop 2/dev/null +dnsmasq --strict-order --bind-interfaces --listen-address $subnet.1 --dhcp-range $subnet.1,$subnet.254 $tftp_cmd +} + +del_br() +{ +echo cleanup bridge setup +kill -9 `pgrep dnsmasq|tail -1` +ifconfig $brname down +/usr/sbin/brctl delbr $brname +iptables -t nat -D POSTROUTING -s $subnet.254/24 ! -d $subnet.254/24 -j MASQUERADE +} + + +if [ $# = 0 ]; then +del_br 2/dev/null +exit +fi +if [ $# 1 ]; then +brname=$1 +fi +if [ $# = 2 ]; then +subnet=$2 +fi +add_br -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/3] kvm tools: Add a script to setup tap device
# ./kvm-ifup-vbr0 $tap_name Signed-off-by: Amos Kong kongjian...@gmail.com --- tools/kvm/util/kvm-ifup-vbr0 |6 ++ 1 files changed, 6 insertions(+), 0 deletions(-) create mode 100755 tools/kvm/util/kvm-ifup-vbr0 diff --git a/tools/kvm/util/kvm-ifup-vbr0 b/tools/kvm/util/kvm-ifup-vbr0 new file mode 100755 index 000..a91c37f --- /dev/null +++ b/tools/kvm/util/kvm-ifup-vbr0 @@ -0,0 +1,6 @@ +#!/bin/sh +switch=vbr0 +/sbin/ifconfig $1 0.0.0.0 up +/usr/sbin/brctl addif ${switch} $1 +/usr/sbin/brctl setfd ${switch} 0 +/usr/sbin/brctl stp ${switch} off -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/3] kvm tools: Setup bridged network by a script
Use original hardcode network by default. #./kvm run ... -n virtio --tapscript=./util/kvm-ifup-vbr0 # brctl show bridge name bridge id STP enabled interfaces vbr08000.e272c7c391f4 no tap0 guest)# ifconfig eth6 eth6 Link encap:Ethernet HWaddr 00:11:22:33:44:55 inet addr:192.168.33.192 Bcast:192.168.33.255 Mask:255.255.255.0 inet6 addr: fe80::211:22ff:fe33:4455/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:22 errors:0 dropped:0 overruns:0 frame:0 TX packets:8 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:3725 (3.6 KiB) TX bytes:852 (852.0 b) guest)# ping amosk.info PING amosk.info (69.175.108.82) 56(84) bytes of data. 64 bytes from nurpulat.uz (69.175.108.82): icmp_seq=1 ttl=43 time=306 ms Signed-off-by: Amos Kong kongjian...@gmail.com --- tools/kvm/include/kvm/virtio-net.h |2 +- tools/kvm/kvm-run.c|9 - tools/kvm/virtio-net.c | 14 ++ 3 files changed, 19 insertions(+), 6 deletions(-) diff --git a/tools/kvm/include/kvm/virtio-net.h b/tools/kvm/include/kvm/virtio-net.h index a1cab15..95bf049 100644 --- a/tools/kvm/include/kvm/virtio-net.h +++ b/tools/kvm/include/kvm/virtio-net.h @@ -2,6 +2,6 @@ #define KVM__VIRTIO_NET_H struct kvm; -void virtio_net__init(struct kvm *self); +void virtio_net__init(struct kvm *self, const char *script); #endif /* KVM__VIRTIO_NET_H */ diff --git a/tools/kvm/kvm-run.c b/tools/kvm/kvm-run.c index 6046a0a..9ff3a7d 100644 --- a/tools/kvm/kvm-run.c +++ b/tools/kvm/kvm-run.c @@ -31,6 +31,7 @@ #define DEFAULT_KVM_DEV/dev/kvm #define DEFAULT_CONSOLEserial #define DEFAULT_NETWORKnone +#define DEFAULT_SCRIPT none #define MB_SHIFT (20) #define MIN_RAM_SIZE_MB(64ULL) @@ -66,6 +67,7 @@ static const char *image_filename; static const char *console; static const char *kvm_dev; static const char *network; +static const char *script; static bool single_step; static bool readonly_image; extern bool ioport_debug; @@ -89,6 +91,8 @@ static const struct option options[] = { Console to use), OPT_STRING('n', network, network, virtio, Network to use), + OPT_STRING('\0', tapscript, script, Script path, +Assign a script to process created tap device), OPT_GROUP(Kernel options:), OPT_STRING('k', kernel, kernel_filename, kernel, @@ -218,6 +222,9 @@ int kvm_cmd_run(int argc, const char **argv, const char *prefix) else active_console = CONSOLE_8250; + if (!script) + script = DEFAULT_SCRIPT; + term_init(); kvm = kvm__init(kvm_dev, ram_size); @@ -259,7 +266,7 @@ int kvm_cmd_run(int argc, const char **argv, const char *prefix) network = DEFAULT_NETWORK; if (!strncmp(network, virtio, 6)) - virtio_net__init(kvm); + virtio_net__init(kvm, script); kvm__start_timer(kvm); diff --git a/tools/kvm/virtio-net.c b/tools/kvm/virtio-net.c index ec70d5c..7e9c508 100644 --- a/tools/kvm/virtio-net.c +++ b/tools/kvm/virtio-net.c @@ -273,9 +273,10 @@ static struct pci_device_header virtio_net_pci_device = { .irq_line = VIRTIO_NET_IRQ, }; -static void virtio_net__tap_init(void) +static void virtio_net__tap_init(const char* script) { struct ifreq ifr; + char cmd[PATH_MAX]; net_device.tap_fd = open(/dev/net/tun, O_RDWR); if (net_device.tap_fd 0) @@ -291,8 +292,13 @@ static void virtio_net__tap_init(void) ioctl(net_device.tap_fd, TUNSETNOCSUM, 1); + + if (strcmp(script, none)) { + sprintf(cmd, %s tap0, script); + if (system(cmd) 0) + warning(Fail to setup tap by %s, script); + } else if (system(ifconfig tap0 192.168.33.2) 0) /*FIXME: Remove this after user can specify ip address and netmask*/ - if (system(ifconfig tap0 192.168.33.2) 0) warning(Can not set ip address on tap0); } @@ -308,11 +314,11 @@ static void virtio_net__io_thread_init(struct kvm *self) pthread_create(net_device.io_tx_thread, NULL, virtio_net_tx_thread, (void *)self); } -void virtio_net__init(struct kvm *self) +void virtio_net__init(struct kvm *self, const char *script) { pci__register(virtio_net_pci_device, PCI_VIRTIO_NET_DEVNUM); ioport__register(IOPORT_VIRTIO_NET, virtio_net_io_ops, IOPORT_VIRTIO_NET_SIZE); - virtio_net__tap_init(); + virtio_net__tap_init(script); virtio_net__io_thread_init(self); } -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at
[PATCH 1/2 V2] kvm tools: Set up tun interface using ioctls
Use ioctls to assign IP address and bring interface up instead of using ifconfig. Not breaking aliasing rules this time. Signed-off-by: Sasha Levin levinsasha...@gmail.com --- tools/kvm/virtio-net.c | 29 ++--- 1 files changed, 26 insertions(+), 3 deletions(-) diff --git a/tools/kvm/virtio-net.c b/tools/kvm/virtio-net.c index ec70d5c..622cfc6 100644 --- a/tools/kvm/virtio-net.c +++ b/tools/kvm/virtio-net.c @@ -14,6 +14,9 @@ #include sys/ioctl.h #include assert.h #include fcntl.h +#include arpa/inet.h +#include sys/types.h +#include sys/socket.h #define VIRTIO_NET_IRQ 14 #define VIRTIO_NET_QUEUE_SIZE 128 @@ -276,7 +279,9 @@ static struct pci_device_header virtio_net_pci_device = { static void virtio_net__tap_init(void) { struct ifreq ifr; - + int sock = socket(AF_INET, SOCK_STREAM, 0); + struct sockaddr_in sin = {0}; + net_device.tap_fd = open(/dev/net/tun, O_RDWR); if (net_device.tap_fd 0) die(Unable to open /dev/net/tun\n); @@ -291,9 +296,27 @@ static void virtio_net__tap_init(void) ioctl(net_device.tap_fd, TUNSETNOCSUM, 1); + + memset(ifr, 0, sizeof(ifr)); + + strncpy(ifr.ifr_name, net_device.tap_name, sizeof(net_device.tap_name)); + /*FIXME: Remove this after user can specify ip address and netmask*/ - if (system(ifconfig tap0 192.168.33.2) 0) - warning(Can not set ip address on tap0); + sin.sin_addr.s_addr = inet_addr(192.168.33.2); + memcpy((ifr.ifr_addr), sin, sizeof(ifr.ifr_addr)); + ifr.ifr_addr.sa_family = AF_INET; + + if (ioctl(sock, SIOCSIFADDR, ifr) 0) + warning(Can not set ip address on tap device); + + memset(ifr, 0, sizeof(ifr)); + strncpy(ifr.ifr_name, net_device.tap_name, sizeof(net_device.tap_name)); + ioctl(sock, SIOCGIFFLAGS, ifr); + ifr.ifr_flags |= IFF_UP | IFF_RUNNING; + if (ioctl(sock, SIOCSIFFLAGS, ifr) 0) + warning(Could not bring tap device up); + + close(sock); } static void virtio_net__io_thread_init(struct kvm *self) -- 1.7.5.rc1 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/2 V2] kvm tools: Make host side IP configurable
Add --host-ip-addr parameter to allow changing the host-side IP address. Add a networking group to the cmdline menu. Signed-off-by: Sasha Levin levinsasha...@gmail.com --- tools/kvm/include/kvm/virtio-net.h |2 +- tools/kvm/kvm-run.c| 15 --- tools/kvm/virtio-net.c |9 - 3 files changed, 17 insertions(+), 9 deletions(-) diff --git a/tools/kvm/include/kvm/virtio-net.h b/tools/kvm/include/kvm/virtio-net.h index a1cab15..9800a41 100644 --- a/tools/kvm/include/kvm/virtio-net.h +++ b/tools/kvm/include/kvm/virtio-net.h @@ -2,6 +2,6 @@ #define KVM__VIRTIO_NET_H struct kvm; -void virtio_net__init(struct kvm *self); +void virtio_net__init(struct kvm *self, const char *host_ip_addr); #endif /* KVM__VIRTIO_NET_H */ diff --git a/tools/kvm/kvm-run.c b/tools/kvm/kvm-run.c index 6046a0a..d71057c 100644 --- a/tools/kvm/kvm-run.c +++ b/tools/kvm/kvm-run.c @@ -31,6 +31,7 @@ #define DEFAULT_KVM_DEV/dev/kvm #define DEFAULT_CONSOLEserial #define DEFAULT_NETWORKnone +#define DEFAULT_HOST_ADDR 192.168.33.2 #define MB_SHIFT (20) #define MIN_RAM_SIZE_MB(64ULL) @@ -66,6 +67,7 @@ static const char *image_filename; static const char *console; static const char *kvm_dev; static const char *network; +static const char *host_ip_addr; static bool single_step; static bool readonly_image; extern bool ioport_debug; @@ -87,8 +89,6 @@ static const struct option options[] = { Don't write changes back to disk image), OPT_STRING('c', console, console, serial or virtio, Console to use), - OPT_STRING('n', network, network, virtio, - Network to use), OPT_GROUP(Kernel options:), OPT_STRING('k', kernel, kernel_filename, kernel, @@ -98,6 +98,12 @@ static const struct option options[] = { OPT_STRING('p', params, kernel_cmdline, params, Kernel command line arguments), + OPT_GROUP(Networking options:), + OPT_STRING('n', network, network, virtio, + Network to use), + OPT_STRING('\0', host-ip-addr, host_ip_addr, a.b.c.d, + Assign this address to the host side networking), + OPT_GROUP(Debug options:), OPT_STRING('d', kvm-dev, kvm_dev, kvm-dev, KVM device file), OPT_BOOLEAN('s', single-step, single_step, @@ -218,6 +224,9 @@ int kvm_cmd_run(int argc, const char **argv, const char *prefix) else active_console = CONSOLE_8250; + if (!host_ip_addr) + host_ip_addr = DEFAULT_HOST_ADDR; + term_init(); kvm = kvm__init(kvm_dev, ram_size); @@ -259,7 +268,7 @@ int kvm_cmd_run(int argc, const char **argv, const char *prefix) network = DEFAULT_NETWORK; if (!strncmp(network, virtio, 6)) - virtio_net__init(kvm); + virtio_net__init(kvm, host_ip_addr); kvm__start_timer(kvm); diff --git a/tools/kvm/virtio-net.c b/tools/kvm/virtio-net.c index 622cfc6..90a6a17 100644 --- a/tools/kvm/virtio-net.c +++ b/tools/kvm/virtio-net.c @@ -276,7 +276,7 @@ static struct pci_device_header virtio_net_pci_device = { .irq_line = VIRTIO_NET_IRQ, }; -static void virtio_net__tap_init(void) +static void virtio_net__tap_init(const char *host_ip_addr) { struct ifreq ifr; int sock = socket(AF_INET, SOCK_STREAM, 0); @@ -301,8 +301,7 @@ static void virtio_net__tap_init(void) strncpy(ifr.ifr_name, net_device.tap_name, sizeof(net_device.tap_name)); - /*FIXME: Remove this after user can specify ip address and netmask*/ - sin.sin_addr.s_addr = inet_addr(192.168.33.2); + sin.sin_addr.s_addr = inet_addr(host_ip_addr); memcpy((ifr.ifr_addr), sin, sizeof(ifr.ifr_addr)); ifr.ifr_addr.sa_family = AF_INET; @@ -331,11 +330,11 @@ static void virtio_net__io_thread_init(struct kvm *self) pthread_create(net_device.io_tx_thread, NULL, virtio_net_tx_thread, (void *)self); } -void virtio_net__init(struct kvm *self) +void virtio_net__init(struct kvm *self, const char *host_ip_addr) { pci__register(virtio_net_pci_device, PCI_VIRTIO_NET_DEVNUM); ioport__register(IOPORT_VIRTIO_NET, virtio_net_io_ops, IOPORT_VIRTIO_NET_SIZE); - virtio_net__tap_init(); + virtio_net__tap_init(host_ip_addr); virtio_net__io_thread_init(self); } -- 1.7.5.rc1 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html