Re: Oracle RAC in libvirt+KVM environment
>From the fedora 19 host: [root@fedora ~]# sg_inq /dev/sdc standard INQUIRY: PQual=0 Device_type=0 RMB=0 version=0x05 [SPC-3] [AERC=0] [TrmTsk=0] NormACA=0 HiSUP=0 Resp_data_format=0 SCCS=1 ACC=0 TPGS=1 3PC=0 Protect=0 [BQue=0] EncServ=0 MultiP=0 [MChngr=0] [ACKREQQ=0] Addr16=0 [RelAdr=0] WBus16=1 Sync=1 Linked=0 [TranDis=0] CmdQue=1 length=36 (0x24) Peripheral device type: disk Vendor identification: MacroSAN Product identification: LU Product revision level: 1.0 Unit serial number: fd01ece6-8540-f4c7--fe170142b300 >From the fedora 19 vm: [root@fedoravm ~]# sg_inq /dev/sdb standard INQUIRY: PQual=0 Device_type=0 RMB=0 version=0x05 [SPC-3] [AERC=0] [TrmTsk=0] NormACA=0 HiSUP=0 Resp_data_format=0 SCCS=1 ACC=0 TPGS=1 3PC=0 Protect=0 [BQue=0] EncServ=0 MultiP=0 [MChngr=0] [ACKREQQ=0] Addr16=0 [RelAdr=0] WBus16=1 Sync=1 Linked=0 [TranDis=0] CmdQue=1 length=36 (0x24) Peripheral device type: disk Vendor identification: MacroSAN Product identification: LU Product revision level: 1.0 Unit serial number: fd01ece6-8540-f4c7--fe170142b300 The result from fedora 19 host and fedora 19 vm are the same. It's that means I got a wrong windows vm scsi pass-through driver? Or is there any tool like sg_inq in windows 2008? On Tue, Aug 20, 2013 at 8:09 PM, Paolo Bonzini wrote: > Il 20/08/2013 13:43, Timon Wang ha scritto: >> Thanks, the whole iSCSI LUN have been passed to the VM. >> >> But I test it with scsicmd, and found that the driver may be not >> support SPC-3, but if i use this by microsoft iscsi initiator, I can >> pass all the scsi3_test tests. > > If you are passing the LUN to the VM with device='lun', the driver and > VMM do not interpret any SCSI command. You should see exactly the same > data as in the host, which includes support for SPC-3: > [root@localhost /]# sg_inq /dev/VM-IMAGES-BACKUP-DO-NOT-REMOVE/q_disk standard INQUIRY: PQual=0 Device_type=0 RMB=0 version=0x05 [SPC-3] [AERC=0] [TrmTsk=0] NormACA=0 HiSUP=0 Resp_data_format=0 SCCS=1 ACC=0 TPGS=1 3PC=0 Protect=0 [BQue=0] EncServ=0 MultiP=0 [MChngr=0] [ACKREQQ=0] Addr16=0 [RelAdr=0] WBus16=1 Sync=1 Linked=0 [TranDis=0] CmdQue=1 length=36 (0x24) Peripheral device type: disk Vendor identification: MacroSAN Product identification: LU Product revision level: 1.0 Unit serial number: 0d9281ae-aea4-6da0--02180142b300 > > Can you try using a Linux VM and executing sg_inq in the VM? > > Paolo > -- Focus on: Server Vitualization, Network security,Scanner,NodeJS,JAVA,WWW Blog: http://www.nohouse.net -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2] vhost: Include linux/uio.h instead of linux/socket.h
From: Asias He Date: Mon, 19 Aug 2013 09:23:19 +0800 > memcpy_fromiovec is moved from net/core/iovec.c to lib/iovec.c. > linux/uio.h provides the declaration for memcpy_fromiovec. > > Include linux/uio.h instead of inux/socket.h for it. > > Signed-off-by: Asias He Applied. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH-v3 1/4] idr: Percpu ida
On Fri, 16 Aug 2013 23:09:06 + "Nicholas A. Bellinger" wrote: > From: Kent Overstreet > > Percpu frontend for allocating ids. With percpu allocation (that works), > it's impossible to guarantee it will always be possible to allocate all > nr_tags - typically, some will be stuck on a remote percpu freelist > where the current job can't get to them. > > We do guarantee that it will always be possible to allocate at least > (nr_tags / 2) tags - this is done by keeping track of which and how many > cpus have tags on their percpu freelists. On allocation failure if > enough cpus have tags that there could potentially be (nr_tags / 2) tags > stuck on remote percpu freelists, we then pick a remote cpu at random to > steal from. > > Note that there's no cpu hotplug notifier - we don't care, because > steal_tags() will eventually get the down cpu's tags. We _could_ satisfy > more allocations if we had a notifier - but we'll still meet our > guarantees and it's absolutely not a correctness issue, so I don't think > it's worth the extra code. > > ... > > include/linux/idr.h | 53 + > lib/idr.c | 316 > +-- I don't think this should be in idr.[ch] at all. It has no relationship with the existing code. Apart from duplicating its functionality :( > > ... > > @@ -243,4 +245,55 @@ static inline int ida_get_new(struct ida *ida, int *p_id) > > void __init idr_init_cache(void); > > +/* Percpu IDA/tag allocator */ > + > +struct percpu_ida_cpu; > + > +struct percpu_ida { > + /* > + * number of tags available to be allocated, as passed to > + * percpu_ida_init() > + */ > + unsignednr_tags; > + > + struct percpu_ida_cpu __percpu *tag_cpu; > + > + /* > + * Bitmap of cpus that (may) have tags on their percpu freelists: > + * steal_tags() uses this to decide when to steal tags, and which cpus > + * to try stealing from. > + * > + * It's ok for a freelist to be empty when its bit is set - steal_tags() > + * will just keep looking - but the bitmap _must_ be set whenever a > + * percpu freelist does have tags. > + */ > + unsigned long *cpus_have_tags; Why not cpumask_t? > + struct { > + spinlock_t lock; > + /* > + * When we go to steal tags from another cpu (see steal_tags()), > + * we want to pick a cpu at random. Cycling through them every > + * time we steal is a bit easier and more or less equivalent: > + */ > + unsignedcpu_last_stolen; > + > + /* For sleeping on allocation failure */ > + wait_queue_head_t wait; > + > + /* > + * Global freelist - it's a stack where nr_free points to the > + * top > + */ > + unsignednr_free; > + unsigned*freelist; > + } cacheline_aligned_in_smp; Why the cacheline_aligned_in_smp? > +}; > > ... > > + > +/* Percpu IDA */ > + > +/* > + * Number of tags we move between the percpu freelist and the global > freelist at > + * a time "between a percpu freelist" would be more accurate? > + */ > +#define IDA_PCPU_BATCH_MOVE 32U > + > +/* Max size of percpu freelist, */ > +#define IDA_PCPU_SIZE((IDA_PCPU_BATCH_MOVE * 3) / 2) > + > +struct percpu_ida_cpu { > + spinlock_t lock; > + unsignednr_free; > + unsignedfreelist[]; > +}; Data structure needs documentation. There's one of these per cpu. I guess nr_free and freelist are clear enough. The presence of a lock in a percpu data structure is a surprise. It's for cross-cpu stealing, I assume? > +static inline void move_tags(unsigned *dst, unsigned *dst_nr, > + unsigned *src, unsigned *src_nr, > + unsigned nr) > +{ > + *src_nr -= nr; > + memcpy(dst + *dst_nr, src + *src_nr, sizeof(unsigned) * nr); > + *dst_nr += nr; > +} > + > > ... > > +static inline void alloc_global_tags(struct percpu_ida *pool, > + struct percpu_ida_cpu *tags) > +{ > + move_tags(tags->freelist, &tags->nr_free, > + pool->freelist, &pool->nr_free, > + min(pool->nr_free, IDA_PCPU_BATCH_MOVE)); > +} Document this function? > +static inline unsigned alloc_local_tag(struct percpu_ida *pool, > +struct percpu_ida_cpu *tags) > +{ > + int tag = -ENOSPC; > + > + spin_lock(&tags->lock); > + if (tags->nr_free) > + tag = tags->freelist[--tags->nr_free]; > + spin_unlock(&tags->lock); > + > + return tag; > +} I guess this one's clear enough, if the data structure relationships are understood. > +/** > + * percpu_ida_alloc - allocate a
Re: [PATCH] vfio-pci: Use fdget() rather than eventfd_fget()
On Tue, Aug 20, 2013 at 01:18:07PM -0600, Alex Williamson wrote: > eventfd_fget() tests to see whether the file is an eventfd file, which > we then immediately pass to eventfd_ctx_fileget(), which again tests > whether the file is an eventfd file. Simplify slightly by using > fdget() so that we only test that we're looking at an eventfd once. > fget() could also be used, but fdget() makes use of fget_light() for > another slight optimization. Umm... > @@ -210,8 +210,8 @@ fail: > if (ctx && !IS_ERR(ctx)) > eventfd_ctx_put(ctx); > > - if (file && !IS_ERR(file)) > - fput(file); > + if (irqfd.file) > + fdput(irqfd); > > kfree(virqfd); IMO it's a bad style; you have three failure exits leading here, and those ifs are nothing but "how far did we get before we'd failed". fail3: eventfd_ctx_put(ctx); fail2: fdput(irqfd); fail1: kfree(virqfd); is much easier to analyse. It's a very common pattern and IME it's more robust than this kind of "flexibility"... -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] vfio-pci: Use fdget() rather than eventfd_fget()
eventfd_fget() tests to see whether the file is an eventfd file, which we then immediately pass to eventfd_ctx_fileget(), which again tests whether the file is an eventfd file. Simplify slightly by using fdget() so that we only test that we're looking at an eventfd once. fget() could also be used, but fdget() makes use of fget_light() for another slight optimization. Signed-off-by: Alex Williamson --- drivers/vfio/pci/vfio_pci_intrs.c | 18 +- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c index 4bc704e..7507975 100644 --- a/drivers/vfio/pci/vfio_pci_intrs.c +++ b/drivers/vfio/pci/vfio_pci_intrs.c @@ -130,7 +130,7 @@ static int virqfd_enable(struct vfio_pci_device *vdev, void (*thread)(struct vfio_pci_device *, void *), void *data, struct virqfd **pvirqfd, int fd) { - struct file *file = NULL; + struct fd irqfd; struct eventfd_ctx *ctx = NULL; struct virqfd *virqfd; int ret = 0; @@ -149,13 +149,13 @@ static int virqfd_enable(struct vfio_pci_device *vdev, INIT_WORK(&virqfd->shutdown, virqfd_shutdown); INIT_WORK(&virqfd->inject, virqfd_inject); - file = eventfd_fget(fd); - if (IS_ERR(file)) { - ret = PTR_ERR(file); + irqfd = fdget(fd); + if (!irqfd.file) { + ret = -EBADF; goto fail; } - ctx = eventfd_ctx_fileget(file); + ctx = eventfd_ctx_fileget(irqfd.file); if (IS_ERR(ctx)) { ret = PTR_ERR(ctx); goto fail; @@ -187,7 +187,7 @@ static int virqfd_enable(struct vfio_pci_device *vdev, init_waitqueue_func_entry(&virqfd->wait, virqfd_wakeup); init_poll_funcptr(&virqfd->pt, virqfd_ptable_queue_proc); - events = file->f_op->poll(file, &virqfd->pt); + events = irqfd.file->f_op->poll(irqfd.file, &virqfd->pt); /* * Check if there was an event already pending on the eventfd @@ -202,7 +202,7 @@ static int virqfd_enable(struct vfio_pci_device *vdev, * Do not drop the file until the irqfd is fully initialized, * otherwise we might race against the POLLHUP. */ - fput(file); + fdput(irqfd); return 0; @@ -210,8 +210,8 @@ fail: if (ctx && !IS_ERR(ctx)) eventfd_ctx_put(ctx); - if (file && !IS_ERR(file)) - fput(file); + if (irqfd.file) + fdput(irqfd); kfree(virqfd); -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v3] vfio-pci: PCI hot reset interface
The current VFIO_DEVICE_RESET interface only maps to PCI use cases where we can isolate the reset to the individual PCI function. This means the device must support FLR (PCIe or AF), PM reset on D3hot->D0 transition, device specific reset, or be a singleton device on a bus for a secondary bus reset. FLR does not have widespread support, PM reset is not very reliable, and bus topology is dictated by the system and device design. We need to provide a means for a user to induce a bus reset in cases where the existing mechanisms are not available or not reliable. This device specific extension to VFIO provides the user with this ability. Two new ioctls are introduced: - VFIO_DEVICE_PCI_GET_HOT_RESET_INFO - VFIO_DEVICE_PCI_HOT_RESET The first provides the user with information about the extent of devices affected by a hot reset. This is essentially a list of devices and the IOMMU groups they belong to. The user may then initiate a hot reset by calling the second ioctl. We must be careful that the user has ownership of all the affected devices found via the first ioctl, so the second ioctl takes a list of file descriptors for the VFIO groups affected by the reset. Each group must have IOMMU protection established for the ioctl to succeed. Signed-off-by: Alex Williamson --- v2: Use PCI bus iterators. Depends on pci_walk_slot() patch v3: #include per Al Viro drivers/vfio/pci/vfio_pci.c | 280 +++ include/uapi/linux/vfio.h | 38 ++ 2 files changed, 317 insertions(+), 1 deletion(-) diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c index cef6002..1dfec392 100644 --- a/drivers/vfio/pci/vfio_pci.c +++ b/drivers/vfio/pci/vfio_pci.c @@ -13,6 +13,7 @@ #include #include +#include #include #include #include @@ -227,6 +228,104 @@ static int vfio_pci_get_irq_count(struct vfio_pci_device *vdev, int irq_type) return 0; } +struct vfio_pci_walk_info { + int ret; + void *data; +}; + +static int vfio_pci_count_devs(struct pci_dev *pdev, void *data) +{ + struct vfio_pci_walk_info *walk = data; + int *count = walk->data; + + (*count)++; + return walk->ret; +} + +struct vfio_pci_fill_info { + int max; + int cur; + struct vfio_pci_dependent_device *devices; +}; + +static int vfio_pci_fill_devs(struct pci_dev *pdev, void *data) +{ + struct vfio_pci_walk_info *walk = data; + struct vfio_pci_fill_info *fill = walk->data; + struct iommu_group *iommu_group; + + if (fill->cur == fill->max) { + walk->ret = -EAGAIN; /* Something changed, try again */ + return walk->ret; + } + + iommu_group = iommu_group_get(&pdev->dev); + if (!iommu_group) { + walk->ret = -EPERM; /* Cannot reset non-isolated devices */ + return walk->ret; + } + + fill->devices[fill->cur].group_id = iommu_group_id(iommu_group); + fill->devices[fill->cur].segment = pci_domain_nr(pdev->bus); + fill->devices[fill->cur].bus = pdev->bus->number; + fill->devices[fill->cur].devfn = pdev->devfn; + fill->cur++; + iommu_group_put(iommu_group); + return walk->ret; +} + +struct vfio_pci_group_entry { + struct vfio_group *group; + int id; +}; + +struct vfio_pci_group_info { + int count; + struct vfio_pci_group_entry *groups; +}; + +static int vfio_pci_validate_devs(struct pci_dev *pdev, void *data) +{ + struct vfio_pci_walk_info *walk = data; + struct vfio_pci_group_info *info = walk->data; + struct iommu_group *group; + int id, i; + + group = iommu_group_get(&pdev->dev); + if (!group) { + walk->ret = -EPERM; + return walk->ret; + } + + id = iommu_group_id(group); + + for (i = 0; i < info->count; i++) + if (info->groups[i].id == id) + break; + + iommu_group_put(group); + + if (i == info->count) + walk->ret = -EINVAL; + + return walk->ret; +} + +static int vfio_pci_for_each_slot_or_bus(struct pci_dev *pdev, +int (*fn)(struct pci_dev *, + void *data), void *data, +bool slot) +{ + struct vfio_pci_walk_info info = { .ret = 0, .data = data }; + + if (slot) + pci_walk_slot(pdev->slot, fn, &info); + else + pci_walk_bus(pdev->bus, fn, &info); + + return info.ret; +} + static long vfio_pci_ioctl(void *device_data, unsigned int cmd, unsigned long arg) { @@ -407,10 +506,189 @@ static long vfio_pci_ioctl(void *device_data, return ret; - } else if (cmd == VFIO_DEVICE_RESET) + } else if (cmd == VFIO_DEVICE_RESET) { return vdev->reset_works ? pci_re
KVM: x86: update masterclock when kvmclock_offset is calculated
The offset to add to the hosts monotonic time, kvmclock_offset, is calculated against the monotonic time at KVM_SET_CLOCK ioctl time. Request a master clock update at this time, to reduce a potentially unbounded difference between the values of the masterclock and the clock value used to calculate kvmclock_offset. Signed-off-by: Marcelo Tosatti Index: linux-2.6-kvmclock-fixes/arch/x86/kvm/x86.c === --- linux-2.6-kvmclock-fixes.orig/arch/x86/kvm/x86.c +++ linux-2.6-kvmclock-fixes/arch/x86/kvm/x86.c @@ -3806,6 +3806,7 @@ long kvm_arch_vm_ioctl(struct file *filp delta = user_ns.clock - now_ns; local_irq_enable(); kvm->arch.kvmclock_offset = delta; + kvm_gen_update_masterclock(kvm); break; } case KVM_GET_CLOCK: { -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [uq/master PATCH] kvm: i386: fix LAPIC TSC deadline timer save/restore
Il 19/08/2013 22:01, Marcelo Tosatti ha scritto: > On Mon, Aug 19, 2013 at 08:57:58PM +0200, Paolo Bonzini wrote: >> Il 19/08/2013 19:13, Marcelo Tosatti ha scritto: >>> >>> The configuration of the timer represented by MSR_IA32_TSCDEADLINE depends >>> on: >>> >>> - APIC LVT Timer register. >>> - TSC value. >>> >>> Change the order to respect the dependency. >> >> Do you have a testcase? >> >> Paolo > > Autotest: > > python ConfigTest.py --guestname=RHEL.7 --driveformat=virtio_scsi > --nicmodel=e1000 --mem=2048 --vcpu=4 > --testcase=timedrift..ntp.with_migration --nrepeat=10 Thanks, applied. Paolo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Fix lapic time counter read for periodic mode
Il 13/11/2012 21:40, Marcelo Tosatti ha scritto: > On Tue, Nov 13, 2012 at 08:52:54AM +0100, Christian Ehrhardt wrote: >> > >> > Hi, >> > >> > thanks for your reply. >> > >> > On Mon, Nov 12, 2012 at 07:32:37PM -0200, Marcelo Tosatti wrote: > > > there is a bug in the emulation of the lapic time counter. In > > > particular > > > what we are seeing is that the time counter of a periodic lapic timer > > > in the guest reads as zero 99% of the time. The patch below fixes > > > that. > > > > > > The emulation of the lapic timer is done with the help of a hires > > > timer that expires with the same frequency as the lapic counter. > > > New expiration times for a periodic timer are calculated > > > incrementally > > > based on the last scheduled expiration time. This ensures long term > > > accuracy of the emulated timer close to that of the underlying clock. > > > > > > The actual value of the lapic time counter is calculated from the > > > real time difference between current time and scheduled expiration > > > time > > > of the hires timer. If this difference is negative, the hires timer > > > expired. For oneshot mode this is correctly translated into a zero > > > value > > > for the time counter. However, in periodic mode we must use the > > > negative > > > difference unmodified. > > > > > > regards Christian > > > > > > Fix lapic time counter read for periodic mode. >>> > > >>> > > In periodic mode the hrtimer is rearmed once expired, see >>> > > apic_timer_fn. So _get_remaining should return proper value >>> > > even if the guest is not able to process timer interrupts. >>> > > >>> > > Can you describe your specific scenario in more detail? >> > >> > In my specific case, the host is admittedly somewhat special as it >> > already is a rehosted version of linux, i.e. not running directly on >> > native hardware. It is still unclear if the host has sufficiently accurate >> > timer interrupts. This is most likely part of the problems we are seeing. >> > >> > However, AFAICS apic_timer_fn is only called once per jiffy (at least in >> > some configurations). In particular, it is not called by >> > hrtimer_get_remaining. Thus depending on the frequency of the LAPIC timer >> > in the guest there might _several_ iterations that are missed. This can >> > probably be mitigated by a hires timer interrupts. However, I think >> > the problem is still there even in that case. >> > >> > Additionally, the behaviour that I want to establish matches that of the >> > PIT timer (in a not completely obvious way, though). >> > >> > Having said that the proposed patch in my first mail is incomplete, as >> > the mod_64 does not work correctly for negative values. A fixed version >> > is below. >> > >> > regards Christian >> > >> > Signed-off-by: Christian Ehrhardt > Alright. Please add a comment from the LAPIC documentation describing > this behaviour (and a nice changelog). Thanks. > Christian, did you ever resubmit the patch? Paolo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [uq/master patch 2/2] kvm-all.c: max_cpus should not exceed KVM vcpu limit
Il 12/08/2013 21:56, Marcelo Tosatti ha scritto: > maxcpus, which specifies the maximum number of hotpluggable CPUs, > should not exceed KVM's vcpu limit. > > Signed-off-by: Marcelo Tosatti > > Index: qemu/kvm-all.c > === > --- qemu.orig/kvm-all.c > +++ qemu/kvm-all.c > @@ -1391,6 +1391,13 @@ int kvm_init(void) > goto err; > } > > +if (max_cpus > max_vcpus) { > +ret = -EINVAL; > +fprintf(stderr, "Number of max_cpus requested (%d) exceeds max cpus " > +"supported by KVM (%d)\n", max_cpus, max_vcpus); > +goto err; > +} > + > s->vmfd = kvm_ioctl(s, KVM_CREATE_VM, 0); > if (s->vmfd < 0) { > #ifdef TARGET_S390X > > > -- > To unsubscribe from this list: send the line "unsubscribe kvm" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > I applied this patch to uq/master. Thanks, Paolo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] [PATCH] qemu-kvm bugfix for IA32_FEATURE_CONTROL
Il 20/08/2013 05:33, Liu, Jinsong ha scritto: > Thanks Andreas! > > This patch is for qemu-kvm. Even though the repository is still called qemu-kvm, the uq/master branch is the only active one and patches there will end up in upstream QEMU. There are no qemu-kvm releases anymore. I applied the patch to uq/master, thanks. Paolo > Per my understanding, there are some patches firstly checked in qemu-kvm > uq/master branch. > This patch is to fix c/s 0779caeb1a17f4d3ed14e2925b36ba09b084fb7b of qemu-kvm > uq/master branch > (which is to co-work w/ kvm IA32_FEATURE_CONTROL, and currently not yet in > upstream qemu). > > This patch is used to fix the bug introduced by > 0779caeb1a17f4d3ed14e2925b36ba09b084fb7b of qemu-kvm uq/master branch. The > bug is reported as > https://bugs.launchpad.net/qemu-kvm/+bug/1207623 > https://bugs.launchpad.net/qemu/+bug/1213797 > > Anything I misunderstand, for upstream qemu and qemu-kvm? -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH uq/master] kvm: Simplify kvm_handle_io
Il 13/08/2013 14:43, Jan Kiszka ha scritto: > Now that cpu_in/out is just a wrapper around address_space_rw, we can > also call the latter directly. As host endianness == guest endianness, > there is no need for the memory access helpers st*_p/ld*_p as well. > > Signed-off-by: Jan Kiszka > --- > kvm-all.c | 28 ++-- > 1 files changed, 2 insertions(+), 26 deletions(-) > > diff --git a/kvm-all.c b/kvm-all.c > index 716860f..c861354 100644 > --- a/kvm-all.c > +++ b/kvm-all.c > @@ -1499,32 +1499,8 @@ static void kvm_handle_io(uint16_t port, void *data, > int direction, int size, > uint8_t *ptr = data; > > for (i = 0; i < count; i++) { > -if (direction == KVM_EXIT_IO_IN) { > -switch (size) { > -case 1: > -stb_p(ptr, cpu_inb(port)); > -break; > -case 2: > -stw_p(ptr, cpu_inw(port)); > -break; > -case 4: > -stl_p(ptr, cpu_inl(port)); > -break; > -} > -} else { > -switch (size) { > -case 1: > -cpu_outb(port, ldub_p(ptr)); > -break; > -case 2: > -cpu_outw(port, lduw_p(ptr)); > -break; > -case 4: > -cpu_outl(port, ldl_p(ptr)); > -break; > -} > -} > - > +address_space_rw(&address_space_io, port, ptr, size, > + direction == KVM_EXIT_IO_OUT); > ptr += size; > } > } > Applied, thanks. Paolo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: kvm/queue still ahead of kvm/next
Il 09/08/2013 21:26, Paolo Bonzini ha scritto: > Hi all, > > I'm seeing some breakage of shadow-on-shadow and shadow-on-EPT nested > VMX. Until I can track more precisely whether it is a regression, and > on which hosts I can reproduce it, I'm going to leave the patches out of > kvm/next. > > The good news is that nested EPT works pretty well. :) Yeah, shadow-on-EPT doesn't work on at least the Westmere I tried, so I'll merge kvm/queue to kvm/next soon. Paolo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] target-ppc: Update slb array with correct index values.
Alexander Graf writes: > On 19.08.2013, at 09:25, Aneesh Kumar K.V wrote: > >> Alexander Graf writes: >> >>> On 11.08.2013, at 20:16, Aneesh Kumar K.V wrote: >>> From: "Aneesh Kumar K.V" Without this, a value of rb=0 and rs=0, result in us replacing the 0th index Signed-off-by: Aneesh Kumar K.V >>> >>> Wrong mailing list again ;). >> >> Will post the series again with updated commit message to the qemu list. >> >>> --- target-ppc/kvm.c | 14 -- 1 file changed, 12 insertions(+), 2 deletions(-) diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c index 30a870e..5d4e613 100644 --- a/target-ppc/kvm.c +++ b/target-ppc/kvm.c @@ -1034,8 +1034,18 @@ int kvm_arch_get_registers(CPUState *cs) /* Sync SLB */ #ifdef TARGET_PPC64 for (i = 0; i < 64; i++) { -ppc_store_slb(env, sregs.u.s.ppc64.slb[i].slbe, - sregs.u.s.ppc64.slb[i].slbv); +target_ulong rb = sregs.u.s.ppc64.slb[i].slbe; +/* + * KVM_GET_SREGS doesn't retun slb entry with slot information + * same as index. So don't depend on the slot information in + * the returned value. >>> >>> This is the generating code in book3s_pr.c: >>> >>>if (vcpu->arch.hflags & BOOK3S_HFLAG_SLB) { >>>for (i = 0; i < 64; i++) { >>>sregs->u.s.ppc64.slb[i].slbe = >>> vcpu->arch.slb[i].orige | i; >>>sregs->u.s.ppc64.slb[i].slbv = >>> vcpu->arch.slb[i].origv; >>>} >>> >>> Where exactly did you see broken slbe entries? >>> >> >> I noticed this when adding support for guest memory dumping via qemu gdb >> server. Now the array we get would look like below >> >> slbe0 slbv0 >> slbe1 slbv1 >> 0 >> 0 > > Ok, so that's where the problem lies. Why are the entries 0 here? > Either we try to fetch more entries than we should, we populate > entries incorrectly or the kernel simply returns invalid SLB entry > values for invalid entries. The ioctl zero out the sregs, and fill only slb_max entries. So we find 0 filled entries above slb_max. Also we don't pass slb_max to user space. So userspace have to look at all the 64 entries. > > Are you seeing this with PR KVM or HV KVM? > HV KVM -aneesh -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Call for agenda for 2013-08-20
Juan Quintela wrote: > Hi > > Please, send any topic that you are interested in covering. > Call cancelled. As this was the only topic, and neither Frederik or Konrad are able to attend today, topic got moved to next call in two weeks. > Thanks, Juan. > > Agenda so far: > - Talk about qemu reverse executing (1st description was done this week) > How to handle IO when we want to do reverse execution. > How this relate to Kemari needs? > And to icount changes? > > Call details: > > 10:00 AM to 11:00 AM EDT > Every two weeks > > If you need phone number details, contact me privately. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [Qemu-devel] vm performance degradation after kvm live migration or save-restore with EPT enabled
>>> >>> The QEMU command line (/var/log/libvirt/qemu/[domain name].log), >>> >>> LC_ALL=C PATH=/bin:/sbin:/usr/bin:/usr/sbin HOME=/ >>> >>> QEMU_AUDIO_DRV=none >>> >>> /usr/local/bin/qemu-system-x86_64 -name ATS1 -S -M pc-0.12 -cpu >>> >>> qemu32 -enable-kvm -m 12288 -smp 4,sockets=4,cores=1,threads=1 >>> >>> -uuid >>> >>> 0505ec91-382d-800e-2c79-e5b286eb60b5 -no-user-config -nodefaults >>> >>> -chardev >>> >>> socket,id=charmonitor,path=/var/lib/libvirt/qemu/ATS1.monitor,ser >>> >>> ver, n owait -mon chardev=charmonitor,id=monitor,mode=control >>> >>> -rtc base=localtime -no-shutdown -device >>> >>> piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive >>> >>> file=/opt/ne/vm/ATS1.img,if=none,id=drive-virtio-disk0,format=raw >>> >>> ,cac >>> >>> h >>> >>> e=none -device >>> >>> virtio-blk-pci,scsi=off,bus=pci.0,addr=0x8,drive=drive-virtio-dis >>> >>> k0,i >>> >>> d >>> >>> =virtio-disk0,bootindex=1 -netdev >>> >>> tap,fd=20,id=hostnet0,vhost=on,vhostfd=21 -device >>> >>> virtio-net-pci,netdev=hostnet0,id=net0,mac=00:e0:fc:00:0f:00,bus=pci. >>> >>> 0 >>> >>> ,addr=0x3,bootindex=2 -netdev >>> >>> tap,fd=22,id=hostnet1,vhost=on,vhostfd=23 -device >>> >>> virtio-net-pci,netdev=hostnet1,id=net1,mac=00:e0:fc:01:0f:00,bus=pci. >>> >>> 0 >>> >>> ,addr=0x4 -netdev tap,fd=24,id=hostnet2,vhost=on,vhostfd=25 >>> >>> -device >>> >>> virtio-net-pci,netdev=hostnet2,id=net2,mac=00:e0:fc:02:0f:00,bus=pci. >>> >>> 0 >>> >>> ,addr=0x5 -netdev tap,fd=26,id=hostnet3,vhost=on,vhostfd=27 >>> >>> -device >>> >>> virtio-net-pci,netdev=hostnet3,id=net3,mac=00:e0:fc:03:0f:00,bus=pci. >>> >>> 0 >>> >>> ,addr=0x6 -netdev tap,fd=28,id=hostnet4,vhost=on,vhostfd=29 >>> >>> -device >>> >>> virtio-net-pci,netdev=hostnet4,id=net4,mac=00:e0:fc:0a:0f:00,bus=pci. >>> >>> 0 >>> >>> ,addr=0x7 -netdev tap,fd=30,id=hostnet5,vhost=on,vhostfd=31 >>> >>> -device >>> >>> virtio-net-pci,netdev=hostnet5,id=net5,mac=00:e0:fc:0b:0f:00,bus=pci. >>> >>> 0 >>> >>> ,addr=0x9 -chardev pty,id=charserial0 -device >>> >>> isa-serial,chardev=charserial0,id=serial0 -vnc *:0 -k en-us -vga >>> >>> cirrus -device i6300esb,id=watchdog0,bus=pci.0,addr=0xb >>> >>> -watchdog-action poweroff -device >>> >>> virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0xa >>> >>> >>> >>Which QEMU version is this? Can you try with e1000 NICs instead of virtio? >>> >> >>> >This QEMU version is 1.0.0, but I also test QEMU 1.5.2, the same problem >>> >exists, including the performance degradation and readonly GFNs' flooding. >>> >I tried with e1000 NICs instead of virtio, including the performance >>> >degradation and readonly GFNs' flooding, the QEMU version is 1.5.2. >>> >No matter e1000 NICs or virtio NICs, the GFNs' flooding is initiated at >>> >post-restore stage (i.e. running stage), as soon as the restoring >>> >completed, the flooding is starting. >>> > >>> >Thanks, >>> >Zhang Haoyu >>> > >>> >>-- >>> >> Gleb. >>> >>> Should we focus on the first bad >>> commit(612819c3c6e67bac8fceaa7cc402f13b1b63f7e4) and the surprising GFNs' >>> flooding? >>> >>Not really. There is no point in debugging very old version compiled >>with kvm-kmod, there are to many variables in the environment. I cannot >>reproduce the GFN flooding on upstream, so the problem may be gone, may >>be a result of kvm-kmod problem or something different in how I invoke >>qemu. So the best way to proceed is for you to reproduce with upstream >>version then at least I will be sure that we are using the same code. >> >Thanks, I will test the combos of upstream kvm kernel and upstream qemu. >And, the guest os version above I said was wrong, current running guest os is >SLES10SP4. > I tested below combos of qemu and kernel, +-+-+-+ | kvm kernel | QEMU | test result | +-+-+-+ | kvm-3.11-2 | qemu-1.5.2| GOOD | +-+-+-+ | SLES11SP2 | qemu-1.0.0| BAD| +-+-+-+ | SLES11SP2 | qemu-1.4.0| BAD| +-+-+-+ | SLES11SP2 | qemu-1.4.2| BAD| +-+-+-+ | SLES11SP2 | qemu-1.5.0-rc0 | GOOD | +-+-+-+ | SLES11SP2 | qemu-1.5.0| GOOD | +-+-+-+ | SLES11SP2 | qemu-1.5.1| GOOD | +-+-+-+ | SLES11SP2 | qemu-1.5.2| GOOD | +-+-+-+ NOTE: 1. above kvm-3.11-2 in the table is the whole tag kernel download from https://git.kernel.org/pub/scm/virt/kvm/kvm.git 2. SLES11SP2's kernel version is 3.0.13-0.27 Then I git bisect the qemu changes between
Re: Multi Queue KVM Support
Il 20/08/2013 13:13, Naor Shlomo ha scritto: > Hi Paolo and thanks for your help. > > I upgraded the following (compiled from source) > qemu : 1.5.2 stable > libvirt : 1.1.1 > > but for some reason when I run the version command inside virsh: > > Compiled against library: libvirt 1.1.1 > Using library: libvirt 1.1.1 > Using API: QEMU 1.1.1 > Running hypervisor: QEMU 0.12.1 > > It says that my running Hypervisor is QEMU 0.12.1 > > Could you please tell me what did I miss, how do I upgrade the hypervisor? Not sure. Adding the libvirt-users mailing list. > Thanks, > Naor > > -Original Message- > From: Paolo Bonzini [mailto:paolo.bonz...@gmail.com] On Behalf Of Paolo > Bonzini > Sent: Tuesday, August 20, 2013 12:28 PM > To: Naor Shlomo > Cc: kvm@vger.kernel.org > Subject: Re: Multi Queue KVM Support > > Il 20/08/2013 05:21, Naor Shlomo ha scritto: >> Hi Paolo, >> >> The host is running CentOS release 6.3 (Final). >> I did "yum upgrade libvirt" and "yum upgrade qemu-kvm" a couple of days ago >> and ended up with these versions. >> >> What do you suggest regarding qemu? compile 6.5 or later myself? > > RHEL/CentOS 6.5 is not yet out, it's still a few months before it's released. > > You can compile QEMU 1.6 from source, or wait for CentOS to have the feature. > > Paolo > >> I appreciate your help, >> Naor >> >> -Original Message- >> From: Paolo Bonzini [mailto:paolo.bonz...@gmail.com] On Behalf Of >> Paolo Bonzini >> Sent: Monday, August 19, 2013 11:22 PM >> To: Naor Shlomo >> Cc: kvm@vger.kernel.org >> Subject: Re: Multi Queue KVM Support >> >> Il 19/08/2013 13:29, Naor Shlomo ha scritto: >>> Hello experts, >>> >>> I am trying to use the multi queue support on a Linux guest running Kernel >>> 3.9.7. >>> >>> The host's virsh version command reports the following output: >>> Compiled against library: libvirt 0.10.2 Using library: libvirt >>> 0.10.2 Using API: QEMU 0.10.2 Running hypervisor: QEMU 0.12.1 >> >> Is it RHEL or CentOS or Scientific Linux, or something else? If >> RHEL/CentOS, what release? >> >>> The problem is that virtio_has_feature(vdev, VIRTIO_NET_F_MQ) returns FALSE >>> and I don't know why. >> >> This version of QEMU is too old. It's possible that 6.5 will have >> multiqueue, but I'm not entirely sure. >> >> Paolo >> -- >> To unsubscribe from this list: send the line "unsubscribe kvm" in the >> body of a message to majord...@vger.kernel.org More majordomo info at >> http://vger.kernel.org/majordomo-info.html >> > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Oracle RAC in libvirt+KVM environment
Il 20/08/2013 13:43, Timon Wang ha scritto: > Thanks, the whole iSCSI LUN have been passed to the VM. > > But I test it with scsicmd, and found that the driver may be not > support SPC-3, but if i use this by microsoft iscsi initiator, I can > pass all the scsi3_test tests. If you are passing the LUN to the VM with device='lun', the driver and VMM do not interpret any SCSI command. You should see exactly the same data as in the host, which includes support for SPC-3: >>> [root@localhost /]# sg_inq /dev/VM-IMAGES-BACKUP-DO-NOT-REMOVE/q_disk >>> standard INQUIRY: >>> PQual=0 Device_type=0 RMB=0 version=0x05 [SPC-3] >>> [AERC=0] [TrmTsk=0] NormACA=0 HiSUP=0 Resp_data_format=0 >>> SCCS=1 ACC=0 TPGS=1 3PC=0 Protect=0 [BQue=0] >>> EncServ=0 MultiP=0 [MChngr=0] [ACKREQQ=0] Addr16=0 >>> [RelAdr=0] WBus16=1 Sync=1 Linked=0 [TranDis=0] CmdQue=1 >>> length=36 (0x24) Peripheral device type: disk >>> Vendor identification: MacroSAN >>> Product identification: LU >>> Product revision level: 1.0 >>> Unit serial number: 0d9281ae-aea4-6da0--02180142b300 Can you try using a Linux VM and executing sg_inq in the VM? Paolo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Oracle RAC in libvirt+KVM environment
I found when I use "scsicmd -d1 -s13" test command to test the "controller bus reset" request, there will be a blue screen on windows 2008 r2. The error code is : BugCheck D1, {4, a, 0, f8800154dd06} 1: kd> !analyze -v *** * * *Bugcheck Analysis* * * *** DRIVER_IRQL_NOT_LESS_OR_EQUAL (d1) An attempt was made to access a pageable (or completely invalid) address at an interrupt request level (IRQL) that is too high. This is usually caused by drivers using improper addresses. If kernel debugger is available get stack backtrace. Arguments: Arg1: 0004, memory referenced Arg2: 000a, IRQL Arg3: , value 0 = read operation, 1 = write operation Arg4: f8800154dd06, address which referenced memory Debugging Details: -- Page 17c41 not present in the dump file. Type ".hh dbgerr004" for details READ_ADDRESS: 0004 CURRENT_IRQL: a FAULTING_IP: vioscsi+1d06 f880`0154dd06 458b4804mov r9d,dword ptr [r8+4] DEFAULT_BUCKET_ID: VISTA_DRIVER_FAULT BUGCHECK_STR: 0xD1 PROCESS_NAME: scsicmd.exe TRAP_FRAME: f880009f7670 -- (.trap 0xf880009f7670) NOTE: The trap frame does not contain all registers. Some register values may be zeroed or incorrect. rax=0002 rbx= rcx=fa800065e738 rdx=fa800065e8f8 rsi= rdi= rip=f8800154dd06 rsp=f880009f7800 rbp=fa800065e8f8 r8= r9= r10=fa80009155b0 r11=f880009f7848 r12= r13= r14= r15= iopl=0 nv up ei pl zr na po nc vioscsi+0x1d06: f880`0154dd06 458b4804mov r9d,dword ptr [r8+4] ds:e630:0004= Resetting default scope LAST_CONTROL_TRANSFER: from f800016ca469 to f800016caf00 STACK_TEXT: f880`009f7528 f800`016ca469 : `000a `0004 `000a ` : nt!KeBugCheckEx f880`009f7530 f800`016c90e0 : ` fa80`009151b0 fa80`0155a290 f880`01339110 : nt!KiBugCheckDispatch+0x69 f880`009f7670 f880`0154dd06 : `0001 f880`0154dcec fa80`009151b0 f880`01323934 : nt!KiPageFault+0x260 f880`009f7800 f880`0132abcf : fa80`009151b0 fa80`0065e8f8 fa80`0065e738 `0001 : vioscsi+0x1d06 f880`009f7850 f880`0154d971 : `0001 `0001 `002d5000 fa80`00925000 : storport!StorPortSynchronizeAccess+0x4f f880`009f7890 f880`01323a0c : fa80`0fb1 fa80`0155a200 `002d5000 fa80`01576010 : vioscsi+0x1971 f880`009f78d0 f880`01333adf : fa80`006eeb30 fa80`006e2070 ` `0801 : storport!RaCallMiniportResetBus+0x1c f880`009f7900 f880`01333b68 : fa80`0155a290 fa80`006b39f0 0040` ` : storport!RaidAdapterResetBus+0x2f f880`009f7950 f880`0136de0b : `20206f49 `0001 `0001 `20206f49 : storport!RaidAdapterStorageResetBusIoctl+0x28 f880`009f7980 f880`0136d1d0 : f880`01339110 fa80`00915060 ` fa80`006e2070 : storport! ?? ::NNGAKEGL::`string'+0x3c8 f880`009f79d0 f800`019e33a7 : fa80`006e2070 f880`009f7ca0 fa80`006e2070 fa80`0155a290 : storport!RaDriverDeviceControlIrp+0x90 f880`009f7a10 f800`019e3c06 : ` ` ` ` : nt!IopXxxControlFile+0x607 f880`009f7b40 f800`016ca153 : `001aeb01 `0001 `001aeba0 f800`019db152 : nt!NtDeviceIoControlFile+0x56 f880`009f7bb0 `77a2ff2a : ` ` ` ` : nt!KiSystemServiceCopyEnd+0x13 `001af1d8 ` : ` ` ` ` : 0x77a2ff2a STACK_COMMAND: kb FOLLOWUP_IP: vioscsi+1d06 f880`0154dd06 458b4804mov r9d,dword ptr [r8+4] SYMBOL_STACK_INDEX: 3 SYMBOL_NAME: vioscsi+1d06 FOLLOWUP_NAME: MachineOwner MODULE_NAME: vioscsi IMAGE_NAME: vioscsi.sys DEBUG_FLR_IMAGE_TIMESTAMP: 5200724f FAILURE_BUCKET_ID: X64_0xD1_vioscsi+1d06 BUCKET_ID: X64_0xD1_vioscsi+1d06 Followup: MachineOwner - On Tue, Aug 20, 2013 at 7:43 PM, Timon Wang wrote: > Thanks, the whole iSCSI LUN have been passed to the VM. > > But I test it with scsicmd, and found that the driver may be not > support SPC-3, but if i use this by microsoft iscsi initiator, I can > pass all the scsi3_test tes
Re: Oracle RAC in libvirt+KVM environment
Thanks, the whole iSCSI LUN have been passed to the VM. But I test it with scsicmd, and found that the driver may be not support SPC-3, but if i use this by microsoft iscsi initiator, I can pass all the scsi3_test tests. Tool can be found here: http://www.symantec.com/business/support/index?page=content&id=TECH72086 It's this means that the scsi passthrough windows driver does not support SPC-3 feature, I have read a post about this, it says if support this we should change both the implementation and the documents in virtio spec. I am new to this list, so I don't know what is the situation right now? Would somebody please give me some advise on it? On Tue, Aug 20, 2013 at 6:49 PM, Paolo Bonzini wrote: > Il 20/08/2013 12:42, Timon Wang ha scritto: >> [root@localhost /]# ls -l /dev/VM-IMAGES-BACKUP-DO-NOT-REMOVE/q_disk >> lrwxrwxrwx. 1 root root 8 8月 20 17:38 >> /dev/VM-IMAGES-BACKUP-DO-NOT-REMOVE/q_disk -> ../dm-13 >> [root@localhost /]# sg_inq /dev/VM-IMAGES-BACKUP-DO-NOT-REMOVE/q_disk >> standard INQUIRY: >> PQual=0 Device_type=0 RMB=0 version=0x05 [SPC-3] >> [AERC=0] [TrmTsk=0] NormACA=0 HiSUP=0 Resp_data_format=0 >> SCCS=1 ACC=0 TPGS=1 3PC=0 Protect=0 [BQue=0] >> EncServ=0 MultiP=0 [MChngr=0] [ACKREQQ=0] Addr16=0 >> [RelAdr=0] WBus16=1 Sync=1 Linked=0 [TranDis=0] CmdQue=1 >> length=36 (0x24) Peripheral device type: disk >> Vendor identification: MacroSAN >> Product identification: LU >> Product revision level: 1.0 >> Unit serial number: 0d9281ae-aea4-6da0--02180142b300 >> >> This lun is from a vg build based on iscsi target. > > If it is a logical volume, you cannot pass it as a LUN to the guest. > Only the whole iSCSI LUN can be passed as a LUN. > > Paolo > >> [root@localhost /]# libvirtd --version >> libvirtd (libvirt) 1.0.5 >> [root@localhost /]# qemu-kvm --version >> QEMU emulator version 1.4.1, Copyright (c) 2003-2008 Fabrice Bellard >> [root@localhost /]# uname -a >> Linux localhost.localdomain 3.9.2-301.fc19.x86_64 #1 SMP Mon May 13 >> 12:36:24 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux >> >> >> On Tue, Aug 20, 2013 at 6:16 PM, Paolo Bonzini wrote: >>> Il 20/08/2013 11:59, Timon Wang ha scritto: On Tue, Aug 20, 2013 at 4:33 PM, Paolo Bonzini wrote: > Il 20/08/2013 08:00, Timon Wang ha scritto: >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> > > I'm not sure this will be enough, but if you want passthrough to the > host device you should use device='lun' here. However, you still would > not be able to issue SCSI reservations unless you run QEMU with the > CAP_SYS_RAWIO capability (using ""). > After change the libvirt xml like this: I got these errors: char device redirected to /dev/pts/1 (label charserial0) qemu-system-x86_64: -device scsi-block,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0: scsi-block: INQUIRY failed qemu-system-x86_64: -device scsi-block,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0: Device 'scsi-block' could not be initialized >>> >>> Can you do >>> >>> # ls -l /dev/VM-IMAGES-BACKUP-DO-NOT-REMOVE/q_disk >>> # sg_inq /dev/VM-IMAGES-BACKUP-DO-NOT-REMOVE/q_disk >>> >>> ? >>> >>> Paolo >>> >> >> >> > -- Focus on: Server Vitualization, Network security,Scanner,NodeJS,JAVA,WWW Blog: http://www.nohouse.net -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: Multi Queue KVM Support
Hi Paolo and thanks for your help. I upgraded the following (compiled from source) qemu : 1.5.2 stable libvirt : 1.1.1 but for some reason when I run the version command inside virsh: Compiled against library: libvirt 1.1.1 Using library: libvirt 1.1.1 Using API: QEMU 1.1.1 Running hypervisor: QEMU 0.12.1 It says that my running Hypervisor is QEMU 0.12.1 Could you please tell me what did I miss, how do I upgrade the hypervisor? Thanks, Naor -Original Message- From: Paolo Bonzini [mailto:paolo.bonz...@gmail.com] On Behalf Of Paolo Bonzini Sent: Tuesday, August 20, 2013 12:28 PM To: Naor Shlomo Cc: kvm@vger.kernel.org Subject: Re: Multi Queue KVM Support Il 20/08/2013 05:21, Naor Shlomo ha scritto: > Hi Paolo, > > The host is running CentOS release 6.3 (Final). > I did "yum upgrade libvirt" and "yum upgrade qemu-kvm" a couple of days ago > and ended up with these versions. > > What do you suggest regarding qemu? compile 6.5 or later myself? RHEL/CentOS 6.5 is not yet out, it's still a few months before it's released. You can compile QEMU 1.6 from source, or wait for CentOS to have the feature. Paolo > I appreciate your help, > Naor > > -Original Message- > From: Paolo Bonzini [mailto:paolo.bonz...@gmail.com] On Behalf Of > Paolo Bonzini > Sent: Monday, August 19, 2013 11:22 PM > To: Naor Shlomo > Cc: kvm@vger.kernel.org > Subject: Re: Multi Queue KVM Support > > Il 19/08/2013 13:29, Naor Shlomo ha scritto: >> Hello experts, >> >> I am trying to use the multi queue support on a Linux guest running Kernel >> 3.9.7. >> >> The host's virsh version command reports the following output: >> Compiled against library: libvirt 0.10.2 Using library: libvirt >> 0.10.2 Using API: QEMU 0.10.2 Running hypervisor: QEMU 0.12.1 > > Is it RHEL or CentOS or Scientific Linux, or something else? If RHEL/CentOS, > what release? > >> The problem is that virtio_has_feature(vdev, VIRTIO_NET_F_MQ) returns FALSE >> and I don't know why. > > This version of QEMU is too old. It's possible that 6.5 will have > multiqueue, but I'm not entirely sure. > > Paolo > -- > To unsubscribe from this list: send the line "unsubscribe kvm" in the > body of a message to majord...@vger.kernel.org More majordomo info at > http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Oracle RAC in libvirt+KVM environment
Il 20/08/2013 12:42, Timon Wang ha scritto: > [root@localhost /]# ls -l /dev/VM-IMAGES-BACKUP-DO-NOT-REMOVE/q_disk > lrwxrwxrwx. 1 root root 8 8月 20 17:38 > /dev/VM-IMAGES-BACKUP-DO-NOT-REMOVE/q_disk -> ../dm-13 > [root@localhost /]# sg_inq /dev/VM-IMAGES-BACKUP-DO-NOT-REMOVE/q_disk > standard INQUIRY: > PQual=0 Device_type=0 RMB=0 version=0x05 [SPC-3] > [AERC=0] [TrmTsk=0] NormACA=0 HiSUP=0 Resp_data_format=0 > SCCS=1 ACC=0 TPGS=1 3PC=0 Protect=0 [BQue=0] > EncServ=0 MultiP=0 [MChngr=0] [ACKREQQ=0] Addr16=0 > [RelAdr=0] WBus16=1 Sync=1 Linked=0 [TranDis=0] CmdQue=1 > length=36 (0x24) Peripheral device type: disk > Vendor identification: MacroSAN > Product identification: LU > Product revision level: 1.0 > Unit serial number: 0d9281ae-aea4-6da0--02180142b300 > > This lun is from a vg build based on iscsi target. If it is a logical volume, you cannot pass it as a LUN to the guest. Only the whole iSCSI LUN can be passed as a LUN. Paolo > [root@localhost /]# libvirtd --version > libvirtd (libvirt) 1.0.5 > [root@localhost /]# qemu-kvm --version > QEMU emulator version 1.4.1, Copyright (c) 2003-2008 Fabrice Bellard > [root@localhost /]# uname -a > Linux localhost.localdomain 3.9.2-301.fc19.x86_64 #1 SMP Mon May 13 > 12:36:24 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux > > > On Tue, Aug 20, 2013 at 6:16 PM, Paolo Bonzini wrote: >> Il 20/08/2013 11:59, Timon Wang ha scritto: >>> On Tue, Aug 20, 2013 at 4:33 PM, Paolo Bonzini wrote: Il 20/08/2013 08:00, Timon Wang ha scritto: > > > > > > > > > > > > > > > > I'm not sure this will be enough, but if you want passthrough to the host device you should use device='lun' here. However, you still would not be able to issue SCSI reservations unless you run QEMU with the CAP_SYS_RAWIO capability (using ""). >>> >>> After change the libvirt xml like this: >>> >>> >>> >>> >>> >>> >>> >>> I got these errors: >>> char device redirected to /dev/pts/1 (label charserial0) >>> qemu-system-x86_64: -device >>> scsi-block,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0: >>> scsi-block: INQUIRY failed >>> qemu-system-x86_64: -device >>> scsi-block,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0: >>> Device 'scsi-block' could not be initialized >> >> Can you do >> >> # ls -l /dev/VM-IMAGES-BACKUP-DO-NOT-REMOVE/q_disk >> # sg_inq /dev/VM-IMAGES-BACKUP-DO-NOT-REMOVE/q_disk >> >> ? >> >> Paolo >> > > > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Oracle RAC in libvirt+KVM environment
[root@localhost /]# ls -l /dev/VM-IMAGES-BACKUP-DO-NOT-REMOVE/q_disk lrwxrwxrwx. 1 root root 8 8月 20 17:38 /dev/VM-IMAGES-BACKUP-DO-NOT-REMOVE/q_disk -> ../dm-13 [root@localhost /]# sg_inq /dev/VM-IMAGES-BACKUP-DO-NOT-REMOVE/q_disk standard INQUIRY: PQual=0 Device_type=0 RMB=0 version=0x05 [SPC-3] [AERC=0] [TrmTsk=0] NormACA=0 HiSUP=0 Resp_data_format=0 SCCS=1 ACC=0 TPGS=1 3PC=0 Protect=0 [BQue=0] EncServ=0 MultiP=0 [MChngr=0] [ACKREQQ=0] Addr16=0 [RelAdr=0] WBus16=1 Sync=1 Linked=0 [TranDis=0] CmdQue=1 length=36 (0x24) Peripheral device type: disk Vendor identification: MacroSAN Product identification: LU Product revision level: 1.0 Unit serial number: 0d9281ae-aea4-6da0--02180142b300 This lun is from a vg build based on iscsi target. [root@localhost /]# libvirtd --version libvirtd (libvirt) 1.0.5 [root@localhost /]# qemu-kvm --version QEMU emulator version 1.4.1, Copyright (c) 2003-2008 Fabrice Bellard [root@localhost /]# uname -a Linux localhost.localdomain 3.9.2-301.fc19.x86_64 #1 SMP Mon May 13 12:36:24 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux On Tue, Aug 20, 2013 at 6:16 PM, Paolo Bonzini wrote: > Il 20/08/2013 11:59, Timon Wang ha scritto: >> On Tue, Aug 20, 2013 at 4:33 PM, Paolo Bonzini wrote: >>> Il 20/08/2013 08:00, Timon Wang ha scritto: >>> >>> I'm not sure this will be enough, but if you want passthrough to the >>> host device you should use device='lun' here. However, you still would >>> not be able to issue SCSI reservations unless you run QEMU with the >>> CAP_SYS_RAWIO capability (using ""). >>> >> >> After change the libvirt xml like this: >> >> >> >> >> >> >> >> I got these errors: >> char device redirected to /dev/pts/1 (label charserial0) >> qemu-system-x86_64: -device >> scsi-block,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0: >> scsi-block: INQUIRY failed >> qemu-system-x86_64: -device >> scsi-block,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0: >> Device 'scsi-block' could not be initialized > > Can you do > > # ls -l /dev/VM-IMAGES-BACKUP-DO-NOT-REMOVE/q_disk > # sg_inq /dev/VM-IMAGES-BACKUP-DO-NOT-REMOVE/q_disk > > ? > > Paolo > -- Focus on: Server Vitualization, Network security,Scanner,NodeJS,JAVA,WWW Blog: http://www.nohouse.net -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Oracle RAC in libvirt+KVM environment
Il 20/08/2013 11:59, Timon Wang ha scritto: > On Tue, Aug 20, 2013 at 4:33 PM, Paolo Bonzini wrote: >> Il 20/08/2013 08:00, Timon Wang ha scritto: >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >> >> I'm not sure this will be enough, but if you want passthrough to the >> host device you should use device='lun' here. However, you still would >> not be able to issue SCSI reservations unless you run QEMU with the >> CAP_SYS_RAWIO capability (using ""). >> > > After change the libvirt xml like this: > > > > > > > > I got these errors: > char device redirected to /dev/pts/1 (label charserial0) > qemu-system-x86_64: -device > scsi-block,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0: > scsi-block: INQUIRY failed > qemu-system-x86_64: -device > scsi-block,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0: > Device 'scsi-block' could not be initialized Can you do # ls -l /dev/VM-IMAGES-BACKUP-DO-NOT-REMOVE/q_disk # sg_inq /dev/VM-IMAGES-BACKUP-DO-NOT-REMOVE/q_disk ? Paolo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Oracle RAC in libvirt+KVM environment
On Tue, Aug 20, 2013 at 4:33 PM, Paolo Bonzini wrote: > Il 20/08/2013 08:00, Timon Wang ha scritto: >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> > > I'm not sure this will be enough, but if you want passthrough to the > host device you should use device='lun' here. However, you still would > not be able to issue SCSI reservations unless you run QEMU with the > CAP_SYS_RAWIO capability (using ""). > After change the libvirt xml like this: I got these errors: char device redirected to /dev/pts/1 (label charserial0) qemu-system-x86_64: -device scsi-block,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0: scsi-block: INQUIRY failed qemu-system-x86_64: -device scsi-block,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0: Device 'scsi-block' could not be initialized > Most important, it still would be unsafe to do this if the same device > is passed to multiple virtual machines on the same host. You need to > have NPIV and create separate virtual HBAs. Then each virtual machine > should get a separate virtual HBA. Otherwise, persistent reservations > are not attached to a particular virtual machine, but generically to the > host. How to use NPIV virtual HBAs with libvirt xml configurations? I can define nodedev, but have no idea about how to pass the nodedev to the vm. > >> >> >> > > You are not exposing a virtio-scsi disk here. You are exposing a > virtio-blk disk. You can see this from the type='pci' address that > libvirt gave to the disk. > > If you use bus='scsi', you will see that libvirt will use type='drive' > for the address. > >> >> > function='0x0'/> >> > > This is okay. > >> >> >> > > FWIW, this can be replaced with > > > > > > (you already have the element, but no element inside). Thanks for this tip. > > Paolo > >> >> >> >> >> On 8/19/13, Paolo Bonzini wrote: >>> Il 15/08/2013 12:01, Timon Wang ha scritto: Thanks. I have read the link you provide, there is another link which tells me to pass a NPIV discovery lun as a disk, this is seen as a local direct access disk in windows. RAC and Failure Cluster both consider this pass through disk as local disk, not a share disk, and the setup process failed. Hyper-v provides a virtual Fiber Channel implementation, so I wondering if kvm has the same solution like it. >>> >>> Can you include the XML file you are using for the domain? >>> >>> Paolo >>> >>> >> >> > -- Focus on: Server Vitualization, Network security,Scanner,NodeJS,JAVA,WWW Blog: http://www.nohouse.net -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Multi Queue KVM Support
Il 20/08/2013 05:21, Naor Shlomo ha scritto: > Hi Paolo, > > The host is running CentOS release 6.3 (Final). > I did "yum upgrade libvirt" and "yum upgrade qemu-kvm" a couple of days ago > and ended up with these versions. > > What do you suggest regarding qemu? compile 6.5 or later myself? RHEL/CentOS 6.5 is not yet out, it's still a few months before it's released. You can compile QEMU 1.6 from source, or wait for CentOS to have the feature. Paolo > I appreciate your help, > Naor > > -Original Message- > From: Paolo Bonzini [mailto:paolo.bonz...@gmail.com] On Behalf Of Paolo > Bonzini > Sent: Monday, August 19, 2013 11:22 PM > To: Naor Shlomo > Cc: kvm@vger.kernel.org > Subject: Re: Multi Queue KVM Support > > Il 19/08/2013 13:29, Naor Shlomo ha scritto: >> Hello experts, >> >> I am trying to use the multi queue support on a Linux guest running Kernel >> 3.9.7. >> >> The host's virsh version command reports the following output: >> Compiled against library: libvirt 0.10.2 Using library: libvirt 0.10.2 >> Using API: QEMU 0.10.2 Running hypervisor: QEMU 0.12.1 > > Is it RHEL or CentOS or Scientific Linux, or something else? If RHEL/CentOS, > what release? > >> The problem is that virtio_has_feature(vdev, VIRTIO_NET_F_MQ) returns FALSE >> and I don't know why. > > This version of QEMU is too old. It's possible that 6.5 will have > multiqueue, but I'm not entirely sure. > > Paolo > -- > To unsubscribe from this list: send the line "unsubscribe kvm" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Oracle RAC in libvirt+KVM environment
Il 20/08/2013 08:00, Timon Wang ha scritto: > > > > > > > > > > > > > > > > I'm not sure this will be enough, but if you want passthrough to the host device you should use device='lun' here. However, you still would not be able to issue SCSI reservations unless you run QEMU with the CAP_SYS_RAWIO capability (using ""). Most important, it still would be unsafe to do this if the same device is passed to multiple virtual machines on the same host. You need to have NPIV and create separate virtual HBAs. Then each virtual machine should get a separate virtual HBA. Otherwise, persistent reservations are not attached to a particular virtual machine, but generically to the host. > > > You are not exposing a virtio-scsi disk here. You are exposing a virtio-blk disk. You can see this from the type='pci' address that libvirt gave to the disk. If you use bus='scsi', you will see that libvirt will use type='drive' for the address. > >function='0x0'/> > This is okay. > > > FWIW, this can be replaced with (you already have the element, but no element inside). Paolo > > > > > On 8/19/13, Paolo Bonzini wrote: >> Il 15/08/2013 12:01, Timon Wang ha scritto: >>> Thanks. >>> >>> I have read the link you provide, there is another link which tells me >>> to pass a NPIV discovery lun as a disk, this is seen as a local direct >>> access disk in windows. RAC and Failure Cluster both consider this >>> pass through disk as local disk, not a share disk, and the setup >>> process failed. >>> >>> Hyper-v provides a virtual Fiber Channel implementation, so I >>> wondering if kvm has the same solution like it. >> >> Can you include the XML file you are using for the domain? >> >> Paolo >> >> > > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Emulation failure
Il 20/08/2013 03:26, Duy Nguyen TN ha scritto: > Vào T2, ngày 19, 08 năm 2013 lúc 11:27 +0200, Paolo Bonzini viết: >>> The disassembled code is >>> >>>0x1dd10:push %rbx >>>0x1dd11:mov$0x6e,%eax >>>0x1dd16:mov%rdi,%rbx >>>0x1dd19:sub$0x20,%rsp >>>0x1dd1d:test %rdi,%rdi >>>0x1dd20:je 0xb1dd92 >>>0x1dd22:mov0x4bf1e0(%rip),%eax >>>0x1dd28:cmp$0x,%eax >>>0x1dd2b:je 0xb1ddd0 >>>0x1dd31:test %eax,%eax >>>0x1dd33:jne0xb1dd92 >>>0x1dd35:mov0xe1f55c(%rip),%rax >>>0x1dd3c:cmpq $0x0,0xf0(%rax) >>>0x1dd44:fildll 0xf0(%rax) >>>0x1dd4a:js 0xb1ddf0 >>>0x1dd50:mov0xe1f54a(%rip),%eax >>>0x1dd56:mov%rax,-0x80(%rsp) >>>0x1dd5b:fildll -0x80(%rsp) >>>0x1dd5f:fmulp %st,%st(1) >>> >>> Not sure if it helps but rax after 0xb1dd35 contains the pointer to >>> mmap'd memory of /dev/hpet >> >> I think this wouldn't work even with the latest kernel. Emulation of >> x87 instructions is not supported yet. > > I'm confused. How could this program work? It produces similar assembly > listing The information you posted is not really enough to get the complete picture (it is better to grab it from ftrace in the host, or from the QEMU monitor), but my understanding is that the instruction at 0xb1dd44 doesn't refer to RAM; it refers to a memory-mapped I/O region. In this case, the instructions are not executed by the processor. Instead, they are emulated by the hypervisor. KVM does not support emulation of x87 instructions. Paolo > -- 8< -- > #include > #include > > uint64_t s_rtcClockPeriod = 10; > uint64_t mc = 30; > int main(int ac, char **av) > { > uint64_t value = (uint64_t)((long double)mc * > (long double)s_rtcClockPeriod / > 10.0L); > printf("%lu\n", value); > return 0; > } > -- 8< -- > > and the assembly I got is > > -- 8< -- > sub$0x18,%rsp > cmpq $0x0,0x200adc(%rip) > fildll 0x200ad6(%rip) > js 0x4005f8 > cmpq $0x0,0x200ac0(%rip) > fildll 0x200aba(%rip) > js 0x400612 > fmulp %st,%st(1) > fdivs 0x1ac(%rip) > flds 0x1aa(%rip) > fxch %st(1) > fucomi %st(1),%st > jae0x4005c0 > fstp %st(1) > fnstcw 0x16(%rsp) > ... > -- 8< -- > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html