Re: [PATCH RESEND 0/5] Add vhost-blk support
Hi Christoph, On 07/14/2012 03:49 PM, Christoph Hellwig wrote: Please send a version that does direct block I/O similar to xen-blkback for now. Seems xen-blkback converts the guest IO request to host bio and submit them directly. I was wondering whether this has a performance gain compared to AIO implementation. If we get proper in-kernel aio support one day you can add back file backend support. I talked with Dave and Zack on the in-kernel aio patch which James pointed out: http://marc.info/?l=linux-fsdevelm=133312234313122 Dave will post a new version soon. I will wait for it. -- Asias ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH 0/5] Add vhost-blk support
On Thu, Jul 12, 2012 at 4:35 PM, Asias He as...@redhat.com wrote: This patchset adds vhost-blk support. vhost-blk is a in kernel virito-blk device accelerator. Compared to userspace virtio-blk implementation, vhost-blk gives about 5% to 15% performance improvement. Why is it 5-15% faster? vhost-blk and the userspace virtio-blk you benchmarked should be doing basically the same thing: 1. An eventfd file descriptor is signalled when the vring has new requests available from the guest. 2. A thread wakes up and processes the virtqueue. 3. Linux AIO is used to issue host I/O. 4. An interrupt is injected into the guest. Does the vhost-blk implementation do anything fundamentally different from userspace? Where is the overhead that userspace virtio-blk has? I'm asking because it would be beneficial to fix the overhead (especially it that could speed up all userspace applications) instead of adding a special-purpose kernel module to work around the overhead. Stefan ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
[PATCH 0/2] virtio-scsi spec: event improvements
This makes some changes to the virtio-scsi event specification, so that it is now possible to use virtio-scsi events in the implementation of the QEMU block_resize command. Thanks to Cong Meng for finally implementing virtio-scsi hotplug, which made me look at block_resize again! Paolo Bonzini (2): virtio-scsi spec: unify event structs virtio-scsi spec: add configuration change event virtio-spec.lyx | 164 +-- 1 file changed, 160 insertions(+), 4 deletions(-) -- 1.7.10.4 ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
[PATCH 1/2] virtio-scsi spec: unify event structs
All currently defined event structs have the same fields. Simplify the driver by enforcing this also for future structs. Signed-off-by: Paolo Bonzini pbonz...@redhat.com --- virtio-spec.lyx | 69 +++ 1 file changed, 65 insertions(+), 4 deletions(-) diff --git a/virtio-spec.lyx b/virtio-spec.lyx index 905e619..f8b214b 100644 --- a/virtio-spec.lyx +++ b/virtio-spec.lyx @@ -8207,7 +8207,20 @@ struct virtio_scsi_event { \begin_layout Plain Layout +\change_deleted 1531152142 1342440791 + ... +\change_inserted 1531152142 1342440791 +u8 lun[8]; +\end_layout + +\begin_layout Plain Layout + +\change_inserted 1531152142 1342440791 + +u32 reason; +\change_unchanged + \end_layout \begin_layout Plain Layout @@ -8221,16 +8234,32 @@ struct virtio_scsi_event { \end_layout \begin_layout Standard -If bit 31 is set in the event field, the device failed to report an event - due to missing buffers. +If bit 31 is set in the +\series bold +event +\series default + field, the device failed to report an event due to missing buffers. In this case, the driver should poll the logical units for unit attention conditions, and/or do whatever form of bus scan is appropriate for the guest operating system. \end_layout \begin_layout Standard -Other data that the device writes to the buffer depends on the contents - of the event field. + +\change_deleted 1531152142 1342440830 +Other data that the device writes to the buffer +\change_inserted 1531152142 1342440839 +The meaning of the +\series bold +reason +\series default + field +\change_unchanged + depends on the contents of the +\series bold +event +\series default + field. The following events are defined: \end_layout @@ -8312,36 +8341,50 @@ status open \begin_layout Plain Layout +\change_deleted 1531152142 1342440799 + struct virtio_scsi_event_reset { \end_layout \begin_layout Plain Layout +\change_deleted 1531152142 1342440799 + // Write-only part \end_layout \begin_layout Plain Layout +\change_deleted 1531152142 1342440799 + u32 event; \end_layout \begin_layout Plain Layout +\change_deleted 1531152142 1342440799 + u8 lun[8]; \end_layout \begin_layout Plain Layout +\change_deleted 1531152142 1342440799 + u32 reason; \end_layout \begin_layout Plain Layout +\change_deleted 1531152142 1342440799 + } \end_layout \begin_layout Plain Layout +\change_deleted 1531152142 1342440799 + \end_layout \begin_layout Plain Layout @@ -8542,40 +8585,58 @@ status open \begin_layout Plain Layout #define VIRTIO_SCSI_T_ASYNC_NOTIFY 2 +\change_deleted 1531152142 1342440854 + \end_layout \begin_layout Plain Layout +\change_deleted 1531152142 1342440854 + \end_layout \begin_layout Plain Layout +\change_deleted 1531152142 1342440854 + struct virtio_scsi_event_an { \end_layout \begin_layout Plain Layout +\change_deleted 1531152142 1342440854 + // Write-only part \end_layout \begin_layout Plain Layout +\change_deleted 1531152142 1342440854 + u32 event; \end_layout \begin_layout Plain Layout +\change_deleted 1531152142 1342440854 + u8 lun[8]; \end_layout \begin_layout Plain Layout +\change_deleted 1531152142 1342440854 + u32 reason; \end_layout \begin_layout Plain Layout +\change_deleted 1531152142 1342440854 + } +\change_unchanged + \end_layout \end_inset -- 1.7.10.4 ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
[PATCH 2/2] virtio-scsi spec: add configuration change event
This adds an event for changes to LUN parameters, for example capacity. These are reported in virtio-blk via configuration changes, and we want a similar functionality in virtio-scsi too. There is no list of supported parameter changes, instead we just refer to the list of sense codes in the SCSI specification. This event will usually be serviced in one of three ways: 1) call an OS service to revalidate the disk, either always or only for some specific sense codes; 2) somehow pass the sense directly to the upper-level driver; 3) inject a TEST UNIT READY command into the upper-level device, so that the OS will see the unit attention code and react. Of course a mix of the three is also possible, depending on how the driver writer prefers to have his layering violations served. Signed-off-by: Paolo Bonzini pbonz...@redhat.com --- virtio-spec.lyx | 95 +++ 1 file changed, 95 insertions(+) diff --git a/virtio-spec.lyx b/virtio-spec.lyx index f8b214b..8d2ac9a 100644 --- a/virtio-spec.lyx +++ b/virtio-spec.lyx @@ -6995,6 +6995,21 @@ VIRTIO_SCSI_F_HOTPLUG (1) The host should enable hot-plug/hot-unplug of new LUNs and targets on the SCSI bus. +\change_inserted 1531152142 1342440342 + +\end_layout + +\begin_layout Description + +\change_inserted 1531152142 1342440768 +VIRTIO_SCSI_F_CHANGE +\begin_inset space ~ +\end_inset + +(2) The host will report changes to LUN parameters via a VIRTIO_SCSI_T_PARAM_CHA +NGE event. +\change_unchanged + \end_layout \end_deeper @@ -8673,6 +8688,86 @@ reason \begin_layout Standard When dropped events are reported, the driver should poll for asynchronous events manually using SCSI commands. +\change_inserted 1531152142 1342439104 + +\end_layout + +\end_deeper +\begin_layout Description + +\change_inserted 1531152142 1342440778 +LUN +\begin_inset space ~ +\end_inset + +parameter +\begin_inset space ~ +\end_inset + +change +\begin_inset space ~ +\end_inset + + +\begin_inset Newline newline +\end_inset + + +\begin_inset listings +inline false +status open + +\begin_layout Plain Layout + +\change_inserted 1531152142 1342440783 + +#define VIRTIO_SCSI_T_PARAM_CHANGE 3 +\end_layout + +\end_inset + + +\end_layout + +\begin_deeper +\begin_layout Standard + +\change_inserted 1531152142 1342440882 +By sending this event, the device signals that the configuration parameters + (for example the capacity) of a logical unit have changed. + The +\series bold +event +\series default + field is set to VIRTIO_SCSI_T_PARAM_CHANGE. + The +\series bold +lun +\series default + field addresses a logical unit in the SCSI host. +\end_layout + +\begin_layout Standard + +\change_inserted 1531152142 1342440916 +The same event is also reported as a unit attention condition. + The +\series bold +reason +\series default + field contains the additional sense code and additional sense code qualifier, + respectively in bits 0..7 and 8..15. + For example, a change in capacity will be reported as asc 0x2a, ascq 0x09 + (CAPACITY DATA HAS CHANGED). +\end_layout + +\begin_layout Standard + +\change_inserted 1531152142 1342442803 +For MMC devices (inquiry type 5) there would be some overlap between this + event and the asynchronous notification event. + For simplicity, as of this version of the specification the host must + never report this event for MMC devices. \end_layout \end_deeper -- 1.7.10.4 ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [Ksummit-2012-discuss] SCSI Performance regression [was Re: [PATCH 0/6] tcm_vhost/virtio-scsi WIP code for-3.6]
On Fri, 6 Jul 2012, James Bottomley wrote: What people might pay attention to is evidence that there's a problem in 3.5-rc6 (without any OFED crap). If you're not going to bother investigating, it has to be in an environment they can reproduce (so ordinary hardware, not infiniband) otherwise it gets ignored as an esoteric hardware issue. The OFED stuff in the meantime is part of 3.5-rc6. Infiniband has been supported for a long time and its a very important technology given the problematic nature of ethernet at high network speeds. OFED crap exists for those running RHEL5/6. The new enterprise distros are based on the 3.2 kernel which has pretty good Infiniband support out of the box. ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH-v2] virtio-scsi: Add vdrv-scan for post VIRTIO_CONFIG_S_DRIVER_OK LUN scanning
On Wed, 2012-07-11 at 21:22 +, Nicholas A. Bellinger wrote: From: Nicholas Bellinger n...@linux-iscsi.org This patch changes virtio-scsi to use a new virtio_driver-scan() callback so that scsi_scan_host() can be properly invoked once virtio_dev_probe() has set add_status(dev, VIRTIO_CONFIG_S_DRIVER_OK) to signal active virtio-ring operation, instead of from within virtscsi_probe(). This fixes a bug where SCSI LUN scanning for both virtio-scsi-raw and virtio-scsi/tcm_vhost setups was happening before VIRTIO_CONFIG_S_DRIVER_OK had been set, causing VIRTIO_SCSI_S_BAD_TARGET to occur. This fixes a bug with virtio-scsi/tcm_vhost where LUN scan was not detecting LUNs. Tested with virtio-scsi-raw + virtio-scsi/tcm_vhost w/ IBLOCK on 3.5-rc2 code. (nab: Fix up minor apply fuzz against scsi.git/misc) Cc: Paolo Bonzini pbonz...@redhat.com Cc: Stefan Hajnoczi stefa...@linux.vnet.ibm.com Cc: Zhi Yong Wu wu...@cn.ibm.com Cc: Christoph Hellwig h...@lst.de Cc: Hannes Reinecke h...@suse.de Cc: James Bottomley jbottom...@parallels.com Signed-off-by: Nicholas Bellinger n...@linux-iscsi.org Was the change so great that it needs re acking? I assume it also now no longer applies to stable because it will reject? James ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [Xen-devel] [PATCH] xen: populate correct number of pages when across mem boundary
On 04/07/12 07:49, zhenzhong.duan wrote: When populate pages across a mem boundary at bootup, the page count populated isn't correct. This is due to mem populated to non-mem region and ignored. Pfn range is also wrongly aligned when mem boundary isn't page aligned. Also need consider the rare case when xen_do_chunk fail(populate). For a dom0 booted with dom_mem=3368952K(0xcd9ff000-4k) dmesg diff is: [0.00] Freeing 9e-100 pfn range: 98 pages freed [0.00] 1-1 mapping on 9e-100 [0.00] 1-1 mapping on cd9ff-10 [0.00] Released 98 pages of unused memory [0.00] Set 206435 page(s) to 1-1 mapping -[0.00] Populating cd9fe-cda00 pfn range: 1 pages added +[0.00] Populating cd9fe-cd9ff pfn range: 1 pages added +[0.00] Populating 10-100061 pfn range: 97 pages added [0.00] BIOS-provided physical RAM map: [0.00] Xen: - 0009e000 (usable) [0.00] Xen: 000a - 0010 (reserved) [0.00] Xen: 0010 - cd9ff000 (usable) [0.00] Xen: cd9ffc00 - cda53c00 (ACPI NVS) ... [0.00] Xen: 0001 - 000100061000 (usable) [0.00] Xen: 000100061000 - 00012c00 (unusable) ... [0.00] MEMBLOCK configuration: ... -[0.00] reserved[0x4] [0x00cd9ff000-0x00cd9ffbff], 0xc00 bytes -[0.00] reserved[0x5] [0x01-0x0100060fff], 0x61000 bytes Related xen memory layout: (XEN) Xen-e820 RAM map: (XEN) - 0009ec00 (usable) (XEN) 000f - 0010 (reserved) (XEN) 0010 - cd9ffc00 (usable) Signed-off-by: Zhenzhong Duan zhenzhong.d...@oracle.com --- arch/x86/xen/setup.c | 24 +++- 1 files changed, 11 insertions(+), 13 deletions(-) diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c index a4790bf..bd78773 100644 --- a/arch/x86/xen/setup.c +++ b/arch/x86/xen/setup.c @@ -157,50 +157,48 @@ static unsigned long __init xen_populate_chunk( unsigned long dest_pfn; for (i = 0, entry = list; i map_size; i++, entry++) { - unsigned long credits = credits_left; unsigned long s_pfn; unsigned long e_pfn; unsigned long pfns; long capacity; - if (credits = 0) + if (credits_left = 0) break; if (entry-type != E820_RAM) continue; - e_pfn = PFN_UP(entry-addr + entry-size); + e_pfn = PFN_DOWN(entry-addr + entry-size); Ok. /* We only care about E820 after the xen_start_info-nr_pages */ if (e_pfn = max_pfn) continue; - s_pfn = PFN_DOWN(entry-addr); + s_pfn = PFN_UP(entry-addr); Ok. /* If the E820 falls within the nr_pages, we want to start * at the nr_pages PFN. * If that would mean going past the E820 entry, skip it */ +again: if (s_pfn = max_pfn) { capacity = e_pfn - max_pfn; dest_pfn = max_pfn; } else { - /* last_pfn MUST be within E820_RAM regions */ - if (*last_pfn e_pfn = *last_pfn) - s_pfn = *last_pfn; capacity = e_pfn - s_pfn; dest_pfn = s_pfn; } - /* If we had filled this E820_RAM entry, go to the next one. */ - if (capacity = 0) - continue; - if (credits capacity) - credits = capacity; + if (credits_left capacity) + capacity = credits_left; - pfns = xen_do_chunk(dest_pfn, dest_pfn + credits, false); + pfns = xen_do_chunk(dest_pfn, dest_pfn + capacity, false); done += pfns; credits_left -= pfns; *last_pfn = (dest_pfn + pfns); + if (credits_left 0 *last_pfn e_pfn) { + s_pfn = *last_pfn; + goto again; + } This looks like it will loop forever if xen_do_chunk() repeatedly fails because Xen is out of pages. I think if xen_do_chunk() cannot get a page from Xen the repopulation process should stop -- aborting this chunk and any others. This will allow the guest to continue to boot just with less memory than expected. David ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH 2/2] virtio-scsi spec: add configuration change event
On Mon, 16 Jul 2012 16:24:37 +0200, Paolo Bonzini pbonz...@redhat.com wrote: This adds an event for changes to LUN parameters, for example capacity. These are reported in virtio-blk via configuration changes, and we want a similar functionality in virtio-scsi too. Both applied. Thanks! Rusty. ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization