Hi Paolo, On 12/16/2015 08:16 PM, Paolo Bonzini wrote: > > > On 16/12/2015 17:55, Alex Pyrgiotis wrote: >> Hi all, >> >> This patch is an attempt to boost the performance of "scsi-generic" and >> "scsi-block" device types, by removing an extra data copy and reducing >> their memory footprint. More specifically, the problem lies in the >> functions in the `scsi-generic_req_ops` struct of scsi-generic.c. These >> functions rely on an intermediate buffer to do the SG_IO ioctl request, >> without checking if the SCSI controller has provided a scatter-gather >> list with the request. >> >> In a nutshell, our proposal is to map the provided scatter-gather list >> (if any) to the address space of the QEMU process and use the resulting >> iovec as the buffer for the ioctl request. You'll find that the logic is >> quite similar to the one used in scsi-disk.c. > > Which commands have large payloads and are on the data path, for > scsi-block? Or is the use case just scsi-generic (e.g. tape devices?)? > > (Just trying to understand before I dive into the patches).
Sure, no problem. The commands that have large payloads and are on the data path are the classic SCSI READ/WRITE commands. Usually, these commands are implemented with vectored reads/writes, which utilize the controller's scatter-gather list. However, when opening a "scsi-block" device with the default cache policy (cache=writeback), QEMU fallbacks to the "scsi-generic" functions (i.e, SG_IO ioctl requests) for reading/writing data [1]. In this case, the data are copied in a bounce buffer, which is the issue that this patch tackles. Thanks, Alex [1]: I'll quote the comment on the code for the rationale behind this choice: "If we are not using O_DIRECT, we might read stale data from the host cache if writes were made using other commands than these ones (such as WRITE SAME or EXTENDED COPY, etc.). So, without O_DIRECT everything must go through SG_IO."