On Fri, Feb 4, 2011 at 4:37 AM, <ronniesahlb...@gmail.com> wrote: > From: Ronnie Sahlberg <ronniesahlb...@gmail.com> > > This patch adds a new block driver : block.iscsi.c > This driver interfaces with the multiplatform posix library > for iscsi initiator/client access to iscsi devices hosted at > git://github.com/sahlberg/libiscsi.git > > The patch adds the driver to interface with the iscsi library. > It also updated the configure script to > * by default, probe is libiscsi is available and if so, build > qemu against libiscsi. > * --enable-libiscsi > Force a build against libiscsi. If libiscsi is not available > the build will fail. > * --disable-libiscsi > Do not link against libiscsi, even if it is available. > > When linked with libiscsi, qemu gains support to access iscsi resources > such as disks and cdrom directly, without having to make the devices visible > to the host. > > You can specify devices using a iscsi url of the form : > iscsi://[<username>%<password>@]<host>[:<port]/<target-iqn-name>/<lun>
>From what I can tell there is no standard iSCSI URI scheme but it's worth checking before we invent our own. I thought that '%' is an escape character in URIs. Are you sure it's valid to use it as a delimeter between the username and password? There is a security risk in passing the password on the QEMU command-line since other users will be able to see it in ps(1) output. Encrypted image files (e.g. qcow2) require keys and this is closest to a mechanism for securely providing the password either via QMP or interactively. That would be worth looking into. > > Example: > -drive file=iscsi://10.1.1.1:3260/iqn.ronnie.test/1 > > -cdrom iscsi://10.1.1.1:3260/iqn.ronnie.test/2 > > Signed-off-by: Ronnie Sahlberg <ronniesahlb...@gmail.com> > --- > Makefile.objs | 1 + > block/iscsi.c | 542 > +++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > configure | 31 ++++ > trace-events | 7 + > 4 files changed, 581 insertions(+), 0 deletions(-) > create mode 100644 block/iscsi.c > > diff --git a/Makefile.objs b/Makefile.objs > index f1c7bfe..a62fcdd 100644 > --- a/Makefile.objs > +++ b/Makefile.objs > @@ -25,6 +25,7 @@ block-nested-y += qed-check.o > block-nested-y += parallels.o nbd.o blkdebug.o sheepdog.o blkverify.o > block-nested-$(CONFIG_WIN32) += raw-win32.o > block-nested-$(CONFIG_POSIX) += raw-posix.o > +block-nested-$(CONFIG_LIBISCSI) += iscsi.o > block-nested-$(CONFIG_CURL) += curl.o > block-nested-$(CONFIG_RBD) += rbd.o > > diff --git a/block/iscsi.c b/block/iscsi.c > new file mode 100644 > index 0000000..8cb1c6f > --- /dev/null > +++ b/block/iscsi.c > @@ -0,0 +1,542 @@ > +/* > + * QEMU Block driver for iSCSI images > + * > + * Copyright (c) 2010 Ronnie Sahlberg <ronniesahlb...@gmail.com> > + * > + * Permission is hereby granted, free of charge, to any person obtaining a > copy > + * of this software and associated documentation files (the "Software"), to > deal > + * in the Software without restriction, including without limitation the > rights > + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell > + * copies of the Software, and to permit persons to whom the Software is > + * furnished to do so, subject to the following conditions: > + * > + * The above copyright notice and this permission notice shall be included in > + * all copies or substantial portions of the Software. > + * > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL > + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING > FROM, > + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN > + * THE SOFTWARE. > + */ > + > +#include "config-host.h" > + > +#include <poll.h> > +#include "sysemu.h" > +#include "qemu-common.h" > +#include "qemu-error.h" > +#include "block_int.h" > +#include "trace.h" > + > +#include <iscsi/iscsi.h> > +#include <iscsi/scsi-lowlevel.h> > + > + > +typedef struct IscsiLun { > + struct iscsi_context *iscsi; > + int lun; > + int block_size; > + unsigned long num_blocks; > +} IscsiLun; > + > +typedef struct IscsiAIOCB { > + BlockDriverAIOCB common; > + QEMUIOVector *qiov; > + QEMUBH *bh; > + IscsiLun *iscsilun; > + uint8_t *buf; > + int canceled; > + int status; > + size_t read_size; > +} IscsiAIOCB; > + > +struct IscsiTask { > + IscsiLun *iscsilun; > + int status; > + int complete; > +}; > + > +static int > +iscsi_is_inserted(BlockDriverState *bs) > +{ > + IscsiLun *iscsilun = bs->opaque; > + struct iscsi_context *iscsi = iscsilun->iscsi; > + > + return iscsi_is_logged_in(iscsi); > +} Why is this needed? > + > + > +static void > +iscsi_aio_cancel(BlockDriverAIOCB *blockacb) > +{ > + IscsiAIOCB *acb = (IscsiAIOCB *)blockacb; > + > + acb->status = -EIO; -ECANCELED? > +static void > +iscsi_readv_writev_bh_cb(void *p) > +{ > + IscsiAIOCB *acb = p; > + > + qemu_bh_delete(acb->bh); > + > + acb->common.cb(acb->common.opaque, acb->status); I think we should double-check that acb->status != -ECANCELED here just in case we scheduled the BH but someone cancelled the request before the BH was executed. This would be a really odd case but I think it's theoretically possible. > +static BlockDriverAIOCB * > +iscsi_aio_writev(BlockDriverState *bs, int64_t sector_num, > + QEMUIOVector *qiov, int nb_sectors, > + BlockDriverCompletionFunc *cb, > + void *opaque) > +{ > + IscsiLun *iscsilun = bs->opaque; > + struct iscsi_context *iscsi = iscsilun->iscsi; > + IscsiAIOCB *acb; > + size_t size; > + int ret; > + > + acb = qemu_aio_get(&iscsi_aio_pool, bs, cb, opaque); > + trace_iscsi_aio_writev(iscsi, sector_num, nb_sectors, opaque, acb); > + if (!acb) { > + return NULL; > + } > + > + acb->iscsilun = iscsilun; > + acb->qiov = qiov; > + > + acb->canceled = 0; > + > + /* XXX we should pass the iovec to write10 to avoid the extra copy */ > + /* this will allow us to get rid of 'buf' completely */ > + size = nb_sectors * BDRV_SECTOR_SIZE; > + acb->buf = qemu_malloc(size); > + qemu_iovec_to_buffer(acb->qiov, acb->buf); > + ret = iscsi_write10_async(iscsi, iscsilun->lun, acb->buf, size, > + sector_qemu2lun(sector_num, iscsilun), > + 0, 0, iscsilun->block_size, > + iscsi_aio_write10_cb, acb); We're not obeying LUN block size here. In order to do that we need read-write-modify. See comments on iscsi_aio_readv() below. > + if (ret != 0) { > + error_report("iSCSI: Failed to send write10 command. %s", > + iscsi_get_error(iscsi)); > + qemu_free(acb->buf); > + qemu_aio_release(acb); > + return NULL; > + } > + > + iscsi_set_events(iscsilun); > + > + return &acb->common; > +} > + > +static void > +iscsi_aio_read10_cb(struct iscsi_context *iscsi, int status, > + void *command_data, void *opaque) > +{ > + IscsiAIOCB *acb = opaque; > + struct scsi_task *scsi = command_data; > + > + trace_iscsi_aio_read10_cb(iscsi, status, acb, acb->canceled); > + > + if (acb->canceled != 0) { > + qemu_aio_release(acb); > + return; > + } > + > + acb->status = 0; > + if (status < 0) { > + error_report("Failed to read10 data from iSCSI lun. %s", > + iscsi_get_error(iscsi)); > + acb->status = -EIO; > + } else { > + qemu_iovec_from_buffer(acb->qiov, scsi->datain.data, acb->read_size); Need to offset from sector_num rounded down to LUN block size. > + } > + > + iscsi_schedule_bh(iscsi_readv_writev_bh_cb, acb); > +} > + > +static BlockDriverAIOCB * > +iscsi_aio_readv(BlockDriverState *bs, int64_t sector_num, > + QEMUIOVector *qiov, int nb_sectors, > + BlockDriverCompletionFunc *cb, > + void *opaque) > +{ > + IscsiLun *iscsilun = bs->opaque; > + struct iscsi_context *iscsi = iscsilun->iscsi; > + IscsiAIOCB *acb; > + size_t qemu_read_size, lun_read_size; > + int ret; > + > + qemu_read_size = BDRV_SECTOR_SIZE * (size_t)nb_sectors; > + lun_read_size = (qemu_read_size + iscsilun->block_size - 1) > + / iscsilun->block_size * iscsilun->block_size; > + > + acb = qemu_aio_get(&iscsi_aio_pool, bs, cb, opaque); > + trace_iscsi_aio_readv(iscsi, sector_num, nb_sectors, opaque, acb); > + if (!acb) { > + return NULL; > + } > + > + acb->iscsilun = iscsilun; > + acb->qiov = qiov; > + > + acb->canceled = 0; > + acb->read_size = qemu_read_size; > + acb->buf = NULL; > + > + ret = iscsi_read10_async(iscsi, iscsilun->lun, > + sector_qemu2lun(sector_num, iscsilun), > + lun_read_size, iscsilun->block_size, > + iscsi_aio_read10_cb, acb); lba and datalen seem incorrect here. lun_read_size has been rounded up to the next LUN block. But sector_num might not be LUN block aligned so we need to extend lun_read_size even further in order to round down lba to LUN block size. The cleanest way to support this would be for BlockDriverState to supply a block size but we currently keep that separate in BlockConf and have it set from device emulation (e.g. scsi-disk.c) code. So in the near term we need to emulated BDRV_SECTOR_SIZE accesses. > + if (ret != 0) { > + error_report("iSCSI: Failed to send read10 command. %s", > + iscsi_get_error(iscsi)); > + qemu_aio_release(acb); > + return NULL; > + } > + > + iscsi_set_events(iscsilun); > + > + return &acb->common; > +} > + > + > +static void synccache10_cb(struct iscsi_context *iscsi, int status, > + void *command_data, void *opaque) > +{ > + struct IscsiTask *task = opaque; > + > + task->status = status; > + task->complete = 1; > +} > + > +static int > +iscsi_flush(BlockDriverState *bs) > +{ > + IscsiLun *iscsilun = bs->opaque; > + struct iscsi_context *iscsi = iscsilun->iscsi; > + struct IscsiTask task; > + int ret; > + > + task.iscsilun = iscsilun; > + task.status = 0; > + task.complete = 0; > + > + ret = iscsi_synchronizecache10_async(iscsi, iscsilun->lun, > + 0, 0, 0, 0, > + synccache10_cb, > + &task); > + if (ret != 0) { > + error_report("iSCSI: Failed to send SYNCHRONIZECACHE10. %s", > + iscsi_get_error(iscsi)); > + return -1; Please use an errno here since this function returns a negative errno value. > +static BlockDriver bdrv_iscsi = { > + .format_name = "iscsi", > + .protocol_name = "iscsi", > + > + .instance_size = sizeof(IscsiLun), > + .bdrv_file_open = iscsi_open, > + .bdrv_close = iscsi_close, > + .bdrv_flush = iscsi_flush, > + > + .bdrv_getlength = iscsi_getlength, > + > + .bdrv_aio_readv = iscsi_aio_readv, > + .bdrv_aio_writev = iscsi_aio_writev, Please implement .bdrv_aio_flush. Synchronous functions like .bdrv_flush will block the VM so asynchronous functions are much preferred. Stefan