On Thu, Sep 06, 2012 at 09:35:04AM +0200, Paolo Bonzini wrote: > > +static int qemu_gluster_open(BlockDriverState *bs, const char *filename, > > + int bdrv_flags) > > +{ > > + BDRVGlusterState *s = bs->opaque; > > + int open_flags = 0; > > + int ret = 0; > > + GlusterURI *uri = g_malloc0(sizeof(GlusterURI)); > > + > > + s->glfs = qemu_gluster_init(uri, filename); > > + if (!s->glfs) { > > + ret = -errno; > > + goto out; > > + } > > + > > + open_flags |= O_BINARY; > > + open_flags &= ~O_ACCMODE; > > + if (bdrv_flags & BDRV_O_RDWR) { > > + open_flags |= O_RDWR; > > + } else { > > + open_flags |= O_RDONLY; > > + } > > + > > + if ((bdrv_flags & BDRV_O_NOCACHE)) { > > + open_flags |= O_DIRECT; > > + } > > + > > + s->fd = glfs_open(s->glfs, uri->image, open_flags); > > + if (!s->fd) { > > + ret = -errno; > > + goto out; > > + } > > + > > + ret = qemu_pipe(s->fds); > > + if (ret < 0) { > > + goto out; > > + } > > + fcntl(s->fds[0], F_SETFL, O_NONBLOCK); > > + fcntl(s->fds[1], F_SETFL, O_NONBLOCK); > > A small thing I noticed while reviewing: since the write end of the pipe > is used from the gluster thread, you do not need to make this nonblocking.
Ok. > > Also, please use GLUSTER_FD_{READ,WRITE} instead. Sure. > > +static void qemu_gluster_aio_cancel(BlockDriverAIOCB *blockacb) > > +{ > > + GlusterAIOCB *acb = (GlusterAIOCB *)blockacb; > > + > > + acb->common.cb(acb->common.opaque, -ECANCELED); > > + acb->canceled = true; > > I think this is wrong, because the write could still complete later and > undo the effects of a second write that is done by the guest. That is: > > gluster QEMU guest > ---------------------------------------------------------- > <--- write #1 > <--- write #1 > <--- cancel write #1 > write #1 canceled ---> > <--- write #2 > <--- write #2 > write #2 completed ---> > write #2 completed --> > write #1 completed ---> > > Now, the persistent storage recorded the effect of write #1, but the > guest thinks that it recorded the effect of write #2 instead. > > You can simply do qemu_aio_flush() here. Based on my discussions with Kevin, I have now followed block/qed.c for aio_cancel which is based on using acb->finished. > > +static int qemu_gluster_send_pipe(BDRVGlusterState *s, GlusterAIOCB *acb) > > +{ > > + int ret = 0; > > + while (1) { > > + fd_set wfd; > > + int fd = s->fds[GLUSTER_FD_WRITE]; > > + > > + ret = write(fd, (void *)&acb, sizeof(acb)); > > + if (ret >= 0) { > > + break; > > + } > > + if (errno == EINTR) { > > + continue; > > + } > > + if (errno != EAGAIN) { > > + break; > > + } > > + > > + FD_ZERO(&wfd); > > + FD_SET(fd, &wfd); > > + do { > > + ret = select(fd + 1, NULL, &wfd, NULL, NULL); > > + } while (ret < 0 && errno == EINTR); > > If you make the fd non-blocking, you can avoid the select here. Will change in in the next iteration. Thanks for taking time to review. Regards, Bharata.