Re: [kvm-devel] More virtio users
On Sun, 2007-06-10 at 11:16 +0300, Avi Kivity wrote: Rusty Russell wrote: Lguest doesn't have a framebuffer, so maybe this is a good thing for me to hack on, but I promised myself I'd finish NAPI for the net device, and tag for block device first. If you're touching the block device, passing a request's io priority to the host can be useful. OK, here's the interdiff. I still don't handle non-fs requests, but I haven't seen any yet. I should probably BUG_ON() there and wait for Jens to scream... Changes: 1) Make virtio_blk.h userspace-friendly. 2) /dev/vbN - /dev/vdN 3) Ordered tags, handed thru to other end. 4) Hand ioprio to other end, too. diff -u b/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c --- b/drivers/block/virtio_blk.c +++ b/drivers/block/virtio_blk.cSun Jun 10 22:09:10 2007 +1000 @@ -33,18 +33,19 @@ struct virtio_blk_inhdr in_hdr; }; -/* Jens gave me this nice helper to end all chunks of a request. */ -static void end_dequeued_request(struct request *req, int uptodate) +static void end_tagged_request(struct request *req, + request_queue_t *q, int uptodate) { if (end_that_request_first(req, uptodate, req-hard_nr_sectors)) BUG(); add_disk_randomness(req-rq_disk); + blk_queue_end_tag(q, req); end_that_request_last(req, uptodate); } static void finish(struct virtio_blk *vblk, struct virtblk_req *vbr) { - end_dequeued_request(vbr-req, !vbr-failed); + end_tagged_request(vbr-req, vblk-disk-queue, !vbr-failed); list_del(vbr-list); mempool_free(vbr, vblk-pool); /* In case queue is stopped waiting for more buffers. */ @@ -120,7 +121,7 @@ goto detach_inbuf_full; pr_debug(Write: %p in=%lu out=%lu\n, vbr, -vbr-out_hdr.id, vbr-out_id); +(long)vbr-out_hdr.id, (long)vbr-out_id); list_add_tail(vbr-list, vblk-reqs); return true; @@ -157,7 +158,7 @@ goto detach_inbuf_full; pr_debug(Read: %p in=%lu out=%lu\n, vbr, -vbr-out_hdr.id, vbr-out_id); +(long)vbr-out_hdr.id, (long)vbr-out_id); list_add_tail(vbr-list, vblk-reqs); return true; @@ -178,10 +179,9 @@ /* FIXME: handle these iff capable. */ if (!blk_fs_request(req)) { - pr_debug(Got non-command 0x%08x\n, req-cmd_type); + printk(Got non-command 0x%08x\n, req-cmd_type); req-errors++; - blkdev_dequeue_request(req); - end_dequeued_request(req, 0); + end_tagged_request(req, vblk-disk-queue, 0); continue; } @@ -193,6 +193,8 @@ vbr-req = req; vbr-out_hdr.type = rq_data_dir(req); vbr-out_hdr.sector = req-sector; + vbr-out_hdr.tag = req-tag; + vbr-out_hdr.ioprio = req-ioprio; if (rq_data_dir(req) == WRITE) { if (!do_write(q, vblk, vbr)) @@ -201,7 +203,6 @@ if (!do_read(q, vblk, vbr)) goto stop; } - blkdev_dequeue_request(req); } sync: @@ -261,16 +262,25 @@ goto out_put_disk; } - sprintf(vblk-disk-disk_name, vb%c, virtblk_index++); + sprintf(vblk-disk-disk_name, vd%c, virtblk_index++); vblk-disk-major = major; vblk-disk-first_minor = 0; vblk-disk-private_data = vblk; vblk-disk-fops = virtblk_fops; + err = blk_queue_init_tags(vblk-disk-queue, 100 /* FIXME */, NULL); + if (err) + goto out_cleanup_queue; + + blk_queue_ordered(vblk-disk-queue, QUEUE_ORDERED_TAG, NULL); + blk_queue_prep_rq(vblk-disk-queue, blk_queue_start_tag); + /* Caller can do blk_queue_max_hw_segments(), set_capacity() * etc then add_disk(). */ return vblk-disk; +out_cleanup_queue: + blk_cleanup_queue(vblk-disk-queue); out_put_disk: put_disk(vblk-disk); out_unregister_blkdev: diff -u b/include/linux/virtio_blk.h b/include/linux/virtio_blk.h --- b/include/linux/virtio_blk.h +++ b/include/linux/virtio_blk.hSun Jun 10 22:09:10 2007 +1000 @@ -3,26 +3,31 @@ #include linux/types.h -struct gendisk; -struct virtio_device; -struct hd_geometry; /* This is the first element of the scatter-gather list. */ struct virtio_blk_outhdr { /* 0 == read, 1 == write */ - u32 type; + __u32 type; + /* Ordered tag. */ + __u16 tag; + /* Linux's ioprio. */ + __u16 ioprio; /* Sector (ie. 512 byte offset) */ - unsigned long sector; + __u64 sector; /* Where to put reply. */ - unsigned long id; + __u64 id; }; struct virtio_blk_inhdr { /* 1 = OK, 0
Re: [kvm-devel] More virtio users
On Mon, 2007-06-11 at 08:41 +0200, Jens Axboe wrote: On Sun, Jun 10 2007, Rusty Russell wrote: On Sun, 2007-06-10 at 11:16 +0300, Avi Kivity wrote: Rusty Russell wrote: Lguest doesn't have a framebuffer, so maybe this is a good thing for me to hack on, but I promised myself I'd finish NAPI for the net device, and tag for block device first. If you're touching the block device, passing a request's io priority to the host can be useful. OK, here's the interdiff. I still don't handle non-fs requests, but I haven't seen any yet. I should probably BUG_ON() there and wait for Jens to scream... Ehm no, that would certainly cause screaming :-) Checking for blk_fs_request() and terminating the request if you don't know how to handle it is the correct thing to do, a BUG() would definitely not be. So the problem is that I'd like to handle all of them, but I'm not clear what requests my device can get. I can't see a source of any other type of request. + blk_queue_prep_rq(vblk-disk-queue, blk_queue_start_tag); + is quite a novel way, I actually had to look that code up to check whether it was correct. I'd much prefer a little wrapper around that, OK, but I got it from the blk_queue_start_tag documentation: * Description: *This can either be used as a stand-alone helper, or possibly be *assigned as the queue prep_rq_fn (in which case struct request *automagically gets a tag assigned). Note that this function *assumes that any type of request can be queued! I'm unsure on the whole ordered tag idea, though. It seems like there are never be multiple requests in the queue with the same tag, so it effectively forces my (userspace) virtio server to serialize and fdatasync on every write request: if (outhdr-type outhdr-tag != vblk-last_tag) { while (vblk-in_progress) handle_io_finish(fd, dev); fdatasync(fd); vblk-last_tag = outhdr-tag; } In fact, AFAICT any implementation of ordered tags will be forced into serial order. Am I better off telling the block layer to drain and flush instead of ordered tags? I can then push that through to the virtio server. Thanks, Rusty. - This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] More virtio users
On Mon, Jun 11 2007, Rusty Russell wrote: On Mon, 2007-06-11 at 08:41 +0200, Jens Axboe wrote: On Sun, Jun 10 2007, Rusty Russell wrote: On Sun, 2007-06-10 at 11:16 +0300, Avi Kivity wrote: Rusty Russell wrote: Lguest doesn't have a framebuffer, so maybe this is a good thing for me to hack on, but I promised myself I'd finish NAPI for the net device, and tag for block device first. If you're touching the block device, passing a request's io priority to the host can be useful. OK, here's the interdiff. I still don't handle non-fs requests, but I haven't seen any yet. I should probably BUG_ON() there and wait for Jens to scream... Ehm no, that would certainly cause screaming :-) Checking for blk_fs_request() and terminating the request if you don't know how to handle it is the correct thing to do, a BUG() would definitely not be. So the problem is that I'd like to handle all of them, but I'm not clear what requests my device can get. I can't see a source of any other type of request. The other main request type is blk_pc_request(). In the data setup it's indentical to blk_fs_request(), there's a bio chain off -bio. It's a byte granularity entity though, so you should check -data_len for the size of it. -cmd[] holds a SCSI cdb, which is the command you are supposed to handle. + blk_queue_prep_rq(vblk-disk-queue, blk_queue_start_tag); + is quite a novel way, I actually had to look that code up to check whether it was correct. I'd much prefer a little wrapper around that, OK, but I got it from the blk_queue_start_tag documentation: * Description: *This can either be used as a stand-alone helper, or possibly be *assigned as the queue prep_rq_fn (in which case struct request *automagically gets a tag assigned). Note that this function *assumes that any type of request can be queued! OK, bad documentation, I'll make a note to fix that! Sorry about that. I'm unsure on the whole ordered tag idea, though. It seems like there are never be multiple requests in the queue with the same tag, so it effectively forces my (userspace) virtio server to serialize and fdatasync on every write request: if (outhdr-type outhdr-tag != vblk-last_tag) { while (vblk-in_progress) handle_io_finish(fd, dev); fdatasync(fd); vblk-last_tag = outhdr-tag; } In fact, AFAICT any implementation of ordered tags will be forced into serial order. Am I better off telling the block layer to drain and flush instead of ordered tags? I can then push that through to the virtio server. Perhaps you are misunderstanding what the tag is? The tag is a unique identifier for a pending request, so you will by definition never have requests sharing a tag value. But the tag is not to be considered as ordered, unless it has the barrier flag set as well. So you only need to serialize on the device side when blk_barrier_rq() is true. -- Jens Axboe - This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] More virtio users
On Mon, 2007-06-11 at 09:33 +0200, Jens Axboe wrote: On Mon, Jun 11 2007, Rusty Russell wrote: So the problem is that I'd like to handle all of them, but I'm not clear what requests my device can get. I can't see a source of any other type of request. The other main request type is blk_pc_request(). In the data setup it's indentical to blk_fs_request(), there's a bio chain off -bio. It's a byte granularity entity though, so you should check -data_len for the size of it. -cmd[] holds a SCSI cdb, which is the command you are supposed to handle. SCSI? I'm even more lost now. Q: So what *are* the commands? Q: Who puts them in my queue? I have *never* seen anything but an fs request come through to my driver. Perhaps you are misunderstanding what the tag is? The tag is a unique identifier for a pending request, so you will by definition never have requests sharing a tag value. Yes, I started to suspect that. But the tag is not to be considered as ordered, unless it has the barrier flag set as well. So you only need to serialize on the device side when blk_barrier_rq() is true. blk_barrier_rq(req) is never true. I put a BUG_ON(blk_barrier_rq(req)) in my code and booted. This is using the device as a root ext3 filesystem. I reverted all the tag changes, still no barriers. I added blk_queue_ordered(vblk-disk-queue, QUEUE_ORDERED_DRAIN, NULL);, still no barriers. Dazed and confused, Rusty. - This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] More virtio users
On Tue, Jun 12 2007, Rusty Russell wrote: On Mon, 2007-06-11 at 09:33 +0200, Jens Axboe wrote: On Mon, Jun 11 2007, Rusty Russell wrote: So the problem is that I'd like to handle all of them, but I'm not clear what requests my device can get. I can't see a source of any other type of request. The other main request type is blk_pc_request(). In the data setup it's indentical to blk_fs_request(), there's a bio chain off -bio. It's a byte granularity entity though, so you should check -data_len for the size of it. -cmd[] holds a SCSI cdb, which is the command you are supposed to handle. SCSI? I'm even more lost now. Q: So what *are* the commands? They are SCSI commands! Q: Who puts them in my queue? If you want to support SG_IO for instance, you'd have to deal with SCSI commands. I have *never* seen anything but an fs request come through to my driver. You probably only did basic IO to it. Perhaps you are misunderstanding what the tag is? The tag is a unique identifier for a pending request, so you will by definition never have requests sharing a tag value. Yes, I started to suspect that. But the tag is not to be considered as ordered, unless it has the barrier flag set as well. So you only need to serialize on the device side when blk_barrier_rq() is true. blk_barrier_rq(req) is never true. I put a BUG_ON(blk_barrier_rq(req)) in my code and booted. This is using the device as a root ext3 filesystem. I reverted all the tag changes, still no barriers. I added blk_queue_ordered(vblk-disk-queue, QUEUE_ORDERED_DRAIN, NULL);, still no barriers. -o barrier=1 for ext3, it doesn't use barriers by default. Or barrier=flush for reiserfs. XFS defaults to barriers on. -- Jens Axboe - This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] More virtio users
On Tue, 2007-06-12 at 08:24 +0200, Jens Axboe wrote: On Tue, Jun 12 2007, Rusty Russell wrote: On Mon, 2007-06-11 at 09:33 +0200, Jens Axboe wrote: The other main request type is blk_pc_request(). In the data setup it's indentical to blk_fs_request(), there's a bio chain off -bio. It's a byte granularity entity though, so you should check -data_len for the size of it. -cmd[] holds a SCSI cdb, which is the command you are supposed to handle. SCSI? I'm even more lost now. Q: So what *are* the commands? They are SCSI commands! Q: Who puts them in my queue? If you want to support SG_IO for instance, you'd have to deal with SCSI commands. I do not. If someone wants to implement a SCSI layer over virtio, I think that's wonderful. Fortunately, that's not the problem I'm trying to solve. -o barrier=1 for ext3, it doesn't use barriers by default. That's, um, a little disturbing. But, it works. Thanks! Rusty. - This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] More virtio users
On Sunday 10 June 2007, Avi Kivity wrote: There are probably more. Any ideas? * watchdog timer * tty ports (not just console) to attach to via host socket * alsa * hostfs (UML like) Arnd - This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel