Re: [kvm-devel] More virtio users

2007-06-12 Thread Rusty Russell
On Sun, 2007-06-10 at 11:16 +0300, Avi Kivity wrote:
 Rusty Russell wrote:
  Lguest doesn't have a framebuffer, so maybe this is a good thing for me
  to hack on, but I promised myself I'd finish NAPI for the net device,
  and tag for block device first.

 
 If you're touching the block device, passing a request's io priority to 
 the host can be useful.

OK, here's the interdiff.  I still don't handle non-fs requests, but I
haven't seen any yet.  I should probably BUG_ON() there and wait for
Jens to scream...

Changes:
1) Make virtio_blk.h userspace-friendly.
2) /dev/vbN - /dev/vdN
3) Ordered tags, handed thru to other end.
4) Hand ioprio to other end, too.

diff -u b/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
--- b/drivers/block/virtio_blk.c
+++ b/drivers/block/virtio_blk.cSun Jun 10 22:09:10 2007 +1000
@@ -33,18 +33,19 @@
struct virtio_blk_inhdr in_hdr;
 };
 
-/* Jens gave me this nice helper to end all chunks of a request. */
-static void end_dequeued_request(struct request *req, int uptodate)
+static void end_tagged_request(struct request *req,
+  request_queue_t *q, int uptodate)
 {
if (end_that_request_first(req, uptodate, req-hard_nr_sectors))
BUG();
add_disk_randomness(req-rq_disk);
+   blk_queue_end_tag(q, req);
end_that_request_last(req, uptodate);
 }
 
 static void finish(struct virtio_blk *vblk, struct virtblk_req *vbr)
 {
-   end_dequeued_request(vbr-req, !vbr-failed);
+   end_tagged_request(vbr-req, vblk-disk-queue, !vbr-failed);
list_del(vbr-list);
mempool_free(vbr, vblk-pool);
/* In case queue is stopped waiting for more buffers. */
@@ -120,7 +121,7 @@
goto detach_inbuf_full;
 
pr_debug(Write: %p in=%lu out=%lu\n, vbr,
-vbr-out_hdr.id, vbr-out_id);
+(long)vbr-out_hdr.id, (long)vbr-out_id);
list_add_tail(vbr-list, vblk-reqs);
return true;
 
@@ -157,7 +158,7 @@
goto detach_inbuf_full;
 
pr_debug(Read: %p in=%lu out=%lu\n, vbr,
-vbr-out_hdr.id, vbr-out_id);
+(long)vbr-out_hdr.id, (long)vbr-out_id);
list_add_tail(vbr-list, vblk-reqs);
return true;
 
@@ -178,10 +179,9 @@
 
/* FIXME: handle these iff capable. */
if (!blk_fs_request(req)) {
-   pr_debug(Got non-command 0x%08x\n, req-cmd_type);
+   printk(Got non-command 0x%08x\n, req-cmd_type);
req-errors++;
-   blkdev_dequeue_request(req);
-   end_dequeued_request(req, 0);
+   end_tagged_request(req, vblk-disk-queue, 0);
continue;
}
 
@@ -193,6 +193,8 @@
vbr-req = req;
vbr-out_hdr.type = rq_data_dir(req);
vbr-out_hdr.sector = req-sector;
+   vbr-out_hdr.tag = req-tag;
+   vbr-out_hdr.ioprio = req-ioprio;
 
if (rq_data_dir(req) == WRITE) {
if (!do_write(q, vblk, vbr))
@@ -201,7 +203,6 @@
if (!do_read(q, vblk, vbr))
goto stop;
}
-   blkdev_dequeue_request(req);
}
 
 sync:
@@ -261,16 +262,25 @@
goto out_put_disk;
}
 
-   sprintf(vblk-disk-disk_name, vb%c, virtblk_index++);
+   sprintf(vblk-disk-disk_name, vd%c, virtblk_index++);
vblk-disk-major = major;
vblk-disk-first_minor = 0;
vblk-disk-private_data = vblk;
vblk-disk-fops = virtblk_fops;
 
+   err = blk_queue_init_tags(vblk-disk-queue, 100 /* FIXME */, NULL);
+   if (err)
+   goto out_cleanup_queue;
+
+   blk_queue_ordered(vblk-disk-queue, QUEUE_ORDERED_TAG, NULL);
+   blk_queue_prep_rq(vblk-disk-queue, blk_queue_start_tag);
+
/* Caller can do blk_queue_max_hw_segments(), set_capacity()
 * etc then add_disk(). */
return vblk-disk;
 
+out_cleanup_queue:
+   blk_cleanup_queue(vblk-disk-queue);
 out_put_disk:
put_disk(vblk-disk);
 out_unregister_blkdev:
diff -u b/include/linux/virtio_blk.h b/include/linux/virtio_blk.h
--- b/include/linux/virtio_blk.h
+++ b/include/linux/virtio_blk.hSun Jun 10 22:09:10 2007 +1000
@@ -3,26 +3,31 @@
 #include linux/types.h
-struct gendisk;
-struct virtio_device;
-struct hd_geometry;
 
 /* This is the first element of the scatter-gather list. */
 struct virtio_blk_outhdr
 {
/* 0 == read, 1 == write */
-   u32 type;
+   __u32 type;
+   /* Ordered tag. */
+   __u16 tag;
+   /* Linux's ioprio. */
+   __u16 ioprio;
/* Sector (ie. 512 byte offset) */
-   unsigned long sector;
+   __u64 sector;
/* Where to put reply. */
-   unsigned long id;
+   __u64 id;
 };
 
 struct virtio_blk_inhdr
 {
/* 1 = OK, 0 

Re: [kvm-devel] More virtio users

2007-06-12 Thread Rusty Russell
On Mon, 2007-06-11 at 08:41 +0200, Jens Axboe wrote:
 On Sun, Jun 10 2007, Rusty Russell wrote:
  On Sun, 2007-06-10 at 11:16 +0300, Avi Kivity wrote:
   Rusty Russell wrote:
Lguest doesn't have a framebuffer, so maybe this is a good thing for me
to hack on, but I promised myself I'd finish NAPI for the net device,
and tag for block device first.
  
   
   If you're touching the block device, passing a request's io priority to 
   the host can be useful.
  
  OK, here's the interdiff.  I still don't handle non-fs requests, but I
  haven't seen any yet.  I should probably BUG_ON() there and wait for
  Jens to scream...
 
 Ehm no, that would certainly cause screaming :-)
 
 Checking for blk_fs_request() and terminating the request if you don't
 know how to handle it is the correct thing to do, a BUG() would
 definitely not be.

So the problem is that I'd like to handle all of them, but I'm not clear
what requests my device can get.  I can't see a source of any other type
of request.

  +   blk_queue_prep_rq(vblk-disk-queue, blk_queue_start_tag);
  +
 
 is quite a novel way, I actually had to look that code up to check
 whether it was correct. I'd much prefer a little wrapper around that,

OK, but I got it from the blk_queue_start_tag documentation:

 *  Description:
 *This can either be used as a stand-alone helper, or possibly be
 *assigned as the queue prep_rq_fn (in which case struct request
 *automagically gets a tag assigned). Note that this function
 *assumes that any type of request can be queued! 

I'm unsure on the whole ordered tag idea, though.  It seems like there
are never be multiple requests in the queue with the same tag, so it
effectively forces my (userspace) virtio server to serialize and
fdatasync on every write request:

if (outhdr-type  outhdr-tag != vblk-last_tag) {
while (vblk-in_progress)
handle_io_finish(fd, dev);
fdatasync(fd);
vblk-last_tag = outhdr-tag;
}


In fact, AFAICT any implementation of ordered tags will be forced into
serial order.  Am I better off telling the block layer to drain and
flush instead of ordered tags?  I can then push that through to the
virtio server.

Thanks,
Rusty.


-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] More virtio users

2007-06-12 Thread Jens Axboe
On Mon, Jun 11 2007, Rusty Russell wrote:
 On Mon, 2007-06-11 at 08:41 +0200, Jens Axboe wrote:
  On Sun, Jun 10 2007, Rusty Russell wrote:
   On Sun, 2007-06-10 at 11:16 +0300, Avi Kivity wrote:
Rusty Russell wrote:
 Lguest doesn't have a framebuffer, so maybe this is a good thing for 
 me
 to hack on, but I promised myself I'd finish NAPI for the net device,
 and tag for block device first.
   

If you're touching the block device, passing a request's io priority to 
the host can be useful.
   
   OK, here's the interdiff.  I still don't handle non-fs requests, but I
   haven't seen any yet.  I should probably BUG_ON() there and wait for
   Jens to scream...
  
  Ehm no, that would certainly cause screaming :-)
  
  Checking for blk_fs_request() and terminating the request if you don't
  know how to handle it is the correct thing to do, a BUG() would
  definitely not be.
 
 So the problem is that I'd like to handle all of them, but I'm not clear
 what requests my device can get.  I can't see a source of any other type
 of request.

The other main request type is blk_pc_request(). In the data setup it's
indentical to blk_fs_request(), there's a bio chain off -bio. It's a
byte granularity entity though, so you should check -data_len for the
size of it. -cmd[] holds a SCSI cdb, which is the command you are
supposed to handle.

   + blk_queue_prep_rq(vblk-disk-queue, blk_queue_start_tag);
   +
  
  is quite a novel way, I actually had to look that code up to check
  whether it was correct. I'd much prefer a little wrapper around that,
 
 OK, but I got it from the blk_queue_start_tag documentation:
 
  *  Description:
  *This can either be used as a stand-alone helper, or possibly be
  *assigned as the queue prep_rq_fn (in which case struct request
  *automagically gets a tag assigned). Note that this function
  *assumes that any type of request can be queued! 

OK, bad documentation, I'll make a note to fix that! Sorry about that.

 I'm unsure on the whole ordered tag idea, though.  It seems like there
 are never be multiple requests in the queue with the same tag, so it
 effectively forces my (userspace) virtio server to serialize and
 fdatasync on every write request:
 
   if (outhdr-type  outhdr-tag != vblk-last_tag) {
   while (vblk-in_progress)
   handle_io_finish(fd, dev);
   fdatasync(fd);
   vblk-last_tag = outhdr-tag;
   }
 
 
 In fact, AFAICT any implementation of ordered tags will be forced into
 serial order.  Am I better off telling the block layer to drain and
 flush instead of ordered tags?  I can then push that through to the
 virtio server.

Perhaps you are misunderstanding what the tag is? The tag is a unique
identifier for a pending request, so you will by definition never have
requests sharing a tag value. But the tag is not to be considered as
ordered, unless it has the barrier flag set as well. So you only need to
serialize on the device side when blk_barrier_rq() is true.

-- 
Jens Axboe


-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] More virtio users

2007-06-12 Thread Rusty Russell
On Mon, 2007-06-11 at 09:33 +0200, Jens Axboe wrote:
 On Mon, Jun 11 2007, Rusty Russell wrote:
  So the problem is that I'd like to handle all of them, but I'm not clear
  what requests my device can get.  I can't see a source of any other type
  of request.
 
 The other main request type is blk_pc_request(). In the data setup it's
 indentical to blk_fs_request(), there's a bio chain off -bio. It's a
 byte granularity entity though, so you should check -data_len for the
 size of it. -cmd[] holds a SCSI cdb, which is the command you are
 supposed to handle.

SCSI?  I'm even more lost now.

Q: So what *are* the commands?
Q: Who puts them in my queue?

I have *never* seen anything but an fs request come through to my
driver.

 Perhaps you are misunderstanding what the tag is? The tag is a unique
 identifier for a pending request, so you will by definition never have
 requests sharing a tag value.

Yes, I started to suspect that.

 But the tag is not to be considered as
 ordered, unless it has the barrier flag set as well. So you only need to
 serialize on the device side when blk_barrier_rq() is true.

blk_barrier_rq(req) is never true.  I put a BUG_ON(blk_barrier_rq(req))
in my code and booted.  This is using the device as a root ext3
filesystem.

I reverted all the tag changes, still no barriers.  I added
blk_queue_ordered(vblk-disk-queue, QUEUE_ORDERED_DRAIN, NULL);,
still no barriers.

Dazed and confused,
Rusty.


-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] More virtio users

2007-06-12 Thread Jens Axboe
On Tue, Jun 12 2007, Rusty Russell wrote:
 On Mon, 2007-06-11 at 09:33 +0200, Jens Axboe wrote:
  On Mon, Jun 11 2007, Rusty Russell wrote:
   So the problem is that I'd like to handle all of them, but I'm not clear
   what requests my device can get.  I can't see a source of any other type
   of request.
  
  The other main request type is blk_pc_request(). In the data setup it's
  indentical to blk_fs_request(), there's a bio chain off -bio. It's a
  byte granularity entity though, so you should check -data_len for the
  size of it. -cmd[] holds a SCSI cdb, which is the command you are
  supposed to handle.
 
 SCSI?  I'm even more lost now.
 
 Q: So what *are* the commands?

They are SCSI commands!

 Q: Who puts them in my queue?

If you want to support SG_IO for instance, you'd have to deal with SCSI
commands.

 I have *never* seen anything but an fs request come through to my
 driver.

You probably only did basic IO to it.

  Perhaps you are misunderstanding what the tag is? The tag is a unique
  identifier for a pending request, so you will by definition never have
  requests sharing a tag value.
 
 Yes, I started to suspect that.
 
  But the tag is not to be considered as
  ordered, unless it has the barrier flag set as well. So you only need to
  serialize on the device side when blk_barrier_rq() is true.
 
 blk_barrier_rq(req) is never true.  I put a BUG_ON(blk_barrier_rq(req))
 in my code and booted.  This is using the device as a root ext3
 filesystem.

 I reverted all the tag changes, still no barriers.  I added
 blk_queue_ordered(vblk-disk-queue, QUEUE_ORDERED_DRAIN, NULL);,
 still no barriers.

-o barrier=1 for ext3, it doesn't use barriers by default. Or
barrier=flush for reiserfs. XFS defaults to barriers on.

-- 
Jens Axboe


-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] More virtio users

2007-06-12 Thread Rusty Russell
On Tue, 2007-06-12 at 08:24 +0200, Jens Axboe wrote:
 On Tue, Jun 12 2007, Rusty Russell wrote:
  On Mon, 2007-06-11 at 09:33 +0200, Jens Axboe wrote:
   The other main request type is blk_pc_request(). In the data setup it's
   indentical to blk_fs_request(), there's a bio chain off -bio. It's a
   byte granularity entity though, so you should check -data_len for the
   size of it. -cmd[] holds a SCSI cdb, which is the command you are
   supposed to handle.
  
  SCSI?  I'm even more lost now.
  
  Q: So what *are* the commands?
 
 They are SCSI commands!
 
  Q: Who puts them in my queue?
 
 If you want to support SG_IO for instance, you'd have to deal with SCSI
 commands.

I do not.  If someone wants to implement a SCSI layer over virtio, I
think that's wonderful.  Fortunately, that's not the problem I'm trying
to solve.

 -o barrier=1 for ext3, it doesn't use barriers by default.

That's, um, a little disturbing.

But, it works.  Thanks!
Rusty.


-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] More virtio users

2007-06-12 Thread Arnd Bergmann
On Sunday 10 June 2007, Avi Kivity wrote:
 There are probably more.  Any ideas?

* watchdog timer
* tty ports (not just console) to attach to via host socket
* alsa
* hostfs (UML like)

Arnd 

-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel