Rusty Russell wrote: > On Mon, 2007-06-18 at 12:09 +0300, Avi Kivity wrote: > >> Rusty Russell wrote: >> >>> On Sun, 2007-06-17 at 17:25 +0300, Avi Kivity wrote: >>> >>> >>>> Rusty Russell wrote: >>>> >>>> >>>>> + /* Set up for reply. */ >>>>> + vblk->sg[0].page = virt_to_page(&vbr->in_hdr); >>>>> + vblk->sg[0].offset = offset_in_page(&vbr->in_hdr); >>>>> + vblk->sg[0].length = sizeof(vbr->in_hdr); >>>>> + num = blk_rq_map_sg(q, vbr->req, vblk->sg+1); >>>>> + vbr->out_hdr.id = vblk->vdev->ops->add_inbuf(vblk->vdev, vblk->sg, >>>>> + 1+num, vbr); >>>>> + if (IS_ERR_VALUE(vbr->out_hdr.id)) >>>>> + goto full; >>>>> + >>>>> + vblk->sg[0].page = virt_to_page(&vbr->out_hdr); >>>>> + vblk->sg[0].offset = offset_in_page(&vbr->out_hdr); >>>>> + vblk->sg[0].length = sizeof(vbr->out_hdr); >>>>> + >>>>> + vbr->out_id = vblk->vdev->ops->add_outbuf(vblk->vdev, vblk->sg, 1, >>>>> + vbr); >>>>> >>>>> >>>> This strikes me as wasteful. Why not set up a single descriptor which >>>> contains both placement and the data itself? >>>> >>>> >>> We could actually do this for write, but not for read (where the length >>> & sector need to be sent to other end, and the data comes from other >>> end). >>> >> So you map the first sg entry for output, and the rest for input. >> >> Less pretty, but more efficient. >> > > At the moment there are two kinds of sgs: input and output. You're > proposing instead that we have a single kind, and a flag on each segment > which indicates direction. That may make block a little efficient, > depending on particular implementation of virtio. > > Yet network really wants in and out separate. I'm not even sure how > we'd implement a mixed-sg scheme efficiently for that case. At the > moment it's very straightforward. > >
I think the differences are deeper than that. Network wants two queues, disk wants one queue. The directionality of the dma transfer is orthogonal to that. For a disk, the initiator is always the guest; you issue a request and expend a completion within a few milliseconds. Read and write requests are similar in nature; you want them in a single queue because you want the elevator to run on both read and write request. When idling, the queue is empty. For a network interface, the guest initiates transmits, but the host initiates receives. Yes, the guest preloads the queue with rx descriptors, but it can't expect a response in any given time, hence the need for two queues. When idling, the tx queue is empty, the rx queue is full. So I'd suggest having a struct virtio_queue to represent a queue (so a network interface would have two of them), and the dma direction to be specified separately: + unsigned long (*add_buf)(struct virtio_queue *q, + const struct scatterlist out_sg[], int out_nr, + const struct scatterlist in_sg[], int in_nr, + void *data); I think this fits better with the Xen model (which has one ring for block, two each for network and console, if memory serves). > Of course, I'm carrying two implementations and two drivers, so it's > natural for me to try to resist changes 8 Life is rough on the frontier. But all the glory will be yours! -- error compiling committee.c: too many arguments to function ------------------------------------------------------------------------- This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ _______________________________________________ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel