Re: [uml-devel] [PATCH] UML - fix I/O hang when multiple devices are in use

2007-03-31 Thread Blaisorblade
On giovedì 29 marzo 2007, Jeff Dike wrote:
> On Thu, Mar 29, 2007 at 02:36:43AM +0200, Blaisorblade wrote:
> > > Sometimes you need to. I'd probably just remove the do_ubd check and
> > > always recall the request function when handling completions, it's
> > > easier and safe.
>
> If I'm understanding this correctly, this is what happens now.  There
> is still the flag check and return if the queue is being run, but I
> don't see the advantage of removing that.
Possibly he just didn't understood what do_ubd was for, maybe there's some 
technical reason.

> That's a lot of mapping and unmapping though.  I wonder if just
> calling mmap would cause the COWed page to be dropped...
Yes, it would.
-- 
Inform me of my mistakes, so I can add them to my list!
Paolo Giarrusso, aka Blaisorblade
http://www.user-mode-linux.org/~blaisorblade
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [uml-devel] [PATCH] UML - fix I/O hang when multiple devices are in use

2007-03-30 Thread Blaisorblade
On giovedì 29 marzo 2007, Jeff Dike wrote:
> On Thu, Mar 29, 2007 at 02:36:43AM +0200, Blaisorblade wrote:
> > > Sometimes you need to. I'd probably just remove the do_ubd check and
> > > always recall the request function when handling completions, it's
> > > easier and safe.
>
> If I'm understanding this correctly, this is what happens now.  There
> is still the flag check and return if the queue is being run, but I
> don't see the advantage of removing that.
>
> > Anyway, the main speedups to do on the UBD driver are:
> > * implement write barriers (so much less fsync) - this is performance
> > killer n.1
>
> You mean preventing the upper layers from calling fsync?

No. Since we don't know when the upper layers (including the journaling layer) 
wants to fsync, we call it everytime. But they pass this information. Chris 
Lightfoot implemented write barriers just before the API was changed, 
together with much of the other stuff I'm talking about.

It's impressive to check his original mail - the scenario with create a 32M 
file + delete it, where delete takes a minute on vanilla and 1 second on his 
patched code. I've downloaded the patch for future reference, even if I don't 
know when I'll have time to look at it.

> > * possibly to use the new 2.6 request layout with scatter/gather I/O, and
> > vectorized I/O on the host
>
> Yeah, this is something I've thought about on occassion but never
> done.
>
> > * while at vectorizing I/O using async I/O
>
> I have that, but haven't merged it since I see no performance benefit
> for some reason.
>
> > * to avoid passing requests on pipes (n.2) - on fast disk I/O becomes
> > cpu-bound.
>
> Right - I cooked up a scheme a while ago that had the requests on a
> list, being removed from one end and added to the other, with some
> minimal number of bytes going across the pipe to ensure a wakeup if
> the other side was possibly asleep.  But I never implemented it.
>
> > * using futexes instead of pipes for synchronization (required for
> > previous one).
>
> Yup - for this, we either need to test the host for futuxes and use
> pipes as a fallback or give up on 2.4 as the host.
>
> > I forgot one thing: remember ubd=mmap? Something like that could have
> > been done using MAP_PRIVATE, so that write had still to be called
> > explicitly but unchanged data was shared with the host.
> >
> > Once a page gets dirty but is then cleaned, sharing it back is
> > difficult - but even without that good savings could be
> > achievable. That's to explore for the very future though.
>
> Interesting idea.  That does avoid the formerly fatal mmap problem.
> If you unmap it, the private copy goes away because it lost its last
> reference, and if you map it again, you get the shared version.
>
> That's a lot of mapping and unmapping though.  I wonder if just
> calling mmap would cause the COWed page to be dropped...
>
>   Jeff



-- 
Inform me of my mistakes, so I can add them to my list!
Paolo Giarrusso, aka Blaisorblade
http://www.user-mode-linux.org/~blaisorblade
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [uml-devel] [PATCH] UML - fix I/O hang when multiple devices are in use

2007-03-29 Thread Jeff Dike
On Thu, Mar 29, 2007 at 02:36:43AM +0200, Blaisorblade wrote:
> > Sometimes you need to. I'd probably just remove the do_ubd check and
> > always recall the request function when handling completions, it's
> > easier and safe.

If I'm understanding this correctly, this is what happens now.  There
is still the flag check and return if the queue is being run, but I
don't see the advantage of removing that.

> Anyway, the main speedups to do on the UBD driver are:
> * implement write barriers (so much less fsync) - this is performance killer 
> n.1

You mean preventing the upper layers from calling fsync?

> * possibly to use the new 2.6 request layout with scatter/gather I/O, and 
> vectorized I/O on the host

Yeah, this is something I've thought about on occassion but never
done.

> * while at vectorizing I/O using async I/O

I have that, but haven't merged it since I see no performance benefit
for some reason.

> * to avoid passing requests on pipes (n.2) - on fast disk I/O becomes 
> cpu-bound.

Right - I cooked up a scheme a while ago that had the requests on a
list, being removed from one end and added to the other, with some
minimal number of bytes going across the pipe to ensure a wakeup if
the other side was possibly asleep.  But I never implemented it.

> * using futexes instead of pipes for synchronization (required for previous 
> one).

Yup - for this, we either need to test the host for futuxes and use
pipes as a fallback or give up on 2.4 as the host.

> I forgot one thing: remember ubd=mmap? Something like that could have been
> done using MAP_PRIVATE, so that write had still to be called explicitly but
> unchanged data was shared with the host.

> Once a page gets dirty but is then cleaned, sharing it back is
> difficult - but even without that good savings could be
> achievable. That's to explore for the very future though.

Interesting idea.  That does avoid the formerly fatal mmap problem.
If you unmap it, the private copy goes away because it lost its last
reference, and if you map it again, you get the shared version.

That's a lot of mapping and unmapping though.  I wonder if just
calling mmap would cause the COWed page to be dropped...

Jeff

-- 
Work email - jdike at linux dot intel dot com
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [uml-devel] [PATCH] UML - fix I/O hang when multiple devices are in use

2007-03-28 Thread Blaisorblade
On giovedì 29 marzo 2007, Blaisorblade wrote:
> On mercoledì 28 marzo 2007, Jeff Dike wrote:
> > [ This patch needs to get into 2.6.21, as it fixes a serious bug
> > introduced soon after 2.6.20 ]
> >
> > Commit 62f96cb01e8de7a5daee472e540f726db2801499 introduced per-devices
> > queues and locks, which was fine as far as it went, but left in place
> > a global which controlled access to submitting requests to the host.
> > This should have been made per-device as well, since it causes I/O
> > hangs when multiple block devices are in use.
> >
> > This patch fixes that by replacing the global with an activity flag in
> > the device structure in order to tell whether the queue is currently
> > being run.
>
> Finally that variable has a understandable name. However in a mail from
> Jens Axboe, titled:
> "Re: [uml-devel] [PATCH 06/11] uml ubd driver: ubd_io_lock usage fixup" ,
> with Date: Mon, 30 Oct 2006 09:26:48 +0100, he suggested removing this flag
>
> altogether, so we may explore this for the future:
> > > Add some comments about requirements for ubd_io_lock and expand its
> > > use.
> > >
> > > When an irq signals that the "controller" (i.e. another thread on the
> > > host, which does the actual requests and is the only one blocked on I/O
> > > on the host) has done some work, we call again the request function
> > > ourselves (do_ubd_request).

> > > We now do that with ubd_io_lock held - that's useful to protect against
> > > concurrent calls to elv_next_request and so on.

> > Not only useful, required, as I think I complained about a year or more
> > ago :-)

> > > XXX: Maybe we shouldn't call at all the request function. Input needed
> > > on this. Are we supposed to plug and unplug the queue? That code
> > > "indirectly" does that by setting a flag, called do_ubd, which makes
> > > the request function return (it's a residual of 2.4 block layer
> > > interface).

> > Sometimes you need to. I'd probably just remove the do_ubd check and
> > always recall the request function when handling completions, it's
> > easier and safe.

> Anyway, the main speedups to do on the UBD driver are:
> * implement write barriers (so much less fsync) - this is performance
> killer n.1

> * possibly to use the new 2.6 request layout with scatter/gather I/O, and
> vectorized I/O on the host
> * while at vectorizing I/O using async I/O

> * to avoid passing requests on pipes (n.2) - on fast disk I/O becomes
> cpu-bound.
> To make a different but related example, with a SpeedScale laptop, it's
> interesting to double CPU frequency and observe tuntap speed double too.
> (with 1GHz I get on TCP numbers like 150 Mbit/s - 100 Mbit/s, depending
> whether UML trasmits or receives data; with 2GHz double rates).
> Update: I now get 150Mbit / 200Mbit (Uml receives/Uml sends) at 1GHz, and
> still the double at 2Ghz.
> This is a different UML though.

> * using futexes instead of pipes for synchronization (required for previous
> one).

I forgot one thing: remember ubd=mmap? Something like that could have been 
done using MAP_PRIVATE, so that write had still to be called explicitly but 
unchanged data was shared with the host.

Once a page gets dirty but is then cleaned, sharing it back is difficult - but 
even without that good savings could be achievable. That's to explore for the 
very future though.
-- 
Inform me of my mistakes, so I can add them to my list!
Paolo Giarrusso, aka Blaisorblade
http://www.user-mode-linux.org/~blaisorblade
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [uml-devel] [PATCH] UML - fix I/O hang when multiple devices are in use

2007-03-28 Thread Blaisorblade
On mercoledì 28 marzo 2007, Jeff Dike wrote:
> [ This patch needs to get into 2.6.21, as it fixes a serious bug
> introduced soon after 2.6.20 ]
>
> Commit 62f96cb01e8de7a5daee472e540f726db2801499 introduced per-devices
> queues and locks, which was fine as far as it went, but left in place
> a global which controlled access to submitting requests to the host.
> This should have been made per-device as well, since it causes I/O
> hangs when multiple block devices are in use.
>
> This patch fixes that by replacing the global with an activity flag in
> the device structure in order to tell whether the queue is currently
> being run.
Finally that variable has a understandable name. However in a mail from Jens 
Axboe, titled:
"Re: [uml-devel] [PATCH 06/11] uml ubd driver: ubd_io_lock usage fixup" , with 
Date: Mon, 30 Oct 2006 09:26:48 +0100, he suggested removing this flag 
altogether, so we may explore this for the future:

> > Add some comments about requirements for ubd_io_lock and expand its use.
> >
> > When an irq signals that the "controller" (i.e. another thread on the
> > host, which does the actual requests and is the only one blocked on I/O
> > on the host) has done some work, we call again the request function
> > ourselves (do_ubd_request).
> >
> > We now do that with ubd_io_lock held - that's useful to protect against
> > concurrent calls to elv_next_request and so on.
>
> Not only useful, required, as I think I complained about a year or more
> ago :-)
>
> > XXX: Maybe we shouldn't call at all the request function. Input needed on
> >  this. Are we supposed to plug and unplug the queue? That code
> > "indirectly" does that by setting a flag, called do_ubd, which makes the
> > request function return (it's a residual of 2.4 block layer interface).
>
> Sometimes you need to. I'd probably just remove the do_ubd check and
> always recall the request function when handling completions, it's
> easier and safe.

Anyway, the main speedups to do on the UBD driver are:
* implement write barriers (so much less fsync) - this is performance killer 
n.1

* possibly to use the new 2.6 request layout with scatter/gather I/O, and 
vectorized I/O on the host
* while at vectorizing I/O using async I/O

* to avoid passing requests on pipes (n.2) - on fast disk I/O becomes 
cpu-bound.
To make a different but related example, with a SpeedScale laptop, it's 
interesting to double CPU frequency and observe tuntap speed double too. 
(with 1GHz I get on TCP numbers like 150 Mbit/s - 100 Mbit/s, depending 
whether UML trasmits or receives data; with 2GHz double rates).
Update: I now get 150Mbit / 200Mbit (Uml receives/Uml sends) at 1GHz, and 
still the double at 2Ghz.
This is a different UML though.

* using futexes instead of pipes for synchronization (required for previous 
one).

-- 
Inform me of my mistakes, so I can add them to my list!
Paolo Giarrusso, aka Blaisorblade
http://www.user-mode-linux.org/~blaisorblade
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] UML - fix I/O hang when multiple devices are in use

2007-03-27 Thread Jeff Dike
[ This patch needs to get into 2.6.21, as it fixes a serious bug
introduced soon after 2.6.20 ]

Commit 62f96cb01e8de7a5daee472e540f726db2801499 introduced per-devices
queues and locks, which was fine as far as it went, but left in place
a global which controlled access to submitting requests to the host.
This should have been made per-device as well, since it causes I/O
hangs when multiple block devices are in use.

This patch fixes that by replacing the global with an activity flag in
the device structure in order to tell whether the queue is currently
being run.

Signed-off-by: Jeff Dike <[EMAIL PROTECTED]>
--
 arch/um/drivers/ubd_kern.c |   13 ++---
 1 file changed, 6 insertions(+), 7 deletions(-)

Index: linux-2.6.21-mm/arch/um/drivers/ubd_kern.c
===
--- linux-2.6.21-mm.orig/arch/um/drivers/ubd_kern.c 2007-03-27 
20:27:16.0 -0400
+++ linux-2.6.21-mm/arch/um/drivers/ubd_kern.c  2007-03-27 20:28:29.0 
-0400
@@ -108,10 +108,6 @@ static inline void ubd_set_bit(__u64 bit
 
 static DEFINE_MUTEX(ubd_lock);
 
-/* XXX - this made sense in 2.4 days, now it's only used as a boolean, and
- * probably it doesn't make sense even for that. */
-static int do_ubd;
-
 static int ubd_open(struct inode * inode, struct file * filp);
 static int ubd_release(struct inode * inode, struct file * file);
 static int ubd_ioctl(struct inode * inode, struct file * file,
@@ -168,6 +164,7 @@ struct ubd {
struct platform_device pdev;
struct request_queue *queue;
spinlock_t lock;
+   int active;
 };
 
 #define DEFAULT_COW { \
@@ -189,6 +186,7 @@ struct ubd {
.shared =   0, \
 .cow = DEFAULT_COW, \
.lock = SPIN_LOCK_UNLOCKED, \
+   .active =   0, \
 }
 
 /* Protected by ubd_lock */
@@ -506,7 +504,6 @@ static void ubd_handler(void)
struct ubd *dev;
int n;
 
-   do_ubd = 0;
n = os_read_file(thread_fd, &req, sizeof(req));
if(n != sizeof(req)){
printk(KERN_ERR "Pid %d - spurious interrupt in ubd_handler, "
@@ -516,6 +513,7 @@ static void ubd_handler(void)
 
rq = req.req;
dev = rq->rq_disk->private_data;
+   dev->active = 0;
 
ubd_finish(rq, req.error);
reactivate_fd(thread_fd, UBD_IRQ);
@@ -1081,11 +1079,12 @@ static void do_ubd_request(request_queue
}
}
else {
-   if(do_ubd || (req = elv_next_request(q)) == NULL)
+   struct ubd *dev = q->queuedata;
+   if(dev->active || (req = elv_next_request(q)) == NULL)
return;
err = prepare_request(req, &io_req);
if(!err){
-   do_ubd = 1;
+   dev->active = 1;
n = os_write_file(thread_fd, (char *) &io_req,
 sizeof(io_req));
if(n != sizeof(io_req))
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/