Re: ll_rw_block/submit_bh and request limits

2001-02-25 Thread Andrea Arcangeli

On Sun, Feb 25, 2001 at 06:34:01PM +0100, Jens Axboe wrote:
> Any reason why you don't have a lower wake-up limit for the queue?

The watermark diff looked too high (it's 128M in current Linus's tree), but it's
probably a good idea to resurrect it with a max difference of a few full sized
requests (1/2mbytes).

> Do you mind if I do some testing with this patch and fold it in,
> possibly?

Go ahead, thanks,
Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: ll_rw_block/submit_bh and request limits

2001-02-25 Thread Jens Axboe

On Thu, Feb 22 2001, Andrea Arcangeli wrote:
> On Thu, Feb 22, 2001 at 10:59:20AM -0800, Linus Torvalds wrote:
> > I'd prefer for this check to be a per-queue one.
> 
> I'm running this in my tree since a few weeks, however I never had the courage
> to post it publically because I didn't benchmarked it carefully yet and I
> prefer to finish another thing first. This is actually based on the code I had
> in my blkdev tree after I merged last time with Jens the 512K I/O requests and
> elevator fixes. I think it won't generate bad numbers and it was running fine
> on a 32way SMP (though I didn't stressed the I/O subsystem much there) but
> please don't include until somebody benchmarks it carefully with dbench and
> tiotest.  (it still applys cleanly against 2.4.2)

Thinking about this a bit, I have to agree with you and Linus. It
is possible to find pathetic cases where the per-queue limit suffers
compared to the global one, but in reality I don't think it's worth
it. And the per-queue limits saves us the atomic updates since it's
done under the io_request_lock (or queue later, still fine) so that's
a win too.

I have had rw wait queues before, was removed when I did the request
stealing which is now gone again. I'm not even sure it's worth it
now, Marcelo and I discussed it last week and I did some tests that
showed nothing remarkable. But it's mainly for free, so we might
as well do it.

Any reason why you don't have a lower wake-up limit for the queue?
Do you mind if I do some testing with this patch and fold it in,
possibly?

-- 
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: ll_rw_block/submit_bh and request limits

2001-02-25 Thread Jens Axboe

On Thu, Feb 22 2001, Andrea Arcangeli wrote:
 On Thu, Feb 22, 2001 at 10:59:20AM -0800, Linus Torvalds wrote:
  I'd prefer for this check to be a per-queue one.
 
 I'm running this in my tree since a few weeks, however I never had the courage
 to post it publically because I didn't benchmarked it carefully yet and I
 prefer to finish another thing first. This is actually based on the code I had
 in my blkdev tree after I merged last time with Jens the 512K I/O requests and
 elevator fixes. I think it won't generate bad numbers and it was running fine
 on a 32way SMP (though I didn't stressed the I/O subsystem much there) but
 please don't include until somebody benchmarks it carefully with dbench and
 tiotest.  (it still applys cleanly against 2.4.2)

Thinking about this a bit, I have to agree with you and Linus. It
is possible to find pathetic cases where the per-queue limit suffers
compared to the global one, but in reality I don't think it's worth
it. And the per-queue limits saves us the atomic updates since it's
done under the io_request_lock (or queue later, still fine) so that's
a win too.

I have had rw wait queues before, was removed when I did the request
stealing which is now gone again. I'm not even sure it's worth it
now, Marcelo and I discussed it last week and I did some tests that
showed nothing remarkable. But it's mainly for free, so we might
as well do it.

Any reason why you don't have a lower wake-up limit for the queue?
Do you mind if I do some testing with this patch and fold it in,
possibly?

-- 
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: ll_rw_block/submit_bh and request limits

2001-02-25 Thread Andrea Arcangeli

On Sun, Feb 25, 2001 at 06:34:01PM +0100, Jens Axboe wrote:
 Any reason why you don't have a lower wake-up limit for the queue?

The watermark diff looked too high (it's 128M in current Linus's tree), but it's
probably a good idea to resurrect it with a max difference of a few full sized
requests (1/2mbytes).

 Do you mind if I do some testing with this patch and fold it in,
 possibly?

Go ahead, thanks,
Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: ll_rw_block/submit_bh and request limits

2001-02-22 Thread Andrea Arcangeli

On Thu, Feb 22, 2001 at 07:44:11PM -0200, Marcelo Tosatti wrote:
> The global limit on top of the per-queue limit sounds good. 

Probably.

> Since you're talking about the "total_ram / 3" hardcoded value... it
> should be /proc tunable IMO. (Andi Kleen already suggested this)

Yes, IIRC Andi also proposed that a few weeks ago.

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: ll_rw_block/submit_bh and request limits

2001-02-22 Thread Marcelo Tosatti


On Thu, 22 Feb 2001, Andrea Arcangeli wrote:



> However if you have houndred of different queues doing I/O at the same
> time it may make a difference, but probably with tons of harddisks
> you'll also have tons of ram... In theory we could put a global limit
> on top of the the per-queue one. Or we could at least upper bound the
> total_ram / 3.

The global limit on top of the per-queue limit sounds good. 

Since you're talking about the "total_ram / 3" hardcoded value... it
should be /proc tunable IMO. (Andi Kleen already suggested this)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: ll_rw_block/submit_bh and request limits

2001-02-22 Thread Andrea Arcangeli

On Thu, Feb 22, 2001 at 11:57:00PM +0100, Andrea Arcangeli wrote:
> unsane to wait kupdate to submit 10G of ram to a single harddisk before
> unplugging on a 30G machine.

actually kupdate will unplug itself the queue but in theory it can grow the
queue still up to such level after the I/O started. I think we'd better
add an high limit on the in flight I/O watermark.

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: ll_rw_block/submit_bh and request limits

2001-02-22 Thread Andrea Arcangeli

On Thu, Feb 22, 2001 at 06:40:48PM -0200, Marcelo Tosatti wrote:
> You want to throttle IO if the amount of on flight data is higher than
> a given percentage of _main memory_. 
> 
> As far as I can see, your patch avoids each individual queue from being
> bigger than the high watermark (which is a percentage of main
> memory). However, you do not avoid multiple queues together from being
> bigger than the high watermark.

I of course see what you mean and I considered but I tend to believe that's a
minor issue and that most machines will be happier without the global unplug
even if without the global limit.

The only reason we added the limit of I/O in flight is to be allowed to have an
huge number of requests so we can do very large reordering and merges in the
elevator with seeking I/O (4k large IO request) _but_ still we don't have to
wait to lock in ram giga of pages before starting the I/O if the I/O was
contigous. We absolutely need such a sanity limit, it would be absolutely
unsane to wait kupdate to submit 10G of ram to a single harddisk before
unplugging on a 30G machine.

It doesn't need to be exactly "if we unplug not exactly after 1/3 of the global
ram is locked then performance sucks or the machine crashes or task gets
killed".  As Jens noticed sync_page_buffers will unplug the queue at some point
if we're low on ram anyways.

The limit just says "unplug after a rasonable limit, after it doesn't matter
anymore to try to delay requests for this harddisk, not matter if there are
still I/O requests available".

However if you have houndred of different queues doing I/O at the same time it
may make a difference, but probably with tons of harddisks you'll also have
tons of ram... In theory we could put a global limit on top of the the
per-queue one. Or we could at least upper bound the total_ram / 3.

Note that 2.4.0 as well doesn't enforce a global limit of packets in flight.
(while in 2.2 the limit is global as it has a shared pool of I/O requests).

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: ll_rw_block/submit_bh and request limits

2001-02-22 Thread Marcelo Tosatti


On Thu, 22 Feb 2001, Andrea Arcangeli wrote:

> On Thu, Feb 22, 2001 at 10:59:20AM -0800, Linus Torvalds wrote:
> > I'd prefer for this check to be a per-queue one.
> 
> I'm running this in my tree since a few weeks, however I never had the courage
> to post it publically because I didn't benchmarked it carefully yet and I
> prefer to finish another thing first. 

You want to throttle IO if the amount of on flight data is higher than
a given percentage of _main memory_. 

As far as I can see, your patch avoids each individual queue from being
bigger than the high watermark (which is a percentage of main
memory). However, you do not avoid multiple queues together from being
bigger than the high watermark.



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: ll_rw_block/submit_bh and request limits

2001-02-22 Thread Marcelo Tosatti


On Thu, 22 Feb 2001, Linus Torvalds wrote:

> 
> 
> On Thu, 22 Feb 2001, Jens Axboe wrote:
> 
> > On Thu, Feb 22 2001, Marcelo Tosatti wrote:
> > > The following piece of code in ll_rw_block() aims to limit the number of
> > > locked buffers by making processes throttle on IO if the number of on
> > > flight requests is bigger than a high watermaker. IO will only start
> > > again if we're under a low watermark.
> > > 
> > > if (atomic_read(_sectors) >= high_queued_sectors) {
> > > run_task_queue(_disk);
> > > wait_event(blk_buffers_wait,
> > >   atomic_read(_sectors) < low_queued_sectors);
> > > }
> > > 
> > > 
> > > However, if submit_bh() is used to queue IO (which is used by ->readpage()
> > > for ext2, for example), no throttling happens.
> > > 
> > > It looks like ll_rw_block() users (writes, metadata reads) can be starved
> > > by submit_bh() (data reads). 
> > > 
> > > If I'm not missing something, the watermark check should be moved to
> > > submit_bh(). 
> > 
> > We might as well put it there, the idea was to not lock this one
> > buffer either but I doubt this would make any different in reality :-)
> 
> I'd prefer for this check to be a per-queue one.
> 
> Right now a slow device (like a floppy) would artifically throttle a fast
> one, if I read the above right. So instead of moving it down the
> call-chain, I'd rather remove the check completely as it looks wrong to
> me.
> 
> Now, if people want throttling, I'd much rather see that done per-queue.
> 
> (There's another level of throttling that migth make sense: right now the
> swap-out code has this "nr_async_pages" throttling which is very different
> from the queue throttling. It might make sense to move that _VM_-level
> throttling to writepage too - so that syncing of dirty mmap's will not
> cause an overload of pages in flight. This was one of the reasons I
> changed the semantics of write-page - so that shared mappings could do
> that kind of smoothing too).

And what about write() and read() if you do throttling with nr_async_pages ?

The current scheme inside the block-layer throttles write()'s based on the
number of locked buffers in _main memory_. 






-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: ll_rw_block/submit_bh and request limits

2001-02-22 Thread Linus Torvalds



On Thu, 22 Feb 2001, Jens Axboe wrote:

> On Thu, Feb 22 2001, Marcelo Tosatti wrote:
> > The following piece of code in ll_rw_block() aims to limit the number of
> > locked buffers by making processes throttle on IO if the number of on
> > flight requests is bigger than a high watermaker. IO will only start
> > again if we're under a low watermark.
> > 
> > if (atomic_read(_sectors) >= high_queued_sectors) {
> > run_task_queue(_disk);
> > wait_event(blk_buffers_wait,
> > atomic_read(_sectors) < low_queued_sectors);
> > }
> > 
> > 
> > However, if submit_bh() is used to queue IO (which is used by ->readpage()
> > for ext2, for example), no throttling happens.
> > 
> > It looks like ll_rw_block() users (writes, metadata reads) can be starved
> > by submit_bh() (data reads). 
> > 
> > If I'm not missing something, the watermark check should be moved to
> > submit_bh(). 
> 
> We might as well put it there, the idea was to not lock this one
> buffer either but I doubt this would make any different in reality :-)

I'd prefer for this check to be a per-queue one.

Right now a slow device (like a floppy) would artifically throttle a fast
one, if I read the above right. So instead of moving it down the
call-chain, I'd rather remove the check completely as it looks wrong to
me.

Now, if people want throttling, I'd much rather see that done per-queue.

(There's another level of throttling that migth make sense: right now the
swap-out code has this "nr_async_pages" throttling which is very different
from the queue throttling. It might make sense to move that _VM_-level
throttling to writepage too - so that syncing of dirty mmap's will not
cause an overload of pages in flight. This was one of the reasons I
changed the semantics of write-page - so that shared mappings could do
that kind of smoothing too).

Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: ll_rw_block/submit_bh and request limits

2001-02-22 Thread Jens Axboe

On Thu, Feb 22 2001, Marcelo Tosatti wrote:
> The following piece of code in ll_rw_block() aims to limit the number of
> locked buffers by making processes throttle on IO if the number of on
> flight requests is bigger than a high watermaker. IO will only start
> again if we're under a low watermark.
> 
> if (atomic_read(_sectors) >= high_queued_sectors) {
> run_task_queue(_disk);
> wait_event(blk_buffers_wait,
>   atomic_read(_sectors) < low_queued_sectors);
> }
> 
> 
> However, if submit_bh() is used to queue IO (which is used by ->readpage()
> for ext2, for example), no throttling happens.
> 
> It looks like ll_rw_block() users (writes, metadata reads) can be starved
> by submit_bh() (data reads). 
> 
> If I'm not missing something, the watermark check should be moved to
> submit_bh(). 

We might as well put it there, the idea was to not lock this one
buffer either but I doubt this would make any different in reality :-)

Linus, could you apply?

--- /opt/kernel/linux-2.4.2/drivers/block/ll_rw_blk.c   Thu Feb 22 14:55:22 2001
+++ drivers/block/ll_rw_blk.c   Thu Feb 22 14:53:07 2001
@@ -957,6 +959,20 @@
if (!test_bit(BH_Lock, >b_state))
BUG();
 
+   /*
+* don't lock any more buffers if we are above the high
+* water mark. instead start I/O on the queued stuff.
+*/
+   if (atomic_read(_sectors) >= high_queued_sectors) {
+   run_task_queue(_disk);
+   if (rw == READA) {
+   bh->b_end_io(bh, test_bit(BH_Uptodate, >b_state));
+   return;
+   }
+   wait_event(blk_buffers_wait,
+   atomic_read(_sectors) < low_queued_sectors);
+   }
+
set_bit(BH_Req, >b_state);
 
/*
@@ -1057,16 +1073,6 @@
 
for (i = 0; i < nr; i++) {
struct buffer_head *bh = bhs[i];
-
-   /*
-* don't lock any more buffers if we are above the high
-* water mark. instead start I/O on the queued stuff.
-*/
-   if (atomic_read(_sectors) >= high_queued_sectors) {
-   run_task_queue(_disk);
-   wait_event(blk_buffers_wait,
-atomic_read(_sectors) < low_queued_sectors);
-   }
 
/* Only one thread can actually submit the I/O. */
if (test_and_set_bit(BH_Lock, >b_state))

-- 
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: ll_rw_block/submit_bh and request limits

2001-02-22 Thread Jens Axboe

On Thu, Feb 22 2001, Marcelo Tosatti wrote:
 The following piece of code in ll_rw_block() aims to limit the number of
 locked buffers by making processes throttle on IO if the number of on
 flight requests is bigger than a high watermaker. IO will only start
 again if we're under a low watermark.
 
 if (atomic_read(queued_sectors) = high_queued_sectors) {
 run_task_queue(tq_disk);
 wait_event(blk_buffers_wait,
   atomic_read(queued_sectors)  low_queued_sectors);
 }
 
 
 However, if submit_bh() is used to queue IO (which is used by -readpage()
 for ext2, for example), no throttling happens.
 
 It looks like ll_rw_block() users (writes, metadata reads) can be starved
 by submit_bh() (data reads). 
 
 If I'm not missing something, the watermark check should be moved to
 submit_bh(). 

We might as well put it there, the idea was to not lock this one
buffer either but I doubt this would make any different in reality :-)

Linus, could you apply?

--- /opt/kernel/linux-2.4.2/drivers/block/ll_rw_blk.c   Thu Feb 22 14:55:22 2001
+++ drivers/block/ll_rw_blk.c   Thu Feb 22 14:53:07 2001
@@ -957,6 +959,20 @@
if (!test_bit(BH_Lock, bh-b_state))
BUG();
 
+   /*
+* don't lock any more buffers if we are above the high
+* water mark. instead start I/O on the queued stuff.
+*/
+   if (atomic_read(queued_sectors) = high_queued_sectors) {
+   run_task_queue(tq_disk);
+   if (rw == READA) {
+   bh-b_end_io(bh, test_bit(BH_Uptodate, bh-b_state));
+   return;
+   }
+   wait_event(blk_buffers_wait,
+   atomic_read(queued_sectors)  low_queued_sectors);
+   }
+
set_bit(BH_Req, bh-b_state);
 
/*
@@ -1057,16 +1073,6 @@
 
for (i = 0; i  nr; i++) {
struct buffer_head *bh = bhs[i];
-
-   /*
-* don't lock any more buffers if we are above the high
-* water mark. instead start I/O on the queued stuff.
-*/
-   if (atomic_read(queued_sectors) = high_queued_sectors) {
-   run_task_queue(tq_disk);
-   wait_event(blk_buffers_wait,
-atomic_read(queued_sectors)  low_queued_sectors);
-   }
 
/* Only one thread can actually submit the I/O. */
if (test_and_set_bit(BH_Lock, bh-b_state))

-- 
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: ll_rw_block/submit_bh and request limits

2001-02-22 Thread Linus Torvalds



On Thu, 22 Feb 2001, Jens Axboe wrote:

 On Thu, Feb 22 2001, Marcelo Tosatti wrote:
  The following piece of code in ll_rw_block() aims to limit the number of
  locked buffers by making processes throttle on IO if the number of on
  flight requests is bigger than a high watermaker. IO will only start
  again if we're under a low watermark.
  
  if (atomic_read(queued_sectors) = high_queued_sectors) {
  run_task_queue(tq_disk);
  wait_event(blk_buffers_wait,
  atomic_read(queued_sectors)  low_queued_sectors);
  }
  
  
  However, if submit_bh() is used to queue IO (which is used by -readpage()
  for ext2, for example), no throttling happens.
  
  It looks like ll_rw_block() users (writes, metadata reads) can be starved
  by submit_bh() (data reads). 
  
  If I'm not missing something, the watermark check should be moved to
  submit_bh(). 
 
 We might as well put it there, the idea was to not lock this one
 buffer either but I doubt this would make any different in reality :-)

I'd prefer for this check to be a per-queue one.

Right now a slow device (like a floppy) would artifically throttle a fast
one, if I read the above right. So instead of moving it down the
call-chain, I'd rather remove the check completely as it looks wrong to
me.

Now, if people want throttling, I'd much rather see that done per-queue.

(There's another level of throttling that migth make sense: right now the
swap-out code has this "nr_async_pages" throttling which is very different
from the queue throttling. It might make sense to move that _VM_-level
throttling to writepage too - so that syncing of dirty mmap's will not
cause an overload of pages in flight. This was one of the reasons I
changed the semantics of write-page - so that shared mappings could do
that kind of smoothing too).

Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: ll_rw_block/submit_bh and request limits

2001-02-22 Thread Marcelo Tosatti


On Thu, 22 Feb 2001, Linus Torvalds wrote:

 
 
 On Thu, 22 Feb 2001, Jens Axboe wrote:
 
  On Thu, Feb 22 2001, Marcelo Tosatti wrote:
   The following piece of code in ll_rw_block() aims to limit the number of
   locked buffers by making processes throttle on IO if the number of on
   flight requests is bigger than a high watermaker. IO will only start
   again if we're under a low watermark.
   
   if (atomic_read(queued_sectors) = high_queued_sectors) {
   run_task_queue(tq_disk);
   wait_event(blk_buffers_wait,
 atomic_read(queued_sectors)  low_queued_sectors);
   }
   
   
   However, if submit_bh() is used to queue IO (which is used by -readpage()
   for ext2, for example), no throttling happens.
   
   It looks like ll_rw_block() users (writes, metadata reads) can be starved
   by submit_bh() (data reads). 
   
   If I'm not missing something, the watermark check should be moved to
   submit_bh(). 
  
  We might as well put it there, the idea was to not lock this one
  buffer either but I doubt this would make any different in reality :-)
 
 I'd prefer for this check to be a per-queue one.
 
 Right now a slow device (like a floppy) would artifically throttle a fast
 one, if I read the above right. So instead of moving it down the
 call-chain, I'd rather remove the check completely as it looks wrong to
 me.
 
 Now, if people want throttling, I'd much rather see that done per-queue.
 
 (There's another level of throttling that migth make sense: right now the
 swap-out code has this "nr_async_pages" throttling which is very different
 from the queue throttling. It might make sense to move that _VM_-level
 throttling to writepage too - so that syncing of dirty mmap's will not
 cause an overload of pages in flight. This was one of the reasons I
 changed the semantics of write-page - so that shared mappings could do
 that kind of smoothing too).

And what about write() and read() if you do throttling with nr_async_pages ?

The current scheme inside the block-layer throttles write()'s based on the
number of locked buffers in _main memory_. 






-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: ll_rw_block/submit_bh and request limits

2001-02-22 Thread Marcelo Tosatti


On Thu, 22 Feb 2001, Andrea Arcangeli wrote:

 On Thu, Feb 22, 2001 at 10:59:20AM -0800, Linus Torvalds wrote:
  I'd prefer for this check to be a per-queue one.
 
 I'm running this in my tree since a few weeks, however I never had the courage
 to post it publically because I didn't benchmarked it carefully yet and I
 prefer to finish another thing first. 

You want to throttle IO if the amount of on flight data is higher than
a given percentage of _main memory_. 

As far as I can see, your patch avoids each individual queue from being
bigger than the high watermark (which is a percentage of main
memory). However, you do not avoid multiple queues together from being
bigger than the high watermark.



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: ll_rw_block/submit_bh and request limits

2001-02-22 Thread Andrea Arcangeli

On Thu, Feb 22, 2001 at 06:40:48PM -0200, Marcelo Tosatti wrote:
 You want to throttle IO if the amount of on flight data is higher than
 a given percentage of _main memory_. 
 
 As far as I can see, your patch avoids each individual queue from being
 bigger than the high watermark (which is a percentage of main
 memory). However, you do not avoid multiple queues together from being
 bigger than the high watermark.

I of course see what you mean and I considered but I tend to believe that's a
minor issue and that most machines will be happier without the global unplug
even if without the global limit.

The only reason we added the limit of I/O in flight is to be allowed to have an
huge number of requests so we can do very large reordering and merges in the
elevator with seeking I/O (4k large IO request) _but_ still we don't have to
wait to lock in ram giga of pages before starting the I/O if the I/O was
contigous. We absolutely need such a sanity limit, it would be absolutely
unsane to wait kupdate to submit 10G of ram to a single harddisk before
unplugging on a 30G machine.

It doesn't need to be exactly "if we unplug not exactly after 1/3 of the global
ram is locked then performance sucks or the machine crashes or task gets
killed".  As Jens noticed sync_page_buffers will unplug the queue at some point
if we're low on ram anyways.

The limit just says "unplug after a rasonable limit, after it doesn't matter
anymore to try to delay requests for this harddisk, not matter if there are
still I/O requests available".

However if you have houndred of different queues doing I/O at the same time it
may make a difference, but probably with tons of harddisks you'll also have
tons of ram... In theory we could put a global limit on top of the the
per-queue one. Or we could at least upper bound the total_ram / 3.

Note that 2.4.0 as well doesn't enforce a global limit of packets in flight.
(while in 2.2 the limit is global as it has a shared pool of I/O requests).

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: ll_rw_block/submit_bh and request limits

2001-02-22 Thread Andrea Arcangeli

On Thu, Feb 22, 2001 at 11:57:00PM +0100, Andrea Arcangeli wrote:
 unsane to wait kupdate to submit 10G of ram to a single harddisk before
 unplugging on a 30G machine.

actually kupdate will unplug itself the queue but in theory it can grow the
queue still up to such level after the I/O started. I think we'd better
add an high limit on the in flight I/O watermark.

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: ll_rw_block/submit_bh and request limits

2001-02-22 Thread Marcelo Tosatti


On Thu, 22 Feb 2001, Andrea Arcangeli wrote:

snip

 However if you have houndred of different queues doing I/O at the same
 time it may make a difference, but probably with tons of harddisks
 you'll also have tons of ram... In theory we could put a global limit
 on top of the the per-queue one. Or we could at least upper bound the
 total_ram / 3.

The global limit on top of the per-queue limit sounds good. 

Since you're talking about the "total_ram / 3" hardcoded value... it
should be /proc tunable IMO. (Andi Kleen already suggested this)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: ll_rw_block/submit_bh and request limits

2001-02-22 Thread Andrea Arcangeli

On Thu, Feb 22, 2001 at 07:44:11PM -0200, Marcelo Tosatti wrote:
 The global limit on top of the per-queue limit sounds good. 

Probably.

 Since you're talking about the "total_ram / 3" hardcoded value... it
 should be /proc tunable IMO. (Andi Kleen already suggested this)

Yes, IIRC Andi also proposed that a few weeks ago.

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/