Re: Terrible elevator performance in kernel 2.4.0-test8

2000-09-15 Thread Linus Torvalds

In article <[EMAIL PROTECTED]>,
Ingo Molnar  <[EMAIL PROTECTED]> wrote:
>
>i'm seeing similar problems. I think these problems started when the
>elevator was rewritten, i believe it broke the proper unplugging of IO
>devices. Does your performance problem get fixed by the attached
>workaround?

If this helps, that implies that somebody is doing a (critical) IO
request, without ever actually asking for the request to be _started_. 

That sounds like a bug, and the patch looks like band-aid.

Where do we end up scheduling without starting the disk IO?

I'd rather add the disk schedule to _that_ place, instead of adding it
to every re-schedule event.

(For example, on a multi-CPU system it should _not_ be the case that one
CPU scheduling frequently should cause IO performance to go down - yet
your patch will do exactly that).

Could you make it print out a backtrace instead when this happens (make
it do it for the first few times, so as not to flood your console
forever if it ends up being common..)

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Terrible elevator performance in kernel 2.4.0-test8

2000-09-14 Thread Rik van Riel

On Thu, 14 Sep 2000, Andrea Arcangeli wrote:
> On Thu, Sep 14, 2000 at 07:40:12PM +1000, Robert Cohen wrote:
> 
> > With kernel version 2.4.0-test1-ac22, I saw adequate performance.
> 
> In 2.4.0-test1-ac22 there were a latency-driven elevator (the
> one we have now since test2 can't provide good latency anymore).
> 
> So if something it should be the other way around, the elevator
> that we have since test2 should provide _better_ throghput and
> _less_ seeks. Thus it can't be the elevator algorithm but maybe
> as Ingo said something in the plugging that broke during the
> test2 changes.

Indeed you're right. The elevator /shouldn't/ be the source
of all these problems. Under very heavy (but regular) IO
loads I'm not seeing the stalls I experience under lighter
IO loads ...

That is bound to be a problem with unplugging.

regards,

Rik
--
"What you're running that piece of shit Gnome?!?!"
   -- Miguel de Icaza, UKUUG 2000

http://www.conectiva.com/   http://www.surriel.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Terrible elevator performance in kernel 2.4.0-test8

2000-09-14 Thread Andrea Arcangeli

On Thu, Sep 14, 2000 at 07:40:12PM +1000, Robert Cohen wrote:

> What I believe is happening is that the elevator isn't merging the
> requests properly.
> I think that this may be the same problem reported here
> http://www.uwsg.indiana.edu/hypermail/linux/kernel/0008.2/0389.html

The merging issue pointed out by Giuliano Pochini is there since 2.2.0.

> With kernel version 2.4.0-test1-ac22, I saw adequate performance.

In 2.4.0-test1-ac22 there were a latency-driven elevator (the one we have
now since test2 can't provide good latency anymore).

So if something it should be the other way around, the elevator that we
have since test2 should provide _better_ throghput and _less_ seeks. Thus
it can't be the elevator algorithm but maybe as Ingo said something in the
plugging that broke during the test2 changes.

> In 2.4.0-test3 - test6, the default max_bombs value became 0. And the
> performance with this setting was terrible.

It's zero because in reality is not limited anymore, this just mean it
should provide better performance and worse latency.

> Although I still saw a tendency for a client to get write starved.

Are you doing synchronous writes, right?

> Unfortunately, the benchmarks don't show any improvement.

tiotest should provide better numbers with the elevator from test2
(compared to the test1 one).

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Terrible elevator performance in kernel 2.4.0-test8

2000-09-14 Thread Ingo Molnar


i'm seeing similar problems. I think these problems started when the
elevator was rewritten, i believe it broke the proper unplugging of IO
devices. Does your performance problem get fixed by the attached
workaround?

Ingo

On Thu, 14 Sep 2000, Robert Cohen wrote:

> For a while, Ive been seeing a performance problem with 2.4.0-test
> kernels.
> The benchmark I am using is an netatalk performance benchmark.
> But I think this is a general performance problem, not appletalk
> related.
> The benchmark has a varying number of clients reading and writing 30 Meg
> files.
> The symptom I see is that with more an 2 or 3 clients, I see a suddent
> and gigantic reduction in write performance. At the same time I can hear
> the disk seeking wildly. And the throughput reported by "vmstat 5" drops
> from 2000-3000 to 100-200.
> 
> What I believe is happening is that the elevator isn't merging the
> requests properly.
> I think that this may be the same problem reported here
> http://www.uwsg.indiana.edu/hypermail/linux/kernel/0008.2/0389.html
> 
> When stracing the netatalk servers, I can see that they are reading from
> the network then doing an 8k write and repeating.
> If I try to simulate the problem by running multiple iozones doing 8k
> writes, I dont see the same kind of problems. 
> However, in a non networked benchmark like iozone, each process is doing
> many writes in its timeslice. And these writes coalesce naturally.
> In the networked benchmark, the read from the network is introducting
> enough delay that we get a context switch and the writes to different
> files become interleaved.
> This is precisely the sort of situation that the elevator is supposed to
> help with.
> 
> With kernel version 2.4.0-test1-ac22, I saw adequate performance.
> In this version, the default elevator settings had a max_bomb value of
> 32.
> 
> In 2.4.0-test3 - test6, the default max_bombs value became 0. And the
> performance with this setting was terrible.
> If I increase max_bombs with elvtune, the performance markedly improves.
> Although I still saw a tendency for a client to get write starved.
> 
> In 2.4.0-test, the max_bombs value has been eliminated so I can't change
> it. I was hoping that that meant that the algorithm had been improved.
> Unfortunately, the benchmarks don't show any improvement.
> 
> --
> Robert Cohen
> Unix Support, TLTSU
> Australian National University
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> Please read the FAQ at http://www.tux.org/lkml/
> 


--- linux/kernel/sched.c.orig   Sun Sep  3 10:03:35 2000
+++ linux/kernel/sched.cMon Sep  4 09:23:07 2000
@@ -508,6 +508,7 @@
if (tq_scheduler)
goto handle_tq_scheduler;
 tq_scheduler_back:
+   run_task_queue(_disk);
 
prev = current;
this_cpu = prev->processor;



Terrible elevator performance in kernel 2.4.0-test8

2000-09-14 Thread Robert Cohen

For a while, Ive been seeing a performance problem with 2.4.0-test
kernels.
The benchmark I am using is an netatalk performance benchmark.
But I think this is a general performance problem, not appletalk
related.
The benchmark has a varying number of clients reading and writing 30 Meg
files.
The symptom I see is that with more an 2 or 3 clients, I see a suddent
and gigantic reduction in write performance. At the same time I can hear
the disk seeking wildly. And the throughput reported by "vmstat 5" drops
from 2000-3000 to 100-200.

What I believe is happening is that the elevator isn't merging the
requests properly.
I think that this may be the same problem reported here
http://www.uwsg.indiana.edu/hypermail/linux/kernel/0008.2/0389.html

When stracing the netatalk servers, I can see that they are reading from
the network then doing an 8k write and repeating.
If I try to simulate the problem by running multiple iozones doing 8k
writes, I dont see the same kind of problems. 
However, in a non networked benchmark like iozone, each process is doing
many writes in its timeslice. And these writes coalesce naturally.
In the networked benchmark, the read from the network is introducting
enough delay that we get a context switch and the writes to different
files become interleaved.
This is precisely the sort of situation that the elevator is supposed to
help with.

With kernel version 2.4.0-test1-ac22, I saw adequate performance.
In this version, the default elevator settings had a max_bomb value of
32.

In 2.4.0-test3 - test6, the default max_bombs value became 0. And the
performance with this setting was terrible.
If I increase max_bombs with elvtune, the performance markedly improves.
Although I still saw a tendency for a client to get write starved.

In 2.4.0-test, the max_bombs value has been eliminated so I can't change
it. I was hoping that that meant that the algorithm had been improved.
Unfortunately, the benchmarks don't show any improvement.

--
Robert Cohen
Unix Support, TLTSU
Australian National University
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Terrible elevator performance in kernel 2.4.0-test8

2000-09-14 Thread Robert Cohen

For a while, Ive been seeing a performance problem with 2.4.0-test
kernels.
The benchmark I am using is an netatalk performance benchmark.
But I think this is a general performance problem, not appletalk
related.
The benchmark has a varying number of clients reading and writing 30 Meg
files.
The symptom I see is that with more an 2 or 3 clients, I see a suddent
and gigantic reduction in write performance. At the same time I can hear
the disk seeking wildly. And the throughput reported by "vmstat 5" drops
from 2000-3000 to 100-200.

What I believe is happening is that the elevator isn't merging the
requests properly.
I think that this may be the same problem reported here
http://www.uwsg.indiana.edu/hypermail/linux/kernel/0008.2/0389.html

When stracing the netatalk servers, I can see that they are reading from
the network then doing an 8k write and repeating.
If I try to simulate the problem by running multiple iozones doing 8k
writes, I dont see the same kind of problems. 
However, in a non networked benchmark like iozone, each process is doing
many writes in its timeslice. And these writes coalesce naturally.
In the networked benchmark, the read from the network is introducting
enough delay that we get a context switch and the writes to different
files become interleaved.
This is precisely the sort of situation that the elevator is supposed to
help with.

With kernel version 2.4.0-test1-ac22, I saw adequate performance.
In this version, the default elevator settings had a max_bomb value of
32.

In 2.4.0-test3 - test6, the default max_bombs value became 0. And the
performance with this setting was terrible.
If I increase max_bombs with elvtune, the performance markedly improves.
Although I still saw a tendency for a client to get write starved.

In 2.4.0-test, the max_bombs value has been eliminated so I can't change
it. I was hoping that that meant that the algorithm had been improved.
Unfortunately, the benchmarks don't show any improvement.

--
Robert Cohen
Unix Support, TLTSU
Australian National University
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Terrible elevator performance in kernel 2.4.0-test8

2000-09-14 Thread Ingo Molnar


i'm seeing similar problems. I think these problems started when the
elevator was rewritten, i believe it broke the proper unplugging of IO
devices. Does your performance problem get fixed by the attached
workaround?

Ingo

On Thu, 14 Sep 2000, Robert Cohen wrote:

 For a while, Ive been seeing a performance problem with 2.4.0-test
 kernels.
 The benchmark I am using is an netatalk performance benchmark.
 But I think this is a general performance problem, not appletalk
 related.
 The benchmark has a varying number of clients reading and writing 30 Meg
 files.
 The symptom I see is that with more an 2 or 3 clients, I see a suddent
 and gigantic reduction in write performance. At the same time I can hear
 the disk seeking wildly. And the throughput reported by "vmstat 5" drops
 from 2000-3000 to 100-200.
 
 What I believe is happening is that the elevator isn't merging the
 requests properly.
 I think that this may be the same problem reported here
 http://www.uwsg.indiana.edu/hypermail/linux/kernel/0008.2/0389.html
 
 When stracing the netatalk servers, I can see that they are reading from
 the network then doing an 8k write and repeating.
 If I try to simulate the problem by running multiple iozones doing 8k
 writes, I dont see the same kind of problems. 
 However, in a non networked benchmark like iozone, each process is doing
 many writes in its timeslice. And these writes coalesce naturally.
 In the networked benchmark, the read from the network is introducting
 enough delay that we get a context switch and the writes to different
 files become interleaved.
 This is precisely the sort of situation that the elevator is supposed to
 help with.
 
 With kernel version 2.4.0-test1-ac22, I saw adequate performance.
 In this version, the default elevator settings had a max_bomb value of
 32.
 
 In 2.4.0-test3 - test6, the default max_bombs value became 0. And the
 performance with this setting was terrible.
 If I increase max_bombs with elvtune, the performance markedly improves.
 Although I still saw a tendency for a client to get write starved.
 
 In 2.4.0-test, the max_bombs value has been eliminated so I can't change
 it. I was hoping that that meant that the algorithm had been improved.
 Unfortunately, the benchmarks don't show any improvement.
 
 --
 Robert Cohen
 Unix Support, TLTSU
 Australian National University
 -
 To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
 the body of a message to [EMAIL PROTECTED]
 Please read the FAQ at http://www.tux.org/lkml/
 


--- linux/kernel/sched.c.orig   Sun Sep  3 10:03:35 2000
+++ linux/kernel/sched.cMon Sep  4 09:23:07 2000
@@ -508,6 +508,7 @@
if (tq_scheduler)
goto handle_tq_scheduler;
 tq_scheduler_back:
+   run_task_queue(tq_disk);
 
prev = current;
this_cpu = prev-processor;



Re: Terrible elevator performance in kernel 2.4.0-test8

2000-09-14 Thread Andrea Arcangeli

On Thu, Sep 14, 2000 at 07:40:12PM +1000, Robert Cohen wrote:

 What I believe is happening is that the elevator isn't merging the
 requests properly.
 I think that this may be the same problem reported here
 http://www.uwsg.indiana.edu/hypermail/linux/kernel/0008.2/0389.html

The merging issue pointed out by Giuliano Pochini is there since 2.2.0.

 With kernel version 2.4.0-test1-ac22, I saw adequate performance.

In 2.4.0-test1-ac22 there were a latency-driven elevator (the one we have
now since test2 can't provide good latency anymore).

So if something it should be the other way around, the elevator that we
have since test2 should provide _better_ throghput and _less_ seeks. Thus
it can't be the elevator algorithm but maybe as Ingo said something in the
plugging that broke during the test2 changes.

 In 2.4.0-test3 - test6, the default max_bombs value became 0. And the
 performance with this setting was terrible.

It's zero because in reality is not limited anymore, this just mean it
should provide better performance and worse latency.

 Although I still saw a tendency for a client to get write starved.

Are you doing synchronous writes, right?

 Unfortunately, the benchmarks don't show any improvement.

tiotest should provide better numbers with the elevator from test2
(compared to the test1 one).

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Terrible elevator performance in kernel 2.4.0-test8

2000-09-14 Thread Rik van Riel

On Thu, 14 Sep 2000, Andrea Arcangeli wrote:
 On Thu, Sep 14, 2000 at 07:40:12PM +1000, Robert Cohen wrote:
 
  With kernel version 2.4.0-test1-ac22, I saw adequate performance.
 
 In 2.4.0-test1-ac22 there were a latency-driven elevator (the
 one we have now since test2 can't provide good latency anymore).
 
 So if something it should be the other way around, the elevator
 that we have since test2 should provide _better_ throghput and
 _less_ seeks. Thus it can't be the elevator algorithm but maybe
 as Ingo said something in the plugging that broke during the
 test2 changes.

Indeed you're right. The elevator /shouldn't/ be the source
of all these problems. Under very heavy (but regular) IO
loads I'm not seeing the stalls I experience under lighter
IO loads ...

That is bound to be a problem with unplugging.

regards,

Rik
--
"What you're running that piece of shit Gnome?!?!"
   -- Miguel de Icaza, UKUUG 2000

http://www.conectiva.com/   http://www.surriel.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/