Re: 2.2.19pre3 and poor reponse to RT-scheduled processes?

2000-12-30 Thread Linus Torvalds



On Sat, 30 Dec 2000, Alexander Viro wrote:
> On 30 Dec 2000, Linus Torvalds wrote:
> 
> > There are other, equally likely, candidates for these kinds of stalls:
> > 
> >  - filesystem locks. Especially the ext2 superblock lock. You can easily
> >hit this one, as some ext2 functions actually do a lot of IO while
> >holding the lock.
> 
> Hmm... In 2.4 we can make the situation with superblock lock on ext2
> much better.

Actually, 2.4.x right now is worse than 2.2.x in this regard, for a really
simple reason: 2.2.x will only do the equivalent of "rebalance_dirty" when
it dirties a previously clean buffer. The current 2.4.x code does that
regardless of whether the buffer was dirty before or not.

I want to see your patches to fix this for good in a 2.5.x timeframe (or,
if they are really clean and obvious, at a later 2.4.x date), but for
2.4.x I think that we'll do either "remove rebalance dirty completely" or
at the very least we'll not re-balance for re-dirtying a dirty buffer.

The re-dirtying a dirty buffer is the common case for the superblock
stuff: bitmap blocks etc are often dirty already, _especially_ in the case
of an active writer. So 2.4.x is actually more likely to hit the
superblock/bdflush contention.

Of course, 2.4.x has had so many improvements in file writing memory
pressure that it might not end up being that noticeable, but even so..

Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.2.19pre3 and poor reponse to RT-scheduled processes?

2000-12-30 Thread Alexander Viro



On 30 Dec 2000, Linus Torvalds wrote:

> There are other, equally likely, candidates for these kinds of stalls:
> 
>  - filesystem locks. Especially the ext2 superblock lock. You can easily
>hit this one, as some ext2 functions actually do a lot of IO while
>holding the lock.

Hmm... In 2.4 we can make the situation with superblock lock on ext2
much better. I didn't go the whole way down to spinlocks, but right now
I'm sitting on a box with modified ext2 that doesn't do _any_ IO in
protected parts of ext2_new_inode()/ext2_new_block(). I can try to
extract the relevant parts of the patch if you are interested (it also
got directories-in-pagecache stuff and better SMP threading of
get_block()/truncate()). The thing seems to be working fine and I see
no serious contention on lock_super(). Dunno if it's worth doing before
2.4.0, but since it has zero impact on the rest of tree (OK, zero except
that write_on_page() had been exported, but I could trivially get rid
of that)... Maybe 2.4.early would be a good idea.
Cheers,
Al

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.2.19pre3 and poor reponse to RT-scheduled processes?

2000-12-30 Thread Linus Torvalds

In article <[EMAIL PROTECTED]>,
Andrea Arcangeli  <[EMAIL PROTECTED]> wrote:
>On Fri, Dec 29, 2000 at 04:54:23PM -0500, Rafal Boni wrote:
>> Now my box behaves much more reasonably... I'll just have to beat harder
>> on it and see what happens.
>
>Another thing: while writing to disk if you want low latency readers you can
>do:
>
>   elvtune -r 1 /dev/hd[abcd]
>
>The 1/2 seconds stalls you see could be just because of applications that waits
>I/O synchronously while the elevator is reodering I/O requests (and even if the
>elevator wouldn't reorder anything the new requests would go to the end of the
>I/O queue so they would have some higher latency anyways).

That sounds like too long a stall to be due to elevator ordering except
with some _really_ unlucky access patterns (or with slow disks). 

There are other, equally likely, candidates for these kinds of stalls:

 - filesystem locks. Especially the ext2 superblock lock. You can easily
   hit this one, as some ext2 functions actually do a lot of IO while
   holding the lock.

 - synchronously waiting for bdflush with balance_dirty_buffers().
   Especially mixed with the above.

A mixture of the two above will bascally stall the whole machine: almost
any non-cached file access ends up waiting for the superblock lock and
bdflush, and it can easily get quite unfair.

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.2.19pre3 and poor reponse to RT-scheduled processes?

2000-12-30 Thread Andrea Arcangeli

On Fri, Dec 29, 2000 at 04:54:23PM -0500, Rafal Boni wrote:
> Now my box behaves much more reasonably... I'll just have to beat harder
> on it and see what happens.

Another thing: while writing to disk if you want low latency readers you can
do:

elvtune -r 1 /dev/hd[abcd]

The 1/2 seconds stalls you see could be just because of applications that waits
I/O synchronously while the elevator is reodering I/O requests (and even if the
elevator wouldn't reorder anything the new requests would go to the end of the
I/O queue so they would have some higher latency anyways). That's normal and if
it's the case to avoid those stalls you can only decrease the I/O load or
increase disk throughput ;). The important thing is that the kernel is
not sitting in a tight kernel loop without reschedule in it during such 2
seconds.

However 2.2.19pre3aa4 includes also the lowlatency bugfixes in case you have
tons of ram and you're sending huge buffers to syscalls.

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.2.19pre3 and poor reponse to RT-scheduled processes?

2000-12-30 Thread Andrea Arcangeli

On Fri, Dec 29, 2000 at 04:54:23PM -0500, Rafal Boni wrote:
 Now my box behaves much more reasonably... I'll just have to beat harder
 on it and see what happens.

Another thing: while writing to disk if you want low latency readers you can
do:

elvtune -r 1 /dev/hd[abcd]

The 1/2 seconds stalls you see could be just because of applications that waits
I/O synchronously while the elevator is reodering I/O requests (and even if the
elevator wouldn't reorder anything the new requests would go to the end of the
I/O queue so they would have some higher latency anyways). That's normal and if
it's the case to avoid those stalls you can only decrease the I/O load or
increase disk throughput ;). The important thing is that the kernel is
not sitting in a tight kernel loop without reschedule in it during such 2
seconds.

However 2.2.19pre3aa4 includes also the lowlatency bugfixes in case you have
tons of ram and you're sending huge buffers to syscalls.

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.2.19pre3 and poor reponse to RT-scheduled processes?

2000-12-30 Thread Linus Torvalds

In article [EMAIL PROTECTED],
Andrea Arcangeli  [EMAIL PROTECTED] wrote:
On Fri, Dec 29, 2000 at 04:54:23PM -0500, Rafal Boni wrote:
 Now my box behaves much more reasonably... I'll just have to beat harder
 on it and see what happens.

Another thing: while writing to disk if you want low latency readers you can
do:

   elvtune -r 1 /dev/hd[abcd]

The 1/2 seconds stalls you see could be just because of applications that waits
I/O synchronously while the elevator is reodering I/O requests (and even if the
elevator wouldn't reorder anything the new requests would go to the end of the
I/O queue so they would have some higher latency anyways).

That sounds like too long a stall to be due to elevator ordering except
with some _really_ unlucky access patterns (or with slow disks). 

There are other, equally likely, candidates for these kinds of stalls:

 - filesystem locks. Especially the ext2 superblock lock. You can easily
   hit this one, as some ext2 functions actually do a lot of IO while
   holding the lock.

 - synchronously waiting for bdflush with balance_dirty_buffers().
   Especially mixed with the above.

A mixture of the two above will bascally stall the whole machine: almost
any non-cached file access ends up waiting for the superblock lock and
bdflush, and it can easily get quite unfair.

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.2.19pre3 and poor reponse to RT-scheduled processes?

2000-12-30 Thread Alexander Viro



On 30 Dec 2000, Linus Torvalds wrote:

 There are other, equally likely, candidates for these kinds of stalls:
 
  - filesystem locks. Especially the ext2 superblock lock. You can easily
hit this one, as some ext2 functions actually do a lot of IO while
holding the lock.

Hmm... In 2.4 we can make the situation with superblock lock on ext2
much better. I didn't go the whole way down to spinlocks, but right now
I'm sitting on a box with modified ext2 that doesn't do _any_ IO in
protected parts of ext2_new_inode()/ext2_new_block(). I can try to
extract the relevant parts of the patch if you are interested (it also
got directories-in-pagecache stuff and better SMP threading of
get_block()/truncate()). The thing seems to be working fine and I see
no serious contention on lock_super(). Dunno if it's worth doing before
2.4.0, but since it has zero impact on the rest of tree (OK, zero except
that write_on_page() had been exported, but I could trivially get rid
of that)... Maybe 2.4.early would be a good idea.
Cheers,
Al

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.2.19pre3 and poor reponse to RT-scheduled processes?

2000-12-30 Thread Linus Torvalds



On Sat, 30 Dec 2000, Alexander Viro wrote:
 On 30 Dec 2000, Linus Torvalds wrote:
 
  There are other, equally likely, candidates for these kinds of stalls:
  
   - filesystem locks. Especially the ext2 superblock lock. You can easily
 hit this one, as some ext2 functions actually do a lot of IO while
 holding the lock.
 
 Hmm... In 2.4 we can make the situation with superblock lock on ext2
 much better.

Actually, 2.4.x right now is worse than 2.2.x in this regard, for a really
simple reason: 2.2.x will only do the equivalent of "rebalance_dirty" when
it dirties a previously clean buffer. The current 2.4.x code does that
regardless of whether the buffer was dirty before or not.

I want to see your patches to fix this for good in a 2.5.x timeframe (or,
if they are really clean and obvious, at a later 2.4.x date), but for
2.4.x I think that we'll do either "remove rebalance dirty completely" or
at the very least we'll not re-balance for re-dirtying a dirty buffer.

The re-dirtying a dirty buffer is the common case for the superblock
stuff: bitmap blocks etc are often dirty already, _especially_ in the case
of an active writer. So 2.4.x is actually more likely to hit the
superblock/bdflush contention.

Of course, 2.4.x has had so many improvements in file writing memory
pressure that it might not end up being that noticeable, but even so..

Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.2.19pre3 and poor reponse to RT-scheduled processes?

2000-12-29 Thread Rafal Boni

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Content-Type: text/plain; charset=us-ascii

In message <[EMAIL PROTECTED]>, Greg Maxwell wrote:

- -> You are running IDE aren't you?
- -> 
- -> Enable DMA and/or unmask interupts.

D'oh!  Thanks to Greg for the clue-by-four!  I *am* running IDE and I had
both DMA (due to misreading of kernel boot message) and interrupt unmasking 
(since I had forgotten that one) off

I had assumed that DMA was on from the mention of it in kernel messages 
(which on closer reading do indicate CMOS/BIOS configured default modes,
not what the kernel is using), and the lack of an explicit message on
the order of "I know it's there, but I'm not going to use it all the
same" 8-)

Now my box behaves much more reasonably... I'll just have to beat harder
on it and see what happens.

Thank for the help,
- --rafal

- 
Rafal Boni  [EMAIL PROTECTED]
 PGP key C7D3024C, print EA49 160D F5E4 C46A 9E91  524E 11E0 7133 C7D3 024C
Need to get a hold of me?  http:[EMAIL PROTECTED]

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.0.0 (GNU/Linux)
Comment: Exmh version 2.1.1 10/15/1999

iD8DBQE6TQgOEeBxM8fTAkwRArCFAKDVrzaWxGtRFR0pbyNwvIF20bOSiwCfdhg9
wK1ZAhaCfK5qcrQezDECiK4=
=9x6E
-END PGP SIGNATURE-

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.2.19pre3 and poor reponse to RT-scheduled processes?

2000-12-29 Thread Gregory Maxwell

On Fri, Dec 29, 2000 at 03:45:23PM -0500, Rafal Boni wrote:
[snip]
>   The box in question is running the linux-ha.org heartbeat package,
>   which is a RT-scheduled, mlock()'ed process, and as such should
>   get as good service as the box is able to mange.  Often, under
>   high disk (and/or MM) loads, the box becomes unreponsive for a
>   period of time from ~ 1 sec to a high of ~ 2.8sec.
[snip]

You are running IDE aren't you?

Enable DMA and/or unmask interupts.

man hdparm

Good luck.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.2.19pre3 and poor reponse to RT-scheduled processes?

2000-12-29 Thread Gregory Maxwell

On Fri, Dec 29, 2000 at 03:45:23PM -0500, Rafal Boni wrote:
[snip]
   The box in question is running the linux-ha.org heartbeat package,
   which is a RT-scheduled, mlock()'ed process, and as such should
   get as good service as the box is able to mange.  Often, under
   high disk (and/or MM) loads, the box becomes unreponsive for a
   period of time from ~ 1 sec to a high of ~ 2.8sec.
[snip]

You are running IDE aren't you?

Enable DMA and/or unmask interupts.

man hdparm

Good luck.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.2.19pre3 and poor reponse to RT-scheduled processes?

2000-12-29 Thread Rafal Boni

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Content-Type: text/plain; charset=us-ascii

In message [EMAIL PROTECTED], Greg Maxwell wrote:

- - You are running IDE aren't you?
- - 
- - Enable DMA and/or unmask interupts.

D'oh!  Thanks to Greg for the clue-by-four!  I *am* running IDE and I had
both DMA (due to misreading of kernel boot message) and interrupt unmasking 
(since I had forgotten that one) off

I had assumed that DMA was on from the mention of it in kernel messages 
(which on closer reading do indicate CMOS/BIOS configured default modes,
not what the kernel is using), and the lack of an explicit message on
the order of "I know it's there, but I'm not going to use it all the
same" 8-)

Now my box behaves much more reasonably... I'll just have to beat harder
on it and see what happens.

Thank for the help,
- --rafal

- 
Rafal Boni  [EMAIL PROTECTED]
 PGP key C7D3024C, print EA49 160D F5E4 C46A 9E91  524E 11E0 7133 C7D3 024C
Need to get a hold of me?  http:[EMAIL PROTECTED]

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.0.0 (GNU/Linux)
Comment: Exmh version 2.1.1 10/15/1999

iD8DBQE6TQgOEeBxM8fTAkwRArCFAKDVrzaWxGtRFR0pbyNwvIF20bOSiwCfdhg9
wK1ZAhaCfK5qcrQezDECiK4=
=9x6E
-END PGP SIGNATURE-

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/