subject:"Re\: Break 2.4 VM in five easy steps"

Re: Break 2.4 VM in five easy steps

2001-06-12 Thread Bernd Jendrissek


-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Mon, Jun 11, 2001 at 04:04:45PM -0300, Rik van Riel wrote:
> On Mon, 11 Jun 2001, Maciej Zenczykowski wrote:
> > On Fri, 8 Jun 2001, Pavel Machek wrote:
> >
> > > That modulo is likely slower than dereference.
> > >
> > > > +   if (count % 256 == 0) {
> >
> > You are forgetting that this case should be converted to and 255
> > or a plain byte reference by any optimizing compiler

You read too much into my choice - 256 is a random number ;)

> What matters is that this thing calls schedule() unconditionally
> every 256th time.  Checking current->need_resched will only call
> schedule if it is needed ... not only that, but it will also
> call schedule FASTER if it is needed.

I will try this later today, but it seems right enough.

generic_file_write seems to do enough other work that a dereference
vs. and-255 shouldn't be too bad...

Bernd Jendrissek
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.0.4 (GNU/Linux)
Comment: For info see http://www.gnupg.org

iD8DBQE7Jciz/FmLrNfLpjMRAmI9AKCm2EYziCzG0qrobFooGLf3kepb/wCbBQf6
nXmD/OZNhGttwQejZtYi3ic=
=rWL2
-END PGP SIGNATURE-
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-11 Thread Rik van Riel

On Mon, 11 Jun 2001, Maciej Zenczykowski wrote:
> On Fri, 8 Jun 2001, Pavel Machek wrote:
>
> > That modulo is likely slower than dereference.
> >
> > > +   if (count % 256 == 0) {
>
> You are forgetting that this case should be converted to and 255
> or a plain byte reference by any optimizing compiler

Not relevant.

What matters is that this thing calls schedule() unconditionally
every 256th time.  Checking current->need_resched will only call
schedule if it is needed ... not only that, but it will also
call schedule FASTER if it is needed.

regards,

Rik
--
Linux MM bugzilla: http://linux-mm.org/bugzilla.shtml

Virtual memory is like a game you can't win;
However, without VM there's truly nothing to lose...

http://www.surriel.com/
http://www.conectiva.com/   http://distro.conectiva.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-11 Thread Pavel Machek


Hi!

> If this solves your problem, use it; if your name is Linus or Alan,
> ignore or do it right please.

Well I guess you should do CONDITIONAL_SCHEDULE (if it is not defined
as macro, do if (current->need_resched) schedule()).

That modulo is likely slower than dereference.

> diff -u -r1.1 -r1.2
> --- linux-hack/mm/filemap.c 2001/06/06 21:16:28 1.1
> +++ linux-hack/mm/filemap.c 2001/06/07 08:57:52 1.2
> @@ -2599,6 +2599,11 @@
> char *kaddr;
> int deactivate = 1;
>  
> +   /* bernd-hack: give other processes a chance to run */
> +   if (count % 256 == 0) {
> +   schedule();
> +   }
> +
> /*
>  * Try to find the page in the cache. If it isn't there,
>  * allocate a free page.
> -BEGIN PGP SIGNATURE-
> Version: GnuPG v1.0.4 (GNU/Linux)
> Comment: For info see http://www.gnupg.org
> 
> iD8DBQE7H1tb/FmLrNfLpjMRAguAAJ0fYInFbAa6LjFC/CWZbRPQxzZwrwCeNqT0
> /Kod15Nx7AzaM4v0WhOgp88=
> =pyr6
> -END PGP SIGNATURE-
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

-- 
Philips Velo 1: 1"x4"x8", 300gram, 60, 12MB, 40bogomips, linux, mutt,
details at http://atrey.karlin.mff.cuni.cz/~pavel/velo/index.html.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-11 Thread Pavel Machek


Hi!

> But if the page in memory is 'dirty', you can't be efficient with swapping
> *in* the page.  The page on disk is invalid and should be released, or am I
> missing something?

Yes. You are missing fragmentation. This keeps it low.
Pavel
-- 
Philips Velo 1: 1"x4"x8", 300gram, 60, 12MB, 40bogomips, linux, mutt,
details at http://atrey.karlin.mff.cuni.cz/~pavel/velo/index.html.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-11 Thread Pavel Machek


Hi!

 But if the page in memory is 'dirty', you can't be efficient with swapping
 *in* the page.  The page on disk is invalid and should be released, or am I
 missing something?

Yes. You are missing fragmentation. This keeps it low.
Pavel
-- 
Philips Velo 1: 1x4x8, 300gram, 60, 12MB, 40bogomips, linux, mutt,
details at http://atrey.karlin.mff.cuni.cz/~pavel/velo/index.html.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-11 Thread Pavel Machek


Hi!

 If this solves your problem, use it; if your name is Linus or Alan,
 ignore or do it right please.

Well I guess you should do CONDITIONAL_SCHEDULE (if it is not defined
as macro, do if (current-need_resched) schedule()).

That modulo is likely slower than dereference.

 diff -u -r1.1 -r1.2
 --- linux-hack/mm/filemap.c 2001/06/06 21:16:28 1.1
 +++ linux-hack/mm/filemap.c 2001/06/07 08:57:52 1.2
 @@ -2599,6 +2599,11 @@
 char *kaddr;
 int deactivate = 1;
  
 +   /* bernd-hack: give other processes a chance to run */
 +   if (count % 256 == 0) {
 +   schedule();
 +   }
 +
 /*
  * Try to find the page in the cache. If it isn't there,
  * allocate a free page.
 -BEGIN PGP SIGNATURE-
 Version: GnuPG v1.0.4 (GNU/Linux)
 Comment: For info see http://www.gnupg.org
 
 iD8DBQE7H1tb/FmLrNfLpjMRAguAAJ0fYInFbAa6LjFC/CWZbRPQxzZwrwCeNqT0
 /Kod15Nx7AzaM4v0WhOgp88=
 =pyr6
 -END PGP SIGNATURE-
 -
 To unsubscribe from this list: send the line unsubscribe linux-kernel in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at  http://www.tux.org/lkml/

-- 
Philips Velo 1: 1x4x8, 300gram, 60, 12MB, 40bogomips, linux, mutt,
details at http://atrey.karlin.mff.cuni.cz/~pavel/velo/index.html.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-11 Thread Rik van Riel


On Mon, 11 Jun 2001, Maciej Zenczykowski wrote:
 On Fri, 8 Jun 2001, Pavel Machek wrote:

  That modulo is likely slower than dereference.
 
   +   if (count % 256 == 0) {

 You are forgetting that this case should be converted to and 255
 or a plain byte reference by any optimizing compiler

Not relevant.

What matters is that this thing calls schedule() unconditionally
every 256th time.  Checking current-need_resched will only call
schedule if it is needed ... not only that, but it will also
call schedule FASTER if it is needed.

regards,

Rik
--
Linux MM bugzilla: http://linux-mm.org/bugzilla.shtml

Virtual memory is like a game you can't win;
However, without VM there's truly nothing to lose...

http://www.surriel.com/
http://www.conectiva.com/   http://distro.conectiva.com/

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-10 Thread Rob Landley


>I realize that assembly is platform-specific. Being 
>that I use the IA32 class machine, that's what I 
>would write for. Others who use other platforms could
>do the deed for their native language.

Meaning we'd still need a good C implementation anyway
for the 75% of platforms nobody's going to get around
to writing an assembly implementation for this year,
so we might as well do that first, eh?

As for IA32 being everywhere, 16 bit 8086 was
everywhere until 1990 or so.  And 64 bitness is right
around the corner (iTanic is a pointless way of
de-optimizing for memory bus bandwidth, which is your
real bottleneck and not whatever happens inside a chip
you've clock multiplied by a factor of 12 or more. 
But x86-64 looks seriously cool if AMD would get off
their rear and actually implement sledgehammer in
silicon within our lifetimes.  And that's probably
transmeta's way of going 64 bit eventually too.  (And
that was obvious even BEFORE the cross licensing
agreement was announced.))

And interestingly, an assembly routine optimized for
386 assembly just might get beaten by C code compiled
for Athlon optimization.  It's not JUST "IA32". 
Memory management code probably has to know about the
PAE addressing extensions, different translation
lookaside buffer versions, and interacting with the
wonderful wide world of DMA.  Luckily in kernel we
just don't do floating point (MMX/3DNow/whatever it
was they're so proud of in Pentium 4 whose acronym
I've forgotten at the moment.  Not SLS, that was a
linux distribution...)

If your'e a dyed in the wool assembly hacker, go help
the GCC/EGCS folks make a better compiler.  They could
use you.  The kernel isn't the place for assembly
optimization.

>Being that most users are on the IA32 platform, I'm 
>sure they wouldn't reject an assembly solution to 
>this problem.

If it's unreadable to C hackers, so that nobody
understands it, so that it's black magic that
positively invites subtle bugs from other code that
has to interface with it...

Yes they darn well WOULD reject it.  Simplicity and
clarity are actually slightly MORE important than raw
performance, since if you just six months the midrange
hardware gets 30% faster.

The ONLY assembly that's left in the kernel is the
stuff that's unavoidable, like boot sectors and the
setup code that bootstraps the first kernel init
function in C, or perhaps the occasional driver that's
so amazingly timing dependent it's effectively
real-time programming at the nanosecond level.  (And
for most of those, they've either faked a C solution
or restricted the assembly to 5 lines in the middle of
a bunch of C code.  Memo: this is the kind of thing
where profanity gets into kernel comments.)  And of
course there are a few assembly macros for half-dozen
line things like spinlocks that either can't be done
any other way or are real bottleneck cases where the
cost of the extra opacity (which is a major cost, that
is definitely taken into consideration) honestly is
worth it.

> As for kernel acceptance, that's an
>issue for the political eggheads. Not my forte. :-)

The problem in this case is an O(n^2) or worse
algorithm is being used.  Converting it to assembly
isn't going to fix something that gets exponentially
worse, it just means that instead of blowing up at 2
gigs it now blows up at 6 gigs.  That's not a long
term solution.

If eliminating 5 lines of assembly is a good thing,
rewriting an entire subsystem in assembly isn't going
to happen.  Trust us on this one.

Rob

__
Do You Yahoo!?
Get personalized email addresses from Yahoo! Mail - only $35 
a year!  http://personal.mail.yahoo.com/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-10 Thread Rob Landley


I realize that assembly is platform-specific. Being 
that I use the IA32 class machine, that's what I 
would write for. Others who use other platforms could
do the deed for their native language.

Meaning we'd still need a good C implementation anyway
for the 75% of platforms nobody's going to get around
to writing an assembly implementation for this year,
so we might as well do that first, eh?

As for IA32 being everywhere, 16 bit 8086 was
everywhere until 1990 or so.  And 64 bitness is right
around the corner (iTanic is a pointless way of
de-optimizing for memory bus bandwidth, which is your
real bottleneck and not whatever happens inside a chip
you've clock multiplied by a factor of 12 or more. 
But x86-64 looks seriously cool if AMD would get off
their rear and actually implement sledgehammer in
silicon within our lifetimes.  And that's probably
transmeta's way of going 64 bit eventually too.  (And
that was obvious even BEFORE the cross licensing
agreement was announced.))

And interestingly, an assembly routine optimized for
386 assembly just might get beaten by C code compiled
for Athlon optimization.  It's not JUST IA32. 
Memory management code probably has to know about the
PAE addressing extensions, different translation
lookaside buffer versions, and interacting with the
wonderful wide world of DMA.  Luckily in kernel we
just don't do floating point (MMX/3DNow/whatever it
was they're so proud of in Pentium 4 whose acronym
I've forgotten at the moment.  Not SLS, that was a
linux distribution...)

If your'e a dyed in the wool assembly hacker, go help
the GCC/EGCS folks make a better compiler.  They could
use you.  The kernel isn't the place for assembly
optimization.

Being that most users are on the IA32 platform, I'm 
sure they wouldn't reject an assembly solution to 
this problem.

If it's unreadable to C hackers, so that nobody
understands it, so that it's black magic that
positively invites subtle bugs from other code that
has to interface with it...

Yes they darn well WOULD reject it.  Simplicity and
clarity are actually slightly MORE important than raw
performance, since if you just six months the midrange
hardware gets 30% faster.

The ONLY assembly that's left in the kernel is the
stuff that's unavoidable, like boot sectors and the
setup code that bootstraps the first kernel init
function in C, or perhaps the occasional driver that's
so amazingly timing dependent it's effectively
real-time programming at the nanosecond level.  (And
for most of those, they've either faked a C solution
or restricted the assembly to 5 lines in the middle of
a bunch of C code.  Memo: this is the kind of thing
where profanity gets into kernel comments.)  And of
course there are a few assembly macros for half-dozen
line things like spinlocks that either can't be done
any other way or are real bottleneck cases where the
cost of the extra opacity (which is a major cost, that
is definitely taken into consideration) honestly is
worth it.

 As for kernel acceptance, that's an
issue for the political eggheads. Not my forte. :-)

The problem in this case is an O(n^2) or worse
algorithm is being used.  Converting it to assembly
isn't going to fix something that gets exponentially
worse, it just means that instead of blowing up at 2
gigs it now blows up at 6 gigs.  That's not a long
term solution.

If eliminating 5 lines of assembly is a good thing,
rewriting an entire subsystem in assembly isn't going
to happen.  Trust us on this one.

Rob

__
Do You Yahoo!?
Get personalized email addresses from Yahoo! Mail - only $35 
a year!  http://personal.mail.yahoo.com/
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-09 Thread Mike A. Harris

On Sat, 9 Jun 2001, Rik van Riel wrote:

>> Why are half the people here trying to hide behind this diskspace
>> is cheap argument?  If we rely on that, then Linux sucks shit.
>
>Never mind them, I haven't seen any of them contribute
>VM code, even ;)

Nor have I, but I think you guys working on it will get it
cleaned up eventually.  What bugs me is people trying to pretend
that it isn't important to fix, or that spending money to get
newer hardware is acceptable solution.

>OTOH, disk space _is_ cheap, so the other VM - performance
>related - VM bugs do have a somewhat higher priority at the
>moment.

Yes, it is cheap.  It isn't always an acceptable workaround
though, so I'm glad you guys are working on it - even if we have
to wait a bit.

I have faith in the system.  ;o)

--
Mike A. Harris  -  Linux advocate  -  Open Source advocate
   Opinions and viewpoints expressed are solely my own.
--

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-09 Thread Rik van Riel


On Wed, 6 Jun 2001, Mike A. Harris wrote:

> Why are half the people here trying to hide behind this diskspace
> is cheap argument?  If we rely on that, then Linux sucks shit.

Never mind them, I haven't seen any of them contribute
VM code, even ;)

OTOH, disk space _is_ cheap, so the other VM - performance
related - VM bugs do have a somewhat higher priority at the
moment.

regards,

Rik
--
Virtual memory is like a game you can't win;
However, without VM there's truly nothing to lose...

http://www.surriel.com/ http://distro.conectiva.com/

Send all your spam to [EMAIL PROTECTED] (spam digging piggy)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-09 Thread Rik van Riel


On 6 Jun 2001, Eric W. Biederman wrote:
> Derek Glidden <[EMAIL PROTECTED]> writes:
> 
> > The problem I reported is not that 2.4 uses huge amounts of swap but
> > that trying to recover that swap off of disk under 2.4 can leave the
> > machine in an entirely unresponsive state, while 2.2 handles identical
> > situations gracefully.  
> 
> The interesting thing from other reports is that it appears to be
> kswapd using up CPU resources.

This part is being worked on, expect a solution for this thing
soon...


Rik
--
Virtual memory is like a game you can't win;
However, without VM there's truly nothing to lose...

http://www.surriel.com/ http://distro.conectiva.com/

Send all your spam to [EMAIL PROTECTED] (spam digging piggy)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-09 Thread Rik van Riel


On Wed, 6 Jun 2001, Derek Glidden wrote:

> Or are you saying that if someone is unhappy with a particular
> situation, they should just keep their mouth shut and accept it?

There are lots of options ...

1) wait until somebody fixes the problem
2) fix the problem yourself
3) start infinite flamewars and make developers
   so sick of the problem nobody wants to fix it
4) pay someone to fix the problem ;)

Rik
--
Virtual memory is like a game you can't win;
However, without VM there's truly nothing to lose...

http://www.surriel.com/ http://distro.conectiva.com/

Send all your spam to [EMAIL PROTECTED] (spam digging piggy)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-09 Thread Rik van Riel


On Wed, 6 Jun 2001, Sean Hunter wrote:

> A working VM would have several differences from what we have in my
> opinion, among which are:
> - It wouldn't require 8GB of swap on my large boxes
> - It wouldn't suffer from the "bounce buffer" bug on my
>   large boxes
> - It wouldn't cause the disk drive on my laptop to be
>   _constantly_ in use even when all I have done is spawned a
>   shell session and have no large apps or daemons running.
> - It wouldn't kill things saying it was OOM unless it was OOM.

I fully agree these problems need to be fixed. I just wish I
had the time to tackle all of them right now ;)

We should be close to getting the 3rd problem fixed and the
deadlock problem with the bounce buffers seems to be fixed
already.

Getting reclaiming of swap space and OOM fixed is a matter
of time ... I hope I'll have that time in the near future.

regards,

Rik
--
Virtual memory is like a game you can't win;
However, without VM there's truly nothing to lose...

http://www.surriel.com/ http://distro.conectiva.com/

Send all your spam to [EMAIL PROTECTED] (spam digging piggy)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-09 Thread Rik van Riel


On Wed, 6 Jun 2001, Sean Hunter wrote:

 A working VM would have several differences from what we have in my
 opinion, among which are:
 - It wouldn't require 8GB of swap on my large boxes
 - It wouldn't suffer from the bounce buffer bug on my
   large boxes
 - It wouldn't cause the disk drive on my laptop to be
   _constantly_ in use even when all I have done is spawned a
   shell session and have no large apps or daemons running.
 - It wouldn't kill things saying it was OOM unless it was OOM.

I fully agree these problems need to be fixed. I just wish I
had the time to tackle all of them right now ;)

We should be close to getting the 3rd problem fixed and the
deadlock problem with the bounce buffers seems to be fixed
already.

Getting reclaiming of swap space and OOM fixed is a matter
of time ... I hope I'll have that time in the near future.

regards,

Rik
--
Virtual memory is like a game you can't win;
However, without VM there's truly nothing to lose...

http://www.surriel.com/ http://distro.conectiva.com/

Send all your spam to [EMAIL PROTECTED] (spam digging piggy)

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-09 Thread Rik van Riel


On Wed, 6 Jun 2001, Derek Glidden wrote:

 Or are you saying that if someone is unhappy with a particular
 situation, they should just keep their mouth shut and accept it?

There are lots of options ...

1) wait until somebody fixes the problem
2) fix the problem yourself
3) start infinite flamewars and make developers
   so sick of the problem nobody wants to fix it
4) pay someone to fix the problem ;)

Rik
--
Virtual memory is like a game you can't win;
However, without VM there's truly nothing to lose...

http://www.surriel.com/ http://distro.conectiva.com/

Send all your spam to [EMAIL PROTECTED] (spam digging piggy)

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-09 Thread Rik van Riel


On 6 Jun 2001, Eric W. Biederman wrote:
 Derek Glidden [EMAIL PROTECTED] writes:
 
  The problem I reported is not that 2.4 uses huge amounts of swap but
  that trying to recover that swap off of disk under 2.4 can leave the
  machine in an entirely unresponsive state, while 2.2 handles identical
  situations gracefully.  
 
 The interesting thing from other reports is that it appears to be
 kswapd using up CPU resources.

This part is being worked on, expect a solution for this thing
soon...


Rik
--
Virtual memory is like a game you can't win;
However, without VM there's truly nothing to lose...

http://www.surriel.com/ http://distro.conectiva.com/

Send all your spam to [EMAIL PROTECTED] (spam digging piggy)

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-09 Thread Rik van Riel


On Wed, 6 Jun 2001, Mike A. Harris wrote:

 Why are half the people here trying to hide behind this diskspace
 is cheap argument?  If we rely on that, then Linux sucks shit.

Never mind them, I haven't seen any of them contribute
VM code, even ;)

OTOH, disk space _is_ cheap, so the other VM - performance
related - VM bugs do have a somewhat higher priority at the
moment.

regards,

Rik
--
Virtual memory is like a game you can't win;
However, without VM there's truly nothing to lose...

http://www.surriel.com/ http://distro.conectiva.com/

Send all your spam to [EMAIL PROTECTED] (spam digging piggy)

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-09 Thread Mike A. Harris


On Sat, 9 Jun 2001, Rik van Riel wrote:

 Why are half the people here trying to hide behind this diskspace
 is cheap argument?  If we rely on that, then Linux sucks shit.

Never mind them, I haven't seen any of them contribute
VM code, even ;)

Nor have I, but I think you guys working on it will get it
cleaned up eventually.  What bugs me is people trying to pretend
that it isn't important to fix, or that spending money to get
newer hardware is acceptable solution.

OTOH, disk space _is_ cheap, so the other VM - performance
related - VM bugs do have a somewhat higher priority at the
moment.

Yes, it is cheap.  It isn't always an acceptable workaround
though, so I'm glad you guys are working on it - even if we have
to wait a bit.

I have faith in the system.  ;o)

--
Mike A. Harris  -  Linux advocate  -  Open Source advocate
   Opinions and viewpoints expressed are solely my own.
--

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-08 Thread Mike A. Harris

On 6 Jun 2001, Miles Lane wrote:

>> Precicely.  Saying 8x RAM doesn't change it either.  Sometime
>> next week I'm going to purposefully put a new 60Gb disk in on a
>> separate controller as pure swap on top of 256Mb of RAM.  My
>> guess is after bootup, and login, I'll have 48Gb of stuff in
>> swap "just in case".
>
>Mike and others, I am getting tired of your comments.  Sheesh.

And I'm tired of having people tell me, or tell others to buy a
faster computer or more RAM to work around a real technical
problem.  If a dual 1Ghz system with 1Gb of RAM and 60GB of disk
space broken across 3 U160 drives is not a modern fast
workstation I don't know what is.  My 300Mhz system however works
on its own stuff, and doesn't need upgrading.

>The various developers who actually work on the VM have already
>acknowledged the issues and are exploring fixes, including at
>least one patch that already exists.

Precicely, which underscores what I'm saying: The problem is
acknowledged, and being worked on by talented hackers knowing
what they are doing - so why must people keep saying "get more
disk space, it is cheap?" et al.?  That is totally nonuseful
advice in most cases.  Many have pointed out already for example
how impossible that would be in a 500 computer webserver farm.

>It seems clear that the uproar from the people who are having
>trouble with the new VM's handling of swap space have been
>heard and folks are going to fix these problems.  It may not
>happen today or tomorrow, but soon.  What the heck else do you
>want?

I agree with you.  What I want, is when someone talks about this
stuff or inquires about it, for people to stop telling them that
their computer is out of date and they should upgrade it as that
is bogus advice.  "It worked fine yesterday, why should I
upgrade" reigns supreme.

>Making enflammatory remarks about the current situation does
>nothing to help get the problems fixed, it just wastes our time
>and bandwidth.

It's not like there is someone forcing you to read it though.

>So please, if you have new facts that you want to offer that
>will help us characterize and understand these VM issues better
>or discover new problems, feel free to share them.  But if you
>just want to rant, I, for one, would rather you didn't.

Point noted, however that isn't going to stop anyone from
speaking their personal opinion on things.  Freedom of speech.

--
Mike A. Harris  -  Linux advocate  -  Open Source advocate
   Opinions and viewpoints expressed are solely my own.
--

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-08 Thread Bernd Jendrissek

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Thu, Jun 07, 2001 at 03:38:35PM -0600, Brian D Heaton wrote:
>   Maybe i'm missing something.  I just tried this (with the 262144k/1
> and 128k/2048 params) and my results are within .1s of each other.  This is
> without any special patches.  Am I doing something wrong

Oh, I don't mean the time elapsed, It's that nothing _else_ can happen
while dd is hogging the kernel.

> Oh yes -
> 
> SMP - dual PIII866/133

Yes, this is what you are doing wrong ;)

My hypothesis is that in your case, one cpu gets pegged copying pages
from /dev/zero into dd's buffer, while the other cpu can do things like
updating mouse cursors, run setiathome, etc.

What happens if you do *two* dd-tortures with huge buffers at the same
time?  And then, please don't happen to have a quad box!

I don't know if my symptom (loss of interactivity on heavy writing) is
related to swapoff -a causing the same symptom on deeply-swapped boxes.

BTW keep in mind my 4-liner is based more on voodoo than on analysis.

Bernd Jendrissek
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.0.4 (GNU/Linux)
Comment: For info see http://www.gnupg.org

iD8DBQE7IICU/FmLrNfLpjMRAnpTAJ48/jAFxZqfxUf2NXT0O542KDbNOwCfaoZo
Q2xaNE4GBqnbn/cl2vrRxLc=
=4sGO
-END PGP SIGNATURE-
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-08 Thread Bernd Jendrissek


-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Thu, Jun 07, 2001 at 03:38:35PM -0600, Brian D Heaton wrote:
   Maybe i'm missing something.  I just tried this (with the 262144k/1
 and 128k/2048 params) and my results are within .1s of each other.  This is
 without any special patches.  Am I doing something wrong

Oh, I don't mean the time elapsed, It's that nothing _else_ can happen
while dd is hogging the kernel.

 Oh yes -
 
 SMP - dual PIII866/133

Yes, this is what you are doing wrong ;)

My hypothesis is that in your case, one cpu gets pegged copying pages
from /dev/zero into dd's buffer, while the other cpu can do things like
updating mouse cursors, run setiathome, etc.

What happens if you do *two* dd-tortures with huge buffers at the same
time?  And then, please don't happen to have a quad box!

I don't know if my symptom (loss of interactivity on heavy writing) is
related to swapoff -a causing the same symptom on deeply-swapped boxes.

BTW keep in mind my 4-liner is based more on voodoo than on analysis.

Bernd Jendrissek
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.0.4 (GNU/Linux)
Comment: For info see http://www.gnupg.org

iD8DBQE7IICU/FmLrNfLpjMRAnpTAJ48/jAFxZqfxUf2NXT0O542KDbNOwCfaoZo
Q2xaNE4GBqnbn/cl2vrRxLc=
=4sGO
-END PGP SIGNATURE-
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-08 Thread Mike A. Harris


On 6 Jun 2001, Miles Lane wrote:

 Precicely.  Saying 8x RAM doesn't change it either.  Sometime
 next week I'm going to purposefully put a new 60Gb disk in on a
 separate controller as pure swap on top of 256Mb of RAM.  My
 guess is after bootup, and login, I'll have 48Gb of stuff in
 swap just in case.

Mike and others, I am getting tired of your comments.  Sheesh.

And I'm tired of having people tell me, or tell others to buy a
faster computer or more RAM to work around a real technical
problem.  If a dual 1Ghz system with 1Gb of RAM and 60GB of disk
space broken across 3 U160 drives is not a modern fast
workstation I don't know what is.  My 300Mhz system however works
on its own stuff, and doesn't need upgrading.


The various developers who actually work on the VM have already
acknowledged the issues and are exploring fixes, including at
least one patch that already exists.

Precicely, which underscores what I'm saying: The problem is
acknowledged, and being worked on by talented hackers knowing
what they are doing - so why must people keep saying get more
disk space, it is cheap? et al.?  That is totally nonuseful
advice in most cases.  Many have pointed out already for example
how impossible that would be in a 500 computer webserver farm.


It seems clear that the uproar from the people who are having
trouble with the new VM's handling of swap space have been
heard and folks are going to fix these problems.  It may not
happen today or tomorrow, but soon.  What the heck else do you
want?

I agree with you.  What I want, is when someone talks about this
stuff or inquires about it, for people to stop telling them that
their computer is out of date and they should upgrade it as that
is bogus advice.  It worked fine yesterday, why should I
upgrade reigns supreme.


Making enflammatory remarks about the current situation does
nothing to help get the problems fixed, it just wastes our time
and bandwidth.

It's not like there is someone forcing you to read it though.


So please, if you have new facts that you want to offer that
will help us characterize and understand these VM issues better
or discover new problems, feel free to share them.  But if you
just want to rant, I, for one, would rather you didn't.

Point noted, however that isn't going to stop anyone from
speaking their personal opinion on things.  Freedom of speech.



--
Mike A. Harris  -  Linux advocate  -  Open Source advocate
   Opinions and viewpoints expressed are solely my own.
--

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-07 Thread C. Martins

  In my everyday desktop workstation (PII 350) I have 64MB of RAM and use 300MB of 
swap, 150MB on 
each hard disk. After upgrading to 2.4, and maintaining the same set of applications 
(KDE, Netscape
& friends), the machine performance is _definitely_ much worse, in terms of 
responsiveness and 
throughput. Most of applications just take much longer to load, and once you've made 
something
that required more memory for a while (like compiling a kernel, opening a large JPEG 
in gimp, etc)
it takes lots of time to come back to normal. Strangely, with 2.4 the workstation just 
feels that
someone stole the 64MB DIMM and put in a 16MB one!!
  One thing I find strange is that with 2.4 if you run top or something similar you 
notice that
memory allocated for cache is almost always using more than half total RAM. I don't 
remember seeing
this with 2.2 kernel series...

  Anyway I think there is something really broken with respect to 2.4 VM. It is just 
NOT acceptable
that when running the same set of apps and type of work and you upgrade your kernel, 
your hardware
no longer is up to the job, when it fited perfectly right before. This is just MS way 
of solving
problems here.

  Best regards

 Claudio Martins 

On Wed, Jun 06, 2001 at 06:58:39AM -0700, Gerhard Mack wrote:
> 
> I have several boxes with 2x ram as swap and performance still sucks
> compared to 2.2.17.  
> 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-07 Thread Marcelo Tosatti




On Thu, 7 Jun 2001, Shane Nay wrote:

> On Thursday 07 June 2001 13:00, Marcelo Tosatti wrote:
> > On Thu, 7 Jun 2001, Shane Nay wrote:
> > > (Oh, BTW, I really appreciate the work that people have done on the VM,
> > > but folks that are just talking..., well, think clearly before you impact
> > > other people that are writing code.)
> >
> > If all the people talking were reporting results we would be really happy.
> >
> > Seriously, we really lack VM reports.
> 
> Okay, I've had some problems with the VM on my machine, what is the most 
> usefull way to compile reports for you?  

1) Describe what you're running. (your workload)
2) Describe what you're feeling. (eg "interactivity is crap when I run
this or that thing", etc) 

If we need more info than that I'll request in private. 

Also send this reports to the linux-mm list, so other VM hackers can also
get those reports and we avoid traffic on lk.

> I have modified the kernel for a few different ports fixing bugs, and
> device drivers, etc., but the VM is all greek to me, I can just see
> that caching is hyper aggressive and doesn't look like it's going back
> to the pool..., which results in sluggish performance.

By performance you mean interactivity or throughput? 

> Now I know from the work that I've done that anecdotal information is
> almost never even remotely usefull.  

If we need more info, we will request. 

> Therefore is there any body of information that I can read up on to
> create a usefull set of data points for you or other VM hackers to
> look at?  (Or maybe some report in the past that you thought was
> especially usefull?)

Just do what I described above. 

Thanks

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-07 Thread LA Walsh


"Eric W. Biederman" wrote:

> LA Walsh <[EMAIL PROTECTED]> writes:
>
> > Now for whatever reason, since 2.4, I consistently use at least
> > a few Mb of swap -- stands at 5Meg now.  Weird -- but I notice things
> > like nscd running 7 copies that take 72M.  Seems like overkill for
> > a laptop.
>
> So the question becomes why you are seeing an increased swap usage.
> Currently there are two canidates in the 2.4.x code path.
>
> 1) Delayed swap deallocation, when a program exits after it
>has gone into swap it's swap usage is not freed. Ouch.

---
Double ouch.  Swap is backing a non-existent program?

>
>
> 2) Increased tenacity of swap caching.  In particular in 2.2.x if a page
>that was in the swap cache was written to the the page in the swap
>space would be removed.  In 2.4.x the location in swap space is
>retained with the goal of getting more efficient swap-ins.


But if the page in memory is 'dirty', you can't be efficient with swapping
*in* the page.  The page on disk is invalid and should be released, or am I
missing something?

> Neither of the known canidates from increasing the swap load applies
> when you aren't swapping in the first place.  They may aggrevate the
> usage of swap when you are already swapping but they do not cause
> swapping themselves.  This is why the intial recommendation for
> increased swap space size was made.  If you are swapping we will use
> more swap.
>
> However what pushes your laptop over the edge into swapping is an
> entirely different question.  And probably what should be solved.


On my laptop, it is insignificant and to my knowledge has no measurable
impact.  It seems like there is always 3-5 Meg used in swap no matter what's
running (or not) on the system.

> > I think that is the point -- it was supported in 2.2, it is, IMO,
> > a serious regression that it is not supported in 2.4.
>
> The problem with this general line of arguing is that it lumps a whole
> bunch of real issues/regressions into one over all perception.  Since
> there are multiple reasons people are seeing problems, they need to be
> tracked down with specifics.

---
Uhhh, yeah, sorta -- it's addressing the statement that a "new requirement of
2.4 is to have double the swap space".  If everyone agrees that's a problem, then
yes, we can go into specifics of what is causing or contributing to the problem.
It's getting past the attitude of some people that 2xMem for swap is somehow
'normal and acceptable -- deal with it".  In my case, seems like 10Mb of swap would
be all that would generally be used (I don't think I've ever seen swap usage over 7Mb)
on a 512M system.  To be told "oh, your wrong, you *should* have 1Gig or you are
operating in an 'unsupported' or non-standard configuration".  I find that very
user-unfriendly.


>
> The swapoff case comes down to dead swap pages in the swap cache.
> Which greatly increases the number of swap pages slows the system
> down, but since these pages are trivial to free we don't generate any
> I/O so don't wait for I/O and thus never enter the scheduler.  Making
> nothing else in the system runnable.

---
I haven't ever *noticed* this on my machine but that could be
because there isn't much in swap to begin with?  Could be I was
just blissfully ignorant of the time it took to do a swapoff.
Hmmmlet's see...  Just tried it.  I didn't get a total lock up,
but cursor movement was definitely jerky:
> time sudo swapoff -a

real0m10.577s
user0m0.000s
sys 0m9.430s

Looking at vmstat, the needed space was taken mostly out of the
page cache (86M->81.8M) and about 700K each out of free and buff.


> Your case is significantly different.  I don't know if you are seeing
> any issues with swapping at all.  With a 5M usage it may simply be
> totally unused pages being pushed out to the swap space.

---
Probably -- I guess the page cache and disk buffers put enough pressure to
push some things off to swap.

-linda
--
The above thoughts and   | They may have nothing to do with
writings are my own. | the opinions of my employer. :-)
L A Walsh| Senior MTS, Trust Tech, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-07 Thread Shane Nay

Uh, last I checked on my linux based embedded device I didn't want to swap to 
flash.  Hmm.., now why was that..., oh, that's right, it's *much* more 
expensive than memory, oh yes, and it actually gets FRIED when you write to a 
block more than 100k times.  Oh, what was that other thing..., oh yes, and 
its SOLDERED ON THE BOARD.  Damn..., guess I just lost a grand or so.

Seriously folks, Linux isn't just for big webservers...

Thanks,
Shane Nay.
(Oh, BTW, I really appreciate the work that people have done on the VM, but 
folks that are just talking..., well, think clearly before you impact other 
people that are writing code.)

On Wednesday 06 June 2001 02:57, Dr S.M. Huen wrote:
> On Wed, 6 Jun 2001, Sean Hunter wrote:
> > For large memory boxes, this is ridiculous.  Should I have 8GB of swap?
>
> Do I understand you correctly?
> ECC grade SDRAM for your 8GB server costs £335 per GB as 512MB sticks even
> at today's silly prices (Crucial). Ultra160 SCSI costs £8.93/GB as 73GB
> drives.
>
> It will cost you 19x as much to put the RAM in as to put the
> developer's recommended amount of swap space to back up that RAM.  The
> developers gave their reasons for this design some time ago and if the
> ONLY problem was that it required you to allocate more swap, why should
> it be a priority item to fix it for those that refuse to do so?   By all
> means fix it urgently where it doesn't work when used as advised but
> demanding priority to fixing a problem encountered when a user refuses to
> use it in the manner specified seems very unreasonable.  If you can afford
> 4GB RAM, you certainly can afford 8GB swap.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-07 Thread Shane Nay

On Thursday 07 June 2001 13:00, Marcelo Tosatti wrote:
> On Thu, 7 Jun 2001, Shane Nay wrote:
> > (Oh, BTW, I really appreciate the work that people have done on the VM,
> > but folks that are just talking..., well, think clearly before you impact
> > other people that are writing code.)
>
> If all the people talking were reporting results we would be really happy.
>
> Seriously, we really lack VM reports.

Okay, I've had some problems with the VM on my machine, what is the most 
usefull way to compile reports for you?  I have modified the kernel for a few 
different ports fixing bugs, and device drivers, etc., but the VM is all 
greek to me, I can just see that caching is hyper aggressive and doesn't look 
like it's going back to the pool..., which results in sluggish performance.  
Now I know from the work that I've done that anecdotal information is almost 
never even remotely usefull.  Therefore is there any body of information that 
I can read up on to create a usefull set of data points for you or other VM 
hackers to look at?  (Or maybe some report in the past that you thought was 
especially usefull?)

Thank You,
Shane Nay.
(I have in the past had many problems with the VM on embedded machines as 
well, but I'm not actively working on any right this second..., though my 
Psion is sitting next to me begging for me to run some VM tests on it :)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-07 Thread Marcelo Tosatti




On Thu, 7 Jun 2001, Shane Nay wrote:

> (Oh, BTW, I really appreciate the work that people have done on the VM, but 
> folks that are just talking..., well, think clearly before you impact other 
> people that are writing code.)

If all the people talking were reporting results we would be really happy. 

Seriously, we really lack VM reports.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-07 Thread Miles Lane


On 07 Jun 2001 11:49:47 -0400, Derek Glidden wrote:
> Miles Lane wrote:
> > 
> > So please, if you have new facts that you want to offer that
> > will help us characterize and understand these VM issues better
> > or discover new problems, feel free to share them.  But if you
> > just want to rant, I, for one, would rather you didn't.
> 
> *sigh*
> 
> Not to prolong an already pointless thread, but that really was the
> intent of my original message.  I had figured out a specific way, with
> easy-to-follow steps, to make the VM misbehave under very certain
> conditions.  I even offered to help figure out a solution in any way I
> could, considering I'm not familiar with kernel code.
> 
> However, I guess this whole "too much swap" issue has a lot of people on
> edge and immediately assumed I was talking about this subject, without
> actually reading my original message.

Actually, I think your original message was useful.  It has
spurred a reevaluation of some design assumptions implicit in the VM
in the 2.4 series and has also surfaced some bugs.  It was not you
who I felt was sending enflammatory remarks, it was the folks who
have been bellyaching about the current swap disk space requirements
without offering any new information to help developers remedy
the situation.

So, thanks for bringing the topic up.  :-)

Cheers,
Miles

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-07 Thread Marcelo Tosatti




On Thu, 7 Jun 2001, Mike Galbraith wrote:

> On 6 Jun 2001, Eric W. Biederman wrote:
> 
> > Mike Galbraith <[EMAIL PROTECTED]> writes:
> >
> > > > If you could confirm this by calling swapoff sometime other than at
> > > > reboot time.  That might help.  Say by running top on the console.
> > >
> > > The thing goes comatose here too. SCHED_RR vmstat doesn't run, console
> > > switch is nogo...
> > >
> > > After running his memory hog, swapoff took 18 seconds.  I hacked a
> > > bleeder valve for dead swap pages, and it dropped to 4 seconds.. still
> > > utterly comatose for those 4 seconds though.
> >
> > At the top of the while(1) loop in try_to_unuse what happens if you put in.
> > if (need_resched) schedule();
> > It should be outside all of the locks.  It might just be a matter of everything
> > serializing on the SMP locks, and the kernel refusing to preempt itself.
> 
> That did it.

What about including this workaround in the kernel ? 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-07 Thread José Luis Domingo López


On Thursday, 07 June 2001, at 09:23:42 +0200,
Helge Hafting wrote:

> Derek Glidden wrote:
> > 
> > Helge Hafting wrote:
> [...]
> The machine froze 10 seconds or so at the end of the minute, I can
> imagine that biting with bigger swap.
> 
Same behavior here with a Pentium III 600, 128 MB RAM and 128 MB of swap.
Filled mem and swap with the infamous glob() "bug" (ls ../*/.. etc.), made
swapoff, and the machine kept very responsive except for the last 10-15
seconds before swapoff ends.

Even scrolling complex pages with Mozilla 0.9 worked smoothly :).

-- 
José Luis Domingo López
Linux Registered User #189436 Debian GNU/Linux Potato (P166 64 MB RAM)
 
jdomingo EN internautas PUNTO org  => ¿ Spam ? Atente a las consecuencias
jdomingo AT internautas DOT   org  => Spam at your own risk

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-07 Thread Eric W. Biederman

Helge Hafting <[EMAIL PROTECTED]> writes:

> A problem with this is that normal paging-in is allowed to page other
> things out as well.  But you can't have that when swap is about to
> be turned off.  My guess is that swapoff functionality was perceived to
> be so seldom used that they didn't bother too much with scheduling 
> or efficiency.

There is some truth in that.  You aren't allowed to allocate new pages
in the swap space currently being removed however.  The current swap
off code removes pages from the current swap space without breaking
any sharing between swap pages.  Depending on your load this may be
important.  Fixing swapoff to be more efficient while at the same time
keeping sharing between pages is tricky.  That under loads that are
easy to trigger in 2.4 swapoff never sleeps is a big bug.

> I don't have the same problem myself though.  Shutting down with
> 30M or so in swap never take unusual time on 2.4.x kernels here,
> with a 300MHz processor.  I did a test while typing this letter,
> almost filling the 96M swap partition with 88M.  swapoff
> took 1 minute at 100% cpu.  This is long, but the machine was responsive
> most of that time.  I.e. no worse than during a kernel compile.
> The machine froze 10 seconds or so at the end of the minute, I can
> imagine that biting with bigger swap.

O.k. so at some point you actually wait for I/O and other process get
a chance to run.  On the larger machines we never wait for I/O and
thus never schedule at all.

The problem is now understood.  Now we just need to fix it.

Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-07 Thread Eric W. Biederman

LA Walsh <[EMAIL PROTECTED]> writes:

> Now for whatever reason, since 2.4, I consistently use at least
> a few Mb of swap -- stands at 5Meg now.  Weird -- but I notice things
> like nscd running 7 copies that take 72M.  Seems like overkill for
> a laptop.

So the question becomes why you are seeing an increased swap usage.
Currently there are two canidates in the 2.4.x code path.

1) Delayed swap deallocation, when a program exits after it
   has gone into swap it's swap usage is not freed. Ouch.

2) Increased tenacity of swap caching.  In particular in 2.2.x if a page
   that was in the swap cache was written to the the page in the swap
   space would be removed.  In 2.4.x the location in swap space is
   retained with the goal of getting more efficient swap-ins.

Neither of the known canidates from increasing the swap load applies
when you aren't swapping in the first place.  They may aggrevate the
usage of swap when you are already swapping but they do not cause
swapping themselves.  This is why the intial recommendation for
increased swap space size was made.  If you are swapping we will use
more swap.

However what pushes your laptop over the edge into swapping is an
entirely different question.  And probably what should be solved.

> I think that is the point -- it was supported in 2.2, it is, IMO,
> a serious regression that it is not supported in 2.4.

The problem with this general line of arguing is that it lumps a whole
bunch of real issues/regressions into one over all perception.  Since
there are multiple reasons people are seeing problems, they need to be
tracked down with specifics.

The swapoff case comes down to dead swap pages in the swap cache.
Which greatly increases the number of swap pages slows the system
down, but since these pages are trivial to free we don't generate any
I/O so don't wait for I/O and thus never enter the scheduler.  Making
nothing else in the system runnable.

Your case is significantly different.  I don't know if you are seeing 
any issues with swapping at all.  With a 5M usage it may simply be
totally unused pages being pushed out to the swap space.

Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-07 Thread Derek Glidden


Miles Lane wrote:
> 
> So please, if you have new facts that you want to offer that
> will help us characterize and understand these VM issues better
> or discover new problems, feel free to share them.  But if you
> just want to rant, I, for one, would rather you didn't.

*sigh*

Not to prolong an already pointless thread, but that really was the
intent of my original message.  I had figured out a specific way, with
easy-to-follow steps, to make the VM misbehave under very certain
conditions.  I even offered to help figure out a solution in any way I
could, considering I'm not familiar with kernel code.

However, I guess this whole "too much swap" issue has a lot of people on
edge and immediately assumed I was talking about this subject, without
actually reading my original message.

-- 
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
#!/usr/bin/perl -w
$_='while(read+STDIN,$_,2048){$a=29;$b=73;$c=142;$t=255;@t=map
{$_%16or$t^=$c^=($m=(11,10,116,100,11,122,20,100)[$_/16%8])&110;
$t^=(72,@z=(64,72,$a^=12*($_%16-2?0:$m&17)),$b^=$_%64?12:0,@z)
[$_%8]}(16..271);if((@a=unx"C*",$_)[20]&48){$h=5;$_=unxb24,join
"",@b=map{xB8,unxb8,chr($_^$a[--$h+84])}@ARGV;s/...$/1$&/;$d=
unxV,xb25,$_;$e=256|(ord$b[4])<<9|ord$b[3];$d=$d>>8^($f=$t&($d
>>12^$d>>4^$d^$d/8))<<17,$e=$e>>8^($t&($g=($q=$e>>14&7^$e)^$q*
8^$q<<6))<<9,$_=$t[$_]^(($h>>=8)+=$f+(~$g&$t))for@a[128..$#a]}
print+x"C*",@a}';s/x/pack+/g;eval 

usage: qrpff 153 2 8 105 225 < /mnt/dvd/VOB_FILENAME \
| extract_mpeg2 | mpeg2dec - 

http://www.eff.org/http://www.opendvd.org/ 
 http://www.cs.cmu.edu/~dst/DeCSS/Gallery/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-07 Thread Mike Galbraith


On Thu, 7 Jun 2001, Bulent Abali wrote:

> I happened to saw this one with debugger attached serial port.
> The system was alive.  I think I was watching the free page count and
> it was decreasing very slowly may be couple pages per second.  Bigger
> the swap usage longer it takes to do swapoff.  For example, if I had
> 1GB in the swap space then it would take may be an half hour to shutdown...

I took a ~300ms ktrace snapshot of the no IO spot with 2.4.4.ikd..

  % TOTALTOTAL USECSAVG/CALL   NCALLS
  0.0693% 208.540.40  517 c012d4b9 __free_pages
  0.0755% 227.341.01  224 c012cb67 __free_pages_ok
  ...
 34.7195%  104515.150.95   110049 c012de73 unuse_vma
 53.3435%  160578.37  303.55  529 c012dd38 __swap_free
Total entries: 131051  Total usecs:301026.93 Idle: 0.00%

Andrew Morton could be right about that loop not being wonderful.

-Mike

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-07 Thread LA Walsh

"Eric W. Biederman" wrote:

> There are cetain scenario's where you can't avoid virtual mem =
> min(RAM,swap). Which is what I was trying to say, (bad formula).  What
> happens is that pages get referenced  evenly enough and quickly enough
> that you simply cannot reuse the on disk pages.  Basically in the
> worst case all of RAM is pretty much in flight doing I/O.  This is
> true of all paging systems.

So, if I understand, you are talking about thrashing behavior
where your active set is larger than physical ram.  If that
is the case then requiring 2X+ swap for "better" performance
is reasonable.  However, if your active set is truely larger
than your physical memory on a consistant basis, in this day,
the solution is usually "add more RAM".  I may be wrong, but
my belief is that with today's computers people are used to having
enough memory to do their normal tasks and that swap is for
"peak loads" that don't occur on a sustained basis.  Of course
I imagine that this is my belief as it is my own practice/view.
I want to have considerably more memory than my normal working
set.  Swap on my laptop disk is *slow*.  It's a low-power, low-RPM,
slow seek rate all to conserve power (difference between spinning/off
= 1W).  So I have 50% of my phys mem on swap -- because I want to
'feel' it when I goto swap and start looking for memory hogs.
For me, the pathological case is touching swap *at all*.  So the
idea of the entire active set being >=phys mem is already broken
on my setup.  Thus my expectation of swap only as 'warning'/'buffer'
zone.

Now for whatever reason, since 2.4, I consistently use at least
a few Mb of swap -- stands at 5Meg now.  Weird -- but I notice things
like nscd running 7 copies that take 72M.  Seems like overkill for
a laptop.

> However just because in the worst case virtual mem = min(RAM,swap), is
> no reason other cases should use that much swap.  If you are doing a
> lot of swapping it is more efficient to plan on mem = min(RAM,swap) as
> well, because frequently you can save on I/O operations by simply
> reusing the existing swap page.

---
Agreed.  But planning your swap space for a worst
case scenario that you never hit is wasteful.  My worst
case is using any swap.  The system should be able to live
with swap=1/2*phys in my situation.  I don't think I'm
unique in this respect.

> It's a theoretical worst case and they all have it.  In practice it is
> very hard to find a work load where practically every page in the
> system is close to the I/O point howerver.

---
Well exactly the point.  It was in such situations in some older
systems that some programs were swapped out and temporarily made
unavailable for running (they showed up in the 'w' space in vmstat).

> Except for removing pages that aren't used paging with swap < RAM is
> not useful.  Simply removing pages that aren't in active use but might
> possibly be used someday is a common case, so it is worth supporting.

---
I think that is the point -- it was supported in 2.2, it is, IMO,
a serious regression that it is not supported in 2.4.

-linda

--
The above thoughts and   | They may have nothing to do with
writings are my own. | the opinions of my employer. :-)
L A Walsh| Senior MTS, Trust Tech., Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-07 Thread Bulent Abali




>> O.k.  I think I'm ready to nominate the dead swap pages for the big
>> 2.4.x VM bug award.  So we are burning cpu cycles in sys_swapoff
>> instead of being IO bound?  Just wanting to understand this the cheap
way :)
>
>There's no IO being done whatsoever (that I can see with only a blinky).
>I can fire up ktrace and find out exactly what's going on if that would
>be helpful.  Eating the dead swap pages from the active page list prior
>to swapoff cures all but a short freeze.  Eating the rest (few of those)
>might cure the rest, but I doubt it.
>
>-Mike

1)  I second Mike's observation.  swapoff either from command line or
during
shutdown, just hangs there.  No disk I/O is being done as I could see
from the blinkers.  This is not a I/O boundness issue.  It is more like
a deadlock.

I happened to saw this one with debugger attached serial port.
The system was alive.  I think I was watching the free page count and
it was decreasing very slowly may be couple pages per second.  Bigger
the swap usage longer it takes to do swapoff.  For example, if I had
1GB in the swap space then it would take may be an half hour to shutdown...


2)  Now why I would have 1 GB in the swap space, that is another problem.
Here is what I observe and it doesn't make much sense to me.
Let's say I have 1GB of memory and plenty of swap.  And let's
say there is process with little less than 1GB size.  Suppose the system
starts swapping because it is short few megabytes of memory.
Within *seconds* of swapping, I see that the swap disk usage balloons to
nearly 1GB. Nearly entire memory moves in to the page cache.  If you
run xosview you will know what I mean.  Memory usage suddenly turns from
green to red :-).   And I know for a fact that my disk cannot do 1GB per
second :-). The SHARE column of the big process in "top" goes up by
hundreds
of megabytes.
So it appears to me that MM is marking the whole process memory to be
swapped out and probably reserving nearly 1 GB in the swap space and
furthermore moves entire process pages to apparently to the page cache.
You would think that if you are short by few MB of memory MM would put
few MB worth of pages in the swap. But it wants to move entire processes
in to swap.

When the 1GB process exits, the swap usage doesn't change (dead swap
pages?).
And shutdown or swapoff will take forever due to #1 above.

Bulent




-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-07 Thread Bernd Jendrissek

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
NotDashEscaped: You need GnuPG to verify this message

First things first: 1) Please Cc: me when responding, 2) apologies for
dropping any References: headers, 3) sorry for bad formatting

"Jeffrey W. Baker" wrote:
> On Tue, 5 Jun 2001, Derek Glidden wrote: 
> > This isn't trying to test extreme low-memory pressure, just how the 
> > system handles recovering from going somewhat into swap, which is
> > a real 
> > day-to-day problem for me, because I often run a couple of apps
> > that 
> > most of the time live in RAM, but during heavy computation runs,
> > can go 
> > a couple hundred megs into swap for a few minutes at a time.
> > Whenever 
> > that happens, my machine always starts acting up afterwards, so I 
> > started investigating and found some really strange stuff going on. 

Has anyone else noticed the difference between
 dd if=/dev/zero of=bigfile bs=16384k count=1
and
 dd if=/dev/zero of=bigfile bs=8k count=2048
deleting 'bigfile' each time before use?  (You with lots of memory may
(or may not!) want to try bs=262144k)

Once, a few months ago, I thought I traced this to the loop at line ~2597
in linux/mm/filemap.c:generic_file_write
  2593  remove_suid(inode);
  2594  inode->i_ctime = inode->i_mtime = CURRENT_TIME;
  2595  mark_inode_dirty_sync(inode);
  2596  
  2597  while (count) {
  2598  unsigned long index, offset;
  2599  char *kaddr;
  2600  int deactivate = 1;
...
  2659  
  2660  if (status < 0)
  2661  break;
  2662  }
  2663  *ppos = pos;
  2664  
  2665  if (cached_page)

It appears to me that pseudo-spins (it *does* do useful work) in this
loop for as long as there are pages available.

BTW while the big-bs dd is running, the disk is active.  I assume that
writes are indeed scheduled and start happening even while we're still
dirtying pages?

Does this freezing effect occur on SMP machines too?  Oops, had access
to one until this morning :(  Would an SMP box still have a 'spare'
cpu which isn't dirtying pages like crazy, and can therefore do things
like updating mouse cursors, etc.?

Bernd Jendrissek

P.S. here's my patch that cures this one symptom; it smells and looks
ugly, I know, but at least my mouse cursor doesn't jump across the whole
screen when I do the dd=torture.

I have no idea if this is right or not, whether I'm allowed to call
schedule inside generic_file_write or not, etc.  And the '256' is
just random - small enough to let the cursor move, but large enough
to do work between schedule()s.

If this solves your problem, use it; if your name is Linus or Alan,
ignore or do it right please.

diff -u -r1.1 -r1.2
--- linux-hack/mm/filemap.c 2001/06/06 21:16:28 1.1
+++ linux-hack/mm/filemap.c 2001/06/07 08:57:52 1.2
@@ -2599,6 +2599,11 @@
char *kaddr;
int deactivate = 1;

+   /* bernd-hack: give other processes a chance to run */
+   if (count % 256 == 0) {
+   schedule();
+   }
+
/*
 * Try to find the page in the cache. If it isn't there,
 * allocate a free page.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.0.4 (GNU/Linux)
Comment: For info see http://www.gnupg.org

iD8DBQE7H1tb/FmLrNfLpjMRAguAAJ0fYInFbAa6LjFC/CWZbRPQxzZwrwCeNqT0
/Kod15Nx7AzaM4v0WhOgp88=
=pyr6
-END PGP SIGNATURE-
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-07 Thread Eric W. Biederman

Linus Torvalds <[EMAIL PROTECTED]> writes:

> On 7 Jun 2001, Eric W. Biederman wrote:
> 
> No - I suspect that we're not actually doing all that much IO at all, and
> the real reason for the lock-up is just that the current algorithm is so
> bad that when it starts to act exponentially worse it really _is_ taking
> minutes of CPU time following pointers and generally not being very nice
> on the CPU cache etc..

Hmm.  Unless I am mistaken the complexity is O(SwapPages*VMSize)
Which is very bad, but no where near exponentially horrible.

> The bulk of the work is walking the process page tables thousands and
> thousands of times. Expensive.

Definitely.  I played following the page tables in a good way a while
back, and even when you do it right the process is slow.  Is 
if (need_resched) {
schedule();
}
A good idiom to use when you know you have a loop that will take a
long time.  Because even if we do this right we should do our best to
avoid starving other processes in the system 

Hmm.  There is a nasty case with turning the walk inside out.  When we
read a page into RAM there could still be other users of that page
that still refer to the swap entry.  So we cannot immediately remove
the page from the swap cache.  Unless we want to break sharing and
increase the demands upon the virtual memory when we are shrinking
it...  

> > If this is going on I think we need to look at our delayed
> > deallocation policy a little more carefully.
> 
> Agreed. I already talked in private with some people about just
> re-visiting the issue of the lazy de-allocation. It has nice properties,
> but it certainly appears as if the nasty cases just plain outweigh the
> advantages.

I'm trying to remember the advantages.  Besides not having to care
that a page is a swap page in free_pte.  If there really is some value
in not handling the pages there (and I seem to recall something about
pages under I/O).  It might at least be worth putting the pages on
their own LRU list.  So that kswapd can cruch through the list
whenever it wakes up and gives a bunch of free pages.

Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-07 Thread Mike Galbraith


On 7 Jun 2001, Eric W. Biederman wrote:

> Mike Galbraith <[EMAIL PROTECTED]> writes:
>
> > On 7 Jun 2001, Eric W. Biederman wrote:
> >
> > > Does this improve the swapoff speed or just allow other programs to
> > > run at the same time?  If it is still slow under that kind of load it
> > > would be interesting to know what is taking up all time.
> > >
> > > If it is no longer slow a patch should be made and sent to Linus.
> >
> > No, it only cures the freeze.  The other appears to be the slow code
> > pointed out by Andrew Morton being tickled by dead swap pages.
>
> O.k.  I think I'm ready to nominate the dead swap pages for the big
> 2.4.x VM bug award.  So we are burning cpu cycles in sys_swapoff
> instead of being IO bound?  Just wanting to understand this the cheap way :)

There's no IO being done whatsoever (that I can see with only a blinky).
I can fire up ktrace and find out exactly what's going on if that would
be helpful.  Eating the dead swap pages from the active page list prior
to swapoff cures all but a short freeze.  Eating the rest (few of those)
might cure the rest, but I doubt it.

-Mike

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-07 Thread Linus Torvalds

On 7 Jun 2001, Eric W. Biederman wrote:

> [EMAIL PROTECTED] (Linus Torvalds) writes:
> > 
> > Somebody interested in trying the above add? And looking for other more
> > obvious bandaid fixes.  It won't "fix" swapoff per se, but it might make
> > it bearable and bring it to the 2.2.x levels. 
> 
> At little bit.  The one really bad behavior of not letting any other
> processes run seems to be fixed with an explicit:
> if (need_resched) {
> schedule();
> }
> 
> What I can't figure out is why this is necessary.  Because we should
> be sleeping in alloc_pages if nowhere else.

No - I suspect that we're not actually doing all that much IO at all, and
the real reason for the lock-up is just that the current algorithm is so
bad that when it starts to act exponentially worse it really _is_ taking
minutes of CPU time following pointers and generally not being very nice
on the CPU cache etc..

The bulk of the work is walking the process page tables thousands and
thousands of times. Expensive.

> If this is going on I think we need to look at our delayed
> deallocation policy a little more carefully.

Agreed. I already talked in private with some people about just
re-visiting the issue of the lazy de-allocation. It has nice properties,
but it certainly appears as if the nasty cases just plain outweigh the
advantages.

Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-07 Thread Eric W. Biederman


Mike Galbraith <[EMAIL PROTECTED]> writes:

> On 7 Jun 2001, Eric W. Biederman wrote:
> 
> > Does this improve the swapoff speed or just allow other programs to
> > run at the same time?  If it is still slow under that kind of load it
> > would be interesting to know what is taking up all time.
> >
> > If it is no longer slow a patch should be made and sent to Linus.
> 
> No, it only cures the freeze.  The other appears to be the slow code
> pointed out by Andrew Morton being tickled by dead swap pages.

O.k.  I think I'm ready to nominate the dead swap pages for the big
2.4.x VM bug award.  So we are burning cpu cycles in sys_swapoff
instead of being IO bound?  Just wanting to understand this the cheap way :)

Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-07 Thread Eric W. Biederman

[EMAIL PROTECTED] (Linus Torvalds) writes:
> 
> Somebody interested in trying the above add? And looking for other more
> obvious bandaid fixes.  It won't "fix" swapoff per se, but it might make
> it bearable and bring it to the 2.2.x levels. 

At little bit.  The one really bad behavior of not letting any other
processes run seems to be fixed with an explicit:
if (need_resched) {
schedule();
}

What I can't figure out is why this is necessary.  Because we should
be sleeping in alloc_pages if nowhere else.

I suppose if the bulk of our effort really is freeing dead swap cache
pages we can spin without sleeping, and never let another process run
because we are busily recycling dead swap cache pages. Does this sound
right? 

If this is going on I think we need to look at our delayed
deallocation policy a little more carefully.   I suspect we should
have code in kswapd actively removing these dead swap cache pages. 
After we get the latency improvements in exit these pages do
absolutely nothing for us except clog up the whole system, and
generally give the 2.4 VM a bad name.

Anyone care to check my analysis? 

> Is anybody interested in making "swapoff()" better? Please speak up..

Interested.   But finding the time...

Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-07 Thread Mike Galbraith


On 7 Jun 2001, Eric W. Biederman wrote:

> Mike Galbraith <[EMAIL PROTECTED]> writes:
>
> > On 6 Jun 2001, Eric W. Biederman wrote:
> >
> > > Mike Galbraith <[EMAIL PROTECTED]> writes:
> > >
> > > > > If you could confirm this by calling swapoff sometime other than at
> > > > > reboot time.  That might help.  Say by running top on the console.
> > > >
> > > > The thing goes comatose here too. SCHED_RR vmstat doesn't run, console
> > > > switch is nogo...
> > > >
> > > > After running his memory hog, swapoff took 18 seconds.  I hacked a
> > > > bleeder valve for dead swap pages, and it dropped to 4 seconds.. still
> > > > utterly comatose for those 4 seconds though.
> > >
> > > At the top of the while(1) loop in try_to_unuse what happens if you put in.
> > > if (need_resched) schedule();
> > > It should be outside all of the locks.  It might just be a matter of
> > everything
> >
> > > serializing on the SMP locks, and the kernel refusing to preempt itself.
> >
> > That did it.
>
> Does this improve the swapoff speed or just allow other programs to
> run at the same time?  If it is still slow under that kind of load it
> would be interesting to know what is taking up all time.
>
> If it is no longer slow a patch should be made and sent to Linus.

No, it only cures the freeze.  The other appears to be the slow code
pointed out by Andrew Morton being tickled by dead swap pages.

-Mike

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-07 Thread Helge Hafting

Derek Glidden wrote:
> 
> Helge Hafting wrote:
> >
> > The drive is inactive because it isn't needed, the machine is
> > running loops on data in memory.  And it is unresponsive because
> > nothing else is scheduled, maybe "swapoff" is easier to implement
> 
> I don't quite get what you're saying.  If the system becomes
> unresponsive because  the VM swap recovery parts of the kernel are
> interfering with the kernel scheduler then that's also bad because there
> absolutely *are* other processes that should be getting time, like the
> console windows/shells at which I'm logged in.  If they aren't getting
> it specifically because the VM is preventing them from receiving
> execution time, then that's another bug.
> 
Sure.  The kernel doing a big job without scheduling anything 
is a problem.

> I'm not familiar enough with the swapping bits of the kernel code, so I
> could be totally wrong, but turning off a swap file/partition should
> just call the same parts of the VM subsystem that would normally try to
> recover swap space under memory pressure.  

A problem with this is that normal paging-in is allowed to page other
things out as well.  But you can't have that when swap is about to
be turned off.  My guess is that swapoff functionality was perceived to
be so seldom used that they didn't bother too much with scheduling 
or efficiency.

I don't have the same problem myself though.  Shutting down with
30M or so in swap never take unusual time on 2.4.x kernels here,
with a 300MHz processor.  I did a test while typing this letter,
almost filling the 96M swap partition with 88M.  swapoff
took 1 minute at 100% cpu.  This is long, but the machine was responsive
most of that time.  I.e. no worse than during a kernel compile.
The machine froze 10 seconds or so at the end of the minute, I can
imagine that biting with bigger swap.

Helge Hafting
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-07 Thread Eric W. Biederman


Mike Galbraith <[EMAIL PROTECTED]> writes:

> On 6 Jun 2001, Eric W. Biederman wrote:
> 
> > Mike Galbraith <[EMAIL PROTECTED]> writes:
> >
> > > > If you could confirm this by calling swapoff sometime other than at
> > > > reboot time.  That might help.  Say by running top on the console.
> > >
> > > The thing goes comatose here too. SCHED_RR vmstat doesn't run, console
> > > switch is nogo...
> > >
> > > After running his memory hog, swapoff took 18 seconds.  I hacked a
> > > bleeder valve for dead swap pages, and it dropped to 4 seconds.. still
> > > utterly comatose for those 4 seconds though.
> >
> > At the top of the while(1) loop in try_to_unuse what happens if you put in.
> > if (need_resched) schedule();
> > It should be outside all of the locks.  It might just be a matter of
> everything
> 
> > serializing on the SMP locks, and the kernel refusing to preempt itself.
> 
> That did it.

Does this improve the swapoff speed or just allow other programs to
run at the same time?  If it is still slow under that kind of load it
would be interesting to know what is taking up all time.

If it is no longer slow a patch should be made and sent to Linus.

Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-07 Thread Eric W. Biederman

LA Walsh <[EMAIL PROTECTED]> writes:

> "Eric W. Biederman" wrote:
> 
> > The hard rule will always be that to cover all pathological cases swap
> > must be greater than RAM.  Because in the worse case all RAM will be
> > in thes swap cache.  That this is more than just the worse case in 2.4
> > is problematic.  I.e. In the worst case:
> > Virtual Memory = RAM + (swap - RAM).
> 
> Hmmmso my 512M laptop only really has 256M?  Um...I regularlly run
> more than 256M of programs.  I don't want it to swap -- its a special, weird
> condition if I do start swapping.  I don't want to waste 1G of HD (5%) for
> something I never want to use.  IRIX runs just fine with swap Irix, your Virtual Memory = RAM + swap.  Seems like the Linux kernel requires
> more swap than other old OS's (SunOS3 (virtual mem = min(mem,swap)).
> I *thought* I remember that restriction being lifted in SunOS4 when they
> upgraded the VM.  Even though I worked there for 6 years, that was
> 6 years ago...

There are cetain scenario's where you can't avoid virtual mem =
min(RAM,swap). Which is what I was trying to say, (bad formula).  What
happens is that pages get referenced  evenly enough and quickly enough
that you simply cannot reuse the on disk pages.  Basically in the
worst case all of RAM is pretty much in flight doing I/O.  This is
true of all paging systems.

However just because in the worst case virtual mem = min(RAM,swap), is
no reason other cases should use that much swap.  If you are doing a
lot of swapping it is more efficient to plan on mem = min(RAM,swap) as
well, because frequently you can save on I/O operations by simply
reusing the existing swap page.

> 
> > You can't improve the worst case.  We can improve the worst case that
> > many people are facing.
> 
> ---
> Other OS's don't have this pathological 'worst case' scenario.  Even
> my Windows [vm]box seems to operate fine with swap virtual space closely approximates physical + disk memory.

It's a theoretical worst case and they all have it.  In practice it is
very hard to find a work load where practically every page in the
system is close to the I/O point howerver.

Except for removing pages that aren't used paging with swap < RAM is
not useful.  Simply removing pages that aren't in active use but might
possibly be used someday is a common case, so it is worth supporting.

> 
> > It's worth complaining about.  It is also worth digging into and find
> > out what the real problem is.  I have a hunch that this hole
> > conversation on swap sizes being irritating is hiding the real
> > problem.
> 
> ---
> Okay, admission of ignorance.  When we speak of "swap space",
> is this term inclusive of both demand paging space and
> swap-out-entire-programs space or one or another?

Linux has no method to swap out an entire program so when I speak of
swapping I'm actually thinking paging.

Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-07 Thread Eric W. Biederman


LA Walsh [EMAIL PROTECTED] writes:

 Eric W. Biederman wrote:
 
  The hard rule will always be that to cover all pathological cases swap
  must be greater than RAM.  Because in the worse case all RAM will be
  in thes swap cache.  That this is more than just the worse case in 2.4
  is problematic.  I.e. In the worst case:
  Virtual Memory = RAM + (swap - RAM).
 
 Hmmmso my 512M laptop only really has 256M?  Um...I regularlly run
 more than 256M of programs.  I don't want it to swap -- its a special, weird
 condition if I do start swapping.  I don't want to waste 1G of HD (5%) for
 something I never want to use.  IRIX runs just fine with swapRAM.  In
 Irix, your Virtual Memory = RAM + swap.  Seems like the Linux kernel requires
 more swap than other old OS's (SunOS3 (virtual mem = min(mem,swap)).
 I *thought* I remember that restriction being lifted in SunOS4 when they
 upgraded the VM.  Even though I worked there for 6 years, that was
 6 years ago...

There are cetain scenario's where you can't avoid virtual mem =
min(RAM,swap). Which is what I was trying to say, (bad formula).  What
happens is that pages get referenced  evenly enough and quickly enough
that you simply cannot reuse the on disk pages.  Basically in the
worst case all of RAM is pretty much in flight doing I/O.  This is
true of all paging systems.

However just because in the worst case virtual mem = min(RAM,swap), is
no reason other cases should use that much swap.  If you are doing a
lot of swapping it is more efficient to plan on mem = min(RAM,swap) as
well, because frequently you can save on I/O operations by simply
reusing the existing swap page.

 
  You can't improve the worst case.  We can improve the worst case that
  many people are facing.
 
 ---
 Other OS's don't have this pathological 'worst case' scenario.  Even
 my Windows [vm]box seems to operate fine with swapMEM.  On IRIX,
 virtual space closely approximates physical + disk memory.

It's a theoretical worst case and they all have it.  In practice it is
very hard to find a work load where practically every page in the
system is close to the I/O point howerver.

Except for removing pages that aren't used paging with swap  RAM is
not useful.  Simply removing pages that aren't in active use but might
possibly be used someday is a common case, so it is worth supporting.

 
  It's worth complaining about.  It is also worth digging into and find
  out what the real problem is.  I have a hunch that this hole
  conversation on swap sizes being irritating is hiding the real
  problem.
 
 ---
 Okay, admission of ignorance.  When we speak of swap space,
 is this term inclusive of both demand paging space and
 swap-out-entire-programs space or one or another?

Linux has no method to swap out an entire program so when I speak of
swapping I'm actually thinking paging.

Eric
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-07 Thread Mike Galbraith


On 7 Jun 2001, Eric W. Biederman wrote:

 Mike Galbraith [EMAIL PROTECTED] writes:

  On 6 Jun 2001, Eric W. Biederman wrote:
 
   Mike Galbraith [EMAIL PROTECTED] writes:
  
 If you could confirm this by calling swapoff sometime other than at
 reboot time.  That might help.  Say by running top on the console.
   
The thing goes comatose here too. SCHED_RR vmstat doesn't run, console
switch is nogo...
   
After running his memory hog, swapoff took 18 seconds.  I hacked a
bleeder valve for dead swap pages, and it dropped to 4 seconds.. still
utterly comatose for those 4 seconds though.
  
   At the top of the while(1) loop in try_to_unuse what happens if you put in.
   if (need_resched) schedule();
   It should be outside all of the locks.  It might just be a matter of
  everything
 
   serializing on the SMP locks, and the kernel refusing to preempt itself.
 
  That did it.

 Does this improve the swapoff speed or just allow other programs to
 run at the same time?  If it is still slow under that kind of load it
 would be interesting to know what is taking up all time.

 If it is no longer slow a patch should be made and sent to Linus.

No, it only cures the freeze.  The other appears to be the slow code
pointed out by Andrew Morton being tickled by dead swap pages.

-Mike

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-07 Thread Helge Hafting


Derek Glidden wrote:
 
 Helge Hafting wrote:
 
  The drive is inactive because it isn't needed, the machine is
  running loops on data in memory.  And it is unresponsive because
  nothing else is scheduled, maybe swapoff is easier to implement
 
 I don't quite get what you're saying.  If the system becomes
 unresponsive because  the VM swap recovery parts of the kernel are
 interfering with the kernel scheduler then that's also bad because there
 absolutely *are* other processes that should be getting time, like the
 console windows/shells at which I'm logged in.  If they aren't getting
 it specifically because the VM is preventing them from receiving
 execution time, then that's another bug.
 
Sure.  The kernel doing a big job without scheduling anything 
is a problem.

 I'm not familiar enough with the swapping bits of the kernel code, so I
 could be totally wrong, but turning off a swap file/partition should
 just call the same parts of the VM subsystem that would normally try to
 recover swap space under memory pressure.  

A problem with this is that normal paging-in is allowed to page other
things out as well.  But you can't have that when swap is about to
be turned off.  My guess is that swapoff functionality was perceived to
be so seldom used that they didn't bother too much with scheduling 
or efficiency.

I don't have the same problem myself though.  Shutting down with
30M or so in swap never take unusual time on 2.4.x kernels here,
with a 300MHz processor.  I did a test while typing this letter,
almost filling the 96M swap partition with 88M.  swapoff
took 1 minute at 100% cpu.  This is long, but the machine was responsive
most of that time.  I.e. no worse than during a kernel compile.
The machine froze 10 seconds or so at the end of the minute, I can
imagine that biting with bigger swap.

Helge Hafting
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-07 Thread Eric W. Biederman


[EMAIL PROTECTED] (Linus Torvalds) writes:
 
 Somebody interested in trying the above add? And looking for other more
 obvious bandaid fixes.  It won't fix swapoff per se, but it might make
 it bearable and bring it to the 2.2.x levels. 

At little bit.  The one really bad behavior of not letting any other
processes run seems to be fixed with an explicit:
if (need_resched) {
schedule();
}

What I can't figure out is why this is necessary.  Because we should
be sleeping in alloc_pages if nowhere else.

I suppose if the bulk of our effort really is freeing dead swap cache
pages we can spin without sleeping, and never let another process run
because we are busily recycling dead swap cache pages. Does this sound
right? 

If this is going on I think we need to look at our delayed
deallocation policy a little more carefully.   I suspect we should
have code in kswapd actively removing these dead swap cache pages. 
After we get the latency improvements in exit these pages do
absolutely nothing for us except clog up the whole system, and
generally give the 2.4 VM a bad name.

Anyone care to check my analysis? 

 Is anybody interested in making swapoff() better? Please speak up..

Interested.   But finding the time...

Eric
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-07 Thread Eric W. Biederman


Mike Galbraith [EMAIL PROTECTED] writes:

 On 7 Jun 2001, Eric W. Biederman wrote:
 
  Does this improve the swapoff speed or just allow other programs to
  run at the same time?  If it is still slow under that kind of load it
  would be interesting to know what is taking up all time.
 
  If it is no longer slow a patch should be made and sent to Linus.
 
 No, it only cures the freeze.  The other appears to be the slow code
 pointed out by Andrew Morton being tickled by dead swap pages.

O.k.  I think I'm ready to nominate the dead swap pages for the big
2.4.x VM bug award.  So we are burning cpu cycles in sys_swapoff
instead of being IO bound?  Just wanting to understand this the cheap way :)

Eric
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-07 Thread Linus Torvalds



On 7 Jun 2001, Eric W. Biederman wrote:

 [EMAIL PROTECTED] (Linus Torvalds) writes:
  
  Somebody interested in trying the above add? And looking for other more
  obvious bandaid fixes.  It won't fix swapoff per se, but it might make
  it bearable and bring it to the 2.2.x levels. 
 
 At little bit.  The one really bad behavior of not letting any other
 processes run seems to be fixed with an explicit:
 if (need_resched) {
 schedule();
 }
 
 What I can't figure out is why this is necessary.  Because we should
 be sleeping in alloc_pages if nowhere else.

No - I suspect that we're not actually doing all that much IO at all, and
the real reason for the lock-up is just that the current algorithm is so
bad that when it starts to act exponentially worse it really _is_ taking
minutes of CPU time following pointers and generally not being very nice
on the CPU cache etc..

The bulk of the work is walking the process page tables thousands and
thousands of times. Expensive.

 If this is going on I think we need to look at our delayed
 deallocation policy a little more carefully.

Agreed. I already talked in private with some people about just
re-visiting the issue of the lazy de-allocation. It has nice properties,
but it certainly appears as if the nasty cases just plain outweigh the
advantages.

Linus

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-07 Thread Mike Galbraith


On 7 Jun 2001, Eric W. Biederman wrote:

 Mike Galbraith [EMAIL PROTECTED] writes:

  On 7 Jun 2001, Eric W. Biederman wrote:
 
   Does this improve the swapoff speed or just allow other programs to
   run at the same time?  If it is still slow under that kind of load it
   would be interesting to know what is taking up all time.
  
   If it is no longer slow a patch should be made and sent to Linus.
 
  No, it only cures the freeze.  The other appears to be the slow code
  pointed out by Andrew Morton being tickled by dead swap pages.

 O.k.  I think I'm ready to nominate the dead swap pages for the big
 2.4.x VM bug award.  So we are burning cpu cycles in sys_swapoff
 instead of being IO bound?  Just wanting to understand this the cheap way :)

There's no IO being done whatsoever (that I can see with only a blinky).
I can fire up ktrace and find out exactly what's going on if that would
be helpful.  Eating the dead swap pages from the active page list prior
to swapoff cures all but a short freeze.  Eating the rest (few of those)
might cure the rest, but I doubt it.

-Mike

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-07 Thread Eric W. Biederman


Linus Torvalds [EMAIL PROTECTED] writes:

 On 7 Jun 2001, Eric W. Biederman wrote:
 
 No - I suspect that we're not actually doing all that much IO at all, and
 the real reason for the lock-up is just that the current algorithm is so
 bad that when it starts to act exponentially worse it really _is_ taking
 minutes of CPU time following pointers and generally not being very nice
 on the CPU cache etc..

Hmm.  Unless I am mistaken the complexity is O(SwapPages*VMSize)
Which is very bad, but no where near exponentially horrible.
 
 The bulk of the work is walking the process page tables thousands and
 thousands of times. Expensive.

Definitely.  I played following the page tables in a good way a while
back, and even when you do it right the process is slow.  Is 
if (need_resched) {
schedule();
}
A good idiom to use when you know you have a loop that will take a
long time.  Because even if we do this right we should do our best to
avoid starving other processes in the system 

Hmm.  There is a nasty case with turning the walk inside out.  When we
read a page into RAM there could still be other users of that page
that still refer to the swap entry.  So we cannot immediately remove
the page from the swap cache.  Unless we want to break sharing and
increase the demands upon the virtual memory when we are shrinking
it...  

 
  If this is going on I think we need to look at our delayed
  deallocation policy a little more carefully.
 
 Agreed. I already talked in private with some people about just
 re-visiting the issue of the lazy de-allocation. It has nice properties,
 but it certainly appears as if the nasty cases just plain outweigh the
 advantages.

I'm trying to remember the advantages.  Besides not having to care
that a page is a swap page in free_pte.  If there really is some value
in not handling the pages there (and I seem to recall something about
pages under I/O).  It might at least be worth putting the pages on
their own LRU list.  So that kswapd can cruch through the list
whenever it wakes up and gives a bunch of free pages.

Eric
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-07 Thread Bernd Jendrissek


-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
NotDashEscaped: You need GnuPG to verify this message

First things first: 1) Please Cc: me when responding, 2) apologies for
dropping any References: headers, 3) sorry for bad formatting

Jeffrey W. Baker wrote:
 On Tue, 5 Jun 2001, Derek Glidden wrote: 
  This isn't trying to test extreme low-memory pressure, just how the 
  system handles recovering from going somewhat into swap, which is
  a real 
  day-to-day problem for me, because I often run a couple of apps
  that 
  most of the time live in RAM, but during heavy computation runs,
  can go 
  a couple hundred megs into swap for a few minutes at a time.
  Whenever 
  that happens, my machine always starts acting up afterwards, so I 
  started investigating and found some really strange stuff going on. 

Has anyone else noticed the difference between
 dd if=/dev/zero of=bigfile bs=16384k count=1
and
 dd if=/dev/zero of=bigfile bs=8k count=2048
deleting 'bigfile' each time before use?  (You with lots of memory may
(or may not!) want to try bs=262144k)

Once, a few months ago, I thought I traced this to the loop at line ~2597
in linux/mm/filemap.c:generic_file_write
  2593  remove_suid(inode);
  2594  inode-i_ctime = inode-i_mtime = CURRENT_TIME;
  2595  mark_inode_dirty_sync(inode);
  2596  
  2597  while (count) {
  2598  unsigned long index, offset;
  2599  char *kaddr;
  2600  int deactivate = 1;
...
  2659  
  2660  if (status  0)
  2661  break;
  2662  }
  2663  *ppos = pos;
  2664  
  2665  if (cached_page)

It appears to me that pseudo-spins (it *does* do useful work) in this
loop for as long as there are pages available.

BTW while the big-bs dd is running, the disk is active.  I assume that
writes are indeed scheduled and start happening even while we're still
dirtying pages?

Does this freezing effect occur on SMP machines too?  Oops, had access
to one until this morning :(  Would an SMP box still have a 'spare'
cpu which isn't dirtying pages like crazy, and can therefore do things
like updating mouse cursors, etc.?

Bernd Jendrissek

P.S. here's my patch that cures this one symptom; it smells and looks
ugly, I know, but at least my mouse cursor doesn't jump across the whole
screen when I do the dd=torture.

I have no idea if this is right or not, whether I'm allowed to call
schedule inside generic_file_write or not, etc.  And the '256' is
just random - small enough to let the cursor move, but large enough
to do work between schedule()s.

If this solves your problem, use it; if your name is Linus or Alan,
ignore or do it right please.

diff -u -r1.1 -r1.2
--- linux-hack/mm/filemap.c 2001/06/06 21:16:28 1.1
+++ linux-hack/mm/filemap.c 2001/06/07 08:57:52 1.2
@@ -2599,6 +2599,11 @@
char *kaddr;
int deactivate = 1;
 
+   /* bernd-hack: give other processes a chance to run */
+   if (count % 256 == 0) {
+   schedule();
+   }
+
/*
 * Try to find the page in the cache. If it isn't there,
 * allocate a free page.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.0.4 (GNU/Linux)
Comment: For info see http://www.gnupg.org

iD8DBQE7H1tb/FmLrNfLpjMRAguAAJ0fYInFbAa6LjFC/CWZbRPQxzZwrwCeNqT0
/Kod15Nx7AzaM4v0WhOgp88=
=pyr6
-END PGP SIGNATURE-
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-07 Thread Bulent Abali




 O.k.  I think I'm ready to nominate the dead swap pages for the big
 2.4.x VM bug award.  So we are burning cpu cycles in sys_swapoff
 instead of being IO bound?  Just wanting to understand this the cheap
way :)

There's no IO being done whatsoever (that I can see with only a blinky).
I can fire up ktrace and find out exactly what's going on if that would
be helpful.  Eating the dead swap pages from the active page list prior
to swapoff cures all but a short freeze.  Eating the rest (few of those)
might cure the rest, but I doubt it.

-Mike

1)  I second Mike's observation.  swapoff either from command line or
during
shutdown, just hangs there.  No disk I/O is being done as I could see
from the blinkers.  This is not a I/O boundness issue.  It is more like
a deadlock.

I happened to saw this one with debugger attached serial port.
The system was alive.  I think I was watching the free page count and
it was decreasing very slowly may be couple pages per second.  Bigger
the swap usage longer it takes to do swapoff.  For example, if I had
1GB in the swap space then it would take may be an half hour to shutdown...


2)  Now why I would have 1 GB in the swap space, that is another problem.
Here is what I observe and it doesn't make much sense to me.
Let's say I have 1GB of memory and plenty of swap.  And let's
say there is process with little less than 1GB size.  Suppose the system
starts swapping because it is short few megabytes of memory.
Within *seconds* of swapping, I see that the swap disk usage balloons to
nearly 1GB. Nearly entire memory moves in to the page cache.  If you
run xosview you will know what I mean.  Memory usage suddenly turns from
green to red :-).   And I know for a fact that my disk cannot do 1GB per
second :-). The SHARE column of the big process in top goes up by
hundreds
of megabytes.
So it appears to me that MM is marking the whole process memory to be
swapped out and probably reserving nearly 1 GB in the swap space and
furthermore moves entire process pages to apparently to the page cache.
You would think that if you are short by few MB of memory MM would put
few MB worth of pages in the swap. But it wants to move entire processes
in to swap.

When the 1GB process exits, the swap usage doesn't change (dead swap
pages?).
And shutdown or swapoff will take forever due to #1 above.

Bulent




-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-07 Thread LA Walsh


Eric W. Biederman wrote:

 There are cetain scenario's where you can't avoid virtual mem =
 min(RAM,swap). Which is what I was trying to say, (bad formula).  What
 happens is that pages get referenced  evenly enough and quickly enough
 that you simply cannot reuse the on disk pages.  Basically in the
 worst case all of RAM is pretty much in flight doing I/O.  This is
 true of all paging systems.


So, if I understand, you are talking about thrashing behavior
where your active set is larger than physical ram.  If that
is the case then requiring 2X+ swap for better performance
is reasonable.  However, if your active set is truely larger
than your physical memory on a consistant basis, in this day,
the solution is usually add more RAM.  I may be wrong, but
my belief is that with today's computers people are used to having
enough memory to do their normal tasks and that swap is for
peak loads that don't occur on a sustained basis.  Of course
I imagine that this is my belief as it is my own practice/view.
I want to have considerably more memory than my normal working
set.  Swap on my laptop disk is *slow*.  It's a low-power, low-RPM,
slow seek rate all to conserve power (difference between spinning/off
= 1W).  So I have 50% of my phys mem on swap -- because I want to
'feel' it when I goto swap and start looking for memory hogs.
For me, the pathological case is touching swap *at all*.  So the
idea of the entire active set being =phys mem is already broken
on my setup.  Thus my expectation of swap only as 'warning'/'buffer'
zone.

Now for whatever reason, since 2.4, I consistently use at least
a few Mb of swap -- stands at 5Meg now.  Weird -- but I notice things
like nscd running 7 copies that take 72M.  Seems like overkill for
a laptop.

 However just because in the worst case virtual mem = min(RAM,swap), is
 no reason other cases should use that much swap.  If you are doing a
 lot of swapping it is more efficient to plan on mem = min(RAM,swap) as
 well, because frequently you can save on I/O operations by simply
 reusing the existing swap page.

---
Agreed.  But planning your swap space for a worst
case scenario that you never hit is wasteful.  My worst
case is using any swap.  The system should be able to live
with swap=1/2*phys in my situation.  I don't think I'm
unique in this respect.

 It's a theoretical worst case and they all have it.  In practice it is
 very hard to find a work load where practically every page in the
 system is close to the I/O point howerver.

---
Well exactly the point.  It was in such situations in some older
systems that some programs were swapped out and temporarily made
unavailable for running (they showed up in the 'w' space in vmstat).

 Except for removing pages that aren't used paging with swap  RAM is
 not useful.  Simply removing pages that aren't in active use but might
 possibly be used someday is a common case, so it is worth supporting.

---
I think that is the point -- it was supported in 2.2, it is, IMO,
a serious regression that it is not supported in 2.4.

-linda

--
The above thoughts and   | They may have nothing to do with
writings are my own. | the opinions of my employer. :-)
L A Walsh| Senior MTS, Trust Tech., Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338



-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-07 Thread Mike Galbraith


On Thu, 7 Jun 2001, Bulent Abali wrote:

 I happened to saw this one with debugger attached serial port.
 The system was alive.  I think I was watching the free page count and
 it was decreasing very slowly may be couple pages per second.  Bigger
 the swap usage longer it takes to do swapoff.  For example, if I had
 1GB in the swap space then it would take may be an half hour to shutdown...

I took a ~300ms ktrace snapshot of the no IO spot with 2.4.4.ikd..

  % TOTALTOTAL USECSAVG/CALL   NCALLS
  0.0693% 208.540.40  517 c012d4b9 __free_pages
  0.0755% 227.341.01  224 c012cb67 __free_pages_ok
  ...
 34.7195%  104515.150.95   110049 c012de73 unuse_vma
 53.3435%  160578.37  303.55  529 c012dd38 __swap_free
Total entries: 131051  Total usecs:301026.93 Idle: 0.00%

Andrew Morton could be right about that loop not being wonderful.

-Mike

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-07 Thread Derek Glidden


Miles Lane wrote:
 
 So please, if you have new facts that you want to offer that
 will help us characterize and understand these VM issues better
 or discover new problems, feel free to share them.  But if you
 just want to rant, I, for one, would rather you didn't.

*sigh*

Not to prolong an already pointless thread, but that really was the
intent of my original message.  I had figured out a specific way, with
easy-to-follow steps, to make the VM misbehave under very certain
conditions.  I even offered to help figure out a solution in any way I
could, considering I'm not familiar with kernel code.

However, I guess this whole too much swap issue has a lot of people on
edge and immediately assumed I was talking about this subject, without
actually reading my original message.

-- 
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
#!/usr/bin/perl -w
$_='while(read+STDIN,$_,2048){$a=29;$b=73;$c=142;$t=255;@t=map
{$_%16or$t^=$c^=($m=(11,10,116,100,11,122,20,100)[$_/16%8])110;
$t^=(72,@z=(64,72,$a^=12*($_%16-2?0:$m17)),$b^=$_%64?12:0,@z)
[$_%8]}(16..271);if((@a=unxC*,$_)[20]48){$h=5;$_=unxb24,join
,@b=map{xB8,unxb8,chr($_^$a[--$h+84])}@ARGV;s/...$/1$/;$d=
unxV,xb25,$_;$e=256|(ord$b[4])9|ord$b[3];$d=$d8^($f=$t($d
12^$d4^$d^$d/8))17,$e=$e8^($t($g=($q=$e147^$e)^$q*
8^$q6))9,$_=$t[$_]^(($h=8)+=$f+(~$g$t))for@a[128..$#a]}
print+xC*,@a}';s/x/pack+/g;eval 

usage: qrpff 153 2 8 105 225  /mnt/dvd/VOB_FILENAME \
| extract_mpeg2 | mpeg2dec - 

http://www.eff.org/http://www.opendvd.org/ 
 http://www.cs.cmu.edu/~dst/DeCSS/Gallery/
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-07 Thread Eric W. Biederman


LA Walsh [EMAIL PROTECTED] writes:

 Now for whatever reason, since 2.4, I consistently use at least
 a few Mb of swap -- stands at 5Meg now.  Weird -- but I notice things
 like nscd running 7 copies that take 72M.  Seems like overkill for
 a laptop.

So the question becomes why you are seeing an increased swap usage.
Currently there are two canidates in the 2.4.x code path.

1) Delayed swap deallocation, when a program exits after it
   has gone into swap it's swap usage is not freed. Ouch.

2) Increased tenacity of swap caching.  In particular in 2.2.x if a page
   that was in the swap cache was written to the the page in the swap
   space would be removed.  In 2.4.x the location in swap space is
   retained with the goal of getting more efficient swap-ins.

Neither of the known canidates from increasing the swap load applies
when you aren't swapping in the first place.  They may aggrevate the
usage of swap when you are already swapping but they do not cause
swapping themselves.  This is why the intial recommendation for
increased swap space size was made.  If you are swapping we will use
more swap.

However what pushes your laptop over the edge into swapping is an
entirely different question.  And probably what should be solved.

 I think that is the point -- it was supported in 2.2, it is, IMO,
 a serious regression that it is not supported in 2.4.

The problem with this general line of arguing is that it lumps a whole
bunch of real issues/regressions into one over all perception.  Since
there are multiple reasons people are seeing problems, they need to be
tracked down with specifics.

The swapoff case comes down to dead swap pages in the swap cache.
Which greatly increases the number of swap pages slows the system
down, but since these pages are trivial to free we don't generate any
I/O so don't wait for I/O and thus never enter the scheduler.  Making
nothing else in the system runnable.

Your case is significantly different.  I don't know if you are seeing 
any issues with swapping at all.  With a 5M usage it may simply be
totally unused pages being pushed out to the swap space.

Eric
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-07 Thread Eric W. Biederman


Helge Hafting [EMAIL PROTECTED] writes:

 A problem with this is that normal paging-in is allowed to page other
 things out as well.  But you can't have that when swap is about to
 be turned off.  My guess is that swapoff functionality was perceived to
 be so seldom used that they didn't bother too much with scheduling 
 or efficiency.

There is some truth in that.  You aren't allowed to allocate new pages
in the swap space currently being removed however.  The current swap
off code removes pages from the current swap space without breaking
any sharing between swap pages.  Depending on your load this may be
important.  Fixing swapoff to be more efficient while at the same time
keeping sharing between pages is tricky.  That under loads that are
easy to trigger in 2.4 swapoff never sleeps is a big bug.

 I don't have the same problem myself though.  Shutting down with
 30M or so in swap never take unusual time on 2.4.x kernels here,
 with a 300MHz processor.  I did a test while typing this letter,
 almost filling the 96M swap partition with 88M.  swapoff
 took 1 minute at 100% cpu.  This is long, but the machine was responsive
 most of that time.  I.e. no worse than during a kernel compile.
 The machine froze 10 seconds or so at the end of the minute, I can
 imagine that biting with bigger swap.

O.k. so at some point you actually wait for I/O and other process get
a chance to run.  On the larger machines we never wait for I/O and
thus never schedule at all.

The problem is now understood.  Now we just need to fix it.

Eric
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-07 Thread José Luis Domingo López


On Thursday, 07 June 2001, at 09:23:42 +0200,
Helge Hafting wrote:

 Derek Glidden wrote:
  
  Helge Hafting wrote:
 [...]
 The machine froze 10 seconds or so at the end of the minute, I can
 imagine that biting with bigger swap.
 
Same behavior here with a Pentium III 600, 128 MB RAM and 128 MB of swap.
Filled mem and swap with the infamous glob() bug (ls ../*/.. etc.), made
swapoff, and the machine kept very responsive except for the last 10-15
seconds before swapoff ends.

Even scrolling complex pages with Mozilla 0.9 worked smoothly :).

-- 
José Luis Domingo López
Linux Registered User #189436 Debian GNU/Linux Potato (P166 64 MB RAM)
 
jdomingo EN internautas PUNTO org  = ¿ Spam ? Atente a las consecuencias
jdomingo AT internautas DOT   org  = Spam at your own risk

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-07 Thread Marcelo Tosatti




On Thu, 7 Jun 2001, Mike Galbraith wrote:

 On 6 Jun 2001, Eric W. Biederman wrote:
 
  Mike Galbraith [EMAIL PROTECTED] writes:
 
If you could confirm this by calling swapoff sometime other than at
reboot time.  That might help.  Say by running top on the console.
  
   The thing goes comatose here too. SCHED_RR vmstat doesn't run, console
   switch is nogo...
  
   After running his memory hog, swapoff took 18 seconds.  I hacked a
   bleeder valve for dead swap pages, and it dropped to 4 seconds.. still
   utterly comatose for those 4 seconds though.
 
  At the top of the while(1) loop in try_to_unuse what happens if you put in.
  if (need_resched) schedule();
  It should be outside all of the locks.  It might just be a matter of everything
  serializing on the SMP locks, and the kernel refusing to preempt itself.
 
 That did it.

What about including this workaround in the kernel ? 

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-07 Thread Miles Lane


On 07 Jun 2001 11:49:47 -0400, Derek Glidden wrote:
 Miles Lane wrote:
  
  So please, if you have new facts that you want to offer that
  will help us characterize and understand these VM issues better
  or discover new problems, feel free to share them.  But if you
  just want to rant, I, for one, would rather you didn't.
 
 *sigh*
 
 Not to prolong an already pointless thread, but that really was the
 intent of my original message.  I had figured out a specific way, with
 easy-to-follow steps, to make the VM misbehave under very certain
 conditions.  I even offered to help figure out a solution in any way I
 could, considering I'm not familiar with kernel code.
 
 However, I guess this whole too much swap issue has a lot of people on
 edge and immediately assumed I was talking about this subject, without
 actually reading my original message.

Actually, I think your original message was useful.  It has
spurred a reevaluation of some design assumptions implicit in the VM
in the 2.4 series and has also surfaced some bugs.  It was not you
who I felt was sending enflammatory remarks, it was the folks who
have been bellyaching about the current swap disk space requirements
without offering any new information to help developers remedy
the situation.

So, thanks for bringing the topic up.  :-)

Cheers,
Miles

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-07 Thread Shane Nay


On Thursday 07 June 2001 13:00, Marcelo Tosatti wrote:
 On Thu, 7 Jun 2001, Shane Nay wrote:
  (Oh, BTW, I really appreciate the work that people have done on the VM,
  but folks that are just talking..., well, think clearly before you impact
  other people that are writing code.)

 If all the people talking were reporting results we would be really happy.

 Seriously, we really lack VM reports.

Okay, I've had some problems with the VM on my machine, what is the most 
usefull way to compile reports for you?  I have modified the kernel for a few 
different ports fixing bugs, and device drivers, etc., but the VM is all 
greek to me, I can just see that caching is hyper aggressive and doesn't look 
like it's going back to the pool..., which results in sluggish performance.  
Now I know from the work that I've done that anecdotal information is almost 
never even remotely usefull.  Therefore is there any body of information that 
I can read up on to create a usefull set of data points for you or other VM 
hackers to look at?  (Or maybe some report in the past that you thought was 
especially usefull?)

Thank You,
Shane Nay.
(I have in the past had many problems with the VM on embedded machines as 
well, but I'm not actively working on any right this second..., though my 
Psion is sitting next to me begging for me to run some VM tests on it :)
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-07 Thread Marcelo Tosatti




On Thu, 7 Jun 2001, Shane Nay wrote:

 (Oh, BTW, I really appreciate the work that people have done on the VM, but 
 folks that are just talking..., well, think clearly before you impact other 
 people that are writing code.)

If all the people talking were reporting results we would be really happy. 

Seriously, we really lack VM reports.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-07 Thread Shane Nay


Uh, last I checked on my linux based embedded device I didn't want to swap to 
flash.  Hmm.., now why was that..., oh, that's right, it's *much* more 
expensive than memory, oh yes, and it actually gets FRIED when you write to a 
block more than 100k times.  Oh, what was that other thing..., oh yes, and 
its SOLDERED ON THE BOARD.  Damn..., guess I just lost a grand or so.

Seriously folks, Linux isn't just for big webservers...

Thanks,
Shane Nay.
(Oh, BTW, I really appreciate the work that people have done on the VM, but 
folks that are just talking..., well, think clearly before you impact other 
people that are writing code.)

On Wednesday 06 June 2001 02:57, Dr S.M. Huen wrote:
 On Wed, 6 Jun 2001, Sean Hunter wrote:
  For large memory boxes, this is ridiculous.  Should I have 8GB of swap?

 Do I understand you correctly?
 ECC grade SDRAM for your 8GB server costs £335 per GB as 512MB sticks even
 at today's silly prices (Crucial). Ultra160 SCSI costs £8.93/GB as 73GB
 drives.

 It will cost you 19x as much to put the RAM in as to put the
 developer's recommended amount of swap space to back up that RAM.  The
 developers gave their reasons for this design some time ago and if the
 ONLY problem was that it required you to allocate more swap, why should
 it be a priority item to fix it for those that refuse to do so?   By all
 means fix it urgently where it doesn't work when used as advised but
 demanding priority to fixing a problem encountered when a user refuses to
 use it in the manner specified seems very unreasonable.  If you can afford
 4GB RAM, you certainly can afford 8GB swap.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-07 Thread Marcelo Tosatti




On Thu, 7 Jun 2001, Shane Nay wrote:

 On Thursday 07 June 2001 13:00, Marcelo Tosatti wrote:
  On Thu, 7 Jun 2001, Shane Nay wrote:
   (Oh, BTW, I really appreciate the work that people have done on the VM,
   but folks that are just talking..., well, think clearly before you impact
   other people that are writing code.)
 
  If all the people talking were reporting results we would be really happy.
 
  Seriously, we really lack VM reports.
 
 Okay, I've had some problems with the VM on my machine, what is the most 
 usefull way to compile reports for you?  

1) Describe what you're running. (your workload)
2) Describe what you're feeling. (eg interactivity is crap when I run
this or that thing, etc) 

If we need more info than that I'll request in private. 

Also send this reports to the linux-mm list, so other VM hackers can also
get those reports and we avoid traffic on lk.

 I have modified the kernel for a few different ports fixing bugs, and
 device drivers, etc., but the VM is all greek to me, I can just see
 that caching is hyper aggressive and doesn't look like it's going back
 to the pool..., which results in sluggish performance.

By performance you mean interactivity or throughput? 

 Now I know from the work that I've done that anecdotal information is
 almost never even remotely usefull.  

If we need more info, we will request. 

 Therefore is there any body of information that I can read up on to
 create a usefull set of data points for you or other VM hackers to
 look at?  (Or maybe some report in the past that you thought was
 especially usefull?)

Just do what I described above. 

Thanks

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-07 Thread LA Walsh


Eric W. Biederman wrote:

 LA Walsh [EMAIL PROTECTED] writes:

  Now for whatever reason, since 2.4, I consistently use at least
  a few Mb of swap -- stands at 5Meg now.  Weird -- but I notice things
  like nscd running 7 copies that take 72M.  Seems like overkill for
  a laptop.

 So the question becomes why you are seeing an increased swap usage.
 Currently there are two canidates in the 2.4.x code path.

 1) Delayed swap deallocation, when a program exits after it
has gone into swap it's swap usage is not freed. Ouch.

---
Double ouch.  Swap is backing a non-existent program?



 2) Increased tenacity of swap caching.  In particular in 2.2.x if a page
that was in the swap cache was written to the the page in the swap
space would be removed.  In 2.4.x the location in swap space is
retained with the goal of getting more efficient swap-ins.


But if the page in memory is 'dirty', you can't be efficient with swapping
*in* the page.  The page on disk is invalid and should be released, or am I
missing something?

 Neither of the known canidates from increasing the swap load applies
 when you aren't swapping in the first place.  They may aggrevate the
 usage of swap when you are already swapping but they do not cause
 swapping themselves.  This is why the intial recommendation for
 increased swap space size was made.  If you are swapping we will use
 more swap.

 However what pushes your laptop over the edge into swapping is an
 entirely different question.  And probably what should be solved.


On my laptop, it is insignificant and to my knowledge has no measurable
impact.  It seems like there is always 3-5 Meg used in swap no matter what's
running (or not) on the system.

  I think that is the point -- it was supported in 2.2, it is, IMO,
  a serious regression that it is not supported in 2.4.

 The problem with this general line of arguing is that it lumps a whole
 bunch of real issues/regressions into one over all perception.  Since
 there are multiple reasons people are seeing problems, they need to be
 tracked down with specifics.

---
Uhhh, yeah, sorta -- it's addressing the statement that a new requirement of
2.4 is to have double the swap space.  If everyone agrees that's a problem, then
yes, we can go into specifics of what is causing or contributing to the problem.
It's getting past the attitude of some people that 2xMem for swap is somehow
'normal and acceptable -- deal with it.  In my case, seems like 10Mb of swap would
be all that would generally be used (I don't think I've ever seen swap usage over 7Mb)
on a 512M system.  To be told oh, your wrong, you *should* have 1Gig or you are
operating in an 'unsupported' or non-standard configuration.  I find that very
user-unfriendly.



 The swapoff case comes down to dead swap pages in the swap cache.
 Which greatly increases the number of swap pages slows the system
 down, but since these pages are trivial to free we don't generate any
 I/O so don't wait for I/O and thus never enter the scheduler.  Making
 nothing else in the system runnable.

---
I haven't ever *noticed* this on my machine but that could be
because there isn't much in swap to begin with?  Could be I was
just blissfully ignorant of the time it took to do a swapoff.
Hmmmlet's see...  Just tried it.  I didn't get a total lock up,
but cursor movement was definitely jerky:
 time sudo swapoff -a

real0m10.577s
user0m0.000s
sys 0m9.430s

Looking at vmstat, the needed space was taken mostly out of the
page cache (86M-81.8M) and about 700K each out of free and buff.


 Your case is significantly different.  I don't know if you are seeing
 any issues with swapping at all.  With a 5M usage it may simply be
 totally unused pages being pushed out to the swap space.

---
Probably -- I guess the page cache and disk buffers put enough pressure to
push some things off to swap.

-linda
--
The above thoughts and   | They may have nothing to do with
writings are my own. | the opinions of my employer. :-)
L A Walsh| Senior MTS, Trust Tech, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-07 Thread C. Martins



  In my everyday desktop workstation (PII 350) I have 64MB of RAM and use 300MB of 
swap, 150MB on 
each hard disk. After upgrading to 2.4, and maintaining the same set of applications 
(KDE, Netscape
 friends), the machine performance is _definitely_ much worse, in terms of 
responsiveness and 
throughput. Most of applications just take much longer to load, and once you've made 
something
that required more memory for a while (like compiling a kernel, opening a large JPEG 
in gimp, etc)
it takes lots of time to come back to normal. Strangely, with 2.4 the workstation just 
feels that
someone stole the 64MB DIMM and put in a 16MB one!!
  One thing I find strange is that with 2.4 if you run top or something similar you 
notice that
memory allocated for cache is almost always using more than half total RAM. I don't 
remember seeing
this with 2.2 kernel series...

  Anyway I think there is something really broken with respect to 2.4 VM. It is just 
NOT acceptable
that when running the same set of apps and type of work and you upgrade your kernel, 
your hardware
no longer is up to the job, when it fited perfectly right before. This is just MS way 
of solving
problems here.

  Best regards

 Claudio Martins 


On Wed, Jun 06, 2001 at 06:58:39AM -0700, Gerhard Mack wrote:
 
 I have several boxes with 2x ram as swap and performance still sucks
 compared to 2.2.17.  
 
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-06 Thread Mike Galbraith


On 6 Jun 2001, Eric W. Biederman wrote:

> Mike Galbraith <[EMAIL PROTECTED]> writes:
>
> > > If you could confirm this by calling swapoff sometime other than at
> > > reboot time.  That might help.  Say by running top on the console.
> >
> > The thing goes comatose here too. SCHED_RR vmstat doesn't run, console
> > switch is nogo...
> >
> > After running his memory hog, swapoff took 18 seconds.  I hacked a
> > bleeder valve for dead swap pages, and it dropped to 4 seconds.. still
> > utterly comatose for those 4 seconds though.
>
> At the top of the while(1) loop in try_to_unuse what happens if you put in.
> if (need_resched) schedule();
> It should be outside all of the locks.  It might just be a matter of everything
> serializing on the SMP locks, and the kernel refusing to preempt itself.

That did it.

-Mike

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-06 Thread Miles Lane

On 06 Jun 2001 20:34:49 -0400, Mike A. Harris wrote:
> On Wed, 6 Jun 2001, Derek Glidden wrote:
> 
> >>  Derek> overwhelmed.  On the system I'm using to write this, with
> >>  Derek> 512MB of RAM and 512MB of swap, I run two copies of this
> >>
> >> Please see the following message on the kernel mailing list,
> >>
> >> 3086:Linus 2.4.0 notes are quite clear that you need at least twice RAM of swap
> >> Message-Id: <[EMAIL PROTECTED]>
> >
> >Yes, I'm aware of this.
> >
> >However, I still believe that my original problem report is a BUG.  No
> >matter how much swap I have, or don't have, and how much is or isn't
> >being used, running "swapoff" and forcing the VM subsystem to reclaim
> >unused swap should NOT cause my machine to feign death for several
> >minutes.
> >
> >I can easily take 256MB out of this machine, and then I *will* have
> >twice as much swap as RAM and I can still cause the exact same
> >behaviour.
> >
> >It's a bug, and no number of times saying "You need twice as much swap
> >as RAM" will change that fact.
> 
> Precicely.  Saying 8x RAM doesn't change it either.  Sometime
> next week I'm going to purposefully put a new 60Gb disk in on a
> separate controller as pure swap on top of 256Mb of RAM.  My
> guess is after bootup, and login, I'll have 48Gb of stuff in
> swap "just in case".

Mike and others, I am getting tired of your comments.  Sheesh.  
The various developers who actually work on the VM have already
acknowledged the issues and are exploring fixes, including at 
least one patch that already exists.  It seems clear that the 
uproar from the people who are having trouble with the new VM's 
handling of swap space have been heard and folks are going to 
fix these problems.  It may not happen today or tomorrow, but 
soon.  What the heck else do you want?

Making enflammatory remarks about the current situation does 
nothing to help get the problems fixed, it just wastes our time 
and bandwidth.

So please, if you have new facts that you want to offer that
will help us characterize and understand these VM issues better
or discover new problems, feel free to share them.  But if you
just want to rant, I, for one, would rather you didn't.

Miles

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-06 Thread Mike A. Harris


On Wed, 6 Jun 2001, android wrote:

>associated with that mindset that made Microsoft such a [fill in the blank].
>As for the 2.4 VM problem, what are you doing with your machine that's
>making it use up so much memory? I have several processes running
>on mine all the time, including a slew in X, and I have yet to see
>significant swap activity.

Try _compiling_ XFree86.  Watch the machine nosedive.

--
Mike A. Harris  -  Linux advocate  -  Open Source advocate
   Opinions and viewpoints expressed are solely my own.
--

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-06 Thread Mike A. Harris

On Wed, 6 Jun 2001, Dr S.M. Huen wrote:

>> For large memory boxes, this is ridiculous.  Should I have 8GB of swap?
>>
>
>Do I understand you correctly?
>ECC grade SDRAM for your 8GB server costs £335 per GB as 512MB sticks even
>at today's silly prices (Crucial). Ultra160 SCSI costs £8.93/GB as 73GB
>drives.

Linux is all about technical correctness, and doing the job
properly.  It isn't about "there is a bug in the kernel, but that
is ok because a 8Gb swapfile only costs $2"

Why are half the people here trying to hide behind this diskspace
is cheap argument?  If we rely on that, then Linux sucks shit.

The problem IMHO is widely acknowledged by those who matter as an
official BUG, and that is that.  It is also acknowledged widely
by those who can fix the problem that it will be fixed in time.

So technically speaking - the kernel has a widely known
bug/misfeature, which is acknowledged by core kernel developers
as needing fixing, and that it will get fixed at some point.

Saying it is a nonissue due to the cost of hardware resources is
just plain Microsoft attitude and holds absolutely zero technical
merit.

It *IS* an issue, because it is making Linux suck, and is causing
REAL WORLD PROBLEMS.  The use 2x RAM is nothing more than a
bandaid workaround, so don't claim that it is the proper fix due
to big wallet size.

I have 2.2 doing a software build that takes 40 minutes with
256Mb of RAM, and 1G of swap.  The same build on 2.4 takes 60
minutes.  That is 4x RAM for swap.

Lowering the swap down to 2x RAM makes no difference in the
numbers, down to 1x RAM the 2.4 build slows down horrendously,
and droping the swap to 20Mb makes it die completely in 2.4.

2.4 is fine for a firewall, or certain other applications, but
regardless of the amount of SWAP,  I'll take the 40minute build
using 2.2 over the 60minute build using 2.4 anyday.

This is the real world.  And no cost isn't an issue to me.
Putting another 80Gb drive in this box for swap isn't going to
help the work get done any faster.

--
Mike A. Harris  -  Linux advocate  -  Open Source advocate
   Opinions and viewpoints expressed are solely my own.
--

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-06 Thread Kai Henningsen


[EMAIL PROTECTED] (Alexander Viro)  wrote on 06.06.01 in 
<[EMAIL PROTECTED]>:

> On Wed, 6 Jun 2001, Sean Hunter wrote:
>
> > This is completely bogus. I am not saying that I can't afford the swap.
> > What I am saying is that it is completely broken to require this amount
> > of swap given the boundaries of efficient use.
>
> Funny. I can count many ways in which 4.3BSD, SunOS{3,4} and post-4.4 BSD
> systems I've used were broken, but I've never thought that swap==2*RAM rule
> was one of them.

As a "will break without" rule, I'd consider a kernel with that property  
completely unsuitable for production use. I certainly don't remember  
thinking of that as more than a recommendation back when I used commercial  
Unices (SysVsomething).

MfG Kai
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-06 Thread Jonathan Morton


At 11:27 pm +0100 6/6/2001, android wrote:
>> >I'd be happy to write a new routine in assembly
>>
>>I sincerely hope you're joking.
>>
>>It's the algorithm that needs fixing, not the implementation of that
>>algorithm.  Writing in assembler?  Hope you're proficient at writing in
>>x86, PPC, 68k, MIPS (several varieties), ARM, SPARC, and whatever other
>>architectures we support these days.  And you darn well better hope every
>>other kernel hacker is as proficient as that, to be able to read it.

>As for the algorithm, I'm sure that
>whatever method is used to handle page swapping, it has to comply with
>the kernel's memory management scheme already in place. That's why I would
>need the details so that I wouldn't create more problems than already present.

Have you actually been following this thread?  The algorithm has been
discussed and at least one alternative brought forward.

--
from: Jonathan "Chromatix" Morton
mail: [EMAIL PROTECTED]  (not for attachments)

The key to knowledge is not to rely on people to teach you it.

GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-06 Thread Robert Love

On 06 Jun 2001 15:27:57 -0700, android wrote:
> >I sincerely hope you're joking.
>
> I realize that assembly is platform-specific. Being that I use the IA32 class
> machine, that's what I would write for. Others who use other platforms could
> do the deed for their native language.

no, look at the code. it is not going to benefit from assembly (assuming
you can even implement it cleanly in assembly).  its basically an
iteration of other function calls.

doing a new implementation in assembly for each platform is not
feasible, anyhow. this is the sort of thing that needs to be uniform.

this really has nothing to do with the "iron" of the computer -- its a
loop to check and free swap pages. assembly will not provide benefit.

-- 
Robert M. Love
[EMAIL PROTECTED]
[EMAIL PROTECTED]

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-06 Thread Antoine


hi,

I have a problem with kswapd, it takes suddenly 98 % CPU and crash my server
I dono why, I have a linux kernel 2.2.17 debian distro if anyone can help me
... thx ;)

Antoine

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-06 Thread android



> >I'd be happy to write a new routine in assembly
>
>I sincerely hope you're joking.
>
>It's the algorithm that needs fixing, not the implementation of that
>algorithm.  Writing in assembler?  Hope you're proficient at writing in
>x86, PPC, 68k, MIPS (several varieties), ARM, SPARC, and whatever other
>architectures we support these days.  And you darn well better hope every
>other kernel hacker is as proficient as that, to be able to read it.
I realize that assembly is platform-specific. Being that I use the IA32 class
machine, that's what I would write for. Others who use other platforms could
do the deed for their native language. As for the algorithm, I'm sure that
whatever method is used to handle page swapping, it has to comply with
the kernel's memory management scheme already in place. That's why I would
need the details so that I wouldn't create more problems than already present.
Being that most users are on the IA32 platform, I'm sure they wouldn't reject
an assembly solution to this problem. As for kernel acceptance, that's an
issue for the political eggheads. Not my forte. :-)

  -- Ted

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-06 Thread Jonathan Morton


>I'd be happy to write a new routine in assembly

I sincerely hope you're joking.

It's the algorithm that needs fixing, not the implementation of that
algorithm.  Writing in assembler?  Hope you're proficient at writing in
x86, PPC, 68k, MIPS (several varieties), ARM, SPARC, and whatever other
architectures we support these days.  And you darn well better hope every
other kernel hacker is as proficient as that, to be able to read it.

IOW, no chance.

--
from: Jonathan "Chromatix" Morton
mail: [EMAIL PROTECTED]  (not for attachments)

The key to knowledge is not to rely on people to teach you it.

GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-06 Thread LA Walsh


"Eric W. Biederman" wrote:

> The hard rule will always be that to cover all pathological cases swap
> must be greater than RAM.  Because in the worse case all RAM will be
> in thes swap cache.  That this is more than just the worse case in 2.4
> is problematic.  I.e. In the worst case:
> Virtual Memory = RAM + (swap - RAM).

Hmmmso my 512M laptop only really has 256M?  Um...I regularlly run
more than 256M of programs.  I don't want it to swap -- its a special, weird
condition if I do start swapping.  I don't want to waste 1G of HD (5%) for
something I never want to use.  IRIX runs just fine with swap You can't improve the worst case.  We can improve the worst case that
> many people are facing.

---
Other OS's don't have this pathological 'worst case' scenario.  Even
my Windows [vm]box seems to operate fine with swap It's worth complaining about.  It is also worth digging into and find
> out what the real problem is.  I have a hunch that this hole
> conversation on swap sizes being irritating is hiding the real
> problem.

---
Okay, admission of ignorance.  When we speak of "swap space",
is this term inclusive of both demand paging space and
swap-out-entire-programs space or one or another?
-linda

--
The above thoughts and   | They may have nothing to do with
writings are my own. | the opinions of my employer. :-)
L A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-06 Thread android



>Is anybody interested in making "swapoff()" better? Please speak up..
>
> Linus

I'd be happy to write a new routine in assembly, if I had a clue as to how
the VM algorithm works in Linux. What should swapoff  do if all physical
memory is in use? How does the swapping algorithm balance against
cache memory? Can someone point me to where I can find the exact
details of the VM mechanism in Linux? Thanks!

   -- Ted

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-06 Thread Linus Torvalds

In article <[EMAIL PROTECTED]>,
Derek Glidden  <[EMAIL PROTECTED]> wrote:
>
>After reading the messages to this list for the last couple of weeks and
>playing around on my machine, I'm convinced that the VM system in 2.4 is
>still severely broken.  

Now, this may well be true, but what you actually demonstrated is that
"swapoff()" is extremely (and I mean _EXTREMELY_) inefficient, to the
point that it can certainly be called broken.

It got worse in 2.4.x not so much due to any generic VM worseness, as
due to the fact that the much more persistent swap cache behaviour in
2.4.x just exposes the fundamental inefficiencies of "swapoff()" more
clearly.  I don't think the swapoff() algorithm itself has changed, it's
just that the algorithm was always exponential, I think (and because of
the persistent swap cache, the "n" in the algorithm became much bigger). 

So this is really a separate problem from the general VM balancing
issues. Go and look at the "try_to_unuse()" logic, and wince. 

I'd love to have somebody look a bit more at swap-off.  It may well be,
for example, that swap-off does not correctly notice dead swap-pages at
all - somebody should verify that it doesn't try to read in and
"try_to_unuse()" dead swap entries.  That would make the inefficiency
show up even more clearly. 

(Quick look gives the following: right now try_to_unuse() in
mm/swapfile.c does something like

lock_page(page);
if (PageSwapCache(page))
delete_from_swap_cache_nolock(page);
UnlockPage(page);
read_lock(_lock);
for_each_task(p)
unuse_process(p->mm, entry, page);
read_unlock(_lock);
shmem_unuse(entry, page);
/* Now get rid of the extra reference to the temporary
   page we've been using. */
page_cache_release(page);

and we should trivially notice that if the page count is 1, it cannot be
mapped in any process, so we should maybe add something like

lock_page(page);
if (PageSwapCache(page))
delete_from_swap_cache_nolock(page);
UnlockPage(page);
+   if (page_count(page) == 1)
+   goto nothing_to_do;
read_lock(_lock);
for_each_task(p)
unuse_process(p->mm, entry, page);
read_unlock(_lock);
shmem_unuse(entry, page);
+
+   nothing_to_do:
+
/* Now get rid of the extra reference to the temporary
   page we've been using. */
page_cache_release(page);

which should (assuming I got the page count thing right - I'v eobviously
not tested the above change) make sure that we don't spend tons of time
on dead swap pages. 

Somebody interested in trying the above add? And looking for other more
obvious bandaid fixes.  It won't "fix" swapoff per se, but it might make
it bearable and bring it to the 2.2.x levels. 

The _real_ fix is to really make "swapoff()" work the other way around -
go through each process and look for swap entries in the page tables
_first_, and bring all entries for that device in sanely, and after
everything is brought in just drop all the swap cache pages for that
device. 

The current swapoff() thing is really a quick hack that has lived on
since early 1992 with quick hacks to make it work with the big VM
changes that have happened since. 

That would make swapoff be O(n) in VM size (and you can easily do some
further micro-optimizations at that time by avoiding shared mappings
with backing store and other things that cannot have swap info involved)

Is anybody interested in making "swapoff()" better? Please speak up..

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-06 Thread Daniel Phillips


On Wednesday 06 June 2001 20:27, Eric W. Biederman wrote:
> The hard rule will always be that to cover all pathological cases
> swap must be greater than RAM.  Because in the worse case all RAM
> will be in thes swap cache.

Could you explain in very simple terms how the worst case comes about?

--
Daniel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-06 Thread Derek Glidden


Mike Galbraith wrote:
> 
> Can you try the patch below to see if it helps?  If you watch
> with vmstat, you should see swap shrinking after your test.
> Let is shrink a while and then see how long swapoff takes.
> Under a normal load, it'll munch a handfull of them at least
> once a second and keep them from getting annoying. (theory;)

Hi Mike,
I'll give that patch a spin this evening after work when I have time to
patch and recompile the kernel.

-- 
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-06 Thread Eric W. Biederman


Mike Galbraith <[EMAIL PROTECTED]> writes:

> On 6 Jun 2001, Eric W. Biederman wrote:
> 
> > Derek Glidden <[EMAIL PROTECTED]> writes:
> >
> >
> > > The problem I reported is not that 2.4 uses huge amounts of swap but
> > > that trying to recover that swap off of disk under 2.4 can leave the
> > > machine in an entirely unresponsive state, while 2.2 handles identical
> > > situations gracefully.
> > >
> >
> > The interesting thing from other reports is that it appears to be kswapd
> > using up CPU resources.  Not the swapout code at all.  So it appears
> > to be a fundamental VM issue.  And calling swapoff is just a good way
> > to trigger it.
> >
> > If you could confirm this by calling swapoff sometime other than at
> > reboot time.  That might help.  Say by running top on the console.
> 
> The thing goes comatose here too. SCHED_RR vmstat doesn't run, console
> switch is nogo...
> 
> After running his memory hog, swapoff took 18 seconds.  I hacked a
> bleeder valve for dead swap pages, and it dropped to 4 seconds.. still
> utterly comatose for those 4 seconds though.

At the top of the while(1) loop in try_to_unuse what happens if you put in.
if (need_resched) schedule(); 
It should be outside all of the locks.  It might just be a matter of everything
serializing on the SMP locks, and the kernel refusing to preempt itself.

Eric

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-06 Thread Derek Glidden

"Eric W. Biederman" wrote:
> 
> Derek Glidden <[EMAIL PROTECTED]> writes:
> 
> > The problem I reported is not that 2.4 uses huge amounts of swap but
> > that trying to recover that swap off of disk under 2.4 can leave the
> > machine in an entirely unresponsive state, while 2.2 handles identical
> > situations gracefully.
> >
> 
> The interesting thing from other reports is that it appears to be kswapd
> using up CPU resources.  Not the swapout code at all.  So it appears
> to be a fundamental VM issue.  And calling swapoff is just a good way
> to trigger it.
> 
> If you could confirm this by calling swapoff sometime other than at
> reboot time.  That might help.  Say by running top on the console.

That's exactly what my original test was doing.  I think it was Jeffrey
Baker complaining about "swapoff" at reboot.  See my original post that
started this thread and follow the "five easy steps."  :)  I'm sucking
down a lot of swap, although not all that's available which is something
I am specifically trying to avoid - I wanted to stress the VM/swap
recovery procedure, not "out of RAM and swap" memory pressure - and then
running 'swapoff' from an xterm or a console.

The problem with being able to see what's eating up CPU resources is
that the whole machine stops responding for me to tell.  consoles stop
updating, the X display freezes, keyboard input is locked out, etc.  As
far as anyone can tell, for several minutes, the whole machine is locked
up. (except, strangely enough, the machine will still respond to ping) 
I've tried running 'top' to see what task is taking up all the CPU time,
but the system hangs before it shows anything meaningful.  I have been
able to tell that it hits 100% "system" utilization very quickly though.

I did notice that the first thing sys_swapoff() does is call
lock_kernel() ... so if sys_swapoff() takes a long time, I imagine
things will get very unresponsive quickly.  (But I'm not intimately
familiar with the various kernel locks, so I don't know what
granularity/atomicity/whatever lock_kernel() enforces.)

-- 
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
#!/usr/bin/perl -w
$_='while(read+STDIN,$_,2048){$a=29;$b=73;$c=142;$t=255;@t=map
{$_%16or$t^=$c^=($m=(11,10,116,100,11,122,20,100)[$_/16%8])&110;
$t^=(72,@z=(64,72,$a^=12*($_%16-2?0:$m&17)),$b^=$_%64?12:0,@z)
[$_%8]}(16..271);if((@a=unx"C*",$_)[20]&48){$h=5;$_=unxb24,join
"",@b=map{xB8,unxb8,chr($_^$a[--$h+84])}@ARGV;s/...$/1$&/;$d=
unxV,xb25,$_;$e=256|(ord$b[4])<<9|ord$b[3];$d=$d>>8^($f=$t&($d
>>12^$d>>4^$d^$d/8))<<17,$e=$e>>8^($t&($g=($q=$e>>14&7^$e)^$q*
8^$q<<6))<<9,$_=$t[$_]^(($h>>=8)+=$f+(~$g&$t))for@a[128..$#a]}
print+x"C*",@a}';s/x/pack+/g;eval 

usage: qrpff 153 2 8 105 225 < /mnt/dvd/VOB_FILENAME \
| extract_mpeg2 | mpeg2dec - 

http://www.eff.org/http://www.opendvd.org/ 
 http://www.cs.cmu.edu/~dst/DeCSS/Gallery/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-06 Thread Mike Galbraith


On 6 Jun 2001, Eric W. Biederman wrote:

> Derek Glidden <[EMAIL PROTECTED]> writes:
>
>
> > The problem I reported is not that 2.4 uses huge amounts of swap but
> > that trying to recover that swap off of disk under 2.4 can leave the
> > machine in an entirely unresponsive state, while 2.2 handles identical
> > situations gracefully.
> >
>
> The interesting thing from other reports is that it appears to be kswapd
> using up CPU resources.  Not the swapout code at all.  So it appears
> to be a fundamental VM issue.  And calling swapoff is just a good way
> to trigger it.
>
> If you could confirm this by calling swapoff sometime other than at
> reboot time.  That might help.  Say by running top on the console.

The thing goes comatose here too. SCHED_RR vmstat doesn't run, console
switch is nogo...

After running his memory hog, swapoff took 18 seconds.  I hacked a
bleeder valve for dead swap pages, and it dropped to 4 seconds.. still
utterly comatose for those 4 seconds though.

-Mike

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-06 Thread android



>Furthermore, I am not demanding anything, much less "priority fixing"
>for this bug. Its my personal opinion that this is the most critical bug
>in the 2.4 series, and if I had the time and skill, this is what I would
>be working on. Because I don't have the time and skill, I am perfectly
>happy to wait until those that do fix the problem. To say it isn't a
>problem because I can buy more disk is nonsense, and its that sort of
>thinking that leads to constant need to upgrade hardware in the
>proprietary OS world.
>
>Sean

This would reflect the Microsoft way of programming:
If there's a bug in the system, don't fix it, but upgrade your hardware.
Why do you think the requirements for Windows is so great?
Most of their code is very inefficient. I'm sure they programmed
their kernel in Visual Basic. The worst part is that they get
paid to do this! I program in Linux because I don't want to be
associated with that mindset that made Microsoft such a [fill in the blank].
As for the 2.4 VM problem, what are you doing with your machine that's
making it use up so much memory? I have several processes running
on mine all the time, including a slew in X, and I have yet to see
significant swap activity.

   -- Ted

P.S. My faithful Timex Sinclair from the 80's never had swap :-)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-06 Thread Mike Galbraith


On Tue, 5 Jun 2001, Derek Glidden wrote:

> After reading the messages to this list for the last couple of weeks and
> playing around on my machine, I'm convinced that the VM system in 2.4 is
> still severely broken.

...

Hi,

Can you try the patch below to see if it helps?  If you watch
with vmstat, you should see swap shrinking after your test.
Let is shrink a while and then see how long swapoff takes.
Under a normal load, it'll munch a handfull of them at least
once a second and keep them from getting annoying. (theory;)

-Mike


--- linux-2.4.5.ac5/mm/vmscan.c.org Sat Jun  2 07:37:16 2001
+++ linux-2.4.5.ac5/mm/vmscan.c Wed Jun  6 18:29:02 2001
@@ -1005,6 +1005,53 @@
return ret;
 }

+int deadswap_reclaim(unsigned int priority)
+{
+   struct list_head * page_lru;
+   struct page * page;
+   int maxscan = nr_active_pages >> priority;
+   int nr_reclaim = 0;
+
+   /* Take the lock while messing with the list... */
+   spin_lock(_lru_lock);
+   while (maxscan-- > 0 && (page_lru = active_list.prev) != _list) {
+   page = list_entry(page_lru, struct page, lru);
+
+   /* Wrong page on list?! (list corruption, should not happen) */
+   if (!PageActive(page)) {
+   printk("VM: refill_inactive, wrong page on list.\n");
+   list_del(page_lru);
+   nr_active_pages--;
+   continue;
+   }
+
+   if (PageSwapCache(page) &&
+   (page_count(page) - !!page->buffers) == 1 &&
+   swap_count(page) == 1) {
+   if (page->buffers || TryLockPage(page)) {
+   ClearPageReferenced(page);
+   ClearPageDirty(page);
+   page->age = 0;
+   deactivate_page_nolock(page);
+   } else {
+   page_cache_get(page);
+   spin_unlock(_lru_lock);
+   delete_from_swap_cache_nolock(page);
+   spin_lock(_lru_lock);
+   UnlockPage(page);
+   page_cache_release(page);
+   }
+   nr_reclaim++;
+   continue;
+   }
+   list_del(page_lru);
+   list_add(page_lru, _list);
+   }
+   spin_unlock(_lru_lock);
+
+   return nr_reclaim;
+}
+
 DECLARE_WAIT_QUEUE_HEAD(kreclaimd_wait);
 /*
  * Kreclaimd will move pages from the inactive_clean list to the
@@ -1027,7 +1074,7 @@
 * We sleep until someone wakes us up from
 * page_alloc.c::__alloc_pages().
 */
-   interruptible_sleep_on(_wait);
+   interruptible_sleep_on_timeout(_wait, HZ);

/*
 * Move some pages from the inactive_clean lists to
@@ -1051,6 +1098,7 @@
}
pgdat = pgdat->node_next;
} while (pgdat);
+   deadswap_reclaim(4);
}
 }


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-06 Thread Eric W. Biederman

Derek Glidden <[EMAIL PROTECTED]> writes:

> The problem I reported is not that 2.4 uses huge amounts of swap but
> that trying to recover that swap off of disk under 2.4 can leave the
> machine in an entirely unresponsive state, while 2.2 handles identical
> situations gracefully.  
> 

The interesting thing from other reports is that it appears to be kswapd
using up CPU resources.  Not the swapout code at all.  So it appears
to be a fundamental VM issue.  And calling swapoff is just a good way
to trigger it. 

If you could confirm this by calling swapoff sometime other than at
reboot time.  That might help.  Say by running top on the console.

Eric

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-06 Thread Mark Salisbury

On Wed, 06 Jun 2001, Dr S.M. Huen wrote:
> The whole screaming match is about whether a drastic degradation on using
> swap with less than the 2*RAM swap specified by the developers should lead
> one to conclude that a kernel is "broken".

I would argue that any system that performs substantially worse with swap==1xRAM
than a system with swap==0xRAM is fundamentally broken.  it seems that w/
todays 2.4.x kernel, people running programs totalling LESS THAN their physical
dram are having swap problems.  they should not even be using 1 byte of swap.

the whole point of swapping pages is to give you more memory to execute
programs.

if I want to execute 140MB of programs+kernel on a system with 128 MB of ram,
I should be able to do the job effectively with ANY amount of "total memory"
exceeding 140MB. not some hokey 128MB RAM + 256MB swap just because the kernel
it too fscked up to deal with a small swap file.

-- 
/***
**   Mark Salisbury | Mercury Computer Systems**
***/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-06 Thread Derek Glidden


"Eric W. Biederman" wrote:
> 
> > Or are you saying that if someone is unhappy with a particular
> > situation, they should just keep their mouth shut and accept it?
> 
> It's worth complaining about.  It is also worth digging into and find
> out what the real problem is.  I have a hunch that this hole
> conversation on swap sizes being irritating is hiding the real
> problem.

I totally agree with this, and want to reiterate that the original
problem I posted has /nothing/ to do with the "swap == 2*RAM" issue.

The problem I reported is not that 2.4 uses huge amounts of swap but
that trying to recover that swap off of disk under 2.4 can leave the
machine in an entirely unresponsive state, while 2.2 handles identical
situations gracefully.  

I'm annoyed by 2.4's "requirement" of too much swap, but I consider that
less a bug and more a severe design flaw.  

-- 
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
#!/usr/bin/perl -w
$_='while(read+STDIN,$_,2048){$a=29;$b=73;$c=142;$t=255;@t=map
{$_%16or$t^=$c^=($m=(11,10,116,100,11,122,20,100)[$_/16%8])&110;
$t^=(72,@z=(64,72,$a^=12*($_%16-2?0:$m&17)),$b^=$_%64?12:0,@z)
[$_%8]}(16..271);if((@a=unx"C*",$_)[20]&48){$h=5;$_=unxb24,join
"",@b=map{xB8,unxb8,chr($_^$a[--$h+84])}@ARGV;s/...$/1$&/;$d=
unxV,xb25,$_;$e=256|(ord$b[4])<<9|ord$b[3];$d=$d>>8^($f=$t&($d
>>12^$d>>4^$d^$d/8))<<17,$e=$e>>8^($t&($g=($q=$e>>14&7^$e)^$q*
8^$q<<6))<<9,$_=$t[$_]^(($h>>=8)+=$f+(~$g&$t))for@a[128..$#a]}
print+x"C*",@a}';s/x/pack+/g;eval 

usage: qrpff 153 2 8 105 225 < /mnt/dvd/VOB_FILENAME \
| extract_mpeg2 | mpeg2dec - 

http://www.eff.org/http://www.opendvd.org/ 
 http://www.cs.cmu.edu/~dst/DeCSS/Gallery/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-06 Thread Dr S.M. Huen

On Wed, 6 Jun 2001, Kurt Roeckx wrote:

> On Wed, Jun 06, 2001 at 10:57:57AM +0100, Dr S.M. Huen wrote:
> > On Wed, 6 Jun 2001, Sean Hunter wrote:
> > 
> > > 
> > > For large memory boxes, this is ridiculous.  Should I have 8GB of swap?
> > > 
> > 
> > Do I understand you correctly?
> > ECC grade SDRAM for your 8GB server costs £335 per GB as 512MB sticks even
> > at today's silly prices (Crucial). Ultra160 SCSI costs £8.93/GB as 73GB
> > drives.
> 
> Maybe you really should reread the statements people made about
> this before.
> 
I think you might do with a more careful quoting or reading of the thread
yourself before casting such aspersions.

I did not recommend swap use. I argued that it was not reasonable to
reject a  2*RAM swap requirement on cost grounds.  There are those who do
not think this argument adequate because of grounds other than
hardware cost (e.g. retrofitting existing farms, laptops with zillions of
OSes etc.)

> 
> That swap = 2 * RAM is just a guideline, you really should look
> at what applications you run, and how memory they use.  If you
> choise your RAM so that all application can always be in memory
> at all time, there is no need for swap.  If they can't be, the
> rule might help you.
> 
I think the whole argument of the thread is against you here.  It seems
that if you do NOT provide 2*RAM you get into trouble much earlier than
you expect (a few argue that even if you do you get trouble).  If it were
just a guideline that gracefully degraded your performance the other lot
wouldn't be screaming.

The whole screaming match is about whether a drastic degradation on using
swap with less than the 2*RAM swap specified by the developers should lead
one to conclude that a kernel is "broken".

To conclude, this is not a hypothetical argument about whether to operate
completely in core.  There's not a person on LKML who doesn't know running
in RAM is better than running swapping.   It is one where users do swap
but allocate a size smaller than that recommended and are adversely
affected.  It is about whether a kernel that reacts this way could be
regarded as stable.  Answe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-06 Thread Eric W. Biederman

Derek Glidden <[EMAIL PROTECTED]> writes:

> John Alvord wrote:
> > 
> > On Wed, 06 Jun 2001 11:31:28 -0400, Derek Glidden
> > <[EMAIL PROTECTED]> wrote:
> > 
> > >
> > >I'm beginning to be amazed at the Linux VM hackers' attitudes regarding
> > >this problem.  I expect this sort of behaviour from academics - ignoring
> > >real actual problems being reported by real actual people really and
> > >actually experiencing and reporting them because "technically" or
> > >"theoretically" they "shouldn't be an issue" or because "the "literature
> > >[documentation] says otherwise - but not from this group.
> > 
> > There have been multiple comments that a fix for the problem is
> > forthcoming. Is there some reason you have to keep talking about it?
> 
> Because there have been many more comments that "The rule for 2.4 is
> 'swap == 2*RAM' and that's the way it is" and "disk space is cheap -
> just add more" than there have been "this is going to be fixed" which is
> extremely discouraging and doesn't instill me with all sorts of
> confidence that this problem is being taken seriously.

The hard rule will always be that to cover all pathological cases swap
must be greater than RAM.  Because in the worse case all RAM will be
in thes swap cache.  That this is more than just the worse case in 2.4
is problematic.  I.e. In the worst case: 
Virtual Memory = RAM + (swap - RAM).

You can't improve the worst case.  We can improve the worst case that
many people are facing.

> Or are you saying that if someone is unhappy with a particular
> situation, they should just keep their mouth shut and accept it?

It's worth complaining about.  It is also worth digging into and find
out what the real problem is.  I have a hunch that this hole
conversation on swap sizes being irritating is hiding the real
problem.  

Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-06 Thread José Luis Domingo López

On Wednesday, 06 June 2001, at 10:19:30 +0200,
Xavier Bestel wrote:

> On 05 Jun 2001 23:19:08 -0400, Derek Glidden wrote:
> > On Wed, Jun 06, 2001 at 12:16:30PM +1000, Andrew Morton wrote:
> [...]
> Did you try to put twice as much swap as you have RAM ? (e.g. add a 512M
> swapfile to your box)
>
I'm not a kernel guru, neither I can even try to understand how an
operating system's memory management is designed or behaves. But I've some
questions and thoughs:

1. Is swap=2xRAM a desing issue, or just a recommendation to get best
results _based_ on current VM subsystem status ?
2. Wouldn't performance drop quickly when VM starts to swap
processes/pages to disk, instead of keeping them on RAM ?. Maybe having a
couple of GB worth of processes on disk is not very wyse.
3. Shouldn't an ideal VM manage swap space as an extension of system's RAM
(of course, taking into account that RAM is much faster than HD, and
nothing should be on swap if there is room enough on RAM ?.
4. Wouldn't you say that "adding more swap" (maybe 2xRAM is a
recommendation, maybe a temporary fix, maybe a design decission) is the
M$-way of fixing things ?. If there is a _real_ need for more swap to get
a well baheving system, let's add swap. But we shouldn't hide inner desing
and/or implementation problems under the "cheap multigigabyte disks"
argument.
5. AFAIK, kernel developers are well aware of current 2.4.x problems in
some areas. I don't think insisting on certain problems without providing
ideas, testing, support, and limiting to just blaming the authors is the
best way to go. Maybe kernel hackers are the most interested of all in
fixing all these issues ASAP.

Just some thoughts from someone unable to write C code and help fix this
mess ;).

--
José Luis Domingo López
Linux Registered User #189436 Debian GNU/Linux Potato (P166 64 MB RAM)

jdomingo EN internautas PUNTO org  => ¿ Spam ? Atente a las consecuencias
jdomingo AT internautas DOT   org  => Spam at your own risk

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-06 Thread Kurt Roeckx

On Wed, Jun 06, 2001 at 10:57:57AM +0100, Dr S.M. Huen wrote:
> On Wed, 6 Jun 2001, Sean Hunter wrote:
> 
> > 
> > For large memory boxes, this is ridiculous.  Should I have 8GB of swap?
> > 
> 
> Do I understand you correctly?
> ECC grade SDRAM for your 8GB server costs £335 per GB as 512MB sticks even
> at today's silly prices (Crucial). Ultra160 SCSI costs £8.93/GB as 73GB
> drives.

Maybe you really should reread the statements people made about
this before.

One of them being, that if you're not using swap in 2.2, it won't
need any in 2.4 either.

2.4 will use more swap in case it does use it.  It now works more
like other UNIX variants where the rule is that swap = 2 * RAM.

That swap = 2 * RAM is just a guideline, you really should look
at what applications you run, and how memory they use.  If you
choise your RAM so that all application can always be in memory
at all time, there is no need for swap.  If they can't be, the
rule might help you.

I think someone said that the swap should be large enough to hold
all application that are running on swapspace, that is, in case
you want to use swap.

Disk maybe be alot cheaper than RAM, but it's also alot slower.

Kurt

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Break 2.4 VM in five easy steps

2001-06-06 Thread Remi Turk


On Wed, Jun 06, 2001 at 06:48:32AM -0400, Alexander Viro wrote:
> On Wed, 6 Jun 2001, Sean Hunter wrote:
> 
> > This is completely bogus. I am not saying that I can't afford the swap.
> > What I am saying is that it is completely broken to require this amount
> > of swap given the boundaries of efficient use. 
> 
> Funny. I can count many ways in which 4.3BSD, SunOS{3,4} and post-4.4 BSD
> systems I've used were broken, but I've never thought that swap==2*RAM rule
> was one of them.
> 
> Not that being more kind on swap would be a bad thing, but that rule for
> amount of swap is pretty common. ISTR similar for (very old) SCO, so it's
> not just BSD world. How are modern Missed'em'V variants in that respect, BTW?

Although I don't have any swap-trouble myself, what I think
most people are having problems with is not that Linux
doesn't have the "you-dont-need-2xRAM-size-swap-if-you-swap-at-all
feature", but that it lost it in 2.4.

-- 
Linux 2.4.5-ac9 #5 Wed Jun 6 18:30:24 CEST 2001
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

1 2 3 >

1 - 100 of 201 matches

Mail list logo