Re: Freezing bug in all kernels greater than 2.4.5-ac13 *AND* 2.4.6-pre2

2001-06-30 Thread tcm

I'm currently running 2.4.6-pre8 and happy as a clam, the
problem has been found and reverted, looks from my discussions with
Linus like the page_launder change introduced into pre3 and also
included in ac14 was causing the hangs/near freezes.

I'm not really much of a coder, so I can't say what was wrong
with it, only what the symptoms were and how to get it to screw up
whenever I wanted to test for it. (See previous messages for how to do
this) If Rik van Riel/Marcelo Tosatti/anyone wants to have me gather
information on what is going on just before/after the kernel dies I'll
do it - just tell me how to, and I'll push it along :)

Thanks a bunch Linus,
Tim
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Freezing bug in all kernels greater than 2.4.5-ac13 *AND* 2.4.6-pre2

2001-06-30 Thread tcm

I'm currently running 2.4.6-pre8 and happy as a clam, the
problem has been found and reverted, looks from my discussions with
Linus like the page_launder change introduced into pre3 and also
included in ac14 was causing the hangs/near freezes.

I'm not really much of a coder, so I can't say what was wrong
with it, only what the symptoms were and how to get it to screw up
whenever I wanted to test for it. (See previous messages for how to do
this) If Rik van Riel/Marcelo Tosatti/anyone wants to have me gather
information on what is going on just before/after the kernel dies I'll
do it - just tell me how to, and I'll push it along :)

Thanks a bunch Linus,
Tim
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Freezing bug in all kernels greater than 2.4.5-ac13 *AND*2.4.6-pre2

2001-06-27 Thread Marcelo Tosatti



On Wed, 27 Jun 2001, Marcelo Tosatti wrote:

> 
> 
> On Wed, 27 Jun 2001 [EMAIL PROTECTED] wrote:
> 
> > I decided, for the hell of it, to test the pre series as I've been
> > nudged by many people to try it in favor of the ac kernel series that
> > I've been having problems with. Well, it turns out I have ran into
> > exactly the same problem I had with the ac kernel series, which quite
> > frankly is surprising the hell out of me.
> > 
> > To make the kernel freeze/slow down to a crawl with affected kernels on
> > my machine I do this test:
> > 
> > Load X (This fills up my ram and causes me to swap a bit)
> > run a rxvt and su to root (proboably unnecessary)
> > du /
> > 
> > Now, somewhere in this test I start swapping a little bit, nothing
> > big... then BAM. hard disk, mouse, keyboard, all completely and utterly
> > stop. Video continues to work, but my cpu's load goes absolutely INSANE.
> > (If it recovers, gkrellm generally says I've gotten a loadavg somewhere
> > between 3-20, depending on how long it was stuck) This can last for
> > seconds (usually) minutes (once) or it can simply get worse and hang the
> > machine (many, many many times)
> > 
> > When it recovers from this, I generally see a MASSIVE write to swap,
> > (I'm using gkrellm to monitor it) and the system continues on as if
> > nothing happened - until, of course, this happens again. A kernel
> > compile can cause it. a rm -R of a large directory can cause it. Loading
> > a large application can cause it.
> > 
> > On some kernels this is more noticable than others - ac15 does it the
> > worst, although pre3 rivals it, and the symptoms are different on
> > ac17/18 - it'll simply freeze randomly and with no recovery instead of
> > sometimes freezing or sometimes slowing down to a crawl and recovering
> > or freezing. (Which is worse? You decide.)
> > 
> > Now, as before, I tested this with swap and without swap. With swap, I
> > get the hangs/freezes in all the affected kernels. Without swap, I
> > don't. Nada.
> > 
> > Now, the big question of the day folks: What changed between 2.4.6-pre2
> > and 2.4.6-pre3 that ALSO changed between 2.4.5-ac13 and 2.4.5-ac14 - and
> > now, what part of those patches were the VM? Anyone? I don't see in
> > 2.4.6-pre3 what changed that was part of the VM... So I am trying to
> > narrow it down a bit :)
> > 
> > This bug is driving me slightly nuts, so I want it dead. Anyone got a
> > exterminator handy? =)
> 
> Rik's page_launder() changes. 

Eek. I mean Rik's page_launder() changes are _causing_ the problem. (its
the only VM change between 2.4.6-pre2->pre3/2.4.5-ac13->ac14)

Question:

Whats the size of the inactive dirty and clean lists when you're about to
crash.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Freezing bug in all kernels greater than 2.4.5-ac13 *AND*2.4.6-pre2

2001-06-27 Thread Marcelo Tosatti



On Wed, 27 Jun 2001 [EMAIL PROTECTED] wrote:

> I decided, for the hell of it, to test the pre series as I've been
> nudged by many people to try it in favor of the ac kernel series that
> I've been having problems with. Well, it turns out I have ran into
> exactly the same problem I had with the ac kernel series, which quite
> frankly is surprising the hell out of me.
> 
> To make the kernel freeze/slow down to a crawl with affected kernels on
> my machine I do this test:
> 
> Load X (This fills up my ram and causes me to swap a bit)
> run a rxvt and su to root (proboably unnecessary)
> du /
> 
> Now, somewhere in this test I start swapping a little bit, nothing
> big... then BAM. hard disk, mouse, keyboard, all completely and utterly
> stop. Video continues to work, but my cpu's load goes absolutely INSANE.
> (If it recovers, gkrellm generally says I've gotten a loadavg somewhere
> between 3-20, depending on how long it was stuck) This can last for
> seconds (usually) minutes (once) or it can simply get worse and hang the
> machine (many, many many times)
> 
> When it recovers from this, I generally see a MASSIVE write to swap,
> (I'm using gkrellm to monitor it) and the system continues on as if
> nothing happened - until, of course, this happens again. A kernel
> compile can cause it. a rm -R of a large directory can cause it. Loading
> a large application can cause it.
> 
> On some kernels this is more noticable than others - ac15 does it the
> worst, although pre3 rivals it, and the symptoms are different on
> ac17/18 - it'll simply freeze randomly and with no recovery instead of
> sometimes freezing or sometimes slowing down to a crawl and recovering
> or freezing. (Which is worse? You decide.)
> 
> Now, as before, I tested this with swap and without swap. With swap, I
> get the hangs/freezes in all the affected kernels. Without swap, I
> don't. Nada.
> 
> Now, the big question of the day folks: What changed between 2.4.6-pre2
> and 2.4.6-pre3 that ALSO changed between 2.4.5-ac13 and 2.4.5-ac14 - and
> now, what part of those patches were the VM? Anyone? I don't see in
> 2.4.6-pre3 what changed that was part of the VM... So I am trying to
> narrow it down a bit :)
> 
> This bug is driving me slightly nuts, so I want it dead. Anyone got a
> exterminator handy? =)

Rik's page_launder() changes. 


> 
> Refer to my previous post with this subject for my original description
> of this problem. It's still there in ac18, though I've not tested 19
> (Some have said it's not likely to have been fixed, and I've been
> regress testing 2.4.6pre's today.)
> 
> Subject: Possible freezing bug located after ac13
> 
> Let me know if I can provide any additional information that will help
> nail this bug to the wall. (I want to torture it. =)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Freezing bug in all kernels greater than 2.4.5-ac13 *AND* 2.4.6-pre2

2001-06-27 Thread tcm

I decided, for the hell of it, to test the pre series as I've been
nudged by many people to try it in favor of the ac kernel series that
I've been having problems with. Well, it turns out I have ran into
exactly the same problem I had with the ac kernel series, which quite
frankly is surprising the hell out of me.

To make the kernel freeze/slow down to a crawl with affected kernels on
my machine I do this test:

Load X (This fills up my ram and causes me to swap a bit)
run a rxvt and su to root (proboably unnecessary)
du /

Now, somewhere in this test I start swapping a little bit, nothing
big... then BAM. hard disk, mouse, keyboard, all completely and utterly
stop. Video continues to work, but my cpu's load goes absolutely INSANE.
(If it recovers, gkrellm generally says I've gotten a loadavg somewhere
between 3-20, depending on how long it was stuck) This can last for
seconds (usually) minutes (once) or it can simply get worse and hang the
machine (many, many many times)

When it recovers from this, I generally see a MASSIVE write to swap,
(I'm using gkrellm to monitor it) and the system continues on as if
nothing happened - until, of course, this happens again. A kernel
compile can cause it. a rm -R of a large directory can cause it. Loading
a large application can cause it.

On some kernels this is more noticable than others - ac15 does it the
worst, although pre3 rivals it, and the symptoms are different on
ac17/18 - it'll simply freeze randomly and with no recovery instead of
sometimes freezing or sometimes slowing down to a crawl and recovering
or freezing. (Which is worse? You decide.)

Now, as before, I tested this with swap and without swap. With swap, I
get the hangs/freezes in all the affected kernels. Without swap, I
don't. Nada.

Now, the big question of the day folks: What changed between 2.4.6-pre2
and 2.4.6-pre3 that ALSO changed between 2.4.5-ac13 and 2.4.5-ac14 - and
now, what part of those patches were the VM? Anyone? I don't see in
2.4.6-pre3 what changed that was part of the VM... So I am trying to
narrow it down a bit :)

This bug is driving me slightly nuts, so I want it dead. Anyone got a
exterminator handy? =)

Refer to my previous post with this subject for my original description
of this problem. It's still there in ac18, though I've not tested 19
(Some have said it's not likely to have been fixed, and I've been
regress testing 2.4.6pre's today.)

Subject: Possible freezing bug located after ac13

Let me know if I can provide any additional information that will help
nail this bug to the wall. (I want to torture it. =)

Tim
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Freezing bug in all kernels greater than 2.4.5-ac13 *AND* 2.4.6-pre2

2001-06-27 Thread tcm

I decided, for the hell of it, to test the pre series as I've been
nudged by many people to try it in favor of the ac kernel series that
I've been having problems with. Well, it turns out I have ran into
exactly the same problem I had with the ac kernel series, which quite
frankly is surprising the hell out of me.

To make the kernel freeze/slow down to a crawl with affected kernels on
my machine I do this test:

Load X (This fills up my ram and causes me to swap a bit)
run a rxvt and su to root (proboably unnecessary)
du /

Now, somewhere in this test I start swapping a little bit, nothing
big... then BAM. hard disk, mouse, keyboard, all completely and utterly
stop. Video continues to work, but my cpu's load goes absolutely INSANE.
(If it recovers, gkrellm generally says I've gotten a loadavg somewhere
between 3-20, depending on how long it was stuck) This can last for
seconds (usually) minutes (once) or it can simply get worse and hang the
machine (many, many many times)

When it recovers from this, I generally see a MASSIVE write to swap,
(I'm using gkrellm to monitor it) and the system continues on as if
nothing happened - until, of course, this happens again. A kernel
compile can cause it. a rm -R of a large directory can cause it. Loading
a large application can cause it.

On some kernels this is more noticable than others - ac15 does it the
worst, although pre3 rivals it, and the symptoms are different on
ac17/18 - it'll simply freeze randomly and with no recovery instead of
sometimes freezing or sometimes slowing down to a crawl and recovering
or freezing. (Which is worse? You decide.)

Now, as before, I tested this with swap and without swap. With swap, I
get the hangs/freezes in all the affected kernels. Without swap, I
don't. Nada.

Now, the big question of the day folks: What changed between 2.4.6-pre2
and 2.4.6-pre3 that ALSO changed between 2.4.5-ac13 and 2.4.5-ac14 - and
now, what part of those patches were the VM? Anyone? I don't see in
2.4.6-pre3 what changed that was part of the VM... So I am trying to
narrow it down a bit :)

This bug is driving me slightly nuts, so I want it dead. Anyone got a
exterminator handy? =)

Refer to my previous post with this subject for my original description
of this problem. It's still there in ac18, though I've not tested 19
(Some have said it's not likely to have been fixed, and I've been
regress testing 2.4.6pre's today.)

Subject: Possible freezing bug located after ac13

Let me know if I can provide any additional information that will help
nail this bug to the wall. (I want to torture it. =)

Tim
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Freezing bug in all kernels greater than 2.4.5-ac13 *AND*2.4.6-pre2

2001-06-27 Thread Marcelo Tosatti



On Wed, 27 Jun 2001 [EMAIL PROTECTED] wrote:

 I decided, for the hell of it, to test the pre series as I've been
 nudged by many people to try it in favor of the ac kernel series that
 I've been having problems with. Well, it turns out I have ran into
 exactly the same problem I had with the ac kernel series, which quite
 frankly is surprising the hell out of me.
 
 To make the kernel freeze/slow down to a crawl with affected kernels on
 my machine I do this test:
 
 Load X (This fills up my ram and causes me to swap a bit)
 run a rxvt and su to root (proboably unnecessary)
 du /
 
 Now, somewhere in this test I start swapping a little bit, nothing
 big... then BAM. hard disk, mouse, keyboard, all completely and utterly
 stop. Video continues to work, but my cpu's load goes absolutely INSANE.
 (If it recovers, gkrellm generally says I've gotten a loadavg somewhere
 between 3-20, depending on how long it was stuck) This can last for
 seconds (usually) minutes (once) or it can simply get worse and hang the
 machine (many, many many times)
 
 When it recovers from this, I generally see a MASSIVE write to swap,
 (I'm using gkrellm to monitor it) and the system continues on as if
 nothing happened - until, of course, this happens again. A kernel
 compile can cause it. a rm -R of a large directory can cause it. Loading
 a large application can cause it.
 
 On some kernels this is more noticable than others - ac15 does it the
 worst, although pre3 rivals it, and the symptoms are different on
 ac17/18 - it'll simply freeze randomly and with no recovery instead of
 sometimes freezing or sometimes slowing down to a crawl and recovering
 or freezing. (Which is worse? You decide.)
 
 Now, as before, I tested this with swap and without swap. With swap, I
 get the hangs/freezes in all the affected kernels. Without swap, I
 don't. Nada.
 
 Now, the big question of the day folks: What changed between 2.4.6-pre2
 and 2.4.6-pre3 that ALSO changed between 2.4.5-ac13 and 2.4.5-ac14 - and
 now, what part of those patches were the VM? Anyone? I don't see in
 2.4.6-pre3 what changed that was part of the VM... So I am trying to
 narrow it down a bit :)
 
 This bug is driving me slightly nuts, so I want it dead. Anyone got a
 exterminator handy? =)

Rik's page_launder() changes. 


 
 Refer to my previous post with this subject for my original description
 of this problem. It's still there in ac18, though I've not tested 19
 (Some have said it's not likely to have been fixed, and I've been
 regress testing 2.4.6pre's today.)
 
 Subject: Possible freezing bug located after ac13
 
 Let me know if I can provide any additional information that will help
 nail this bug to the wall. (I want to torture it. =)

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Freezing bug in all kernels greater than 2.4.5-ac13 *AND*2.4.6-pre2

2001-06-27 Thread Marcelo Tosatti



On Wed, 27 Jun 2001, Marcelo Tosatti wrote:

 
 
 On Wed, 27 Jun 2001 [EMAIL PROTECTED] wrote:
 
  I decided, for the hell of it, to test the pre series as I've been
  nudged by many people to try it in favor of the ac kernel series that
  I've been having problems with. Well, it turns out I have ran into
  exactly the same problem I had with the ac kernel series, which quite
  frankly is surprising the hell out of me.
  
  To make the kernel freeze/slow down to a crawl with affected kernels on
  my machine I do this test:
  
  Load X (This fills up my ram and causes me to swap a bit)
  run a rxvt and su to root (proboably unnecessary)
  du /
  
  Now, somewhere in this test I start swapping a little bit, nothing
  big... then BAM. hard disk, mouse, keyboard, all completely and utterly
  stop. Video continues to work, but my cpu's load goes absolutely INSANE.
  (If it recovers, gkrellm generally says I've gotten a loadavg somewhere
  between 3-20, depending on how long it was stuck) This can last for
  seconds (usually) minutes (once) or it can simply get worse and hang the
  machine (many, many many times)
  
  When it recovers from this, I generally see a MASSIVE write to swap,
  (I'm using gkrellm to monitor it) and the system continues on as if
  nothing happened - until, of course, this happens again. A kernel
  compile can cause it. a rm -R of a large directory can cause it. Loading
  a large application can cause it.
  
  On some kernels this is more noticable than others - ac15 does it the
  worst, although pre3 rivals it, and the symptoms are different on
  ac17/18 - it'll simply freeze randomly and with no recovery instead of
  sometimes freezing or sometimes slowing down to a crawl and recovering
  or freezing. (Which is worse? You decide.)
  
  Now, as before, I tested this with swap and without swap. With swap, I
  get the hangs/freezes in all the affected kernels. Without swap, I
  don't. Nada.
  
  Now, the big question of the day folks: What changed between 2.4.6-pre2
  and 2.4.6-pre3 that ALSO changed between 2.4.5-ac13 and 2.4.5-ac14 - and
  now, what part of those patches were the VM? Anyone? I don't see in
  2.4.6-pre3 what changed that was part of the VM... So I am trying to
  narrow it down a bit :)
  
  This bug is driving me slightly nuts, so I want it dead. Anyone got a
  exterminator handy? =)
 
 Rik's page_launder() changes. 

Eek. I mean Rik's page_launder() changes are _causing_ the problem. (its
the only VM change between 2.4.6-pre2-pre3/2.4.5-ac13-ac14)

Question:

Whats the size of the inactive dirty and clean lists when you're about to
crash.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/