Re: Freezing bug in all kernels greater than 2.4.5-ac13 *AND* 2.4.6-pre2

2001-06-30 Thread tcm

I'm currently running 2.4.6-pre8 and happy as a clam, the
problem has been found and reverted, looks from my discussions with
Linus like the page_launder change introduced into pre3 and also
included in ac14 was causing the hangs/near freezes.

I'm not really much of a coder, so I can't say what was wrong
with it, only what the symptoms were and how to get it to screw up
whenever I wanted to test for it. (See previous messages for how to do
this) If Rik van Riel/Marcelo Tosatti/anyone wants to have me gather
information on what is going on just before/after the kernel dies I'll
do it - just tell me how to, and I'll push it along :)

Thanks a bunch Linus,
Tim
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Freezing bug in all kernels greater than 2.4.5-ac13 *AND* 2.4.6-pre2

2001-06-27 Thread tcm

I decided, for the hell of it, to test the pre series as I've been
nudged by many people to try it in favor of the ac kernel series that
I've been having problems with. Well, it turns out I have ran into
exactly the same problem I had with the ac kernel series, which quite
frankly is surprising the hell out of me.

To make the kernel freeze/slow down to a crawl with affected kernels on
my machine I do this test:

Load X (This fills up my ram and causes me to swap a bit)
run a rxvt and su to root (proboably unnecessary)
du /

Now, somewhere in this test I start swapping a little bit, nothing
big... then BAM. hard disk, mouse, keyboard, all completely and utterly
stop. Video continues to work, but my cpu's load goes absolutely INSANE.
(If it recovers, gkrellm generally says I've gotten a loadavg somewhere
between 3-20, depending on how long it was stuck) This can last for
seconds (usually) minutes (once) or it can simply get worse and hang the
machine (many, many many times)

When it recovers from this, I generally see a MASSIVE write to swap,
(I'm using gkrellm to monitor it) and the system continues on as if
nothing happened - until, of course, this happens again. A kernel
compile can cause it. a rm -R of a large directory can cause it. Loading
a large application can cause it.

On some kernels this is more noticable than others - ac15 does it the
worst, although pre3 rivals it, and the symptoms are different on
ac17/18 - it'll simply freeze randomly and with no recovery instead of
sometimes freezing or sometimes slowing down to a crawl and recovering
or freezing. (Which is worse? You decide.)

Now, as before, I tested this with swap and without swap. With swap, I
get the hangs/freezes in all the affected kernels. Without swap, I
don't. Nada.

Now, the big question of the day folks: What changed between 2.4.6-pre2
and 2.4.6-pre3 that ALSO changed between 2.4.5-ac13 and 2.4.5-ac14 - and
now, what part of those patches were the VM? Anyone? I don't see in
2.4.6-pre3 what changed that was part of the VM... So I am trying to
narrow it down a bit :)

This bug is driving me slightly nuts, so I want it dead. Anyone got a
exterminator handy? =)

Refer to my previous post with this subject for my original description
of this problem. It's still there in ac18, though I've not tested 19
(Some have said it's not likely to have been fixed, and I've been
regress testing 2.4.6pre's today.)

Subject: Possible freezing bug located after ac13

Let me know if I can provide any additional information that will help
nail this bug to the wall. (I want to torture it. =)

Tim
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Swap error message I've seen in 2.4.5-ac17

2001-06-26 Thread tcm

Yep, me again. I've been playing around with ac17 on my old 486 machine
for a few days (it seems strange that the 486 works fine while the K6
doesn't, but I digress) and I noticed today something that made my hair
stand on end:

Jun 26 16:17:27 debian kernel: VM: Bad swap entry 0033da00
Jun 26 16:17:27 debian kernel: Unused swap offset entry in swap_count
0033da00
Jun 26 16:17:27 debian kernel: Unused swap offset entry in swap_count
0033da00
Jun 26 16:38:16 debian -- MARK --
Jun 26 16:53:13 debian kernel: PPP BSD Compression module registered
Jun 26 16:53:14 debian kernel: PPP Deflate Compression module registered
Jun 26 16:53:24 debian kernel: VM: Bad swap entry 0033da00

Now I have been told by Rik Van Riel that this is a kernel bug - I
initially figured it was a bad disk, thanks to him I can breathe now...

Anyway, at the time the kernel did these messages I was just stopping
playing quake on my K6-III (486 handles packets to/from the modem) and
was reloading the compression modules, changing the mtu of my modem's 
interface to 1500 from 576, and starting fetchmail. And about one
minute later I decided to simply disconnect.

I can't seem to find a way to reproduce this problem all the time like I
can with the freezing bug, but I will reply to this thread if I see it
again and/or can repeatedly reproduce it.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Possible freezing bug located after ac13

2001-06-23 Thread tcm

I've recently been going slightly nuts with the fact ac15, 16, and 17
all like deadlocking/slowing to a crawl for seconds/minutes on my K6-III
with 64MB of ram and a swap space of 128MB...

Recently I noticed something VERY odd, I'd been keeping an eye on
gkrellm while I was doing stupid things to produce the problem (a du
as root in X of / generally would always make it pop up) ... And swap
was doing I/O at the time *JUST* before when I'd either deadlock or slow
down to a crawl, and if it recovered, swap would do more I/O...

So. I tried unmounting all swap, and suddenly everything worked fine,
although I couldn't exactly do everythign I wanted of course.

I regression tested this, ac 16,15 and even 14 do this. ac 13 does *not*
- IMHO I think the dead swap patches introduced into 14 may be related
to the problem.

Just my two cents.

Tim
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



irqtune in linux 2.4 (second try)

2001-06-16 Thread tcm

The last time I tried this, I didn't get any replies - at all. I don't
know why, so I request that anyone who wants to send me replies, send
directly to [EMAIL PROTECTED] - as it has been ten days since I last tried
to ask for help, I hope this isn't considered spam.

Note that I'm still having the same problems, this time with ac10, 
and I've still been completely unable to get anyone to tell me what
I'm either doing wrong, or is wrong with my programs/configuration.

Tim

On Wed, Jun 06, 2001 at 08:22:24PM -0400, tm wrote:
>   Hi. I will try to keep this as informative as possible, just in
> case I've missed something.
> 
> First off, I've already searched all the kernel archives I could,
> google, I've looked around on IRC for help in four different networks,
> I've emailed the debian hwtools package maintainer (who misdirected me
> to use /dev/irq to do what I wanted to do), and the irqtune
> author (I have not yet recieved a reply), and come up with absolutely
> no way to get this to work.
> 
> Problem: Irqtune is not working under any 2.4 kernel I've tried as it
> did in kernel 2.2.x, in fact it is not doing anything at all, despite
> the fact it says it's working fine. The symptoms of it not working are
> that whenever my hard disk writes, all serial and ethernet operations
> stop. As you can imagine, this generates quite a bit of packet loss,
> which is unacceptable, especially if I have to be writing/reading at the
> same time the modem is going.
> 
> Description of my configuration:
> Kernel: 2.4.5-ac7
> 
> irqtune: Debian unstable, using irqtune 0.6 from the hwtools package
> 
> hdparm: version v3.9 from the debian unstable hwtools package
> 
> Hardware: IBM PS/1 486 dx 50 with 16MB of ram, a add on ISA card
> which provides a 16550A uart for the external zoom 56K faxmodem, a
> NE2000 compatible ethernet card
> 
> irqtune -e 7 10 output:
> irqtune: version is 0.6
> irqtune: kernel version 0.0.0
> probe: irqtune must be invoked via the full path -- OK
> probe: /sbin in $PATH -- YES
> probe: insmod found in $PATH (/sbin) -- OK
> probe: insmod simple execution -- OK
> probe: insmod has version (2.4.6) -- YES
> probe: rmmod found in insmod directory -- OK
> probe: insmod version supports command line options -- OK
> probe: insmod version (2.4.6) compatible with kernel version (0.0.0) --
> OK
> probe: insmod version should be 2.1.34 (or better) -- OK
> probe: insmod and kernel compatible with CONFIG_MODVERSIONS -- OK
> probe: irqtune_mod loading will be tried -- OK
> probe: kernel version irqtune built under (1.0.0) matches current system
> -- NO
> probe: kernel IRQ handling is compatible -- OK
> probe: kernel has module support (CONFIG_MODULES) -- OK
> probe: kernel has symbols -- OK
> probe: kernel is using versions (CONFIG_MODVERSIONS) -- NO
> probe: kernel symbols are checksummed (CONFIG_MODVERSIONS) -- NO
> probe: kernel has /proc/interrupts -- OK
> irqtune: setting system IRQ priority to 7/10
> irqtune: trying command -- insmod -x -o irqtune_mod -f
> /usr/lib/hwtools/irqtune_mod.o priority=7,10
> Warning: kernel-module version mismatch
> /usr/lib/hwtools/irqtune_mod.o was compiled for kernel version
> 1.0.0
> while this kernel is version 2.4.5-ac7
> irqtune: trying command -- rmmod irqtune_mod
> tblread: SYNTAX 'ERR:  0'
> I00/P01:34152281  XT-PIC  timer
> I01/P02:   2  XT-PIC  keyboard
> I02/P03:   0  XT-PIC  cascade
> I03/P11:   1  XT-PIC  serial
> I07/P00: 5957335  XT-PIC  serial
> I10/P03:  202481  XT-PIC  NE2000
> I13/P06:   0  XT-PIC  fpu
> I14/P07: 8104967  XT-PIC  ide0
> I15/P08:   0  XT-PIC  ide1
> irqtune: complete
> 
> As you can imagine I'm slightly perturbed. I think the syntax error is
> OK, it's likely barfing on the new 'cpu 0' part of /proc/interrupts...
> However the misdetection of the kernel version is making we worry, as
> well as the fact that although it SAYS it has done something, in fact
> the problems I have been having since I upgraded to 2.4.x continue.
> (hard disk reads/writes cause all serial/eth0 operations to generate
> massive PL)
> 
> hdparm /dev/hda output:
> 
> /dev/hda:
>  multcount=  0 (off)
>  I/O support  =  0 (default 16-bit)
>  unmaskirq=  1 (on)
>  using_dma=  0 (off)
>  keepsettings =  1 (on)
>  nowerr   =  0 (off)
>  readonly =  0 (off)
>  readahead=  8 (on)
>  geometry = 4956/16/63, sectors = 4996476, start = 0
> 
> As you can see, I've enabled unmaskirq, as it has been reported to help
> in my situation... It does in fact, although I wish I had a way to get
> irqtune working again. Note that DMA is NOT available on this old
> system, and I will actually go back to a 2.2.x kernel rather than spend
> money on a dma compatible controller if irqtune or another solution
> cannot be found.
> 
> I will accept any ideas anyone has to offer. If irqtune is obsolete,
> please 

irqtune in kernel 2.4

2001-06-06 Thread tcm

Hi. I will try to keep this as informative as possible, just in
case I've missed something.

First off, I've already searched all the kernel archives I could,
google, I've looked around on IRC for help in four different networks,
I've emailed the debian hwtools package maintainer (who misdirected me
to use /dev/irq to do what I wanted to do), and the irqtune
author (I have not yet recieved a reply), and come up with absolutely
no way to get this to work.

Problem: Irqtune is not working under any 2.4 kernel I've tried as it
did in kernel 2.2.x, in fact it is not doing anything at all, despite
the fact it says it's working fine. The symptoms of it not working are
that whenever my hard disk writes, all serial and ethernet operations
stop. As you can imagine, this generates quite a bit of packet loss,
which is unacceptable, especially if I have to be writing/reading at the
same time the modem is going.

Description of my configuration:
Kernel: 2.4.5-ac7

irqtune: Debian unstable, using irqtune 0.6 from the hwtools package

hdparm: version v3.9 from the debian unstable hwtools package

Hardware: IBM PS/1 486 dx 50 with 16MB of ram, a add on ISA card
which provides a 16550A uart for the external zoom 56K faxmodem, a
NE2000 compatible ethernet card

irqtune -e 7 10 output:
irqtune: version is 0.6
irqtune: kernel version 0.0.0
probe: irqtune must be invoked via the full path -- OK
probe: /sbin in $PATH -- YES
probe: insmod found in $PATH (/sbin) -- OK
probe: insmod simple execution -- OK
probe: insmod has version (2.4.6) -- YES
probe: rmmod found in insmod directory -- OK
probe: insmod version supports command line options -- OK
probe: insmod version (2.4.6) compatible with kernel version (0.0.0) --
OK
probe: insmod version should be 2.1.34 (or better) -- OK
probe: insmod and kernel compatible with CONFIG_MODVERSIONS -- OK
probe: irqtune_mod loading will be tried -- OK
probe: kernel version irqtune built under (1.0.0) matches current system
-- NO
probe: kernel IRQ handling is compatible -- OK
probe: kernel has module support (CONFIG_MODULES) -- OK
probe: kernel has symbols -- OK
probe: kernel is using versions (CONFIG_MODVERSIONS) -- NO
probe: kernel symbols are checksummed (CONFIG_MODVERSIONS) -- NO
probe: kernel has /proc/interrupts -- OK
irqtune: setting system IRQ priority to 7/10
irqtune: trying command -- insmod -x -o irqtune_mod -f
/usr/lib/hwtools/irqtune_mod.o priority=7,10
Warning: kernel-module version mismatch
/usr/lib/hwtools/irqtune_mod.o was compiled for kernel version
1.0.0
while this kernel is version 2.4.5-ac7
irqtune: trying command -- rmmod irqtune_mod
tblread: SYNTAX 'ERR:  0'
I00/P01:34152281  XT-PIC  timer
I01/P02:   2  XT-PIC  keyboard
I02/P03:   0  XT-PIC  cascade
I03/P11:   1  XT-PIC  serial
I07/P00: 5957335  XT-PIC  serial
I10/P03:  202481  XT-PIC  NE2000
I13/P06:   0  XT-PIC  fpu
I14/P07: 8104967  XT-PIC  ide0
I15/P08:   0  XT-PIC  ide1
irqtune: complete

As you can imagine I'm slightly perturbed. I think the syntax error is
OK, it's likely barfing on the new 'cpu 0' part of /proc/interrupts...
However the misdetection of the kernel version is making we worry, as
well as the fact that although it SAYS it has done something, in fact
the problems I have been having since I upgraded to 2.4.x continue.
(hard disk reads/writes cause all serial/eth0 operations to generate
massive PL)

hdparm /dev/hda output:

/dev/hda:
 multcount=  0 (off)
 I/O support  =  0 (default 16-bit)
 unmaskirq=  1 (on)
 using_dma=  0 (off)
 keepsettings =  1 (on)
 nowerr   =  0 (off)
 readonly =  0 (off)
 readahead=  8 (on)
 geometry = 4956/16/63, sectors = 4996476, start = 0

As you can see, I've enabled unmaskirq, as it has been reported to help
in my situation... It does in fact, although I wish I had a way to get
irqtune working again. Note that DMA is NOT available on this old
system, and I will actually go back to a 2.2.x kernel rather than spend
money on a dma compatible controller if irqtune or another solution
cannot be found.

I will accept any ideas anyone has to offer. If irqtune is obsolete,
please say so. If there is an in kernel solution PLEASE say so.
(/dev/irq is used for smp systems. This is a single cpu system) If there
is NO present solution, please tell me that too. :)

I am quite willing to downgrade my system to kernel 2.2 if this can't be
fixed somehow, kernel 2.2.x works just fine on my old 486, although
kernel 2.4.x tends to simply do some things better. (It's VM, although
some report it to do strange things, manages memory better in many cases
on my 486 - less disk thrashing when it swaps things, runs obese perl
scripts etc.) The fact I'd be missing out on reiserfs in the kernel
makes me sad though, I really do like that filesystem.

Anyway, please reply to the list with ideas, I'll see them.

Timothy C.

2.2.19pre3's VM is great :)

2001-01-02 Thread tcm

I originally sent this message to Andrea Archelangi, but he felt I
should also send it here. 2.2.19pre3 is great!

Note that I'm not subscribed to the list - please replies to me and the
list if you would? Thanks :)

You may remember me from an earlier e-mail asking where to find
the VM patch for 2.2.18 - well, I took a gander at 2.2.19pre3's fixes
due to your nudging, and boy am I happy with it. The VM - or at least
whatever controls the swapping in and out of processes - is REALLY
improved. I used to abuse the heck out of my swap whenever I'd run
mozilla, but now the swap barely takes up anything even when I'm
compiling something and using mozilla. (This is a 54MB memory available
machine with 100MB of swap) Boot speed has also increased somewhat as
well, although I've not really taken the time to figure out the
improvement in times... Just so you know, I've noticed the same style of
improvements - some INCREDIBLY noticable - on my 486 dx 50 with 16MB of
ram and 100MB of swap. It used to page out just about everything it
could to swap even when it had plenty of real memory to play with,
mainly because it seemed stupid and would cache around 4-6MB of stuff in
the ram. Now it does caching much less and actually USES the ram for
programs. (still uses 2MB or less of mem for caching, and this changes
quite a bit depending on what it's up to, but) Whee! That certaintly
makes a tremendous difference.

Felt you could use a 'this works great' story, I know if I was you I'd
want one once in a while since you proboably get more bug reports than
anything. :)

Tim

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/