Re: [Cooker] Kernel problem in devfs

2002-04-17 Thread Alan

On Wednesday 17 April 2002 03:57 am, Borsenkow Andrej wrote:

Here is the oops from the current crash.

It is the same as the old crash.

fs/devfs/base.c has the offending call.

Also attached is the dmesg from the boot. You will see a number of odd 
messages associated with the zip drive.  (The zip drive works. It shows up at 
an odd spot that devfs seems unable to deal with. Usually /dev/sda4 instead 
of /dev/sda0, but it trys to find the partition table on /dev/sda0. (Or 
actually /dev/scsi/host1/bus0/target0/lun0, which is about the same 
thing...))

Hopefully that helps.

ksymoops 2.4.3 on i686 2.4.18-aro.  Options used
 -V (default)
 -k /proc/ksyms (default)
 -l /proc/modules (default)
 -o /lib/modules/2.4.18-aro/ (default)
 -m /boot/System.map-2.4.18-aro (default)

Warning: You did not tell me where to find symbol information.  I will
assume that the log matches the kernel and modules that are running
right now and I'll use the default options above for symbol resolution.
If the current kernel and/or modules do not match the log, you can get
more accurate output by telling me the kernel version and where to find
map, modules, ksyms etc.  ksymoops -h explains the options.

Apr 17 04:06:21 kludge kernel: Unable to handle kernel paging request at virtual 
address 204f2f8d
Apr 17 04:06:21 kludge kernel: c0174293
Apr 17 04:06:21 kludge kernel: *pde = 
Apr 17 04:06:21 kludge kernel: Oops: 
Apr 17 04:06:21 kludge kernel: CPU:0
Apr 17 04:06:21 kludge kernel: EIP:0010:[scan_dir_for_removable+19/64]Not 
tainted
Apr 17 04:06:21 kludge kernel: EIP:0010:[]Not tainted
Using defaults from ksymoops -t elf32-i386 -a i386
Apr 17 04:06:21 kludge kernel: EFLAGS: 00010202
Apr 17 04:06:21 kludge kernel: eax: c0aa8460   ebx: 204f2f49   ecx:    edx: 
c0aa8460
Apr 17 04:06:21 kludge kernel: esi: c42bbba0   edi: c3d3a7a0   ebp: c8fdca60   esp: 
cb631f28
Apr 17 04:06:21 kludge kernel: ds: 0018   es: 0018   ss: 0018
Apr 17 04:06:21 kludge kernel: Process msec_find (pid: 8079, stackpage=cb631000)
Apr 17 04:06:21 kludge kernel: Stack: c42bbba0 c0174726 c3d3a7a0 c027d100  
c42bbba0 c42bbc20 c42bbc0c 
Apr 17 04:06:21 kludge kernel:c8fdca60 c0143770 c8fdca60 cb631fa0 c0143c70 
c8fdca60 fff7 000d 
Apr 17 04:06:21 kludge kernel:bfffecf8 c0143e1f c8fdca60 c0143c70 cb631fa0 
c4bbf540 c4bbf2c0 c8fdca60 
Apr 17 04:06:21 kludge kernel: Call Trace: [devfs_readdir+86/448] [vfs_readdir+96/144] 
[filldir64+0/352] [sys_getdents64+79/185] [filldir64+0/352] 
Apr 17 04:06:21 kludge kernel: Call Trace: [] [] [] 
[] [] 
Apr 17 04:06:21 kludge kernel:[] [] 
Apr 17 04:06:21 kludge kernel: Code: 66 8b 43 44 25 00 f0 00 00 66 3d 00 60 75 0d f6 
43 10 04 74 

>>EIP; c0174292<=
Trace; c0174726 
Trace; c0143770 
Trace; c0143c70 
Trace; c0143e1e 
Trace; c0143c70 
Trace; c0135b6c 
Trace; c0108972 
Code;  c0174292 
 <_EIP>:
Code;  c0174292<=
   0:   66 8b 43 44   mov0x44(%ebx),%ax   <=
Code;  c0174296 
   4:   25 00 f0 00 00and$0xf000,%eax
Code;  c017429a 
   9:   66 3d 00 60   cmp$0x6000,%ax
Code;  c017429e 
   d:   75 0d jne1c <_EIP+0x1c> c01742ae 

Code;  c01742a0 
   f:   f6 43 10 04   testb  $0x4,0x10(%ebx)
Code;  c01742a4 
  13:   74 00 je 15 <_EIP+0x15> c01742a6 


Apr 17 17:47:34 kludge kernel: 8139too Fast Ethernet driver 0.9.24
Apr 17 17:47:40 kludge kernel: ac97_codec: AC97 Audio codec, id: 0x8384:0x7609 
(SigmaTel STAC9721/23)

1 warning issued.  Results may not be reliable.


Linux version 2.4.18-aro ([EMAIL PROTECTED]) (gcc version 2.96 2731 
(Mandrake Linux 8.2 2.96-0.76mdk)) #2 Tue Apr 16 18:06:22 PDT 2002
BIOS-provided physical RAM map:
 BIOS-e820:  - 0009fc00 (usable)
 BIOS-e820: 0009fc00 - 000a (reserved)
 BIOS-e820: 000f - 0010 (reserved)
 BIOS-e820: 0010 - 0fff (usable)
 BIOS-e820: 0fff - 0fff3000 (ACPI NVS)
 BIOS-e820: 0fff3000 - 1000 (ACPI data)
 BIOS-e820:  - 0001 (reserved)
On node 0 totalpages: 65520
zone(0): 4096 pages.
zone(1): 61424 pages.
zone(2): 0 pages.
Kernel command line: BOOT_IMAGE=2418-aro ro root=305 devfs=mount hdc=ide-scsi
ide_setup: hdc=ide-scsi
Local APIC disabled by BIOS -- reenabling.
Found and enabled local APIC!
Initializing CPU#0
Detected 645.669 MHz processor.
Console: colour VGA+ 80x25
Calibrating delay loop... 1287.78 BogoMIPS
Memory: 256688k/262080k available (1244k kernel code, 5004k reserved, 353k data, 272k 
init, 0k highmem)
Dentry-cache hash table entries: 32768 (order: 6, 262144 bytes)
Inode-cache hash table entries: 16384 (order: 5, 131072 bytes)
Mount-cache hash table entries: 4096 (order: 3, 32768 bytes)
Buffer-cache hash table entries: 16384 (order: 4, 65536 bytes)
Page-cache hash table entries: 65536 (order:

Re: [Cooker] Kernel problem in devfs

2002-04-17 Thread Alan

On Wednesday 17 April 2002 03:57 am, Borsenkow Andrej wrote:
> > > I wonder should it not be in FAQ somewhere
> >
> > The steps below are pretty much what I did before.
>
> "pretty much" is not enough. When I compile kernel it boots.

But I know *why* it did not boot.  mkinitrd wants /dev/loop which is not 
getting created for some reason. (I have not used /dev/loop much, so i have 
not looked into it.)  Ext3 was the problem. Fixed.


>  I think I see what may
>
> > have made the difference, but I can only be sure if I test it.
> >
> > > cd /usr/src/linux
> > > make mrproper
> > > edit Makefile - change EXTRAVERSION to something like 11mdk1alan
> > > cp /boot/config  .config
> >
> > I used the .config in the kernel-source rpm.  The other had been
> > overwritten.
>
> It is NOT THE SAME! You have to get config from your running kernel and
> it is located in kernel-2.4.18.Xmdk-{smp,enterprise,...}.

It does not matter.  I get the same problem with the patches as I do with the 
old kernel.

> rpm2cpio kernel-XXX | cpio -imdu boot/config-XXX
>
> > I will try it with the one left by /boot/config-2.4.18-6mdk and see if
>
> that
>
> > is any better.
>
> DO NOT DO IT! Do NOT mix confis from different kernels releases.

I found that out.  (Make oldconfig started as\king for all sorts of new stuff. 
I gave up after the third prompt.)

> > > make oldconfig
> > > make dep
> >
> > You forgot "make clean".
>
> I did not. It is not needed.

Not any more. It used to.  (I have been using Linux since 0.99.  I still 
sometimes do things "the old fashion way".)

> > > make
> > > make modules
> > > make modules_install
> > > make install
> >
> > I did "make install" then "make modules_install". I wonder if that
>
> makes a
>
> > difference for mkinitrd?
>
> Yes. You have to have modules installed for mkinitrd to work.

You also have to have /dev/loop.

> > It does not seem to like the ext3 boot partition.
>
> It likes it if you do it correctly.

Actually, the reason it was not bulding had to do with /dev/loop.  Not having 
modules installed gives a different error. To get around the problem, I just 
rebuilt with ext3 and jfs as part of the kernel instead of a module.  Worked 
fine.

Either way, the problem still exists.  I need to see if I captured a proper 
kernel oops this time.  (I have not gotten to it yet.)  I will need to move 
it to this machine before sending out in any case.

The problem only seems to occur after mounting and unmounting a zip disc under 
ide-floppy. (msec had run successfully before that point.)

I should have more data once I have checked the oops and a few other things.







Re: [Cooker] Kernel problem in devfs

2002-04-17 Thread Michal 'hramrach' Suchanek

On Tue, Apr 16, 2002 at 07:43:23PM -0700, Alan wrote:

> Shouldn't ext3 be compiled into the main kernel and not a module?
> 
And Reiser, XFS, JFS as well ;-) I wonder what amount of memory this
would waste. There are too many filesystems around for this IMO.

-- 
Michal Suchanek
[EMAIL PROTECTED]




Re: [Cooker] Kernel problem in devfs

2002-04-16 Thread Alan

On Tuesday 16 April 2002 01:51 am, Guillaume Cottenceau wrote:
> Alan <[EMAIL PROTECTED]> writes:
> > The kernel halts with the following message:
> >
> > EXT2-fs: ide0(3,5): could'nt mount because of uunsupported optional
> > features (4).
> > Kernel panic: VFS: Unable to mount root fs on 03:05
>
> Can you copy here your /etc/fstab?

I already know what cause the problem.  Ext3 is supported as a module.

If initrd cannot or is not created or loaded, then the ext3 code is not 
available. The ext2 code does not react well to the ext3 partition.

Shouldn't ext3 be compiled into the main kernel and not a module?





Re: [Cooker] Kernel problem in devfs

2002-04-16 Thread OS

I'll go along with that.  I was about to submit a report about this but then 
saw this thread. I have noticed the system will degrade in the same way if my 
parprt Zip drive is attached after the system has booted. When installing 8.2 
/ Cooker I had the Zip drive attached. Having never done this before I wanted 
to see what difference it actually makes. Any modprobing and super mounting 
therefore was put in place by the stock install procedure and it does use 
Supermount.

Owen

On Friday 12 Apr 2002 6:59 pm, you wrote:
> On Friday 12 April 2002 10:50 am, Borsenkow Andrej wrote:
> > ÷ ðÔÎ, 12.04.2002, × 21:30, Alan ÎÁÐÉÓÁÌ:
> > > This is pretty nasty.
> > >
> > > This is using the stock kernel from Mandrake 8.2.  I expect that
> > > similar problems exist in the cooker version.
> > >
> > > What happens is that after some indeterminate time period, the system
> > > does not allow you to start new processes.  Already existing processes
> > > run, but new processes will not start and you cannot restart new
> > > processes.  Shutting down cannot happen because you can't start the
> > > shutdown script!
> > >
> > > After looking through the logs, I think I have found the cause of the
> > > problem.  It appears that devfs is dying. It kills enough of the kernel
> > > to not work correctly, but not enough of the kernel to choke all
> > > together. (Enough to be frustrating.) It looks like the lethal
> > > combination is a remountable ide-scsi device, but that is only a guess
> > > at this point.
> >
> > What device? Is it supermounted read-write?
>
> I believe it is the zip drive that it is choking in.  The oops happens in
> scan_dir_for_removable().  (The only removable devices are two scsi cd-rom
> drives and an ide Zip drive. The zip drive is the only one that gets
> classed as "removable" to my knowledge.)
>
> I assume it is supermounted as read/write. It is a fresh install of 8.2
> with no real changes to devices.  (I have been too busy coding to modify
> the system too much.)




Re: [Cooker] Kernel problem in devfs

2002-04-16 Thread Alan

On Tuesday 16 April 2002 01:41 am, Borsenkow Andrej wrote:
> > > Huh? It is what kernel SRPM is for :-)
> >
> > Yick. Forgot about that...  SRPMS for a source RPM? Who woulda thunk
>
> it...
>
> > Well, I need to check the .config file on this beast.
>
> I wonder should it not be in FAQ somewhere

The steps below are pretty much what I did before.  I think I see what may 
have made the difference, but I can only be sure if I test it.

> cd /usr/src/linux
> make mrproper
> edit Makefile - change EXTRAVERSION to something like 11mdk1alan
> cp /boot/config  .config

I used the .config in the kernel-source rpm.  The other had been overwritten. 
I will try it with the one left by /boot/config-2.4.18-6mdk and see if that 
is any better.

> make oldconfig
> make dep

You forgot "make clean".

> make
> make modules
> make modules_install
> make install

I did "make install" then "make modules_install". I wonder if that makes a 
difference for mkinitrd?

> Leaves you with copy of currently booted kernel installed _and_ mkinitrd
> created _and_ /etc/lilo.conf updated _and_ not interfering with other
> kernels. Now you can hack it sround as much as you want.

I already had a working kernel to boot from.  (The original one was not 
"working" on this machine.)

Well, lets see how this works out.

It does not seem to like the ext3 boot partition.





Re: [Cooker] Kernel problem in devfs

2002-04-16 Thread Guillaume Cottenceau

Alan <[EMAIL PROTECTED]> writes:

> The kernel halts with the following message:
> 
> EXT2-fs: ide0(3,5): could'nt mount because of uunsupported optional features 
> (4).
> Kernel panic: VFS: Unable to mount root fs on 03:05

Can you copy here your /etc/fstab?

-- 
Guillaume Cottenceau - http://www.frozen-bubble.org/




RE: [Cooker] Kernel problem in devfs

2002-04-16 Thread Borsenkow Andrej


> > Huh? It is what kernel SRPM is for :-)
> 
> Yick. Forgot about that...  SRPMS for a source RPM? Who woulda thunk
it...
> 
> Well, I need to check the .config file on this beast. 

I wonder should it not be in FAQ somewhere

cd /usr/src/linux
make mrproper
edit Makefile - change EXTRAVERSION to something like 11mdk1alan
cp /boot/config  .config
make oldconfig
make dep
make
make modules
make modules_install
make install

Leaves you with copy of currently booted kernel installed _and_ mkinitrd
created _and_ /etc/lilo.conf updated _and_ not interfering with other
kernels. Now you can hack it sround as much as you want.

-andrej




Re: [Cooker] Kernel problem in devfs

2002-04-16 Thread Alan

On Tuesday 16 April 2002 12:30 am, Borsenkow Andrej wrote:
> > > > SUGGESTION:  Could the patches to the kernel source please be
>
> included
>
> > > in
> > >
> > > > the
> > > > kernel-source rpm?
> > >
> > > No unless you test them and confirm that they work. :-)
> >
> > I am speaking of patches applied to the kernel in general.  it would
>
> be nice
>
> > to have a copy of all the patches applied to the kernel along with the
> > kernel
> > source.
>
> Huh? It is what kernel SRPM is for :-)

Yick. Forgot about that...  SRPMS for a source RPM? Who woulda thunk it...

Well, I need to check the .config file on this beast. (In the morning, as it 
is 1:21am local time and I need sleep.)

The kernel halts with the following message:

EXT2-fs: ide0(3,5): could'nt mount because of uunsupported optional features 
(4).
Kernel panic: VFS: Unable to mount root fs on 03:05

It may be due to initrd not getting built right or something else. I will do 
it when I have sleep and caffiene. (In that order.)





RE: [Cooker] Kernel problem in devfs

2002-04-16 Thread Borsenkow Andrej


> > > SUGGESTION:  Could the patches to the kernel source please be
included
> >
> > in
> >
> > > the
> > > kernel-source rpm?
> >
> > No unless you test them and confirm that they work. :-)
> 
> I am speaking of patches applied to the kernel in general.  it would
be nice
> to have a copy of all the patches applied to the kernel along with the
> kernel
> source. 

Huh? It is what kernel SRPM is for :-)


-andrej




Re: [Cooker] Kernel problem in devfs

2002-04-15 Thread Alan

On Monday 15 April 2002 11:06 pm, Borsenkow Andrej wrote:

> Yes, there are more changes than I expected actually. Which means it may
> not even work at all.
>
> If you remove ungrok_partitions and use --ignore-whitespace the -2, -3
> and -4 patches apply. Still, it may be left in some messy state.
>
> Anyway you may give it a try.

I am finishing up the compile and will do testing tonight.  The 
ungrok_partition is still there. (Because I started the compile before I read 
this...)

> > SUGGESTION:  Could the patches to the kernel source please be included
>
> in
>
> > the
> > kernel-source rpm?
>
> No unless you test them and confirm that they work. :-)

I am speaking of patches applied to the kernel in general.  it would be nice 
to have a copy of all the patches applied to the kernel along with the kernel 
source. It would make it much easier to determine which patch is responsible 
for what modifications to the kernel source.





RE: [Cooker] Kernel problem in devfs

2002-04-15 Thread Borsenkow Andrej

> 
> I really wish that these patches were merged.  They overlap in this
one
> section repeatedly.  (ide-floppy-2.diff makes changes that get deleted
in
> ide-floppy-3.diff, for example.)
> 
> > All failures are in modifying drivers/ide/ide-floppy.c
> >
> > ide-floppy-2.diff failed hunk #14 at line 1740.
> 
> This is due to a function called "ungrok_partitions" that exists in
the
> code.
> I am not certain what patch added this function.  I have left it in
place as
> much as I can.
> 

Yes, there are more changes than I expected actually. Which means it may
not even work at all.

If you remove ungrok_partitions and use --ignore-whitespace the -2, -3
and -4 patches apply. Still, it may be left in some messy state.

Anyway you may give it a try.



> 
> SUGGESTION:  Could the patches to the kernel source please be included
in
> the
> kernel-source rpm? 

No unless you test them and confirm that they work. :-)


-andrej





Re: [Cooker] Kernel problem in devfs

2002-04-15 Thread Alan

On Monday 15 April 2002 04:16 pm, Alan wrote:
> On Sunday 14 April 2002 09:09 am, Borsenkow Andrej wrote:
> > ÷ ðÔÎ, 12.04.2002, × 21:59, Alan ÎÁÐÉÓÁÌ:
> > > I believe it is the zip drive that it is choking in.  The oops happens
> > > in scan_dir_for_removable().  (The only removable devices are two scsi
> > > cd-rom drives and an ide Zip drive. The zip drive is the only one that
> > > gets classed as "removable" to my knowledge.)
> > >
> > > I assume it is supermounted as read/write. It is a fresh install of 8.2
> > > with no real changes to devices.  (I have been too busy coding to
> > > modify the system too much.)
> >
> > If you can reliably reproduce the oops, could you please test the
> > following:
> >
> > - get current kernel-source
> >
> > - backout ide-floppy patch. It is available either in CVS,
> >  >19
> > -pre4-ide-floppy-devfs-fix.patch?rev=1.1&content-type=text/x-cvsweb-marku
> >p> or in kernel SRPM of course.
> >
> > - apply ide-floppy patches from
> > 
> >
> > there may be conflicts of course, they are against vanilla kernel.
> > Still  I hope it should apply.
> >
> > - test with these patches if it works.
> >
> > Please, test _standard_ Mandrake config without any modifications. We
> > are chasing bug in mandrake kernel :-)
>
> The ide-probe.diff goes in correctly, as does the ide-floppy-1.diff.  The
> other ones fail on one part around the same chunk of code.  I am going to
> look at the rejects and see if I can figure out what they failed.

I really wish that these patches were merged.  They overlap in this one 
section repeatedly.  (ide-floppy-2.diff makes changes that get deleted in 
ide-floppy-3.diff, for example.)

> All failures are in modifying drivers/ide/ide-floppy.c
>
> ide-floppy-2.diff failed hunk #14 at line 1740.

This is due to a function called "ungrok_partitions" that exists in the code.  
I am not certain what patch added this function.  I have left it in place as 
much as I can.

> ide-floppy-3.diff failed hunk #2 at line 1755.

This is the same chunk of code that failed in the previous patch.  It is a 
cascade from the previous failed chunk.

> ide-floppy-4.diff failed hunk #7 at line 1778.

Another cascade from previous code.

> The test was against linux-2.4.18-11mdk.

I will post a completed patch if this works. (I am not certain if it will 
compile.  We will see.)

SUGGESTION:  Could the patches to the kernel source please be included in the 
kernel-source rpm?  It would make fixing these problems so much easier.

Also, I need to know which default config file is used in the build.  I am 
going to guess at it, but I would like to know for certain.

Thanks!






Re: [Cooker] Kernel problem in devfs

2002-04-15 Thread Alan

On Sunday 14 April 2002 09:09 am, Borsenkow Andrej wrote:
> ÷ ðÔÎ, 12.04.2002, × 21:59, Alan ÎÁÐÉÓÁÌ:
> > I believe it is the zip drive that it is choking in.  The oops happens in
> > scan_dir_for_removable().  (The only removable devices are two scsi
> > cd-rom drives and an ide Zip drive. The zip drive is the only one that
> > gets classed as "removable" to my knowledge.)
> >
> > I assume it is supermounted as read/write. It is a fresh install of 8.2
> > with no real changes to devices.  (I have been too busy coding to modify
> > the system too much.)
>
> If you can reliably reproduce the oops, could you please test the
> following:
>
> - get current kernel-source
>
> - backout ide-floppy patch. It is available either in CVS,
> -pre4-ide-floppy-devfs-fix.patch?rev=1.1&content-type=text/x-cvsweb-markup>
> or in kernel SRPM of course.
>
> - apply ide-floppy patches from
> 
>
> there may be conflicts of course, they are against vanilla kernel.
> Still  I hope it should apply.
>
> - test with these patches if it works.
>
> Please, test _standard_ Mandrake config without any modifications. We
> are chasing bug in mandrake kernel :-)

The ide-probe.diff goes in correctly, as does the ide-floppy-1.diff.  The 
other ones fail on one part around the same chunk of code.  I am going to 
look at the rejects and see if I can figure out what they failed.

All failures are in modifying drivers/ide/ide-floppy.c

ide-floppy-2.diff failed hunk #14 at line 1740.
ide-floppy-3.diff failed hunk #2 at line 1755.
ide-floppy-4.diff failed hunk #7 at line 1778.

The test was against linux-2.4.18-11mdk.





Re: [Cooker] Kernel problem in devfs

2002-04-14 Thread Borsenkow Andrej

÷ ðÔÎ, 12.04.2002, × 21:59, Alan ÎÁÐÉÓÁÌ:
> 
> I believe it is the zip drive that it is choking in.  The oops happens in 
> scan_dir_for_removable().  (The only removable devices are two scsi cd-rom 
> drives and an ide Zip drive. The zip drive is the only one that gets classed 
> as "removable" to my knowledge.)
> 
> I assume it is supermounted as read/write. It is a fresh install of 8.2 with 
> no real changes to devices.  (I have been too busy coding to modify the 
> system too much.)
> 

If you can reliably reproduce the oops, could you please test the
following:

- get current kernel-source

- backout ide-floppy patch. It is available either in CVS, 

or in kernel SRPM of course.

- apply ide-floppy patches from 


there may be conflicts of course, they are against vanilla kernel.
Still  I hope it should apply.

- test with these patches if it works.

Please, test _standard_ Mandrake config without any modifications. We
are chasing bug in mandrake kernel :-)

-andrej





Re: [Cooker] Kernel problem in devfs

2002-04-13 Thread Joseph Davidson

I am seeing this same exact problem.  I have an IDE ZIP that is
supermounted, and about once a day (when msec_find runs), I get an Oops
in scan_dir_for_removeable().  I placed some code in the function that
prints out dir->name.  It appears the Oops occurs when attempting to
access /dev/host1/bus0/target0/lun0,  which is my zip drive.  

I have since patched a stock kernel (2.4.18) with supermount,  and this
has worked flawlessly for over a week.  


--
Joe

On Fri, 2002-04-12 at 13:59, Alan wrote:
> On Friday 12 April 2002 10:50 am, Borsenkow Andrej wrote:
> > ÷ ðÔÎ, 12.04.2002, × 21:30, Alan ÎÁÐÉÓÁÌ:
> > > This is pretty nasty.
> > >
> > > This is using the stock kernel from Mandrake 8.2.  I expect that similar
> > > problems exist in the cooker version.
> > >
> > > What happens is that after some indeterminate time period, the system
> > > does not allow you to start new processes.  Already existing processes
> > > run, but new processes will not start and you cannot restart new
> > > processes.  Shutting down cannot happen because you can't start the
> > > shutdown script!
> > >
> > > After looking through the logs, I think I have found the cause of the
> > > problem.  It appears that devfs is dying. It kills enough of the kernel
> > > to not work correctly, but not enough of the kernel to choke all
> > > together. (Enough to be frustrating.) It looks like the lethal
> > > combination is a remountable ide-scsi device, but that is only a guess
> > > at this point.
> >
> > What device? Is it supermounted read-write?
> 
> I believe it is the zip drive that it is choking in.  The oops happens in 
> scan_dir_for_removable().  (The only removable devices are two scsi cd-rom 
> drives and an ide Zip drive. The zip drive is the only one that gets classed 
> as "removable" to my knowledge.)
> 
> I assume it is supermounted as read/write. It is a fresh install of 8.2 with 
> no real changes to devices.  (I have been too busy coding to modify the 
> system too much.)
> 
> 
> 
> 






Re: [Cooker] Kernel problem in devfs

2002-04-12 Thread Alan

On Friday 12 April 2002 10:50 am, Borsenkow Andrej wrote:
> ÷ ðÔÎ, 12.04.2002, × 21:30, Alan ÎÁÐÉÓÁÌ:
> > This is pretty nasty.
> >
> > This is using the stock kernel from Mandrake 8.2.  I expect that similar
> > problems exist in the cooker version.
> >
> > What happens is that after some indeterminate time period, the system
> > does not allow you to start new processes.  Already existing processes
> > run, but new processes will not start and you cannot restart new
> > processes.  Shutting down cannot happen because you can't start the
> > shutdown script!
> >
> > After looking through the logs, I think I have found the cause of the
> > problem.  It appears that devfs is dying. It kills enough of the kernel
> > to not work correctly, but not enough of the kernel to choke all
> > together. (Enough to be frustrating.) It looks like the lethal
> > combination is a remountable ide-scsi device, but that is only a guess
> > at this point.
>
> What device? Is it supermounted read-write?

I did not add in the supermount patches on the new kernel. (The supermount 
code does not come with the stock kernel. And I forgot about them...)

The ksymoops I attached gives you an idea where it died.





Re: [Cooker] Kernel problem in devfs

2002-04-12 Thread Alan

On Friday 12 April 2002 10:50 am, Borsenkow Andrej wrote:
> ÷ ðÔÎ, 12.04.2002, × 21:30, Alan ÎÁÐÉÓÁÌ:
> > This is pretty nasty.
> >
> > This is using the stock kernel from Mandrake 8.2.  I expect that similar
> > problems exist in the cooker version.
> >
> > What happens is that after some indeterminate time period, the system
> > does not allow you to start new processes.  Already existing processes
> > run, but new processes will not start and you cannot restart new
> > processes.  Shutting down cannot happen because you can't start the
> > shutdown script!
> >
> > After looking through the logs, I think I have found the cause of the
> > problem.  It appears that devfs is dying. It kills enough of the kernel
> > to not work correctly, but not enough of the kernel to choke all
> > together. (Enough to be frustrating.) It looks like the lethal
> > combination is a remountable ide-scsi device, but that is only a guess
> > at this point.
>
> What device? Is it supermounted read-write?

I believe it is the zip drive that it is choking in.  The oops happens in 
scan_dir_for_removable().  (The only removable devices are two scsi cd-rom 
drives and an ide Zip drive. The zip drive is the only one that gets classed 
as "removable" to my knowledge.)

I assume it is supermounted as read/write. It is a fresh install of 8.2 with 
no real changes to devices.  (I have been too busy coding to modify the 
system too much.)







Re: [Cooker] Kernel problem in devfs

2002-04-12 Thread Borsenkow Andrej

÷ ðÔÎ, 12.04.2002, × 21:30, Alan ÎÁÐÉÓÁÌ:
> This is pretty nasty.
> 
> This is using the stock kernel from Mandrake 8.2.  I expect that similar
> problems exist in the cooker version.
> 
> What happens is that after some indeterminate time period, the system
> does not allow you to start new processes.  Already existing processes
> run, but new processes will not start and you cannot restart new
> processes.  Shutting down cannot happen because you can't start the
> shutdown script!
> 
> After looking through the logs, I think I have found the cause of the
> problem.  It appears that devfs is dying. It kills enough of the kernel
> to not work correctly, but not enough of the kernel to choke all
> together. (Enough to be frustrating.) It looks like the lethal
> combination is a remountable ide-scsi device, but that is only a guess
> at this point.
> 

What device? Is it supermounted read-write? 

-andrej




[Cooker] Kernel problem in devfs

2002-04-12 Thread Alan

This is pretty nasty.

This is using the stock kernel from Mandrake 8.2.  I expect that similar
problems exist in the cooker version.

What happens is that after some indeterminate time period, the system
does not allow you to start new processes.  Already existing processes
run, but new processes will not start and you cannot restart new
processes.  Shutting down cannot happen because you can't start the
shutdown script!

After looking through the logs, I think I have found the cause of the
problem.  It appears that devfs is dying. It kills enough of the kernel
to not work correctly, but not enough of the kernel to choke all
together. (Enough to be frustrating.) It looks like the lethal
combination is a remountable ide-scsi device, but that is only a guess
at this point.

I believe the cause is one of the patches added to the kernel. (Probably
grsecurity.)

I rebuilt the kernel using the stock 2.4.18 source from ftp.kernel.org,
using the same configuration options needed to keep Mandrake happy.
(Devfs and ide-scsi mostly.)  That kernel has worked flawlessly.  (The
other kernel would not last more than a day.

There is definitely a problem here. What the solution is will take more
research.

ksymoops log is attached. Please Cc me on all mail, as I do not read the
cooker list very often.  (Far too many other lists to keep up with...)


ksymoops 2.4.3 on i686 2.4.18-6mdk.  Options used
 -V (default)
 -k /proc/ksyms (default)
 -l /proc/modules (default)
 -o /lib/modules/2.4.18-6mdk/ (default)
 -m /boot/System.map-2.4.18-6mdk (default)

Warning: You did not tell me where to find symbol information.  I will
assume that the log matches the kernel and modules that are running
right now and I'll use the default options above for symbol resolution.
If the current kernel and/or modules do not match the log, you can get
more accurate output by telling me the kernel version and where to find
map, modules, ksyms etc.  ksymoops -h explains the options.

Warning (compare_ksyms_lsmod): module ext3 is in lsmod but not in ksyms, probably no 
symbols exported
Warning (compare_maps): mismatch on symbol partition_name  , ksyms_base says c01ce310, 
System.map says c0157de0.  Ignoring ksyms_base entry
Apr  7 14:31:39 kludge kernel: Unable to handle kernel paging request at virtual 
address 204f2f8d
Apr  7 14:31:39 kludge kernel: c0160783
Apr  7 14:31:39 kludge kernel: *pde = 
Apr  7 14:31:39 kludge kernel: Oops: 
Apr  7 14:31:39 kludge kernel: CPU:0
Apr  7 14:31:39 kludge kernel: EIP:0010:[scan_dir_for_removable+19/64]Not 
tainted
Apr  7 14:31:39 kludge kernel: EIP:0010:[]Not tainted
Using defaults from ksymoops -t elf32-i386 -a i386
Apr  7 14:31:39 kludge kernel: EFLAGS: 00010202
Apr  7 14:31:39 kludge kernel: eax: cc181240   ebx: 204f2f49   ecx:    edx: 
cc181240
Apr  7 14:31:39 kludge kernel: esi: ce153840   edi: ce3647a0   ebp: ce5f32e0   esp: 
c8955f28
Apr  7 14:31:39 kludge kernel: ds: 0018   es: 0018   ss: 0018
Apr  7 14:31:39 kludge kernel: Process msec_find (pid: 2952, stackpage=c8955000)
Apr  7 14:31:39 kludge kernel: Stack: ce153840 c0160c16 ce3647a0 c0265a40  
ce153840 ce1538c0 ce1538ac 
Apr  7 14:31:39 kludge kernel:ce5f32e0 c0141690 ce5f32e0 c8955fa0 c0141b90 
ce5f32e0 fff7 000d 
Apr  7 14:31:39 kludge kernel:bfffeac8 c0141d3f ce5f32e0 c0141b90 c8955fa0 
ce02dbc0 c01338f7 ce02dbc0 
Apr  7 14:31:39 kludge kernel: Call Trace: [devfs_readdir+86/448] [vfs_readdir+96/144] 
[filldir64+0/352] [sys_getdents64+79/185] [filldir64+0/352] 
Apr  7 14:31:39 kludge kernel: Call Trace: [] [] [] 
[] [] 
Apr  7 14:31:39 kludge kernel:[] [] 
Apr  7 14:31:39 kludge kernel: Code: 66 8b 43 44 25 00 f0 00 00 66 3d 00 60 75 0d f6 
43 10 04 74 

>>EIP; c0160782<=
Trace; c0160c16 
Trace; c0141690 
Trace; c0141b90 
Trace; c0141d3e 
Trace; c0141b90 
Trace; c01338f6 
Trace; c0106f22 
Code;  c0160782 
 <_EIP>:
Code;  c0160782<=
   0:   66 8b 43 44   mov0x44(%ebx),%ax   <=
Code;  c0160786 
   4:   25 00 f0 00 00and$0xf000,%eax
Code;  c016078a 
   9:   66 3d 00 60   cmp$0x6000,%ax
Code;  c016078e 
   d:   75 0d jne1c <_EIP+0x1c> c016079e 

Code;  c0160790 
   f:   f6 43 10 04   testb  $0x4,0x10(%ebx)
Code;  c0160794 
  13:   74 00 je 15 <_EIP+0x15> c0160796 


Apr  7 14:39:49 kludge kernel: 8139too Fast Ethernet driver 0.9.24
Apr  7 14:39:57 kludge kernel: ac97_codec: AC97 Audio codec, id: 0x8384:0x7609 
(SigmaTel STAC9721/23)
Apr  8 04:05:56 kludge kernel: Unable to handle kernel paging request at virtual 
address 204f2f8d
Apr  8 04:05:56 kludge kernel: c0160783
Apr  8 04:05:56 kludge kernel: *pde = 
Apr  8 04:05:56 kludge kernel: Oops: 
Apr  8 04:05:56 kludge kernel: CPU:0
Apr  8 04:05:56 kludge kernel: EIP:0010:[scan_dir_for_removable+19/64]Not 
tainted
Apr  8 04:05:56 kludge kernel: EIP: