How silly of me not to have included my dmesg:

Linux version 2.4.9-34 ([EMAIL PROTECTED]) (gcc version 2.96 
20000731 (Red Hat Linux 7.2 2.96-108.1)) #1 Sat Jun 1 06:32:14 EDT 2002
BIOS-provided physical RAM map:
 BIOS-e820: 0000000000000000 - 00000000000a0000 (usable)
 BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved)
 BIOS-e820: 0000000000100000 - 0000000007e77000 (usable)
 BIOS-e820: 0000000007e77000 - 0000000007e79000 (ACPI NVS)
 BIOS-e820: 0000000007e79000 - 0000000008000000 (reserved)
 BIOS-e820: 00000000fec00000 - 00000000fec10000 (reserved)
 BIOS-e820: 00000000fee00000 - 00000000fee10000 (reserved)
 BIOS-e820: 00000000ffb00000 - 0000000100000000 (reserved)
On node 0 totalpages: 32375
zone(0): 4096 pages.
zone(1): 28279 pages.
zone(2): 0 pages.
Kernel command line: auto BOOT_IMAGE=linux ro root=305 BOOT_FILE=/boot/vmlinuz-2.4.9-34
Initializing CPU#0
Detected 897.289 MHz processor.
Console: colour VGA+ 80x25
Calibrating delay loop... 1789.13 BogoMIPS
Memory: 123348k/129500k available (1741k kernel code, 4884k reserved, 90k data, 216k 
init, 0k highmem)
Dentry-cache hash table entries: 16384 (order: 5, 131072 bytes)
Inode-cache hash table entries: 8192 (order: 4, 65536 bytes)
Mount-cache hash table entries: 2048 (order: 2, 16384 bytes)
Buffer-cache hash table entries: 4096 (order: 2, 16384 bytes)
Page-cache hash table entries: 32768 (order: 6, 262144 bytes)
CPU: Before vendor init, caps: 0383fbff 00000000 00000000, vendor = 0
CPU: L1 I cache: 16K, L1 D cache: 16K
CPU: L2 cache: 128K
Intel machine check architecture supported.
Intel machine check reporting enabled on CPU#0.
CPU: After vendor init, caps: 0383fbff 00000000 00000000 00000000
CPU:     After generic, caps: 0383fbff 00000000 00000000 00000000
CPU:             Common caps: 0383fbff 00000000 00000000 00000000
CPU: Intel Celeron (Coppermine) stepping 0a
Enabling fast FPU save and restore... done.
Enabling unmasked SIMD FPU exception support... done.
Checking 'hlt' instruction... OK.
Checking for popad bug... OK.
POSIX conformance testing by UNIFIX
mtrr: v1.40 (20010327) Richard Gooch ([EMAIL PROTECTED])
mtrr: detected mtrr type: Intel
PCI: PCI BIOS revision 2.10 entry at 0xfbe9e, last bus=1
PCI: Using configuration type 1
PCI: Probing PCI hardware
Unknown bridge resource 2: assuming transparent
isapnp: Scanning for PnP cards...
isapnp: No Plug & Play device found
Linux NET4.0 for Linux 2.4
Based upon Swansea University Computer Society NET3.039
Initializing RT netlink socket
Simple Boot Flag extension found and enabled.
apm: BIOS version 1.2 Flags 0x03 (Driver version 1.14)
Starting kswapd v1.8
VFS: Diskquotas version dquot_6.5.0 initialized
pty: 512 Unix98 ptys configured
Serial driver version 5.05c (2001-07-08) with MANY_PORTS MULTIPORT SHARE_IRQ 
SERIAL_PCI ISAPNP enabled
ttyS00 at 0x03f8 (irq = 4) is a 16550A
ttyS01 at 0x02f8 (irq = 3) is a 16550A
Real Time Clock Driver v1.10e
block: queued sectors max/low 81869kB/27289kB, 256 slots per queue
RAMDISK driver initialized: 16 RAM disks of 4096K size 1024 blocksize
Uniform Multi-Platform E-IDE driver Revision: 6.31
ide: Assuming 33MHz PCI bus speed for PIO modes; override with idebus=xx
PIIX4: IDE controller on PCI bus 00 dev f9
PIIX4: chipset revision 2
PIIX4: not 100% native mode: will probe irqs later
    ide0: BM-DMA at 0xffa0-0xffa7, BIOS settings: hda:DMA, hdb:pio
    ide1: BM-DMA at 0xffa8-0xffaf, BIOS settings: hdc:DMA, hdd:pio
hda: Maxtor 2B020H1, ATA DISK drive
hdc: SAMSUNG CD-ROM SC-148C, ATAPI CD/DVD-ROM drive
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
ide1 at 0x170-0x177,0x376 on irq 15
blk: queue c033e6c0, I/O limit 4095Mb (mask 0xffffffff)
blk: queue c033e6c0, I/O limit 4095Mb (mask 0xffffffff)
hda: 39062500 sectors (20000 MB) w/2048KiB Cache, CHS=2431/255/63, UDMA(66)
Partition check:
 hda: hda1 hda2 hda3 hda4 < hda5 hda6 hda7 hda8 >
Floppy drive(s): fd0 is 1.44M
FDC 0 is a National Semiconductor PC87306
md: md driver 0.90.0 MAX_MD_DEVS=256, MD_SB_DISKS=27
md: Autodetecting RAID arrays.
md: autorun ...
md: ... autorun DONE.
NET4: Linux TCP/IP 1.0 for NET4.0
IP Protocols: ICMP, UDP, TCP, IGMP
IP: routing cache hash table of 512 buckets, 4Kbytes
TCP: Hash tables configured (established 8192 bind 8192)
Linux IP multicast router 0.06 plus PIM-SM
NET4: Unix domain sockets 1.0/SMP for Linux NET4.0.
RAMDISK: Compressed image found at block 0
Freeing initrd memory: 325k freed
VFS: Mounted root (ext2 filesystem).
Journalled Block Device driver loaded
EXT3-fs: INFO: recovery required on readonly filesystem.
EXT3-fs: write access will be enabled during recovery.
kjournald starting.  Commit interval 5 seconds
EXT3-fs: ide0(3,5): orphan cleanup on readonly fs
ext3_orphan_cleanup: deleting unreferenced inode 28114
EXT3-fs: ide0(3,5): 1 orphan inode deleted
EXT3-fs: recovery complete.
EXT3-fs: mounted filesystem with ordered data mode.
Freeing unused kernel memory: 216k freed
Adding Swap: 257000k swap-space (priority -1)
usb.c: registered new driver usbdevfs
usb.c: registered new driver hub
usb-uhci.c: $Revision: 1.259 $ time 06:40:13 Jun  1 2002
usb-uhci.c: High bandwidth mode enabled
PCI: Setting latency timer of device 00:1f.2 to 64
usb-uhci.c: USB UHCI at I/O 0xff80, IRQ 11
usb-uhci.c: Detected 2 ports
usb.c: new USB bus registered, assigned bus number 1
hub.c: USB hub found
hub.c: 2 ports detected
usb-uhci.c: v1.251:USB Universal Host Controller Interface driver
hub.c: USB new device connect on bus1/2, assigned device number 2
hub.c: USB hub found
hub.c: 3 ports detected
EXT3 FS 2.4-0.9.11, 3 Oct 2001 on ide0(3,5), internal journal
kjournald starting.  Commit interval 5 seconds
EXT3 FS 2.4-0.9.11, 3 Oct 2001 on ide0(3,1), internal journal
EXT3-fs: mounted filesystem with ordered data mode.
kjournald starting.  Commit interval 5 seconds
EXT3 FS 2.4-0.9.11, 3 Oct 2001 on ide0(3,2), internal journal
EXT3-fs: mounted filesystem with ordered data mode.
kjournald starting.  Commit interval 5 seconds
EXT3 FS 2.4-0.9.11, 3 Oct 2001 on ide0(3,3), internal journal
EXT3-fs: mounted filesystem with ordered data mode.
kjournald starting.  Commit interval 5 seconds
EXT3 FS 2.4-0.9.11, 3 Oct 2001 on ide0(3,6), internal journal
EXT3-fs: mounted filesystem with ordered data mode.
kjournald starting.  Commit interval 5 seconds
EXT3 FS 2.4-0.9.11, 3 Oct 2001 on ide0(3,8), internal journal
EXT3-fs: mounted filesystem with ordered data mode.

=====================================
   Chuck Spencer 
   Associate Information Processing Consultant
   University of Wisconsin-Madison
   Department of Neurosurgery
   Tel: 608 265-0458
=====================================

-----Original Message-----
From: Spencer (Chuck S.) 
Sent: Wednesday, December 11, 2002 10:23 AM
To: '[EMAIL PROTECTED]'
Subject: IDE/SCSI errors, machine locking

I'm running a Redhat 7.2 box (kernel 2.4.18-18.7.x) with an AMD thunderbird chip in an 
ASUS A7V board, 2 ATA 100 IDE drives, an Adaptec 2930 SCSI controller hooked to a 
Promise UltraTrack SX8000 external RAID array.

Recently the machine started hanging when the 0 level backup script, which Dumped data 
from one IDE drive to the other, ran. It was choking on a file that when read in any 
way, be it copying or opening, locked the machine up. I ran e2fsck on the volume with 
a -c, and when it got to a particular block on the drive once again the machine would 
hang. I replaced the drive in question as well as the IDE cable for good measure, and 
patched the kernel and the xinetd per recent advisories. I got everything running 
again and proceeded to try to set up a partition on a new level 1 array I'd just set 
up in the external controller. During the mkfs the machine hangs once again, much in 
the same way as with the previous errors with the IDE disks.

What I mean when I say hang is this: The machine stops responding. Hitting enter in 
the shell gives a new line but no prompt, ctrl-c and ctrl-z do nothing, switching to a 
different virtual terminal is no help, once again commands can be typed but there is 
no response. No amount of waiting yields anything new. Nothing comes up in 
/var/log/messages, everything just stops. Soft-reboot is impossible, I have to 
power-cycle the machine to get it going again. 

It strikes me odd that a drive problem handled at as low a level as a fsck would cause 
this kind of hangup. It also strikes me as odd that I'd get the exact same problem 
when working with a separate component (the SCSI array). Could this be a kernel 
problem? Driver issues? 

The machine has run nicely for 1.5 years now. Recent changes include; patching the 
kernel/xinetd, installing the SCSI card / array controller. It seems tempting to jump 
to the conclusion that the SCSI card is the culprit but I've tried 2 cards (identical) 
with no change, and why would a goofy SCSI card driver cause the machine to lock up 
when transferring data over the IDE channels? 

This is a production server and this problem is getting pretty drawn out. If I can't 
come up with a solution soon we'll be forced to abandon the Redhat machine and move 
things to a windows file server. I'd hate to see that happen! Any help is greatly 
appreciated.


=====================================
   Chuck Spencer 
   Associate Information Processing Consultant
   University of Wisconsin-Madison
   Department of Neurosurgery
   Tel: 608 265-0458
=====================================



-- 
redhat-list mailing list
unsubscribe mailto:[EMAIL PROTECTED]?subject=unsubscribe
https://listman.redhat.com/mailman/listinfo/redhat-list

Reply via email to