Dear all,

I am running raidtools 0.90.990824-5 on a mingo-B1-patched 2.2.14 for
quite some time. Now a few days ago the machine just froze (no keyboard,
no login, no remote login, ping worked, that's all) and had to be
resetted. It came up fine, reconstructed etc. but there is really not a
single hint in the logs for the freeze. Now I 'could' live with that but
two days later I got ext2fs errors on one of the raid0 partitions. Things
like the following:

Mar 15 01:29:46 kludge kernel: EXT2-fs warning (device md(9,3)):
ext2_unlink: Deleting nonexistent file (159), 0
Mar 15 01:49:47 kludge kernel: EXT2-fs warning (device md(9,3)):
ext2_unlink: Deleting nonexistent file (160), 0
Mar 15 06:28:11 kludge kernel: EXT2-fs warning (device md(9,3)):
ext2_free_inode: bit already cleared for inode 150

I did a reboot and checked and cleaned up the filesystem. Now I have two
problems:

- there are corrupt files left that I cannot move/remove by normal means
  (i.e. rm/mv and a unlink(2) call did not work)
- I still get ext2-fs warnings!

Here are some of the files:

kludge:/var/spool/squid/03# find . -not -type d -ls
   200    0 br-xr-S-w-   1 15730    26990    116, 110 May 29  2000 ./9F
   201    0 b--Sr-S--T   1 28021    28682     97, 112 Sep  2  2002 ./A0
   210    0 cr--rwSrwT   1 8236     12336     46,  54 Aug 21  2021 ./B1
   213 789517 br--r-----   1 8308     13875     61, 114 Feb 27  1996 ./B8

And here are recent warnings:

Mar 17 09:06:06 kludge kernel: EXT2-fs error (device md(9,3)):
ext2_readdir: directory #198 contains a hole at offset 20480
Mar 17 09:06:16 kludge kernel: EXT2-fs error (device md(9,3)): empty_dir:
bad entry in directory #198: rec_len is smaller than minimal - offset=4096,
inode=0, rec_len=0, name_len=0

Anyone got an idea how to get rid of both? I have attached the relevant
proc-info at the end. The machine is a dual-pIII with an onboard adaptec 
AIC-7890/1 (AIT tape connected) and a dual-channel AHA-394X with two
chains of 3 IBM disks each. The network card is a 3com905b. I guess I
should move that to use an extra interrupt. But that should not be the
source of the problem since the scsi-controller is the one the tape is
connected to? Hmmm, thinking over that again. I only recently started to
use the tape drive heavily. So this could be the cause?!

-Peter


-------------------
kludge:/proc# cat interrupts 
           CPU0       CPU1       
  0:    3183559    3048539    IO-APIC-edge  timer
  1:        880        917    IO-APIC-edge  keyboard
  2:          0          0          XT-PIC  cascade
  8:          0          1    IO-APIC-edge  rtc
 10:   17670681   17651847   IO-APIC-level  aic7xxx, eth0
 11:     659362     655379   IO-APIC-level  aic7xxx, aic7xxx
 13:          1          0          XT-PIC  fpu
 14:          4          2    IO-APIC-edge  ide0
NMI:          0
-----------------------
kludge:/var/log# cat /proc/mdstat 
Personalities : [raid0] [raid1] [raid5] 
read_ahead 1024 sectors
md1 : active raid1 sdd2[1] sda2[0] 96320 blocks [2/2] [UU]
md2 : active raid1 sdd3[1] sda3[0] 1951808 blocks [2/2] [UU]
md5 : active raid0 sdd6[1] sda6[0] 979712 blocks 32k chunks
md7 : active raid0 sdd7[1] sda7[0] 5863424 blocks 32k chunks
md3 : active raid0 sde6[1] sdb6[0] 979712 blocks 32k chunks
md4 : active raid0 sdf6[1] sdc6[0] 979712 blocks 32k chunks
md6 : active raid5 sdf7[3] sde7[2] sdc7[1] sdb7[0] 14650944 blocks level
5, 32k chunk, algorithm 0 [4/4] [UUUU]
md8 : active raid5 sdf8[5] sde8[4] sdd8[3] sdc8[2] sdb8[1] sda8[0]
24418240 blocks level 5, 32k chunk, algorithm 0 [6/6] [UUUUUU]
md9 : active raid5 sdf9[5] sde9[4] sdd9[3] sdc9[2] sdb9[1] sda9[0]
37350400 blocks level 5, 32k chunk, algorithm 0 [6/6] [UUUUUU]
unused devices: <none>


----------------------
Here is the scsi-controller with the tape drive:

kludge:/proc# cat scsi/aic7xxx/0
Adaptec AIC7xxx driver version: 5.1.21/3.2.4
Compile Options:
  TCQ Enabled By Default : Enabled
  AIC7XXX_PROC_STATS     : Enabled
  AIC7XXX_RESET_DELAY    : 5

Adapter Configuration:
           SCSI Adapter: Adaptec AIC-7890/1 Ultra2 SCSI host adapter
                           Ultra-2 LVD/SE Wide Controller
    PCI MMAPed I/O Base: 0xe1000000
    PCI Bus 0x00 Device 0x30
 Adapter SEEPROM Config: SEEPROM found and used.
      Adaptec SCSI BIOS: Enabled
                    IRQ: 10
                   SCBs: Active 0, Max Active 1,
                         Allocated 15, HW 32, Page 255
             Interrupts: 489083
      BIOS Control Word: 0x18a6
   Adapter Control Word: 0x1c5e
   Extended Translation: Enabled
Disconnect Enable Flags: 0xffff
     Ultra Enable Flags: 0x0000
 Tag Queue Enable Flags: 0x0000
Ordered Queue Tag Flags: 0x0000
Default Tag Queue Depth: 8
    Tagged Queue By Device array for aic7xxx host instance 0:
      {0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0}
    Actual queue depth per device for aic7xxx host instance 0:
      {1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1}

Statistics:

(scsi0:0:4:0)
  Device using Wide/Sync transfers at 20.0 MByte/sec, offset 15
  Transinfo settings: current(25/15/1/0), goal(11/127/1/0),
user(11/127/1/0)
  Total transfers 488937 (6 reads and 488931 writes)
             < 2K      2K+     4K+     8K+    16K+    32K+    64K+   128K+
   Reads:       0       0       0       0       0       6       0       0
  Writes:       0       0       0       0       0  488931       0       0


----------------------
Here the first scsi-controller with disks:

kludge:/var/log# cat /proc/scsi/aic7xxx/1
Adaptec AIC7xxx driver version: 5.1.21/3.2.4
Compile Options:
  TCQ Enabled By Default : Enabled
  AIC7XXX_PROC_STATS     : Enabled
  AIC7XXX_RESET_DELAY    : 5

Adapter Configuration:
           SCSI Adapter: Adaptec AHA-394X Ultra2 SCSI host adapter
                           Ultra-2 LVD/SE Wide Controller Channel A
    PCI MMAPed I/O Base: 0xe0000000
    PCI Bus 0x00 Device 0x58
 Adapter SEEPROM Config: SEEPROM found and used.
      Adaptec SCSI BIOS: Enabled
                    IRQ: 11
                   SCBs: Active 0, Max Active 24,
                         Allocated 30, HW 32, Page 255
             Interrupts: 687232
      BIOS Control Word: 0x18a6
   Adapter Control Word: 0x1c5e
   Extended Translation: Enabled
Disconnect Enable Flags: 0xffff
     Ultra Enable Flags: 0x0000
 Tag Queue Enable Flags: 0x000b
Ordered Queue Tag Flags: 0x000b
Default Tag Queue Depth: 8
    Tagged Queue By Device array for aic7xxx host instance 1:
      {0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0}
    Actual queue depth per device for aic7xxx host instance 1:
      {8,8,1,8,1,1,1,1,1,1,1,1,1,1,1,1}

Statistics:

(scsi1:0:0:0)
  Device using Wide/Sync transfers at 80.0 MByte/sec, offset 31
  Transinfo settings: current(10/31/1/0), goal(10/31/1/0), user(10/127/1/0)
  Total transfers 385985 (207115 reads and 178870 writes)
             < 2K      2K+     4K+     8K+    16K+    32K+    64K+   128K+
   Reads:    2753     381   80050   23479   26653   22572   51227       0
  Writes:   11409    2913   56308   13887    9520   15661   69172       0


(scsi1:0:1:0)
  Device using Wide/Sync transfers at 80.0 MByte/sec, offset 31
  Transinfo settings: current(10/31/1/0), goal(10/31/1/0), user(10/127/1/0)
  Total transfers 206185 (97241 reads and 108944 writes)
             < 2K      2K+     4K+     8K+    16K+    32K+    64K+   128K+
   Reads:      10       1   62339   15403    9172    6557    3759       0
  Writes:       5       0   66184   18877    7829   13184    2865       0


(scsi1:0:3:0)
  Device using Wide/Sync transfers at 80.0 MByte/sec, offset 31
  Transinfo settings: current(10/31/1/0), goal(10/31/1/0), user(10/127/1/0)
  Total transfers 94929 (77407 reads and 17522 writes)
             < 2K      2K+     4K+     8K+    16K+    32K+    64K+   128K+
   Reads:       8       0   48031   12718    7647    5512    3491       0
  Writes:       0       0   14411    2162     471      60     418       0




-------------------------------
Email: [EMAIL PROTECTED]
WWW:   http://www.risc.uni-linz.ac.at/people/ppregler

Reply via email to