Re: raid5 checksumming chooses wrong function

2000-03-18 Thread Christian Robottom Reis

On Tue, 14 Mar 2000, Malcolm Beattie wrote:

 Benchmarking it on a stripeset of 7 x 9GB disks on a Ultra3 bus with
 one of the Adaptec 7899 channels, it's impressively fast. 81MB/s block
 reads and 512 seeks/s in bonnie and 50MB/s (500 "netbench" Mbits/sec)
 running dbench with 128 threads. I've done tiotest runs too and I'll
 be doing more benchmarks on RAID5 soon. If anyone wants me to post
 figures, I'll do so.

Go ahead and post the tiobenches as well!

Cheers,
--
_/\ Christian Reis is sometimes [EMAIL PROTECTED] 
\/~ suicide architect | free software advocate | mountain biker 



Re: RAID0: Fast writes, Slow reads...

2000-03-18 Thread Esben Haabendal Soerensen

 "Kent" == Kent Nilsen [EMAIL PROTECTED] writes:

Kent I've got exactly the same problem on a Mylex hardware RAID-
Kent controller, writing is nearly twice as fast as reading. The
Kent drives are Barracuda 50Gb drives, the controller is a
Kent DAC1164P. I use the latest firmware, and latest drivers from
Kent dandelion.com. Kernel version is 2.2.14, distribution=Mandrake
Kent 7.0. Single PIII-600 on dual motherboard (Asus P2B-DS), 392 Mb
Kent RAM.

Kent Do you by any chance have problems with the entire system
Kent freezing after a while or during lots of activity? I've got that
Kent problem, though my firmware should be a fixed version.

Funny (or not), this sounds a lot like my current problems.

I have a HP Netserver LH4 with integrated HP Netraid controller.  When
doing heavy I/O on the Netraid controller the entire system more or
less freezes.  I am using newet firmware version (have even tried a
few older versions), and have tried it on 2.2.14 and 2.2.15pre14.

Doing bonnie on a single 9GB HP disk shows the following:

  ---Sequential Output ---Sequential Input-- --Random--
  -Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks---
MachineMB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU  /sec %CPU
  100  7771 99.5  6753  3.8  4513  9.2  7830 99.6 245756 100.8 22572.5 95.9


That is approx. 4 times faster on block read than block write, but
100% cpu load on block read!!

I was coming to the conclusion that I had to look deep into the AMI
Megaraid driver to try and locate the problem, but could the problem
by chance be more generic ?

Can anyone offer some qualified guidance about where to start look for
this problem ?

/bart



file system corruption on raid0

2000-03-18 Thread Peter Pregler

Dear all,

I am running raidtools 0.90.990824-5 on a mingo-B1-patched 2.2.14 for
quite some time. Now a few days ago the machine just froze (no keyboard,
no login, no remote login, ping worked, that's all) and had to be
resetted. It came up fine, reconstructed etc. but there is really not a
single hint in the logs for the freeze. Now I 'could' live with that but
two days later I got ext2fs errors on one of the raid0 partitions. Things
like the following:

Mar 15 01:29:46 kludge kernel: EXT2-fs warning (device md(9,3)):
ext2_unlink: Deleting nonexistent file (159), 0
Mar 15 01:49:47 kludge kernel: EXT2-fs warning (device md(9,3)):
ext2_unlink: Deleting nonexistent file (160), 0
Mar 15 06:28:11 kludge kernel: EXT2-fs warning (device md(9,3)):
ext2_free_inode: bit already cleared for inode 150

I did a reboot and checked and cleaned up the filesystem. Now I have two
problems:

- there are corrupt files left that I cannot move/remove by normal means
  (i.e. rm/mv and a unlink(2) call did not work)
- I still get ext2-fs warnings!

Here are some of the files:

kludge:/var/spool/squid/03# find . -not -type d -ls
   2000 br-xr-S-w-   1 1573026990116, 110 May 29  2000 ./9F
   2010 b--Sr-S--T   1 2802128682 97, 112 Sep  2  2002 ./A0
   2100 cr--rwSrwT   1 8236 12336 46,  54 Aug 21  2021 ./B1
   213 789517 br--r-   1 8308 13875 61, 114 Feb 27  1996 ./B8

And here are recent warnings:

Mar 17 09:06:06 kludge kernel: EXT2-fs error (device md(9,3)):
ext2_readdir: directory #198 contains a hole at offset 20480
Mar 17 09:06:16 kludge kernel: EXT2-fs error (device md(9,3)): empty_dir:
bad entry in directory #198: rec_len is smaller than minimal - offset=4096,
inode=0, rec_len=0, name_len=0

Anyone got an idea how to get rid of both? I have attached the relevant
proc-info at the end. The machine is a dual-pIII with an onboard adaptec 
AIC-7890/1 (AIT tape connected) and a dual-channel AHA-394X with two
chains of 3 IBM disks each. The network card is a 3com905b. I guess I
should move that to use an extra interrupt. But that should not be the
source of the problem since the scsi-controller is the one the tape is
connected to? Hmmm, thinking over that again. I only recently started to
use the tape drive heavily. So this could be the cause?!

-Peter


---
kludge:/proc# cat interrupts 
   CPU0   CPU1   
  0:31835593048539IO-APIC-edge  timer
  1:880917IO-APIC-edge  keyboard
  2:  0  0  XT-PIC  cascade
  8:  0  1IO-APIC-edge  rtc
 10:   17670681   17651847   IO-APIC-level  aic7xxx, eth0
 11: 659362 655379   IO-APIC-level  aic7xxx, aic7xxx
 13:  1  0  XT-PIC  fpu
 14:  4  2IO-APIC-edge  ide0
NMI:  0
---
kludge:/var/log# cat /proc/mdstat 
Personalities : [raid0] [raid1] [raid5] 
read_ahead 1024 sectors
md1 : active raid1 sdd2[1] sda2[0] 96320 blocks [2/2] [UU]
md2 : active raid1 sdd3[1] sda3[0] 1951808 blocks [2/2] [UU]
md5 : active raid0 sdd6[1] sda6[0] 979712 blocks 32k chunks
md7 : active raid0 sdd7[1] sda7[0] 5863424 blocks 32k chunks
md3 : active raid0 sde6[1] sdb6[0] 979712 blocks 32k chunks
md4 : active raid0 sdf6[1] sdc6[0] 979712 blocks 32k chunks
md6 : active raid5 sdf7[3] sde7[2] sdc7[1] sdb7[0] 14650944 blocks level
5, 32k chunk, algorithm 0 [4/4] []
md8 : active raid5 sdf8[5] sde8[4] sdd8[3] sdc8[2] sdb8[1] sda8[0]
24418240 blocks level 5, 32k chunk, algorithm 0 [6/6] [UU]
md9 : active raid5 sdf9[5] sde9[4] sdd9[3] sdc9[2] sdb9[1] sda9[0]
37350400 blocks level 5, 32k chunk, algorithm 0 [6/6] [UU]
unused devices: none


--
Here is the scsi-controller with the tape drive:

kludge:/proc# cat scsi/aic7xxx/0
Adaptec AIC7xxx driver version: 5.1.21/3.2.4
Compile Options:
  TCQ Enabled By Default : Enabled
  AIC7XXX_PROC_STATS : Enabled
  AIC7XXX_RESET_DELAY: 5

Adapter Configuration:
   SCSI Adapter: Adaptec AIC-7890/1 Ultra2 SCSI host adapter
   Ultra-2 LVD/SE Wide Controller
PCI MMAPed I/O Base: 0xe100
PCI Bus 0x00 Device 0x30
 Adapter SEEPROM Config: SEEPROM found and used.
  Adaptec SCSI BIOS: Enabled
IRQ: 10
   SCBs: Active 0, Max Active 1,
 Allocated 15, HW 32, Page 255
 Interrupts: 489083
  BIOS Control Word: 0x18a6
   Adapter Control Word: 0x1c5e
   Extended Translation: Enabled
Disconnect Enable Flags: 0x
 Ultra Enable Flags: 0x
 Tag Queue Enable Flags: 0x
Ordered Queue Tag Flags: 0x
Default Tag Queue Depth: 8
Tagged Queue By Device array for aic7xxx host instance 0:
  {0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0}
Actual queue depth per device for aic7xxx host instance 0:
  {1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1}

Statistics:

(scsi0:0:4:0)
  Device using Wide/Sync transfers at 20.0 MByte/sec, offset 15
  

Re: Root on RAID1

2000-03-18 Thread Christian Robottom Reis

On Wed, 15 Mar 2000 [EMAIL PROTECTED] wrote:

  Can you please tell me where can I find the latest raid-tools? I
 found raidtools-  dangerous on http://people.redhat.com/mingo and
 didn't try to use it.
 
 Good question.  From my impression, Linux (software) raid support got
 a bit divided up, so who knows where the latest and most *stable*
 version is?

Dangerous just scares away the sissies :-) It's the right version.

Cheers,
--
_/\ Christian Reis is sometimes [EMAIL PROTECTED] 
\/~ suicide architect | free software advocate | mountain biker 



mkraid secret flag

2000-03-18 Thread James Manning

[ Wednesday, March 15, 2000 ] root wrote:
  mkraid --**-force /dev/md0

/me attempts to get the Stupid Idea Of The Month award

Motivation: trying to keep the Sekret Flag a secret is a failed effort
(the number of linux-raid archives, esp. those that are searchable, make
this a given), and a different approach could help things tremendously.

*** Idea #1:

How about --force / -f look for $HOME/.md_force_warning_read and

if not exists:
 - print huge warning (and beep thousands of times as desired)
 - creat()/close() the file

if exists:
 - Do the Horrifically Dangerous stuff

Benefit:  everyone has to read at least once (or at a minimum create a
  file that says they've read it)
Downside: adds a $HOME/ entry, relies on getenv("HOME"), etc.

*** Idea #2:

--force / -f prints a warning, prompts for input (no fancy term
tricks), and continues only on "yes" being entered (read(1,..) so
we can "echo yes |mkraid --force" in cases we want it automated).

Benefit:  warning always generated
Downside: slightly more complicated to script

Both are fairly trivial patches, so I'll be glad to generate the
patch for whichever (if either :) people seem to like.

James



Re: RAID0: Fast writes, Slow reads...

2000-03-18 Thread Jakob Østergaard

On Tue, 14 Mar 2000, Scott M. Ransom wrote:

 Hello,
 
 I have just set up RAID0 with two 30G DiamondMax (Maxtor) ATA-66 drives
 connected to a Promise Ultra66 controller.
 
 I am using raid 0.90 in kernel 2.3.51 on a dual PII-450 with 256M RAM.
 
 Here are the results from bonnie:
 
  ---Sequential Output ---Sequential Input-- --Random--
  -Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks---
   MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU  /sec %CPU
 1200  6813 98.2 41157 42.0 10101 25.9  5205 78.9 14890 27.3 137.8  1.8
 
 Seems like my sequential block writes are 3 times faster than the
 reads.  Any idea why that would be?

Someone (Probably Andre Hedrick, or perhaps Andrea Arcangali -- sorry guys, I
don't recall)  explained this on LKML. Out of my memory it has something to do
with ATA modes and the kernel configuration. You haven't enabled ``Generic
busmaster support'', or perhaps one of the other IDE driver options, I don't
exactly remember which.  But I was going to try it out myself as I see the same
odd numbers on a test system.

If you experiment a little and find the right option, please post here   :)

Or search the LKML archives, the mail was posted last week (I think).

-- 

: [EMAIL PROTECTED]  : And I see the elder races, :
:.: putrid forms of man:
:   Jakob Østergaard  : See him rise and claim the earth,  :
:OZ9ABN   : his downfall is at hand.   :
:.:{Konkhra}...:



fsck

2000-03-18 Thread octave klaba

Hi,
I have mount handly raid on 2.2.12. it works fine :)
The problem is: how to make a fsck on start, usually done
on /dev/sda1 etc, because it does not want to make
a fsck on /dev/md0 :(
At the moment I have the /etc/fstab to not allow the check
but it is not a good solution I think.

Thanks
Octave



Patch Application Problem

2000-03-18 Thread Brian Lavender

I am trying to apply the raid patch to the 2.2.14 kernel
and I get this error. What is wrong?

brian

everest:/usr/src/linux# patch -p1  raid-2.2.14-B1.patch
patching file `init/main.c'
Hunk #2 FAILED at 488.
Hunk #3 succeeded at 940 with fuzz 2 (offset 12 lines).
Hunk #4 FAILED at 1438.
2 out of 4 hunks FAILED -- saving rejects to init/main.c.rej
patching file `include/linux/raid/linear.h'
patching file `include/linux/raid/hsm_p.h'
patching file `include/linux/raid/md.h'
patch:  malformed patch at line 411: rint_devices(); }

-- 
Brian Lavender
http://www.brie.com/brian/



RE: upgrade from 2.2.12-20 to 2.2.14 breaks stuff

2000-03-18 Thread Gregory Leblanc

 -Original Message-
 From: Luca Berra [mailto:[EMAIL PROTECTED]]
 Sent: Friday, March 17, 2000 5:02 AM
 To: Linux Raid list (E-mail)
 Subject: Re: upgrade from 2.2.12-20 to 2.2.14 breaks stuff
 
 
 On Wed, Mar 15, 2000 at 12:40:07PM -0800, Gregory Leblanc wrote:
  The patch applied cleanly, and I compiled the kernel with 
 all of the proper
  options (AFAICT).  When I try to boot with the new kernel, I get:
  
  md.c: sizeof(m_super_t) = 4096 cr
  Autodetecting RAID Arrays cr
  Autorun... cr
  ...autorun done. cr
  Bad md_map in ll_rw_block isofs_read_super: bread failed, dev=09:01,
  iso_blknum=16, block=32 cr
  kernel panic: VFS: unable to open root fs on 09:01 cr
 
 can you double-check that scsi, sd, your scsi adapter, raid1
 are all built into the kernel, if they are modules
 you need to build an initrd. also check the partition type.

Aw, son-of-a-BLANK.  Thanks Luca, looks like the stupid idiotic kernel
config program bit me again.  When I hit select, I expect it to SELECT that
option, not module that option.  I'll go re-compile and see what that fixes.
Thanks again,
Greg



How to get RAID (0) working in kernel 2.2.14 2.3.47 ?

2000-03-18 Thread Martin Eriksson

Hi!

I've installed my system with the RH6.1 installation process to be a small
boot partition, a small swap part., a bigger raid0 swap part and a really
big raid0 root partition.

Now this is with kernel 2.2.12-20, and the system uses an initrd to load the
raid0 module.

When I try to upgrade to 2.2.14 or 2.3.47 the raid disk won't start at
bootup! I have made a new initrd with correct modules but it still doesn't
work.

I tried some RAID patch on 2.3.47 but that didn't work either... but that
may be the ramdisk too.

Is it possible to run RAID0 in 2.3.47 and boot from it in some way without
using a ramdisk?
And what's wrong with 2.2.14?

_
|  Martin Eriksson [EMAIL PROTECTED]
|  http://www.fysnet.nu/
|  ICQ: 5869159



Re: Root on RAID1

2000-03-18 Thread Luigi Gangitano

 Pick up lilo-21.3 - I have it here, if you can't find it. Perhaps Ingo
 could place it together with the raid-patches so there would be somewhere
 centralized - or going back to kernel.org - opinions, Jakob?

I found lilo-21.14 on debian unstable. Any hint?


Luigi Gangitano
ICQ: 2406003
WWW: http://www.poboxes.com/l.gangitano



Re: RAID0: Fast writes, Slow reads...

2000-03-18 Thread Scott M. Ransom

Jakob Østergaard wrote:
 
 Someone (Probably Andre Hedrick, or perhaps Andrea Arcangali -- sorry guys, I
 don't recall)  explained this on LKML. Out of my memory it has something to do
 with ATA modes and the kernel configuration. You haven't enabled ``Generic
 busmaster support'', or perhaps one of the other IDE driver options, I don't
 exactly remember which.  But I was going to try it out myself as I see the same
 odd numbers on a test system.

Nope.  That's not the problem (I think) here are the applicabel defines
from my config:

CONFIG_BLK_DEV_IDE=y
CONFIG_BLK_DEV_IDEDISK=y
CONFIG_BLK_DEV_IDECD=y
CONFIG_BLK_DEV_IDEFLOPPY=y
CONFIG_BLK_DEV_CMD640=y
CONFIG_BLK_DEV_RZ1000=y
CONFIG_BLK_DEV_IDEPCI=y
CONFIG_IDEPCI_SHARE_IRQ=y
CONFIG_BLK_DEV_IDEDMA_PCI=y
CONFIG_IDEDMA_PCI_AUTO=y
CONFIG_BLK_DEV_IDEDMA=y
CONFIG_IDEDMA_AUTO=y
CONFIG_IDEDMA_PCI_EXPERIMENTAL=y
CONFIG_BLK_DEV_PIIX=y
CONFIG_PIIX_TUNING=y
CONFIG_BLK_DEV_PDC202XX=y
CONFIG_BLK_DEV_IDE_MODES=y

Scott

PS:  Am now using 2.3.99-pre1 and the problem still persists...

-- 
Scott M. Ransom   
Phone:  (781) 320-9867 Address:  75 Sanderson Ave.
email:  [EMAIL PROTECTED]   Dedham, MA  02026
PGP Fingerprint: D2 0E D0 10 CD 95 06 DA  EF 78 FE 2B CB 3A D3 53



RE: How to get RAID (0) working in kernel 2.2.14 2.3.47 ?

2000-03-18 Thread Gregory Leblanc

 -Original Message-
 From: Martin Eriksson [mailto:[EMAIL PROTECTED]]
 Sent: Saturday, March 18, 2000 4:14 AM
 To: [EMAIL PROTECTED]
 Subject: How to get RAID (0) working in kernel 2.2.14  2.3.47 ?
 
 Hi!
 
 I've installed my system with the RH6.1 installation process 
 to be a small
 boot partition, a small swap part., a bigger raid0 swap part 
 and a really
 big raid0 root partition.

Root on RAID0 is stupid.  Don't do it.  RAID 0 your other partitions (/usr,
/var, /home, /whoosiewhatsit), and leave root as non-RAID, or RAID1.  RAID5
might be ok, but I still wouldn't do that either, unless it was on hardware
RAID.

 
 Now this is with kernel 2.2.12-20, and the system uses an 
 initrd to load the
 raid0 module.
 
 When I try to upgrade to 2.2.14 or 2.3.47 the raid disk won't start at
 bootup! I have made a new initrd with correct modules but it 
 still doesn't
 work.

I don't work with the 2.3.x series, but you need to patch 2.2.14 with the
patch from http://www.redhat.com/~mingo/
Greg



Re: mkraid secret flag

2000-03-18 Thread Seth Vidal

 How about --force / -f look for $HOME/.md_force_warning_read and
 
 if not exists:
  - print huge warning (and beep thousands of times as desired)
  - creat()/close() the file
how about an expiration on the timestamp on this file
ie: if the time is longer than 2 weeks make them read it again.

I know I forget all sorts of warnings after a while :)

-sv




Re: raid5 on 2.2.14

2000-03-18 Thread Seth Vidal

 If the partition types are set to "fd" and you selected the "autorun"
 config option in block devices (it should be turned on on a rawhide-type
 kernel), raidstart shouldn't be necessary.  (the kernel will have
 already started the md arrays itself, and the later initscripts raidstart
 call won't be necessary).  Could you paste any "autorun" section of md
 initialization during boot?
 
 does the same problem appear even if you build-in raid5? (first-pass
 debugging of building-in all raid-related scsi and md modules just to
 get initrd and module ordering issues out of the way might help)
 
 after you boot, does /proc/mdstat show the array?  active?
 if you boot into single-user mode, is the array already active?
 what's the raidtab contents?
 
 Note that as coded, the initscripts should only be attempting to
 raidstart inactive arrays, but I never checked to make sure that
 the code actually worked as intended.
 
 Given that, I don't really think any of the above really helps, but
 it's something to throw out there :)

I think I figured it out.
the drives came off of an older sun. They still had the sun disklabels on
them. I never remade the new disk labels before repartitioning. I think
when I rebooted the disklabels got in the way of the disks being
recognized correctly and it ate the drive.

I also found out later than one of the drives I was using had somesort of
fairly heinous fault. It would detect but would only occasionally be found
by linux. I took it out of the array I think I'm going to rma it.

thanks for the help.

As an additional question. What sort of numbers should I be seeing
(performance wise) on a u2w 4 disk array in raid5.

I'm getting about 15MB/s write and 25MB/s read but I wouldn't mind getting
those numbers cranked up some.

I'm using 32K chunksize with the stride setting correctly set (as per
jakob's howto).

I'm testing with 500MB/1000MB/1500MB/2000MB bonnie tests.

The machine is a k6-2 500 with 128MB of ram
Scsi controller is a tekram 390U2W

The disks are seagate 7200RPM's baracudda (18 and 9 gig versions)

I'm using 1 9gig partition of each of the 18 gig drives and the whole
drive on the 2 9 gig drives.

thanks

-sv