Re: md0 won't let go...

2000-05-11 Thread D. Lance Robinson
Harry, Can you do simple things with /dev/hdl like... ? dd count=10 if=/dev/hdl of=/dev/null It might help to see your device entry and other information, can you give us the output of... ls -l /dev/hdl cat /etc/mtab cat /proc/mdstat cat /etc/mdtab dd count=10 if=/dev/hdl o

Re: raid1 question

2000-05-05 Thread D. Lance Robinson
Ben Ross wrote: > Hi All, > > I'm using a raid1 setup with the raidtools 0.90 and mingo's raid patch > against the 2.2.15 kernel. ... > My concern is if /dev/sdb1 really crashes and I replace it with another > fresh disk, partition it the same as before, and do a resync, everything > on /dev/

Re: Please help - when is a bad disk a bad disk?

2000-04-11 Thread D. Lance Robinson
Darren Nickerson wrote: > +> 4. is there some way to mark this disk bad right now, so that > +> reconstruction is carried out from the disks I trust? I do have a hot > +> spare . . . > > Lance> You can use the 'raidhotremove' utility. > > This has never worked for me when the disk had n

Re: Please help - when is a bad disk a bad disk?

2000-04-11 Thread D. Lance Robinson
I hope this helps. See below. <>< Lance. > my questions are: > > 2. the disk seems to be "cured" by re-enabling DMA . . . but what is the state > of my array likely to be after the errors above? Can I safely assume this was > harmless? I mean, they WERE write errors after all, yes? Is my array

Re: Raid1 - dangerous resync after power-failure?

2000-03-30 Thread D. Lance Robinson
The event counter (and serial number) only indicates that the superblock is the most current. The SB_CLEAN bit is cleared when an array gets started, and is set when it is stopped (this automatically happens during a normal shutdown.) But, if the system crashes or the power gets yanked, the SB_

Re: Raid1 - dangerous resync after power-failure?

2000-03-30 Thread D. Lance Robinson
It is a very bad idea to prevent resyncs after a volume has possibly becoming out of sync. It is important to have the disks in sync--even if the data is the wrong data. The way raid-1's balancing works, you don't know what disk will be read. For the same block, the system may read different dis

Re: reconstruction problem.

2000-03-16 Thread D. Lance Robinson
> > i have set up an md (raid1) device. it has two hard disks. > > Something has gone bad on the disks, such > that whenever I do a raidstart or mkraid, it > says > raid set md0 not clean. starting background reconstr.. .. > > what can I do to clean my md device. If the raid device isn't stoppe

Re: SV: SV: raid5: bug: stripe->bh_new[4]

2000-03-03 Thread D. Lance Robinson
Johan, Thanks for sending the bulk information about this bug. I have never seen the buffer bug when running local loads, only when using nfs. The bug appears more often when running with 64MB of RAM or less, but has been seen when using more memory. Below is a sample of the errors seen while d

Re: still get max 12 disks limit

2000-02-29 Thread D. Lance Robinson
Perhaps if you also modified MAX_REAL in the md_k.h file to 15, it will like more than 12. This value is only used by raid0. <>< Lance. [EMAIL PROTECTED] wrote: > i tried this but mkraid still gives the same error of "a maximum of 12 > disks is supported." > i set MD_SB_DISKS_WORDS to 480 to gi

Re: [FAQ-answer] Re: soft RAID5 + journalled FS + power failure =problems ?

2000-01-14 Thread D. Lance Robinson
Ingo, I can fairly regularly generate corruption (data or ext2 filesystem) on a busy RAID-5 by adding a spare drive to a degraded array and letting it build the parity. Could the problem be from the bad (illegal) buffer interactions you mentioned, or are there other areas that need fixing as well

Re: large ide raid system

2000-01-11 Thread D. Lance Robinson
SCSI works quite well with many devices connected to the same cable. The PCI bus turns out to be the bottleneck with the faster scsi modes, so it doesn't matter how many channels you have. If performance was the issue, but the original poster wasn't interested in performance, multiple channels wou

Re: Swapping Drives on RAID?

2000-01-11 Thread D. Lance Robinson
Scott, 1. Use raidhotremove to take out the IDE drive. Example: raidhotremove /dev/md0 /dev/hda5 2. Use raidhotadd to add the SCSI drive. Example: raidhotadd /dev/md0 /dev/sda5 3. Correct your /etc/raidtab file with the changed device. <>< Lance. Scott Patten wrote: > I'm sorry if this is

Re: new raid5 says overlapping physical units....

2000-01-07 Thread D. Lance Robinson
Roland, The messages are not to be feared. To prevent thrashing on a drive between multiple resync processes, the raid resync routine checks to see if any of the disks in the array are already active in another resync. If so, then it waits for the other process to finish before starting. Thus,

Re: Adding a spare-disk (continued)

1999-12-25 Thread D. Lance Robinson
Hi, By the mdstat shown below, you have a 3 drive raid-5 device with one spare. The [0], [1] and [2] indicate the raid role for the associated disks. Values of [3] or higher are the spare (for a three disk array.) In general, in an 'n' disk raid array, [0]..[n-1] are the disks that are in the arr

Re: Help:Raid-5 with 12 HDD now on degrade mode.

1999-12-20 Thread D. Lance Robinson
Makoto, The normal raid driver only handles 12 disk entries (or slots). Unfortunately, a spare disk counts as another disk slot, and you need a spare slot to rebuild the failed disk. But, with your setup of 12 disk raid 5, you have already defined all the available disk slots. To recover your 12

Re: kernel SW-RAID implementation questions

1999-11-12 Thread D. Lance Robinson
There is a constant specifying the maximum number of md devices. But, there is no variable stating how many active md devices are around. This wouldn't make much sense anyway since the md devices are not allocated sequentially. You can start with md3, for example. You can have a program analyze t

Re: Tuning readahead

1999-11-12 Thread D. Lance Robinson
Attached is a program that will let you get or set the read ahead value for any major device. You can easily change the value and then do a performance test. <>< Lance. Jakob Østergaard wrote: > > Hi all ! > > I was looking into tuning the readahead done on disks in a RAID. > It seems as thou

Re: Uping the limit of drives in a single raid.

1999-09-30 Thread D. Lance Robinson
Jakob Østergaard wrote: > IIRC the 12 disk limit is a ``feature''. Actually you can have up to 15 disks. Simply > grep for the 12 disk constant in the raidtools and flip it up to 15. You can't go > further than that though. I forget the exact number, but Ingo said that the drive count can be ch

Re: Slower read access on RAID-1 than regular partition

1999-09-16 Thread D. Lance Robinson
Optimizing the md driver for Bonnie, IMHO, is foolishness. Bonnie is a sequential read/write test and does not produce numbers that mean much in typical data access patterns. Example: the read_ahead value is bumped way up (1024), this kills performance when doing more normal accesses. Linux's aver

Re: the 12 disk limit

1999-08-30 Thread D. Lance Robinson
Lawrence, If you don't care about being 'standard', There is plenty of fluff in the superblock to make room for more disks. I don't know how well behaved all the tools are at using the symbolic constants though. To Support 18 devices, you will need to allow at least 19 disks (one for the spare/r

Re: Why RAID1 half-speed?

1999-08-30 Thread D. Lance Robinson
Hi Mike, You are using a very small chunk size. Increase this number to 128. I think you may need to remake the array though. This is kind of silly since in RAID-1, the data isn't laid out any differently for different chunk sizes as other raid personalities are. It would be nice to be able to ju

Re: seeking advice for linux raid config

1999-07-21 Thread D. Lance Robinson
James, There are currently 128 possible SCSI disk device allocated in the device map--see linux/Documentation/devices.txt . Now, each of these supports partitions 1..15 (lower 4 bits) with 0 being the raw device, and the other bits for the base device are mapped into various places. There is a s

Re: RAID-0 Slowness

1999-07-01 Thread D. Lance Robinson
ng the requests). 128KB may work just as well, but this exceeds the size of some cache buffers and some device drivers cannot request more than 64KB in one request. My two cents worth. <>< Lance. Marc Mutz wrote: > D. Lance Robinson wrote: > > > > Try bumping your chunk-s

Re: RAID-0 Slowness

1999-06-30 Thread D. Lance Robinson
Try bumping your chunk-size up. I usually use 64. When this number is low, you cause more scsi requests to be performed than needed. If really big ( >=256 ) RAID 0 won't help much. <>< Lance. Richard Schroeder wrote: > Help, > I have set up RAID-0 on my Linux Redhat 6.0. I am using RAID-0 > (s

Re: What hardware do you recommend for raid?

1999-06-14 Thread D. Lance Robinson
Hi Lucio, Lucio Godoy wrote: > > The idea of using raid is to add more disks onto the scsi > controler (Hot adding ?) when needed and combine the newly > added disk to the previous disks as one physical device. > > Is it possible to add another disk without having to switch of the > machine?

Re: How to read /proc/mdstat

1999-05-30 Thread D. Lance Robinson
To identify the spare devices through /proc/mdstat... 1) Look for the [#/#] value on a line. The first number is the number of a complete raid device as defined. Lets say it is 'n'. 2) The raid role numbers [#] following each device indicate its role, or function, within the raid set. Any

Re: raid1 on ide decreases read performance

1999-05-29 Thread D. Lance Robinson
Don't start to think that Bonnie gives real world performance numbers. It gives single tasking sequential access throughput values. Sure Bonnie's numbers have some value, but don't think that its results match typical system access patterns. The performance difference with Raid-1 is seen when doi

Re: raid1 on ide decreases read performance

1999-05-28 Thread D. Lance Robinson
> > > > The bottom line: Read performance for a RAID-1 device is better than a > > single (JBOD) device. The bigger the n in n-way mirroring gives better > > read performance, but slightly worse write performance. > > > But using n-way mirrors will also increase cpu utilization during reads > - >

Re: raid1 on ide decreases read performance

1999-05-28 Thread D. Lance Robinson
Osma, RAID-1 does read balancing which may(?) be better than striping. Each read request is checked against the previous request, if it is contiguious with the previous request, it uses the same device, otherwise it switches to the next mirror. This process cycles through the mirrors (n-way mirro

Re: Add expansion of exisiting RAID 5 config in software RAID?

1999-05-27 Thread D. Lance Robinson
The answer is still the same (May 1999). <>< Lance. Scott Smyth wrote: > > I would like to explore the requirements of expanding > RAID 0,4, and 5 levels from an existing configuration. > For example, if you have 3 disks in a RAID 5 configuration, > you currently cannot add a disk to the RAID 5

Fix for /proc/mdstat & raidstop panic

1999-05-13 Thread D. Lance Robinson
Hi all, Attached is a fix for a problem that happens when /proc/mdstat is read when a raid device is being stopped. A panic could result. Not many users are reading /proc/mdstat much or stopping a raid device manually, but this problem caused us many headaches. The problem happens something lik

Re: Swap on raid

1999-05-10 Thread D. Lance Robinson
Hi, You can run a system without a swap device. But if you do 'swapoff -a' _after_ a swap device failure, you are dead (if swap had any virtual data stored in it.) 'swapoff -a' copies virtual data stored in the swap device to physical memory before closing the device. This is much different than

system panic when reading /proc/mdstat while doing raidstop.

1999-05-07 Thread D. Lance Robinson
There seems to be a major problem when reading /proc/mdstat while a raid set is being stopped. This rarely conflict will very rarely be seen, but I have a daemon that monitors /proc/mdstat every two seconds and once in a while the system panics when doing testing. While running the script below w

Re: RAID+devfs patch for new kernel?

1999-05-01 Thread D. Lance Robinson
Hi Steve, I made the patches that are on Richard's site for raid+devfs. Unfortunately, I was having too many problems with devfs on my PowerPC sustem and had to solve problems without devfs. I still have a patch file that I used to help create the raid+devfs patch. I don't know if it fixes all th

Memory buffer corruption with Raid on PPC

1999-04-14 Thread D. Lance Robinson
I have linux 2.2.3 with raid014519990309.. patch. On a PPC (Mac G3) system, I am getting what seems to be memory buffer courruption when using raidstart. The same kernel source run with i386 architecture seems to be fine. To show the problem, I do something like the following... # cd ~me # gc

Re: Day 7 and still no satisfaction

1999-04-02 Thread D. Lance Robinson
Carl, The 2.2.4 kernel does not have the latest raid code. But, the raid patches do not yet cleanly apply to the 2.2.4 kernel. I suggest you start with the 2.2.3 kernel, apply the appropriate raid patches (raid0145-19990309-2_2_3.gz), and get the latest raidtools (raidtools-19990309-0_90_tar.gz)

Re: Filesystem corruption (was: Re: Linux 2.2.4 & RAID - success report)

1999-03-31 Thread D. Lance Robinson
I have also experienced file system corruption with 2.2.4. The problem most likely lies in the /fs/buffer.c file which the raid patch had a conflict with. <>< Lance. Tony Wildish wrote: > this sound to me like bad memory. I had a very similar problem recently > and it was a bad SIMM. I was luc

raid5: md0: unrecoverable I/O error for block x

1999-03-12 Thread D. Lance Robinson
Hi, If I "scsi remove-single-device" two devices from a RAID5, I would expect the RAID device to eventually fail itself. But it seems to be in some sort of loop spitting out raid5: md0: unrecoverable I/O error for block Where seems to be cyclic. Top shows that raid5d is taking 99% of

read_ahead in md driver.

1999-03-12 Thread D. Lance Robinson
Hi, I have noticed that the read_ahead value is set to 1024 in the md driver. Why is this value so large? I would think a value of 128 or so would be more appropriate. <>< Lance.

md: bug in file raid5.c, line 666 (line of raid5_error code)

1999-02-15 Thread D. Lance Robinson
Hi, I am doing some tests with raid. I will probably have more posts on other situations, but here is a situation that causes raid problems... scenario: 1) mkraid /dev/md/0# raid5 three drive, no spare (using devfs) 2) Wait for resync to complete 3) Disable one of the drives. 4) mke2fs

Re: disconnecting live disks

1999-02-05 Thread D. Lance Robinson
steve rader wrote: > > Some eec person once told me that disconnecting live molex > (power) scsi connectors can kill a disk drive. And I'm also > not confortable futzing with scsi connectors on live busses. > > I assume the perferred method is to put the disk-to-kill > on a external power suppl

Re: [BUG] v2.2.0 heavy writing at raid5 array kills processes

1999-02-05 Thread D. Lance Robinson
Markus Linnala wrote: > > v2.2.0 heavy writing at raid5 array kills processes randomly, including init. > > Normal user can force random processes to out of memory > situation when writing stuff at raid5 array. This makes the raid > > I get 'Out of memory for init. ' etc. with following simple

Re: Physical device tracking....

1999-01-29 Thread D. Lance Robinson
James, First of all, you probably want to reboot. This will rename your devices to their typical values. To add a device into a failed raid slot, you can use the raidhotadd command. do something like: raidhotadd /dev/md0 /dev/hdc2 This will add the device to the raid set and start a res

Re: [BUG] v2.2.0 heavy writing at raid5 array kills processes

1999-01-29 Thread D. Lance Robinson
I have also noticed this type of problem. It seems as though the RAID5 driver generates a growing write backlog and keeps allocating new buffers when new asynchronous write requests get in. Eventually it reserves all the available physical memory. Trying to swap data to virtual memory storage wou

Where is 2.1.131-ac11 kernel

1998-12-18 Thread D. Lance Robinson
I've been hearing about 2.1.131-ac9, and now 2.1.131-ac11. What does the -acX mean and where is it available? Thanks, <>< Lance.

Re: raid0145 & devfs v79

1998-11-30 Thread D. Lance Robinson
Eric van Dijken wrote: > Is there somebody working on joining the devfs patch and the raid patch in > the linux kernel (2.1.130) ? > I am planning on working on this issue sometime this week. <>< Lance.

Raid5 pauses when doing mk2efs on PowerPC

1998-11-16 Thread D. Lance Robinson
Hi all, The RAID5 md driver pauses for 10-11 seconds, many times, while doing a mke2fs. The pauses start after 300-400 groups have been written, then a small amount of transfers happen between pauses until the process is done. The spurts of transfers between pauses range between .01 seconds to ma