Re: 2.2.16 RAID patch

2000-06-13 Thread Ingo Molnar


On Tue, 13 Jun 2000, Marc Haber wrote:

 A kernel patched this way doesn't build with Debian's kernel package.
 Complains "The version number 2.2.16-RAID is not all lowercase. Stop."
 
 Could this be changed to 2.2.16-raid for future versions or should I
 better get in touch with kernel-package's maintainer?

if there is any good reason for this rule then i can change it to -raid.
Does anyone know why uppercase characters are a problem?

Ingo




MD_BOOT is _flawed_

2000-06-13 Thread Ingo Molnar


On Tue, 13 Jun 2000, Neil Brown wrote:

 One way is by setting the partition type of the relevant partitions.
 This is nice and easy, but requires you to use MSDOS style partition
 tables (which only 99.4% of Linux users do:-), and works fine for
 RAID0 or 1 or 5 or Linear.

no, (and i told you this before) it does not need MSDOS style partition
tables. Linux's partition system is 'generic', and when i implemented this
i only added the 'lowlevel glue' to support MSDOS-style partitions (that
was what i can test). It's trivial to add code for every other
partitioning type as well, such as BSD disklabels, which are available on
every platform.

(two minutes later) In fact i've just added BSD-disklabels autostart
support to my tree, it's a twoliner:

--- msdos.c.origMon Jun 12 03:27:25 2000
+++ msdos.c Tue Jun 13 00:46:42 2000
@@ -256,6 +256,8 @@
} /* if the bsd partition is not currently known to linux, we end
   * up here 
   */
+   if (bsd_p-p_fstype == LINUX_RAID_PARTITION)
+   md_autodetect_dev(MKDEV(hd-major, current_minor));
add_gd_partition(hd, current_minor, bsd_p-p_offset, bsd_p-p_size);
current_minor++;
 }

this is the major reason why i consider MD_BOOT an inferior solution, and
i'm still convinced about phasing it out (gradually, later on). We do not
need two ways of boot-time starting up arrays, and superblock-less arrays
are dangerous anyway. Especially as MD_BOOT is fundamentally inferior.

 The other way is to explicitly tell the kernel via md= options. This
[...]
 doesn't easily deal with devices changing name (as scsi devices can do
 when you plug in new devices).

it also doesnt deal with disk failures. Autostart is able to start up a
(failure-resistent such as RAID1/RAID5) array even if one of the disks has
failed. MD_BOOT cannot deal with certain types of disk failures.

 So both have short comings, [...]

no, only MD_BOOT has shortcomings, and i'm very convinced it will be
phased out. The only reason i accepted it is that some people want to
start up (legacy) non-persistent arrays at boot-time. I'm going to remove
the ability to start up persistent arrays via MD_BOOT, so that people do
not get the habit if starting up persistent arrays in an inferior way with
MD_BOOT. It's a pure compatibility thing, the MD_BOOT code is short and
localized.

 be addressed, and probably will over the months, but in any case, it's
 nice to have a choice (unless it is confusing I guess).

MD_BOOT cannot be fixed. It will _not_ be able to start up arrays if the
device name changes (eg. due to a failure), no matter how hard you try.
And yes, MD_BOOT is confusing.

 Me: I choose MD_BOOT because I like explicit control.

Persistent arrays are _not_ identified by their (temporary) device names.
Anything short of autostart arrays does not make full use of Linux-RAID's
capabilities to deal with various failure scenarios, and is fundamentally
flawed. (if you want to have an explicit pre-rootmount startup method then
you can still use initrd.)

Ingo




[patch] RAID 0/1/4/5 release, raid-2.4.0-test1-ac15-B4

2000-06-12 Thread Ingo Molnar


you can find the latest 2.4 RAID code at:

http://www.redhat.com/~mingo/raid-patches/raid-2.4.0-test1-ac15-B4

this is against the latest Alan Cox kernel (ac15), which can be found at:

http://www.kernel.org/pub/linux/kernel/people/alan/2.4.0test

which is against the stock 2.4.0-test1 kernel. I'd urge every 2.4 RAID
user to upgrade to the ac- kernels and this RAID patch, as it fixes
critical bugs. Users of the production 2.2-based RAID code should not
upgrade yet.

this release contains most of the fixes from Neil Brown and Jakob
Oestergaard. (thanks Neil and Jakob!) (i've cleaned up those patches and
fixed bugs in them as well) It also contains the onliner bugfix from Anton
Altaparmakov.

This RAID release finally also adds Mika Kuoppala's RAID1 read balancing
code, which is a great speedup for RAID1 systems. (cool stuff Mika!)

if any bug is still present in this release then please resend the
bugreport. Please resend patches if any of them didnt make it in (it's
likely due to cleanness issues.)

i'm also very interested in slowdowns relative to 2.2+latest_RAID, for all
RAID levels - does it still happen with this patchset as well?

i've tested the patch and it's stable under all circumstances i could
reproduce - be careful nevertheless. (RAID0/RAID1/RAID5 under SMP is
tested)

Ingo




2.2.16 RAID patch

2000-06-12 Thread Ingo Molnar


the latest 2.2 (production) RAID code against 2.2.16-final can be found
at:

http://www.redhat.com/~mingo/raid-patches/raid-2.2.16-A0

let me know if you have any problems with it.

Ingo





Re: 2.2.16 RAID patch

2000-06-12 Thread Ingo Molnar


On Mon, 12 Jun 2000, Stephen Frost wrote:

   Didn't appear to patch cleanly against a clean 2.2.16 tree, error
 was in md.c and left a rather large .rej file..

ouch, right - i've uploaded a new patch. (this problem was caused by a bug
in creating the patch)

Ingo




Re: 2.2.16 RAID patch

2000-06-12 Thread Ingo Molnar


On Mon, 12 Jun 2000, Stephen Frost wrote:

  ouch, right - i've uploaded a new patch. (this problem was caused by a bug
  in creating the patch)
 
   Much nicer, patched cleanly, thanks.  Now time to see if it compiles
 and works happily. ;)

it should :-) the problem was in creating the patch - the code itself
didnt change.

Ingo




RE: [patch] RAID 0/1/4/5 release, raid-2.4.0-test1-ac15-B4

2000-06-12 Thread Ingo Molnar


On Mon, 12 Jun 2000, Darren Evans wrote:

 can raidtools-19990824-0.90.tar.gz be used with your patch available
 on http://people.redhat.com/mingo/raid-patches/raid-2.2.16-A0 for new
 style RAID on a 2.2.16 kernel instead of the raid0145-19990824-2.2.11
 patch.

yep.

 I noticed the name had an A0 at the end, presumably thats's alpha 0 
 release on 2.2.16?

no, it's the 'stable' release. 'A0' is just an internal id for me.

Ingo




Re: Benchmarks, raid0 performance, 1,2,3,4 drives

2000-06-12 Thread Ingo Molnar


could you send me your /etc/raidtab? I've tested the performance of 4-disk
RAID0 on SCSI, and it scales perfectly here, as far as hdparm -t goes.
(could you also send the 'hdparm -t /dev/md0' results, do you see a
degradation in those numbers as well?)

it could either be some special thing in your setup, or an IDE+RAID
performance problem.

Ingo




Re: Disk failure-Error message indicates bug

2000-05-19 Thread Ingo Molnar


On Fri, 19 May 2000, Neil Brown wrote:

 - md2 checks b_rdev to see which device was in error. It gets confused
   because sda12 is not part of md2.
 
 The fix probably involves making sure that b_dev really does refer to
 md0 (a quick look at the code suggests it actually refers to md2!) and
 then using b_dev instead of b_rdev.

the fix i think is to not look at b_rdev in the error path (and anywhere
else), at all. Just like we dont look at rsector. Do we need that
information? b_rdev is in fact just for RAID0 and LINEAR, and i believe it
would be cleaner to get rid of it altogether, and create a new
encapsulated bh for every RAID0 request, like we do it in RAID1/RAID5.
OTOH handling this is clearly more complex than RAID0 itself.

 Basically, b_rdev and b_rsector cannot be trusted after a call to
 make_request, but they are being trusted.

yep. What about this solution:

md.c (or buffer.c) implements a generic pool of IO-related buffer-heads.
This pool would have deadlock assurance, and allocation from this pool
could never fail. This would already reduce the complexity of raid1.c and
raid5.c bh-allocation. Then raid0.c and linear.c is changed to create a
new bh for the mapping, which is hung off bh-b_dev_id. bh-b_rdev would
be gone, ll_rw_blk looks at bh-b_dev. This also simplifies the handling
of bhs.

i like this solution much better, and i dont think there is any
significant performance impact (starting IO is heavy anyway), but it would
clean up this issue for once and for all.

Ingo




Re: Can't recover raid5 1 disk failure - Could not import [dev21:01]!

2000-04-12 Thread Ingo Molnar


On Wed, 12 Apr 2000, Darren Nickerson wrote:

 So no problem, I have 3 of the four left, right? The array was marked [_UUU] 
 just before I power cycled (the disk was crashing) and since it had been 
 marked faulty, I was able to raidhotremove the underlined one.
 
 But now, it won't boot into degraded mode. As I try to boot redhat to single 
 user, I am told:

 md: could not lock [dev 21:01], zero size?
   Marking faulty
 Could not import [dev 21:01]!
 Autostart [dev 21:01] failed!

this happens because raidstart looks at the first entry in /etc/raidtab to
start up an array. If that entry is damaged, it does not cycle through the
other entries to start up the array. The solution is to permutate the
entries in /etc/raidtab. (make sure to restore the original order)

if you switch to boot-time autostart then this should not happen, RAID
partitions are first collected then started up, and the code should be
able to start up the array, no matter which disk got damaged.

Ingo




Re: Can't recover raid5 1 disk failure - Could not import [dev21:01]!

2000-04-12 Thread Ingo Molnar


On Wed, 12 Apr 2000, Darren Nickerson wrote:

 I'm confused. I thought I WAS boot-time autostarting.  RedHat's
 definitely autodetecting and starting the array very early in the boot
 process, but I'm clearly not entirely properly setup here because my
 partition types are not 0xfd, which seems to be important for some
 reason or another. [...]

well, it was boot-time 'very early' autostarting, but not
RAID-autostarting in the classic sense. I think i'll fix raidstart to
simply iterate through all available partitions, until one is started up
correctly (or until all entries fail). This still doesnt cover all the
cases which are covered by the 0xfd method (such as card failure, device
reshuffling, etc.), but should cover your case (which is definitely the
most common one).

 So, you're saying that the array would have automatically recovered if
 I had had all five partitions set 0xfd?

yes, definitely. Not marking a partition 0xfd is the more conservative
approach from the installer's point of view in a possibly multi-OS
environment, you can always mark it 0xfd later on.

Ingo




[patch] block device stacking support, raid-2.3.47-B6

2000-02-23 Thread Ingo Molnar


Heinz, Andrea, Linus,

various ideas/patches regarding block device stacking support were
floating around in the last couple of days, here is a patch against
vanilla 2.3.47 that solves both RAID's and LVM's needs sufficiently:

http://www.redhat.com/~mingo/raid-patches/raid-2.3.47-B6

(also attached) Andrea's patch from yesterday touches some of the issues
but RAID has different needs wrt. -make_request():

- RAID1 and RAID5 needs truly recursive -make_request() stacking because
  the relationship between the request-bh and the IO-bh is not 1:1. In the
  case of RAID0/linear and LVM the mapping is 1:1, so no on-stack
  recursion is necessery.

- re-grabbing the device queue in generic_make_request() is necessery,
  just think of RAID0+LVM stacking.

- IO-errors have to be initiated in the layer that notices them.

- i dont agree with moving the -make_request() function to be
  a per-major thing, in the (near) future i'd like to implement RAID
  personalities via several sub-queues of a single RAID-blockdevice,
  avoiding the current md_make_request internal step completely.

- renaming -make_request_fn() to -logical_volume_fn is both misleading
  and unnecessery.

i've added the good bits (i hope i found all of them) from Andrea's patch
as well: the end_io() fix in md.c, the -make_request() change returning
IO errors, and avoiding an unnecessery get_queue() in the fast path.

the patch changes blkdev-make_request_fn() semantics, but these work
pretty well both for RAID0, LVM  RAID1/RAID5:

  (bh-b_dev, bh-b_blocknr) = just like today, never modified, this is
the 'physical index' of the buffer-cache.

  internally any special -make_request() function is forbidden to access
  b_dev and b_blocknr too, b_rdev and b_rsector has to be used.
  ll_rw_block() correctly installs an identity mapping first, and all
  stacked devices just iterate one more step.

  bh-b_rdev: the 'current target device'
  bh-b_rsector: the 'current target sector'

  the return values of -make_request_fn():
ret == 0: dont continue iterating and dont submit IO
ret   0: continue iterating
ret   0: IO error (already handled by the layer which noticed it)

  we explicitly rely on ll_rw_blk getting the BH_Lock and not calling
  -make_request() on this bh more than once.

with these semantics all the variations are possible, it's up to the
device to use the one it likes best:

 - device resolves one mapping step and returns 1 (RAID0, LVM)

 - device calls generic_make_request() and return 1 (RAID1, RAID5)

 - device resolves recursion internally and returns 0 (future RAID0),
  returns 1 if recursion cannot be resolved internally.

generic_make_request() returns 0 if it has submitted IO - thus
generic_make_request() can also be used as a queue's -make_request_fn()
function - it's completely symmetric. (not that anyone would want to do
this)

NOTE: a device might still resolve stacking internally, if it can. Eg. the
next version of raid0.c will do a while loop internally if we map
RAID0-RAID0. The performance advantage is obvious: no indirect function
calls and no get_queue(). LVM could do the same as well.

(the patch modifies lvm.c to reflect these new semantics, to not rely on
b_dev and b_blocknr and to not call generic_make_request(), and fixes the
lvm.c hack avoiding MD-LVM stacking. These changes are untested.)

with this method it was pretty straightforward to add stacked RAID0 and
linear device support, here is a sample RAID0+RAID0 = RAID0 stacking:

[root@moon /root]# cat /proc/mdstat
Personalities : [linear] [raid0]
read_ahead 1024 sectors
md2 : active raid0 mdb[1] mda[0]
  1661472 blocks 4k chunks

md1 : active raid0 sdf1[1] sde1[0]
  830736 blocks 4k chunks

md0 : active raid0 sdd1[1] sdc1[0]
  830736 blocks 4k chunks

unused devices: none
[root@moon /root]# df /mnt
Filesystem   1k-blocks  Used Available Use% Mounted on
/dev/md2   160747313   1524387   0% /mnt

The LVM changes are not tested. The RAID0/linear changes compile/boot/work
just fine and are reasonably well-tested and understood.

any objections?

Ingo



--- linux/include/linux/raid/md_k.h.origWed Feb 23 06:00:20 2000
+++ linux/include/linux/raid/md_k.h Wed Feb 23 06:22:02 2000
@@ -75,6 +75,8 @@
 
 extern inline mddev_t * kdev_to_mddev (kdev_t dev)
 {
+   if (MAJOR(dev) != MD_MAJOR)
+   BUG();
 return mddev_map[MINOR(dev)].mddev;
 }
 
@@ -213,7 +215,7 @@
char *name;
int (*map)(mddev_t *mddev, kdev_t dev, kdev_t *rdev,
unsigned long *rsector, unsigned long size);
-   int (*make_request)(mddev_t *mddev, int rw, struct buffer_head * bh);
+   int (*make_request)(request_queue_t *q, mddev_t *mddev, int rw, struct 
+buffer_head * bh);
void 

Re: [patch] block device stacking support, raid-2.3.47-B6

2000-02-23 Thread Ingo Molnar


On Wed, 23 Feb 2000, Andrea Arcangeli wrote:

 - renaming -make_request_fn() to -logical_volume_fn is both misleading
   and unnecessery.
 
 Note that with my proposal it was make_request_fn to be misleading because
 all the code run within the callback had anything to do with the
 make_request code.

ok, your variant was more like a -map_buffer_fn() thing - like the old
md_map() stuff. -make_request_fn() is closer to 'make request' in the
context of RAID1 and RAID5. (even if this is not visible now).

  - device resolves recursion internally and returns 0 (future RAID0),
   returns 1 if recursion cannot be resolved internally.
 
 I don't think it worth to handle such case if it costs something for the
 other cases. I'll check and test the code on the LVM side soon.

the cost is only in the device (not in the generic block IO code), it's a
'if (MAJOR(bh-b_rdev) == MD_MAJOR) goto repeat;' type of thing (analogous
in the LVM code), nothing more. We will see how common it gets - it's just
a nice side-effect that th possibility is there.

Ingo


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Current raid driver for 2.3.42?

2000-02-09 Thread Ingo Molnar


On Tue, 8 Feb 2000, Mike Panetta wrote:

 I am looking for an updated raid driver for kernel 2.3.42+ Does such a
 beast exist?  I looked on Ingo's site and only found a patch for
 kernel 2.3.40.  This patch did not patch cleanly at all.

the newest RAID code is being merged into 2.3.43 right now. The RAID0 and
linear changes, plus most of the md.c 'infrastructure' changes are already
in pre5-2.3.43. The next step is RAID1 (including Mika Kuoppala's nice
read balancing patch) and RAID4/5.

WARNING: while this is a 'full merge' (ie. all the latest 0.90 stuff and
more will show up), and RAID0/linear is pretty functional already, do not
consider this to be near the reliability of 2.2+latest_raid, for a couple
of weeks, at least.

-- mingo



Re: Current raid driver for 2.3.42?

2000-02-09 Thread Ingo Molnar


On Wed, 9 Feb 2000, James Manning wrote:

 [ Wednesday, February  9, 2000 ] Ingo Molnar wrote:
  the newest RAID code is being merged into 2.3.43 right now.
 
 (Hopefully) quick question.  Will KNI work?

i'll make sure it works (it certainly didnt in the past) - xor.c can
afford full FPU saves so it needs no generic kernel support.

-- mingo



Re: raid145 patches for 2.2.14 anywhere?

2000-01-14 Thread Ingo Molnar


On Thu, 13 Jan 2000, Thomas Gebhardt wrote:

 just looked for the raid for 2.2.13 or 2.2.14 in the kernel archive.
 The last patches that I have found are for 2.2.11 and at least one
 hunk cannot be applied to the newer kernel sources without making
 the hands dirty. Can I get the patches for the newer kernels
 anywhere?

it's at:

http://www.redhat.com/~mingo/raid-2.2.14-B1

it applies cleanly to vanilla 2.2.14, do a 'patch -p0  raid-2.2.14-B1'.

-- mingo



Re: [FAQ-answer] Re: soft RAID5 + journalled FS + power failure =problems ?

2000-01-12 Thread Ingo Molnar


On Wed, 12 Jan 2000, Gadi Oxman wrote:

 As far as I know, we took care not to poke into the buffer cache to
 find clean buffers -- in raid5.c, the only code which does a find_buffer()
 is:

yep, this is still the case. (Sorry Stephen, my bad.) We will have these
problems once we try to eliminate the current copying overhead.
Nevertheless there are bad (illegal) interactions between the RAID code
and the buffer cache, i'm cleaning up this for 2.3 right now. Especially
the reconstruction code is a rathole. Unfortunately blocking
reconstruction if b_count == 0 is not acceptable because several
filesystems (such as ext2fs) keep metadata caches around (eg. the block
group descriptors in the ext2fs case) which have b_count == 1 for a longer
time.

If both power and a disk fails at once then we still might get local
corruption for partially written RAID5 stripes. If either power or a disk
fails, then the Linux RAID5 code is safe wrt. journalling, because it
behaves like an ordinary disk. We are '100% journal-safe' if power fails
during resync. We are also 100% journal-safe if power fails during
reconstruction of failed disk or in degraded mode.

the 2.3 buffer-cache enhancements i wrote ensure that 'cache snooping' and
adding to the buffer-cache can be done safely by 'external' cache
managers. I also added means to do atomic IO operations which in fact are
several underlying IO operations - without the need of allocating a
separate bh. The RAID code uses these facilities now.

Ingo



Re: WARNING: raid for kernel 2.2.11 used with 2.2.14 panics

2000-01-06 Thread Ingo Molnar


On Wed, 5 Jan 2000, Robert Dahlem wrote:

 I just wanted to warn everybody not to use raid0145-19990824-2.2.11
 together with kernel 2.2.14: at least in my configuration (two IDE
 drives with RAID-1, root on /dev/mdx) the kernel panics with "B_FREE
 inserted into queues" at boot time.

this should be fixed in:

http://www.redhat.com/~mingo/raid-2.2.14-B1

let me know if you still have any problem. The problem outlined by
Andrea's patch (which reverses a patch of mine) is solved as well.

-- mingo




Re: raidtools for 2.3.36?

2000-01-06 Thread Ingo Molnar


On Thu, 6 Jan 2000 [EMAIL PROTECTED] wrote:

 I am trying to build a raid0 array with 2 500 MB SCSI disks, using 2.3.36.

2.3.36 is broken wrt. RAID0 (even old RAID0 is broken). The new 2.3 RAID
patch i'm working on for 2.3.36 still has some instabilities in RAID1, but
RAID0 is rock solid. Will send a patch today or tomorrow, even if RAID1 is
still instable, so that RAID0 (and the related ll_rw_blk.c and buffer.c
changes) can be tested separately.

-- mingo



Re: Help Raid for sparc

1999-11-26 Thread Ingo Molnar


chunksize does have an important meaning in the linear case: it's
'rounding'. We cannot change this unilaterally (it breaks backwards
compatibility), and it does make sense i believe. [certain disks serve
requests faster which have proper alignment and size. I do not think we
should assume that an arbitrarily misaligned IO request will perform
identically.] So i'll fix raidtools to enforce chunksize in the linear
case (maybe introduce a 'rounding' keyword?). 

Ingo

On Fri, 26 Nov 1999, Jakub Jelinek wrote:

 On Fri, Nov 26, 1999 at 09:43:06AM +0100, [EMAIL PROTECTED] wrote:
  Hallo,
  
  I have a Sparc 10 with Linux6.1 running  I have two disks of 1Gb and
  1.7Gb.
  I would like to do a linear raid but  when I do "raidstart -a /dev/md0
   into shell I receive - /dev/md0: Invalid argument -
  and  into consolle
  (read) sdb1's sb offset:1026048 [events: 20202020]
 md: invalid raid superblock magic on sdb1
 md: sdb1 has invalid sb, not importing!
 could not import sdb1!
 autostart sdb1 failed!
  My kernel is  2.2.12-42 and raidtools-0.90
  If I do mkraid /dev/md0 I receive into shell
  -handling MD device /dev/md0
  analyzing super-block
  disk 0: /dev/sdb1, 1026144kB, raid superblock at 1026048kB
  disk 1: /dev/sdc1, 1720345kB, raid superblock at 1720256kB
  /dev/md0: Invalid argument  and into consolle I receive some messages
  that they are into file attach messages.
  My /etc/raidtab is:
  raiddev /dev/md0
  raid-level linear
  nr-raid-disks 2
  nr-spare-disks 0
  persistent-superblock 1
 
 put
   chunk-size 8
 
 here and redo mkraid (possibly with -f).
 It seems that the kernel is checking chunk size always, while raidtools are
 checking chunk size for raid0,1,4,5 only.
 IMHO kernel should not check chunk size for other raid levels, but if Ingo
 thinks it should, then raidtools should either error on not specified
 chunk-size for other levels as well or supply some default which will not
 trigger the md.c MD_BUG().
 
 Cheers,
 Jakub
 ___
 Jakub Jelinek | [EMAIL PROTECTED] | http://sunsite.mff.cuni.cz/~jj
 Linux version 2.3.18 on a sparc64 machine (1343.49 BogoMips)
 ___
 



Re: Problems with persistant superblocks and drive removal

1999-10-18 Thread Ingo Molnar


i suspect this is what happened: 

md: md0, array needs 12 disks, has 7, aborting.
raid0: disks are not ordered, aborting!

raidstart was still using the old raidtab to start up the array. It has
found an old array's superblock and tried to start it up. Some disks were
not available so the raid0 module refused to run and aborted in a safe
manner.

the behavior you saw is normal (unless my analysis is wrong, i do not
claim that there might not be bugs left).  The only way we can guarantee
protection against device reordering is marking RAID-enabled partitions as
autostartable. For that to work on Sparc you'll have to introduce a new
partition type (really small amount of hacking), only MSDOS partitions can
currently be used as autostart. (because this is what i'm using) 

of course the worst thing that should happen with this are arrays not
getting started up. If anything else happens (messed up superblocks or
corrupted data) then that is a bug. 

-- mingo



Re: Problems with persistant superblocks and drive removal

1999-10-18 Thread Ingo Molnar


On Mon, 18 Oct 1999, Florian Lohoff wrote:

 I created 2 Raid 5s with 6 Disks each. I created them one after another 
 alwas disconnecting the other disks - Both raid 5s were created as
 /dev/md1 - Afterwards i duplicated the md1 entry and created an md2 
 attaching all 12 Disks. 
 
 On startup the raidcode wasnt able to initialize both seperate raid5s.

could you send me the raidtab(s) you used, the commands you used to create
the array and the startup method, plus the bootlog of the failure? And
which driver/patch, raidtools were you using. 

-- mingo



Re: 2.2.13pre15 SMP+IDE test summary

1999-10-06 Thread Ingo Molnar


On Wed, 6 Oct 1999 [EMAIL PROTECTED] wrote:

 One more pre15 test:
 2.2.13pre15 with Unified IDE 2.2.13pre14-19991003 (two rejects in ide.c, one ok, one 
probably harmless):
 (5) dual P3 machine: NULL deref after 6 hours (i.e. this pre15 kernel survived 
longest)

is it correct that this failing kernel didnt have the RAID patch applied?

 I can think of these possible reasons for the SMP problems:
 
 (A) SMP race(s) in IDE driver in original 2.2.13pre15
 (B) SMP-deadlock in raid-2.2.11-patch

(B) is quite unlikely if you do not have it applied and the box still
crashes? My understanding is that others who had IDE+SMP problems could
reproduce it without RAID as well. (RAID0 stresses the hardware harder)

-- mingo



Re: Hotswapping successes?

1999-09-22 Thread Ingo Molnar


On Tue, 21 Sep 1999, Daniel Bidwell wrote:

  who has had success on hotswapping scsi devices in raid configuration?
  on which controllers and kernel versions?
  
 I am usinge a Compaq 2500 with 5 18GB disks, on Debian 2.1, Kernel
 2.2.12 (with raid patches).  We pulled a hotswap disk out and kept
 on using the disk system.  A couple of error messages sputtered out
 on the console and it kept on working.  We inserted a new unformated
 18GB disk and did the raidhotremove/raidhotadd thing and it looked like
 it was rebuilding the raid.  We rebooted the system and it came up
 without the new disk.  I ran fdisk to format the new disk and did the
 raidhotadd thing and it is rebuilding the entire disk.  It takes awile.
 It is still running.

yes, you need to fdisk the new disk properly for autostart to work on the
next startup.

-- mingo



Re: [PATCH] adjustable raid1 balancing (was Re: Slower read accesson RAID-1 than regular partition)

1999-09-18 Thread Ingo Molnar


On Fri, 17 Sep 1999, James Manning wrote:

 Since the previous sysctl code had been ripped out, this was pretty

James, are you patching against the latest RAID source? 2.3.18 has a
painfully outdated RAID driver. (i'm working on porting the newest stuff
to 2.3 right now)

 simple, just pulling back in the code from 2.2.11-ac3.  I'm hoping that
 the sysctl getting ripped out was more for acceptance, since speed-limit
 I still think was a good idea, even as a maximum, as it helped make the
 array more usable...

sure, and it's present and used in the latest RAID driver ...

maybe the fact that you are using the old driver explains why you see bad
RAID1 performance? What performance do you see with the newest RAID driver
on 2.2.12?

-- mingo



Re: Kernel probs...

1999-09-18 Thread Ingo Molnar


On Fri, 17 Sep 1999, David A. Cooley wrote:

 Running Kernel 2.2.11 with the raid patch and all is well...
 I'm wanting to upgrade to the 2.2.12 kernel just because it's newer...
 The 2.2.11 raid patch had some problems on the 2.2.12 source.  Is there any 
 benefit of the 2.2.12 kernel over the 2.2.11 (for sparc) or should I just 
 stay with the 2.2.11 for now?

2.2.12 (and the pre-2.2.13 patches) are better than the 2.2.11 kernel. If
you apply the 2.2.11 patch to 2.2.12 then you'll get a single reject,
which you can safely ignore. (it tries to add something that has already
been merged into the main tree)

-- mingo



Re: RAID0 benchmark

1999-09-02 Thread Ingo Molnar


On 31 Aug 1999, Marc SCHAEFER wrote:

 Now, I just changed to have the 4 disks on the QLOGIC 1080 (U2/LVD),
 then 4 (2 each for each aic7xxx)
 
   ---Sequential Output ---Sequential Input-- --Random--
   -Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks---
 MachineMB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU  /sec %CPU
  2000 19712 97.6 85378 85.8 30903 73.0 26272 97.1 83648 92.1 323.2  3.8
 
 And with just the 4 disks on the QLOGIC:
 
   ---Sequential Output ---Sequential Input-- --Random--
   -Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks---
 MachineMB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU  /sec %CPU
  2000 19662 97.4 63567 64.3 21655 50.0 26064 96.0 68886 69.6 225.2  2.4

 For me, this means that we are saturating. [...]

i think you are hitting hardware limits. First, getting 85.3 MB/sec out of
your RAID0 array isnt all that bad :) But CPU load seems to be pretty
high, that could already be a limit. Also, DMA load is probably very high
(and coming from several devices) as well. It would be interesting to
check out the very same benchmarks with an identical but higher-clocked
CPU, to see how much the saturation point depends on CPU speed. (this
might not be possible with your system i guess) 

-- mingo



[oops] the limit is 27 disks! (Re: the 12 disk limit)

1999-08-31 Thread Ingo Molnar


On Mon, 30 Aug 1999, D. Lance Robinson wrote:

 #define MD_SB_DESCRIPTOR_WORDS   32
 #define MD_SB_DISKS  20
 #define MD_SB_DISK_WORDS (MD_SB_DESCRIPTOR_WORDS * MD_SB_DISKS)

oops. I've just re-checked the superblock layout calculations to prove you
wrong, but actually it turned out that there was a factor of 2 error in
all previous calculations! (done by several people, not only me) - the
actual 'safe limit' for maximum number of disks is 27...

I'll release a new RAID driver probably tomorrow (hopefully) with these
fixes. It looks like the 'big' superblock changes can wait for some time,
27 isnt an all that bad limit after all ... 

this is a 100% safe solution - arrays bigger than 12 disks will not be
backwards compatible, but no other problems.

ugh, this is good news indeed for lots of people, thanks Lance for
pointing this out... 

-- mingo



Re: Why RAID1 half-speed?

1999-08-30 Thread Ingo Molnar


On Mon, 30 Aug 1999, Mike Black wrote:

 I just set up a mirror this weekend on an IDE RAID1 - two 5G disks on the
 same IDE bus (primary and master).

 /dev/hda:
  Timing buffer-cache reads:   64 MB in  0.95 seconds =67.37 MB/sec
  Timing buffered disk reads:  32 MB in  3.28 seconds = 9.76 MB/sec
 
 /dev/md0:
  Timing buffer-cache reads:   64 MB in  0.85 seconds =75.29 MB/sec
  Timing buffered disk reads:  32 MB in  6.10 seconds = 5.25 MB/sec

could you also show the RAID0 results? That one should quite accurately
show the effect of master/slave interaction. Some IDE chipsets do not
handle it high-performance at all. 

-- mingo



Re: the 12 disk limit

1999-08-30 Thread Ingo Molnar


On Mon, 30 Aug 1999, Lawrence Dickson wrote:

I guess this has been asked before, but - when will the RAID code get
 past the 12 disk limit? We'd even be willing to use a variant - our
 customer wants 18 disk RAID-5 real bad. 

yes, this has been requested before. I'm now mainly working on the 2.3/2.4
merge, it's working but it has unearthed main kernel bugs. I have patches
for up to ~250 disks per array, but these patches are not proven and it's
a major change in the superblock layout.

until these 'big RAID' patches are merged, as a workaround i suggest you
to combine two (or more) RAID5 arrays with RAID0 or LINEAR to form a
bigger array. Data safety is not compromized with this, system
administration is a bit more complex. (also you cannot get a better than
1:12 parity/data ratio) On the bright side, this setup provides you more
protection than pure 1:18 RAID5. (eg. if you do it 2x 1:9, then the two
RAID5 arrays can fail at once, there is a 50% chance that simultaneous
2-disk failures will be covered by this.) 

-- mingo



Re: AW: AW: more than 16 /dev/sdx ??

1999-07-22 Thread Ingo Molnar


On Thu, 22 Jul 1999, Schackel, Fa. Integrata, ZRZ DA wrote:

 Thx for all your help. All works fine. I had to rebuild kernel
 and reboot with the new one.

the 'hard limit' for per-array disks is 12. Work is underway to raise this
limit. (the new code boots and works, but some migration issues have to be
taken care of as the new superblock layout is incompatible)

-- mingo



Re: RAID 0+1

1999-07-22 Thread Ingo Molnar


On Thu, 22 Jul 1999, Christopher A. Gantz wrote:

 Also was wondering what was the status of providing RAID 1 + 0
 functionality in software for Linux. 

it works just fine:

  [root@moon /root]# cat /proc/mdstat
  Personalities : [linear] [raid0] [raid1] [raid5]
  read_ahead 1024 sectors
  md2 : active raid1 md1[1] md0[0] 8467136 blocks [2/2] [UU]
  md1 : active raid0 sdf[1] sde[0] 8467200 blocks 8k chunks
  md0 : active raid0 sdd[1] sdb[0] 8467200 blocks 8k chunks

  [root@moon /root]# df /mnt
  Filesystem   1k-blocks  Used Available Use% Mounted on
  /dev/md2   820141652   7778008   0% /mnt

you can mirror RAID0 arrays no problem.

-- mingo



Re: Raid and SMP, wont reboot after crash

1999-07-14 Thread Ingo Molnar


On Tue, 13 Jul 1999, Michael McLagan wrote:

Basic config: Supermicro P6DGU, dual Pent III 500MHz, 512M RAM, AIC7890 
 chipset, dual Seagate 39140W 9.1G Metalist drives.  This fails on multiple 
 machines of the same config, so it's not a machine specific hardware problem.
 
Using 2.2.7 (what I've got), raid0145-19990421 and a / RAID1
 partition.  It

do you get the same problems with 2.2.10 + raid0145-19990713?

-- mingo



RELEASE: RAID-0,1,4,5 patch 1999.07.13 for 2.2.10 and 2.0.37

1999-07-13 Thread Ingo Molnar


i have released Linux-RAID 1999.07.13, you can find the patches
raid0145-19990713-2.0.37.gz, raid0145-19990713-2.2.10.gz and
raidtools-19990713-0.90.tar.gz in the usual alpha directory:

http://www.country.kernel.org/pub/linux/daemons/raid/alpha

[mirrors should have synced up by the time you receive this email]

the patch adds the following new features:

- the failed-disk patch from Martin Bene [EMAIL PROTECTED]. With this
  feature it's possible to recreate arbitrary RAID superblock
  state. It's also used to install RAID1-only systems easier.

- new super-fast PIII/KNI RAID-checksumming assembly routines from
  Zach Brown [EMAIL PROTECTED] and Doug Ledford [EMAIL PROTECTED].
  These new checksumming routines not only speed RAID5 writes and
  reconstruction up significantly on PIII boxes, but also use the
  new KNI cache-control instructions to reduce cache footprint.

- initial SMP threading of RAID5 - RAID5 will be fully SMP
  threaded by 2.4 time.

bugfixes in this release (only the resync bug was serious): 

- some annoying documentation and sample config file errors got
  corrected

- raidtools abort message updated

- the 'hanging resync' bug, triggered by 2.2.8 fixed

- a slightly rewritten version of the small raid0.c additional
  sanity check patch floating around on linux-raid - we'll see
  wether it makes a difference.

- *WARNING*. The 2.0.37 RAID-patch includes the 'EGCS patch', so
  if you have applied the EGCS patch you'll get rejects.

... and other small stuff. Let me know if i've missed something.

While these changes might sound extensive, they are well-tested, thanks to
Mike Black and others. 

enjoy,

-- mingo



Re: RAID and Queuing problem

1999-07-13 Thread Ingo Molnar


On Tue, 13 Jul 1999, jiang wrote:

 I 'd like to know how the queueing commands are organized in a RAID
 system where multi-host and multi-LUN are simultanously supported. Are
 all the queuing commands are threaded or only threaded on a LUN basis? 
 Thanks

I'm not sure i understand your question. If it's about restrictions wrt. 
the execution of IO commentds, then the answer is that there is no
restriction in the RAID architecture per se - SCSI commands will be
executed in arbitrary order by the block device and SCSI layer - depending
on various reordering optimizations. The RAID layer itself does it's own
optimizations as well. Or is your question about SMP-threading?

-- mingo



Re: Swap on Raid ???

1999-07-13 Thread Ingo Molnar


On Mon, 12 Jul 1999 [EMAIL PROTECTED] wrote:

 The HOWTO states that swapping on RAID is unsafe, and that is probably
 unjustified with the latest RAID patches.

yes swapping is safe. It's _slightly_ justified with RAID1 to be fair -
but i've tried it myself and was unable to reproduce anything bad. Linux
handles resource starvation much better these days.

-- mingo




Re: Newbie: Quick patch question

1999-07-13 Thread Ingo Molnar


On Mon, 12 Jul 1999, Solitude wrote:

 The reason for the question:  I want to build a production box with a root
 raid level 1.  I have this kinda sorta working on a test box right now.  
 I have not patched the kernel at all.  I compiled it myself to support
 initrd, but otherwise it is a stock kernel.  I built it from the
 redhat-6.0 distribution 2.2.5-15 kernel source.  I am wondering if the
 patches are something I need to look into or not.

Red Hat 6.0 includes a very recent (and well-tested) version of the RAID
code, so you need no extra patching. If you want to use kernel 2.2.10 (+) 
later, then you'll need to fetch the latest RAID patch from
linux.kernel.org. 

-- mingo



RE: resync runs forever

1999-06-24 Thread Ingo Molnar


  #if 0
  if ((blocksize/1024)*j/((jiffies-starttime)/HZ + 1) + 1
   sysctl_speed_limit) {
  current-priority = 0;
^^
this is the real bug, it should be:
current-priority = 1;

yeah, stupid bug. You dont have to comment out the whole speed limit stuff
(it's rather useful you'll notice).

-- mingo



Re: RAID0 and RedHat 6.0

1999-05-19 Thread Ingo Molnar


On Mon, 17 May 1999, Robert McPeak wrote:

 Here are the relevant messages from dmesg:

 hdd1's event counter: 000c
 hdb1's event counter: 000c
 request_module[md-personality-2]: Root fs not mounted
 do_md_run() returned -22

hm, this is the problem, it tries to load the RAID personality module but
cannot find it, because the root fs is not yet mounted. But
'md-personality-2' is strange as well, it should be 'md-personality-0' for
RAID0, there is no personality-2 ...

when you run it manually:

 raid0 personality registered

then it correctly registers raid0. You'll definitely get rid of these
problems if you compile RAID into the kernel (this is only a workaround),
but these things supposed to work. I'm not sure yet whats going on. 

-- mingo



Re: RAID and RedHat 6.0

1999-05-09 Thread Ingo Molnar


On Sun, 9 May 1999, Charles Barrasso wrote:

 I recently upgraded one of my computers to RedHat 6.0 (which includes raid 
 .90).  Before the upgrade I had 2 4.1GB SCSI Hdd's combined into a linear RAID 
 array (created with raidtools-0.50beta10-2) .. after the upgrade I went to 
 re-instate this array and put the following into my /etc/raidtab:
 
 raiddev /dev/md0
 raid-level  linear
 nr-raid-disks   2
 device  /dev/sdb1
 raid-disk   0
 device  /dev/sdc1
 raid-disk   1
 
 
 but.. when I run raidstart -a I get:
 
 [root@news /root]# /sbin/raidstart -a
 /dev/md0: Invalid argument

this is the correct raidtab entry for your config: 

raiddev /dev/md0
raid-level  linear
nr-raid-disks   2
persistent-superblock   0
chunk-size  8
device  /dev/sdb1
raid-disk   0
device  /dev/sdc1
raid-disk   1

to get your array running simply do:

   raid0run /dev/md0

and thats all. Note that to get 'raid0run' you'll have to get and install
the latest raidtools. (RedHat 6.0 includes the latest code but raid0run
was added shortly after RH 6.0 was released, raid0run will show up in an
errata)

let me know if it still doesnt work,

-- mingo



Re: Raid0 created with old mdtools

1999-04-29 Thread Ingo Molnar


On Thu, 29 Apr 1999, Tuomo Pyhala wrote:

 I upgraded RH6.0 to one machine having raid0 created with some old
 version of mdtools. However new code seems to be unable to start it
 complaining about superblock magic. Has the superblock bee nchanged/Added
 in newer versions making them incompatible with old versions or is there
 some option i can use to get the raid0 running and mounted?

please upgrade to raidtools-19990421-0.90.tar.gz, that raidtools version
handles the RH 6.0 kernel just fine. First create the correct
/etc/raidtab. Then use 'raid0run /dev/md0' or 'raid0run -a' in your init
scripts to start up the old array. 

-- mingo




Re: auto-partiton new blank hotadded disk

1999-04-26 Thread Ingo Molnar


On Mon, 26 Apr 1999, Benno Senoner wrote:

 I am interested more in the idea of automatically repartition a new blank disk
 while it is hot-added.

no need to do this in the kernel (or even in raidtools). I use such
scripts to 'mass-create' partitioned disks: 

[root@moon root]# cat dobigsd

if [ "$#" -ne "1" ]; then
   echo 'sample usage: dobigsd sda'
   exit -1
fi

echo "*** DESTROYING /dev/$1 in 5 seconds!!! ***"
sleep 5
dd if=/dev/zero of=/dev/$1 bs=1024k count=1
(for N in `cat domanydisks`; do echo $N; done) | fdisk /dev/$1

[root@moon root]# cat domanydisks
n e 1   1 200
 n l 1 25
 n l 26 50
 n l 51 75
 n l 76 100
 n l 101 125
 n l 126 150
 n l 151 175
 n l 176 200
 n p 2 300 350
 n p 3 350 400
 n p 4 450 500
t 2 86
t 3 83
t 4 83
t 5 83
t 6 83
t 7 83
t 8 83
t 9 83
t 10 83
t 11 83
t 12 83
w

thats all, fdisk is happy to be put into scripts.

-- mingo



Re: auto-partiton new blank hotadded disk

1999-04-26 Thread Ingo Molnar


On Mon, 26 Apr 1999, Benno Senoner wrote:

  no need to do this in the kernel (or even in raidtools). I use such
  scripts to 'mass-create' partitioned disks:
 
 but it's not unsafe to overwrite the partition-table of disks which are
 actually part of a soft-raid array and in use ? 

it's unsafe, and thus the kernel does not allow it at all. Why dont you
create the partitions before hot-adding the disk?

-- mingo



Re: A couple of... pearls?

1999-04-25 Thread Ingo Molnar


On Sat, 24 Apr 1999, Andy Poling wrote:

  I agree completely with the first statement. But the second sounds
  somewhat odd to me. I can hotadd or hotremove a disk on linux with sw RAID
  and a non-hot swappable capable controller, maybe this is another feature
  of sw RAID over hw RAID? 
 
 Because you're _supposed_ to quiet the SCSI bus while you'ure swapping your
 disk to prevent errors in active requests when you're removing or inserting
 a device into the bus.

we could as well provide kernel functionality to turn a particular SCSI
bus (from within the kernel) off/on by delaying IO requests. This has to
be done carefully to avoid deadlocks (what if the code to turn the bus on
lies on a disk on that bus :), but can be done i think, without hardware
assistance.

-- mingo



Re: Global hot-spare disk?

1999-04-25 Thread Ingo Molnar


On Sun, 25 Apr 1999, Steve Costaras wrote:

 I'm playing this weekend with v2.2.6  the new patches on a spare server
 trying to get boot-raid working or to see how far off it is. 
 
 Anyway, I noticed that the current code doesn't seems to allow a 'global
 hot spare' disk for the raid arrays.  On my test system here I have 3
 arrays (raid 1  raid 5) and instead of keeping one hot standby disk for
 each array I'd like to keep one disk (obviously large enough to
 accomidate any of the arrays) as a hot spare to be used in any array
 that needs it. 

good idea, i've added this to the TODO list. Until this is implemented you
can keep it 'quasi-global' by raidhotremoving it, and adding it only if a
disk fails. Probably a small script (to watch failures) is needed for this
too.

-- mingo



Re: Re-naming raid arrays?

1999-04-23 Thread Ingo Molnar


On Thu, 22 Apr 1999, Steve Costaras wrote:

 I have a raid array /dev/md0 on a system here.  I am now looking at
 moving some things around and want to rename this to say /dev/md9 or
 whatever.  Since this data is mapped (initially) out of the /etc/raidtab
 file and then stored in the raid superblock, is there a way to update
 this WITHOUT loosing data on the device? 
 
 Ie, I'm keeping all disks/partitions the same that make up the device I
 just want to change it's offset (to free up 900 to create a boot-raid
 device).  Is this possible, and if so, how? 

there is no 'safe' way yet to change the number of a persistent-superblock
RAID array. You can do it by taking the array down (stop it), and recreate
with a modified /etc/raidtab. BUT! doing this you lose all protection
against device mixups (for this one single mkraid that is), so you have to
be very careful to have the right partitions mentioned. Especially on
RAID1 and RAID5 the reconstruction starts immediately overwriting what it
thinks to be redundant data ... 

but yes, this works. Alternatively there could be a safe 'tuneraid'
utility to do various smaller changes to an array. (one such functionality
would be to change the number of the array, another one could be to make
an array's config 'immutable')

-- mingo



Re: New patches against v2.2.6 kernel?

1999-04-17 Thread Ingo Molnar


On Sat, 17 Apr 1999, Steve Costaras wrote:

 Has anyone created any patches against the new 2.2.6 kernel, the latest
 I've seen is against v2.2.3 which doesn't apply cleanly against the newer
 kernels.

i'll release it Real Soon. (probably this weekend)

 Also, Just a side question, what's the status in possibly merging the raid
 code into the current kernel?  I for one have been running raid 5 here on
 several systems for about a year with no problems.  Or am I just lucky?

yep, the upcoming release will be a canditate for a merge.

-- mingo



Re: Swap on raid

1999-04-14 Thread Ingo Molnar


On Wed, 14 Apr 1999 [EMAIL PROTECTED] wrote:

 Hi folks,
 we are trying to set up a mirrored (raid-1) system for reliability
 but it is not possible according
 to the latest HOWTO to swap onto a raid volume. Is there any change on
 this?

it does work for me (i do not actually use it as such, but i've done some
stresstesting under heavy load). Let me know if you find any problems. 

-- mingo


   Has anyone set up a system like this with/without swap configured and
 what is your experience?
   We have decided to disable swap for the moment, does anyone know if
 this causes any problems
 with the general usage of linux. If we enable swap then the system will
 almost certainly crash if the disk
 with the swap partition crashes which would make mirroring the file
 systems a waste of time. Please
 correct me if I am wrong. Presumably swapping to a file has the same
 problems.
If swap does not work on a mirrored volume are there plans to make it
 work in the future? If not
 does anyone know what it would take. Perhaps we can help.
 
 Brian Murphy
 
 



Re: Swap on raid

1999-04-14 Thread Ingo Molnar


On 14 Apr 1999, Osma Ahvenlampi wrote:

 Ingo Molnar [EMAIL PROTECTED] writes:
  it does work for me (i do not actually use it as such, but i've done some
  stresstesting under heavy load). Let me know if you find any problems. 
 
 Hmm? Since when does swapping work on raid-1? How about raid-5?

i've tested it on RAID5, swapping madly to a RAID5 array while parity is
being reconstructed works just fine.

-- mingo



Re: lockup with root raid1 linux 2.2.1

1999-03-23 Thread Ingo Molnar


On Tue, 23 Mar 1999, Thorsten Schwander wrote:

 System: dual pentium 450, linux 2.2.1 SMP, BusLogic BT-958, two 4.5 GB SCSI
 ^^^
 disks with root RAID1, raid0145 and raidtools from mid February
 buslogic scsi driver compiled into the kernel
 
 Symptoms: solid freeze, no response to ALT SysRq, ping, etc. 
 
 A web server is running on the machine, otherwise it was more or less idle at

2.2.1 has a couple of known problems which are fixed in 2.2.3. (or better,
try 2.2.4-pre6).

 Is this a sign of hardware problems or could there be a problem with software
 raid? Is there anything I could do to help debugging?

next time please do a 'Ctrl-ScrollLock' and check out the process list,
which ones are running. Let me know if it happens again with a newer
kernel and RAID version 1990309.

-- mingo



Re: persistent-superblock 0 makes raidstart fail.

1999-03-19 Thread Ingo Molnar


On Fri, 19 Mar 1999, Piete Brooks wrote:

 Should raidtools-19990309-0.90 manage a linear device without a SB ?
 [ I can "mkraid" it, but once stopped, it can never be restarted ]
 md8 fails, md7 is fine.

Since it's nonpersistent, it can only be re-created. The 'old' mdadd+mdrun
was always re-creating arrays as well. raidstart (and autostart) starts
only 'persistent' arrays.

-- mingo



Re: RAID1 experiences - patches

1999-02-15 Thread Ingo Molnar


just to add one more point, i was waiting for 2.2 to stabilize before
moving the RAID driver to 2.2.x. But when patches began floating around
porting the RAID driver to 2.2.x, i rather decided to move the 'official'
patch to 2.2.x too. This resulted in at least two bogus 'RAID-problems' so
far: the out of memory thing is a generic kernel bug fixed in pre2-2.2.2,
the 'crash when MMX' problem seems to be bogus as well. So the seemingly
increasing number of bugs is actually mostly a side-effect of 2.2
stabilization. The difference in user-space tools between the latest
'stable' RAID package and the current version is unfortunate but
unavoidable. Still waiting for someone with better documentation skills
than mine to pick the maintainance of RAID-docs up :) 

-- mingo



Re: LOTS OF BAD STUFF in raid0: raid0145-19990824-2.2.11 is unstable

1999-01-03 Thread Ingo Molnar


On Fri, 5 Nov 1999, David Mansfield wrote:

 Well, I've never gotten a single SCSI error from the controller...  not to
 mention that the block being requested is WAY beyond the end of the
 device.  If this wasn't a RAID device, this would be one of the 'Attempt
 to access beyond end of device' errors that non-raid users have reported
 many times for the 2.2 series kernels.

 I have also gotten the error when not under any load, about once a month
 or so, but never with the alarming frequency of last night!

it's 99.99% a problem with the disk. The RAID0 code has not had any
significant changes (due to it's simplicity) in the last couple of years.
We never rule out software bugs, but this is one of those cases where it's
way, way down in the list of potential problem sources.

-- mingo