On Tue, Oct 19, 1999 at 09:47:13PM -0700, Tom Livingston wrote:
[snip]
> 
> A fair thing to note is that this is a "riskier" endeavor than anything else
> in the raidtools package.  That being so, it makes sense to have a good
> amount of checks along the way.  I see you do check the raidtab against the
> superblocks, which is most of it.  I can't remember, though, if you do ext2
> checking & mounted file system checking like mkraid does

A little checking.  But I'll start by putting in algorithms that don't destroy
your data if you use differently sized disks   :)

> Also, some thought needs to go into how this could handle a power fail in
> the middle.  Certainly you don't want the old raid set to auto start when

Yes, checkpointing would be good.  It should be fairly simple, if one could
accept to still loose one (or N = # of disks) chunks.

> the machine is rebooted, doing so could cause all sorts of problems.
> Instead of requiring users to disable the raid by removing the fd partition
> labels, maybe the reconf utility should erase the superblocks on the md
> device it's working on, placing instead a marker that shows it's in the
> middle of being reworked.  If status information about the reconstruction

Good point    :)

> was kept in the superblocks (or just one) the reconf utility could use this
> data to pick up where it left off...
> 
> Also, when allowing a user to reduce the raidset size ( can't remember if
> you already allow this... I read the code sunday and already I've forgotten
> everything ;), you probably want to do a sanity check to see if the ext2
> partition on that device has already been sized down.

I'm trying to use the existing code that mkraid uses (from raid_io.c) to do
the superblock updating, so some of the clever checks are only maintained
in one place.  But the raidreconf utility will need some checks added to
it, in time.

I'll do basic features first, checks later.

> Might also benefit in considering what happens if there's a hardware error
> on one of the old or new disks during the process...  perhaps an area of bad
> sectors on one of the new disks?  I think all of the information is still
> there at this point to do an about face, and start un-reconf'ing the
> drive... walking backwards to put it back in it's original state.

This quickly gets hairy.   I think, once it handles a number of basic features
(like shrink+grow of raid[01]) it'll be easier to see what can be done.

One could do a write test on the new disks and a read test on the old ones
before actually moving data.  I guess that would be a pretty good start  (?)

> I believe in put-up or shut-up, so I'm happy to lend my time to the process.
> I'm in between consulting gigs right now and could probably add something.

I'll write back to the list as soon as I've re-done the basic algorithm.  The
code available for download now is _really_ basic, and wrong.  I've already
changed quite a lot of it. Wait a day or so for an update   :)

Cheers,

................................................................
: [EMAIL PROTECTED]  : And I see the elder races,         :
:.........................: putrid forms of man                :
:   Jakob Østergaard      : See him rise and claim the earth,  :
:        OZ9ABN           : his downfall is at hand.           :
:.........................:............{Konkhra}...............:

Reply via email to