On Sat, 1 Apr 2000, Mike Bilow wrote:

> On Fri, 31 Mar 2000, Michael wrote:

> > hmmm..... the remirroring code is not very smart... as I recall it 
> > does the remirroring in order .. i.e. md0, md1, etc... This would 
> > imply that if you have a power fail or other crash that causes both 
> > md's to be faulty, the system will not be able to swap until md0, 
> > then md1 is remirrored. The boot and swap md's should always be the 
> > first two raid partitions under this scenario as they are very small 
> > and will remirror quickly allowing swapping to proceed while the main 
> > array remirrors.
> 
> Jakob:
> 
> If this is true, it should be added to the HOWTO.

I have some ideas for how to deal with swapping on RAID.

1. Modify or wrap the invocation of the swapon program so that it takes
responsibility for waiting until resyncing of its target swap volume is
completed.  This could be done by adding code to the swapon program itself
which checked to make sure that it was not enabling swapping onto a RAID
volume being resynced.  While this is the more reboust approach, it would
also involve modifying a fairly well established and solid program.  Using
a shell test against /proc/mdstat is probably more prudent.  Ideally, the
kernel should be able to act intelligently when swapping is enabled onto a
RAID device which is being resynced and defer actual use until it is safe,
so it makes more sense to fix this in the kernel than in the swapon binary
-- which is really just a call to the swapon() system service anyway.

2. We should probably get some code added to the resync procedure which
allows controlling the order in which resync is done so that swap
partitions can be specified first.  This would logically be handled as
kernel arguments, something like "raidsync=md1,md0,md2" or maybe a general
algorithm such as "raidsync=smallest-first".  I am also not sure that it
is sensible to have the kernel code default to resyncing in device order
(md0, md1, ...) rather than doing smallest volumes first.  Of course, if
the kernel has been notified via swapon which volumes are to be used for
swapping, then it is easy for it to resync these first.

3. Some way of testing for resync in progress should be added to /proc so
that startup scripts are not dependent upon external tools such as grep. 
for example, we could have /proc/md which had a subdirectory for each
device, say /proc/md/0, /proc/md/1, and so on.  Each device subdirectory
could have standard names such as "size," "resync," "type,"  and so on.
This way, it would be possible to do tests such as this:

        if [ `cat /proc/md/0/resync` = 0 ]; then
                swapon /dev/md0
        fi

This sort of facility is going to be needed if the kernel is going to
protect against enabling swapping onto RAID undergoing resync.

-- Mike


Reply via email to