Re: New features?

2006-11-09 Thread Neil Brown
On Friday November 3, [EMAIL PROTECTED] wrote:
 On Fri, Nov 03, 2006 at 02:39:31PM +1100, Neil Brown wrote:
 
  mdadm could probably be changed to be able to remove the device
  anyway.  The only difficulty is: how do you tell it which device to
  remove, given that there is no name in /dev to use.
  Suggestions?
 
 Major:minor? If /sys/block still holds an entry for the removed disk,
 then the user can figure it out from the name. Or mdadm could just
 accept a path under /sys/block instead of a device node.

I like the /sys/block idea.  So if given a directory we look for a
'dev' file and read major:minor from that.
I guess I could also just allow
  mdadm /dev/mdX --remove failed

and all failed devices get removed.

Thanks for the suggestion.
NeilBrown
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: New features?

2006-11-03 Thread Gabor Gombas
On Fri, Nov 03, 2006 at 02:39:31PM +1100, Neil Brown wrote:

 mdadm could probably be changed to be able to remove the device
 anyway.  The only difficulty is: how do you tell it which device to
 remove, given that there is no name in /dev to use.
 Suggestions?

Major:minor? If /sys/block still holds an entry for the removed disk,
then the user can figure it out from the name. Or mdadm could just
accept a path under /sys/block instead of a device node.

Gabor

-- 
 -
 MTA SZTAKI Computer and Automation Research Institute
Hungarian Academy of Sciences
 -
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: New features?

2006-11-02 Thread Neil Brown
On Tuesday October 31, [EMAIL PROTECTED] wrote:
 Hi,
 
 On Tue, Oct 31, 2006 at 08:50:19AM -0800, Mike Hardy wrote:
   1 Warm swap - replacing drives without taking down the array but maybe
   having to type in a few commands. Presumably a sata or sata/raid
   interface issue. (True hot swap is nice but not worth delaying warm-
   swap.)
   
   I believe that 2.6.18 has SATA hot-swap, so this should be available
   know ... providing you can find out what commands to use.
  
  I forgot 2.6.18 has SATA hot-swap, has anyone tested that?
 
 Yeah, I've tracked the sata EH patches and now 2.6.18 for a while and
 hotswap works. However if you pull the disk on a raidset the disk is set
 as faulty and the device (/dev/sda for example) dissapears. If you
 replug it, the device does not regain it's original devicename but but
 will use the latest free 'slot' available (in a four disk layout that's
 /dev/sde). Also trying to --remove the disk doesn't work since the
 devicefile is gone. So be sure to --remove disks _before_ you pull it.
 
 Anyone know if there's work being done to fix this issue, does 
 this also happen on scsi ?

The 'correct' way to fix --remove doesn't work after is to get udev
to do the mdadm --remove before deleting the entry from /dev.

mdadm could probably be changed to be able to remove the device
anyway.  The only difficulty is: how do you tell it which device to
remove, given that there is no name in /dev to use.
Suggestions?

Having the device come back with a different name shouldn't really be
a problem.  You can still find it an re-add it.

NeilBrown
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: New features?

2006-10-31 Thread Neil Brown
On Tuesday October 31, [EMAIL PROTECTED] wrote:
 All this discussion has led me to wonder if we users of linux RAID have
 a clear consensus of what our priorities are, ie what are the things we
 really want to see soon as opposed to the many things that would be nice
 but not worth delaying the important things for. FWIW, here are mine, in
 order although the first two are roughly equal priority.
 
 1 Warm swap - replacing drives without taking down the array but maybe
 having to type in a few commands. Presumably a sata or sata/raid
 interface issue. (True hot swap is nice but not worth delaying warm-
 swap.)

I believe that 2.6.18 has SATA hot-swap, so this should be available
know ... providing you can find out what commands to use.

 
 2 Adding new disks to arrays. Allows incremental upgrades and to take
 advantage of the hard disk equivalent of Moore's law.

Works for raid5 and linear.  Raid6 one day.

 
 3. RAID level conversion (1 to 5, 5 to 6, with single-disk to RAID 1 a
 lower priority).

A single disk is large than a RAID1 built from it, so this is
non-trivial.  What exactly do you want to do there.

 
 4. Uneven disk sizes, eg adding a 400GB disk to a 2x200GB mirror to
 create a 400GB mirror. Together with 2 and 3, allows me to continuously
 expand a disk array.

So you have a RAID1 (md) from sda and sdb, both 200GB, and you now have a
sdc which is 400GB.
So
   mdadm /dev/md0 -a /dev/sdc
   mdadm /dev/md0 -f /dev/sda
   mdadm /dev/md0 -r /dev/sda
   # wait for recovery
   mdadm /dev/md0 -f /dev/sdb
   mdadm /dev/md0 -r /dev/sdb
   mdadm -C /dev/md1 -l linear -n 2 /dev/sda /dev/sdb
   mdadm /dev/md0 -a /dev/md1
   # wait for recovery
   mdadm --grow /dev/md0 --size=max

You do run with a degraded array for a while, but you can do it
entirely online.
It might be possible to decrease the time when the array is degraded,
but it is too late at night to think about that.

NeilBrown
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: New features?

2006-10-31 Thread John Rowe
Thanks for this Neil, good to know that most of what I would like is
already available. I think your reply highlights what I almost put in
there as my first priority: documentation, specifically a HOWTO.

 I believe that 2.6.18 has SATA hot-swap, so this should be available
 know ... providing you can find out what commands to use.

Exactly!

  2 Adding new disks to arrays. Allows incremental upgrades and to take
  advantage of the hard disk equivalent of Moore's law.
 
 Works for raid5 and linear.  Raid6 one day.

Am I misinterpreting the mdadm 2.5 man pages when it says:

Grow (or shrink) an array, or otherwise reshape it in some way.
Currently supported growth options including changing the active
size of component devices in RAID level 1/4/5/6 and changing the
number of active devices in RAID1.

  3. RAID level conversion (1 to 5, 5 to 6, with single-disk to RAID 1 a
  lower priority).
 
 A single disk is large than a RAID1 built from it, so this is
 non-trivial.  What exactly do you want to do there.

Single to disk is less important, but adding a third disk to a RAID1
pair to make a RAID5 would be nice as would be adding one or more disks
to a RAID5 to make a RAID6.

John


-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: New features?

2006-10-31 Thread Mike Hardy


Neil Brown wrote:
 On Tuesday October 31, [EMAIL PROTECTED] wrote:

 1 Warm swap - replacing drives without taking down the array but maybe
 having to type in a few commands. Presumably a sata or sata/raid
 interface issue. (True hot swap is nice but not worth delaying warm-
 swap.)
 
 I believe that 2.6.18 has SATA hot-swap, so this should be available
 know ... providing you can find out what commands to use.

I forgot 2.6.18 has SATA hot-swap, has anyone tested that?

FWIW, SCSI (or SAS now, using SCSI or SATA drives) has full hot-swap
with completely online drive exchanges. I have done this on recent
kernels in production and it works.

 
 2 Adding new disks to arrays. Allows incremental upgrades and to take
 advantage of the hard disk equivalent of Moore's law.
 
 Works for raid5 and linear.  Raid6 one day.

Also works for raid1!


 4. Uneven disk sizes, eg adding a 400GB disk to a 2x200GB mirror to
 create a 400GB mirror. Together with 2 and 3, allows me to continuously
 expand a disk array.
 
 So you have a RAID1 (md) from sda and sdb, both 200GB, and you now have a
 sdc which is 400GB.
 So
mdadm /dev/md0 -a /dev/sdc
mdadm /dev/md0 -f /dev/sda
mdadm /dev/md0 -r /dev/sda
# wait for recovery

Could be:

mdadm /dev/md0 -a /dev/sdc
mdadm --grow /dev/md0 --raid-devices=3 # 3-disk mirror
# wait for recovery
# don't forget grub-install /dev/sda (or similar)!
mdadm /dev/md0 -f /dev/sda
mdadm /dev/md0 -r /dev/sda
mdadm --grow /dev/md0 --raid-devices=2 # 2-disk again

# Run a 'smartctl -d ata -t long /dev/sdb' before next line...

mdadm /dev/md0 -f /dev/sdb
mdadm /dev/md0 -r /dev/sdb
mdadm -C /dev/md1 -l linear -n 2 /dev/sda /dev/sdb
mdadm /dev/md0 -a /dev/md1
# wait for recovery
mdadm --grow /dev/md0 --size=max
 
 You do run with a degraded array for a while, but you can do it
 entirely online.
 It might be possible to decrease the time when the array is degraded,
 but it is too late at night to think about that.

All I did was decrease the degradation time, but hey it could help. And
don't forget the long SMART test before running degraded for real. Could
save you some pain.

-Mike
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: New features?

2006-10-31 Thread Bill Davidsen

John Rowe wrote:


All this discussion has led me to wonder if we users of linux RAID have
a clear consensus of what our priorities are, ie what are the things we
really want to see soon as opposed to the many things that would be nice
but not worth delaying the important things for. FWIW, here are mine, in
order although the first two are roughly equal priority.

1 Warm swap - replacing drives without taking down the array but maybe
having to type in a few commands. Presumably a sata or sata/raid
interface issue. (True hot swap is nice but not worth delaying warm-
swap.)
 

That seems to work now. It does assume that you have hardware hot swap 
capability.



2 Adding new disks to arrays. Allows incremental upgrades and to take
advantage of the hard disk equivalent of Moore's law.
 


Also seems to work.


3. RAID level conversion (1 to 5, 5 to 6, with single-disk to RAID 1 a
lower priority).
 

Single to RAID-N is possible, but involves a good bit of magic with 
leaving room for superblocks, etc.



4. Uneven disk sizes, eg adding a 400GB disk to a 2x200GB mirror to
create a 400GB mirror. Together with 2 and 3, allows me to continuously
expand a disk array.
 


???

--

bill davidsen [EMAIL PROTECTED]
 CTO TMR Associates, Inc
 Doing interesting things with small computers since 1979

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: New features?

2006-10-31 Thread Frido Ferdinand
Hi,

On Tue, Oct 31, 2006 at 08:50:19AM -0800, Mike Hardy wrote:
  1 Warm swap - replacing drives without taking down the array but maybe
  having to type in a few commands. Presumably a sata or sata/raid
  interface issue. (True hot swap is nice but not worth delaying warm-
  swap.)
  
  I believe that 2.6.18 has SATA hot-swap, so this should be available
  know ... providing you can find out what commands to use.
 
 I forgot 2.6.18 has SATA hot-swap, has anyone tested that?

Yeah, I've tracked the sata EH patches and now 2.6.18 for a while and
hotswap works. However if you pull the disk on a raidset the disk is set
as faulty and the device (/dev/sda for example) dissapears. If you
replug it, the device does not regain it's original devicename but but
will use the latest free 'slot' available (in a four disk layout that's
/dev/sde). Also trying to --remove the disk doesn't work since the
devicefile is gone. So be sure to --remove disks _before_ you pull it.

Anyone know if there's work being done to fix this issue, does 
this also happen on scsi ?

Met vriendelijke groet,

-- Frido Ferdinand
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


New features?

2006-10-31 Thread Frido Ferdinand
Hi,

On Tue, Oct 31, 2006 at 08:50:19AM -0800, Mike Hardy wrote:
  1 Warm swap - replacing drives without taking down the array but maybe
  having to type in a few commands. Presumably a sata or sata/raid
  interface issue. (True hot swap is nice but not worth delaying warm-
  swap.)
  
  I believe that 2.6.18 has SATA hot-swap, so this should be available
  know ... providing you can find out what commands to use.
 
 I forgot 2.6.18 has SATA hot-swap, has anyone tested that?

Yeah, I've tracked the sata EH patches and now 2.6.18 for a while and
hotswap works. However if you pull the disk on a raidset the disk is set
as faulty and the device (/dev/sda for example) dissapears. If you
replug it, the device does not regain it's original devicename but but
will use the latest free 'slot' available (in a four disk layout that's
/dev/sde). Also trying to --remove the disk doesn't work since the
devicefile is gone. So be sure to --remove disks _before_ you pull it.

Anyone know if there's work being done to fix this issue, does 
this also happen on scsi ?

Regards,

-- Frido

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: new features time-line

2006-10-18 Thread Neil Brown
On Tuesday October 17, [EMAIL PROTECTED] wrote:
 We talked about RAID5E a while ago, is there any thought that this would 
 actually happen, or is it one of the would be nice features? With 
 larger drives I suspect the number of drives in arrays is going down, 
 and anything which offers performance benefits for smaller arrays would 
 be useful.

So ... RAID5E is RAID5 using (N-1)/N of each drive (or close to that)
and not having a hot spare.
On a drive failure, the data is restriped across N-1 drives so that it
becomes plain RAID5.  This means that instead of having an idle spare,
you have spare space at the end of each drive.

To implement this you would need kernel code to restripe and array to
reduce the number of devices (currently we only increase the number of
devices).

Probably not too hard - just needs code and motivation.  

Don't know if/when it will happen, but it probably will
 especially if someone tries writing some code (hint hint to any
potential developers out there...)

NeilBrown
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: new features time-line

2006-10-17 Thread Bill Davidsen

Neil Brown wrote:


On Friday October 13, [EMAIL PROTECTED] wrote:
 


I am curious if there are plans for either of the following;
-RAID6 reshape
-RAID5 to RAID6 migration
   



No concrete plans with timelines and milestones and such, no.
I would like to implement both of these but I really don't know when I
will find/make time.  Probably by the end of 2007, but that is not a
promise.

We talked about RAID5E a while ago, is there any thought that this would 
actually happen, or is it one of the would be nice features? With 
larger drives I suspect the number of drives in arrays is going down, 
and anything which offers performance benefits for smaller arrays would 
be useful.


--
bill davidsen [EMAIL PROTECTED]
 CTO TMR Associates, Inc
 Doing interesting things with small computers since 1979

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: new features time-line

2006-10-15 Thread Neil Brown
On Friday October 13, [EMAIL PROTECTED] wrote:
 Good to hear.  
 
 I think when I first built my RAID (a few years ago) I did some research on
 this;
 http://www.google.com/search?hl=enq=bad+block+replacement+capabilities+mdad
 m
 
 And found stories where bit errors were an issue.
 http://www.ogre.com/tiki-read_article.php?articleId=7
 
 After your email, I went out and researched it again.  Eleven months ago a
 patch to address this was submitted for RAID5, I would assume RAID6
 benefited from it too? 

Yes.  All appropriate raid level support auto-overwrite of read
errors.

NeilBrown
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


new features time-line

2006-10-13 Thread Dan
I am curious if there are plans for either of the following;
-RAID6 reshape
-RAID5 to RAID6 migration

Here is why I ask, and sorry for the length.

I have an aging RAID6 with eight 250G drives as a physical volume in a
volume group.  It is at about 80% capacity.  I have had a couple drives fail
and replaced them with 500G drives.  I plan to migrate the rest over time as
they drop out.  However this could be months or years.

I could just be patient and wait until I have replaced all the drives and
use the -G -z max  to grow the RAID to resize the array to the maximum
space.  But I could use the extra space sooner.

Since I already have the existing RAID (md0) as a physical volume in a
volume group, I though why not just use the other half of the drives and
create another RAID6 (md1) add that to the same volume group and so on as I
grow. md0 made from devices=/dev/sd[abcdefgh]1; md1 made from
devices=/dev/sd[abcdefgh]2; and so on (I could have the md number match the
partition number for aesthetics I suppose)...  

By doing this I further protect myself from possible bit error rate on
increasingly large drives.  So if there are suddenly three bit errors I have
a chance as long as they are not all on the same partition number.  Mdadm
will only kick out the bad partitions and not the whole drive. (I know I am
already doing RAID6, what are the chances of three!).

To get to my point, I would like to split the new half of the drives into a
new physical volume and would 'like' to try to start using some of the
drives before I have replace all the existing 250G drives.  If RAID6 reshape
was an option I could start once I have replaced at least three of the old
drives (built it as a RAID6 with one missing).  But it is not available,
yet.  Or, since RAID5 reshape is an option, I could again start when I have
replaced three (built it as a RAID5) than grow it to until I get to the
eighth drive and migrate to the final desired RAID6.  But that is not an
option, yet.

Thoughts?


-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: new features time-line

2006-10-13 Thread Neil Brown
On Friday October 13, [EMAIL PROTECTED] wrote:
 I am curious if there are plans for either of the following;
 -RAID6 reshape
 -RAID5 to RAID6 migration

No concrete plans with timelines and milestones and such, no.
I would like to implement both of these but I really don't know when I
will find/make time.  Probably by the end of 2007, but that is not a
promise.

Of course someone else could implement them.  RAID6 reshape should be
fairly straight forward given that RAID5 and RAID6 use the same code
and RAID5 reshape is done.
RAID5 to RAID6 conversion would be a bit trickier, but not much.
A point worth noting is that RAID5-RAID6 conversion without growing
the array at the same time is not a good idea.  It will either be
dangerous (a crash during the reshape will cause corruption of data)
or slow (all data needs to be copied one extra time - the 'critical
region' of raid5 reshape becomes the whole array if you don't grow the
array). 

Probably the fastest way for these to get implemented is for someone
else to try and post the results.  I would be very likely to comment
and help get the patch into a reliable and maintainable form and then
pass it on upstream.

Are you any good at coding :-)


 
 Here is why I ask, and sorry for the length.

All sounds fairly sensible.  Except that as I say above, the option of
growing a raid5 bit by bit, then adding the last disk and making it
raid6 is not such a good approach.

NeilBreon
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: new features time-line

2006-10-13 Thread Dan
Good to hear.  

I think when I first built my RAID (a few years ago) I did some research on
this;
http://www.google.com/search?hl=enq=bad+block+replacement+capabilities+mdad
m

And found stories where bit errors were an issue.
http://www.ogre.com/tiki-read_article.php?articleId=7

After your email, I went out and researched it again.  Eleven months ago a
patch to address this was submitted for RAID5, I would assume RAID6
benefited from it too? 

___
http://kernel.org/pub/linux/kernel/v2.6/testing/ChangeLog-2.6.15-rc1 

Author: NeilBrown [EMAIL PROTECTED]
Date:   Tue Nov 8 21:39:22 2005 -0800

[PATCH] md: better handling of readerrors with raid5.

This patch changes the behaviour of raid5 when it gets a read error.
Instead of just failing the device, it tried to find out what should
have
been there, and writes it over the bad block.  For some media-errors,
this
has a reasonable chance of fixing the error.  If the write succeeds, and
a
subsequent read succeeds as well, raid5 decided the address is OK and
conitnues.

Instead of failing a drive on read-error, we attempt to re-write the
block,
and then re-read.  If that all works, we allow the device to remain in
the
array.

Signed-off-by: Neil Brown [EMAIL PROTECTED]
Signed-off-by: Andrew Morton [EMAIL PROTECTED]
Signed-off-by: Linus Torvalds [EMAIL PROTECTED]
_


So the vulnerability would exist only if the bad bit stuck at the some
parity information and another at a data sector that needed that exact
parity information which is next to impossible and closer to impossible with
RAID6 since there would need to be loss the data sector and both the same p
and q parity information at the same time.

Thus less benefit for splitting up the drives in sections for logical
volumes is less useful.  And RAID6 provides the added benefit for bit errors
during a single drive degraded array as opposed to RAID5. 

Nevertheless, I would still use the LVM system to split the new replacement
drives if I had a method to utilize the extra drive space of the few new
replacements prior to replacing all of them.  Otherwise I suppose practice
patients and wait until they are all replaced to use the current grow -G -z
max feature.

Thanks,
Dan.



-Original Message-
From: Mike Hardy [mailto:[EMAIL PROTECTED] 
Sent: Friday, October 13, 2006 5:14 PM
To: Dan
Subject: Re: new features time-line


Not commenting on your overall premise, but I believe bit errors are
already logged and rewritten using parity info by md

-Mike

Dan wrote:
 I am curious if there are plans for either of the following;
 -RAID6 reshape
 -RAID5 to RAID6 migration
 
 Here is why I ask, and sorry for the length.
 
 I have an aging RAID6 with eight 250G drives as a physical volume in a
 volume group.  It is at about 80% capacity.  I have had a couple drives
fail
 and replaced them with 500G drives.  I plan to migrate the rest over time
as
 they drop out.  However this could be months or years.
 
 I could just be patient and wait until I have replaced all the drives and
 use the -G -z max  to grow the RAID to resize the array to the maximum
 space.  But I could use the extra space sooner.
 
 Since I already have the existing RAID (md0) as a physical volume in a
 volume group, I though why not just use the other half of the drives and
 create another RAID6 (md1) add that to the same volume group and so on as
I
 grow. md0 made from devices=/dev/sd[abcdefgh]1; md1 made from
 devices=/dev/sd[abcdefgh]2; and so on (I could have the md number match
the
 partition number for aesthetics I suppose)...  
 
 By doing this I further protect myself from possible bit error rate on
 increasingly large drives.  So if there are suddenly three bit errors I
have
 a chance as long as they are not all on the same partition number.  Mdadm
 will only kick out the bad partitions and not the whole drive. (I know I
am
 already doing RAID6, what are the chances of three!).
 
 To get to my point, I would like to split the new half of the drives into
a
 new physical volume and would 'like' to try to start using some of the
 drives before I have replace all the existing 250G drives.  If RAID6
reshape
 was an option I could start once I have replaced at least three of the old
 drives (built it as a RAID6 with one missing).  But it is not available,
 yet.  Or, since RAID5 reshape is an option, I could again start when I
have
 replaced three (built it as a RAID5) than grow it to until I get to the
 eighth drive and migrate to the final desired RAID6.  But that is not an
 option, yet.
 
 Thoughts?
 
 
 -
 To unsubscribe from this list: send the line unsubscribe linux-raid in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html



-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL