Re: RAID1 and load-balancing during read
On Mon, Sep 10, 2007 at 10:51:37PM +0300, Dimitrios Apostolou wrote: > On Monday 10 September 2007 22:35:30 Iustin Pop wrote: > > On Mon, Sep 10, 2007 at 10:29:30PM +0300, Dimitrios Apostolou wrote: > > > Hello list, > > > > > > I just created a RAID1 array consisting of two disks. After experiments > > > with processes *reading* from the device (badblocks, dd) and the iostat > > > program, I can see that only one disk is being utilised for reading. To > > > be exact, every time I execute the command one of the two disks is being > > > randomly used, but the other one has absolutely no activity. > > > > > > My question is: why isn't load balancing happening? Is there an option > > > I'm missing? Until now I though it was the default for all RAID1 > > > implementations. > > > > Did you read the archives of this list? This question has been answered, > > like, 4 times already in the last months. > > > > And yes, the driver does do load balancing. Just not as RAID0 does, > > since it's not RAID0. > > Of course I did a quick search in the archives but couldn't find anything. Hmm, it's true that searching does not point out an easy to find response. > I'll search better, thanks anyway. Moreover, I think I found the answer in > the code after posting. There is a comment somewhere in read_balance() > saying "Don't change to another disk for sequential reads". I have to study > it a bit to figure out *why* you chose that way. Well, from what I understand, you cannot make a mirror behave like a stripe, plain and simple. There is no simple algorithm that makes sequential raid behave better. OTOH, random I/O or multiple threads are being sped up by raid1. And people have said on the list that using the raid10 module with only two disks and (IIRC) in offset or far mode will give better read performance, albeit it reduces write performance. Hmmm, I think a patch is needed to md.4 in order to explain this right at the source of the confusion. thanks, iustin - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] Explain the read-balancing algorithm for RAID1 better in md.4
There are many questions on the mailing list about the RAID1 read performance profile. This patch adds a new paragraph to the RAID1 section in md.4 that details what kind of speed-up one should expect from RAID1. Signed-off-by: Iustin Pop <[EMAIL PROTECTED]> --- this patch is against the git tree of mdadm. md.4 |7 +++ 1 files changed, 7 insertions(+), 0 deletions(-) diff --git a/md.4 b/md.4 index cf423cb..db39aba 100644 --- a/md.4 +++ b/md.4 @@ -168,6 +168,13 @@ All devices in a RAID1 array should be the same size. If they are not, then only the amount of space available on the smallest device is used (any extra space on other devices is wasted). +Note that the read balancing done by the driver does not make the RAID1 +performance profile be the same as for RAID0; a single stream of +sequential input will not be accelerated (e.g. a single dd), but +multiple sequential streams or a random workload will use more than one +spindle. In theory, having an N-disk RAID1 will allow N sequential +threads to read from all disks. + .SS RAID4 A RAID4 array is like a RAID0 array with an extra device for storing -- 1.5.3.1 - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
unreasonable latentcies under heavy write load
I'm having some troubles with the system below: Linux inhale 2.6.18-4-amd64 #1 SMP Thu May 10 01:01:58 UTC 2007 x86_64 GNU/Linux md2 is a md2 : active raid1 sda3[0] sdb3[1] 484359680 blocks [2/2] [UU] sda and sdb are both Vendor: ATA Model: ST3500630AS Rev: 3.AA Type: Direct-Access ANSI SCSI revision: 05 The problem is _extreme_ latencies under a write load, and also weird accounting as seen in iostat. I know I've complained about this over on linux-mm, but with the raid1 it seems even worse than usual. Nothing happens on the system for literally minutes at a stretch. It takes half an hour to unpack a 1GB .zip archive (into a 7.4GB directory). During the lengthy pauses, md2_raid1 is the only process that gets any time (normally < 1% CPU). iostat reads weirdly (iostat -kx 10): avg-cpu: %user %nice %system %iowait %steal %idle 2.050.003.55 51.070.00 43.33 Device: rrqm/s wrqm/s r/s w/srkB/swkB/s avgrq-sz avgqu-sz await svctm %util sda 12.30 3495.00 5.80 89.80 916.80 14119.20 314.56 143.68 1522.19 10.46 100.04 sdb 0.00 3494.20 0.10 87.50 0.40 13920.40 317.8325.08 274.39 10.89 95.44 md0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 md1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 md2 0.00 0.00 18.10 2425.10 904.40 9700.40 8.68 0.000.00 0.00 0.00 avg-cpu: %user %nice %system %iowait %steal %idle 0.000.005.10 49.380.00 45.53 Device: rrqm/s wrqm/s r/s w/srkB/swkB/s avgrq-sz avgqu-sz await svctm %util sda 0.00 2954.60 0.20 99.10 0.80 12481.20 251.40 143.49 1449.52 10.07 100.04 sdb 0.00 2954.60 0.10 99.00 0.40 12240.40 247.0439.32 398.24 10.09 100.04 md0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 md1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 md2 0.00 0.00 0.30 16917.30 1.20 67669.20 8.00 0.000.00 0.00 0.00 That seems a little weird to me. Why is sda + sdb != md2? Is md2 really issuing 17000 writes per second over a period of ten seconds? If so, why? Also: avg-cpu: %user %nice %system %iowait %steal %idle 0.000.001.25 48.750.00 50.00 Device: rrqm/s wrqm/s r/s w/srkB/swkB/s avgrq-sz avgqu-sz await svctm %util sda 0.00 2805.69 0.00 104.69 0.00 11638.72 222.35 87.11 848.35 9.54 99.84 sdb 0.00 2806.29 0.00 102.50 0.00 11471.06 223.84 143.09 1400.85 9.74 99.84 md0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 md1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 md2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 Is that normal? -jwb - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: RAID1 and load-balancing during read
On Monday 10 September 2007 22:35:30 Iustin Pop wrote: > On Mon, Sep 10, 2007 at 10:29:30PM +0300, Dimitrios Apostolou wrote: > > Hello list, > > > > I just created a RAID1 array consisting of two disks. After experiments > > with processes *reading* from the device (badblocks, dd) and the iostat > > program, I can see that only one disk is being utilised for reading. To > > be exact, every time I execute the command one of the two disks is being > > randomly used, but the other one has absolutely no activity. > > > > My question is: why isn't load balancing happening? Is there an option > > I'm missing? Until now I though it was the default for all RAID1 > > implementations. > > Did you read the archives of this list? This question has been answered, > like, 4 times already in the last months. > > And yes, the driver does do load balancing. Just not as RAID0 does, > since it's not RAID0. Of course I did a quick search in the archives but couldn't find anything. I'll search better, thanks anyway. Moreover, I think I found the answer in the code after posting. There is a comment somewhere in read_balance() saying "Don't change to another disk for sequential reads". I have to study it a bit to figure out *why* you chose that way. Thanks, Dimitris - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: RAID1 and load-balancing during read
On Mon, Sep 10, 2007 at 10:29:30PM +0300, Dimitrios Apostolou wrote: > Hello list, > > I just created a RAID1 array consisting of two disks. After experiments with > processes *reading* from the device (badblocks, dd) and the iostat program, I > can see that only one disk is being utilised for reading. To be exact, every > time I execute the command one of the two disks is being randomly used, but > the other one has absolutely no activity. > > My question is: why isn't load balancing happening? Is there an option I'm > missing? Until now I though it was the default for all RAID1 implementations. Did you read the archives of this list? This question has been answered, like, 4 times already in the last months. And yes, the driver does do load balancing. Just not as RAID0 does, since it's not RAID0. regards, iustin - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
RAID1 and load-balancing during read
Hello list, I just created a RAID1 array consisting of two disks. After experiments with processes *reading* from the device (badblocks, dd) and the iostat program, I can see that only one disk is being utilised for reading. To be exact, every time I execute the command one of the two disks is being randomly used, but the other one has absolutely no activity. My question is: why isn't load balancing happening? Is there an option I'm missing? Until now I though it was the default for all RAID1 implementations. Even md man page mentions in the RAID1 section: The driver attempts to distribute read requests across all devices to maximise performance. Thanks in advance, Dimitris - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] Expose the degraded status of an assembled array through sysfs
The 'degraded' attribute is useful to quickly determine if the array is degraded, instead of parsing 'mdadm -D' output or relying on the other techniques (number of working devices against number of defined devices, etc.). The md code already keeps track of this attribute, so it's useful to export it. Signed-off-by: Iustin Pop <[EMAIL PROTECTED]> --- Note: I sent this back in January and it people agreed it was a good idea. However, it has not been picked up. So here I resend it again. Patch is against 2.6.23-rc5 Thanks, Iustin Pop drivers/md/md.c |7 +++ 1 files changed, 7 insertions(+), 0 deletions(-) diff --git a/drivers/md/md.c b/drivers/md/md.c index f883b7e..3e3ad71 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -2842,6 +2842,12 @@ sync_max_store(mddev_t *mddev, const char *buf, size_t len) static struct md_sysfs_entry md_sync_max = __ATTR(sync_speed_max, S_IRUGO|S_IWUSR, sync_max_show, sync_max_store); +static ssize_t +degraded_show(mddev_t *mddev, char *page) +{ + return sprintf(page, "%i\n", mddev->degraded); +} +static struct md_sysfs_entry md_degraded = __ATTR_RO(degraded); static ssize_t sync_speed_show(mddev_t *mddev, char *page) @@ -2985,6 +2991,7 @@ static struct attribute *md_redundancy_attrs[] = { &md_suspend_lo.attr, &md_suspend_hi.attr, &md_bitmap.attr, + &md_degraded.attr, NULL, }; static struct attribute_group md_redundancy_group = { -- 1.5.3.1 - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: reducing the number of disks a RAID1 expects
On Sun, Sep 09, 2007 at 09:31:54PM -1000, J. David Beutel wrote: > [EMAIL PROTECTED] ~]# mdadm --grow /dev/md5 -n2 > mdadm: Cannot set device size/shape for /dev/md5: Device or resource busy > > mdadm - v1.6.0 - 4 June 2004 > Linux 2.6.12-1.1381_FC3 #1 Fri Oct 21 03:46:55 EDT 2005 i686 athlon i386 > GNU/Linux I'm not sure that such an old kernel supports reshaping an array. The mdadm version should not be a problem, as that message is probably generated by the kernel. I'd recommend trying to boot with a newer kernel, even if only for the duration of the reshape. regards, iustin - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: reducing the number of disks a RAID1 expects
Richard Scobie wrote: Have a look at the "Grow Mode" section of the mdadm man page. Thanks! I overlooked that, although I did look at the man page before posting. It looks as though you should just need to use the same command you used to grow it to 3 drives, except specify only 2 this time. I think I hot-added it. Anyway, --grow looks like what I need, but I'm having some difficulty with it. The man page says, "Change the size or shape of an active array." But I got: [EMAIL PROTECTED] ~]# mdadm --grow /dev/md5 -n2 mdadm: Cannot set device size/shape for /dev/md5: Device or resource busy [EMAIL PROTECTED] ~]# umount /dev/md5 [EMAIL PROTECTED] ~]# mdadm --grow /dev/md5 -n2 mdadm: Cannot set device size/shape for /dev/md5: Device or resource busy So I tried stopping it, but got: [EMAIL PROTECTED] ~]# mdadm --stop /dev/md5 [EMAIL PROTECTED] ~]# mdadm --grow /dev/md5 -n2 mdadm: Cannot get array information for /dev/md5: No such device [EMAIL PROTECTED] ~]# mdadm --query /dev/md5 --scan /dev/md5: is an md device which is not active /dev/md5: is too small to be an md component. [EMAIL PROTECTED] ~]# mdadm --grow /dev/md5 --scan -n2 mdadm: option s not valid in grow mode Am I trying the right thing, but running into some limitation of my version of mdadm or the kernel? Or am I overlooking something fundamental yet again? md5 looked like this in /proc/mdstat before I stopped it: md5 : active raid1 hdc8[2] hdg8[1] 58604992 blocks [3/2] [_UU] For -n the man page says, "This number can only be changed using --grow for RAID1 arrays, and only on kernels which provide necessary support." Grow mode says, "Various types of growth may be added during 2.6 development, possibly including restructuring a raid5 array to have more active devices. Currently the only support available is to change the "size" attribute for arrays with redundancy, and the raid-disks attribute of RAID1 arrays. ... When reducing the number of devices in a RAID1 array, the slots which are to be removed from the array must already be vacant. That is, the devices that which were in those slots must be failed and removed." I don't know how I overlooked all that the first time, but I can't see what I'm overlooking now. mdadm - v1.6.0 - 4 June 2004 Linux 2.6.12-1.1381_FC3 #1 Fri Oct 21 03:46:55 EDT 2005 i686 athlon i386 GNU/Linux Cheers, 11011011 - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html