Re: Is shrinking raid5 possible?
Neil Brown wrote: Yep. The '--size' option refers to: Amount (in Kibibytes) of space to use from each drive in RAID1/4/5/6. This must be a multiple of the chunk size, and must leave about 128Kb of space at the end of the drive for the RAID superblock. (from the man page). So you were telling md to use the first 600GB of each device in the array, and it told you there wasn't that much room. If your array has N drives, you need to divide the target array size by N-1 to find the target device size. So if you have a 5 drive array, then you want --size=157286400 NeilBrown Thanks, and sorry for not being able to read properly -- I read this at least three times and didn't notice it was the drive size and not the array size. Cheers, Paul. - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Is shrinking raid5 possible?
On Monday June 19, [EMAIL PROTECTED] wrote: > Hi, > > I'd like to shrink the size of a RAID5 array - is this > possible? My first attempt shrinking 1.4Tb to 600Gb, > > mdadm --grow /dev/md5 --size=629145600 > > gives > > mdadm: Cannot set device size/shape for /dev/md5: No space left on device Yep. The '--size' option refers to: Amount (in Kibibytes) of space to use from each drive in RAID1/4/5/6. This must be a multiple of the chunk size, and must leave about 128Kb of space at the end of the drive for the RAID superblock. (from the man page). So you were telling md to use the first 600GB of each device in the array, and it told you there wasn't that much room. If your array has N drives, you need to divide the target array size by N-1 to find the target device size. So if you have a 5 drive array, then you want --size=157286400 NeilBrown - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Is shrinking raid5 possible?
Hi, I'd like to shrink the size of a RAID5 array - is this possible? My first attempt shrinking 1.4Tb to 600Gb, mdadm --grow /dev/md5 --size=629145600 gives mdadm: Cannot set device size/shape for /dev/md5: No space left on device which is true but not particularly relevant :). If mdadm doesn't support this for online arrays, can I do it offline somehow? I'd like to retain the ext3 filesystem on this device, which I have already shrunk to 400Gb with resize2fs. Thanks for any help, Paul - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Raid5 reshape
On Sunday June 18, [EMAIL PROTECTED] wrote: > This from dmesg might help diagnose the problem: > Yes, that helps a lot, thanks. The problem is that the reshape thread is restarting before the array is fully set-up, so it ends up dereferencing a NULL pointer. This patch should fix it. In fact, there is a small chance that next time you boot it will work without this patch, but the patch makes it more reliable. There definitely should be no data-loss due to this bug. Thanks, NeilBrown ### Diffstat output ./drivers/md/md.c|6 -- ./drivers/md/raid5.c |3 --- 2 files changed, 4 insertions(+), 5 deletions(-) diff .prev/drivers/md/md.c ./drivers/md/md.c --- .prev/drivers/md/md.c 2006-05-30 15:07:14.0 +1000 +++ ./drivers/md/md.c 2006-06-19 12:01:47.0 +1000 @@ -2719,8 +2719,6 @@ static int do_md_run(mddev_t * mddev) } set_bit(MD_RECOVERY_NEEDED, &mddev->recovery); - md_wakeup_thread(mddev->thread); - if (mddev->sb_dirty) md_update_sb(mddev); @@ -2738,6 +2736,10 @@ static int do_md_run(mddev_t * mddev) mddev->changed = 1; md_new_event(mddev); + + md_wakeup_thread(mddev->thread); + md_wakeup_thread(mddev->sync_thread); + return 0; } diff .prev/drivers/md/raid5.c ./drivers/md/raid5.c --- .prev/drivers/md/raid5.c2006-06-19 11:56:41.0 +1000 +++ ./drivers/md/raid5.c2006-06-19 11:56:44.0 +1000 @@ -2373,9 +2373,6 @@ static int run(mddev_t *mddev) set_bit(MD_RECOVERY_RUNNING, &mddev->recovery); mddev->sync_thread = md_register_thread(md_do_sync, mddev, "%s_reshape"); - /* FIXME if md_register_thread fails?? */ - md_wakeup_thread(mddev->sync_thread); - } /* read-ahead size must cover two whole stripes, which is - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: the question about raid0_make_request
On Monday June 19, [EMAIL PROTECTED] wrote: > When I read the code of raid0_make_request,I meet some questions. > > 1\ block = bio->bi_sector >> 1,it's the device offset in kilotytes. > so why do we use block substract zone->zone_offset? The > zone->zone_offset is the zone offset relative the mddev in sectors. zone_offset is set to 'curr_zone_offset' in create_strip_zones, curr_zone_offset is a sum of 'zone->size' values. zone->size is (typically) calculated by (smallest->size - current_offset) *c 'smallest' is an rdev. So the unit of 'zone_offset' are ultimately the same units as that of rdev->size. rdev->size is set in md.c is set e.g. from calc_dev_size(rdev, sb->chunk_size); which uses the value from calc_dev_sboffset which shifts the size in bytes by BLOCK_SIZE_BITS which is defined in fs.h to be 10. So the units of zone_offset is in kilobytes, not sectors. > > 2\ the codes below: > x = block >> chunksize_bits; > tmp_dev = zone->dev[sector_div(x, zone->nb_dev)]; > actually, we get the underlying device by 'sector_div(x, > zone->nb_dev)'.The var x is the chunk nr relative to the start of the > mddev in my opinion.But not all of the zone->nb_dev is the same, so we > cann't get the right rdev by 'sector_div(x, zone->nb_dev)', I think. x is the chunk number relative to the start of the current zone, not the start of the mddev: sector_t x = (block - zone->zone_offset) >> chunksize_bits; taking the remainder after dividing this by the number of devices in the current zone gives the number of the device to use. Hope that helps. NeilBrown - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
the question about raid0_make_request
When I read the code of raid0_make_request,I meet some questions. 1\ block = bio->bi_sector >> 1,it's the device offset in kilotytes. so why do we use block substract zone->zone_offset? The zone->zone_offset is the zone offset relative the mddev in sectors. 2\ the codes below: x = block >> chunksize_bits; tmp_dev = zone->dev[sector_div(x, zone->nb_dev)]; actually, we get the underlying device by 'sector_div(x, zone->nb_dev)'.The var x is the chunk nr relative to the start of the mddev in my opinion.But not all of the zone->nb_dev is the same, so we cann't get the right rdev by 'sector_div(x, zone->nb_dev)', I think. Why?Could you explain them to me? Thanks! Regards. YangLiu - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Raid5 reshape
Nigel J. Terry wrote: Neil Brown wrote: OK, thanks for the extra details. I'll have a look and see what I can find, but it'll probably be a couple of days before I have anything useful for you. NeilBrown - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html This from dmesg might help diagnose the problem: md: Autodetecting RAID arrays. md: autorun ... md: considering sdb1 ... md: adding sdb1 ... md: adding sda1 ... md: adding hdc1 ... md: adding hdb1 ... md: created md0 md: bind md: bind md: bind md: bind md: running: raid5: automatically using best checksumming function: generic_sse generic_sse: 6795.000 MB/sec raid5: using function: generic_sse (6795.000 MB/sec) md: raid5 personality registered for level 5 md: raid4 personality registered for level 4 raid5: reshape will continue raid5: device sdb1 operational as raid disk 1 raid5: device sda1 operational as raid disk 0 raid5: device hdb1 operational as raid disk 2 raid5: allocated 4268kB for md0 raid5: raid level 5 set md0 active with 3 out of 4 devices, algorithm 2 RAID5 conf printout: --- rd:4 wd:3 fd:1 disk 0, o:1, dev:sda1 disk 1, o:1, dev:sdb1 disk 2, o:1, dev:hdb1 ...ok start reshape thread md: syncing RAID array md0 md: minimum _guaranteed_ reconstruction speed: 1000 KB/sec/disc. md: using maximum available idle IO bandwidth (but not more than 20 KB/sec) for reconstruction. md: using 128k window, over a total of 245111552 blocks. Unable to handle kernel NULL pointer dereference at RIP: <>{stext+2145382632} PGD 7c3f9067 PUD 7cb9e067 PMD 0 Oops: 0010 [1] SMP CPU 0 Modules linked in: raid5 xor usb_storage video button battery ac lp parport_pc parport floppy nvram snd_intel8x0 snd_ac97_codec snd_ac97_bus snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss ehci_hcd ohci1394 ieee1394 sg snd_pcm uhci_hcd i2c_nforce2 i2c_core forcedeth ohci_hcd snd_timer snd soundcore snd_page_alloc dm_snapshot dm_zero dm_mirror dm_mod ext3 jbd sata_nv libata sd_mod scsi_mod Pid: 1432, comm: md0_reshape Not tainted 2.6.17-rc6 #1 RIP: 0010:[<>] <>{stext+2145382632} RSP: :81007aa43d60 EFLAGS: 00010246 RAX: 81007cf72f20 RBX: 81007c682000 RCX: 0006 RDX: RSI: RDI: 81007cf72f20 RBP: 02090900 R08: R09: 810037f497b0 R10: 000b44ffd564 R11: 8022c92a R12: R13: 0100 R14: R15: FS: 0066d870() GS:80611000() knlGS: CS: 0010 DS: 0018 ES: 0018 CR0: 8005003b CR2: CR3: 7bebc000 CR4: 06e0 Process md0_reshape (pid: 1432, threadinfo 81007aa42000, task 810037f497b0) Stack: 803dce42 1d383600 Call Trace: {md_do_sync+1307} {thread_return+0} {thread_return+94} {keventd_create_kthread+0} {md_thread+248} {keventd_create_kthread+0} {md_thread+0} {kthread+254} {child_rip+8} {keventd_create_kthread+0} {thread_return+0} {kthread+0} {child_rip+0} Code: Bad RIP value. RIP <>{stext+2145382632} RSP CR2: <6>md: ... autorun DONE. - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
SW RAID 5 Bug? - Slow After Rebuild (XFS+2.6.16.20)
I set a disk faulty and then rebuilt it, afterwards, I got horrible performance, I was using 2.6.16.20 during the tests. The FS I use is XFS. # xfs_info /dev/md3 meta-data=/dev/root isize=256agcount=16, agsize=1097941 blks = sectsz=512 attr=0 data = bsize=4096 blocks=17567056, imaxpct=25 = sunit=0 swidth=0 blks, unwritten=1 naming =version 2 bsize=4096 log =internal bsize=4096 blocks=8577, version=1 = sectsz=512 sunit=0 blks realtime =none extsz=65536 blocks=0, rtextents=0 After a raid5 rebuild before reboot: $ cat 448mb.img > /dev/null 0 1 4 25104 64 90556000 4 0 1027 154 0 0 88 12 0 0 4 14580 64 91412800 1434434 1081 718 0 2 77 21 0 0 4 14516 64 91236000 10312 184 1128 1376 0 3 97 0 0 0 4 15244 64 91188400 12660 0 1045 1248 0 3 97 0 0 0 4 15464 64 91127200 11916 0 1055 1081 0 3 98 0 0 1 4 15100 64 91548800 7844 0 1080 592 0 3 76 21 0 1 4 13840 64 91678000 1268 0 1295 1757 0 1 49 49 0 1 4 13480 64 91718800 38848 1050 142 0 1 50 49 procs ---memory-- ---swap-- -io --system-- cpu r b swpd free buff cache si sobibo incs us sy id wa 0 1 4 14816 64 91589600 492 0 1047 321 0 1 49 49 0 1 4 14504 64 91623600 324 0 1022 108 0 2 50 49 0 1 4 14144 64 91657600 388 0 1021 108 0 1 50 50 0 1 4 13904 64 91684800 256 0 1043 159 0 0 50 49 0 1 4 13728 64 91712000 26024 1032 102 0 1 50 49 0 0 4 15244 64 91304000 11856 0 1042 1315 0 3 90 7 0 0 4 14564 64 91365200 12288 0 1068 1137 1 3 97 0 0 0 4 15252 64 91297200 12288 0 1054 1128 0 3 97 0 0 0 4 15132 64 91310800 16384 0 1048 1368 0 4 96 0 0 0 4 15372 64 91283600 12288 0 1062 1125 0 3 97 0 0 0 4 15660 64 91263200 12288 0 1065 1093 0 3 97 0 0 0 4 15388 64 91276800 12288 0 1042 1051 0 3 97 0 0 0 4 15028 64 91331200 12288 0 1040 1122 0 3 97 0 With an ftp: 0 1 4 208564 64 72366000 8192 0 1945 495 0 4 53 44 1 0 4 200592 64 73182000 8192 0 1828 459 0 5 52 44 0 0 4 194472 64 73794000 6144 0 1396 220 0 2 50 47 0 1 4 186128 64 74616800 8192 0 1622 377 0 4 51 45 0 1 4 180008 64 75228800 6144 0 1504 339 0 3 51 46 0 1 4 174012 64 75847600 6144 0 1438 229 0 3 51 47 0 1 4 167956 64 76459600 6144 0 1498 263 0 2 51 46 0 1 4 162084 64 77071600 6144 0 1497 326 0 3 51 46 0 1 4 156152 64 77690400 6144 0 1476 293 0 3 51 47 0 1 4 150048 64 78302400 614420 1514 273 0 2 51 46 Also note, when I run 'sync' it would take up to 5 minutes!!! And I was not even doing anything on the array. After reboot: `448mb.img' at 161467144 (34%) 42.82M/s eta:7s [Receiving data] `448mb.img' at 283047424 (60%) 45.23M/s eta:4s [Receiving data] `448mb.img' at 406802192 (86%) 46.29M/s eta:1s [Receiving data] Write speed to the RAID5 is also back to normal. 0 0 0 16664 8 92894000 0 44478 1522 19791 1 35 43 21 0 0 0 15304 8 9303680020 49816 1437 19260 0 21 59 20 0 0 4 16964 8 9283240020 50388 1410 20059 0 20 47 33 0 0 4 13504 8 93192800 0 46792 1449 16712 0 17 69 15 0 0 4 14952 8 93043200 8 43510 1489 16443 0 16 60 23 0 0 4 16328 8 9290720036 50316 1498 16972 1 19 59 23 0 1 4 16708 8 92846000 0 45604 1504 17196 0 19 55 26 procs ---memory-- ---swap-- -io --system-- cpu r b swpd free buff cache si sobibo incs us sy id wa 0 0 4 16968 8 92812004 0 47640 1584 17821 0 19 57 25 0 0 4 15160 8 92988800 0 40836 1637 15335 0 17 63 19 0 1 4 15372 8 92961600 0 41932 1630 14862 0 17 64 19 Was curious if anyone else had seen this? /dev/md3: Version : 00.90.03 Creation Time : Sun Jun 11 16:52:00 2006 Raid Level : raid5 Array Size : 1562834944 (1490.44 GiB 1600.34 GB) Device Size : 390708736 (372.61 GiB 400.09 GB) Raid Devices : 5 Total Devices : 5 Preferre