How mdadm can support > 2T
Ilinux-raid want to create raid0 use mdadm 2.5.6, kernel 2.6.18-iop3 on the intel iop80331(32bit). use 5 disks, and every hard disk is 500G. But it can't beyond > 2T. How can support >2T on the 32bit cpu ? command and log : #mdadm -C /dev/md0 -l0 -n5 /dev/sd[c,d,e,f,g] # mdadm --detail /dev/md0 [EMAIL PROTECTED]:/# mdadm --detail /dev/md0 /dev/md0: Version : 00.90.03 Creation Time : Thu Jan 1 00:29:29 1970 Raid Level : raid0 Array Size : 294448832 (280.81 GiB 301.52 GB) Raid Devices : 5 Total Devices : 5 Preferred Minor : 0 Persistence : Superblock is persistent Update Time : Thu Jan 1 00:29:29 1970 State : clean Active Devices : 5 Working Devices : 5 Failed Devices : 0 Spare Devices : 0 Chunk Size : 64K UUID : ebdd57fe:8eb46fdf:884d06b0:5db18b9d Events : 0.1 Number Major Minor RaidDevice State 0 8 320 active sync /dev/sdc 1 8 481 active sync /dev/sdd 2 8 642 active sync /dev/sde 3 8 803 active sync /dev/sdf 4 8 964 active sync /dev/sdg - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Recovering from default FC6 install
Doug Ledford wrote: On Sun, 2006-11-12 at 01:00 -0500, Bill Davidsen wrote: I tried something new on a test system, using the install partitioning tools to partition the disk. I had three drives and went with RAID-1 for boot, and RAID-5+LVM for the rest. After the install was complete I noted that it was solid busy on the drives, and found that the base RAID appears to have been created (a) with no superblock and (b) with no bitmap. That last is an issue, as a test system it WILL be getting hung and rebooted, and recovering the 1.5TB took hours. Is there an easy way to recover this? The LVM dropped on it has a lot of partitions, and there is a lot of data in them asfter several hours of feeding with GigE, so I can't readily back up and recreate by hand. Suggestions? First, the Fedora installer *always* creates persistent arrays, so I'm not sure what is making you say it didn't, but they should be persistent. I got the detail on the md device, then -E on the components, and got a "no super block found" message, which made me think it wasn't there. Given that, I didn't have much hope for the part which starts "assuming that they are persistent" but I do thank you for the information, I'm sure it will be useful. I did try recreating, from the running FC6 rather than the rescue, since the large data was on it's own RAID and I could umount the f/s and stop the array. Alas, I think a "grow" is needed somewhere, after configuration, start, and mount of the f/s on RAID-5, e2fsck told me my data was toast. Shortest time to solution was to recreate the f/s and reload the data. The RAID-1 stuff is small, a total rebuild is acceptable in the case of a failure. FC install suggestion: more optional control over the RAID features during creation. Maybe there's an "advanced features" button in the install and I just missed it, but there should be, since the non-average user might be able to do useful things with the chunk size, and specify a bitmap. I would think that a bitmap would be the default on large arrays, assuming that >1TB is still large for the moment. Instructions and attachments save for future use, trimmed here. -- bill davidsen <[EMAIL PROTECTED]> CTO TMR Associates, Inc Doing interesting things with small computers since 1979 - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: raid5 hang on get_active_stripe
You probably guessed that no matter what I did, I never, ever saw the problem when your trace was installed. I'd guess at some obscure timing-related problem. I can still trigger it consistently with a vanilla 2.6.17_SMP though, but again only when bitmaps are turned on. Neil Brown wrote: On Tuesday October 10, [EMAIL PROTECTED] wrote: Very happy to. Let me know what you'd like me to do. Cool thanks. At the end is a patch against 2.6.17.11, though it should apply against any later 2.6.17 kernel. Apply this and reboot. Then run while true do cat /sys/block/mdX/md/stripe_cache_active sleep 10 done > /dev/null (maybe write a little script or whatever). Leave this running. It effects the check for "has raid5 hung". Make sure to change "mdX" to whatever is appropriate. Occasionally look in the kernel logs for plug problem: if you find that, send me the surrounding text - there should be about a dozen lines following this one. Hopefully this will let me know which is last thing to happen: a plug or an unplug. If the last is a "plug", then the timer really should still be pending, but isn't (this is impossible). So I'll look more closely at that option. If the last is an "unplug", then the 'Plugged' flag should really be clear but it isn't (this is impossible). So I'll look more closely at that option. Dean is running this, but he only gets the hang every couple of weeks. If you get it more often, that would help me a lot. Thanks, NeilBrown diff ./.patches/orig/block/ll_rw_blk.c ./block/ll_rw_blk.c --- ./.patches/orig/block/ll_rw_blk.c 2006-08-21 09:52:46.0 +1000 +++ ./block/ll_rw_blk.c 2006-10-05 11:33:32.0 +1000 @@ -1546,6 +1546,7 @@ static int ll_merge_requests_fn(request_ * This is called with interrupts off and no requests on the queue and * with the queue lock held. */ +static atomic_t seq = ATOMIC_INIT(0); void blk_plug_device(request_queue_t *q) { WARN_ON(!irqs_disabled()); @@ -1558,9 +1559,16 @@ void blk_plug_device(request_queue_t *q) return; if (!test_and_set_bit(QUEUE_FLAG_PLUGGED, &q->queue_flags)) { + q->last_plug = jiffies; + q->plug_seq = atomic_read(&seq); + atomic_inc(&seq); mod_timer(&q->unplug_timer, jiffies + q->unplug_delay); blk_add_trace_generic(q, NULL, 0, BLK_TA_PLUG); - } + } else + q->last_plug_skip = jiffies; + if (!timer_pending(&q->unplug_timer) && + !q->unplug_work.pending) + printk("Neither Timer or work are pending\n"); } EXPORT_SYMBOL(blk_plug_device); @@ -1573,10 +1581,17 @@ int blk_remove_plug(request_queue_t *q) { WARN_ON(!irqs_disabled()); - if (!test_and_clear_bit(QUEUE_FLAG_PLUGGED, &q->queue_flags)) + if (!test_and_clear_bit(QUEUE_FLAG_PLUGGED, &q->queue_flags)) { + q->last_unplug_skip = jiffies; return 0; + } del_timer(&q->unplug_timer); + q->last_unplug = jiffies; + q->unplug_seq = atomic_read(&seq); + atomic_inc(&seq); + if (test_bit(QUEUE_FLAG_PLUGGED, &q->queue_flags)) + printk("queue still (or again) plugged\n"); return 1; } @@ -1635,7 +1650,7 @@ static void blk_backing_dev_unplug(struc static void blk_unplug_work(void *data) { request_queue_t *q = data; - + q->last_unplug_work = jiffies; blk_add_trace_pdu_int(q, BLK_TA_UNPLUG_IO, NULL, q->rq.count[READ] + q->rq.count[WRITE]); @@ -1649,6 +1664,7 @@ static void blk_unplug_timeout(unsigned blk_add_trace_pdu_int(q, BLK_TA_UNPLUG_TIMER, NULL, q->rq.count[READ] + q->rq.count[WRITE]); + q->last_unplug_timeout = jiffies; kblockd_schedule_work(&q->unplug_work); } diff ./.patches/orig/drivers/md/raid1.c ./drivers/md/raid1.c --- ./.patches/orig/drivers/md/raid1.c 2006-08-10 17:28:01.0 +1000 +++ ./drivers/md/raid1.c2006-09-04 21:58:31.0 +1000 @@ -1486,7 +1486,6 @@ static void raid1d(mddev_t *mddev) d = conf->raid_disks; d--; rdev = conf->mirrors[d].rdev; - atomic_add(s, &rdev->corrected_errors); if (rdev && test_bit(In_sync, &rdev->flags)) { if (sync_page_io(rdev->bdev, @@ -1509,6 +1508,9 @@ static void raid1d(mddev_t *mddev) s<<9, conf->tmppage, READ) == 0) /* Well, this device is dead */
Re: Raid 1 up after raidhotadd without rebuild
On Tuesday November 14, [EMAIL PROTECTED] wrote: > > Any ideas why this is and how to fix it? You don't mention a kernel version. If it is 2.6.18, upgrade to 2.6.18.2. NeilBrown - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re[4]: RAID1 submirror failure causes reboot?
Hello Jens, JA> Then lets wait for Jim to repeat his testing with all the debugging JA> options enabled, that should make us a little wiser. Ok, I'll enable the kernel rebuilt with these options and report any findings later. So far, I'll move to the other questions aroused. I remember when I ran with most of debugging ON last week, it made the server run darn slow, with LA around 40-60 and little responsiveness, and numerous messages like: Hangcheck: hangcheck value past margin! BUG: soft lockup detected on CPU#1! which hide the interesting traces :) However, I can post these captured traces of several system lifetimes to the list or privately. Concerning other questions: 1) The workload on the software raid is rather small. It's a set of system partitions which keep fileserver's logs, etc. The file storage is on 3Ware cards and has substantial load. MD arrays are checked nightly, though (echo check > sync_action), and most often this triggers the problem. These drives only contain mirrored partitions, so there should be no I/O to these drives around MD, except for rare cases of lilo running :) 2) I have installed a third submirror disk this weekend, it's an IDE slave device hdd (near the failing hdc). Since then I got errors on other partitions, attached below as "2*)". 3) The failures which lead to reboots are usually preceded by a long history of dma_intr errors 0x40 and 0x51, but that sample I sent was rather full. A few errors preceded it every 5 seconds, making the full trace like this: [87319.049902] hdc: dma_intr: status=0x51 { DriveReady SeekComplete Error } [87319.057393] hdc: dma_intr: error=0x01 { AddrMarkNotFound }, LBAsect=176315718, sector=176315631 [87319.067205] ide: failed opcode was: unknown [87323.956399] hdc: dma_intr: status=0x51 { DriveReady SeekComplete Error } [87323.963681] hdc: dma_intr: error=0x01 { AddrMarkNotFound }, LBAsect=176315718, sector=176315631 [87323.973171] ide: failed opcode was: unknown [87328.846265] hdc: dma_intr: status=0x51 { DriveReady SeekComplete Error } [87328.853485] hdc: dma_intr: error=0x01 { AddrMarkNotFound }, LBAsect=176315718, sector=176315631 [87328.862834] ide: failed opcode was: unknown [87333.736127] hdc: dma_intr: status=0x51 { DriveReady SeekComplete Error } [87333.743535] hdc: dma_intr: error=0x01 { AddrMarkNotFound }, LBAsect=176315718, sector=176315631 [87333.752876] ide: failed opcode was: unknown [87333.806569] ide1: reset: success [87338.675891] hdc: task_in_intr: status=0x59 { DriveReady SeekComplete DataRequest Error } [87338.685143] hdc: task_in_intr: error=0x01 { AddrMarkNotFound }, LBAsect=176315718, sector=176315711 [87338.694791] ide: failed opcode was: unknown [87343.557424] hdc: task_in_intr: status=0x59 { DriveReady SeekComplete DataRequest Error } [87343.566388] hdc: task_in_intr: error=0x01 { AddrMarkNotFound }, LBAsect=176315718, sector=176315711 [87343.576105] ide: failed opcode was: unknown [87348.472226] hdc: task_in_intr: status=0x59 { DriveReady SeekComplete DataRequest Error } [87348.481170] hdc: task_in_intr: error=0x01 { AddrMarkNotFound }, LBAsect=176315718, sector=176315711 [87348.490843] ide: failed opcode was: unknown [87353.387028] hdc: task_in_intr: status=0x59 { DriveReady SeekComplete DataRequest Error } [87353.395735] hdc: task_in_intr: error=0x01 { AddrMarkNotFound }, LBAsect=176315718, sector=176315711 [87353.405500] ide: failed opcode was: unknown [87353.461342] ide1: reset: success [87358.326783] hdc: task_in_intr: status=0x59 { DriveReady SeekComplete DataRequest Error } [87358.335739] hdc: task_in_intr: error=0x01 { AddrMarkNotFound }, LBAsect=176315718, sector=176315718 [87358.345395] ide: failed opcode was: unknown [87363.208313] hdc: task_in_intr: status=0x59 { DriveReady SeekComplete DataRequest Error } [87363.217319] hdc: task_in_intr: error=0x01 { AddrMarkNotFound }, LBAsect=176315718, sector=176315718 [87363.228371] ide: failed opcode was: unknown [87368.106472] hdc: task_in_intr: status=0x59 { DriveReady SeekComplete DataRequest Error } [87368.115414] hdc: task_in_intr: error=0x01 { AddrMarkNotFound }, LBAsect=176315718, sector=176315718 [87368.125275] ide: failed opcode was: unknown [87372.979686] hdc: task_in_intr: status=0x59 { DriveReady SeekComplete DataRequest Error } [87372.988706] hdc: task_in_intr: error=0x01 { AddrMarkNotFound }, LBAsect=176315718, sector=176315718 [87372.998849] ide: failed opcode was: unknown [87373.052152] ide1: reset: success [87377.927744] hdc: task_in_intr: status=0x59 { DriveReady SeekComplete DataRequest Error } [87377.936682] hdc: task_in_intr: error=0x01 { AddrMarkNotFound }, LBAsect=176315718, sector=176315718 [87377.946399] ide: failed opcode was: unknown [87382.800953] hdc: task_in_intr: status=0x59 { DriveReady SeekComplete DataRequest Error } [87382.809881] hdc: task_in_intr: error=0x01 { AddrMarkNotFound }, LBAsect=176315718, sector=176315718 [87382.819511] ide: failed opcode was: u
Re: Raid 1 up after raidhotadd without rebuild
Neil Brown wrote: On Tuesday November 14, [EMAIL PROTECTED] wrote: Any ideas why this is and how to fix it? You don't mention a kernel version. If it is 2.6.18, upgrade to 2.6.18.2. Its 2.6.18, I'll upgrade and try, I checked the log and guess it is "[PATCH] md: Fix bug where spares don't always get rebuilt properly when they become live.". Exactly my problem. Thanks for the fast help. ciao, elm - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Raid 1 up after raidhotadd without rebuild
Hello, I'm trying to rebuild a raid1 array by adding the replacement disk (sda) to the old one (sdf). I zero'd the new disk (sda) and created the partition table corresponding to the existing one. I adapted my raidtab to the new drive node (changed due to other disk replacement) and now I'm trying to add the new disk, but the raid 1 is not rebuilding, the newly added disk is marked as U(p) in the mdstat output. Here's what I do step by step (mdstat output reduced to one raid array, but the problem exists with every partition used as a raid 1): # cat /proc/mdstat md1 : active raid1 sdf1[1] 112728000 blocks [2/1] [_U] # raidhotadd /dev/md1 /dev/sda1 && cat /proc/mdstat md1 : active raid1 sda1[1] sdf1[0] 112728000 blocks [2/2] [UU] Thats it, no resync, nothing. When I do a fsck on the md1 there are errors en masse. raidtools don't even seem to touch sda1, since when I create a filesystem with content on it, I can mount it after adding it to the raid. Any ideas why this is and how to fix it? Thanks in advance & ciao, elm Here is the entry in the syslog: == Nov 15 09:34:46 server md: bind Nov 15 09:34:46 server RAID1 conf printout: Nov 15 09:34:46 server --- wd:1 rd:2 Nov 15 09:34:46 server disk 0, wo:1, o:1, dev:sda1 Nov 15 09:34:46 server disk 1, wo:0, o:1, dev:sdf1 Nov 15 09:34:46 server md: syncing RAID array md2 Nov 15 09:34:46 server md: minimum _guaranteed_ reconstruction speed: 1000 KB/sec/disc. Nov 15 09:34:46 server md: using maximum available idle IO bandwidth (but not more than 20 KB/sec) for reconstruction. Nov 15 09:34:46 server md: using 128k window, over a total of 112728000 blocks. Nov 15 09:34:46 server md: md1: sync done. Nov 15 09:34:46 server RAID1 conf printout: Nov 15 09:34:46 server --- wd:2 rd:2 Nov 15 09:34:46 server disk 0, wo:0, o:1, dev:sda1 Nov 15 09:34:46 server disk 1, wo:0, o:1, dev:sdf1 == Here is the raidtab: == raiddev /dev/md1 raid-level 1 nr-raid-disks 2 nr-spare-disks 0 persistent-superblock 1 chunk-size 8k device /dev/sda1 raid-disk 0 device /dev/sdf1 raid-disk 1 == -- "Religion und Familie sind die beiden größten Feinde des Fortschritts." (André Gide (1869 - 1951), französischer Schriftsteller) - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
safest way to swap in a new physical disk
Hi. What is the safest way to switch out a disk in a software raid array created with mdadm? I'm not talking about replacing a failed disk, I want to take a healthy disk in the array and swap it for another physical disk. Specifically, I have an array made up of 10 250gb software-raid partitions on 8 300gb disks and 2 250gb disks, plus a hot spare. I want to switch the 250s to new 300gb disks so everything matches. Is there a way to do this without risking a rebuild? I can't back everything up, so I want to be as risk-free as possible. I guess what I want is to do something like this: (1) Unmount the array (2) Un-create the array (3) Somehow exactly duplicate partition X to a partition Y on a new disk (4) Re-create array with X gone and Y in it's place (5) Check if the array is OK without changing/activating it (6) If there is a problem, switch from Y back to X and have it as though nothing changed The part I'm worried about is (3), as I've tried duplicating partition images before and it never works right. Is there a way to do this with mdadm? For what it's worth, mdadm / linux software raid handles this setup beautifully... easy to set up, easy to maintain, easy to fix... I've never had any trouble. And I didn't go broke buying raid controllers. GREAT software! Thanks a bunch! -Will Sheffler - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html