Re: IBM xSeries stop responding during RAID1 reconstruction
On Thursday 15 June 2006 12:13, you wrote: If this is causing a lockup, then there is something else wrong, just as any single process should not - by writing constantly to disks - be able to clog up the whole system. Maybe if you could get the result of alt-sysrq-P I tried some kernel changes enabling the HyperThreading on the (single) P4 processor and enabling CONFIG_PREEMPT_VOLUNTARY=y, but with no success. During the lookup Alt-SysRq-P constantly says that: EIP is at mwait_idle+0x1a/0x2e While Alt-SysRq-T shows - among other processes - the MD syncing and the bash looked-up; this is the hand-copied call traces: md3_resync device_barrier default_wake_function sync_request __generic_unplug_device md_do_sync schedule md_thread md_thread kthread kthread kernel_thread_helper bash io_schedule sync_buffer sync_buffer __wait_on_bit_lock sync_buffer out_of_line_wait_on_bit_lock wake_bit_function __lock_buffer do_get_write_access __ext3_get_inode_loc jurnal_get_write_access ext3_reserve_inode_write ext3_mark_inode_dirty ext3_dirty_inode __mark_inode_dirty update_atime vfs_readdir sys_getdents64 filldir64 syscall_call This is also the top output, which runs regularly during the lookup: top - 11:40:41 up 7 min, 2 users, load average: 8.70, 4.92, 2.04 Tasks: 70 total, 1 running, 69 sleeping, 0 stopped, 0 zombie Cpu(s): 0.2% us, 0.7% sy, 0.0% ni, 98.7% id, 0.0% wa, 0.0% hi, 0.5% si Mem:906212k total,58620k used, 847592k free, 3420k buffers Swap: 1951736k total,0k used, 1951736k free,23848k cached PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 829 root 10 -5 000 S1 0.0 0:01.70 md3_raid1 2823 root 10 -5 000 D1 0.0 0:01.62 md3_resync 1 root 16 0 1956 656 560 S0 0.1 0:00.52 init 2 root RT 0 000 S0 0.0 0:00.00 migration/0 3 root 34 19 000 S0 0.0 0:00.00 ksoftirqd/0 4 root RT 0 000 S0 0.0 0:00.00 watchdog/0 5 root RT 0 000 S0 0.0 0:00.00 migration/1 6 root 34 19 000 S0 0.0 0:00.00 ksoftirqd/1 7 root RT 0 000 S0 0.0 0:00.00 watchdog/1 8 root 10 -5 000 S0 0.0 0:00.01 events/0 9 root 10 -5 000 S0 0.0 0:00.01 events/1 10 root 10 -5 000 S0 0.0 0:00.00 khelper 11 root 10 -5 000 S0 0.0 0:00.00 kthread 14 root 10 -5 000 S0 0.0 0:00.00 kblockd/0 15 root 10 -5 000 S0 0.0 0:00.00 kblockd/1 16 root 11 -5 000 S0 0.0 0:00.00 kacpid 152 root 20 0 000 S0 0.0 0:00.00 pdflush 153 root 15 0 000 D0 0.0 0:00.00 pdflush 154 root 17 0 000 S0 0.0 0:00.00 kswapd0 155 root 11 -5 000 S0 0.0 0:00.00 aio/0 156 root 11 -5 000 S0 0.0 0:00.00 aio/1 755 root 10 -5 000 S0 0.0 0:00.00 kseriod 796 root 10 -5 000 S0 0.0 0:00.00 ata/0 797 root 11 -5 000 S0 0.0 0:00.00 ata/1 799 root 11 -5 000 S0 0.0 0:00.00 scsi_eh_0 800 root 11 -5 000 S0 0.0 0:00.00 scsi_eh_1 825 root 15 0 000 S0 0.0 0:00.00 kirqd 831 root 10 -5 000 D0 0.0 0:00.00 md2_raid1 833 root 10 -5 000 S0 0.0 0:00.00 md1_raid1 834 root 10 -5 000 D0 0.0 0:00.00 md0_raid1 835 root 15 0 000 D0 0.0 0:00.00 kjournald 932 root 18 -4 2192 584 368 S0 0.1 0:00.19 udevd 1698 root 10 -5 000 S0 0.0 0:00.00 khubd 2031 root 22 0 000 S0 0.0 0:00.00 kjournald 2032 root 15 0 000 D0 0.0 0:00.00 kjournald 2142 daemon16 0 1708 364 272 S0 0.0 0:00.00 portmap 2464 root 16 0 2588 932 796 S0 0.1 0:00.01 syslogd -- Niccolo Rigacci Firenze - Italy War against Iraq? Not in my name! - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Raid5 software problems after loosing 4 disks for 48 hours
Neil great stuff, its online now!!! I followed your 2nd suggestion and ran mdadm --create /dev/md0 -f -l5 -n15 -c32 /dev/sd[bcdefghijklmnop]1 , after 8 hours we reached 99.9% and some errors appeared on sdh1 which was then kicked from archive but it was fully online. Ill see if any further errors are reported on sdh, but in the meantime I hotadded it back into the array which was successful. To my surprise a full fsck reported a clean volume. I am still unsure how this raid5 volume was partially readable with 4 disks missing. My understanding each file is written across all disks apart from one, which is used for CRC. So if 2 disks are offline the whole thing should be unreadable. Once again thanks for your help On 6/16/06, Neil Brown [EMAIL PROTECTED] wrote: On Friday June 16, [EMAIL PROTECTED] wrote: And is there a way if more then 1 disks goes offline, for the whole array to be taken offline? My understanding of raid5 is loose 1+ disks and nothing on the raid would be readable. this is not the case here. Nothing will be writable, but some blocks might be readable. All the disks are online now, what do I need to do to rebuild the array? Have you tried mdadm --assemble --force /dev/md0 /dev/sd[bcdefghijklmnop]1 ?? Actually, it occurs to me that that might not do the best thing if 4 drives disappeared at exactly the same time (though it is unlikely that you would notice) You should probably use mdadm --create /dev/md0 -f -l5 -n15 -c32 /dev/sd[bcdefghijklmnop]1 This is assuming that e,f,g,h were in that order in the array before they died. The '-f' is quite important - it tells mdadm not recover a spare, but to resync the parity blocks. NeilBrown - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Raid5 reshape
Neil Brown wrote: On Friday June 16, [EMAIL PROTECTED] wrote: Thanks for all the advice. One final question, what kernel and mdadm versions do I need? For resizing raid5: mdadm-2.4 or later linux-2.6.17-rc2 or later NeilBrown - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Ok, I tried and screwed up! I upgraded my kernel and mdadm. I set the grow going and all looked well, so as it said it was going to take 430 minutes, I went to Starbucks. When I came home there had been a power cut, but my UPS had shut the system down. When power returned I rebooted. Now I think I had failed to set the new partition on /dev/hdc1 to Raid Autodetect, so it didn't find it at reboot. I tried to hot add it, but now I seem to have a deadlock situation. Although --detail shows that it is degraded and recovering, /proc/mdstat shows it is reshaping. In truth there is no disk activity and the count in /proc/mdstat is not changing. I gues sthe only good news is that I can still mount the device and my data is fine. Please see below... Any ideas what I should do next? Thanks Nigel [EMAIL PROTECTED] ~]# uname -a Linux homepc.nigelterry.net 2.6.17-rc6 #1 SMP Sat Jun 17 11:05:52 EDT 2006 x86_64 x86_64 x86_64 GNU/Linux [EMAIL PROTECTED] ~]# mdadm --version mdadm - v2.5.1 - 16 June 2006 [EMAIL PROTECTED] ~]# mdadm --detail /dev/md0 /dev/md0: Version : 00.91.03 Creation Time : Tue Apr 18 17:44:34 2006 Raid Level : raid5 Array Size : 490223104 (467.51 GiB 501.99 GB) Device Size : 245111552 (233.76 GiB 250.99 GB) Raid Devices : 4 Total Devices : 4 Preferred Minor : 0 Persistence : Superblock is persistent Update Time : Sat Jun 17 15:15:05 2006 State : clean, degraded, recovering Active Devices : 3 Working Devices : 4 Failed Devices : 0 Spare Devices : 1 Layout : left-symmetric Chunk Size : 128K Reshape Status : 6% complete Delta Devices : 1, (3-4) UUID : 50e3173e:b5d2bdb6:7db3576b:644409bb Events : 0.3211829 Number Major Minor RaidDevice State 0 810 active sync /dev/sda1 1 8 171 active sync /dev/sdb1 2 3 652 active sync /dev/hdb1 3 003 removed 4 221- spare /dev/hdc1 [EMAIL PROTECTED] ~]# cat /proc/mdstat Personalities : [raid5] [raid4] md0 : active raid5 sdb1[1] sda1[0] hdc1[4](S) hdb1[2] 490223104 blocks super 0.91 level 5, 128k chunk, algorithm 2 [4/3] [UUU_] [=...] reshape = 6.9% (17073280/245111552) finish=86.3min speed=44003K/sec unused devices: none [EMAIL PROTECTED] ~]# - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Raid5 software problems after loosing 4 disks for 48 hours
Wilson Wilson wrote: Neil great stuff, its online now!!! Congratulations :) I am still unsure how this raid5 volume was partially readable with 4 disks missing. My understanding each file is written across all disks apart from one, which is used for CRC. So if 2 disks are offline the whole thing should be unreadable. I'll try :) md doesn't operate at a file level, it operates on chunks. The chunk could be 64Kb in size. For raid5 each stripe is made of n-1 chunks. (raid6 would be n-2). When a stripe is read, if your file is in one of the chunks that's still there then you're in luck. I guess md knows it's degraded and gives as much data back as possible. This means that you have a certain probability of accessing a given file depending on it's size, the filesystem and the degree to which the array is degraded. FWIW I'd *never* try a r/w operation on such a degraded array. Speculation: I'm surprised you could mount such a 'sparse' array though. I wonder if some filesystems (like xfs) would just barf as they mounted because they have more distributed mount-time data structures and would spot the missing chunks. Others (ext3?) may just mount and try to read blocks on demand. David - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Raid5 reshape
On Saturday June 17, [EMAIL PROTECTED] wrote: Any ideas what I should do next? Thanks Looks like you've probably hit a bug. I'll need a bit more info though. First: [EMAIL PROTECTED] ~]# cat /proc/mdstat Personalities : [raid5] [raid4] md0 : active raid5 sdb1[1] sda1[0] hdc1[4](S) hdb1[2] 490223104 blocks super 0.91 level 5, 128k chunk, algorithm 2 [4/3] [UUU_] [=...] reshape = 6.9% (17073280/245111552) finish=86.3min speed=44003K/sec unused devices: none This really makes it look like the reshape is progressing. How long after the reboot was this taken? How long after hdc1 has hot added (roughly)? What does it show now? What happens if you remove hdc1 again? Does the reshape keep going? What I would expect to happen in this case is that the array reshapes into a degraded array, then the missing disk is recovered onto hdc1. NeilBrown - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Raid5 reshape
Nigel J. Terry wrote: Neil Brown wrote: On Saturday June 17, [EMAIL PROTECTED] wrote: Any ideas what I should do next? Thanks Looks like you've probably hit a bug. I'll need a bit more info though. First: [EMAIL PROTECTED] ~]# cat /proc/mdstat Personalities : [raid5] [raid4] md0 : active raid5 sdb1[1] sda1[0] hdc1[4](S) hdb1[2] 490223104 blocks super 0.91 level 5, 128k chunk, algorithm 2 [4/3] [UUU_] [=...] reshape = 6.9% (17073280/245111552) finish=86.3min speed=44003K/sec unused devices: none This really makes it look like the reshape is progressing. How long after the reboot was this taken? How long after hdc1 has hot added (roughly)? What does it show now? What happens if you remove hdc1 again? Does the reshape keep going? What I would expect to happen in this case is that the array reshapes into a degraded array, then the missing disk is recovered onto hdc1. NeilBrown - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html I don't know how long the system was reshaping before the power went off, and then I had to restart when the power came back. It claimed it was going to take 430 minutes, so 6% would be about 25 minutes, which could make good sense, certainly it looked like it was working fine when I went out. Now nothing is happening, it shows: [EMAIL PROTECTED] ~]# cat /proc/mdstat Personalities : [raid5] [raid4] md0 : active raid5 sdb1[1] sda1[0] hdc1[4](S) hdb1[2] 490223104 blocks super 0.91 level 5, 128k chunk, algorithm 2 [4/3] [UUU_] [=...] reshape = 6.9% (17073280/245111552) finish=2281.2min speed=1665K/sec unused devices: none [EMAIL PROTECTED] ~]# so the only thing changing is the time till finish. I'll try removing and adding /dev/hdc1 again. Will it make any difference if the device is mounted or not? Nigel Tried remove and add, made no difference: [EMAIL PROTECTED] ~]# mdadm /dev/md0 --remove /dev/hdc1 mdadm: hot removed /dev/hdc1 [EMAIL PROTECTED] ~]# cat /proc/mdstat Personalities : [raid5] [raid4] md0 : active raid5 sdb1[1] sda1[0] hdb1[2] 490223104 blocks super 0.91 level 5, 128k chunk, algorithm 2 [4/3] [UUU_] [=...] reshape = 6.9% (17073280/245111552) finish=2321.5min speed=1636K/sec unused devices: none [EMAIL PROTECTED] ~]# mdadm /dev/md0 --add /dev/hdc1 mdadm: re-added /dev/hdc1 [EMAIL PROTECTED] ~]# cat /proc/mdstat Personalities : [raid5] [raid4] md0 : active raid5 hdc1[4](S) sdb1[1] sda1[0] hdb1[2] 490223104 blocks super 0.91 level 5, 128k chunk, algorithm 2 [4/3] [UUU_] [=...] reshape = 6.9% (17073280/245111552) finish=2329.3min speed=1630K/sec unused devices: none [EMAIL PROTECTED] ~]# - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Raid5 reshape
OK, thanks for the extra details. I'll have a look and see what I can find, but it'll probably be a couple of days before I have anything useful for you. NeilBrown - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Raid5 reshape
Neil Brown wrote: OK, thanks for the extra details. I'll have a look and see what I can find, but it'll probably be a couple of days before I have anything useful for you. NeilBrown - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html OK, I'll try and be patient :-) At least everything else is working. Let me know if you need to ssh to my machine. Nigel - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html