Re: Problem with --manage
On 18.07.2006 15:46:53, Neil Brown wrote: On Monday July 17, [EMAIL PROTECTED] wrote: /dev/md/0 on /boot type ext2 (rw,nogrpid) /dev/md/1 on / type reiserfs (rw) /dev/md/2 on /var type reiserfs (rw) /dev/md/3 on /opt type reiserfs (rw) /dev/md/4 on /usr type reiserfs (rw) /dev/md/5 on /data type reiserfs (rw) I'm running the following kernel: Linux ceres 2.6.16.18-rock #1 SMP PREEMPT Sun Jun 25 10:47:51 CEST 2006 i686 GNU/Linux and mdadm 2.4. Now, hdb seems to be broken, even though smart says everything's fine. After a day or two, hdb would fail: Jul 16 16:58:41 ceres kernel: raid5: Disk failure on hdb3, disabling device. Operation continuing on 2 devices Jul 16 16:58:41 ceres kernel: raid5: Disk failure on hdb5, disabling device. Operation continuing on 2 devices Jul 16 16:59:06 ceres kernel: raid5: Disk failure on hdb7, disabling device. Operation continuing on 2 devices Jul 16 16:59:37 ceres kernel: raid5: Disk failure on hdb8, disabling device. Operation continuing on 2 devices Jul 16 17:02:22 ceres kernel: raid5: Disk failure on hdb6, disabling device. Operation continuing on 2 devices Very odd... no other message from the kernel? You would expect something if there was a real error. This was the only output on the console. But I just checked /var/log/messages now... ouch... --- Jul 16 16:59:36 ceres kernel: hdb: status error: status=0x00 { } Jul 16 16:59:36 ceres kernel: ide: failed opcode was: 0xea Jul 16 16:59:36 ceres kernel: hdb: drive not ready for command Jul 16 16:59:36 ceres kernel: hdb: status error: status=0x10 { SeekComplete } Jul 16 16:59:36 ceres kernel: ide: failed opcode was: unknown Jul 16 16:59:36 ceres kernel: hdb: drive not ready for command Jul 16 16:59:36 ceres kernel: hdb: status error: status=0x10 { SeekComplete } Jul 16 16:59:36 ceres kernel: ide: failed opcode was: unknown Jul 16 16:59:36 ceres kernel: hdb: drive not ready for command Jul 16 16:59:36 ceres kernel: hdb: status error: status=0x10 { SeekComplete } Jul 16 16:59:36 ceres kernel: ide: failed opcode was: unknown Jul 16 16:59:36 ceres kernel: hdb: drive not ready for command Jul 16 16:59:36 ceres kernel: hdb: status error: status=0x10 { SeekComplete } Jul 16 16:59:36 ceres kernel: ide: failed opcode was: unknown Jul 16 16:59:37 ceres kernel: hdb: drive not ready for command Jul 16 16:59:37 ceres kernel: ide0: reset: success Jul 16 16:59:37 ceres kernel: hdb: status error: status=0x10 { SeekComplete } Jul 16 16:59:37 ceres kernel: ide: failed opcode was: unknown Jul 16 16:59:37 ceres kernel: hdb: drive not ready for command Jul 16 16:59:37 ceres kernel: hdb: status error: status=0x00 { } Jul 16 16:59:37 ceres kernel: ide: failed opcode was: unknown Jul 16 16:59:37 ceres kernel: hdb: drive not ready for command Jul 16 16:59:37 ceres kernel: hdb: status error: status=0x10 { SeekComplete } Jul 16 16:59:37 ceres kernel: ide: failed opcode was: unknown Jul 16 16:59:37 ceres kernel: hdb: drive not ready for command Jul 16 16:59:37 ceres kernel: hdb: status error: status=0x10 { SeekComplete } Jul 16 16:59:37 ceres kernel: ide: failed opcode was: unknown Jul 16 16:59:37 ceres kernel: hdb: drive not ready for command Jul 16 16:59:37 ceres kernel: ide0: reset: success Jul 16 16:59:37 ceres kernel: hdb: status error: status=0x00 { } Jul 16 16:59:37 ceres kernel: ide: failed opcode was: unknown Jul 16 16:59:37 ceres kernel: end_request: I/O error, dev hdb, sector 488391932 Jul 16 16:59:37 ceres kernel: hdb: drive not ready for command Jul 16 16:59:37 ceres kernel: hdb: status error: status=0x10 { SeekComplete } Jul 16 16:59:37 ceres kernel: ide: failed opcode was: 0xea Jul 16 16:59:37 ceres kernel: raid5: Disk failure on hdb8, disabling device. Operation continuing on 2 devices Jul 16 16:59:37 ceres kernel: hdb: drive not ready for command Jul 16 16:59:37 ceres kernel: RAID5 conf printout: Jul 16 16:59:37 ceres kernel: --- rd:3 wd:2 fd:1 Jul 16 16:59:37 ceres kernel: disk 0, o:0, dev:hdb8 Jul 16 16:59:37 ceres kernel: disk 1, o:1, dev:hda8 Jul 16 16:59:37 ceres kernel: disk 2, o:1, dev:hdc8 Jul 16 16:59:37 ceres kernel: RAID5 conf printout: Jul 16 16:59:37 ceres kernel: --- rd:3 wd:2 fd:1 Jul 16 16:59:37 ceres kernel: disk 1, o:1, dev:hda8 Jul 16 16:59:37 ceres kernel: disk 2, o:1, dev:hdc8 --- Now, is this a broken IDE controller or harddisk? Because smartctl claims that everything is fine. The problem now is, the machine hangs after the last message and I can only turn it off by physically removing the power plug. alt-sysrq-P or alt-sysrq-T give anything useful? I tried alt-sysrq-o and -b, to no avail. Support for it is in my kernel and it works (tested earlier). When I now reboot the machine, `mdadm -A /dev/md[1-5]' will not start the arrays cleanly. They will all be lacking the hdb device and be 'inactive'. `mdadm -R' will not start them in this state. According to `mdadm --manage --help' using `mdadm --manage /dev/md/3 -a
Re: Problem with --manage
On Tuesday July 18, [EMAIL PROTECTED] wrote: Jul 16 16:59:37 ceres kernel: ide: failed opcode was: unknown Jul 16 16:59:37 ceres kernel: hdb: drive not ready for command Jul 16 16:59:37 ceres kernel: ide0: reset: success Jul 16 16:59:37 ceres kernel: hdb: status error: status=0x00 { } Jul 16 16:59:37 ceres kernel: ide: failed opcode was: unknown Jul 16 16:59:37 ceres kernel: end_request: I/O error, dev hdb, sector 488391932 Jul 16 16:59:37 ceres kernel: hdb: drive not ready for command Jul 16 16:59:37 ceres kernel: hdb: status error: status=0x10 { SeekComplete } Jul 16 16:59:37 ceres kernel: ide: failed opcode was: 0xea Jul 16 16:59:37 ceres kernel: raid5: Disk failure on hdb8, disabling device. Operation continuing on 2 devices Jul 16 16:59:37 ceres kernel: hdb: drive not ready for command Jul 16 16:59:37 ceres kernel: RAID5 conf printout: Jul 16 16:59:37 ceres kernel: --- rd:3 wd:2 fd:1 Jul 16 16:59:37 ceres kernel: disk 0, o:0, dev:hdb8 Jul 16 16:59:37 ceres kernel: disk 1, o:1, dev:hda8 Jul 16 16:59:37 ceres kernel: disk 2, o:1, dev:hdc8 Jul 16 16:59:37 ceres kernel: RAID5 conf printout: Jul 16 16:59:37 ceres kernel: --- rd:3 wd:2 fd:1 Jul 16 16:59:37 ceres kernel: disk 1, o:1, dev:hda8 Jul 16 16:59:37 ceres kernel: disk 2, o:1, dev:hdc8 --- Now, is this a broken IDE controller or harddisk? Because smartctl claims that everything is fine. Ouch indeed. I've no idea whose 'fault' this is. Maybe ask on linux-ide. I don't have a script log or something, but here's what I did from an initrd with init=/bin/bash # mount /dev /proc /sys /tmp # start udevd udevtrigger udevsettle while read a dev c ; do [ $a != ARRAY ] continue [ -e /dev/md/${dev##*/} ] || /bin/mknod $dev b 9 ${dev##*/} /sbin/mdadm -A ${dev} done /etc/mdadm.conf mdadm -As --auto=yes should be sufficient Personalities : [linear] [raid0] [raid1] [raid5] [raid4] md5 : inactive raid5 hda8[1] hdc8[2] 451426304 blocks level 5, 64k chunk, algorithm 2 [2/3] [_UU] Ahh, Ok, make that mdadm -As --force --auto=yes A crash combined with a drive failure can cause undetectable data corruption. You need to give --force to effectively acknowledge that.. I should get mdadm to explain what is happening so that I don't have to as much NeilBrown - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
mdadm -X bitmap status off by 2^16
Hi! Another pseudo-problem :) I've just set up a RAID5 array by creating a three-disk one from two disks, and later adding the third. Everything seems normal, but the mdadm (2.5.2) -X output: Filename : /dev/hda3 Magic : 6d746962 Version : 4 UUID : 293ceee6.d1811fb1.a8b316e6.b54abcc7 Events : 12 Events Cleared : 12 State : OK Chunksize : 1 MB Daemon : 5s flush period Write Mode : Normal Sync Size : 292784512 (279.22 GiB 299.81 GB) Bitmap : 285923 bits (chunks), 65536 dirty (22.9%) # for i in hdb3 hdd3 hda3 ; mdadm -X /dev/$i|grep map Bitmap : 285923 bits (chunks), 0 dirty (0.0%) Bitmap : 285923 bits (chunks), 0 dirty (0.0%) Bitmap : 285923 bits (chunks), 65536 dirty (22.9%) # for i in hdb3 hdd3 hda3 ; mdadm -X /dev/$i|grep map Bitmap : 285923 bits (chunks), 1 dirty (0.0%) Bitmap : 285923 bits (chunks), 1 dirty (0.0%) Bitmap : 285923 bits (chunks), 65537 dirty (22.9%) # for i in hdb3 hdd3 hda3 ; mdadm -X /dev/$i|grep map Bitmap : 285923 bits (chunks), 7 dirty (0.0%) Bitmap : 285923 bits (chunks), 7 dirty (0.0%) Bitmap : 285923 bits (chunks), 65543 dirty (22.9%) # for i in hdb3 hdd3 hda3 ; mdadm -X /dev/$i|grep map Bitmap : 285923 bits (chunks), 0 dirty (0.0%) Bitmap : 285923 bits (chunks), 0 dirty (0.0%) Bitmap : 285923 bits (chunks), 65536 dirty (22.9%) # cat /proc/mdstat Personalities : [raid1] [raid6] [raid5] [raid4] md0 : active raid5 hdd3[2] hdb3[0] hda3[1] 585569024 blocks level 5, 64k chunk, algorithm 2 [3/3] [UUU] bitmap: 0/140 pages [0KB], 1024KB chunk Is this going to bite me later on, or just a harmless display problem? Janos - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: XFS and write barrier
On Mon, Jul 17, 2006 at 01:32:38AM +0800, Federico Sevilla III wrote: On Sat, Jul 15, 2006 at 12:48:56PM +0200, Martin Steigerwald wrote: I am currently gathering information to write an article about journal filesystems with emphasis on write barrier functionality, how it works, why journalling filesystems need write barrier and the current implementation of write barrier support for different filesystems. Cool! Would you by any chance have information on the interaction between journal filesystems with write barrier functionality, and software RAID (md)? Based on my experience with 2.6.17, XFS detects that the underlying software RAID 1 device does not support barriers and therefore disables that functionality. Noone here seems to know, maybe Neil | the other folks on linux-raid can help us out with details on status of MD and write barriers? cheers. -- Nathan - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: trying to brute-force my RAID 5...
What are you expecting fdisk to tell you? fdisk lists partitions and I suspect you didn't have any partitions on /dev/md0 More likely you want something like fsck -n -f /dev/md0 and see which one produces the least noise. Maybe a simple file -s /dev/md0 could do the trick, and would only produce output different from the mere data when the good configuration is found... -- F.-E.B. - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: trying to brute-force my RAID 5...
Francois Barre wrote: What are you expecting fdisk to tell you? fdisk lists partitions and I suspect you didn't have any partitions on /dev/md0 More likely you want something like fsck -n -f /dev/md0 and see which one produces the least noise. Maybe a simple file -s /dev/md0 could do the trick, and would only produce output different from the mere data when the good configuration is found... More likely to produce an output whenever the 1st disk in the array is in the right place as it will just look at the 1st couple of sectors for the superblock. I'd go with the fsck idea as it will try to inspect the rest of the filesystem also. Brad -- Human beings, who are almost unique in having the ability to learn from the experience of others, are also remarkable for their apparent disinclination to do so. -- Douglas Adams - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: trying to brute-force my RAID 5...
More likely to produce an output whenever the 1st disk in the array is in the right place as it will just look at the 1st couple of sectors for the superblock. I'd go with the fsck idea as it will try to inspect the rest of the filesystem also. Obviously that's true, but it's still a good way to be sure of the first disk, and the time cost of the file -s is neglectible...Personally, I would have done both. - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Still can't get md arrays that were started from an initrd to shutdown
with lvm you have to stop lvm before you can stop the arrays... i wouldn't be surprised if evms has the same issue... AFAIK there's no counterpart to evms_activate. Besides, I'm no longer using EVMS, I just included it in my testing since this issue bit me there first. Thanks, Christian - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
read-ahead cache on indiv. raid members and entire md device
Hi, I'm looking for advice on tuning the read-ahead cache for an md device... for example, should I merely set the read-ahead for the md device: blockdev --setra ### /dev/md2 or should I start touching the individual raid member devices: blockdev --setra ### /dev/sdc1 blockdev --setra ### /dev/sdd1 blockdev --setra ### /dev/sde1, etc. I'm not sure about the relation between the two caches. Also, does the read-ahead cache for the entire md device show-up in /proc/ or /sys/block/mdX? I of course find the read-ahead for the individual devices under, for example: /sys/block/sdc/queue/read_ahead_kb Many thanks in advance. Cheers, -- roy - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: which disk the the one that data is on?
Hi, So if I were to want to stop the resync process on a very large array (1.4T), since it is in the middle of the day and makes work slower... How can I tell which drive is the one that is being used to check all the rest of the data? Or in other words, how can I stop the resync process and let it cont. later? Shai On 7/18/06, Neil Brown [EMAIL PROTECTED] wrote: On Tuesday July 18, [EMAIL PROTECTED] wrote: Hi, I rebooted my server today to find out that one of the arrays is being re-synced (see output below) . 1. What does the (S) to the right of hdh1[5](S) mean? Spare. 2. How do I know, from this output, which disk is the one holding the most current data and from which all the other drives are syncing from? Or are they all containing the data and this sync process is something else? Maybe I'm just not understanding what is being done exactly? This is raid5. The data is distributed over all of the drives. There are also 'parity' blocks used for coping with missing devices. This 'resync' process is checking that the parity blocks are all correct and will correct any that are wrong. NeilBrown Thanks in advance for the help. Shai --- md0 : active raid5 hdd1[1] hdc1[0] hdh1[5](S) hdg1[4] hdf1[3] hde1[2] 781433344 blocks level 5, 64k chunk, algorithm 2 [5/5] [U] [=...] resync = 5.1% (9965184/195358336) finish=180.5min speed=17110K/sec - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] enable auto=yes by default when using udev
I think I'm leaning towards auto-creating names if they look like standard names (or are listed in mdadm.conf?), but required auto=whatever to create anything else. The auto= option has the disadvantage that it is different for partitionable and regular arrays -- is there no way to detect from the array if it is supposed to be partitionable or not? As it is scripts are better off creating the node with correct major/minor and assembling without auto=. Regards, C. - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Raid and LVM and LILO
Hi Du, Did you create a /boot partition? /boot cannot be on LVM (AFAIK), and can be a regular partition or raid1. HTH. Paul Du wrote: Hi, I was/am trying to install Debian Sarge r2 with 2 Sata HD's working on Raid 1 via Software and in this newly MD device, I put LVM. All works fine and the debian installs well, but when the LILO try to install, it says me that I dont have an active partition and no matter what I do, it doesnt installs Does anybody knows what is happening? Is it a bad idea work with Raid via software and LVM??? - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: which disk the the one that data is on?
Shai wrote: Hi, I rebooted my server today to find out that one of the arrays is being re-synced (see output below) . 1. What does the (S) to the right of hdh1[5](S) mean? 2. How do I know, from this output, which disk is the one holding the most current data and from which all the other drives are syncing from? Or are they all containing the data and this sync process is something else? Maybe I'm just not understanding what is being done exactly? In addition to what you have already been told, if you find out that the array is in rebuild I would be a lot more worried to find out why. If it was from unclean shutdown you really should look into a bitmap if you don't have one. -- bill davidsen [EMAIL PROTECTED] CTO TMR Associates, Inc Doing interesting things with small computers since 1979 - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: issue with internal bitmaps
Bill Davidsen wrote: Neil Brown wrote: On Thursday July 6, [EMAIL PROTECTED] wrote: hello, i just realized that internal bitmaps do not seem to work anymore. I cannot imagine why. Nothing you have listed show anything wrong with md... Maybe you were expecting mdadm -X /dev/md100 to do something useful. Like -E, -X must be applied to a component device. Try mdadm -X /dev/sda1 To take this from the other end, why should -X apply to a component? Since the components can and do change names, and you frequently mention assembly by UUID, why aren't the component names determined from the invariant array name when mdadm wants them, instead of having a user or script check the array to get the components? Boy, I didn't say that well... what I meant to suggest is that when -E or -X are applied to the array as a whole, would it not be useful to itterate them over all of the components rather than than looking for non-existant data in the array itself? -- bill davidsen [EMAIL PROTECTED] CTO TMR Associates, Inc Doing interesting things with small computers since 1979 - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Raid and LVM and LILO
I assume that your /boot was raid1... I had similar issues with the Debian installer, trying to install a file server using LVM on top of RAID. I never did work out the problem; I installed Fedora Core :-/ Sorry I can't be of more help :-( Paul Du wrote: Paul Waldo wrote: Hi Du, Did you create a /boot partition? /boot cannot be on LVM (AFAIK), and can be a regular partition or raid1. HTH. The second thing I tried was that. I made a 200 MB /dev/md0 to be the /boot partition and the rest to be /dev/md1 where the system will be under LVM. But LILO says be that I dont have an active partition. I set the /dev/md0 to be bootable, and set the 2 Raid Linux Autodetect partitions that makes /dev/md0 bootable too. But LILO never installs... - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: trying to brute-force my RAID 5...
Sevrin Robstad wrote: I created the RAID when I installed Fedora Core 3 some time ago, didn't do anything special so the chunks should be 64kbyte and parity should be left-symmetric ? I have no idea what's default on FC3, sorry. Any Idea ? I missed that you were trying to fdisk -l /dev/md0.. As others have suggested, search for filesystems using fsck, or mount, or what not ;-). - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: mdadm -X bitmap status off by 2^16
Janos Farkas wrote: # for i in hdb3 hdd3 hda3 ; mdadm -X /dev/$i|grep map Bitmap : 285923 bits (chunks), 0 dirty (0.0%) Bitmap : 285923 bits (chunks), 0 dirty (0.0%) Bitmap : 285923 bits (chunks), 65536 dirty (22.9%) This indicates that the _on-disk_ bits are cleared on two disks, but set on the third. # cat /proc/mdstat Personalities : [raid1] [raid6] [raid5] [raid4] md0 : active raid5 hdd3[2] hdb3[0] hda3[1] 585569024 blocks level 5, 64k chunk, algorithm 2 [3/3] [UUU] bitmap: 0/140 pages [0KB], 1024KB chunk This indicates that the _in-memory_ bits are all cleared. At array startup, md initializes the in-memory bitmap from the on-disk copy. It then uses the in-memory bitmap from that point on, shadowing any changes there into the on-disk bitmap. At the end of a rebuild (which should have happened after you added the third disk), the bits should all be cleared. The on-disk bits get cleared lazily, though. Is there any chance that they are cleared now? If not, it sounds like a bug to me. -- Paul - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
md reports: unknown partition table
Hi After a powercut I'm trying to mount an array and failing :( teak:~# mdadm --assemble /dev/media --auto=p /dev/sd[bcdef]1 mdadm: /dev/media has been started with 5 drives. Good However: teak:~# mount /media mount: /dev/media1 is not a valid block device teak:~# dd if=/dev/media1 of=/dev/null dd: opening `/dev/media1': No such device or address teak:~# dd if=/dev/media of=/dev/null 792442+0 records in 792441+0 records out 405729792 bytes transferred in 4.363571 seconds (92981135 bytes/sec) (after ^C) dmesg shows: raid5: device sdb1 operational as raid disk 0 raid5: device sdf1 operational as raid disk 4 raid5: device sde1 operational as raid disk 3 raid5: device sdd1 operational as raid disk 2 raid5: device sdc1 operational as raid disk 1 raid5: allocated 5235kB for md_d127 raid5: raid level 5 set md_d127 active with 5 out of 5 devices, algorithm 2 RAID5 conf printout: --- rd:5 wd:5 fd:0 disk 0, o:1, dev:sdb1 disk 1, o:1, dev:sdc1 disk 2, o:1, dev:sdd1 disk 3, o:1, dev:sde1 disk 4, o:1, dev:sdf1 md_d127: bitmap initialized from disk: read 1/1 pages, set 0 bits, status: 0 created bitmap (5 pages) for device md_d127 md_d127: unknown partition table That last line looks odd... It was created like so: mdadm --create /dev/media --level=5 -n 5 -e1.2 --bitmap=internal --name=media --auto=p /dev/sd[bcdef]1 and the xfs fstab entry is: /dev/media1 /media xfs rw,noatime,logdev=/dev/media2 0 0 fdisk /dev/media shows: Device Boot Start End Blocks Id System /dev/media1 1 312536035 1250144138 83 Linux /dev/media2 312536036 312560448 97652 da Non-FS data cfdisk even gets the filesystem right... Which is expected. teak:~# ll /dev/media* brw-rw 1 root disk 254, 192 2006-07-18 17:18 /dev/media brw-rw 1 root disk 254, 193 2006-07-18 17:18 /dev/media1 brw-rw 1 root disk 254, 194 2006-07-18 17:18 /dev/media2 brw-rw 1 root disk 254, 195 2006-07-18 17:18 /dev/media3 brw-rw 1 root disk 254, 196 2006-07-18 17:18 /dev/media4 teak:~# uname -a Linux teak 2.6.16.19-teak-060602-01 #3 PREEMPT Sat Jun 3 09:20:24 BST 2006 i686 GNU/Linux teak:~# mdadm -V mdadm - v2.5.2 - 27 June 2006 David -- - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: mdadm -X bitmap status off by 2^16
Hi! On 2006-07-18 at 11:30:42, Paul Clements wrote: Personalities : [raid1] [raid6] [raid5] [raid4] md0 : active raid5 hdd3[2] hdb3[0] hda3[1] 585569024 blocks level 5, 64k chunk, algorithm 2 [3/3] [UUU] bitmap: 0/140 pages [0KB], 1024KB chunk This indicates that the _in-memory_ bits are all cleared. Makes sense. At array startup, md initializes the in-memory bitmap from the on-disk copy. It then uses the in-memory bitmap from that point on, shadowing any changes there into the on-disk bitmap. At the end of a rebuild (which should have happened after you added the third disk), the bits should all be cleared. The on-disk bits get cleared lazily, though. Is there any chance that they are cleared now? If not, it sounds like a bug to me. I just removed/readded the bitmap as follows, but before that, the 65536 still was there as of 5 minutes ago. # mdadm /dev/md0 --grow -b none # for i in hdb3 hdd3 hda3 ; mdadm -X /dev/$i|grep map Bitmap : 285923 bits (chunks), 3 dirty (0.0%) Bitmap : 285923 bits (chunks), 3 dirty (0.0%) Bitmap : 285923 bits (chunks), 65539 dirty (22.9%) # for i in hdb3 hdd3 hda3 ; mdadm -X /dev/$i|grep map Bitmap : 285923 bits (chunks), 3 dirty (0.0%) Bitmap : 285923 bits (chunks), 3 dirty (0.0%) Bitmap : 285923 bits (chunks), 65539 dirty (22.9%) (Bitmaps still present, probably I was just too impatient after the removal) # mdadm /dev/md0 --grow -b internal # for i in hdb3 hdd3 hda3 ; mdadm -X /dev/$i|grep map Bitmap : 285923 bits (chunks), 285923 dirty (100.0%) Bitmap : 285923 bits (chunks), 285923 dirty (100.0%) Bitmap : 285923 bits (chunks), 285923 dirty (100.0%) # for i in hdb3 hdd3 hda3 ; mdadm -X /dev/$i|grep map Bitmap : 285923 bits (chunks), 285923 dirty (100.0%) Bitmap : 285923 bits (chunks), 285923 dirty (100.0%) Bitmap : 285923 bits (chunks), 285923 dirty (100.0%) # cat /proc/mdstat Personalities : [raid1] [raid6] [raid5] [raid4] md0 : active raid5 hdd3[2] hdb3[0] hda3[1] 585569024 blocks level 5, 64k chunk, algorithm 2 [3/3] [UUU] bitmap: 140/140 pages [560KB], 1024KB chunk unused devices: none (Ouch, I hoped there wouldn't be another resync :) # cat /proc/mdstat Personalities : [raid1] [raid6] [raid5] [raid4] md0 : active raid5 hdd3[2] hdb3[0] hda3[1] 585569024 blocks level 5, 64k chunk, algorithm 2 [3/3] [UUU] bitmap: 1/140 pages [4KB], 1024KB chunk unused devices: none (Now the in-memory bitmaps seems to be emptied again) # for i in hdb3 hdd3 hda3 ; mdadm -X /dev/$i|grep map Bitmap : 285923 bits (chunks), 285923 dirty (100.0%) Bitmap : 285923 bits (chunks), 285923 dirty (100.0%) Bitmap : 285923 bits (chunks), 285923 dirty (100.0%) # for i in hdb3 hdd3 hda3 ; mdadm -X /dev/$i|grep map Bitmap : 285923 bits (chunks), 0 dirty (0.0%) Bitmap : 285923 bits (chunks), 0 dirty (0.0%) Bitmap : 285923 bits (chunks), 0 dirty (0.0%) And fortunately the on disk ones too... This discrepancy was there after at least two reboots after the whole resync has been done. I also did a scrub (check) on the array, and it still did not change. - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: XFS and write barrier
On Tue, Jul 18, 2006 at 06:58:56PM +1000, Neil Brown wrote: On Tuesday July 18, [EMAIL PROTECTED] wrote: On Mon, Jul 17, 2006 at 01:32:38AM +0800, Federico Sevilla III wrote: On Sat, Jul 15, 2006 at 12:48:56PM +0200, Martin Steigerwald wrote: I am currently gathering information to write an article about journal filesystems with emphasis on write barrier functionality, how it works, why journalling filesystems need write barrier and the current implementation of write barrier support for different filesystems. Journalling filesystems need write barrier isn't really accurate. They can make good use of write barrier if it is supported, and where it isn't supported, they should use blkdev_issue_flush in combination with regular submit/wait. blkdev_issue_flush() causes a write cache flush - just like a barrier typically causes a write cache flush up to the I/O with the barrier in it. Both of these mechanisms provide the same thing - an I/O barrier that enforces ordering of I/Os to disk. Given that filesystems already indicate to the block layer when they want a barrier, wouldn't it be better to get the block layer to issue this cache flush if the underlying device doesn't support barriers and it receives a barrier request? FWIW, Only XFS and Reiser3 use this function, and only then when issuing a fsync when barriers are disabled to make sure a common test (fsync then power cycle) doesn't result in data loss... Noone here seems to know, maybe Neil | the other folks on linux-raid can help us out with details on status of MD and write barriers? In 2.6.17, md/raid1 will detect if the underlying devices support barriers and if they all do, it will accept barrier requests from the filesystem and pass those requests down to all devices. Other raid levels will reject all barrier requests. Any particular reason for not supporting barriers on the other types of RAID? Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: md reports: unknown partition table - fixed.
David Greaves wrote: Hi After a powercut I'm trying to mount an array and failing :( A reboot after tidying up /dev/ fixed it. The first time through I'd forgotten to update the boot scripts and they were assembling the wrong UUID. That was fine; I realised this and ran the manual assemble: mdadm --assemble /dev/media /dev/sd[bcdef]1 dmesg cat /proc/mdstat All OK (but I'd forgotten that this was a partitioned array). I suspect the device entries for /dev/media[1234] from last time were hanging about. mount /media fdisk /dev/media So I guess this fails because the major-minor are for a non-p md device? mdadm --assemble /dev/media --auto=p /dev/sd[bcdef]1 mdadm --stop /dev/media This fails because I'm on mdadm 2.4.1 mdadm --assemble /dev/media --auto=p /dev/sd[bcdef]1 cat /proc/mdstat mdadm --stop /dev/md_d0 mdadm --stop /dev/md0 cat /proc/mdstat So by now I upgrade to mdadm 2.5.1 in another session. mdadm --stop /dev/media dmesg cat /proc/mdstat and it stops. mdadm --assemble /dev/media --auto=p /dev/sd[bcdef]1 But now it won't create working devices... Much messing about with assemble and I try a kernel upgrade - can't because the driver for my video card won't compile under 2.6.17 yet so WTF, I suspect major/minor numbers so just reboot it under the same kernel. All seems well. I think there's a bug here somewhere. I wonder/suspect that the superblock should contain the fact that it's a partitioned/able md device? David -- - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: trying to brute-force my RAID 5...
Neil Brown wrote: I have written some posts about this before... My 6 disk RAID 5 broke down because of hardware failure. When I tried to get it up'n'running again I did a --create without any missing disk, which made it rebuild. I have also lost all information about how the old RAID was set up.. I got a friend of mine to make a list of all the 6^6 combinations of dev 1 2 3 4 5 missing, and set it up this way : mdadm --create -n 6 -l 5 dev1 2 3 4 5 missing ; fdisk -l /dev/md0 ; mdadm --stop /dev/md0 . But a cat logfile | grep Linux of the output of this script tells me that on no of these combination does it find a valid type 83 partition. shouldn't this work ??? No. What are you expecting fdisk to tell you? fdisk lists partitions and I suspect you didn't have any partitions on /dev/md0 More likely you want something like fsck -n -f /dev/md0 and see which one produces the least noise. They all produce Bad magic number in super-block while trying to open /dev/md0 . I tried file -s /dev/md0 also, and with one of the disk as first disk I got ext 3 filedata (needs journal recovery) (errors) . but as fsck -n -f can't do anything with it, there might not be any hope ? Or can it still be that I have some wrong setting? Chunk size is (and was) default 64k, yes? Sevrin - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: issue with internal bitmaps
On Tue, Jul 18, 2006 at 09:34:35AM -0400, Bill Davidsen wrote: Boy, I didn't say that well... what I meant to suggest is that when -E or -X are applied to the array as a whole, would it not be useful to itterate them over all of the components rather than than looking for non-existant data in the array itself? the question i believe is to distinguish the case where an md device is a component of another md device... L. -- Luca Berra -- [EMAIL PROTECTED] Communication Media Services S.r.l. /\ \ / ASCII RIBBON CAMPAIGN XAGAINST HTML MAIL / \ - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: only 4 spares and no access to my data
On 18 Jul 2006, Neil Brown moaned: The superblock locations for sda and sda1 can only be 'one and the same' if sda1 is at an offset in sda which is a multiple of 64K, and if sda1 ends near the end of sda. This certainly can happen, but it is by no means certain. For this reason, version-1 superblocks record the offset of the superblock in the device so that if a superblock is written to sda1 and then read from sda, it will look wrong (wrong offset) and so will be ignored (no valid superblock here). One case where this can happen is Sun slices (and I think BSD disklabels too), where /dev/sda and /dev/sda1 start at the *same place*. (This causes amusing problems with LVM vgscan unless the raw devices are excluded, too.) -- `We're sysadmins. We deal with the inconceivable so often I can clearly see the need to define levels of inconceivability.' --- Rik Steenwinkel - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: which disk the the one that data is on?
Hi, Another question on this matter please: If there is a raid5 with 4 disks and 1 missing, and we add that disk, while its doing the resync of that disk, how do we know which disk it is (if we forgot what one we added)? Thanks, Shai On 7/18/06, Neil Brown [EMAIL PROTECTED] wrote: On Tuesday July 18, [EMAIL PROTECTED] wrote: Hi, So if I were to want to stop the resync process on a very large array (1.4T), since it is in the middle of the day and makes work slower... How can I tell which drive is the one that is being used to check all the rest of the data? Or in other words, how can I stop the resync process and let it cont. later? There is no one that is being used to check all the rest. They are all in this together. But if you want to stop the resync because it is slowing things down then write a small number to /proc/sys/dev/raid/speed_limit_min e.g. echo 10 /proc/sys/dev/raid/speed_limit_min This won't actually stop it, but it will make it go very slowly so as not to interfere with anything else. NeilBrown - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html