Re: [BUG] Raid1/5 over iSCSI trouble
Hello, Any news about this trouble ? Any idea ? I'm trying to fix it, but I don't see any specific interaction between raid5 and istd. Does anyone try to reproduce this bug on another arch than sparc64 ? I only use sparc32 and 64 servers and I cannot test on other archs. Of course, I have a laptop, but I cannot create a raid5 array on its internal HD to test this configuration ;-) Please note that I won't read my mails until next saturday morning (CEST). After disconnection of iSCSI target : Tasks: 232 total, 7 running, 224 sleeping, 0 stopped, 1 zombie Cpu(s): 0.0%us, 15.2%sy, 0.0%ni, 84.3%id, 0.0%wa, 0.1%hi, 0.3%si, 0.0%st Mem: 4139032k total, 4127584k used,11448k free,95752k buffers Swap: 7815536k total,0k used, 7815536k free, 3758792k cached PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 9738 root 15 -5 000 R 100 0.0 4:56.82 md_d0_raid5 9774 root 15 -5 000 R 100 0.0 5:52.41 istd1 9739 root 15 -5 000 R 14 0.0 0:28.90 md_d0_resync 9916 root 20 0 3248 1544 1120 R2 0.0 0:00.56 top 4129 root 20 0 41648 5024 2432 S0 0.1 2:56.17 fail2ban-server 1 root 20 0 2576 960 816 S0 0.0 0:01.58 init 2 root 15 -5 000 S0 0.0 0:00.00 kthreadd 3 root RT -5 000 S0 0.0 0:00.00 migration/0 4 root 15 -5 000 S0 0.0 0:00.02 ksoftirqd/0 5 root RT -5 000 S0 0.0 0:00.00 migration/1 6 root 15 -5 000 S0 0.0 0:00.00 ksoftirqd/1 Regards, JKB - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Time to deprecate old RAID formats?
Doug Ledford wrote: On Mon, 2007-10-22 at 16:39 -0400, John Stoffel wrote: I don't agree completely. I think the superblock location is a key issue, because if you have a superblock location which moves depending the filesystem or LVM you use to look at the partition (or full disk) then you need to be even more careful about how to poke at things. This is the heart of the matter. When you consider that each file system and each volume management stack has a superblock, and they some store their superblocks at the end of devices and some at the beginning, and they can be stacked, then it becomes next to impossible to make sure a stacked setup is never recognized incorrectly under any circumstance. I wonder if we should not really be talking about superblock versions 1.0, 1.1, 1.2 etc but a data format (0.9 vs 1.0) and a location (end,start,offset4k)? This would certainly make things a lot clearer to new users: mdadm --create /dev/md0 --metadata 1.0 --meta-location offset4k mdadm --detail /dev/md0 /dev/md0: Version : 01.0 Metadata-locn : End-of-device Creation Time : Fri Aug 4 23:05:02 2006 Raid Level : raid0 And there you have the deprecation... only two superblock versions and no real changes to code etc David - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Time to deprecate old RAID formats?
Bill == Bill Davidsen [EMAIL PROTECTED] writes: Bill John Stoffel wrote: Why do we have three different positions for storing the superblock? Bill Why do you suggest changing anything until you get the answer to Bill this question? If you don't understand why there are three Bill locations, perhaps that would be a good initial investigation. Because I've asked this question before and not gotten an answer, nor is it answered in the man page for mdadm on why we have this setup. Bill Clearly the short answer is that they reflect three stages of Bill Neil's thinking on the topic, and I would bet that he had a good Bill reason for moving the superblock when he did it. So let's hear Neil's thinking about all this? Or should I just work up a patch to do what I suggest and see how that flies? Bill Since you have to support all of them or break existing arrays, Bill and they all use the same format so there's no saving of code Bill size to mention, why even bring this up? Because of the confusion factor. Again, since noone has been able to articulate a reason why we have three different versions of the 1.x superblock, nor have I seen any good reasons for why we should have them, I'm going by the KISS principle to reduce the options to the best one. And no, I'm not advocating getting rid of legacy support, but I AM advocating that we settle on ONE standard format going forward as the default for all new RAID superblocks. John - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Raid-10 mount at startup always has problem
Daniel L. Miller wrote: Richard Scobie wrote: Daniel L. Miller wrote: And you didn't ask, but my mdadm.conf: DEVICE partitions ARRAY /dev/.static/dev/md0 level=raid10 num-devices=4 UUID=9d94b17b:f5fac31a:577c252b:0d4c4b2a Try adding auto=part at the end of you mdadm.conf ARRAY line. Thanks - will see what happens on my next reboot. Current mdadm.conf: DEVICE partitions ARRAY /dev/.static/dev/md0 level=raid10 num-devices=4 UUID=9d94b17b:f5fac31a:577c252b:0d4c4b2a auto=part still have the problem where on boot one drive is not part of the array. Is there a log file I can check to find out WHY a drive is not being added? It's been a while since the reboot, but I did find some entries in dmesg - I'm appending both the md lines and the physical disk related lines. The bottom shows one disk not being added (this time is was sda) - and the disk that gets skipped on each boot seems to be random - there's no consistent failure: [...] md: raid10 personality registered for level 10 [...] md: Autodetecting RAID arrays. md: autorun ... md: ... autorun DONE. [...] scsi0 : sata_nv scsi1 : sata_nv ata1: SATA max UDMA/133 cmd 0xc20001428480 ctl 0xc200014284a0 bmdma 0x00011410 irq 23 ata2: SATA max UDMA/133 cmd 0xc20001428580 ctl 0xc200014285a0 bmdma 0x00011418 irq 23 ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300) ata1.00: ATA-7: ST3160811AS, 3.AAE, max UDMA/133 ata1.00: 312581808 sectors, multi 16: LBA48 NCQ (depth 31/32) ata1.00: configured for UDMA/133 ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300) ata2.00: ATA-7: ST3160811AS, 3.AAE, max UDMA/133 ata2.00: 312581808 sectors, multi 16: LBA48 NCQ (depth 31/32) ata2.00: configured for UDMA/133 scsi 0:0:0:0: Direct-Access ATA ST3160811AS 3.AA PQ: 0 ANSI: 5 ata1: bounce limit 0x, segment boundary 0x, hw segs 61 scsi 1:0:0:0: Direct-Access ATA ST3160811AS 3.AA PQ: 0 ANSI: 5 ata2: bounce limit 0x, segment boundary 0x, hw segs 61 ACPI: PCI Interrupt Link [LSI1] enabled at IRQ 22 ACPI: PCI Interrupt :00:08.0[A] - Link [LSI1] - GSI 22 (level, high) - IRQ 22 sata_nv :00:08.0: Using ADMA mode PCI: Setting latency timer of device :00:08.0 to 64 scsi2 : sata_nv scsi3 : sata_nv ata3: SATA max UDMA/133 cmd 0xc2000142a480 ctl 0xc2000142a4a0 bmdma 0x00011420 irq 22 ata4: SATA max UDMA/133 cmd 0xc2000142a580 ctl 0xc2000142a5a0 bmdma 0x00011428 irq 22 ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 300) ata3.00: ATA-7: ST3160811AS, 3.AAE, max UDMA/133 ata3.00: 312581808 sectors, multi 16: LBA48 NCQ (depth 31/32) ata3.00: configured for UDMA/133 ata4: SATA link up 1.5 Gbps (SStatus 113 SControl 300) ata4.00: ATA-7: ST3160811AS, 3.AAE, max UDMA/133 ata4.00: 312581808 sectors, multi 16: LBA48 NCQ (depth 31/32) ata4.00: configured for UDMA/133 scsi 2:0:0:0: Direct-Access ATA ST3160811AS 3.AA PQ: 0 ANSI: 5 ata3: bounce limit 0x, segment boundary 0x, hw segs 61 scsi 3:0:0:0: Direct-Access ATA ST3160811AS 3.AA PQ: 0 ANSI: 5 ata4: bounce limit 0x, segment boundary 0x, hw segs 61 sd 0:0:0:0: [sda] 312581808 512-byte hardware sectors (160042 MB) sd 0:0:0:0: [sda] Write Protect is off sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00 sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA sd 0:0:0:0: [sda] 312581808 512-byte hardware sectors (160042 MB) sd 0:0:0:0: [sda] Write Protect is off sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00 sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA sda: unknown partition table sd 0:0:0:0: [sda] Attached SCSI disk sd 1:0:0:0: [sdb] 312581808 512-byte hardware sectors (160042 MB) sd 1:0:0:0: [sdb] Write Protect is off sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00 sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA sd 1:0:0:0: [sdb] 312581808 512-byte hardware sectors (160042 MB) sd 1:0:0:0: [sdb] Write Protect is off sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00 sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA sdb: unknown partition table sd 1:0:0:0: [sdb] Attached SCSI disk sd 2:0:0:0: [sdc] 312581808 512-byte hardware sectors (160042 MB) sd 2:0:0:0: [sdc] Write Protect is off sd 2:0:0:0: [sdc] Mode Sense: 00 3a 00 00 sd 2:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA sd 2:0:0:0: [sdc] 312581808 512-byte hardware sectors (160042 MB) sd 2:0:0:0: [sdc] Write Protect is off sd 2:0:0:0: [sdc] Mode Sense: 00 3a 00 00 sd 2:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA sdc: unknown partition table sd 2:0:0:0: [sdc] Attached SCSI disk sd 3:0:0:0: [sdd] 312581808 512-byte hardware sectors (160042 MB) sd 3:0:0:0: [sdd] Write Protect is off sd 3:0:0:0: [sdd] Mode Sense: 00 3a 00 00 sd
Re: Time to deprecate old RAID formats?
On 10/24/07, John Stoffel [EMAIL PROTECTED] wrote: Bill == Bill Davidsen [EMAIL PROTECTED] writes: Bill John Stoffel wrote: Why do we have three different positions for storing the superblock? Bill Why do you suggest changing anything until you get the answer to Bill this question? If you don't understand why there are three Bill locations, perhaps that would be a good initial investigation. Because I've asked this question before and not gotten an answer, nor is it answered in the man page for mdadm on why we have this setup. Bill Clearly the short answer is that they reflect three stages of Bill Neil's thinking on the topic, and I would bet that he had a good Bill reason for moving the superblock when he did it. So let's hear Neil's thinking about all this? Or should I just work up a patch to do what I suggest and see how that flies? Bill Since you have to support all of them or break existing arrays, Bill and they all use the same format so there's no saving of code Bill size to mention, why even bring this up? Because of the confusion factor. Again, since noone has been able to articulate a reason why we have three different versions of the 1.x superblock, nor have I seen any good reasons for why we should have them, I'm going by the KISS principle to reduce the options to the best one. And no, I'm not advocating getting rid of legacy support, but I AM advocating that we settle on ONE standard format going forward as the default for all new RAID superblocks. Why exactly are you on this crusade to find the one best v1 superblock location? Giving people the freedom to place the superblock where they choose isn't a bad thing. Would adding something like If in doubt, 1.1 is the safest choice. to the mdadm man page give you the KISS warm-fuzzies you're pining for? The fact that, after you read the manpage, you didn't even know that the only difference between the v1.x variants is the location that the superblock is placed indicates that you're not in a position to be so tremendously evangelical about affecting code changes that limit existing options. Mike - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Time to deprecate old RAID formats?
John Stoffel wrote: Bill == Bill Davidsen [EMAIL PROTECTED] writes: Bill John Stoffel wrote: Why do we have three different positions for storing the superblock? Bill Why do you suggest changing anything until you get the answer to Bill this question? If you don't understand why there are three Bill locations, perhaps that would be a good initial investigation. Because I've asked this question before and not gotten an answer, nor is it answered in the man page for mdadm on why we have this setup. Bill Clearly the short answer is that they reflect three stages of Bill Neil's thinking on the topic, and I would bet that he had a good Bill reason for moving the superblock when he did it. So let's hear Neil's thinking about all this? Or should I just work up a patch to do what I suggest and see how that flies? If you are only going to change the default, I think you're done, since people report problems with bootloaders starting versions other than 0.90. And until I hear Neil's thinking on this, I'm not sure that I know what the default location and type should be. In fact, reading the discussion I suspect it should be different for RAID-0 (should be at the end) and all other types (should be near the front). That retains the ability to mount one part of the mirror as a single partition, while minimizing the possibility of bad applications seeing something which looks like a filesystem at the start of a partition and trying to run fsck on it. Bill Since you have to support all of them or break existing arrays, Bill and they all use the same format so there's no saving of code Bill size to mention, why even bring this up? Because of the confusion factor. Again, since noone has been able to articulate a reason why we have three different versions of the 1.x superblock, nor have I seen any good reasons for why we should have them, I'm going by the KISS principle to reduce the options to the best one. And no, I'm not advocating getting rid of legacy support, but I AM advocating that we settle on ONE standard format going forward as the default for all new RAID superblocks. Unfortunately the solution can't be any simpler than the problem, and that's why I'm dubious that anything but the documentation should be changed, or an additional metadata target added per the discussion above, perhaps best1 for best 1.x format based on the raid level. -- bill davidsen [EMAIL PROTECTED] CTO TMR Associates, Inc Doing interesting things with small computers since 1979 - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Raid-10 mount at startup always has problem
On Wed, 2007-10-24 at 07:22 -0700, Daniel L. Miller wrote: Daniel L. Miller wrote: Richard Scobie wrote: Daniel L. Miller wrote: And you didn't ask, but my mdadm.conf: DEVICE partitions ARRAY /dev/.static/dev/md0 level=raid10 num-devices=4 UUID=9d94b17b:f5fac31a:577c252b:0d4c4b2a Try adding auto=part at the end of you mdadm.conf ARRAY line. Thanks - will see what happens on my next reboot. Current mdadm.conf: DEVICE partitions ARRAY /dev/.static/dev/md0 level=raid10 num-devices=4 UUID=9d94b17b:f5fac31a:577c252b:0d4c4b2a auto=part still have the problem where on boot one drive is not part of the array. Is there a log file I can check to find out WHY a drive is not being added? It usually means either the device is busy at the time the raid startup happened, or the device wasn't created by udev yet at the time the startup happened. It it failing to start the array properly in the initrd or is this happening after you've switched to the rootfs and are running the startup scripts? md: md0 stopped. md: md0 stopped. md: bindsdc md: bindsdd md: bindsdb Whole disk raid devices == bad. Lots of stuff can go wrong with that setup. md: md0: raid array is not clean -- starting background reconstruction raid10: raid set md0 active with 3 out of 4 devices md: couldn't update array info. -22 md: resync of RAID array md0 md: minimum _guaranteed_ speed: 1000 KB/sec/disk. md: using maximum available idle IO bandwidth (but not more than 20 KB/sec) for resync. md: using 128k window, over a total of 312581632 blocks. Filesystem md0: Disabling barriers, not supported by the underlying device XFS mounting filesystem md0 Starting XFS recovery on filesystem: md0 (logdev: internal) Ending XFS recovery on filesystem: md0 (logdev: internal) -- Doug Ledford [EMAIL PROTECTED] GPG KeyID: CFBFF194 http://people.redhat.com/dledford Infiniband specific RPMs available at http://people.redhat.com/dledford/Infiniband signature.asc Description: This is a digitally signed message part
MD driver document
Hi, I am looking for best way of understanding MD driver(including raid5/6) architecture. I am developing driver for one of the PPC based SOC. I have done some code reading and tried to use HW debugger to walk through the code. But it was not much help. If you have any pointers or documents, I will greatly appreciate if you can share it. Thanks and regards, Marri - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: MD driver document
On 10/24/07, tirumalareddy marri [EMAIL PROTECTED] wrote: Hi, I am looking for best way of understanding MD driver(including raid5/6) architecture. I am developing driver for one of the PPC based SOC. I have done some code reading and tried to use HW debugger to walk through the code. But it was not much help. If you have any pointers or documents, I will greatly appreciate if you can share it. I started out with include/linux/raid/raid5.h. Also, running it with the debug print statements turned on will get you familiar with the code flow. Lastly, I wrote the following paper which is already becoming outdated: http://downloads.sourceforge.net/xscaleiop/ols_paper_2006.pdf Thanks and regards, Marri -- Dan - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Software RAID when it works and when it doesn't
Alberto Alonso wrote: On Tue, 2007-10-23 at 18:45 -0400, Bill Davidsen wrote: I'm not sure the timeouts are the problem, even if md did its own timeout, it then needs a way to tell the driver (or device) to stop retrying. I don't believe that's available, certainly not everywhere, and anything other than everywhere would turn the md code into a nest of exceptions. If we loose the ability to communication to that drive I don't see it as a problem (that's the whole point, we kick it out of the array). So, if we can't tell the driver about the failure we are still OK, md could successfully deal with misbehaved drivers. I think what you really want is to notice how long the drive and driver took to recover or fail, and take action based on that. In general kick the drive is not optimal for a few bad spots, even if the drive recovery sucks. -- bill davidsen [EMAIL PROTECTED] CTO TMR Associates, Inc Doing interesting things with small computers since 1979 - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [BUG] Raid1/5 over iSCSI trouble
BERTRAND Joël wrote: Hello, Any news about this trouble ? Any idea ? I'm trying to fix it, but I don't see any specific interaction between raid5 and istd. Does anyone try to reproduce this bug on another arch than sparc64 ? I only use sparc32 and 64 servers and I cannot test on other archs. Of course, I have a laptop, but I cannot create a raid5 array on its internal HD to test this configuration ;-) Sure you can, a few loopback devices and a few iSCSI, and you're in business. I think the ongoing discussion of timeouts and whatnot may bear some fruit eventually, perhaps not as fast as you would like. By Saturday a solution may emerge. Please note that I won't read my mails until next saturday morning (CEST). -- bill davidsen [EMAIL PROTECTED] CTO TMR Associates, Inc Doing interesting things with small computers since 1979 - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Time to deprecate old RAID formats?
Doug Ledford wrote: On Mon, 2007-10-22 at 16:39 -0400, John Stoffel wrote: I don't agree completely. I think the superblock location is a key issue, because if you have a superblock location which moves depending the filesystem or LVM you use to look at the partition (or full disk) then you need to be even more careful about how to poke at things. This is the heart of the matter. When you consider that each file system and each volume management stack has a superblock, and they some store their superblocks at the end of devices and some at the beginning, and they can be stacked, then it becomes next to impossible to make sure a stacked setup is never recognized incorrectly under any circumstance. It might be possible if you use static device names, but our users *long* ago complained very loudly when adding a new disk or removing a bad disk caused their setup to fail to boot. So, along came mount by label and auto scans for superblocks. Once you do that, you *really* need all the superblocks at the same end of a device so when you stack things, it always works properly. Let me be devil's advocate, I noted in another post that location might be raid level dependent. For raid-1 putting the superblock at the end allows the BIOS to treat a single partition as a bootable unit. For all other arrangements the end location puts the superblock where it is slightly more likely to be overwritten, and where it must be moved if the partition grows or whatever. There really may be no right answer. -- bill davidsen [EMAIL PROTECTED] CTO TMR Associates, Inc Doing interesting things with small computers since 1979 - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Software RAID when it works and when it doesn't
On Wed, 2007-10-24 at 16:04 -0400, Bill Davidsen wrote: I think what you really want is to notice how long the drive and driver took to recover or fail, and take action based on that. In general kick the drive is not optimal for a few bad spots, even if the drive recovery sucks. The problem is that the driver never comes back and the whole array hangs, waiting forever. That's why a timeout within the md code is needed to recover from these type of drivers. Alberto - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [BUG] Raid1/5 over iSCSI trouble
On 10/24/07, BERTRAND Joël [EMAIL PROTECTED] wrote: Hello, Any news about this trouble ? Any idea ? I'm trying to fix it, but I don't see any specific interaction between raid5 and istd. Does anyone try to reproduce this bug on another arch than sparc64 ? I only use sparc32 and 64 servers and I cannot test on other archs. Of course, I have a laptop, but I cannot create a raid5 array on its internal HD to test this configuration ;-) Can you collect some oprofile data, as Ming suggested, so we can maybe see what md_d0_raid5 and istd1 are fighting about? Hopefully it is as painless to run on sparc as it is on IA: opcontrol --start --vmlinux=/path/to/vmlinux wait opcontrol --stop opreport --image-path=/lib/modules/`uname -r` -l -- Dan - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Time to deprecate old RAID formats?
On Tuesday October 23, [EMAIL PROTECTED] wrote: On Tue, 2007-10-23 at 19:03 -0400, Bill Davidsen wrote: John Stoffel wrote: Why do we have three different positions for storing the superblock? Why do you suggest changing anything until you get the answer to this question? If you don't understand why there are three locations, perhaps that would be a good initial investigation. Clearly the short answer is that they reflect three stages of Neil's thinking on the topic, and I would bet that he had a good reason for moving the superblock when he did it. I believe, and Neil can correct me if I'm wrong, that 1.0 (at the end of the device) is to satisfy people that want to get at their raid1 data without bringing up the device or using a loop mount with an offset. Version 1.1, at the beginning of the device, is to prevent accidental access to a device when the raid array doesn't come up. And version 1.2 (4k from the beginning of the device) would be suitable for those times when you want to embed a boot sector at the very beginning of the device (which really only needs 512 bytes, but a 4k offset is as easy to deal with as anything else). From the standpoint of wanting to make sure an array is suitable for embedding a boot sector, the 1.2 superblock may be the best default. Exactly correct. Another perspective is that I chickened out of making a decision and chose to support all the credible possibilities that I could think of. And showed that I didn't have enough imagination. The other possibility that I should have included (as has been suggested in this conversation, and previously on this list) is to store the superblock both at the beginning and the end for redundancy. However I cannot decide whether to combine the 1.0 and 1.1 locations, or the 1.0 and 1.2. And I don't think I want to support both (maybe I've learned my lesson). As for where the metadata should be placed, it is interesting to observe that the SNIA's DDFv1.2 puts it at the end of the device. And as DDF is an industry standard sponsored by multiple companies it must be .. Sorry. I had intended to say correct, but when it came to it, my fingers refused to type that word in that context. DDF is in a somewhat different situation though. It assumes that the components are whole devices, and that the controller has exclusive access - there is no way another controller could interpret the devices differently before the DDF controller has a chance. DDF is also interesting in that it uses 512 byte alignment for metadata. The 'anchor' block is in the last sector of the device. This contrasts with current md metadata which is all 4K aligned. Given that the drive manufacturers seem to be telling us that 4096 is the new 512, I think 4K alignment was a good idea. It could be that DDF actually specifies the anchor to reside in the last block rather than the last sector, and it could be that the spec allows for block size to be device specific - I'd have to hunt through the spec again to be sure. For the record, I have no intention of deprecating any of the metadata formats, not even 0.90. It is conceivable that I could change the default, though that would require a decision as to what the new default would be. I think it would have to be 1.0 or it would cause too much confusion. I think it would be entirely appropriate for a distro (especially an 'enterprise' distro) to choose a format and location that it was going to standardise on and support, and make that the default on that distro (by using a CREATE line in mdadm.conf). Debian has already done this by making 1.0 the default. I certainly accept that the documentation is probably less that perfect (by a large margin). I am more than happy to accept patches or concrete suggestions on how to improve that. I always think it is best if a non-developer writes documentation (and a developer reviews it) as then it is more likely to address the issues that a non-developer will want to read about, and in a way that will make sense to a non-developer. (i.e. I'm to close to the subject to write good doco). NeilBrown - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [BUG] Raid1/5 over iSCSI trouble
From: Dan Williams [EMAIL PROTECTED] Date: Wed, 24 Oct 2007 16:49:28 -0700 Hopefully it is as painless to run on sparc as it is on IA: opcontrol --start --vmlinux=/path/to/vmlinux wait opcontrol --stop opreport --image-path=/lib/modules/`uname -r` -l It is painless, I use it all the time. The only caveat is to make sure the /path/to/vmlinux is the pre-stripped kernel image. The images installed under /boot/ are usually stripped and thus not suitable for profiling. - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Time to deprecate old RAID formats?
Neil Brown wrote: On Tuesday October 23, [EMAIL PROTECTED] wrote: As for where the metadata should be placed, it is interesting to observe that the SNIA's DDFv1.2 puts it at the end of the device. And as DDF is an industry standard sponsored by multiple companies it must be .. Sorry. I had intended to say correct, but when it came to it, my fingers refused to type that word in that context. DDF is in a somewhat different situation though. It assumes that the components are whole devices, and that the controller has exclusive access - there is no way another controller could interpret the devices differently before the DDF controller has a chance. grin agreed. DDF is also interesting in that it uses 512 byte alignment for metadata. The 'anchor' block is in the last sector of the device. This contrasts with current md metadata which is all 4K aligned. Given that the drive manufacturers seem to be telling us that 4096 is the new 512, I think 4K alignment was a good idea. It could be that DDF actually specifies the anchor to reside in the last block rather than the last sector, and it could be that the spec allows for block size to be device specific - I'd have to hunt through the spec again to be sure. Its a bit of a mess. Yes, with 1K and 4K sector devices starting to appear, as long as the underlying partitioning gets the initial partition alignment correct, this /should/ continue functioning as normal. If for whatever reason you wind up with an odd-aligned 1K sector device and your data winds up aligned to even numbered [hard] sectors, performance will definitely suffer. Mostly this is out of MD's hands, and up to the sysadmin and partitioning tools to get hard-sector alignment right. For the record, I have no intention of deprecating any of the metadata formats, not even 0.90. strongly agreed It is conceivable that I could change the default, though that would require a decision as to what the new default would be. I think it would have to be 1.0 or it would cause too much confusion. A newer default would be nice. Jeff - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Raid-10 mount at startup always has problem
Bill Davidsen wrote: Daniel L. Miller wrote: Current mdadm.conf: DEVICE partitions ARRAY /dev/.static/dev/md0 level=raid10 num-devices=4 UUID=9d94b17b:f5fac31a:577c252b:0d4c4b2a auto=part still have the problem where on boot one drive is not part of the array. Is there a log file I can check to find out WHY a drive is not being added? It's been a while since the reboot, but I did find some entries in dmesg - I'm appending both the md lines and the physical disk related lines. The bottom shows one disk not being added (this time is was sda) - and the disk that gets skipped on each boot seems to be random - there's no consistent failure: I suspect the base problem is that you are using whole disks instead of partitions, and the problem with the partition table below is probably an indication that you have something on that drive which looks like a partition table but isn't. That prevents the drive from being recognized as a whole drive. You're lucky, if the data looked enough like a partition table to be valid the o/s probably would have tried to do something with it. [...] This may be the rare case where you really do need to specify the actual devices to get reliable operation. OK - I'm officially confused now (I was just unofficially before). WHY is it a problem using whole drives as RAID components? I would have thought that building a RAID storage unit with identically sized drives - and using each drive's full capacity - is exactly the way you're supposed to! I should mention that the boot/system drive is IDE, and NOT part of the RAID. So I'm not worried about losing the system - but I AM concerned about the data. I'm using four drives in a RAID-10 configuration - I thought this would provide a good blend of safety and performance for a small fileserver. Because it's RAID-10 - I would ASSuME that I can drop one drive (after all, I keep booting one drive short), partition if necessary, and add it back in. But how would splitting these disks into partitions improve either stability or performance? -- Daniel - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html