beginner error detection
Hi! I have a simple sw raid1, over two sata disks. One of the disks started to complain (s.m.a.r.t. errors). I think in the near future i witness a disk failure. But i don't know how this thing is happening with raid1, so i have some questions. If these questions answered somewhere (faq, manpage, url), then feel free to redirect me to this source(s). Can the raid1 detect and handle disk errors? If one block goes wrong, how can the raid1 driver choose which was the correct, original value? Sata systems can die gracefully? When in a good scsi system happens a total disk failure, then the scsi makes the disk fail, mdadm removes the disk from the array, and in the morning i see a nice e-mail. When a PATA disk dies, the system goes down, so i need to call a cab. I dont know how the sata behaves in this situation. The kernel 2.6.20 (with skas patch), the controller : nVidia Corporation CK804 Serial ATA Controller (rev f3) Thanks. -- Tomka Gergely, [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Reshaping raid0/10
On Feb 22 2007 06:59, Neil Brown wrote: On Wednesday February 21, [EMAIL PROTECTED] wrote: are there any plans to support reshaping on raid0 and raid10? No concrete plans. It largely depends on time and motivation. I expect that the various flavours of raid5/raid6 reshape will come first. Then probably converting raid0-raid5. I really haven't given any thought to how you might reshape a raid10... It should not be any different from raid0/raid5 reshaping, should it? Jan -- - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
2.6.20: stripe_cache_size goes boom with 32mb
Each of these are averaged over three runs with 6 SATA disks in a SW RAID 5 configuration: (dd if=/dev/zero of=file_1 bs=1M count=2000) 128k_stripe: 69.2MB/s 256k_stripe: 105.3MB/s 512k_stripe: 142.0MB/s 1024k_stripe: 144.6MB/s 2048k_stripe: 208.3MB/s 4096k_stripe: 223.6MB/s 8192k_stripe: 226.0MB/s 16384k_stripe: 215.0MB/s When I tried a 32768k stripe, this happened: p34:~# echo 32768 /sys/block/md4/md/stripe_cache_size Connection to p34 closed I was able to Alt-SysRQ+b but I could not access the console/X/etc, it appeared to be frozen. FYI. Justin. - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 006 of 6] md: Add support for reshape of a raid6
Andrew Morton wrote: On Thu, 22 Feb 2007 13:39:56 +1100 Neil Brown [EMAIL PROTECTED] wrote: I must right code that Andrew can read. That's write. But more importantly, things that people can immediately see and understand help reduce the possibility of mistakes. Now and in the future. If we did all loops like that, then it'd be the the best way to do it in new code, because people's eyes and brains are locked into that idiom and we just don't have to think about it when we see it. I have done lots of loops like that and understood it immediately. Nice, short, _clear_ and no - a loop that counts down instead of up is not difficult at all. Testing i-- instead of i = 0 is also something I consider trivial, even though I don't code that much. If this is among the worst you see, then the kernel source must be in great shape ;-) Helge Hafting - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.20: stripe_cache_size goes boom with 32mb
I have 2GB On this machine. For me, 8192 seems to be the sweet spot, I will probably keep it at 8mb. On Fri, 23 Feb 2007, Jason Rainforest wrote: Hi Justin, I'm not a RAID or kernel developer, but .. do you have enough RAM to support a 32mb stripe_cache_size?! Here on my 7*250Gb SW RAID5 array, decreasing a stripe_cache_size of 8192 to 4096 frees up no less than 120mb of RAM. Using that as a calculation tool, a 32mb stripe_cache_size would require approximately 960mb of RAM! My RAID box only has 1Gb of RAM, so I'm not game to test such a thing. Others on these lists would definitely have a good idea on what's happening :-) Cheers, Jason On Fri, 2007-02-23 at 06:41 -0500, Justin Piszcz wrote: Each of these are averaged over three runs with 6 SATA disks in a SW RAID 5 configuration: (dd if=/dev/zero of=file_1 bs=1M count=2000) 128k_stripe: 69.2MB/s 256k_stripe: 105.3MB/s 512k_stripe: 142.0MB/s 1024k_stripe: 144.6MB/s 2048k_stripe: 208.3MB/s 4096k_stripe: 223.6MB/s 8192k_stripe: 226.0MB/s 16384k_stripe: 215.0MB/s When I tried a 32768k stripe, this happened: p34:~# echo 32768 /sys/block/md4/md/stripe_cache_size Connection to p34 closed I was able to Alt-SysRQ+b but I could not access the console/X/etc, it appeared to be frozen. FYI. Justin. - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.20: stripe_cache_size goes boom with 32mb
Hi Justin, I'm not a RAID or kernel developer, but .. do you have enough RAM to support a 32mb stripe_cache_size?! Here on my 7*250Gb SW RAID5 array, decreasing a stripe_cache_size of 8192 to 4096 frees up no less than 120mb of RAM. Using that as a calculation tool, a 32mb stripe_cache_size would require approximately 960mb of RAM! My RAID box only has 1Gb of RAM, so I'm not game to test such a thing. Others on these lists would definitely have a good idea on what's happening :-) Cheers, Jason On Fri, 2007-02-23 at 06:41 -0500, Justin Piszcz wrote: Each of these are averaged over three runs with 6 SATA disks in a SW RAID 5 configuration: (dd if=/dev/zero of=file_1 bs=1M count=2000) 128k_stripe: 69.2MB/s 256k_stripe: 105.3MB/s 512k_stripe: 142.0MB/s 1024k_stripe: 144.6MB/s 2048k_stripe: 208.3MB/s 4096k_stripe: 223.6MB/s 8192k_stripe: 226.0MB/s 16384k_stripe: 215.0MB/s When I tried a 32768k stripe, this happened: p34:~# echo 32768 /sys/block/md4/md/stripe_cache_size Connection to p34 closed I was able to Alt-SysRQ+b but I could not access the console/X/etc, it appeared to be frozen. FYI. Justin. - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.20: stripe_cache_size goes boom with 32mb
On Feb 23 2007 06:41, Justin Piszcz wrote: I was able to Alt-SysRQ+b but I could not access the console/X/etc, it appeared to be frozen. No sysrq+t? (Ah, unblanking might hang.) Well, netconsole/serial to the rescue, then ;-) Jan -- - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: PATA/SATA Disk Reliability paper
Stephen C Woods wrote: So drives do need to be ventilated, not so much wory about exploding, but rather subtle distortion of the case as the atmospheric preasure changed. I have a '94 Caviar without any apparent holes; and as a bonus, the drive still works. In contrast, ever since these holes appeared, drive failures became the norm. Doe anyone rememnber that you had to let you drives acclimate to your machine room for a day or so before you used them. The problem is, that's not enough; the room temperature/humidity has to be controlled too. In a desktop environment, that's not really feasible. Thanks! -- Al - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Linux Software RAID a bit of a weakness?
Hi, We had a small server here that was configured with a RAID 1 mirror, using two IDE disks. Last week one of the drives failed in this. So we replaced the drive and set the array to rebuild. The good disk then found a bad block and the mirror failed. Now I presume that the good disk must have had an underlying bad block in either unallocated space or a file I never access. Now as RAID works at the block level you only ever see this on an array rebuild when it's often catastrophic. Is this a bit of a flaw? I know there is the definite probability of two drives failing within a short period of time. But this is a bit different as it's the probability of two drives failing but over a much larger time scale if one of the flaws is hidden in unallocated space (maybe a dirt particle finds it's way onto the surface or something). This would make RAID buy you a lot less in reliability, I'd have thought. I seem to remember seeing in the log file for a Dell perc something about scavenging for bad blocks. Do hardware RAID systems have a mechanism that at times of low activity search the disks for bad blocks to help guard against this sort of failure (so a disk error is reported early)? On Software RAID, I was thinking apart from a three way mirror, which I don't think is at present supported. Is there any merit in say, cat'ing the whole disk devices to /dev/null every so often to check that the whole surface is readable (I presume just reading the raw device won't upset thing, don't worry I don't plan on trying it on a production system). Any thoughts? As I presume people have thought of this before and I must be missing something. Colin This email and any files transmitted with it are confidential and are intended solely for the use of the individual or entity to whom they are addressed. If you are not the original recipient or the person responsible for delivering the email to the intended recipient, be advised that you have received this email in error, and that any use, dissemination, forwarding, printing, or copying of this email is strictly prohibited. If you received this email in error, please immediately notify the sender and delete the original. - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Linux Software RAID a bit of a weakness?
Colin Simpson wrote: Hi, We had a small server here that was configured with a RAID 1 mirror, using two IDE disks. Last week one of the drives failed in this. So we replaced the drive and set the array to rebuild. The good disk then found a bad block and the mirror failed. Now I presume that the good disk must have had an underlying bad block in either unallocated space or a file I never access. Now as RAID works at the block level you only ever see this on an array rebuild when it's often catastrophic. Is this a bit of a flaw? I know there is the definite probability of two drives failing within a short period of time. But this is a bit different as it's the probability of two drives failing but over a much larger time scale if one of the flaws is hidden in unallocated space (maybe a dirt particle finds it's way onto the surface or something). This would make RAID buy you a lot less in reliability, I'd have thought. I seem to remember seeing in the log file for a Dell perc something about scavenging for bad blocks. Do hardware RAID systems have a mechanism that at times of low activity search the disks for bad blocks to help guard against this sort of failure (so a disk error is reported early)? On Software RAID, I was thinking apart from a three way mirror, which I don't think is at present supported. Is there any merit in say, cat'ing the whole disk devices to /dev/null every so often to check that the whole surface is readable (I presume just reading the raw device won't upset thing, don't worry I don't plan on trying it on a production system). Any thoughts? As I presume people have thought of this before and I must be missing something. Yes, this is an important thing to keep on top of, both for hardware RAID and software RAID. For md: echo check /sys/block/md0/md/sync_action This should be done regularly. I have cron do it once a week. Check out: http://neil.brown.name/blog/20050727141521-002 Good luck, Steve -- __ Steve Cousins, Ocean Modeling GroupEmail: [EMAIL PROTECTED] Marine Sciences, 452 Aubert Hall http://rocky.umeoce.maine.edu Univ. of Maine, Orono, ME 04469Phone: (207) 581-4302 - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Reshaping raid0/10
On Friday February 23, [EMAIL PROTECTED] wrote: On Feb 22 2007 06:59, Neil Brown wrote: On Wednesday February 21, [EMAIL PROTECTED] wrote: are there any plans to support reshaping on raid0 and raid10? No concrete plans. It largely depends on time and motivation. I expect that the various flavours of raid5/raid6 reshape will come first. Then probably converting raid0-raid5. I really haven't given any thought to how you might reshape a raid10... It should not be any different from raid0/raid5 reshaping, should it? Depends on what level you look at. If I wanted to reshape a raid0, I would just morph it into a raid4 with a missing parity drive, then use the raid5 code to restripe it. Then morph it back to regular raid0. With raid10 I cannot do that. I would need to do the restriping inside the raid10 module. But raid10 doesn't have a stripe-cache like raid5 does, and the stripe cache is a very integral part of the restripe process. So there would be a substantial amount of design and coding to effect a raid10 reshape - at least as much as the work to produce the initial raid5 reshape and probably more. So conceptually it might be very similar, but at the code level, it is likely to be very different. NeilBrown - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Linux Software RAID a bit of a weakness?
On Friday February 23, [EMAIL PROTECTED] wrote: Hi, We had a small server here that was configured with a RAID 1 mirror, using two IDE disks. Last week one of the drives failed in this. So we replaced the drive and set the array to rebuild. The good disk then found a bad block and the mirror failed. Now I presume that the good disk must have had an underlying bad block in either unallocated space or a file I never access. Now as RAID works at the block level you only ever see this on an array rebuild when it's often catastrophic. Is this a bit of a flaw? Certainly can be unfortunate. I know there is the definite probability of two drives failing within a short period of time. But this is a bit different as it's the probability of two drives failing but over a much larger time scale if one of the flaws is hidden in unallocated space (maybe a dirt particle finds it's way onto the surface or something). This would make RAID buy you a lot less in reliability, I'd have thought. I seem to remember seeing in the log file for a Dell perc something about scavenging for bad blocks. Do hardware RAID systems have a mechanism that at times of low activity search the disks for bad blocks to help guard against this sort of failure (so a disk error is reported early)? As has been mentioned, this can be done with md/raid too. Some distros (debian/testing at least) schedule a 'check' of all arrays once a month. On Software RAID, I was thinking apart from a three way mirror, which I don't think is at present supported. Is there any merit in say, cat'ing the whole disk devices to /dev/null every so often to check that the whole surface is readable (I presume just reading the raw device won't upset thing, don't worry I don't plan on trying it on a production system). Three-way mirroring has always been supported. You can do N way mirroring if you have N drives. Reading the whole device would not be sufficient as it would only read one copy of every block rather than all copies. The 'check' process reads all copies and compares them with one another, If there is a difference it is reported. If you use 'repair' instead of 'check', the difference is arbitrarily corrected. If a read error is detected during the 'check', md/raid1 will attempt to write the data from the good drive to the bad drive, then read it back. If this works, the drive is assumed to be fixed. If not, the bad drive is failed out of the array. NeilBrown - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.20: stripe_cache_size goes boom with 32mb
On 2/23/07, Justin Piszcz [EMAIL PROTECTED] wrote: I have 2GB On this machine. For me, 8192 seems to be the sweet spot, I will probably keep it at 8mb. Just a note stripe_cache_size = 8192 = 192MB with six disks. The calculation is: stripe_cache_size * num_disks * PAGE_SIZE = stripe_cache_size_bytes -- Dan - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
nonzero mismatch_cnt with no earlier error
I run a 'check' weekly, and yesterday it came up with a non-zero mismatch count (184). There were no earlier RAID errors logged and the count was zero after the run a week ago. Now, the interesting part is that there was one i/o error logged during the check *last week*, however the raid did not see it and the count was zero at the end. No errors were logged during the week since or during the check last night. fsck (ext3 with logging) found no errors but I may have bad data somewhere. Should the raid have noticed the error, checked the offending stripe and taken appropriate action? The messages from that error are below. Naturally, I do not know if the mismatch is related to the failure last week, it could be from a number of other reasons (bad memory? kernel bug?). system details: 2.6.20 vanilla /dev/sd[ab]: on motherboard IDE interface: Intel Corp. 82801EB (ICH5) Serial ATA 150 Storage Controller (rev 02) /dev/sd[cdef]: Promise SATA-II-150-TX4 Unknown mass storage controller: Promise Technology, Inc.: Unknown device 3d18 (rev 02) All 6 disks are WD 320GB SATA of similar models Tail of dmesg, showing all messages since last week 'check': *** last week check start: [927080.617744] md: data-check of RAID array md0 [927080.630783] md: minimum _guaranteed_ speed: 24000 KB/sec/disk. [927080.648734] md: using maximum available idle IO bandwidth (but not more than 20 KB/sec) for data-check. [927080.678103] md: using 128k window, over a total of 312568576 blocks. *** last week error: [937567.332751] ata3.00: exception Emask 0x10 SAct 0x0 SErr 0x4190002 action 0x2 [937567.354094] ata3.00: cmd b0/d5:01:09:4f:c2/00:00:00:00:00/00 tag 0 cdb 0x0 data 512 in [937567.354096] res 51/04:83:45:00:00/00:00:00:00:00/a0 Emask 0x10 (ATA bus error) [937568.120783] ata3: soft resetting port [937568.282450] ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 300) [937568.306693] ata3.00: configured for UDMA/100 [937568.319733] ata3: EH complete [937568.361223] SCSI device sdc: 625142448 512-byte hdwr sectors (320073 MB) [937568.397207] sdc: Write Protect is off [937568.408620] sdc: Mode Sense: 00 3a 00 00 [937568.453522] SCSI device sdc: write cache: enabled, read cache: enabled, doesn't support DPO or FUA *** last week check end: [941696.843935] md: md0: data-check done. [941697.246454] RAID5 conf printout: [941697.256366] --- rd:6 wd:6 [941697.264718] disk 0, o:1, dev:sda1 [941697.275146] disk 1, o:1, dev:sdb1 [941697.285575] disk 2, o:1, dev:sdc1 [941697.296003] disk 3, o:1, dev:sdd1 [941697.306432] disk 4, o:1, dev:sde1 [941697.316862] disk 5, o:1, dev:sdf1 *** this week check start: [1530647.746383] md: data-check of RAID array md0 [1530647.759677] md: minimum _guaranteed_ speed: 24000 KB/sec/disk. [1530647.778041] md: using maximum available idle IO bandwidth (but not more than 20 KB/sec) for data-check. [1530647.807663] md: using 128k window, over a total of 312568576 blocks. *** this week check end: [1545248.680745] md: md0: data-check done. [1545249.266727] RAID5 conf printout: [1545249.276930] --- rd:6 wd:6 [1545249.285542] disk 0, o:1, dev:sda1 [1545249.296228] disk 1, o:1, dev:sdb1 [1545249.306923] disk 2, o:1, dev:sdc1 [1545249.317613] disk 3, o:1, dev:sdd1 [1545249.328292] disk 4, o:1, dev:sde1 [1545249.338981] disk 5, o:1, dev:sdf1 -- Eyal Lebedinsky ([EMAIL PROTECTED]) http://samba.org/eyal/ attach .zip as .dat - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Linux Software RAID a bit of a weakness?
Neil Brown wrote: The 'check' process reads all copies and compares them with one another, If there is a difference it is reported. If you use 'repair' instead of 'check', the difference is arbitrarily corrected. If a read error is detected during the 'check', md/raid1 will attempt to write the data from the good drive to the bad drive, then read it back. If this works, the drive is assumed to be fixed. If not, the bad drive is failed out of the array. One thing to note here is that 'repair' was broken for RAID1 until recently - see http://marc.theaimsgroup.com/?l=linux-raidm=116951242005315w=2 As this patch was submitted just prior to the release of 2.6.20, this may be the first fixed kernel, but I have not checked. Regards, Richard - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
end to end error recovery musings
In the IO/FS workshop, one idea we kicked around is the need to provide better and more specific error messages between the IO stack and the file system layer. My group has been working to stabilize a relatively up to date libata + MD based box, so I can try to lay out at least one appliance like typical configuration to help frame the issue. We are working on a relatively large appliance, but you can buy similar home appliance (or build them) that use linux to provide a NAS in a box for end users. The use case that we have is on an ICH6R/AHCI box with 4 large (500+ GB) drives, with some of the small system partitions on a 4-way RAID1 device. The libata version we have is back port of 2.6.18 onto SLES10, so the error handling at the libata level is a huge improvement over what we had before. Each box has a watchdog timer that can be set to fire after at most 2 minutes. (We have a second flavor of this box with an ICH5 and P-ATA drives using the non-libata drivers that has a similar use case). Using the patches that Mark sent around recently for error injection, we inject media errors into one or more drives and try to see how smoothly error handling runs and, importantly, whether or not the error handling will complete before the watchdog fires and reboots the box. If you want to be especially mean, inject errors into the RAID superblocks on 3 out of the 4 drives. We still have the following challenges: (1) read-ahead often means that we will retry every bad sector at least twice from the file system level. The first time, the fs read ahead request triggers a speculative read that includes the bad sector (triggering the error handling mechanisms) right before the real application triggers a read does the same thing. Not sure what the answer is here since read-ahead is obviously a huge win in the normal case. (2) the patches that were floating around on how to make sure that we effectively handle single sector errors in a large IO request are critical. On one hand, we want to combine adjacent IO requests into larger IO's whenever possible. On the other hand, when the combined IO fails, we need to isolate the error to the correct range, avoid reissuing a request that touches that sector again and communicate up the stack to file system/MD what really failed. All of this needs to complete in tens of seconds, not multiple minutes. (3) The timeout values on the failed IO's need to be tuned well (as was discussed in an earlier linux-ide thread). We cannot afford to hang for 30 seconds, especially in the MD case, since you might need to fail more than one device for a single IO. Prompt error prorogation (say that 4 times quickly!) can allow MD to mask the underlying errors as you would hope, hanging on too long will almost certainly cause a watchdog reboot... (4) The newish libata+SCSI stack is pretty good at handling disk errors, but adding in MD actually can reduce the reliability of your system unless you tune the error handling correctly. We will follow up with specific issues as they arise, but I wanted to lay out a use case that can help frame part of the discussion. I also want to encourage people to inject real disk errors with the Mark patches so we can share the pain ;-) ric - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: end to end error recovery musings
Ric Wheeler wrote: We still have the following challenges: (1) read-ahead often means that we will retry every bad sector at least twice from the file system level. The first time, the fs read ahead request triggers a speculative read that includes the bad sector (triggering the error handling mechanisms) right before the real application triggers a read does the same thing. Not sure what the answer is here since read-ahead is obviously a huge win in the normal case. Probably the only sane thing to do is to remember the bad sectors and avoid attempting reading them; that would mean marking automatic versus explicitly requested requests to determine whether or not to filter them against a list of discovered bad blocks. -hpa - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: end to end error recovery musings
On Feb 23, 2007 16:03 -0800, H. Peter Anvin wrote: Ric Wheeler wrote: (1) read-ahead often means that we will retry every bad sector at least twice from the file system level. The first time, the fs read ahead request triggers a speculative read that includes the bad sector (triggering the error handling mechanisms) right before the real application triggers a read does the same thing. Not sure what the answer is here since read-ahead is obviously a huge win in the normal case. Probably the only sane thing to do is to remember the bad sectors and avoid attempting reading them; that would mean marking automatic versus explicitly requested requests to determine whether or not to filter them against a list of discovered bad blocks. And clearing this list when the sector is overwritten, as it will almost certainly be relocated at the disk level. For that matter, a huge win would be to have the MD RAID layer rewrite only the bad sector (in hopes of the disk relocating it) instead of failing the whiole disk. Otherwise, a few read errors on different disks in a RAID set can take the whole system offline. Apologies if this is already done in recent kernels... Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: end to end error recovery musings
Andreas Dilger wrote: And clearing this list when the sector is overwritten, as it will almost certainly be relocated at the disk level. Certainly if the overwrite is successful. -hpa - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: nonzero mismatch_cnt with no earlier error
I did a resync since, which ended up with the same mismatch_cnt of 184. I noticed that the count *was* reset to zero when the resync started, but ended up with 184 (same as after the check). I thought that the resync just calculates fresh parity and does not bother checking if it is different. So what does this final count mean? This leads me to ask: why bother doing a check if I will always run a resync after an error - better run a resync in the first place? -- Eyal Lebedinsky ([EMAIL PROTECTED]) http://samba.org/eyal/ attach .zip as .dat - To unsubscribe from this list: send the line unsubscribe linux-raid in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html