Re: OT: Dell disks
On 06/19/2012 09:13 PM, Michael Butash wrote: So yeah, no raid is perfect... I've been using software raid1 (md) for a while now for my desktops and laptops work and home, and since my adventures in ati gpu land, I've twice now had video software/hardware cause my software raid to fail ugly, but both times survivable while I rebuilt the array manually. This was just a few days ago the last... Both times were using GL functions (this time toggling compositing on/off, last time i think minecraft) that caused the ati fglrx drivers to spew hardware errors seeming to glitch the card itself. Two separate cards as well now. Getting back into desktop went into visa with gpu unavailable. Then I saw my raid was degraded, again, same timestamp as the gpu glitch. First time prior one of the two disks in the md for boot went offline, simply added sdb1/2 back. This time one partition on each disk to the two md's (boot/else) to go offline alternatively (sda1/sdb2) - very odd. The second disk wouldn't respond to hdparm/fdisk query until a reboot that was done very hesitantly and not before I backed up anything I cared about to an nfs share. Data on both remained available which was really the odd part. To its testament, it rebooted, both disks reported healthy (hdparm, ubuntu disk utility), I re-added each partition, let it rebuild, and works again. Still worries me as my last set of ssd disks got unstable on one after less than 9 months of use and I'm probably about there with these that are known to get cranky. Smart reports them as ok, so I wonder how bad ati taints the kernel space that it causes disk controller/driver exceptions. Moral of story: know when/how to repair whatever raid, as software and hardware are seemingly still prone to exception from unlikely places. Last time a disk died with md, I just mounted the secondary in an enclosure, copied off data as pluggable, and copied to the new pair of raid disks. Hardware is never this easy, especially fakeraids. -mb I use software raid strictly on servers, which are headless (of course). I keep data on a server when practical, and run a daily rsync (offsite) backup of data on my workstation. Interesting to note the problems with video though. You might want to hop onto the raid list (gmane.linux.raid) and see what they think about it. I don't know why anyone would run SSDs in a raid. RAID IS NOT A BACKUP. Of course if you're concerned about device failure and need to maintain continuity, raid is entirely appropriate. One of the nice things about sw raid-1 is that as long as you have one good drive/partition, then you can recover. Just start the array in degraded mode, and you're good to go. I know that grub can access a single raid-1 partition w/out starting the array, which makes me wonder if you can simply mount one of the raid-1 partitions straight away w/out starting the array. I should try that so I know for sure if it's possible or not. BL, *never* use fakeraid, and avoid raid-5 if possible. Disk space is no longer expensive enough to justify using raid-5. -- -Eric 'shubes' --- PLUG-discuss mailing list - PLUG-discuss@lists.plug.phoenix.az.us To subscribe, unsubscribe, or to change your mail settings: http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss
raid (was RE: OT: Dell disks)
-Original Message- From: plug-discuss-boun...@lists.plug.phoenix.az.us on behalf of Eric Shubert On 06/19/2012 09:13 PM, Michael Butash wrote: So yeah, no raid is perfect... ... -mb I use software raid strictly on servers, which are headless (of course). ... I don't know why anyone would run SSDs in a raid. RAID IS NOT A BACKUP. Rather than my guessing, would you mind explaining your reasons? I'm curious. ... BL, *never* use fakeraid, and avoid raid-5 if possible. Disk space is no longer expensive enough to justify using raid-5. Wow, someone else who agrees with me - IMHO, if its important enough to need raid, don't try to skimp and save a few bucks so you can lose your data! Rusty --- PLUG-discuss mailing list - PLUG-discuss@lists.plug.phoenix.az.us To subscribe, unsubscribe, or to change your mail settings: http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss
Re: raid (was RE: OT: Dell disks)
Other than the case when 2 drives failed, Raid 5 worked for me for many years. If you are using simple mirroring though 2 drives failing will cause the same issue. I now use Raid 6 for a little more redundancy. Always backup your data to other storage (offsite if possible) in-case of disaster. Gilbert On 6/21/2012 8:41 AM, Carruth, Rusty wrote: raid (was RE: OT: Dell disks) -Original Message- From: plug-discuss-boun...@lists.plug.phoenix.az.us on behalf of Eric Shubert On 06/19/2012 09:13 PM, Michael Butash wrote: So yeah, no raid is perfect... ... -mb I use software raid strictly on servers, which are headless (of course). ... I don't know why anyone would run SSDs in a raid. RAID IS NOT A BACKUP. Rather than my guessing, would you mind explaining your reasons? I'm curious. ... BL, *never* use fakeraid, and avoid raid-5 if possible. Disk space is no longer expensive enough to justify using raid-5. Wow, someone else who agrees with me - IMHO, if its important enough to need raid, don't try to skimp and save a few bucks so you can lose your data! Rusty --- PLUG-discuss mailing list - PLUG-discuss@lists.plug.phoenix.az.us To subscribe, unsubscribe, or to change your mail settings: http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss --- PLUG-discuss mailing list - PLUG-discuss@lists.plug.phoenix.az.us To subscribe, unsubscribe, or to change your mail settings: http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss
Re: raid (was RE: OT: Dell disks)
Agreed - I have a separate nas I rsync important data regularly with between my laptop as I often work away from home, here I just use the nas. Problem is I don't always have my laptop on at home, and sometimes goes weeks without replication. I run the ssd's in raid1 as a) i want *some* disk-level redundancy so as not to rebuild my desktop from scratch yearly when they puke and b) want the speed, so am willing to deal with their questionable nature. I'd buy some of Rusty's companies industrial ssd's, but they don't seem to sell to want to sell them easily anywhere, and what I do find for sale is insanely priced ($10-20/gb). Nor does anyone commonly sell them even if i were made of cash. Kind of annoying to get enterprise stuff at home you have to hit secondary markets ala ebay, sorta like a crackhead hitting a swapmeet for off the back of the truck goods. Definitely not something I want to get refurbed after some enterprise has run sql db's off it for 2 years already. I might pay double to avoid the stress of rebulding the os yearly with crap ssd's (which seems 98% are), but not 10-20x. -mb On 06/21/2012 09:01 AM, Gilbert T. Gutierrez, Jr. wrote: Other than the case when 2 drives failed, Raid 5 worked for me for many years. If you are using simple mirroring though 2 drives failing will cause the same issue. I now use Raid 6 for a little more redundancy. Always backup your data to other storage (offsite if possible) in-case of disaster. Gilbert On 6/21/2012 8:41 AM, Carruth, Rusty wrote: -Original Message- From: plug-discuss-boun...@lists.plug.phoenix.az.us on behalf of Eric Shubert On 06/19/2012 09:13 PM, Michael Butash wrote: So yeah, no raid is perfect... ... -mb I use software raid strictly on servers, which are headless (of course). ... I don't know why anyone would run SSDs in a raid. RAID IS NOT A BACKUP. Rather than my guessing, would you mind explaining your reasons? I'm curious. ... BL, *never* use fakeraid, and avoid raid-5 if possible. Disk space is no longer expensive enough to justify using raid-5. Wow, someone else who agrees with me - IMHO, if its important enough to need raid, don't try to skimp and save a few bucks so you can lose your data! Rusty --- PLUG-discuss mailing list -PLUG-discuss@lists.plug.phoenix.az.us To subscribe, unsubscribe, or to change your mail settings: http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss --- PLUG-discuss mailing list - PLUG-discuss@lists.plug.phoenix.az.us To subscribe, unsubscribe, or to change your mail settings: http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss --- PLUG-discuss mailing list - PLUG-discuss@lists.plug.phoenix.az.us To subscribe, unsubscribe, or to change your mail settings: http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss
Re: raid (was RE: OT: Dell disks)
I have never liked raid 5 but can still see its use. And while you are 100% correct I have the statement that raid is not a back up it is a good feature for performance needs and overall uptime so you can keep running in case of single disk failure. Which I have dealt with. On Jun 21, 2012 8:45 AM, Carruth, Rusty rusty.carr...@smartstoragesys.com wrote: ** -Original Message- From: plug-discuss-boun...@lists.plug.phoenix.az.us on behalf of Eric Shubert On 06/19/2012 09:13 PM, Michael Butash wrote: So yeah, no raid is perfect... ... -mb I use software raid strictly on servers, which are headless (of course). ... I don't know why anyone would run SSDs in a raid. RAID IS NOT A BACKUP. Rather than my guessing, would you mind explaining your reasons? I'm curious. ... BL, *never* use fakeraid, and avoid raid-5 if possible. Disk space is no longer expensive enough to justify using raid-5. Wow, someone else who agrees with me - IMHO, if its important enough to need raid, don't try to skimp and save a few bucks so you can lose your data! Rusty --- PLUG-discuss mailing list - PLUG-discuss@lists.plug.phoenix.az.us To subscribe, unsubscribe, or to change your mail settings: http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss --- PLUG-discuss mailing list - PLUG-discuss@lists.plug.phoenix.az.us To subscribe, unsubscribe, or to change your mail settings: http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss
Re: OT: Dell disks
Working with dells at work. Including the 620 any hdd will do. The motherboard/case combination however is all sorts of proprietary. But the 620 took Ubuntu 12.04 with gnome 3 like a champ. On Jun 18, 2012 10:24 PM, Mark Jarvis m.jar...@cox.net wrote: I'm considering buying a Dell desktop (Inspiron 620), but a few years ago I was warned off them because Dell did something different to their disks so that you had to buy replacement/additional disks only from Dell. Any chance that it's still true? --- PLUG-discuss mailing list - PLUG-discuss@lists.plug.phoenix.az.us To subscribe, unsubscribe, or to change your mail settings: http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss --- PLUG-discuss mailing list - PLUG-discuss@lists.plug.phoenix.az.us To subscribe, unsubscribe, or to change your mail settings: http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss
Re: OT: Dell disks
Hi Mark, On Mon, Jun 18, 2012 at 10:05 PM, Mark Jarvis m.jar...@cox.net wrote: I'm considering buying a Dell desktop (Inspiron 620), but a few years ago I was warned off them because Dell did something different to their disks so that you had to buy replacement/additional disks only from Dell. Any chance that it's still true? Unless you have a hardware RAID card, and you are buying a desktop, you should not have enterprise grade drives, but check with Dell Support for the model you are interested in. You are referring to TLER/ERC/CCTL: Hard drive manufacturers are drawing a distinction between desktop grade and enterprise grade drives. The desktop grade drives can take a long time (~2 minutes) to respond when they find an error, which causes most RAID systems to label them as failed and drop them from the array. The solution provided by the manufacturers is for us to purchase the enterprise grade drives, at twice the cost, which report errors promptly enough so that this isn't a problem. This enterprise feature is called TLER, ERC, and CCTL. *The Problem:* There are three problems with this situation: The first is that it flies in the face of the word *Inexpensive* in the acronym *Redundant Arrays of Inexpensive Disks (RAID)*http://www-2.cs.cmu.edu/%7Egarth/RAIDpaper/Patterson88.pdf . The second is that when a drive starts to fail, you want to know about it, as Miles Nordin wrote in a long threadhttp://opensolaris.org/jive/thread.jspa?threadID=119639tstart=0 : * Posssible Solutions:* For a while, Western Digital released a program (WDTLER.EXE) that made it possible to enable TLER on desktop grade drives. This no longer works. *Linux:* This message http://marc.info/?l=linux-raidm=128640221813394w=2 implies that it's impossible to tell a drive to cancel its bad read operation: You can set the ERC values of your drives. Then they'll stop processing their internal error recovery procedure after the timeout and continue to react. Without ERC-timeout, the drive tries to correct the error on its own (not reacting on any requests), mdraid assumes an error after a while and tries to rewrite the missing sector (assembled from the other disks). But the drive will still not react to the write request as it is still doing its internal recovery procedure. Now mdraid assumes the disk to be bad and kicks it. There's nothing you can do about this viscious circle except either enabling ERC or using Raid-Edition disk (which have ERC enabled by default). Evidence that using ATA ERC commands don't always work: Both Linux and FreeBSD can use normal desktop drives without TLER, and in fact you *would not even want TLER* in such a case, since *TLER can be dangerous* in some circumstances. Read on. *What is TLER/CCTL/ERC?* TLER (Time-Limited Error Recovery CCTL (Command Completion Time Limit) ERC (Error Recovery Control) These basically mean the same thing: limit the number of seconds the harddrive spends on trying to recover a weak or bad sector. TLER and the other variants are typically configured to 7 seconds, meaning that if the drive has not managed to recover that sector within 7 seconds, it will give up and forfeit recovery, and return an I/O error to the host instead. The behavior without TLER is that up to 120 seconds (20-60 is more frequent) may pass before a disk gives up recovery. This behavior causes haywire on all Hardware RAID and Windows-based software/onboard/driver RAIDs. The RAID consider typically is configured to consider disks that don't respond in 10 seconds as completely failed; which is bizarre to say the least! This smells like the vendors have some sort of deal causing you to buy HDDs at twice the price just for a simple firmware fix. LOL!! Don't get yourself buttraped; read on! *When do i need TLER?* You need TLER-capable disks when using any Hardware RAID or any Windows-based software RAID; bummer if you're on Windows platform! But this also means Hardware RAID on any OS (FreeBSD/Linux) would also need TLER disks; even when configured to run as 'JBOD' array. There may be controllers with different firmware that allow you to set the timeout limit for I/O; but i've not yet heard about specific products, except some LSI 1068E in IR mode; but reputable vendors like Areca (FW1.43) certainly require TLER-enabled disks or they will drop-out like candy whenever you encounter a bad/weak sector that needs longer recovery than 10 seconds. Basically, if you use a RAID platform that DEMANDS the disks to respond within 10 seconds, and will KICK OUT disks that do not respond in time, then you need TLER. *When don't I need TLER?* When using FreeBSD/Linux software RAID on a HBA controller; which is a RAID-less controller. Areca HW RAID running in JBOD mode is still a RAID controller; it controls whether the disks are detached, not the OS. With a true HBA like LSI 1068E (Intel SASUC8i) your OS would have control about whether to detach the disk or not; and Linux/BSD won't, at least not for
Re: OT: Dell disks
Thanks to all who responded. Sounds like no problem. Thanks again, Mark Mark Jarvis wrote: I'm considering buying a Dell desktop (Inspiron 620), but a few years ago I was warned off them because Dell did something different to their disks so that you had to buy replacement/additional disks only from Dell. Any chance that it's still true? --- PLUG-discuss mailing list - PLUG-discuss@lists.plug.phoenix.az.us To subscribe, unsubscribe, or to change your mail settings: http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss --- PLUG-discuss mailing list - PLUG-discuss@lists.plug.phoenix.az.us To subscribe, unsubscribe, or to change your mail settings: http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss
Re: OT: Dell disks
Lesser known fact, but I stopped buying Western Digital drives because they purposely began removing firmware function to disable tler, purposly pushing people to buy their 2x cost enterprise drives to *support* raid. It turned into a inet snafu at one point from backlash as they ripped it out mid-run of drives so half worked, half didn't (their popular black drives too known for performance, and suitability for raid). The only real difference IS the firmware (and warranty, but meh), so their removing the tler disable ability to NOT cook my raid is a rather offensive, especially simply in the name of selling drives at higher margins. I haven't bought a WD HD in a good 3-4 years now because of it. Sad is I migrated to using Hitachi disks for a few years as they don't cripple their disks, and now they got borg'd by WD, which I'm certain they'll just muck those up too to push for raid==enterprise==high margin. I refuse to by Seagate since they integrated the dubious Maxtor junk, now I/we're just about out of options for cost-effective disks that don't suck.. So much for choices, industry consolidation is for the best though! -mb On 06/19/2012 06:28 AM, Lisa Kachold wrote: Hi Mark, On Mon, Jun 18, 2012 at 10:05 PM, Mark Jarvis m.jar...@cox.net mailto:m.jar...@cox.net wrote: I'm considering buying a Dell desktop (Inspiron 620), but a few years ago I was warned off them because Dell did something different to their disks so that you had to buy replacement/additional disks only from Dell. Any chance that it's still true? Unless you have a hardware RAID card, and you are buying a desktop, you should not have enterprise grade drives, but check with Dell Support for the model you are interested in. You are referring to TLER/ERC/CCTL: Hard drive manufacturers are drawing a distinction between desktop grade and enterprise grade drives. The desktop grade drives can take a long time (~2 minutes) to respond when they find an error, which causes most RAID systems to label them as failed and drop them from the array. The solution provided by the manufacturers is for us to purchase the enterprise grade drives, at twice the cost, which report errors promptly enough so that this isn't a problem. This enterprise feature is called TLER, ERC, and CCTL. *The Problem:* There are three problems with this situation: The first is that it flies in the face of the word *Inexpensive* in the acronym *Redundant Arrays of /Inexpensive/ Disks (RAID)* http://www-2.cs.cmu.edu/%7Egarth/RAIDpaper/Patterson88.pdf. The second is that when a drive starts to fail, you want to know about it, as Miles Nordin wrote in a long thread http://opensolaris.org/jive/thread.jspa?threadID=119639tstart=0: * Posssible Solutions:* For a while, Western Digital released a program (WDTLER.EXE) that made it possible to enable TLER on desktop grade drives. This no longer works. *Linux:* This message http://marc.info/?l=linux-raidm=128640221813394w=2 implies that it's impossible to tell a drive to cancel its bad read operation: You can set the ERC values of your drives. Then they'll stop processing their internal error recovery procedure after the timeout and continue to react. Without ERC-timeout, the drive tries to correct the error on its own (not reacting on any requests), mdraid assumes an error after a while and tries to rewrite the missing sector (assembled from the other disks). But the drive will still not react to the write request as it is still doing its internal recovery procedure. Now mdraid assumes the disk to be bad and kicks it. There's nothing you can do about this viscious circle except either enabling ERC or using Raid-Edition disk (which have ERC enabled by default). Evidence that using ATA ERC commands don't always work: Both Linux and FreeBSD can use normal desktop drives without TLER, and in fact you *would not even want TLER* in such a case, since *TLER can be dangerous* in some circumstances. Read on. *What is TLER/CCTL/ERC?* TLER (Time-Limited Error Recovery CCTL (Command Completion Time Limit) ERC (Error Recovery Control) These basically mean the same thing: limit the number of seconds the harddrive spends on trying to recover a weak or bad sector. TLER and the other variants are typically configured to 7 seconds, meaning that if the drive has not managed to recover that sector within 7 seconds, it will give up and forfeit recovery, and return an I/O error to the host instead. The behavior without TLER is that up to 120 seconds (20-60 is more frequent) may pass before a disk gives up recovery. This behavior causes haywire on all Hardware RAID and Windows-based software/onboard/driver RAIDs. The RAID consider typically is configured to consider disks that don't respond in 10 seconds as completely failed; which is bizarre to say the least! This smells like the vendors have some sort of deal causing you to buy HDDs at twice the price just for a simple firmware fix. LOL!! Don't get
Re: OT: Dell disks
On 06/19/2012 06:28 AM, Lisa Kachold wrote: Hi Mark, On Mon, Jun 18, 2012 at 10:05 PM, Mark Jarvis m.jar...@cox.net mailto:m.jar...@cox.net wrote: I'm considering buying a Dell desktop (Inspiron 620), but a few years ago I was warned off them because Dell did something different to their disks so that you had to buy replacement/additional disks only from Dell. Any chance that it's still true? Unless you have a hardware RAID card, and you are buying a desktop, you should not have enterprise grade drives, but check with Dell Support for the model you are interested in. You are referring to TLER/ERC/CCTL: Hard drive manufacturers are drawing a distinction between desktop grade and enterprise grade drives. The desktop grade drives can take a long time (~2 minutes) to respond when they find an error, which causes most RAID systems to label them as failed and drop them from the array. The solution provided by the manufacturers is for us to purchase the enterprise grade drives, at twice the cost, which report errors promptly enough so that this isn't a problem. This enterprise feature is called TLER, ERC, and CCTL. *The Problem:* There are three problems with this situation: The first is that it flies in the face of the word *Inexpensive* in the acronym *Redundant Arrays of /Inexpensive/ Disks (RAID)* http://www-2.cs.cmu.edu/%7Egarth/RAIDpaper/Patterson88.pdf. The second is that when a drive starts to fail, you want to know about it, as Miles Nordin wrote in a long thread http://opensolaris.org/jive/thread.jspa?threadID=119639tstart=0: * Posssible Solutions:* For a while, Western Digital released a program (WDTLER.EXE) that made it possible to enable TLER on desktop grade drives. This no longer works. *Linux:* This message http://marc.info/?l=linux-raidm=128640221813394w=2 implies that it's impossible to tell a drive to cancel its bad read operation: You can set the ERC values of your drives. Then they'll stop processing their internal error recovery procedure after the timeout and continue to react. Without ERC-timeout, the drive tries to correct the error on its own (not reacting on any requests), mdraid assumes an error after a while and tries to rewrite the missing sector (assembled from the other disks). But the drive will still not react to the write request as it is still doing its internal recovery procedure. Now mdraid assumes the disk to be bad and kicks it. There's nothing you can do about this viscious circle except either enabling ERC or using Raid-Edition disk (which have ERC enabled by default). Evidence that using ATA ERC commands don't always work: Both Linux and FreeBSD can use normal desktop drives without TLER, and in fact you *would not even want TLER* in such a case, since *TLER can be dangerous* in some circumstances. Read on. *What is TLER/CCTL/ERC?* TLER (Time-Limited Error Recovery CCTL (Command Completion Time Limit) ERC (Error Recovery Control) These basically mean the same thing: limit the number of seconds the harddrive spends on trying to recover a weak or bad sector. TLER and the other variants are typically configured to 7 seconds, meaning that if the drive has not managed to recover that sector within 7 seconds, it will give up and forfeit recovery, and return an I/O error to the host instead. The behavior without TLER is that up to 120 seconds (20-60 is more frequent) may pass before a disk gives up recovery. This behavior causes haywire on all Hardware RAID and Windows-based software/onboard/driver RAIDs. The RAID consider typically is configured to consider disks that don't respond in 10 seconds as completely failed; which is bizarre to say the least! This smells like the vendors have some sort of deal causing you to buy HDDs at twice the price just for a simple firmware fix. LOL!! Don't get yourself buttraped; read on! *When do i need TLER?* You need TLER-capable disks when using any Hardware RAID or any Windows-based software RAID; bummer if you're on Windows platform! But this also means Hardware RAID on any OS (FreeBSD/Linux) would also need TLER disks; even when configured to run as 'JBOD' array. There may be controllers with different firmware that allow you to set the timeout limit for I/O; but i've not yet heard about specific products, except some LSI 1068E in IR mode; but reputable vendors like Areca (FW1.43) certainly require TLER-enabled disks or they will drop-out like candy whenever you encounter a bad/weak sector that needs longer recovery than 10 seconds. Basically, if you use a RAID platform that DEMANDS the disks to respond within 10 seconds, and will KICK OUT disks that do not respond in time, then you need TLER. *When don't I need TLER?* When using FreeBSD/Linux software RAID on a HBA controller; which is a RAID-less controller. Areca HW RAID running in JBOD mode is still a RAID controller; it controls whether the disks are detached, not the OS. With a true HBA like LSI 1068E (Intel SASUC8i) your OS would have
Re: OT: Dell disks
On 06/19/2012 12:48 PM, Eric Shubert wrote: On 06/19/2012 06:28 AM, Lisa Kachold wrote: Hi Mark, On Mon, Jun 18, 2012 at 10:05 PM, Mark Jarvis m.jar...@cox.net mailto:m.jar...@cox.net wrote: I'm considering buying a Dell desktop (Inspiron 620), but a few years ago I was warned off them because Dell did something different to their disks so that you had to buy replacement/additional disks only from Dell. Any chance that it's still true? Unless you have a hardware RAID card, and you are buying a desktop, you should not have enterprise grade drives, but check with Dell Support for the model you are interested in. You are referring to TLER/ERC/CCTL: Hard drive manufacturers are drawing a distinction between desktop grade and enterprise grade drives. The desktop grade drives can take a long time (~2 minutes) to respond when they find an error, which causes most RAID systems to label them as failed and drop them from the array. The solution provided by the manufacturers is for us to purchase the enterprise grade drives, at twice the cost, which report errors promptly enough so that this isn't a problem. This enterprise feature is called TLER, ERC, and CCTL. *The Problem:* There are three problems with this situation: The first is that it flies in the face of the word *Inexpensive* in the acronym *Redundant Arrays of /Inexpensive/ Disks (RAID)* http://www-2.cs.cmu.edu/%7Egarth/RAIDpaper/Patterson88.pdf. The second is that when a drive starts to fail, you want to know about it, as Miles Nordin wrote in a long thread http://opensolaris.org/jive/thread.jspa?threadID=119639tstart=0: * Posssible Solutions:* For a while, Western Digital released a program (WDTLER.EXE) that made it possible to enable TLER on desktop grade drives. This no longer works. *Linux:* This message http://marc.info/?l=linux-raidm=128640221813394w=2 implies that it's impossible to tell a drive to cancel its bad read operation: You can set the ERC values of your drives. Then they'll stop processing their internal error recovery procedure after the timeout and continue to react. Without ERC-timeout, the drive tries to correct the error on its own (not reacting on any requests), mdraid assumes an error after a while and tries to rewrite the missing sector (assembled from the other disks). But the drive will still not react to the write request as it is still doing its internal recovery procedure. Now mdraid assumes the disk to be bad and kicks it. There's nothing you can do about this viscious circle except either enabling ERC or using Raid-Edition disk (which have ERC enabled by default). Evidence that using ATA ERC commands don't always work: Both Linux and FreeBSD can use normal desktop drives without TLER, and in fact you *would not even want TLER* in such a case, since *TLER can be dangerous* in some circumstances. Read on. *What is TLER/CCTL/ERC?* TLER (Time-Limited Error Recovery CCTL (Command Completion Time Limit) ERC (Error Recovery Control) These basically mean the same thing: limit the number of seconds the harddrive spends on trying to recover a weak or bad sector. TLER and the other variants are typically configured to 7 seconds, meaning that if the drive has not managed to recover that sector within 7 seconds, it will give up and forfeit recovery, and return an I/O error to the host instead. The behavior without TLER is that up to 120 seconds (20-60 is more frequent) may pass before a disk gives up recovery. This behavior causes haywire on all Hardware RAID and Windows-based software/onboard/driver RAIDs. The RAID consider typically is configured to consider disks that don't respond in 10 seconds as completely failed; which is bizarre to say the least! This smells like the vendors have some sort of deal causing you to buy HDDs at twice the price just for a simple firmware fix. LOL!! Don't get yourself buttraped; read on! *When do i need TLER?* You need TLER-capable disks when using any Hardware RAID or any Windows-based software RAID; bummer if you're on Windows platform! But this also means Hardware RAID on any OS (FreeBSD/Linux) would also need TLER disks; even when configured to run as 'JBOD' array. There may be controllers with different firmware that allow you to set the timeout limit for I/O; but i've not yet heard about specific products, except some LSI 1068E in IR mode; but reputable vendors like Areca (FW1.43) certainly require TLER-enabled disks or they will drop-out like candy whenever you encounter a bad/weak sector that needs longer recovery than 10 seconds. Basically, if you use a RAID platform that DEMANDS the disks to respond within 10 seconds, and will KICK OUT disks that do not respond in time, then you need TLER. *When don't I need TLER?* When using FreeBSD/Linux software RAID on a HBA controller; which is a RAID-less controller. Areca HW RAID running in JBOD mode is still a RAID controller; it controls whether the disks are detached, not the OS. With a true HBA like LSI 1068E (Intel
RE: OT: Dell disks
Disclaimer time: I work for an SSD manufacturer. We sell high-end 'enterprise' SSDs. However, while some of what I say is based upon my experience there, NONE of it is official, and all of it is my personal opinion. Just remember what you paid for it! ;-) There are at least 2 levels of 'SSD' - enterprise and consumer. Contrary to what was said about rotating drives and that weird 'error fast' thing, in SSD-land there is a large difference between 'enterprise' and 'consumer'. Enterprise drives are designed to run at very near full rated TPS and/or MB/S continuously 24x7x365. Consumer drives won't last anywhere near that. (Consumer drives are often implemented using the same technlogy (and expected lifetime) that is used in memory sticks. If that doesn't scare you to death nothing will :-)) In any case, I'd look at the smart attributes. Use smartctl -a and see if anything is nearing the threshold. Send the output to me personally if you want me to look at it, I'll be happy to. (As long as I don't get 2,000 of those tomorrow, anyway! ;-) And some other day I'll say what I think of RAID 5 :-) Rusty -Original Message- From: plug-discuss-boun...@lists.plug.phoenix.az.us on behalf of Michael Butash Sent: Tue 6/19/2012 9:13 PM To: plug-discuss@lists.plug.phoenix.az.us Subject: Re: OT: Dell disks On 06/19/2012 12:48 PM, Eric Shubert wrote: On 06/19/2012 06:28 AM, Lisa Kachold wrote: Hi Mark, ... I'll continue to steer clear of HW raid, as well as raid-5. :) So yeah, no raid is perfect... To its testament, it rebooted, both disks reported healthy (hdparm, ubuntu disk utility), I re-added each partition, let it rebuild, and works again. Still worries me as my last set of ssd disks got unstable on one after less than 9 months of use and I'm probably about there with these that are known to get cranky. Smart reports them as ok, so I wonder how bad ati taints the kernel space that it causes disk controller/driver exceptions. Moral of story: know when/how to repair whatever raid, as software and hardware are seemingly still prone to exception from unlikely places. Last time a disk died with md, I just mounted the secondary in an enclosure, copied off data as pluggable, and copied to the new pair of raid disks. Hardware is never this easy, especially fakeraids. -mb --- PLUG-discuss mailing list - PLUG-discuss@lists.plug.phoenix.az.us To subscribe, unsubscribe, or to change your mail settings: http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss --- PLUG-discuss mailing list - PLUG-discuss@lists.plug.phoenix.az.us To subscribe, unsubscribe, or to change your mail settings: http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss
Re: OT: Dell disks
I have several Dells and have added HD's to 2 of them w/o any problem. They are a little older. Things could have changed. I'd do more due diligence though, just to make sure. Keith Smith --- On Mon, 6/18/12, Mark Jarvis m.jar...@cox.net wrote: From: Mark Jarvis m.jar...@cox.net Subject: OT: Dell disks To: Main PLUG discussion list plug-discuss@lists.plug.phoenix.az.us Date: Monday, June 18, 2012, 10:05 PM I'm considering buying a Dell desktop (Inspiron 620), but a few years ago I was warned off them because Dell did something different to their disks so that you had to buy replacement/additional disks only from Dell. Any chance that it's still true? -Inline Attachment Follows- --- PLUG-discuss mailing list - PLUG-discuss@lists.plug.phoenix.az.us To subscribe, unsubscribe, or to change your mail settings: http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss--- PLUG-discuss mailing list - PLUG-discuss@lists.plug.phoenix.az.us To subscribe, unsubscribe, or to change your mail settings: http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss