Re: OT: Dell disks

2012-06-21 Thread Eric Shubert

On 06/19/2012 09:13 PM, Michael Butash wrote:

So yeah, no raid is perfect...

I've been using software raid1 (md) for a while now for my desktops and
laptops work and home, and since my adventures in ati gpu land, I've
twice now had video software/hardware cause my software raid to fail
ugly, but both times survivable while I rebuilt the array manually. This
was just a few days ago the last...

Both times were using GL functions (this time toggling compositing
on/off, last time i think minecraft) that caused the ati fglrx drivers
to spew hardware errors seeming to glitch the card itself.  Two separate
cards as well now.  Getting back into desktop went into visa with gpu
unavailable.  Then I saw my raid was degraded, again, same timestamp as
the gpu glitch.

First time prior one of the two disks in the md for boot went offline,
simply added sdb1/2 back.  This time one partition on each disk to the
two md's (boot/else) to go offline alternatively (sda1/sdb2) - very odd.
  The second disk wouldn't respond to hdparm/fdisk query until a reboot
that was done very hesitantly and not before I backed up anything I
cared about to an nfs share.  Data on both remained available which was
really the odd part.

To its testament, it rebooted, both disks reported healthy (hdparm,
ubuntu disk utility), I re-added each partition, let it rebuild, and
works again.  Still worries me as my last set of ssd disks got unstable
on one after less than 9 months of use and I'm probably about there with
these that are known to get cranky.  Smart reports them as ok, so I
wonder how bad ati taints the kernel space that it causes disk
controller/driver exceptions.

Moral of story: know when/how to repair whatever raid, as software and
hardware are seemingly still prone to exception from unlikely places.
Last time a disk died with md, I just mounted the secondary in an
enclosure, copied off data as pluggable, and copied to the new pair of
raid disks.  Hardware is never this easy, especially fakeraids.

-mb


I use software raid strictly on servers, which are headless (of course). 
I keep data on a server when practical, and run a daily rsync (offsite) 
backup of data on my workstation. Interesting to note the problems with 
video though. You might want to hop onto the raid list 
(gmane.linux.raid) and see what they think about it.


I don't know why anyone would run SSDs in a raid. RAID IS NOT A BACKUP. 
Of course if you're concerned about device failure and need to maintain 
continuity, raid is entirely appropriate.


One of the nice things about sw raid-1 is that as long as you have one 
good drive/partition, then you can recover. Just start the array in 
degraded mode, and you're good to go. I know that grub can access a 
single raid-1 partition w/out starting the array, which makes me wonder 
if you can simply mount one of the raid-1 partitions straight away w/out 
starting the array. I should try that so I know for sure if it's 
possible or not.


BL, *never* use fakeraid, and avoid raid-5 if possible. Disk space is no 
longer expensive enough to justify using raid-5.


--
-Eric 'shubes'

---
PLUG-discuss mailing list - PLUG-discuss@lists.plug.phoenix.az.us
To subscribe, unsubscribe, or to change your mail settings:
http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss


raid (was RE: OT: Dell disks)

2012-06-21 Thread Carruth, Rusty



-Original Message-
From: plug-discuss-boun...@lists.plug.phoenix.az.us on behalf of Eric Shubert

On 06/19/2012 09:13 PM, Michael Butash wrote:
  So yeah, no raid is perfect...
 
 ...
  -mb
 
 I use software raid strictly on servers, which are headless (of course). 
 ...
 
 I don't know why anyone would run SSDs in a raid. RAID IS NOT A BACKUP. 

Rather than my guessing, would you mind explaining your reasons?  I'm curious.

 ...
 
 BL, *never* use fakeraid, and avoid raid-5 if possible. Disk space is no 
 longer expensive enough to justify using raid-5.

Wow, someone else who agrees with me - IMHO, if its important enough to need 
raid, don't try to skimp and save a few bucks so you can lose your data!

Rusty
---
PLUG-discuss mailing list - PLUG-discuss@lists.plug.phoenix.az.us
To subscribe, unsubscribe, or to change your mail settings:
http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss

Re: raid (was RE: OT: Dell disks)

2012-06-21 Thread Gilbert T. Gutierrez, Jr.
Other than the case when 2 drives failed, Raid 5 worked for me for many 
years. If you are using simple mirroring though 2 drives failing will 
cause the same issue. I now use Raid 6 for a little more redundancy. 
Always backup your data to other storage (offsite if possible) in-case 
of disaster.


Gilbert

On 6/21/2012 8:41 AM, Carruth, Rusty wrote:

raid (was RE: OT: Dell disks)


-Original Message-
From: plug-discuss-boun...@lists.plug.phoenix.az.us on behalf of Eric 
Shubert


On 06/19/2012 09:13 PM, Michael Butash wrote:
  So yeah, no raid is perfect...
 
 ...
  -mb

 I use software raid strictly on servers, which are headless (of course).
 ...

 I don't know why anyone would run SSDs in a raid. RAID IS NOT A BACKUP.

Rather than my guessing, would you mind explaining your reasons?  I'm 
curious.


 ...

 BL, *never* use fakeraid, and avoid raid-5 if possible. Disk space is no
 longer expensive enough to justify using raid-5.

Wow, someone else who agrees with me - IMHO, if its important enough 
to need raid, don't try to skimp and save a few bucks so you can lose 
your data!


Rusty



---
PLUG-discuss mailing list - PLUG-discuss@lists.plug.phoenix.az.us
To subscribe, unsubscribe, or to change your mail settings:
http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss



---
PLUG-discuss mailing list - PLUG-discuss@lists.plug.phoenix.az.us
To subscribe, unsubscribe, or to change your mail settings:
http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss

Re: raid (was RE: OT: Dell disks)

2012-06-21 Thread Michael Butash
Agreed - I have a separate nas I rsync important data regularly with 
between my laptop as I often work away from home, here I just use the 
nas.  Problem is I don't always have my laptop on at home, and sometimes 
goes weeks without replication.


I run the ssd's in raid1 as a) i want *some* disk-level redundancy so as 
not to rebuild my desktop from scratch yearly when they puke and b) want 
the speed, so am willing to deal with their questionable nature.


I'd buy some of Rusty's companies industrial ssd's, but they don't seem 
to sell to want to sell them easily anywhere, and what I do find for 
sale is insanely priced ($10-20/gb).  Nor does anyone commonly sell them 
even if i were made of cash.  Kind of annoying to get enterprise stuff 
at home you have to hit secondary markets ala ebay, sorta like a 
crackhead hitting a swapmeet for off the back of the truck goods. 
Definitely not something I want to get refurbed after some enterprise 
has run sql db's off it for 2 years already.


I might pay double to avoid the stress of rebulding the os yearly with 
crap ssd's (which seems 98% are), but not 10-20x.


-mb


On 06/21/2012 09:01 AM, Gilbert T. Gutierrez, Jr. wrote:

Other than the case when 2 drives failed, Raid 5 worked for me for many
years. If you are using simple mirroring though 2 drives failing will
cause the same issue. I now use Raid 6 for a little more redundancy.
Always backup your data to other storage (offsite if possible) in-case
of disaster.

Gilbert

On 6/21/2012 8:41 AM, Carruth, Rusty wrote:




-Original Message-
From: plug-discuss-boun...@lists.plug.phoenix.az.us on behalf of Eric
Shubert

On 06/19/2012 09:13 PM, Michael Butash wrote:
  So yeah, no raid is perfect...
 
 ...
  -mb

 I use software raid strictly on servers, which are headless (of course).
 ...

 I don't know why anyone would run SSDs in a raid. RAID IS NOT A BACKUP.

Rather than my guessing, would you mind explaining your reasons? I'm
curious.

 ...

 BL, *never* use fakeraid, and avoid raid-5 if possible. Disk space is no
 longer expensive enough to justify using raid-5.

Wow, someone else who agrees with me - IMHO, if its important enough
to need raid, don't try to skimp and save a few bucks so you can lose
your data!

Rusty



---
PLUG-discuss mailing list -PLUG-discuss@lists.plug.phoenix.az.us
To subscribe, unsubscribe, or to change your mail settings:
http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss





---
PLUG-discuss mailing list - PLUG-discuss@lists.plug.phoenix.az.us
To subscribe, unsubscribe, or to change your mail settings:
http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss

---
PLUG-discuss mailing list - PLUG-discuss@lists.plug.phoenix.az.us
To subscribe, unsubscribe, or to change your mail settings:
http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss


Re: raid (was RE: OT: Dell disks)

2012-06-21 Thread Stephen
I have never liked raid 5 but can still see its use. And while you are 100%
correct I have the statement that raid is not a back up it is a good
feature for performance needs and overall uptime so you can keep running in
case of single disk failure. Which I have dealt with.
On Jun 21, 2012 8:45 AM, Carruth, Rusty rusty.carr...@smartstoragesys.com
wrote:

 **



 -Original Message-
 From: plug-discuss-boun...@lists.plug.phoenix.az.us on behalf of Eric
 Shubert

 On 06/19/2012 09:13 PM, Michael Butash wrote:
   So yeah, no raid is perfect...
  
  ...
   -mb
 
  I use software raid strictly on servers, which are headless (of course).
  ...
 
  I don't know why anyone would run SSDs in a raid. RAID IS NOT A BACKUP.

 Rather than my guessing, would you mind explaining your reasons?  I'm
 curious.

  ...
 
  BL, *never* use fakeraid, and avoid raid-5 if possible. Disk space is no
  longer expensive enough to justify using raid-5.

 Wow, someone else who agrees with me - IMHO, if its important enough to
 need raid, don't try to skimp and save a few bucks so you can lose your
 data!

 Rusty

 ---
 PLUG-discuss mailing list - PLUG-discuss@lists.plug.phoenix.az.us
 To subscribe, unsubscribe, or to change your mail settings:
 http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss

---
PLUG-discuss mailing list - PLUG-discuss@lists.plug.phoenix.az.us
To subscribe, unsubscribe, or to change your mail settings:
http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss

Re: OT: Dell disks

2012-06-19 Thread Stephen
Working with dells at work. Including the 620 any hdd will do. The
motherboard/case combination however is all sorts of proprietary.

But the 620 took Ubuntu 12.04 with gnome 3 like a champ.
On Jun 18, 2012 10:24 PM, Mark Jarvis m.jar...@cox.net wrote:


 I'm considering buying a Dell desktop (Inspiron 620), but a few years ago
 I was warned off them because Dell did something different to their disks
 so that you had to buy replacement/additional disks only from Dell. Any
 chance that it's still true?

 ---
 PLUG-discuss mailing list - PLUG-discuss@lists.plug.phoenix.az.us
 To subscribe, unsubscribe, or to change your mail settings:
 http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss

---
PLUG-discuss mailing list - PLUG-discuss@lists.plug.phoenix.az.us
To subscribe, unsubscribe, or to change your mail settings:
http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss

Re: OT: Dell disks

2012-06-19 Thread Lisa Kachold
Hi Mark,

On Mon, Jun 18, 2012 at 10:05 PM, Mark Jarvis m.jar...@cox.net wrote:


 I'm considering buying a Dell desktop (Inspiron 620), but a few years ago
 I was warned off them because Dell did something different to their disks
 so that you had to buy replacement/additional disks only from Dell. Any
 chance that it's still true?

 Unless you have a hardware RAID card, and you are buying a desktop, you
should not have enterprise grade drives, but check with Dell Support for
the model you are interested in.

You are referring to  TLER/ERC/CCTL:

Hard drive manufacturers are drawing a distinction between desktop grade
and enterprise grade drives. The desktop grade drives can take a long
time (~2 minutes) to respond when they find an error, which causes most
RAID systems to label them as failed and drop them from the array. The
solution provided by the manufacturers is for us to purchase the
enterprise grade drives, at twice the cost, which report errors promptly
enough so that this isn't a problem. This enterprise feature is called
TLER, ERC, and CCTL.

*The Problem:*

There are three problems with this situation:

The first is that it flies in the face of the word *Inexpensive* in the
acronym *Redundant Arrays of Inexpensive Disks
(RAID)*http://www-2.cs.cmu.edu/%7Egarth/RAIDpaper/Patterson88.pdf
.

The second is that when a drive starts to fail, you want to know about it,
as Miles Nordin wrote in a long
threadhttp://opensolaris.org/jive/thread.jspa?threadID=119639tstart=0
:
*
Posssible Solutions:*

For a while, Western Digital released a program (WDTLER.EXE) that made it
possible to enable TLER on desktop grade drives. This no longer works.

*Linux:*

This message http://marc.info/?l=linux-raidm=128640221813394w=2 implies
that it's impossible to tell a drive to cancel its bad read operation:

You can set the ERC values of your drives. Then they'll stop processing
their internal error recovery procedure after the timeout and continue
to react. Without ERC-timeout, the drive tries to correct the error on
its own (not reacting on any requests), mdraid assumes an error after a
while and tries to rewrite the missing sector (assembled from the
other disks). But the drive will still not react to the write request
as it is still doing its internal recovery procedure. Now mdraid
assumes the disk to be bad and kicks it.

There's nothing you can do about this viscious circle except either
enabling ERC or using Raid-Edition disk (which have ERC enabled by default).

Evidence that using ATA ERC commands don't always work:
Both Linux and FreeBSD can use normal desktop drives without TLER, and in
fact you *would not even want TLER* in such a case, since *TLER can be
dangerous* in some circumstances. Read on.


*What is TLER/CCTL/ERC?*
TLER (Time-Limited Error Recovery
CCTL (Command Completion Time Limit)
ERC (Error Recovery Control)

These basically mean the same thing: limit the number of seconds the
harddrive spends on trying to recover a weak or bad sector. TLER and the
other variants are typically configured to 7 seconds, meaning that if the
drive has not managed to recover that sector within 7 seconds, it will give
up and forfeit recovery, and return an I/O error to the host instead.

The behavior without TLER is that up to 120 seconds (20-60 is more
frequent) may pass before a disk gives up recovery. This behavior causes
haywire on all Hardware RAID and Windows-based software/onboard/driver RAIDs.
The RAID consider typically is configured to consider disks that don't
respond in 10 seconds as completely failed; which is bizarre to say the
least! This smells like the vendors have some sort of deal causing you to
buy HDDs at twice the price just for a simple firmware fix. LOL!! Don't get
yourself buttraped; read on!


*When do i need TLER?*
You need TLER-capable disks when using any Hardware RAID or any
Windows-based software RAID; bummer if you're on Windows platform! But this
also means Hardware RAID on any OS (FreeBSD/Linux) would also need TLER
disks; even when configured to run as 'JBOD' array. There may be
controllers with different firmware that allow you to set the timeout limit
for I/O; but i've not yet heard about specific products, except some LSI
1068E in IR mode; but reputable vendors like Areca (FW1.43) certainly
require TLER-enabled disks or they will drop-out like candy whenever you
encounter a bad/weak sector that needs longer recovery than 10 seconds.

Basically, if you use a RAID platform that DEMANDS the disks to respond
within 10 seconds, and will KICK OUT disks that do not respond in time,
then you need TLER.

*When don't I need TLER?*
When using FreeBSD/Linux software RAID on a HBA controller; which is a
RAID-less controller. Areca HW RAID running in JBOD mode is still a RAID
controller; it controls whether the disks are detached, not the OS. With a
true HBA like LSI 1068E (Intel SASUC8i) your OS would have control about
whether to detach the disk or not; and Linux/BSD won't, at least not for 

Re: OT: Dell disks

2012-06-19 Thread Mark Jarvis

  
  

  Thanks to all who responded. Sounds like no problem.
  
  Thanks again,
  
  Mark
  
  

Mark Jarvis wrote:


  
  
I'm considering buying a Dell desktop (Inspiron 620), but a few
years ago I was warned off them because Dell did something
different to their disks so that you had to buy
replacement/additional disks only from Dell. Any chance that
it's still true?
   
  
  
  ---
PLUG-discuss mailing list - PLUG-discuss@lists.plug.phoenix.az.us
To subscribe, unsubscribe, or to change your mail settings:
http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss



  


---
PLUG-discuss mailing list - PLUG-discuss@lists.plug.phoenix.az.us
To subscribe, unsubscribe, or to change your mail settings:
http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss

Re: OT: Dell disks

2012-06-19 Thread Michael Butash
Lesser known fact, but I stopped buying Western Digital drives because 
they purposely began removing firmware function to disable tler, 
purposly pushing people to buy their 2x cost enterprise drives to 
*support* raid.  It turned into a inet snafu at one point from backlash 
as they ripped it out mid-run of drives so half worked, half didn't 
(their popular black drives too known for performance, and suitability 
for raid).  The only real difference IS the firmware (and warranty, but 
meh), so their removing the tler disable ability to NOT cook my raid is 
a rather offensive, especially simply in the name of selling drives at 
higher margins.  I haven't bought a WD HD in a good 3-4 years now 
because of it.


Sad is I migrated to using Hitachi disks for a few years as they don't 
cripple their disks, and now they got borg'd by WD, which I'm certain 
they'll just muck those up too to push for raid==enterprise==high 
margin.  I refuse to by Seagate since they integrated the dubious Maxtor 
junk, now I/we're just about out of options for cost-effective disks 
that don't suck..


So much for choices, industry consolidation is for the best though!

-mb


On 06/19/2012 06:28 AM, Lisa Kachold wrote:

Hi Mark,

On Mon, Jun 18, 2012 at 10:05 PM, Mark Jarvis m.jar...@cox.net
mailto:m.jar...@cox.net wrote:


I'm considering buying a Dell desktop (Inspiron 620), but a few
years ago I was warned off them because Dell did something different
to their disks so that you had to buy replacement/additional disks
only from Dell. Any chance that it's still true?

Unless you have a hardware RAID card, and you are buying a desktop, you
should not have enterprise grade drives, but check with Dell Support for
the model you are interested in.
You are referring to TLER/ERC/CCTL:

Hard drive manufacturers are drawing a distinction between desktop
grade and enterprise grade drives. The desktop grade drives can take
a long time (~2 minutes) to respond when they find an error, which
causes most RAID systems to label them as failed and drop them from the
array. The solution provided by the manufacturers is for us to purchase
the enterprise grade drives, at twice the cost, which report errors
promptly enough so that this isn't a problem. This enterprise feature
is called TLER, ERC, and CCTL.

*The Problem:*

There are three problems with this situation:

The first is that it flies in the face of the word *Inexpensive* in the
acronym *Redundant Arrays of /Inexpensive/ Disks (RAID)*
http://www-2.cs.cmu.edu/%7Egarth/RAIDpaper/Patterson88.pdf.

The second is that when a drive starts to fail, you want to know about
it, as Miles Nordin wrote in a long thread
http://opensolaris.org/jive/thread.jspa?threadID=119639tstart=0:
*
Posssible Solutions:*

For a while, Western Digital released a program (WDTLER.EXE) that made
it possible to enable TLER on desktop grade drives. This no longer works.

*Linux:*

This message http://marc.info/?l=linux-raidm=128640221813394w=2
implies that it's impossible to tell a drive to cancel its bad read
operation:

You can set the ERC values of your drives. Then they'll stop processing
their internal error recovery procedure after the timeout and continue
to react. Without ERC-timeout, the drive tries to correct the error on
its own (not reacting on any requests), mdraid assumes an error after a
while and tries to rewrite the missing sector (assembled from the
other disks). But the drive will still not react to the write request
as it is still doing its internal recovery procedure. Now mdraid
assumes the disk to be bad and kicks it.

There's nothing you can do about this viscious circle except either
enabling ERC or using Raid-Edition disk (which have ERC enabled by default).

Evidence that using ATA ERC commands don't always work:
Both Linux and FreeBSD can use normal desktop drives without TLER, and
in fact you *would not even want TLER* in such a case, since *TLER can
be dangerous* in some circumstances. Read on.


*What is TLER/CCTL/ERC?*
TLER (Time-Limited Error Recovery
CCTL (Command Completion Time Limit)
ERC (Error Recovery Control)

These basically mean the same thing: limit the number of seconds the
harddrive spends on trying to recover a weak or bad sector. TLER and the
other variants are typically configured to 7 seconds, meaning that if
the drive has not managed to recover that sector within 7 seconds, it
will give up and forfeit recovery, and return an I/O error to the host
instead.

The behavior without TLER is that up to 120 seconds (20-60 is more
frequent) may pass before a disk gives up recovery. This behavior causes
haywire on all Hardware RAID and Windows-based software/onboard/driver
RAIDs. The RAID consider typically is configured to consider disks that
don't respond in 10 seconds as completely failed; which is bizarre to
say the least! This smells like the vendors have some sort of deal
causing you to buy HDDs at twice the price just for a simple firmware
fix. LOL!! Don't get 

Re: OT: Dell disks

2012-06-19 Thread Eric Shubert

On 06/19/2012 06:28 AM, Lisa Kachold wrote:

Hi Mark,

On Mon, Jun 18, 2012 at 10:05 PM, Mark Jarvis m.jar...@cox.net
mailto:m.jar...@cox.net wrote:


I'm considering buying a Dell desktop (Inspiron 620), but a few
years ago I was warned off them because Dell did something different
to their disks so that you had to buy replacement/additional disks
only from Dell. Any chance that it's still true?

Unless you have a hardware RAID card, and you are buying a desktop, you
should not have enterprise grade drives, but check with Dell Support for
the model you are interested in.
You are referring to TLER/ERC/CCTL:

Hard drive manufacturers are drawing a distinction between desktop
grade and enterprise grade drives. The desktop grade drives can take
a long time (~2 minutes) to respond when they find an error, which
causes most RAID systems to label them as failed and drop them from the
array. The solution provided by the manufacturers is for us to purchase
the enterprise grade drives, at twice the cost, which report errors
promptly enough so that this isn't a problem. This enterprise feature
is called TLER, ERC, and CCTL.

*The Problem:*

There are three problems with this situation:

The first is that it flies in the face of the word *Inexpensive* in the
acronym *Redundant Arrays of /Inexpensive/ Disks (RAID)*
http://www-2.cs.cmu.edu/%7Egarth/RAIDpaper/Patterson88.pdf.

The second is that when a drive starts to fail, you want to know about
it, as Miles Nordin wrote in a long thread
http://opensolaris.org/jive/thread.jspa?threadID=119639tstart=0:
*
Posssible Solutions:*

For a while, Western Digital released a program (WDTLER.EXE) that made
it possible to enable TLER on desktop grade drives. This no longer works.

*Linux:*

This message http://marc.info/?l=linux-raidm=128640221813394w=2
implies that it's impossible to tell a drive to cancel its bad read
operation:

You can set the ERC values of your drives. Then they'll stop processing
their internal error recovery procedure after the timeout and continue
to react. Without ERC-timeout, the drive tries to correct the error on
its own (not reacting on any requests), mdraid assumes an error after a
while and tries to rewrite the missing sector (assembled from the
other disks). But the drive will still not react to the write request
as it is still doing its internal recovery procedure. Now mdraid
assumes the disk to be bad and kicks it.

There's nothing you can do about this viscious circle except either
enabling ERC or using Raid-Edition disk (which have ERC enabled by default).

Evidence that using ATA ERC commands don't always work:
Both Linux and FreeBSD can use normal desktop drives without TLER, and
in fact you *would not even want TLER* in such a case, since *TLER can
be dangerous* in some circumstances. Read on.


*What is TLER/CCTL/ERC?*
TLER (Time-Limited Error Recovery
CCTL (Command Completion Time Limit)
ERC (Error Recovery Control)

These basically mean the same thing: limit the number of seconds the
harddrive spends on trying to recover a weak or bad sector. TLER and the
other variants are typically configured to 7 seconds, meaning that if
the drive has not managed to recover that sector within 7 seconds, it
will give up and forfeit recovery, and return an I/O error to the host
instead.

The behavior without TLER is that up to 120 seconds (20-60 is more
frequent) may pass before a disk gives up recovery. This behavior causes
haywire on all Hardware RAID and Windows-based software/onboard/driver
RAIDs. The RAID consider typically is configured to consider disks that
don't respond in 10 seconds as completely failed; which is bizarre to
say the least! This smells like the vendors have some sort of deal
causing you to buy HDDs at twice the price just for a simple firmware
fix. LOL!! Don't get yourself buttraped; read on!


*When do i need TLER?*
You need TLER-capable disks when using any Hardware RAID or any
Windows-based software RAID; bummer if you're on Windows platform! But
this also means Hardware RAID on any OS (FreeBSD/Linux) would also need
TLER disks; even when configured to run as 'JBOD' array. There may be
controllers with different firmware that allow you to set the timeout
limit for I/O; but i've not yet heard about specific products, except
some LSI 1068E in IR mode; but reputable vendors like Areca (FW1.43)
certainly require TLER-enabled disks or they will drop-out like candy
whenever you encounter a bad/weak sector that needs longer recovery than
10 seconds.

Basically, if you use a RAID platform that DEMANDS the disks to respond
within 10 seconds, and will KICK OUT disks that do not respond in time,
then you need TLER.

*When don't I need TLER?*
When using FreeBSD/Linux software RAID on a HBA controller; which is a
RAID-less controller. Areca HW RAID running in JBOD mode is still a RAID
controller; it controls whether the disks are detached, not the OS. With
a true HBA like LSI 1068E (Intel SASUC8i) your OS would have 

Re: OT: Dell disks

2012-06-19 Thread Michael Butash

On 06/19/2012 12:48 PM, Eric Shubert wrote:

On 06/19/2012 06:28 AM, Lisa Kachold wrote:

Hi Mark,

On Mon, Jun 18, 2012 at 10:05 PM, Mark Jarvis m.jar...@cox.net
mailto:m.jar...@cox.net wrote:


I'm considering buying a Dell desktop (Inspiron 620), but a few
years ago I was warned off them because Dell did something different
to their disks so that you had to buy replacement/additional disks
only from Dell. Any chance that it's still true?

Unless you have a hardware RAID card, and you are buying a desktop, you
should not have enterprise grade drives, but check with Dell Support for
the model you are interested in.
You are referring to TLER/ERC/CCTL:

Hard drive manufacturers are drawing a distinction between desktop
grade and enterprise grade drives. The desktop grade drives can take
a long time (~2 minutes) to respond when they find an error, which
causes most RAID systems to label them as failed and drop them from the
array. The solution provided by the manufacturers is for us to purchase
the enterprise grade drives, at twice the cost, which report errors
promptly enough so that this isn't a problem. This enterprise feature
is called TLER, ERC, and CCTL.

*The Problem:*

There are three problems with this situation:

The first is that it flies in the face of the word *Inexpensive* in the
acronym *Redundant Arrays of /Inexpensive/ Disks (RAID)*
http://www-2.cs.cmu.edu/%7Egarth/RAIDpaper/Patterson88.pdf.

The second is that when a drive starts to fail, you want to know about
it, as Miles Nordin wrote in a long thread
http://opensolaris.org/jive/thread.jspa?threadID=119639tstart=0:
*
Posssible Solutions:*

For a while, Western Digital released a program (WDTLER.EXE) that made
it possible to enable TLER on desktop grade drives. This no longer works.

*Linux:*

This message http://marc.info/?l=linux-raidm=128640221813394w=2
implies that it's impossible to tell a drive to cancel its bad read
operation:

You can set the ERC values of your drives. Then they'll stop processing
their internal error recovery procedure after the timeout and continue
to react. Without ERC-timeout, the drive tries to correct the error on
its own (not reacting on any requests), mdraid assumes an error after a
while and tries to rewrite the missing sector (assembled from the
other disks). But the drive will still not react to the write request
as it is still doing its internal recovery procedure. Now mdraid
assumes the disk to be bad and kicks it.

There's nothing you can do about this viscious circle except either
enabling ERC or using Raid-Edition disk (which have ERC enabled by
default).

Evidence that using ATA ERC commands don't always work:
Both Linux and FreeBSD can use normal desktop drives without TLER, and
in fact you *would not even want TLER* in such a case, since *TLER can
be dangerous* in some circumstances. Read on.


*What is TLER/CCTL/ERC?*
TLER (Time-Limited Error Recovery
CCTL (Command Completion Time Limit)
ERC (Error Recovery Control)

These basically mean the same thing: limit the number of seconds the
harddrive spends on trying to recover a weak or bad sector. TLER and the
other variants are typically configured to 7 seconds, meaning that if
the drive has not managed to recover that sector within 7 seconds, it
will give up and forfeit recovery, and return an I/O error to the host
instead.

The behavior without TLER is that up to 120 seconds (20-60 is more
frequent) may pass before a disk gives up recovery. This behavior causes
haywire on all Hardware RAID and Windows-based software/onboard/driver
RAIDs. The RAID consider typically is configured to consider disks that
don't respond in 10 seconds as completely failed; which is bizarre to
say the least! This smells like the vendors have some sort of deal
causing you to buy HDDs at twice the price just for a simple firmware
fix. LOL!! Don't get yourself buttraped; read on!


*When do i need TLER?*
You need TLER-capable disks when using any Hardware RAID or any
Windows-based software RAID; bummer if you're on Windows platform! But
this also means Hardware RAID on any OS (FreeBSD/Linux) would also need
TLER disks; even when configured to run as 'JBOD' array. There may be
controllers with different firmware that allow you to set the timeout
limit for I/O; but i've not yet heard about specific products, except
some LSI 1068E in IR mode; but reputable vendors like Areca (FW1.43)
certainly require TLER-enabled disks or they will drop-out like candy
whenever you encounter a bad/weak sector that needs longer recovery than
10 seconds.

Basically, if you use a RAID platform that DEMANDS the disks to respond
within 10 seconds, and will KICK OUT disks that do not respond in time,
then you need TLER.

*When don't I need TLER?*
When using FreeBSD/Linux software RAID on a HBA controller; which is a
RAID-less controller. Areca HW RAID running in JBOD mode is still a RAID
controller; it controls whether the disks are detached, not the OS. With
a true HBA like LSI 1068E (Intel 

RE: OT: Dell disks

2012-06-19 Thread Carruth, Rusty

Disclaimer time:  I work for an SSD manufacturer.  We sell high-end 
'enterprise' SSDs.

However, while some of what I say is based upon my experience there, NONE of it 
is official, and all of it is my personal opinion.  Just remember what you paid 
for it! ;-)

There are at least 2 levels of 'SSD' - enterprise and consumer.  Contrary to 
what was said about rotating drives and that weird 'error fast' thing, in 
SSD-land there is a large difference between 'enterprise' and 'consumer'.  
Enterprise drives are designed to run at very near full rated TPS and/or MB/S 
continuously 24x7x365.  Consumer drives won't last anywhere near that.  
(Consumer drives are often implemented using the same technlogy (and expected 
lifetime) that is used in memory sticks.  If that doesn't scare you to death 
nothing will :-))

In any case, I'd look at the smart attributes.  Use smartctl -a and see if 
anything is nearing the threshold.  Send the output to me personally if you 
want me to look at it, I'll be happy to.  (As long as I don't get 2,000 of 
those tomorrow, anyway! ;-)

And some other day I'll say what I think of RAID 5 :-)

Rusty

-Original Message-
From: plug-discuss-boun...@lists.plug.phoenix.az.us on behalf of Michael Butash
Sent: Tue 6/19/2012 9:13 PM
To: plug-discuss@lists.plug.phoenix.az.us
Subject: Re: OT: Dell disks
 
On 06/19/2012 12:48 PM, Eric Shubert wrote:
 On 06/19/2012 06:28 AM, Lisa Kachold wrote:
 Hi Mark,
...
 I'll continue to steer clear of HW raid, as well as raid-5. :)


So yeah, no raid is perfect...



To its testament, it rebooted, both disks reported healthy (hdparm, 
ubuntu disk utility), I re-added each partition, let it rebuild, and 
works again.  Still worries me as my last set of ssd disks got unstable 
on one after less than 9 months of use and I'm probably about there with 
these that are known to get cranky.  Smart reports them as ok, so I 
wonder how bad ati taints the kernel space that it causes disk 
controller/driver exceptions.

Moral of story: know when/how to repair whatever raid, as software and 
hardware are seemingly still prone to exception from unlikely places. 
Last time a disk died with md, I just mounted the secondary in an 
enclosure, copied off data as pluggable, and copied to the new pair of 
raid disks.  Hardware is never this easy, especially fakeraids.

-mb
---
PLUG-discuss mailing list - PLUG-discuss@lists.plug.phoenix.az.us
To subscribe, unsubscribe, or to change your mail settings:
http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss

---
PLUG-discuss mailing list - PLUG-discuss@lists.plug.phoenix.az.us
To subscribe, unsubscribe, or to change your mail settings:
http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss

Re: OT: Dell disks

2012-06-18 Thread keith smith

I have several Dells and have added HD's to 2 of them w/o any problem.  They 
are a little older.  Things could have changed.  I'd do more due diligence 
though, just to make sure.



Keith Smith

--- On Mon, 6/18/12, Mark Jarvis m.jar...@cox.net wrote:

From: Mark Jarvis m.jar...@cox.net
Subject: OT: Dell disks
To: Main PLUG discussion list plug-discuss@lists.plug.phoenix.az.us
Date: Monday, June 18, 2012, 10:05 PM


  


  
  


  I'm considering buying a Dell desktop (Inspiron 620), but a few
  years ago I was warned off them because Dell did something
  different to their disks so that you had to buy
  replacement/additional disks only from Dell. Any chance that it's
  still true?


  



-Inline Attachment Follows-

---
PLUG-discuss mailing list - PLUG-discuss@lists.plug.phoenix.az.us
To subscribe, unsubscribe, or to change your mail settings:
http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss---
PLUG-discuss mailing list - PLUG-discuss@lists.plug.phoenix.az.us
To subscribe, unsubscribe, or to change your mail settings:
http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss