subject:"Questions about RAID 6"

Re: Questions about RAID 6

2010-06-08 Thread Henrique de Moraes Holschuh

On Sun, 02 May 2010, Mike Bird wrote:
> On Sun May 2 2010 13:24:30 Alexander Samad wrote:
> > My system used to become close to unusable on the 1st sunday of the month
> > when mdadm did it resync, I had to write my own script so it did not do
> > mulitple at the
> > same time, turn off the hung process timer and set cpufreq to performance.
> 
> A long time ago there were problems like that.  Nowadays s/w
> RAID handles rebuild so well that we don't even have to set
> "/proc/sys/dev/raid/speed_limit_max".

Tried that with a fsck or quotacheck?

-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh


-- 
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20100609014716.ga21...@khazad-dum.debian.net

Re: md does a monthly resync?? (was Re: Questions about RAID 6)

2010-06-07 Thread Jesús M. Navarro

Hi, Hugo:

On Tuesday 04 May 2010 20:25:52 Hugo Vanwoerkom wrote:
> martin f krafft wrote:
> > also sprach Hugo Vanwoerkom  [2010.05.04.1808 +0200]:
> >> I forget your specifics, but you do RAID *and* backup regularly to an
> >> external lvm2?
> >
> > RAID is not a backup solution, it's an availability measure.
>
> But as data availability goes up by using RAID doesn't the need for
> backing up that same data go down? Or is this just semantics?

No, it isn't.  While avaliability relates to a quantitative concept (how much 
time, out of the total, will this system be online?), backup relates to a 
qualitative concept (it either works, in which case I don't need the backup 
at all, or it doesn't work in which case I need the backup, no excuses).

Rising avaliability means that you'll need your backups (say) once every ten 
years instead of once a year.  But when you need them, there's no option: you 
need them.

-- 
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/201006062103.54532.jesus.nava...@undominio.net

Re: md does a monthly resync?? (was Re: Questions about RAID 6)

2010-05-06 Thread Hugo Vanwoerkom


Eduardo M KALINOWSKI wrote:

On Ter, 04 Mai 2010, Hugo Vanwoerkom wrote:

martin f krafft wrote:

RAID is not a backup solution, it's an availability measure.



But as data availability goes up by using RAID doesn't the need for 
backing up that same data go down? Or is this just semantics?


RAID does not prevent against you accidentaly deleting a file you did 
not want deleted. And the deletion will be immediately reflected to all 
disks in the array.





Good point. I had overlooked that.

Hugo


--
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org

Archive: http://lists.debian.org/hrv60u$8i...@dough.gmane.org

Re: md does a monthly resync?? (was Re: Questions about RAID 6)

2010-05-04 Thread Alex Samad

On Tue, 2010-05-04 at 14:50 -0500, Ron Johnson wrote: 
> On 05/04/2010 11:08 AM, Hugo Vanwoerkom wrote:
> [snip]
> >
> > I forget your specifics, but you do RAID *and* backup regularly to an
> > external lvm2?
> >
> 
> No, no RAID for me *at home*.  But at work I manage databases on all 
> sorts of (to use a quaint old phrase) super-minicomputers, and if 
> they ever needed to "resynchronize" monthly, we'd have tossed them 
> out immediately, since for the money we spend on them, they're 
> supposed to stay in sync *always*.

maybe resync was the wrong word to use, the consistency of the array is
checked, it shows up in /proc/mdadm as resync

hardware raid controllers do this as well, its like checking you backups
tapes make sure you can restore if needed.


> 
> -- 
> Dissent is patriotic, remember?
> 
> 



-- 
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: 
http://lists.debian.org/1273016433.20116.36.ca...@alex-mini.samad.com.au

Re: md does a monthly resync?? (was Re: Questions about RAID 6)

2010-05-04 Thread Ron Johnson


On 05/04/2010 11:08 AM, Hugo Vanwoerkom wrote:
[snip]


I forget your specifics, but you do RAID *and* backup regularly to an
external lvm2?



No, no RAID for me *at home*.  But at work I manage databases on all 
sorts of (to use a quaint old phrase) super-minicomputers, and if 
they ever needed to "resynchronize" monthly, we'd have tossed them 
out immediately, since for the money we spend on them, they're 
supposed to stay in sync *always*.


--
Dissent is patriotic, remember?


--
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org

Archive: http://lists.debian.org/4be07a94.5070...@cox.net

Re: md does a monthly resync?? (was Re: Questions about RAID 6)

2010-05-04 Thread Eduardo M KALINOWSKI


On Ter, 04 Mai 2010, Hugo Vanwoerkom wrote:

martin f krafft wrote:

RAID is not a backup solution, it's an availability measure.



But as data availability goes up by using RAID doesn't the need for  
backing up that same data go down? Or is this just semantics?


RAID does not prevent against you accidentaly deleting a file you did  
not want deleted. And the deletion will be immediately reflected to  
all disks in the array.



--
Two is not equal to three, even for large values of two.

Eduardo M KALINOWSKI
edua...@kalinowski.com.br


--
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org

Archive: 
http://lists.debian.org/20100504153038.43696zudmge10...@mail.kalinowski.com.br

Re: md does a monthly resync?? (was Re: Questions about RAID 6)

2010-05-04 Thread Hugo Vanwoerkom


martin f krafft wrote:

also sprach Hugo Vanwoerkom  [2010.05.04.1808 +0200]:

I forget your specifics, but you do RAID *and* backup regularly to an
external lvm2?


RAID is not a backup solution, it's an availability measure.



But as data availability goes up by using RAID doesn't the need for 
backing up that same data go down? Or is this just semantics?


Hugo


--
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org

Archive: http://lists.debian.org/hrporg$3u...@dough.gmane.org

Re: md does a monthly resync?? (was Re: Questions about RAID 6)

2010-05-04 Thread martin f krafft

also sprach Hugo Vanwoerkom  [2010.05.04.1808 +0200]:
> I forget your specifics, but you do RAID *and* backup regularly to an
> external lvm2?

RAID is not a backup solution, it's an availability measure.

-- 
 .''`.   martin f. krafft   Related projects:
: :'  :  proud Debian developer   http://debiansystem.info
`. `'`   http://people.debian.org/~madduckhttp://vcs-pkg.org
  `-  Debian - when you have better things to do than fixing systems
 
only by counting could humans demonstrate
their independence of computers.
-- douglas adams, "the hitchhiker's guide to the galaxy"


digital_signature_gpg.asc
Description: Digital signature (see http://martin-krafft.net/gpg/)

Re: md does a monthly resync?? (was Re: Questions about RAID 6)

2010-05-04 Thread Hugo Vanwoerkom


Ron Johnson wrote:

On 05/03/2010 03:45 AM, martin f krafft wrote:

also sprach Ron Johnson  [2010.05.03.1039 +0200]:

Is that Q21?

http://git.debian.org/?p=pkg-mdadm/mdadm.git;a=blob_plain;f=debian/FAQ;hb=HEAD 



Yes.


2. You were asked upon mdadm installation whether you wanted it, and
you chose to accept yes. dpkg-reconfigure mdadm if you don't
believe in this feature.


Well, not me, since I don't have a soft array...


Then you probably have much more annoying problems, or will have. ;)



I regularly backup to an external lvm2 LV.



I forget your specifics, but you do RAID *and* backup regularly to an 
external lvm2?


Hugo


--
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org

Archive: http://lists.debian.org/hrpgqm$1a...@dough.gmane.org

Re: md does a monthly resync?? (was Re: Questions about RAID 6)

2010-05-03 Thread Ron Johnson


On 05/03/2010 08:04 PM, Sam Leon wrote:

Ron Johnson wrote:

On 05/02/2010 03:24 PM, Alexander Samad wrote:
[snip]


My system used to become close to unusable on the 1st sunday of the
month when
mdadm did it resync,


That sounds... wrong, on a jillion levels.



I would rather the array fail on a monthly resync than have it fail on a
resync after replacing a drive and loose everything.



Arrays shouldn't fall out of sync, so a "monthly resync" *is* wrong 
on a jillion levels.


Fortunately, as Martin Krafft pointed out, it's really a monthly 
self-test, and -- to a Large Systems DBA -- that's OK.


--
Dissent is patriotic, remember?


--
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org

Archive: http://lists.debian.org/4bdf7fd1.80...@cox.net

Re: md does a monthly resync?? (was Re: Questions about RAID 6)

2010-05-03 Thread Sam Leon


Ron Johnson wrote:

On 05/02/2010 03:24 PM, Alexander Samad wrote:
[snip]


My system used to become close to unusable on the 1st sunday of the 
month when

mdadm did it resync,


That sounds... wrong, on a jillion levels.



I would rather the array fail on a monthly resync than have it fail on a 
resync after replacing a drive and loose everything.


Sam


--
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org

Archive: http://lists.debian.org/4bdf72b7.90...@net153.net

Re: md does a monthly resync?? (was Re: Questions about RAID 6)

2010-05-03 Thread Ron Johnson


On 05/03/2010 03:45 AM, martin f krafft wrote:

also sprach Ron Johnson  [2010.05.03.1039 +0200]:

Is that Q21?

http://git.debian.org/?p=pkg-mdadm/mdadm.git;a=blob_plain;f=debian/FAQ;hb=HEAD


Yes.


2. You were asked upon mdadm installation whether you wanted it, and
you chose to accept yes. dpkg-reconfigure mdadm if you don't
believe in this feature.


Well, not me, since I don't have a soft array...


Then you probably have much more annoying problems, or will have. ;)



I regularly backup to an external lvm2 LV.

--
Dissent is patriotic, remember?


--
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org

Archive: http://lists.debian.org/4bde8ff2.6070...@cox.net

Re: md does a monthly resync?? (was Re: Questions about RAID 6)

2010-05-03 Thread martin f krafft

also sprach Ron Johnson  [2010.05.03.1039 +0200]:
> Is that Q21?
> 
> http://git.debian.org/?p=pkg-mdadm/mdadm.git;a=blob_plain;f=debian/FAQ;hb=HEAD

Yes.

> >2. You were asked upon mdadm installation whether you wanted it, and
> >you chose to accept yes. dpkg-reconfigure mdadm if you don't
> >believe in this feature.
> 
> Well, not me, since I don't have a soft array...

Then you probably have much more annoying problems, or will have. ;)

Good luck,

-- 
 .''`.   martin f. krafft   Related projects:
: :'  :  proud Debian developer   http://debiansystem.info
`. `'`   http://people.debian.org/~madduckhttp://vcs-pkg.org
  `-  Debian - when you have better things to do than fixing systems
 
#define emacs eighty megabytes and constantly swapping.


digital_signature_gpg.asc
Description: Digital signature (see http://martin-krafft.net/gpg/)

Re: md does a monthly resync?? (was Re: Questions about RAID 6)

2010-05-03 Thread Ron Johnson


On 05/03/2010 01:21 AM, martin f krafft wrote:

also sprach Ron Johnson  [2010.05.02.2300 +0200]:

My system used to become close to unusable on the 1st sunday of
the month when mdadm did it resync,


That sounds... wrong, on a jillion levels.


It sounds (and is) wrong in exactly two ways:

1. The operation is not a resync, check the FAQ.


Is that Q21?

http://git.debian.org/?p=pkg-mdadm/mdadm.git;a=blob_plain;f=debian/FAQ;hb=HEAD


2. You were asked upon mdadm installation whether you wanted it, and
you chose to accept yes. dpkg-reconfigure mdadm if you don't
believe in this feature.



Well, not me, since I don't have a soft array...

--
Dissent is patriotic, remember?


--
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org

Archive: http://lists.debian.org/4bde8bb4.4050...@cox.net

Re: md does a monthly resync?? (was Re: Questions about RAID 6)

2010-05-03 Thread martin f krafft

also sprach Ron Johnson  [2010.05.02.2300 +0200]:
> >My system used to become close to unusable on the 1st sunday of
> >the month when mdadm did it resync,
> 
> That sounds... wrong, on a jillion levels.

It sounds (and is) wrong in exactly two ways:

1. The operation is not a resync, check the FAQ.
2. You were asked upon mdadm installation whether you wanted it, and
   you chose to accept yes. dpkg-reconfigure mdadm if you don't
   believe in this feature.

-- 
 .''`.   martin f. krafft   Related projects:
: :'  :  proud Debian developer   http://debiansystem.info
`. `'`   http://people.debian.org/~madduckhttp://vcs-pkg.org
  `-  Debian - when you have better things to do than fixing systems
 
"it takes more keystrokes to enter a windows license key
 than it takes to do a complete debian desktop install!"
-- joey hess


digital_signature_gpg.asc
Description: Digital signature (see http://martin-krafft.net/gpg/)

Re: Questions about RAID 6

2010-05-02 Thread Alex Samad

On Sun, 2010-05-02 at 19:19 -0700, Mike Bird wrote: 
> On Sun May 2 2010 13:24:30 Alexander Samad wrote:
> > My system used to become close to unusable on the 1st sunday of the month
> > when mdadm did it resync, I had to write my own script so it did not do
> > mulitple at the
> > same time, turn off the hung process timer and set cpufreq to performance.
> 
> A long time ago there were problems like that.  Nowadays s/w

well this was about 2-3 months ago.

I had 3-4 large raid6/raid4 devices, if I didn't set the limit it would
have taken tooo long

Alex

> RAID handles rebuild so well that we don't even have to set
> "/proc/sys/dev/raid/speed_limit_max".
> 
> --Mike Bird
> 
> 



-- 
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/1272854157.8710.0.ca...@alex-mini.samad.com.au

Re: Questions about RAID 6

2010-05-02 Thread Mike Bird

On Sun May 2 2010 13:24:30 Alexander Samad wrote:
> My system used to become close to unusable on the 1st sunday of the month
> when mdadm did it resync, I had to write my own script so it did not do
> mulitple at the
> same time, turn off the hung process timer and set cpufreq to performance.

A long time ago there were problems like that.  Nowadays s/w
RAID handles rebuild so well that we don't even have to set
"/proc/sys/dev/raid/speed_limit_max".

--Mike Bird


-- 
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/201005021919.06455.mgb-deb...@yosemite.net

Re: md does a monthly resync?? (was Re: Questions about RAID 6)

2010-05-02 Thread Alex Samad

On Sun, 2010-05-02 at 16:00 -0500, Ron Johnson wrote: 
> On 05/02/2010 03:24 PM, Alexander Samad wrote:
> [snip]
> >
> > My system used to become close to unusable on the 1st sunday of the month 
> > when
> > mdadm did it resync,
> 
> That sounds... wrong, on a jillion levels.

depends 
a...@max:~$ dpkg -S /etc/cron.d/mdadm 
mdadm: /etc/cron.d/mdadm

cat of the file

57 0 * * 0 root [ -x /usr/share/mdadm/checkarray ] && [ $(date +\%d) -le
7 ] && /usr/share/mdadm/checkarray --cron --all --quiet


Alex

> 
> -- 
> Dissent is patriotic, remember?
> 
> 




-- 
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/1272850630.4909.3.ca...@alex-mini.samad.com.au

md does a monthly resync?? (was Re: Questions about RAID 6)

2010-05-02 Thread Ron Johnson


On 05/02/2010 03:24 PM, Alexander Samad wrote:
[snip]


My system used to become close to unusable on the 1st sunday of the month when
mdadm did it resync,


That sounds... wrong, on a jillion levels.

--
Dissent is patriotic, remember?


--
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org

Archive: http://lists.debian.org/4bdde7ef.3060...@cox.net

Re: Questions about RAID 6

2010-05-02 Thread Alexander Samad

On Mon, May 3, 2010 at 6:02 AM, Boyd Stephen Smith Jr.
 wrote:
> On Sunday 02 May 2010 06:00:38 Stan Hoeppner wrote:
[snip]
>
> Speeds on my md-RAID devices were comparable to speeds with my Areca HW RAID
> controller (16-port, PCI-X/SATA, battery powered 128MB cache).  Number of
> drives varied from 5 to 10.  RAID levels 5 and 6 were both tested.
>
> Read throughput for both were the expected (# drives - # parity drives) *
> single drive throughput.  Write throughput less than expected in both cases,
> but I can't recall the exact figures.
>
[snip]
>
> It might be different when the system is under load, since the md-RAID depends
> on the host CPU and the HW RAID does not.  However, adding an additional
> generic CPU (to reduce load) is both more useful and often less expensive than
> buying a HW RAID controller that is only used for RAID operations.

My system used to become close to unusable on the 1st sunday of the month when
mdadm did it resync, I had to write my own script so it did not do
mulitple at the
same time, turn off the hung process timer and set cpufreq to performance.

Now with hd ware raid I don't notice it

Alex

>
[snip]
> --
> Boyd Stephen Smith Jr.                   ,= ,-_-. =.
> b...@iguanasuicide.net                   ((_/)o o(\_))
> ICQ: 514984 YM/AIM: DaTwinkDaddy         `-'(. .)`-'
> http://iguanasuicide.net/                    \_/
>


--
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: 
http://lists.debian.org/n2z836a6dcf1005021324k3da07720od5e5e24dc9fbf...@mail.gmail.com

Re: Questions about RAID 6

2010-05-02 Thread Boyd Stephen Smith Jr.

On Sunday 02 May 2010 06:00:38 Stan Hoeppner wrote:
> Good hardware RAID cards are really nice and give you some features you
> can't really get with md raid such as true "just yank the drive tray out"
> hot swap capability.  I've not tried it, but I've read that md raid doesn't
> like it when you just yank an active drive.  Fault LED drive, audible
> warnings, are also nice with HW RAID solutions.  The other main advantage
>  is performance.  Decent HW RAID is almost always faster than md raid,
>  sometimes by a factor of 5 or more depending on the disk count and RAID
>  level. Typically good HW RAID really trounces md raid performance at
>  levels such as 5, 6, 50, 60, basically anything requiring parity
>  calculations.

Speeds on my md-RAID devices were comparable to speeds with my Areca HW RAID 
controller (16-port, PCI-X/SATA, battery powered 128MB cache).  Number of 
drives varied from 5 to 10.  RAID levels 5 and 6 were both tested.

Read throughput for both were the expected (# drives - # parity drives) * 
single drive throughput.  Write throughput less than expected in both cases, 
but I can't recall the exact figures.

Both support "just yank the drive out" if the (rest of) the hardware supports 
hot plugging.  Alerting about failure is probably a bit better with a HW RAID 
controller, since it comes with visual and audible alarms.

It might be different when the system is under load, since the md-RAID depends 
on the host CPU and the HW RAID does not.  However, adding an additional 
generic CPU (to reduce load) is both more useful and often less expensive than 
buying a HW RAID controller that is only used for RAID operations.

> Sounds like you're more of a casual user who needs lots of protected disk
> space but not necessarily absolute blazing speed.  Linux RAID should be
>  fine.

I know I am.
-- 
Boyd Stephen Smith Jr.   ,= ,-_-. =.
b...@iguanasuicide.net  ((_/)o o(\_))
ICQ: 514984 YM/AIM: DaTwinkDaddy `-'(. .)`-'
http://iguanasuicide.net/\_/

signature.asc
Description: This is a digitally signed message part.

Re: Questions about RAID 6

2010-05-02 Thread Boyd Stephen Smith Jr.

On Friday 30 April 2010 19:10:52 Mark Allums wrote:
> or even btrfs for the data directories.

While I am beginning experimenting with btrfs, I wouldn't yet use it for data 
you care about.

/boot, not until/if grub2 gets support for it.  Even then, boot is generally 
small and not often used, so you'll probably not notice any performance change 
from ext2 to a more modern file system, but you might appreciate the fact that 
ext2 is mature and well-understood.

/, maybe -- if mkinitramfs figures out how to mount it properly.

/usr, /opt, /var/cache and /var/tmp, maybe.  They are easily discarded or 
restored.

/home, /var, not yet.  Data stored there might be irreplaceable.  I'd wait 
until the developers say it is stable, at least!
-- 
Boyd Stephen Smith Jr.   ,= ,-_-. =.
b...@iguanasuicide.net  ((_/)o o(\_))
ICQ: 514984 YM/AIM: DaTwinkDaddy `-'(. .)`-'
http://iguanasuicide.net/\_/


signature.asc
Description: This is a digitally signed message part.

Re: Questions about RAID 6

2010-05-02 Thread Stan Hoeppner

Disclaimer:  I'm partial to XFS

Tim Clewlow put forth on 5/1/2010 2:44 AM:

> My reticence to use ext4 / xfs has been due to long cache before
> write times being claimed as dangerous in the event of kernel lockup
> / power outage. 

This is a problem with the Linux buffer cache implementation, not any one
filesystem.  The problem isn't the code itself, but the fact it is a trade
off between performance and data integrity.  No journaling filesystem will
prevent the loss of data in the Linux buffer cache when the machine crashes.
 What they will do is zero out or delete any files that were not fully
written before the crash in order to keep the FS in a consistent state.  You
will always lose data that's in flight, but your FS won't get corrupted due
to the journal replay after reboot.  If you are seriously concerned about
loss of write data that is in the buffer cache when the system crashes, you
should mount your filesystems with "-o sync" in the fstab options so all
writes get flushed to disk without being queued in the buffer cache.

> There are also reports (albeit perhaps somewhat
> dated) that ext4/xfs still have a few small but important bugs to be
> ironed out - I'd be very happy to hear if people have experience
> demonstrating this is no longer true. My preference would be ext4
> instead of xfs as I believe (just my opinion) this is most likely to
> become the successor to ext3 in the future.

I can't speak well to EXT4, but XFS has been fully production quality for
many years, since 1993 on Irix when it was introduced, and since ~2001 on
Linux.  There was a bug identified that resulted in fs inconsistency after a
crash which was fixed in 2007.  All bug fix work since has dealt with minor
issues unrelated to data integrity.  Most of the code fix work for quite
some time now has been cleanup work, optimizations, and writing better
documentation.  Reading the posts to the XFS mailing list is very
informative as to the quality and performance of the code.  XFS has some
really sharp devs.  Most are current or former SGI engineers.

> I have been wanting to know if ext3 can handle >16TB fs.  I now know
> that delayed allocation / writes can be turned off in ext4 (among
> other tuning options I'm looking at), and with ext4, fs sizes are no
> longer a question. So I'm really hoping that ext4 is the way I can
> go.

XFS has even more tuning options than EXT4--pretty much every FS for that
matter.  With XFS on a 32 bit kernel the max FS and file size is 16TB.  On a
64 bit kernel it is 9 exabytes each.  XFS is a better solution than EXT4 at
this point.  Ted T'so admits last week that one function call in EXT4 is in
terrible shape and will a lot of work to fix:

"On my todo list is to fix ext4 to not call write_cache_pages() at all.
We are seriously abusing that function ATM, since we're not actually
writing the pages when we call write_cache_pages().  I won't go into
what we're doing, because it's too embarassing, but suffice it to say
that we end up calling pagevec_lookup() or pagevec_lookup_tag()
*four*, count them *four* times while trying to do writeback.

I have a simple patch that gives ext4 our own copy of
write_cache_pages(), and then simplifies it a lot, and fixes a bunch
of problems, but then I discarded it in favor of fundamentally redoing
how we do writeback at all, but it's going to take a while to get
things completely right.  But I am working to try to fix this."

> I'm also hoping that a cpu/motherboard with suitable grunt and fsb
> bandwidth could reduce performance problems with software raid6. If
> I'm seriously mistaken then I'd love to know beforehand. My
> reticence to use hw raid is that it seems like adding one more point
> of possible failure, but I could be easily be paranoid in dismissing
> it for that reason.

Good hardware RAID cards are really nice and give you some features you
can't really get with md raid such as true "just yank the drive tray out"
hot swap capability.  I've not tried it, but I've read that md raid doesn't
like it when you just yank an active drive.  Fault LED drive, audible
warnings, are also nice with HW RAID solutions.  The other main advantage is
performance.  Decent HW RAID is almost always faster than md raid, sometimes
by a factor of 5 or more depending on the disk count and RAID level.
Typically good HW RAID really trounces md raid performance at levels such as
5, 6, 50, 60, basically anything requiring parity calculations.

Sounds like you're more of a casual user who needs lots of protected disk
space but not necessarily absolute blazing speed.  Linux RAID should be fine.

Take a closer look at XFS before making your decision on a FS for this
array.  It's got a whole lot to like, and it has features to exactly tune
XFS to your mdadm RAID setup.  In fact it's usually automatically done for
you as mkfs.xfs queries the block device device driver for stride and width
info, then matches it.  (~$ man 8 mkfs.xfs)

http://oss.sgi.com/projects/xfs/
http://

Re: Questions about RAID 6

2010-05-01 Thread Tim Clewlow

> On 4/30/2010 6:39 PM, Ron Johnson wrote:
>> On 04/26/2010 09:29 AM, Tim Clewlow wrote:
>>> Hi there,
>>>
>>> I'm getting ready to build a RAID 6 with 4 x 2TB drives to start,
>>
>> Since two of the drives (yes, I know the parity is striped across
>> all
>> the drives, but "two drives" is still the effect) are used by
>> striping,
>> RAID 6 with 4 drives doesn't seem rational.
>
> We've taken OP to task already for this, but I guess it bears
> repeating.
>
> Use multiple HW controllers, and at least 7-8 drives, I believe was
> the
> consensus, given that SW RAID 6 is a performance loser and losing a
> controller during a rebuild is a real ruin-your-week kind of moment.
>
> But while some of us were skeptical about just how bad the
> performance
> of RAID 5 or 6 really is and wanted citation of references, more of
> us
> just questioned the perceived frugality.  With four drives, wouldn't
> a
> RAID 10 be better use of resources, since you can migrate to bigger
> setups later?  And there we were content to let it lie, until...
>
>
>
>>> but the intention is to add more drives as storage requirements
>>> increase.
>>>
>>> My research/googling suggests ext3 supports 16TB volumes if block
>>
>> Why ext3? My kids would graduate college before the fsck
>> completed.
>>
>> ext4 or xfs are the way to go.
>
> I have ceased to have an opinion on this, having been taken to task,
> myself, about it.  I believe the discussion degenerated into a
> nit-picky
> banter over the general suitability of XFS, but I may be wrong about
> this...
>
> _
>
>
> Seriously, ext4 is not suitable if you anticipate possible boot
> problems, unless you are experienced at these things.  The same is
> true
> of XFS.   If you *are* experienced, then more power to you.
> Although, I
> would have assumed a very experienced person would have no need to
> ask
> the question.
>
> Someone pointed out what I have come to regard as the best solution,
> and
> that is to make /boot and / (root) and the usual suspects ext3 for
> safety, and use ext4 or XFS or even btrfs for the data directories.
>
> (Unless OP were talking strictly about the data drives to begin
> with, a
> possibility I admit I may have overlooked.)
>
>
> Have I summarized adequately?
>
>
> MAA
.

First off, thank you all for the valuable information and experience
laden information. For clarity, the setup has always been intended
to be: one system/application drive, and, one array made of separate
drives; the array protects data, nothing else. The idea is for them
to be two clearly distinct entities, with very different levels of
protection, because the system and apps can be quite quickly
recreated if lost, the data cannot.

More clarity, the data is currently touching 4TB, and expected to
exceed that very soon, so I'll be using at least 5 drives, probably
6, in the near future. Yes, I know raid6 on 4 drives is not frugal,
I'm just planning ahead.

My reticence to use ext4 / xfs has been due to long cache before
write times being claimed as dangerous in the event of kernel lockup
/ power outage. There are also reports (albeit perhaps somewhat
dated) that ext4/xfs still have a few small but important bugs to be
ironed out - I'd be very happy to hear if people have experience
demonstrating this is no longer true. My preference would be ext4
instead of xfs as I believe (just my opinion) this is most likely to
become the successor to ext3 in the future.

I have been wanting to know if ext3 can handle >16TB fs.  I now know
that delayed allocation / writes can be turned off in ext4 (among
other tuning options I'm looking at), and with ext4, fs sizes are no
longer a question. So I'm really hoping that ext4 is the way I can
go.

I'm also hoping that a cpu/motherboard with suitable grunt and fsb
bandwidth could reduce performance problems with software raid6. If
I'm seriously mistaken then I'd love to know beforehand. My
reticence to use hw raid is that it seems like adding one more point
of possible failure, but I could be easily be paranoid in dismissing
it for that reason.

Regards, Tim.

-- 
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: 
http://lists.debian.org/8812562889f9881787e6378e770b269c.squir...@192.168.1.100

Re: Questions about RAID 6

2010-04-30 Thread Ron Johnson


On 04/30/2010 07:10 PM, Mark Allums wrote:
[snip]


Someone pointed out what I have come to regard as the best solution, and
that is to make /boot and / (root) and the usual suspects ext3 for
safety, and use ext4 or XFS or even btrfs for the data directories.


That's what I do.  / & /home are ext3 and /data is ext4.


(Unless OP were talking strictly about the data drives to begin with, a
possibility I admit I may have overlooked.)


Have I summarized adequately?



:)


--
Dissent is patriotic, remember?


--
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org

Archive: http://lists.debian.org/4bdb8e16.8030...@cox.net

Re: Questions about RAID 6

2010-04-30 Thread Mark Allums


On 4/30/2010 6:39 PM, Ron Johnson wrote:

On 04/26/2010 09:29 AM, Tim Clewlow wrote:

Hi there,

I'm getting ready to build a RAID 6 with 4 x 2TB drives to start,


Since two of the drives (yes, I know the parity is striped across all
the drives, but "two drives" is still the effect) are used by striping,
RAID 6 with 4 drives doesn't seem rational.


We've taken OP to task already for this, but I guess it bears repeating.

Use multiple HW controllers, and at least 7-8 drives, I believe was the 
consensus, given that SW RAID 6 is a performance loser and losing a 
controller during a rebuild is a real ruin-your-week kind of moment.


But while some of us were skeptical about just how bad the performance 
of RAID 5 or 6 really is and wanted citation of references, more of us 
just questioned the perceived frugality.  With four drives, wouldn't a 
RAID 10 be better use of resources, since you can migrate to bigger 
setups later?  And there we were content to let it lie, until...





but the intention is to add more drives as storage requirements
increase.

My research/googling suggests ext3 supports 16TB volumes if block


Why ext3? My kids would graduate college before the fsck completed.

ext4 or xfs are the way to go.


I have ceased to have an opinion on this, having been taken to task, 
myself, about it.  I believe the discussion degenerated into a nit-picky 
banter over the general suitability of XFS, but I may be wrong about this...


_


Seriously, ext4 is not suitable if you anticipate possible boot 
problems, unless you are experienced at these things.  The same is true 
of XFS.   If you *are* experienced, then more power to you.  Although, I 
would have assumed a very experienced person would have no need to ask 
the question.


Someone pointed out what I have come to regard as the best solution, and 
that is to make /boot and / (root) and the usual suspects ext3 for 
safety, and use ext4 or XFS or even btrfs for the data directories.


(Unless OP were talking strictly about the data drives to begin with, a 
possibility I admit I may have overlooked.)



Have I summarized adequately?


MAA



--
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org

Archive: http://lists.debian.org/4bdb718c.3060...@allums.com

Re: Questions about RAID 6

2010-04-30 Thread Ron Johnson


On 04/26/2010 09:29 AM, Tim Clewlow wrote:

Hi there,

I'm getting ready to build a RAID 6 with 4 x 2TB drives to start,


Since two of the drives (yes, I know the parity is striped across 
all the drives, but "two drives" is still the effect) are used by 
striping, RAID 6 with 4 drives doesn't seem rational.



but the intention is to add more drives as storage requirements
increase.

My research/googling suggests ext3 supports 16TB volumes if block


Why ext3?  My kids would graduate college before the fsck completed.

ext4 or xfs are the way to go.

--
Dissent is patriotic, remember?


--
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org

Archive: http://lists.debian.org/4bdb6a29.8010...@cox.net

Re: Questions about RAID 6

2010-04-29 Thread Dan Ritter

On Mon, Apr 26, 2010 at 04:44:32PM -0500, Boyd Stephen Smith Jr. wrote:
> On Monday 26 April 2010 09:29:28 Tim Clewlow wrote:
> > I'm getting ready to build a RAID 6 with 4 x 2TB drives to start,
> > but the intention is to add more drives as storage requirements
> > increase.
> 
> Since you seem fine with RAID 6, I'll assume you are also fine with RAID 5.
> 
> I don't know what your requirements / levels of paranoia are, but RAID 5 is 
> probably better than RAID 6 until you are up to 6 or 7 drives; the chance of 
> a 
> double failure in a 5 (or less) drive array is minuscule.


It's not minuscule; it happens all the time. The key is that the
double failure won't be simultaneous: first one drive goes, and
then the extra stress involved in rebuilding it makes another
drive go. Then it's time to replace disks and restore from
backup.

-dsr-


-- 
http://tao.merseine.nu/~dsr/eula.html is hereby incorporated by reference.
You can't defend freedom by getting rid of it.


-- 
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20100429163442.gh23...@tao.merseine.nu

Re: Questions about RAID 6

2010-04-29 Thread Boyd Stephen Smith Jr.

On Wednesday 28 April 2010 20:51:18 Stan Hoeppner wrote:
> Mike Bird put forth on 4/28/2010 5:48 PM:
> > On Wed April 28 2010 15:10:32 Stan Hoeppner wrote:
> >> Given the way most database engines do locking, you'll get zero
> >> additional seek benefit on reads, and you'll take a 4x hit on writes. I
> >> don't know how you could possibly argue otherwise.
> >
> > Linux can overlap seeks on multiple spindles, as can most operating
> > systems of the last fifty years.
> 
> Of course it can, and it even performs I/Os in parallel on multicore or smp
> systems, in addition to overlapped I/O.  You're still missing the point
>  that you have to perform 4x the writes with the 4 disk RAID 1 setup, which
>  reduces write performance by a factor of 4 vs a single disk,

4x the data to write does not mean that it take 4x the time.  Since I/O 
performance is generally measured in B/s or ops/s, reducing it by a factor of 
4 would mean taking 4x the time to write the same amount.  That doesn't happen 
in a 4-way RAID-1.

Instead, all the writes to the disk are triggered asynchronously, and 
virtually simultaneously, by the OS.  At some time later, the OS waits for the 
disks to signal that those writes have completed.  We'll assume worse-case and 
make the OS refuse to do anything else until all those writes have finished, 
even then.  Your average time to having all the writes completed is just a 
tiny bit more than writing to a single disk.  [Assuming identical disks, it's 
based on the average write performance of a single disk, and the standard 
deviation of write perform of a single disk.  Something like (1 + StdDev/30) * 
Average.

The OS can be smarter and only wait for 2 of the writes to finish, since the 
array is consistent at that point, which would make that number even better.

In short, RAID-1 does hurt your write throughput, but not by much.  It can 
improve both read response rate and read throughput, although the current 
kernel implementation isn't great at either.

>  and increases
>  write bandwidth by a factor of 4 for writes.

Assuming software RAID, it does increase the bandwidth required on the PCI-X 
or PCIe bus -- but either is so much faster than disks to rarely be a 
bottleneck.  Assuming SATA or SAS connection and no port multipliers, it 
doesn't affect the bandwidth since both are serial interfaces, so all 
bandwidth is measured per attached device.

> Thus, on a loaded multi-user server, compared to a single disk system,
> you've actually decreased your overall write throughput compared to a
>  single disk.  In other words, if the single disk server can't handle the
>  I/O load, running a 4-way RAID 1 will make the situation worse.  Whereas
>  running with RAID 10 you should get almost double the write speed of a
>  single disk due to the striping, even though the total number of writes to
>  disk is the same as with RAID 1.

While I genuinely agree that RAID-1/0 makes more sense than RAID-1 when 
dealing with 3 or more disks, it comparative performance greatly depends on 
the various options you've created your array with.  (Particularly, since the 
current kernel implementation would let you do a 1+2 mirroring [original data 
+ 2 copies] across 4 drives and still call it RAID-1/0)

In the simple case where you have 2 pairs of mirrored drives and you do the 
striping across the pairs (i.e. as most 4-way RAID-1/0 hardware controllers 
do), your read response is about the same as a single disk (just slightly less 
than RAID-1), your read throughput is about 4x a single disk (same as RAID-1), 
and your write throughput is a little less than 2x a single disk (almost 2x 
RAID-1).  Read response (for either/both) could be better, but again, the 
current kernel implementation isn't that great.  [RAID-1 gives you more 
redundancy, of course.]
-- 
Boyd Stephen Smith Jr.   ,= ,-_-. =.
b...@iguanasuicide.net  ((_/)o o(\_))
ICQ: 514984 YM/AIM: DaTwinkDaddy `-'(. .)`-'
http://iguanasuicide.net/\_/

signature.asc
Description: This is a digitally signed message part.

Re: Questions about RAID 6

2010-04-28 Thread Mike Bird

On Wed April 28 2010 18:51:18 Stan Hoeppner wrote:
> You seem to posses knowledge of these things that is 180 degrees opposite
> of fact.  OLTP, or online transaction processing, is typified by retail or
> web point of sale transactions or call logging by telcos.  OLTP databases
> are typically much more write than read heavy.  OLAP, or online analytical
> processing, is exclusively reads, made up entirely of search queries.
> Why/how would you think OLTP is mostly reads?

If all you're doing is appending to a log file then you're write-intensive.
For example, some INN servers using cycbuffs are write-intensive if they can
forward the articles out to the peers before they disappear from cache.

OLTP databases have indices (or hash tables or whatever) that need to be
read even when writing a new record.  Then of course, the data that has been
written needs to be used for something such as fulfillment and analysis.
Both are mostly reads.  Backup from the live DB is all reads.

I typically saw about 90% reads in OLTP databases.

I think this is getting off-topic for debian-user.

--Mike Bird

-- 
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/201004281917.14982.mgb-deb...@yosemite.net

Re: Questions about RAID 6

2010-04-28 Thread Stan Hoeppner

Mike Bird put forth on 4/28/2010 5:48 PM:
> On Wed April 28 2010 15:10:32 Stan Hoeppner wrote:
>> Mike Bird put forth on 4/28/2010 1:48 PM:
>>> I've designed commercial database managers and OLTP systems.
>>
>> Are you saying you've put production OLTP databases on N-way software RAID
>> 1 sets?
> 
> No.  I've used N-way RAID-1 for general servers - mail, web, samba, etc.
> 
> Nevertheless N-way RAID-1 would be a reasonable basis for a small OLTP
> database as the overwhelming majority of OLTP disk transfers are reads.

You seem to posses knowledge of these things that is 180 degrees opposite of
fact.  OLTP, or online transaction processing, is typified by retail or web
point of sale transactions or call logging by telcos.  OLTP databases are
typically much more write than read heavy.  OLAP, or online analytical
processing, is exclusively reads, made up entirely of search queries.
Why/how would you think OLTP is mostly reads?

> You had claimed that "on a loaded system, such as a transactional database
> server or busy ftp upload server, such a RAID setup will bring the system to
> its knees in short order as the CPU overhead for each 'real' disk I/O is now
> increased 4x and the physical I/O bandwidth is increased 4x".

> Your claim is irrelevant as neither CPU utilisation nor I/O bandwith are
> of concern in such systems.  They are seek-bound.

Yep, you're right.  That must be why one finds so many production OLTP, ftp
upload, mail, etc, servers running N-way software RAID 1.  Almost no one
does it, for exactly the reasons I've stated.  The overhead is too great and
RAID 10 gives almost the same level of fault tolerance with much better
performance.

>> Given the way most database engines do locking, you'll get zero additional
>> seek benefit on reads, and you'll take a 4x hit on writes. I don't know
>> how you could possibly argue otherwise.
> 
> Linux can overlap seeks on multiple spindles, as can most operating
> systems of the last fifty years.

Of course it can, and it even performs I/Os in parallel on multicore or smp
systems, in addition to overlapped I/O.  You're still missing the point that
you have to perform 4x the writes with the 4 disk RAID 1 setup, which
reduces write performance by a factor of 4 vs a single disk, and increases
write bandwidth by a factor of 4 for writes.

Thus, on a loaded multi-user server, compared to a single disk system,
you've actually decreased your overall write throughput compared to a single
disk.  In other words, if the single disk server can't handle the I/O load,
running a 4-way RAID 1 will make the situation worse.  Whereas running with
RAID 10 you should get almost double the write speed of a single disk due to
the striping, even though the total number of writes to disk is the same as
with RAID 1.

-- 
Stan

-- 
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/4bd8e616.5070...@hardwarefreak.com

Re: Questions about RAID 6

2010-04-28 Thread Mike Bird

On Wed April 28 2010 15:10:32 Stan Hoeppner wrote:
> Mike Bird put forth on 4/28/2010 1:48 PM:
> > I've designed commercial database managers and OLTP systems.
>
> Are you saying you've put production OLTP databases on N-way software RAID
> 1 sets?

No.  I've used N-way RAID-1 for general servers - mail, web, samba, etc.

Nevertheless N-way RAID-1 would be a reasonable basis for a small OLTP
database as the overwhelming majority of OLTP disk transfers are reads.

> > If CPU usage had ever become a factor in anything I had designed
> > I would have been fired.  If they're not I/O bound they're useless.
>
> That's an odd point to make given that we're discussing N-way RAID 1.  By
> using N-way RAID 1, you're making the system I/O bound before you even
> create the db.

You had claimed that "on a loaded system, such as a transactional database
server or busy ftp upload server, such a RAID setup will bring the system to
its knees in short order as the CPU overhead for each 'real' disk I/O is now
increased 4x and the physical I/O bandwidth is increased 4x".

Your claim is irrelevant as neither CPU utilisation nor I/O bandwith are
of concern in such systems.  They are seek-bound.

> Given the way most database engines do locking, you'll get zero additional
> seek benefit on reads, and you'll take a 4x hit on writes. I don't know
> how you could possibly argue otherwise.

Linux can overlap seeks on multiple spindles, as can most operating
systems of the last fifty years.

--Mike Bird

-- 
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/201004281548.15476.mgb-deb...@yosemite.net

Re: Questions about RAID 6

2010-04-28 Thread Stan Hoeppner

Mike Bird put forth on 4/28/2010 1:48 PM:
> On Wed April 28 2010 01:44:37 Stan Hoeppner wrote:
>> On a sufficiently fast system that is not loaded, the user will likely see
>> no performance degradation, especially given Linux' buffered I/O
>> architecture.  However, on a loaded system, such as a transactional
>> database server or busy ftp upload server, such a RAID setup will bring the
>> system to its knees in short order as the CPU overhead for each 'real' disk
>> I/O is now increased 4x and the physical I/O bandwidth is increased 4x.
> 
> I've designed commercial database managers and OLTP systems.

Are you saying you've put production OLTP databases on N-way software RAID 1
sets?

> If CPU usage had ever become a factor in anything I had designed
> I would have been fired.  If they're not I/O bound they're useless.

That's an odd point to make given that we're discussing N-way RAID 1.  By
using N-way RAID 1, you're making the system I/O bound before you even
create the db.  Given the way most database engines do locking, you'll get
zero additional seek benefit on reads, and you'll take a 4x hit on writes.
I don't know how you could possibly argue otherwise.

> With a few exceptions such as physical backups, any I/O bound
> application is going to be seek bound, not bandwidth bound.

Downloads via http/ftp/scp and largish file copies via smb/cifs, as well as
any media streaming applications will be more b/w bound that seek bound.
For most day to day mundane stuff such as smtp/imap/web/etc, yes, they're
far more seek bound.  But again, using N-way RAID 1 will give no performance
boost to any of these applications, whether seek or b/w bound.  It will give
you the same read performance in most cases as a single spindle, on some
occasions a slight boost, but will always yield a 4x decrease in seek and
b/w/ performance for writes.

The Linux RAID 1 code is optimized for redundancy, not performance.  If you
need redundancy and performance with Linux software RAID, your best bet is
RAID 10.  It costs more per GB than RAID 5 or 6, but doesn't have to
generate parity, yielding lower CPU overhead and thus decreasing I/O
latency.  And it sure as heck costs less than N-way RAID 1.

-- 
Stan

-- 
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/4bd8b258.40...@hardwarefreak.com

Re: Questions about RAID 6

2010-04-28 Thread Bob McGowan

On 04/26/2010 04:33 PM, Mike Bird wrote:
> On Mon April 26 2010 14:44:32 Boyd Stephen Smith Jr. wrote:
>> the chance of a double failure in a 5 (or less) drive array is minuscule.
> 
> A flaky controller knocking one drive out of an array and then
> breaking another before you're rebuilt can really ruin your day.
> 

I've been out of the office and so come to this discussion a bit late -
my apologies if this has been mentioned ...

Greater redundancy can be had by putting disks on several controllers
rather than all on one.

If the rebuild fails due to a controller problem, it shouldn't affect
the disks on the other controllers.

> Rebuild is generally the period of most intense activity so
> figure on failures being much more likely during a rebuild.
> 
> --Mike Bird
> 
> 

-- 
Bob McGowan

-- 
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/4bd88b6f.6080...@symantec.com

Re: Questions about RAID 6

2010-04-28 Thread Mike Bird

On Wed April 28 2010 01:44:37 Stan Hoeppner wrote:
> On a sufficiently fast system that is not loaded, the user will likely see
> no performance degradation, especially given Linux' buffered I/O
> architecture.  However, on a loaded system, such as a transactional
> database server or busy ftp upload server, such a RAID setup will bring the
> system to its knees in short order as the CPU overhead for each 'real' disk
> I/O is now increased 4x and the physical I/O bandwidth is increased 4x.

I've designed commercial database managers and OLTP systems.

If CPU usage had ever become a factor in anything I had designed
I would have been fired.  If they're not I/O bound they're useless.

With a few exceptions such as physical backups, any I/O bound
application is going to be seek bound, not bandwidth bound.

--Mike Bird


-- 
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/201004281148.16763.mgb-deb...@yosemite.net

Re: Questions about RAID 6

2010-04-28 Thread Mark Allums


Stan,
We are on the same wavelength, I do the same thing myself.  (Except that 
I go ahead and mirror swap.)  I love RAID 10.


MAA


On 4/28/2010 5:18 AM, Stan Hoeppner wrote:

Mark Allums put forth on 4/27/2010 10:31 PM:


For DIY, always pair those drives.  Consider RAID 10, RAID 50, RAID 60,
etc.  Alas, that doubles the number of drives, and intensely decreases
the MTBF, which is the whole outcome you want to avoid.


This is my preferred mdadm 4 drive setup for a light office server or home
media/vanity server.  Some minor setup details are omitted from the diagram
to keep it simple, such as the fact that /boot is a mirrored 100MB partition
set and that there are two non mirrored 1GB swap partitions.  / and /var are
mirrored partitions in the remaining first 30GB.  These sizes are arbitrary,
and can be seasoned to taste.  I find these sizes work fairly well for a non
GUI Debian server.

md raid, 4 x 500GB 7.2K rpm SATAII drives:

mirror   mirror
/\   /\
    3    3  
| /boot  | 0 | /boot  |  | swap1  | 0 | swap2  |
| /  | G | /  |  | /var   | G | /var   |
||   ||  ||   ||
| /home  |   | /home  |  | /home  |   | /home  |
| /samba |   | /samba |  | /samba |   | /samba |
| other  |   | other  |  | other  |   | other  |
||   ||  ||   ||
||   ||  ||   ||
    
\   \   /   /
  ---
 RAID 10
   940 GB NET

For approximately the same $$ outlay one could simply mirror two 1TB 7.2K
rpm drives and have the same usable space and a little less power draw.  The
4 drive RAID 10 setup will yield better read and write performance due to
the striping, especially under a multiuser workload, and especially for IMAP
serving of large mailboxen.  For a small/medium office server running say
Postfix/Dovecot/Samba/lighty+Roundcube webmail, a small intranet etc, the 4
drive setup would yield significantly better performance than the higher
capacity 2 drive setup.  Using Newegg's prices, each solution will run a
little below or above $200.

This 4 drive RAID 10 makes for a nice little inexpensive and speedy setup.
1TB of user space may not seem like much given the capacity of today's
drives, but most small/medium offices won't come close to using that much
space for a number of years, assuming you have sane email attachment policies.




--
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org

Archive: http://lists.debian.org/4bd81a5e.4040...@allums.com

Re: Questions about RAID 6

2010-04-28 Thread Stan Hoeppner

Mark Allums put forth on 4/27/2010 10:31 PM:

> For DIY, always pair those drives.  Consider RAID 10, RAID 50, RAID 60,
> etc.  Alas, that doubles the number of drives, and intensely decreases
> the MTBF, which is the whole outcome you want to avoid.

This is my preferred mdadm 4 drive setup for a light office server or home
media/vanity server.  Some minor setup details are omitted from the diagram
to keep it simple, such as the fact that /boot is a mirrored 100MB partition
set and that there are two non mirrored 1GB swap partitions.  / and /var are
mirrored partitions in the remaining first 30GB.  These sizes are arbitrary,
and can be seasoned to taste.  I find these sizes work fairly well for a non
GUI Debian server.

md raid, 4 x 500GB 7.2K rpm SATAII drives:

mirror   mirror
/\   /\
   3    3  
| /boot  | 0 | /boot  |  | swap1  | 0 | swap2  |
| /  | G | /  |  | /var   | G | /var   |
||   ||  ||   ||
| /home  |   | /home  |  | /home  |   | /home  |
| /samba |   | /samba |  | /samba |   | /samba |
| other  |   | other  |  | other  |   | other  |
||   ||  ||   ||
||   ||  ||   ||
   
\   \   /   /
 ---
 RAID 10
   940 GB NET

For approximately the same $$ outlay one could simply mirror two 1TB 7.2K
rpm drives and have the same usable space and a little less power draw.  The
4 drive RAID 10 setup will yield better read and write performance due to
the striping, especially under a multiuser workload, and especially for IMAP
serving of large mailboxen.  For a small/medium office server running say
Postfix/Dovecot/Samba/lighty+Roundcube webmail, a small intranet etc, the 4
drive setup would yield significantly better performance than the higher
capacity 2 drive setup.  Using Newegg's prices, each solution will run a
little below or above $200.

This 4 drive RAID 10 makes for a nice little inexpensive and speedy setup.
1TB of user space may not seem like much given the capacity of today's
drives, but most small/medium offices won't come close to using that much
space for a number of years, assuming you have sane email attachment policies.

-- 
Stan


-- 
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/4bd80b6e.2010...@hardwarefreak.com

Re: Questions about RAID 6

2010-04-28 Thread Stan Hoeppner

Mike Bird put forth on 4/26/2010 3:04 PM:
> On Mon April 26 2010 12:29:43 Stan Hoeppner wrote:
>> Mark Allums put forth on 4/26/2010 12:51 PM:
>>> Put four drives in a RAID 1, you can suffer a loss of three drives.
>>
>> And you'll suffer pretty abysmal write performance as well.
> 
> Write performance of RAID-1 is approximately as good as a simple drive,
> which is good enough for many applications.

That's simply not correct.  The number of operations the software RAID
driver performs is equal to the number of drives in the RAID1 set per
application I/O operation.  If the application writes one sector, 512 bytes,
then the RAID driver is going to write 2048 bytes in 4 I/O transfers, one
512 byte write to each disk.  The RAID driver code will loop 4 times instead
of once on a single drive setup, once per physical I/O, quadrupling the
amount of code executed in additional to a quadrupling of the physical I/O
transfers to the platters.

On a sufficiently fast system that is not loaded, the user will likely see
no performance degradation, especially given Linux' buffered I/O
architecture.  However, on a loaded system, such as a transactional database
server or busy ftp upload server, such a RAID setup will bring the system to
its knees in short order as the CPU overhead for each 'real' disk I/O is now
increased 4x and the physical I/O bandwidth is increased 4x.

>> Also keep in mind that some software RAID implementations allow more than
>> two drives in RAID 1, most often called a "mirror set".  However, I don't
>> know of any hardware RAID controllers that allow more than 2 drives in a
>> RAID 1.  RAID 10 yields excellent fault tolerance and a substantial boost
>> to read and write performance.  Anyone considering a 4 disk mirror set
>> should do RAID 10 instead.
> 
> Some of my RAIDs are N-way RAID-1 because of the superior read performance.

If you perform some actual benchmarks you'll notice that the Linux RAID1
read performance boost is negligible, and very highly application dependent,
regardless of the number of drives.  The RAID1 driver code isn't optimized
for parallel reads.  It's mildly opportunistic at best.

Don't take my word for it, Google for "Linux software RAID performance" or
similar.  What you find should be eye opening.

-- 
Stan

-- 
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/4bd7f575.6010...@hardwarefreak.com

Re: Questions about RAID 6

2010-04-27 Thread Mark Allums


On 4/27/2010 9:56 PM, Mark Allums wrote:

On 4/26/2010 1:37 PM, Mike Bird wrote:

On Mon April 26 2010 10:51:38 Mark Allums wrote:

RAID 6 (and 5) perform well when less than approximately 1/3 full.
After that, even reads suffer.


Mark,

I've been using various kinds of RAID for many many years and
was not aware of that. Do you have a link to an explanation?

Thanks,

--Mike Bird





YMMV.




I should explain better.  Three-drive motherboard-fake-RAID-5's perform 
abysmally when they start to fill up.


I do not have a link handy.  I could try to find some, if you like.

In my experience, real-world performance is always less than theoretical 
or reported performance.  That is, I've never had a RAID 5 work as well 
as advertised.  If you have had better experiences than me, more power 
to you.


MAA



--
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org

Archive: http://lists.debian.org/4bd7ae8b.30...@allums.com

Re: Questions about RAID 6

2010-04-27 Thread Mark Allums


On 4/26/2010 11:11 PM, Tim Clewlow wrote:



I don't know what your requirements / levels of paranoia are, but
RAID 5 is
probably better than RAID 6 until you are up to 6 or 7 drives; the
chance of a
double failure in a 5 (or less) drive array is minuscule.


.
I currently have 3 TB of data with another 1TB on its way fairly
soon, so 4 drives will become 5 quite soon. Also, I have read that a
common rating of drive failure is an unrecoverable read rate of 1
bit in 10^14 - that is 1 bit in every 10TB. While doing a rebuild
across 4 or 5 drives that would mean it is likely to hit an
unrecoverable read. With RAID 5 (no redundancy during rebuild due to
failed drive) that would be game over. Is this correct?



Uh.  Well, I guess I would ballpark it similarly.  Large arrays are 
asking for trouble.  For serious work, separate out the storage system 
from the application as much as possible.  Be prepared to spend money.


For DIY, always pair those drives.  Consider RAID 10, RAID 50, RAID 60, 
etc.  Alas, that doubles the number of drives, and intensely decreases 
the MTBF, which is the whole outcome you want to avoid.


But you have to start somewhere.

For offline storage, tape is still around...


MAA


I get the feeling some of this is overthinking.  Plan ahead, but don't 
spend money until you need to.



--
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org

Archive: http://lists.debian.org/4bd7ac0b.9070...@allums.com

Re: Questions about RAID 6

2010-04-27 Thread Mark Allums


On 4/26/2010 2:29 PM, Stan Hoeppner wrote:

Mark Allums put forth on 4/26/2010 12:51 PM:


Put four drives in a RAID 1, you can suffer a loss of three drives.


And you'll suffer pretty abysmal write performance as well.

Also keep in mind that some software RAID implementations allow more than
two drives in RAID 1, most often called a "mirror set".  However, I don't
know of any hardware RAID controllers that allow more than 2 drives in a
RAID 1.  RAID 10 yields excellent fault tolerance and a substantial boost to
read and write performance.  Anyone considering a 4 disk mirror set should
do RAID 10 instead.




Yeah.  All good points.

I never make things clear.  OP was thinking of doing software RAID with 
Linux md raid.  Most of what I meant to say was/is with that in mind.



MAA



--
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org

Archive: http://lists.debian.org/4bd7a487.8020...@allums.com

Re: Questions about RAID 6

2010-04-27 Thread Mark Allums


On 4/26/2010 1:37 PM, Mike Bird wrote:

On Mon April 26 2010 10:51:38 Mark Allums wrote:

RAID 6 (and 5) perform well when less than approximately 1/3 full.
After that, even reads suffer.


Mark,

I've been using various kinds of RAID for many many years and
was not aware of that.  Do you have a link to an explanation?

Thanks,

--Mike Bird





YMMV.

MAA


--
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org

Archive: http://lists.debian.org/4bd7a3da.9070...@allums.com

Re: Questions about RAID 6

2010-04-27 Thread Alexander Samad

Hi

I recently (last week), migrated from 10 x 1Tb to adaptec 51645 and 5
x 2T drives.

my experience, I can't get frub2 and the adaptec to work, so I am
booting from a SSD I had.

I carved up the 5x2T into 32G (mirror 1e - mirror stripe + parity)  -
too boot from and mirrored against my ssd. the rest went into a raid5
till i moved my info over - this took around 18 hours, my data was
originally on a vgs on a pv on my raid6 mdadm I made the adaptec 5T
into a pv and added it to the vg and then did a pvmove - that took
time :)

next I went and got 2 more 2Tb drives and did an online upgrade from
5x2T raid5 to 7x2Tb raid6 now sitting at 9T - this took about 1 week
to resettle.

Other quirks I had to used parted to install gpt  partition table on
the drive it was over some limit for mbr's.  Had a bit of a scare when
I resized my pvs partition with parted, it submits each command once
its typed - I had to delete my pv partition and then recreate it -
same as deleting a partition and then recreating it, but with fdisk it
doesn't really happen until Write time 5 T of info gone
potentially  could not use the resize command it did not under
stand lvm/pv's

But all is okay now resized and ready.

I choose raid6 because its just another drive and I value my data more
than another drive. I also have 3 x 1T in the box in the raid 1ee
setup which is stripe / mirror / parity setup.

I don't use batter backup to the machine, I have a ups attached, which
can run it for 40 min on battery.

Note - I also backup all my data to another server close by but
another bulding and all the important stuff get backed up off site. I
use rdiff-backup

Alex

On Tue, Apr 27, 2010 at 2:11 PM, Tim Clewlow  wrote:
>
>> I don't know what your requirements / levels of paranoia are, but
>> RAID 5 is
>> probably better than RAID 6 until you are up to 6 or 7 drives; the
>> chance of a
>> double failure in a 5 (or less) drive array is minuscule.
>>
> .
> I currently have 3 TB of data with another 1TB on its way fairly
> soon, so 4 drives will become 5 quite soon. Also, I have read that a
> common rating of drive failure is an unrecoverable read rate of 1
> bit in 10^14 - that is 1 bit in every 10TB. While doing a rebuild
> across 4 or 5 drives that would mean it is likely to hit an
> unrecoverable read. With RAID 5 (no redundancy during rebuild due to
> failed drive) that would be game over. Is this correct?
>
> Tim.
>
>
> --
> To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org
> with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
> Archive: 
> http://lists.debian.org/706fc98e51cb5ceddd4e32ea1bc05cc3.squir...@192.168.1.100
>
>

-- 
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: 
http://lists.debian.org/x2i836a6dcf1004270256v9bf62c2bh9fa70af4907fe...@mail.gmail.com

Re: Questions about RAID 6

2010-04-26 Thread Tim Clewlow


> I don't know what your requirements / levels of paranoia are, but
> RAID 5 is
> probably better than RAID 6 until you are up to 6 or 7 drives; the
> chance of a
> double failure in a 5 (or less) drive array is minuscule.
>
.
I currently have 3 TB of data with another 1TB on its way fairly
soon, so 4 drives will become 5 quite soon. Also, I have read that a
common rating of drive failure is an unrecoverable read rate of 1
bit in 10^14 - that is 1 bit in every 10TB. While doing a rebuild
across 4 or 5 drives that would mean it is likely to hit an
unrecoverable read. With RAID 5 (no redundancy during rebuild due to
failed drive) that would be game over. Is this correct?

Tim.


-- 
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: 
http://lists.debian.org/706fc98e51cb5ceddd4e32ea1bc05cc3.squir...@192.168.1.100

Re: Questions about RAID 6

2010-04-26 Thread Mike Bird

On Mon April 26 2010 14:44:32 Boyd Stephen Smith Jr. wrote:
> the chance of a double failure in a 5 (or less) drive array is minuscule.

A flaky controller knocking one drive out of an array and then
breaking another before you're rebuilt can really ruin your day.

Rebuild is generally the period of most intense activity so
figure on failures being much more likely during a rebuild.

--Mike Bird


-- 
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/201004261633.53139.mgb-deb...@yosemite.net

Re: Questions about RAID 6

2010-04-26 Thread Boyd Stephen Smith Jr.

On Monday 26 April 2010 09:29:28 Tim Clewlow wrote:
> I'm getting ready to build a RAID 6 with 4 x 2TB drives to start,
> but the intention is to add more drives as storage requirements
> increase.

Since you seem fine with RAID 6, I'll assume you are also fine with RAID 5.

I don't know what your requirements / levels of paranoia are, but RAID 5 is 
probably better than RAID 6 until you are up to 6 or 7 drives; the chance of a 
double failure in a 5 (or less) drive array is minuscule.

> I intend to use mdadm to build / run the array.

Modern mdadm can migrate from RAID 5 to RAID 6 when you add the 6th/7th drive 
into the array.

Also modem mdadm has a wealth of RAID 1/0 features that may actually be a 
better performance-wise than RAID 5 or RAID 6.

> If an unrecoverable
> read error (bad block that on disk circuitry cant resolve) is
> discovered on a disk then how does mdadm handle this? It appears the
> possibilities are:
> 1) the disk gets marked as failed in the array - ext3 does not get
> notified of a bad block

This one.

> I would really like to hear it is either 2 or 3 as I would prefer
> not to have an entire disk immediately marked bad due to one
> unrecoverable read error

Sorry.

> I would prefer to be notified instead so
> I can still have RAID 6 protecting "most" of the data until the disk
> gets replaced.

You can add the failed device back into the array and it will re-sync until 
there is another issue with the device.  Just be sure to remember which device 
needs replacing for when your new HW arrives.
-- 
Boyd Stephen Smith Jr.   ,= ,-_-. =.
b...@iguanasuicide.net  ((_/)o o(\_))
ICQ: 514984 YM/AIM: DaTwinkDaddy `-'(. .)`-'
http://iguanasuicide.net/\_/

signature.asc
Description: This is a digitally signed message part.

Re: Questions about RAID 6

2010-04-26 Thread Mike Bird

On Mon April 26 2010 12:29:43 Stan Hoeppner wrote:
> Mark Allums put forth on 4/26/2010 12:51 PM:
> > Put four drives in a RAID 1, you can suffer a loss of three drives.
>
> And you'll suffer pretty abysmal write performance as well.

Write performance of RAID-1 is approximately as good as a simple drive,
which is good enough for many applications.

> Also keep in mind that some software RAID implementations allow more than
> two drives in RAID 1, most often called a "mirror set".  However, I don't
> know of any hardware RAID controllers that allow more than 2 drives in a
> RAID 1.  RAID 10 yields excellent fault tolerance and a substantial boost
> to read and write performance.  Anyone considering a 4 disk mirror set
> should do RAID 10 instead.

Some of my RAIDs are N-way RAID-1 because of the superior read performance.

--Mike Bird


-- 
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/201004261304.13643.mgb-deb...@yosemite.net

Re: Questions about RAID 6

2010-04-26 Thread Stan Hoeppner

Mark Allums put forth on 4/26/2010 12:51 PM:

> Put four drives in a RAID 1, you can suffer a loss of three drives.

And you'll suffer pretty abysmal write performance as well.

Also keep in mind that some software RAID implementations allow more than
two drives in RAID 1, most often called a "mirror set".  However, I don't
know of any hardware RAID controllers that allow more than 2 drives in a
RAID 1.  RAID 10 yields excellent fault tolerance and a substantial boost to
read and write performance.  Anyone considering a 4 disk mirror set should
do RAID 10 instead.

-- 
Stan


-- 
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/4bd5e9a7.70...@hardwarefreak.com

Re: Questions about RAID 6

2010-04-26 Thread Mike Bird

On Mon April 26 2010 10:51:38 Mark Allums wrote:
> RAID 6 (and 5) perform well when less than approximately 1/3 full.
> After that, even reads suffer.

Mark,

I've been using various kinds of RAID for many many years and
was not aware of that.  Do you have a link to an explanation?

Thanks,

--Mike Bird


-- 
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/201004261137.35915.mgb-deb...@yosemite.net

Re: Questions about RAID 6

2010-04-26 Thread Mark Allums


On 4/26/2010 11:57 AM, Tim Clewlow wrote:



I'm afraid that opinions of RAID vary widely on this list (no
surprise)
but you may be interested to note that we agree (a consensus) that
software-RAID 6 is an unfortunate choice.


.
Is this for performance reasons or potential data loss. I can live
with slow writes, reads should not be all that affected, but data
loss is something I'd really like to avoid.

Regards, Tim.





Performance.

RAID 6 (and 5) perform well when less than approximately 1/3 full. 
After that, even reads suffer.  True hardware RAID can compensate 
somewhat, but you are contemplating mdraid.  Data loss should not be an 
issue if your array can rebuild fast enough.  RAID 6 can usually 
withstand the loss of two drives.


I like RAID 10, but I'm considered peculiar.  RAID 10 can often 
withstand the loss of two drives (but not always) and performs a bit 
better, with much more graceful degradation of performance as the volume 
fills up.  It performs reasonably well 2/3 full.


If data loss is crucial, RAIDed arrays can always lose one drive and 
recover, but for very important data, mirrors (RAID 1) are better.  Put 
four drives in a RAID 1, you can suffer a loss of three drives.


MAA



--
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org

Archive: http://lists.debian.org/4bd5d2aa.6010...@allums.com

Re: Questions about RAID 6

2010-04-26 Thread Tim Clewlow


> I'm afraid that opinions of RAID vary widely on this list (no
> surprise)
> but you may be interested to note that we agree (a consensus) that
> software-RAID 6 is an unfortunate choice.
>
.
Is this for performance reasons or potential data loss. I can live
with slow writes, reads should not be all that affected, but data
loss is something I'd really like to avoid.

Regards, Tim.


-- 
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: 
http://lists.debian.org/c9c11079d529273f62a76ba3b0a00359.squir...@192.168.1.100

Re: Questions about RAID 6

2010-04-26 Thread Mark Allums


On 4/26/2010 10:28 AM, Tim Clewlow wrote:


Ok, I found the answer to my second question - it fails the entire
disk. So the first question remains.



I just figured that out---and I see you have too.

The difference between what we would like it to do, and what it actually 
does can be frustrating sometimes.  I think you can tell mdadm to 
(re-)verify the disk if you think it is okay, just has one bad block. 
But I never trust failing hard disks.  It's a losing game.


MAA





--
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org

Archive: http://lists.debian.org/4bd5b441.4080...@allums.com

Re: Questions about RAID 6

2010-04-26 Thread Mark Allums


On 4/26/2010 9:29 AM, Tim Clewlow wrote:

Hi there,

I'm getting ready to build a RAID 6 with 4 x 2TB drives to start,
but the intention is to add more drives as storage requirements
increase.

My research/googling suggests ext3 supports 16TB volumes if block
size is 4096 bytes, but some sites suggest the 32 bit arch means it
is restricted to 4TB no matter what block size I use. So, does ext3
(and relevent utilities, particularly resize2fs and e2fsck) on 32
bit i386 arch support 16TB volumes?

I intend to use mdadm to build / run the array. If an unrecoverable
read error (bad block that on disk circuitry cant resolve) is
discovered on a disk then how does mdadm handle this? It appears the
possibilities are:
1) the disk gets marked as failed in the array - ext3 does not get
notified of a bad block
2) mdadm uses free space to construct a new stripe (from remaining
raid data) to replace the bad one - ext3 does not get notified of a
bad block
3) mdadm passes the requested data (again reconstructed from
remaining good blocks) up to ext3 and then tells ext3 that all those
blocks (from the single stripe) are now bad, and you deal with it
(ext3 can mark and reallocate storage location if it is told of bad
blocks too).

I would really like to hear it is either 2 or 3 as I would prefer
not to have an entire disk immediately marked bad due to one
unrecoverable read error - I would prefer to be notified instead so
I can still have RAID 6 protecting "most" of the data until the disk
gets replaced.

Regards, Tim.




I'm afraid that opinions of RAID vary widely on this list (no surprise) 
but you may be interested to note that we agree (a consensus) that 
software-RAID 6 is an unfortunate choice.


I believe that the answer to your question is none of the above.  The 
closest is (2.).  As I'm sure you know, RAID 6 uses block-level 
striping.  So, what happens is a matter of policy, but I believe that 
data that is believed lost is recovered from parity, and rewritten to 
the array.[0]  The error is logged, and the status of the drive is 
changed.  If the drive doesn't fail outright, depending on policy[1], 
the drive may be re-verified or dropped out.  However, mdadm handles the 
error, because it is a lower level failure than ext3.


The problem is when the drive is completely 100% in use (no spare 
capacity).  In that case, no new stripe is created, because there is no 
room to put one.  The data is moved to unused area[1], and the status of 
the drive is changed. (your scenario 1.)  ext3 is still unaware.


The file system is a logical layer on top of RAID, and will only become 
aware of changes to the disk structure when it is unavoidable.  RAID 
guarantees a certain capacity.  If you create a volume with 1 TB 
capacity, the volume will always have that capacity.


If you set this up, be sure to also combine it with LVM2.  Then you have 
much greater flexibility about what to do when recovering from failures.



[0]  This depends on the implementation, and I don't know what mdadm 
does.  Some implementations might do this automatically, but I think 
most would require a rebuild.


[1] Again, I forget what mdadm does in this case.  Anybody?



I'm sorry, I seem to have avoided answering a crucial part of your 
question.  I think that the md device documentation is what you want.



MAA






--
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org

Archive: http://lists.debian.org/4bd5b2a0.7060...@allums.com

Re: Questions about RAID 6

2010-04-26 Thread Tim Clewlow


Ok, I found the answer to my second question - it fails the entire
disk. So the first question remains.

Does ext3 (and relevent utilities, particularly resize2fs and
e2fsck) on 32 bit i386 arch support 16TB volumes?

Regards, Tim.


-- 
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: 
http://lists.debian.org/6f4fa734e37bf8efa066ae4152e01429.squir...@192.168.1.100

Questions about RAID 6

2010-04-26 Thread Tim Clewlow

Hi there,

I'm getting ready to build a RAID 6 with 4 x 2TB drives to start,
but the intention is to add more drives as storage requirements
increase.

My research/googling suggests ext3 supports 16TB volumes if block
size is 4096 bytes, but some sites suggest the 32 bit arch means it
is restricted to 4TB no matter what block size I use. So, does ext3
(and relevent utilities, particularly resize2fs and e2fsck) on 32
bit i386 arch support 16TB volumes?

I intend to use mdadm to build / run the array. If an unrecoverable
read error (bad block that on disk circuitry cant resolve) is
discovered on a disk then how does mdadm handle this? It appears the
possibilities are:
1) the disk gets marked as failed in the array - ext3 does not get
notified of a bad block
2) mdadm uses free space to construct a new stripe (from remaining
raid data) to replace the bad one - ext3 does not get notified of a
bad block
3) mdadm passes the requested data (again reconstructed from
remaining good blocks) up to ext3 and then tells ext3 that all those
blocks (from the single stripe) are now bad, and you deal with it
(ext3 can mark and reallocate storage location if it is told of bad
blocks too).

I would really like to hear it is either 2 or 3 as I would prefer
not to have an entire disk immediately marked bad due to one
unrecoverable read error - I would prefer to be notified instead so
I can still have RAID 6 protecting "most" of the data until the disk
gets replaced.

Regards, Tim.


-- 
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: 
http://lists.debian.org/6f1df414f4329ee27ada8e9b63a0c56d.squir...@192.168.1.100

55 matches

Mail list logo