Re: RAID-5 and database servers
Quoting "John G. Heim" : > Well, this is actually the problem... I am about as sure as I can be that a > "spam bomb" is not a noce every five year event. We get flooded with spam > pretty regularly. Its probably not a million messages a day but more like > 50,000 in two hours and then little or nothing for the next 22 hours. Yeah, annoying, huh? This is best handled via the MTA limits, not via SpamAssassin et al... You want to stop this before it ever hits your RAID disks, if possible. In other words, this is more of a mail issue than a database issue. But it is off topic, and better discussed off-list or on another list. > what I want though. If I want to set up two RAIDs, one for the operating > system and one for the database files, do I need two PERCs? Can a single > PERC put 2 disks in a RAID-1 array and 3 others in a RAID-5 array? One controller can do both... If you can get a dual channel controller and split backplanes (or dual bays) all the better. But even a bare bones single controller and single backplane will work... -- Eric Rostetter The Department of Physics The University of Texas at Austin Go Longhorns! ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: RAID-5 and database servers
On 2010-03-12 17:45, Craig White wrote: > On Fri, 2010-03-12 at 15:57 +, Jefferson Ogata wrote: >> On 2010-03-12 15:39, Craig White wrote: >>> I don't think I understand your 'odds' model. I interpret the first >>> example as RAID 50 having 5 times more likelihood of loss than RAID 10 >>> and I presume that isn't what you were after >> Yes, it is 5 times higher. But it is not 100%; it's actually less than >> 50%. And the probability for RAID 10 is not 50% as you said it was. I >> was just correcting your analysis. I'm still not sure what RAID >> structure you had in mind where a second failure on a >> RAID 10 has a 50% probability of loss. > > sorry I wasn't clear but I thought you would figure it out. > > Say you have a 4 disk RAID 10 array. If you lose 2 disks, your chances > are 50% that the RAID 10 array is unrecoverable. If you lose both > elements of one stripe or both elements of one mirror. That's my > understanding anyway. The odds of losing both elements of one mirror on a 4-disk RAID 10 are 1/3. After the first disk fails, there are three remaining; only one of these will kill your RAID if it dies. Another way to look at it: number the disks 0-3, and say your RAID 10 is a stripe of a 0-1 mirror and a 2-3 mirror. Here are all the ways two disks can fail: 0 1 * 0 2 0 3 1 2 1 3 2 3 * The * are the ones that cause loss, 2 out of 6 cases. > I admit I am far from the most knowledgeable person on this topic and I > sat on the sidelines for both of the discussions but felt that the > article from enterprisestorage needed to be linked because clearly there > are sufficient issues with the typical high density, large SATA drives > and RAID 5. I have yet to see anything that would change my mind from > thinking that the only reason to use RAID 5 is to maximize storage per > dollar which may very well come with performance and reliability issues > that should not be unspoken. There's no dispute that RAID 10 is on average more reliable by itself. But there's a lot more to keeping a data center running that the odds of losing data on a RAID. If having twice as much disk chews up your budget so you can't afford that UPS upgrade, then the next power hit may kill your whole data center. If the heat from those extra disks reduces your runtime without HVAC from 30 minutes to 20 minutes, you might get there 10 minutes too late to shut things down without taking 1000 hours off the life of all of those disks. People who tell other people to just go out and buy RAID 10 without knowing anything about the workload or the impact on the rest of the data center are *not* helping them. I'm not even sure why these people stop at RAID 10. How about network RAID with geographic redundancy? How about RAID 10 with 3 replicas per mirror? ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: RAID-5 and database servers
John G. Heim wrote: > From: "Jefferson Ogata" > >> *Again*, this is why if you have particular performance requirements, >> you should consult with your database vendor to determine what bandwidth >> and IOPS you need, and benchmark your gear using different RAID configs. >> You may find that RAID 5 is just fine performance-wise, and you can get >> around 1.7 times the storage capacity with the same rack space, heat, >> and power load over RAID 10. Asking here you're just going to get people >> parroting Oracle's stale recommendations and speculating wildly without >> knowing anything about your workload. >> >> > > Well, its not really practical to suggest that I consult with my vendor. My > whole budget is $6000. This is just the Math Department at the University of > Wisconsin. I mentioned in my original message that our databases consist > primarily of spamassassin bayesian rules and horde3/imp web mail. Those do a > lot of updates -- well, a lot by our standards. Every time a spam message > comes in, it it is added to the bayesian rule set for the user. I'm going to > say that typically each user gets 100 spam messages a day and there are 200 > users. But each new rule consistes of several table updates. Even so, its > not like we're ebay. > > Anyway, speed of updates is critical because we can't have the mail system > getting bogged down by database updates. I put the bayesian rules in a mysql > DB in the first place because it was getting bogged down saving bayesian > data to bbm files on the mail server. > > I just want to make sure that I'm not setting myself up for a disaster. > If writes are an issue and the DB can fit in RAM and you don't mind losing a few writes, then you might try mounting the DB or bbm files in a tmpfs filesystem (aka ramdisk) with a sync to disk every 5 minutes or so. I read an article about someone doing that for ganglia data because the number of transactions was killing them. Jason ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: RAID-5 and database servers
On 2010-03-12 22:10, John G. Heim wrote: > I really think my boss is nearly out of patience with me. I think I know > what I want though. If I want to set up two RAIDs, one for the operating > system and one for the database files, do I need two PERCs? Can a single > PERC put 2 disks in a RAID-1 array and 3 others in a RAID-5 array? Yes, no problem. You'll have /dev/sda and /dev/sdb. ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: RAID-5 and database servers
Hello John, I don't know what the other profs will say,but I would take at least a double channel raidcontroller and put at least half of the disks at one channel and the other disks at the other channel. and make a config like channel a: disk a, b, c, (half of 1st array) channel b: disk a',b',c' (second half of 1st array) channel a: disk d,e,f (half of 2nd array) channel b: disk d',e',f' (second half of 2nd array Then I guess you have quite ok datatransferrates and I think it's unlikely that scenarios like posted before that raid failes because of freezing channels will break your arrays. Probably I don't need to write this, but pls don't put your tapedrive at the same controller or something like that. So, I'm curious what the experts say about this config, but I think this is quite foolproof.. Regards, Arno Op vrijdag 12-03-2010 om 16:10 uur [tijdzone -0600], schreef John G. Heim: > From: "Eric Rostetter" > The other trap would be what happens if you get "spam bombed" and get say > > a couple million spams sent to you in an hour or so... Do you expect to > > survive this without slowdown, or is it okay that it slows down until the > > spam bomb dies down? You might only get a spam bomb like this once every > > 5 years, but if it does happen, what are your expectations? (Here spam > > bomb could also be a joe-job, a virus outbreak, or other unexpected mail > > event... Pick your favorite...) > > > > Well, this is actually the problem... I am about as sure as I can be that a > "spam bomb" is not a noce every five year event. We get flooded with spam > pretty regularly. Its probably not a million messages a day but more like > 50,000 in two hours and then little or nothing for the next 22 hours. > > > Yeah, we can survive that. Its not like classes would be cancelled if some > prof can't get his mail or the response time on the web server is so slow as > to drive people away. But if it can be avoided, it would be very nice. > > I really think my boss is nearly out of patience with me. I think I know > what I want though. If I want to set up two RAIDs, one for the operating > system and one for the database files, do I need two PERCs? Can a single > PERC put 2 disks in a RAID-1 array and 3 others in a RAID-5 array? > > ___ > Linux-PowerEdge mailing list > Linux-PowerEdge@dell.com > https://lists.us.dell.com/mailman/listinfo/linux-poweredge > Please read the FAQ at http://lists.us.dell.com/faq -- Arno van der Veen +386 31 629 556 .technologist.si ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: RAID-5 and database servers
From: "Eric Rostetter" The other trap would be what happens if you get "spam bombed" and get say > a couple million spams sent to you in an hour or so... Do you expect to > survive this without slowdown, or is it okay that it slows down until the > spam bomb dies down? You might only get a spam bomb like this once every > 5 years, but if it does happen, what are your expectations? (Here spam > bomb could also be a joe-job, a virus outbreak, or other unexpected mail > event... Pick your favorite...) Well, this is actually the problem... I am about as sure as I can be that a "spam bomb" is not a noce every five year event. We get flooded with spam pretty regularly. Its probably not a million messages a day but more like 50,000 in two hours and then little or nothing for the next 22 hours. Yeah, we can survive that. Its not like classes would be cancelled if some prof can't get his mail or the response time on the web server is so slow as to drive people away. But if it can be avoided, it would be very nice. I really think my boss is nearly out of patience with me. I think I know what I want though. If I want to set up two RAIDs, one for the operating system and one for the database files, do I need two PERCs? Can a single PERC put 2 disks in a RAID-1 array and 3 others in a RAID-5 array? ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: RAID-5 and database servers
Quoting "John G. Heim" : > We have mysql databases for spamassassin bayesian rules, hore3/imp web mail, > and drupal. We also have a small departmental database updated via my own > web apps. Drupal and horde are _probably_ going to be mostly read heavy, so they probably don't matter. You're going to have to concerns: 1) The SA bayesian rules might be very write heavy, if you get a lot of mail, and depending on if you do autowhitelist or other such things, and how often you expire them, etc. 2) You have concurrent DB access (multiple databases), so we need to know how many hits each gets... Since your mail user base is small, the horde DB isn't really an issue. So we need to know how much drupal is hit, and how many SA hits it will take... My guess is, that will only about 200 users, raid-5 would be fine. But I'd take that back if you are doing something like SA auto-whitelisting also... Or if for some reason your drupal site is very well visited... The biggest trap, as I see it, is if you say "this is fine because we only have 200 users" and then a couple years down the line you've somehow grown to 1000 or 2000 users instead... If you plan to stay small, then I'd say raid-5 would probably be fine. If you forecast growth, then maybe not. The other trap would be what happens if you get "spam bombed" and get say a couple million spams sent to you in an hour or so... Do you expect to survive this without slowdown, or is it okay that it slows down until the spam bomb dies down? You might only get a spam bomb like this once every 5 years, but if it does happen, what are your expectations? (Here spam bomb could also be a joe-job, a virus outbreak, or other unexpected mail event... Pick your favorite...) -- Eric Rostetter The Department of Physics The University of Texas at Austin Go Longhorns! ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: RAID-5 and database servers
On Fri, 2010-03-12 at 15:57 +, Jefferson Ogata wrote: > On 2010-03-12 15:39, Craig White wrote: > > On Fri, 2010-03-12 at 07:06 +, Jefferson Ogata wrote: > >> On 2010-03-12 04:26, Craig White wrote: > >>> On Fri, 2010-03-12 at 02:23 +, Jefferson Ogata wrote: > On 2010-03-11 22:23, Matthew Geier wrote: > > I've had a disk fail in such a way on a SCSI array that all disks on > > that SCSI bus became unavailable simultaneously. When half the disks > > dropped of the array at the same time, it gave up and corrupted the RAID > > 5 meta data so that even after removing the offending drive, the array > > didn't recover. > I also should point out (in case it isn't obvious), that that sort of > failure would take out the typical RAID 10 as well. > >>> > >>> ignoring that a 2nd failed disk on RAID 5 is always fatal and only 50% > >>> fatal on RAID 10, I suppose that would be true. > >> The poster wrote that all of the disks on a bus failed, not just a > >> second one. Depending on the RAID structure, this could take out a RAID > >> 10 100% of the time. > > > > actually, this is what he wrote... > > > > "When half the disks dropped of the array at the same time, it gave up > > and corrupted the RAID 5 meta data so that even after removing the > > offending drive, the array didn't recover." > > > > Half != all > > Read it again: "I've had a disk fail in such a way on a SCSI array that > all disks on that SCSI bus became unavailable simultaneously." of course I read that but the very next sentence expounds... when half the disks dropped out of the array at the same time, it corrupted the RAID 5 metadata... a loss of 2 RAID 5 devices is always catastrophic. > > I had a 5 disk RAID 5 array fail the wrong disk and thus had 2 drives go > > offline and had a catastophic failure and thus had to re-install and > > recover from backup once (PERC 3/di & SCSI disks). Not something I wish > > to do again. > > PERC 5 and PERC 6 are worlds different from the PERC 3/di. agreed > > > I don't think I understand your 'odds' model. I interpret the first > > example as RAID 50 having 5 times more likelihood of loss than RAID 10 > > and I presume that isn't what you were after > > Yes, it is 5 times higher. But it is not 100%; it's actually less than > 50%. And the probability for RAID 10 is not 50% as you said it was. I > was just correcting your analysis. I'm still not sure what RAID > structure you had in mind where a second failure on a > RAID 10 has a 50% probability of loss. sorry I wasn't clear but I thought you would figure it out. Say you have a 4 disk RAID 10 array. If you lose 2 disks, your chances are 50% that the RAID 10 array is unrecoverable. If you lose both elements of one stripe or both elements of one mirror. That's my understanding anyway. I admit I am far from the most knowledgeable person on this topic and I sat on the sidelines for both of the discussions but felt that the article from enterprisestorage needed to be linked because clearly there are sufficient issues with the typical high density, large SATA drives and RAID 5. I have yet to see anything that would change my mind from thinking that the only reason to use RAID 5 is to maximize storage per dollar which may very well come with performance and reliability issues that should not be unspoken. Craig -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: RAID-5 and database servers
> I mentioned in my original message that our databases consist > primarily of spamassassin bayesian rules and horde3/imp web mail. Minor correction... I posted the info about my database uses in another thread on this list, not this thread. I did neglect to reiterate what uses I had for the DB server when I started this thread to ask for specifics on RAID-5 vs RAID-10. Sorry 'bout that. We have mysql databases for spamassassin bayesian rules, hore3/imp web mail, and drupal. We also have a small departmental database updated via my own web apps. PS: Before you suggest it, I also asked about this on the mysql list. But I think the vast majority of people on that list are primarily DB experts and don't know much about hardware. I'd rather not have to become an expert on mysql optimization but I guess I'll have to. ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: RAID-5 and database servers
On 2010-03-12 15:45, John G. Heim wrote: > Well, its not really practical to suggest that I consult with my vendor. My > whole budget is $6000. This is just the Math Department at the University of > Wisconsin. I mentioned in my original message that our databases consist > primarily of spamassassin bayesian rules and horde3/imp web mail. Those do a > lot of updates -- well, a lot by our standards. Every time a spam message > comes in, it it is added to the bayesian rule set for the user. I'm going to > say that typically each user gets 100 spam messages a day and there are 200 > users. But each new rule consistes of several table updates. Even so, its > not like we're ebay. > > Anyway, speed of updates is critical because we can't have the mail system > getting bogged down by database updates. I put the bayesian rules in a mysql > DB in the first place because it was getting bogged down saving bayesian > data to bbm files on the mail server. > > I just want to make sure that I'm not setting myself up for a disaster. Can you estimate the number of transactions per second you need? Is the current mysql implementation keeping up with the mail? If so, run iostat -kthx 60 under peak load, wait a minute, and post the last report indicating which block device has the mysql database on it. It doesn't sound like it would be a disaster if your database filesystem crashed; you'd just drop the spam filtering while you reconstruct it. Is your $6000 just for storage or do you have to buy a PowerEdge to go along with it? ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: RAID-5 and database servers
On 2010-03-12 15:39, Craig White wrote: > On Fri, 2010-03-12 at 07:06 +, Jefferson Ogata wrote: >> On 2010-03-12 04:26, Craig White wrote: >>> On Fri, 2010-03-12 at 02:23 +, Jefferson Ogata wrote: On 2010-03-11 22:23, Matthew Geier wrote: > I've had a disk fail in such a way on a SCSI array that all disks on > that SCSI bus became unavailable simultaneously. When half the disks > dropped of the array at the same time, it gave up and corrupted the RAID > 5 meta data so that even after removing the offending drive, the array > didn't recover. I also should point out (in case it isn't obvious), that that sort of failure would take out the typical RAID 10 as well. >>> >>> ignoring that a 2nd failed disk on RAID 5 is always fatal and only 50% >>> fatal on RAID 10, I suppose that would be true. >> The poster wrote that all of the disks on a bus failed, not just a >> second one. Depending on the RAID structure, this could take out a RAID >> 10 100% of the time. > > actually, this is what he wrote... > > "When half the disks dropped of the array at the same time, it gave up > and corrupted the RAID 5 meta data so that even after removing the > offending drive, the array didn't recover." > > Half != all Read it again: "I've had a disk fail in such a way on a SCSI array that all disks on that SCSI bus became unavailable simultaneously." Unless you have a disk on a separate bus for every mirror in the RAID 10, this will kill your RAID 10 100% of the time. While that configuration is more bulletproof, it also may not perform as well on a saturated RAID 10 since every write has to be queued to two separate buses instead of one. The original poster's failure was a recoverable one, anyway. He just didn't know the technique for recovery. > I had a 5 disk RAID 5 array fail the wrong disk and thus had 2 drives go > offline and had a catastophic failure and thus had to re-install and > recover from backup once (PERC 3/di & SCSI disks). Not something I wish > to do again. PERC 5 and PERC 6 are worlds different from the PERC 3/di. > I don't think I understand your 'odds' model. I interpret the first > example as RAID 50 having 5 times more likelihood of loss than RAID 10 > and I presume that isn't what you were after Yes, it is 5 times higher. But it is not 100%; it's actually less than 50%. And the probability for RAID 10 is not 50% as you said it was. I was just correcting your analysis. I'm still not sure what RAID structure you had in mind where a second failure on a RAID 10 has a 50% probability of loss. > >> In the alternative fair comparison, RAID 5 vs. RAID 1, the second >> failure kills both RAIDs 100% of the time. > > actually, I didn't raise the RAID 5 vs RAID 10 comparison, I only > amplified with my experiences You wrote: "ignoring that a 2nd failed disk on RAID 5 is always fatal and only 50% fatal on RAID 10, I suppose that would be true." That was you comparing RAID 5 with RAID 10. > the last time I bought an MD-1000, Dell would only sell me the PERC-5e, > I don't know why. Currently you can buy an MD1000 with or without a PERC 6. (If I could recommend an enclosure from a different manufacturer at this point, I would, but I haven't evaluated any others since I started buying MD1000s some years ago.) -- Jefferson Ogata : Internetworker, Antibozo http://www.antibozo.net/ogata/ ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: RAID-5 and database servers
From: "Jefferson Ogata" > *Again*, this is why if you have particular performance requirements, > you should consult with your database vendor to determine what bandwidth > and IOPS you need, and benchmark your gear using different RAID configs. > You may find that RAID 5 is just fine performance-wise, and you can get > around 1.7 times the storage capacity with the same rack space, heat, > and power load over RAID 10. Asking here you're just going to get people > parroting Oracle's stale recommendations and speculating wildly without > knowing anything about your workload. > Well, its not really practical to suggest that I consult with my vendor. My whole budget is $6000. This is just the Math Department at the University of Wisconsin. I mentioned in my original message that our databases consist primarily of spamassassin bayesian rules and horde3/imp web mail. Those do a lot of updates -- well, a lot by our standards. Every time a spam message comes in, it it is added to the bayesian rule set for the user. I'm going to say that typically each user gets 100 spam messages a day and there are 200 users. But each new rule consistes of several table updates. Even so, its not like we're ebay. Anyway, speed of updates is critical because we can't have the mail system getting bogged down by database updates. I put the bayesian rules in a mysql DB in the first place because it was getting bogged down saving bayesian data to bbm files on the mail server. I just want to make sure that I'm not setting myself up for a disaster. ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: RAID-5 and database servers
On Fri, 2010-03-12 at 07:06 +, Jefferson Ogata wrote: > On 2010-03-12 04:26, Craig White wrote: > > On Fri, 2010-03-12 at 02:23 +, Jefferson Ogata wrote: > >> On 2010-03-11 22:23, Matthew Geier wrote: > >>> I've had a disk fail in such a way on a SCSI array that all disks on > >>> that SCSI bus became unavailable simultaneously. When half the disks > >>> dropped of the array at the same time, it gave up and corrupted the RAID > >>> 5 meta data so that even after removing the offending drive, the array > >>> didn't recover. > >> I also should point out (in case it isn't obvious), that that sort of > >> failure would take out the typical RAID 10 as well. > > > > ignoring that a 2nd failed disk on RAID 5 is always fatal and only 50% > > fatal on RAID 10, I suppose that would be true. > > The poster wrote that all of the disks on a bus failed, not just a > second one. Depending on the RAID structure, this could take out a RAID > 10 100% of the time. actually, this is what he wrote... "When half the disks dropped of the array at the same time, it gave up and corrupted the RAID 5 meta data so that even after removing the offending drive, the array didn't recover." Half != all I had a 5 disk RAID 5 array fail the wrong disk and thus had 2 drives go offline and had a catastophic failure and thus had to re-install and recover from backup once (PERC 3/di & SCSI disks). Not something I wish to do again. > In your "second disk" scenario, comparing RAID 5 with RAID 10 in terms > of failure likelihood isn't fair; you need to compare RAID 50 with RAID > 10. And the odd depend on the number of disks and the RAID structure. > > Suppose you have 12 disks arranged as a 6x2 RAID 10, and the same number > of disks as a 2x6 RAID 50. When the second disk fails the odds of loss are: > > - RAID 50: 5/11. > - RAID 10: 1/11. > > If instead we have the 12 disks as a 3x4 RAID 50, then the odds of loss > when the second disk fails are: > > - RAID 50: 3/11. > - RAID 10: 1/11. > > We can now tolerate a third disk failure with our RAID 50 with the odds > of loss: > > - RAID 50: 6/10. > - RAID 10: 2/10. > > How often does this happen? It hasn't happened to me, and it hasn't > happened to anyone I know. I don't think I understand your 'odds' model. I interpret the first example as RAID 50 having 5 times more likelihood of loss than RAID 10 and I presume that isn't what you were after > > In the alternative fair comparison, RAID 5 vs. RAID 1, the second > failure kills both RAIDs 100% of the time. actually, I didn't raise the RAID 5 vs RAID 10 comparison, I only amplified with my experiences > It's pretty clear you don't speak from any recent experience as far as > RAID 5 performance goes, and you yourself say as much when you say you > "had already forsaken RAID 5". Like Oracle, you're living in the past. > You should do some of your own benchmarks. I'd agree with that assessment... I gave up on RAID 5 a few years ago. In addition, reading the previously linked article in enterprisestorage.com tells me that when I use SATA drives, I should avoid RAID 5... good enough for me. > In any case, the argument in that article applies to RAID 10 as well; it > gives you better probabilities but eventually it will take too long to > rebuild mirrors and failure will be just as inevitable as with RAID 5. > Error rates will have to drop to prevent this, and no doubt they will, > sufficiently that the article's argument is moot. Eventually they will > drop to the point where we will be using RAID 0. > > > On top of that, > > it seems to me that RAID 10 smokes RAID 5 on every performance > > characteristic my clients are likely to use (and yes, that means > > databases). RAID 5 primarily satisfies the needs for maximum storage for > > the least amount of money and that was rarely what I need in a storage > > system for a server. > > For a lot of access patterns, RAID 5 yields much better write bandwidth > than RAID 10. I don't know why you think RAID 10 "smokes" RAID 5. You > should grab a PERC 6 and a couple of MD1000s and try some different > configurations. I don't think you'll see any smoke in the margins, even > over the oddly limited gamut of access patterns your clients use. the last time I bought an MD-1000, Dell would only sell me the PERC-5e, I don't know why. I could see possibly using RAID 50 but RAID 5 is just not a path I want to venture any more. Craig -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
RE: RAID-5 and database servers
Plenty of good advice in this thread, I might add that you'd wish to use the maximum number of disks you can pack into your box and meet your storage requirements, and not just a few big ones, favoring RAID-6 over RAID-5. Usually striping a busy DB over more individual disks will yield added performance benefits, smaller disks rebuild quicker too in case of failure and are cheaper to replace. The increase in disks does increase the risks of RAID failure as well, but using RAID-6 plus hotspares helps minimizing that. Rebuild is the most critical moment in RAID-5, especially with those newer huge disks that take ages to rebuild. Rebuilding a busy 500Gb SATA RAID-5 took about 8 hours, but YMMV. During rebuild the load is increased on all the remaining disks, and as they usually are from the same batch as all the others, they may fail for the same reason the first one did. I had a big RAID-5 die on me during the rebuild phase, and that is not a happy memory. Thankfully we had some spare drives to rebuild a new array and full backups. If money is really no object, go buy a couple ramsan's (http://www.ramsan.com). ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: RAID-5 and database servers
On 2010-03-12 04:26, Craig White wrote: > On Fri, 2010-03-12 at 02:23 +, Jefferson Ogata wrote: >> On 2010-03-11 22:23, Matthew Geier wrote: >>> I've had a disk fail in such a way on a SCSI array that all disks on >>> that SCSI bus became unavailable simultaneously. When half the disks >>> dropped of the array at the same time, it gave up and corrupted the RAID >>> 5 meta data so that even after removing the offending drive, the array >>> didn't recover. >> I also should point out (in case it isn't obvious), that that sort of >> failure would take out the typical RAID 10 as well. > > ignoring that a 2nd failed disk on RAID 5 is always fatal and only 50% > fatal on RAID 10, I suppose that would be true. The poster wrote that all of the disks on a bus failed, not just a second one. Depending on the RAID structure, this could take out a RAID 10 100% of the time. In your "second disk" scenario, comparing RAID 5 with RAID 10 in terms of failure likelihood isn't fair; you need to compare RAID 50 with RAID 10. And the odd depend on the number of disks and the RAID structure. Suppose you have 12 disks arranged as a 6x2 RAID 10, and the same number of disks as a 2x6 RAID 50. When the second disk fails the odds of loss are: - RAID 50: 5/11. - RAID 10: 1/11. If instead we have the 12 disks as a 3x4 RAID 50, then the odds of loss when the second disk fails are: - RAID 50: 3/11. - RAID 10: 1/11. We can now tolerate a third disk failure with our RAID 50 with the odds of loss: - RAID 50: 6/10. - RAID 10: 2/10. How often does this happen? It hasn't happened to me, and it hasn't happened to anyone I know. In the alternative fair comparison, RAID 5 vs. RAID 1, the second failure kills both RAIDs 100% of the time. And there's always RAID 6. > So if Dell is selling a high quality hard drive with more than average > durability and the anticipation that it is going to last longer under > 24/7 usage, its entirely reasonable to have to pay more than the > cheapest dirt SATA drive you can find online. Of course you will have to > live with the consequences if you go with the dirt cheap drive. > Personally, I put a lot of value on my time and my customers data. I have hundreds of Dell disks online. They fail regularly. Often they fail during system burn-in. For the kind of markup Dell is charging on these drives I don't think I should be finding dead ones after only 24 hours of operation. And a one-year warranty is just ridiculous. > I read this article last year... > > http://www.enterprisestorageforum.com/technology/features/article.php/3839636 > > and I had already forsaken RAID 5 but it pretty much confirmed what my > experiences had been... that when I considered the life cycle of the > installation, the time lost in waiting for file transfer, etc. on RAID > 5, etc. that it was foolish for me to recommend RAID 5 to anyone. It's pretty clear you don't speak from any recent experience as far as RAID 5 performance goes, and you yourself say as much when you say you "had already forsaken RAID 5". Like Oracle, you're living in the past. You should do some of your own benchmarks. In any case, the argument in that article applies to RAID 10 as well; it gives you better probabilities but eventually it will take too long to rebuild mirrors and failure will be just as inevitable as with RAID 5. Error rates will have to drop to prevent this, and no doubt they will, sufficiently that the article's argument is moot. Eventually they will drop to the point where we will be using RAID 0. > On top of that, > it seems to me that RAID 10 smokes RAID 5 on every performance > characteristic my clients are likely to use (and yes, that means > databases). RAID 5 primarily satisfies the needs for maximum storage for > the least amount of money and that was rarely what I need in a storage > system for a server. For a lot of access patterns, RAID 5 yields much better write bandwidth than RAID 10. I don't know why you think RAID 10 "smokes" RAID 5. You should grab a PERC 6 and a couple of MD1000s and try some different configurations. I don't think you'll see any smoke in the margins, even over the oddly limited gamut of access patterns your clients use. ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: RAID-5 and database servers
On Fri, 2010-03-12 at 02:23 +, Jefferson Ogata wrote: > On 2010-03-11 22:23, Matthew Geier wrote: > > I've had a disk fail in such a way on a SCSI array that all disks on > > that SCSI bus became unavailable simultaneously. When half the disks > > dropped of the array at the same time, it gave up and corrupted the RAID > > 5 meta data so that even after removing the offending drive, the array > > didn't recover. > > I also should point out (in case it isn't obvious), that that sort of > failure would take out the typical RAID 10 as well. ignoring that a 2nd failed disk on RAID 5 is always fatal and only 50% fatal on RAID 10, I suppose that would be true. I've been reading this thread and the thread about Dell's pricing of SATA disks pretty much in silence and have been wondering about some of the massive generalizations and limited scope opinions that have been been expressed on this list and figure that it's probably time for me to pipe in with my underinformed view. RAID is a great tool and traditionally servers have been sold with high grade hardware (controllers & hard drives) but of course the pressure is always on to get maximum amount of storage for a minimum amount of cost so it seems that we cannot find RAID controller hardware that is cheap enough or hard drives cheap enough. The truth is that the SATA controllers are fairly marginal and some of the SATA drives are really not suitable for putting into a server that you expect some durability and stability over time. Not that is going to stop people from buying them anyway. So if Dell is selling a high quality hard drive with more than average durability and the anticipation that it is going to last longer under 24/7 usage, its entirely reasonable to have to pay more than the cheapest dirt SATA drive you can find online. Of course you will have to live with the consequences if you go with the dirt cheap drive. Personally, I put a lot of value on my time and my customers data. I read this article last year... http://www.enterprisestorageforum.com/technology/features/article.php/3839636 and I had already forsaken RAID 5 but it pretty much confirmed what my experiences had been... that when I considered the life cycle of the installation, the time lost in waiting for file transfer, etc. on RAID 5, etc. that it was foolish for me to recommend RAID 5 to anyone. It's not that RAID 5 doesn't work... it does. It's not that it is prone to failure, it's not (well this article is suggesting that the more drives you have in a RAID 5 array, the more likely you are going to suffer from catastrophic loss when rebuilding the array). It's just that I am more prone to use cheaper hard drives, cheaper controllers and at some point I have to have the extra margin for safety. On top of that, it seems to me that RAID 10 smokes RAID 5 on every performance characteristic my clients are likely to use (and yes, that means databases). RAID 5 primarily satisfies the needs for maximum storage for the least amount of money and that was rarely what I need in a storage system for a server. Craig -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: RAID-5 and database servers
On 2010-03-11 22:23, Matthew Geier wrote: > I've had a disk fail in such a way on a SCSI array that all disks on > that SCSI bus became unavailable simultaneously. When half the disks > dropped of the array at the same time, it gave up and corrupted the RAID > 5 meta data so that even after removing the offending drive, the array > didn't recover. I also should point out (in case it isn't obvious), that that sort of failure would take out the typical RAID 10 as well. ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: RAID-5 and database servers
On 2010-03-11 22:23, Matthew Geier wrote: > I've had a disk fail in such a way on a SCSI array that all disks on > that SCSI bus became unavailable simultaneously. When half the disks > dropped of the array at the same time, it gave up and corrupted the RAID > 5 meta data so that even after removing the offending drive, the array > didn't recover. In that scenario you *should* be able to recover by reconfiguring the RAID as it originally was before the SCSI crash, and *not* initializing the logical drives. I have my systems all mail me a nightly report of RAID configuration in case I ever need to do this. While I might be able to remember how I configured RAIDs at install time, the config may change over time, e.g. after a hot spare is brought online. If you are using LSI-based RAID controllers, you might be able to save the current controller config with MegaCLI using the -cfgsave option periodically, and recover the config after a crash using -cfgrestore. ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: RAID-5 and database servers
Jefferson Ogata wrote: > > That's not what I mean by a full RAID failure. I've had plenty of disks > fail and subsequent successful rebuilds. I'm saying on one occasion > (because of an oversight) I ended up with an unrecoverable RAID 5 > because of disk failures. > > Of course, this wasn't a serious problem because I also had backups. > > I've had a disk fail in such a way on a SCSI array that all disks on that SCSI bus became unavailable simultaneously. When half the disks dropped of the array at the same time, it gave up and corrupted the RAID 5 meta data so that even after removing the offending drive, the array didn't recover. The restore from backup tape near 48hrs as it was near the end of our monthly backup cycle and 28 'incremental' tapes had to be loaded. It was a mail spool as well, so the incrementals were reasonably large. We changed the backup schedule after that to do full dumps more often so less tapes would be required to restore it :-) ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: RAID-5 and database servers
Quoting Jefferson Ogata : > That's not what I mean by a full RAID failure. My mistake; I just glossed right over the word "full" as if it wasn't there... Sorry about that... Brain fart I guess -- Eric Rostetter The Department of Physics The University of Texas at Austin Go Longhorns! ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: RAID-5 and database servers
On 2010-03-11 19:48, Eric Rostetter wrote: > Quoting Jefferson Ogata : >> I've got several hundred disks running on RAID 5 and I've had one actual >> full RAID failure in 10 years, and that was my fault. > > You've been lucky! :) > > In 10 years, I've think I've had 3 RAID 5 failures (all rebuilt without > problems). That's not what I mean by a full RAID failure. I've had plenty of disks fail and subsequent successful rebuilds. I'm saying on one occasion (because of an oversight) I ended up with an unrecoverable RAID 5 because of disk failures. Of course, this wasn't a serious problem because I also had backups. -- Jefferson Ogata : Internetworker, Antibozo http://www.antibozo.net/ogata/ ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: RAID-5 and database servers
Quoting Jefferson Ogata : > I've got several hundred disks running on RAID 5 and I've had one actual > full RAID failure in 10 years, and that was my fault. You've been lucky! :) In 10 years, I've think I've had 3 RAID 5 failures (all rebuilt without problems). > In terms of performance, depending on the workload, RAID 5 can > outperform RAID 10. Very true. > Furthermore Oracle's recommendations are based on > what appears to be 5-10-year-old data I agree, it appears outdated to me also. > Bear in mind > also that now that Oracle is a hardware company, they'd just love you to > buy almost twice as much disk (from them). I doubt that is a driving factor here... > *Again*, this is why if you have particular performance requirements, > you should consult with your database vendor to determine what bandwidth > and IOPS you need, and benchmark your gear using different RAID configs. Or at a minimum, you need to define what your performance requirements are. If you can't quantify your performance requirements, you're just guessing and "taking a shot in the dark". > You may find that RAID 5 is just fine performance-wise, and you can get > around 1.7 times the storage capacity with the same rack space, heat, > and power load over RAID 10. Asking here you're just going to get people > parroting Oracle's stale recommendations and speculating wildly without > knowing anything about your workload. Well, the advise has been slightly better than that, but yes, we're all speculating without knowing anything about the workload. And I at least have stated that in my posts/replies... If a serious answer is needed, the OP needs to post the workload and performance expectations at a minimum... -- Eric Rostetter The Department of Physics The University of Texas at Austin Go Longhorns! ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: RAID-5 and database servers
Quoting "J. Epperson" : > On Thu, March 11, 2010 11:17, Dan Pritts wrote: >> On Tue, Mar 09, 2010 at 04:54:44PM -0600, John G. Heim wrote: >>> Has anyone configured a database server with RAID-5? Is it really a bad >>> idea >> >> http://www.orafaq.com/wiki/RAID >> > > Which says that unless money is no object, go with RAID 5. I'd say the page is somewhat outdated. If your disks are large, and most disks are today, RAID 5 should be replaced by RAID 6 or better. RAID 5 is risky if your disks are large... The larger the disk, the better the chance of a second failure during a RAID 5 rebuild (causing a total lose of data). Also, while it does indeed say go with RAID 5 if you can't afford RAID 10, it also says: > use where availability is important, AND 'read' will be the majority of I/O's If your database is mostly write, RAID 5 would not be a great idea... Fortunately most databases are either mostly read, or mixed read-write. But there are some mostly-write databases, and these would be a bad fit for RAID 5 (or RAID 6). Again, it depends on your environment and your needs... It is possible RAID 5 is perfect for your needs, but terrible for my needs... If you don't need fast access, then it doesn't matter... Some people have databases, and it takes many hours to generate a report, and they are okay with that. Others can't bear it if the report takes more than 30 seconds... If your database use is interactive and response time is important, you likely need a different setup than if your database is mostly batch oriented and response time isn't as important... -- Eric Rostetter The Department of Physics The University of Texas at Austin Go Longhorns! ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: RAID-5 and database servers
On Thu, March 11, 2010 13:09, Preston Hagar wrote: > On Thu, Mar 11, 2010 at 11:26 AM, J. Epperson > wrote: >> On Thu, March 11, 2010 11:17, Dan Pritts wrote: >>> On Tue, Mar 09, 2010 at 04:54:44PM -0600, John G. Heim wrote: Has anyone configured a database server with RAID-5? Is it really a bad idea >>> >>> http://www.orafaq.com/wiki/RAID >>> >> >> Which says that unless money is no object, go with RAID 5. >> > > Actually it says if money is no object, go with RAID 10: > And that if RAID 10 is too expensive, go with RAID 5. ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: RAID-5 and database servers
Hi I think if you use raid1 with at least 1 hotspare, you're pretty secure with a high datatransfer.. If one disk fails then the hotspare takes it place and gives the time to replace the broken disk.. Recently I put an old server from raid5 to raid1, because of the progresql. they recommended the raid 1 or raid 10 for performance. Although I never had serious problems with raid 5 ( always used combination of at least 3 disks and minimal of 1 hotspare.). No hardware problems, nor performance issues. But I mostly work with quite small workgroups (max 50 workstations). regards, Arno Op donderdag 11-03-2010 om 12:09 uur [tijdzone -0600], schreef Preston Hagar: > On Thu, Mar 11, 2010 at 11:26 AM, J. Epperson > wrote: > > On Thu, March 11, 2010 11:17, Dan Pritts wrote: > >> On Tue, Mar 09, 2010 at 04:54:44PM -0600, John G. Heim wrote: > >>> Has anyone configured a database server with RAID-5? Is it really a bad > >>> idea > >> > >> http://www.orafaq.com/wiki/RAID > >> > > > > Which says that unless money is no object, go with RAID 5. > > > > Actually it says if money is no object, go with RAID 10: > > http://www.orafaq.com/wiki/RAID#RAID_10 > > RAID 10 is the ideal RAID level in terms of performance and > availability, but it can be expensive as it requires at least twice > the amount of disk space. If money is no objective, always choose RAID > 10! > > I would agree with the RAID 10 recommendation. I at one time did a > lot of RAID 5 to try to comprimise price vs performance, but had > several array failures resulting in having to restore from backup. > Now, I put anything important on either RAID 1, or RAID 10. Basically > I use RAID 1 if it needs to be reliable and RAID 10 if it needs to be > reliable and fast. > > Preston > > ___ > Linux-PowerEdge mailing list > Linux-PowerEdge@dell.com > https://lists.us.dell.com/mailman/listinfo/linux-poweredge > Please read the FAQ at http://lists.us.dell.com/faq ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: RAID-5 and database servers
On 2010-03-11 18:09, Preston Hagar wrote: > Actually it says if money is no object, go with RAID 10: > > http://www.orafaq.com/wiki/RAID#RAID_10 > > RAID 10 is the ideal RAID level in terms of performance and > availability, but it can be expensive as it requires at least twice > the amount of disk space. If money is no objective, always choose RAID > 10! > > I would agree with the RAID 10 recommendation. I at one time did a > lot of RAID 5 to try to comprimise price vs performance, but had > several array failures resulting in having to restore from backup. > Now, I put anything important on either RAID 1, or RAID 10. Basically > I use RAID 1 if it needs to be reliable and RAID 10 if it needs to be > reliable and fast. I've got several hundred disks running on RAID 5 and I've had one actual full RAID failure in 10 years, and that was my fault. In terms of performance, depending on the workload, RAID 5 can outperform RAID 10. Furthermore Oracle's recommendations are based on what appears to be 5-10-year-old data, back when mid-level RAID controllers weren't capable of pushing ~700 MB/s onto a RAID 5. Nowadays, they can do that, and achieve pretty stellar IOPS as well. The difference in performance between RAID 5 (or better yet, RAID 50, striped using LVM), and RAID 10 is not what it used to be. Bear in mind also that now that Oracle is a hardware company, they'd just love you to buy almost twice as much disk (from them). *Again*, this is why if you have particular performance requirements, you should consult with your database vendor to determine what bandwidth and IOPS you need, and benchmark your gear using different RAID configs. You may find that RAID 5 is just fine performance-wise, and you can get around 1.7 times the storage capacity with the same rack space, heat, and power load over RAID 10. Asking here you're just going to get people parroting Oracle's stale recommendations and speculating wildly without knowing anything about your workload. ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: RAID-5 and database servers
On Thu, Mar 11, 2010 at 11:26 AM, J. Epperson wrote: > On Thu, March 11, 2010 11:17, Dan Pritts wrote: >> On Tue, Mar 09, 2010 at 04:54:44PM -0600, John G. Heim wrote: >>> Has anyone configured a database server with RAID-5? Is it really a bad >>> idea >> >> http://www.orafaq.com/wiki/RAID >> > > Which says that unless money is no object, go with RAID 5. > Actually it says if money is no object, go with RAID 10: http://www.orafaq.com/wiki/RAID#RAID_10 RAID 10 is the ideal RAID level in terms of performance and availability, but it can be expensive as it requires at least twice the amount of disk space. If money is no objective, always choose RAID 10! I would agree with the RAID 10 recommendation. I at one time did a lot of RAID 5 to try to comprimise price vs performance, but had several array failures resulting in having to restore from backup. Now, I put anything important on either RAID 1, or RAID 10. Basically I use RAID 1 if it needs to be reliable and RAID 10 if it needs to be reliable and fast. Preston ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: RAID-5 and database servers
On Thu, March 11, 2010 11:17, Dan Pritts wrote: > On Tue, Mar 09, 2010 at 04:54:44PM -0600, John G. Heim wrote: >> Has anyone configured a database server with RAID-5? Is it really a bad >> idea > > http://www.orafaq.com/wiki/RAID > Which says that unless money is no object, go with RAID 5. ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: RAID-5 and database servers
On Tue, Mar 09, 2010 at 04:54:44PM -0600, John G. Heim wrote: > Has anyone configured a database server with RAID-5? Is it really a bad idea One other thought - don't use RAID5 for anything you really care about. Use RAID6. For a great understanding of why, read the articles on http://blogs.sun.com/relling/ regarding "ZFS Raid Recommendations." FOr the purposes of reliability calculations, RAID-Z is equivalent to RAID5 and RAID-Z2 is equivalent to RAID6. danno -- Dan Pritts, Sr. Systems Engineer Internet2 office: +1-734-352-4953 | mobile: +1-734-834-7224 Internet2 Spring Member Meeting April 26-28, 2010 - Arlington, Virginia http://events.internet2.edu/2010/spring-mm/ ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: RAID-5 and database servers
On Tue, Mar 09, 2010 at 04:54:44PM -0600, John G. Heim wrote: > Has anyone configured a database server with RAID-5? Is it really a bad idea http://www.orafaq.com/wiki/RAID "stripe and mirror everything" ie, RAID10. danno -- Dan Pritts, Sr. Systems Engineer Internet2 office: +1-734-352-4953 | mobile: +1-734-834-7224 Internet2 Spring Member Meeting April 26-28, 2010 - Arlington, Virginia http://events.internet2.edu/2010/spring-mm/ ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: RAID-5 and database servers
Quoting "John G. Heim" : > Has anyone configured a database server with RAID-5? Sure... Most people don't, but some workloads might benefit from it. > Is it really a bad idea > to do so? Depends on your workload... If it is a mostly read-intensive database, it would be fine. It it is a mostly write-intensive database, it would most likely be very bad, unless you have a very light load. > But is it better to do RAID-1 or RAID-5. I can't I recommend RAID-10 for the database files. You can do multiple raid levels for different disks (system on raid-1, DB on raid-10, etc). > figure out why RAID-1 would be better than RAID-5. I understand that with > RAID-5, a single database write might translate into writing 2 blocks (a > data block and a parity block). But doesn't RAID-1 *always* do an extra > write for every data block written? RAID-1 always does a write to both disks, so it is slow writing. But it can then read from either or both disks, so it is up to twice as fast as a single disk. RAID-5 is slow because it has to split the data into N pieces, calculate the parity, then write out N+1 writes (1 each to N+1 disks). But because it writes the N+1 in parallel, and reads the N in parallel, it is rather fast especially at reads... RAID-10 does a combination (split and stripe across disks similar to RAID-5, but at the same time mirror it like RAID-1 across stripes). It is the most robust version (as far as disk loss goes), and is often the fastest, though as always that depends on your workload and your setup. You could use any of the RAID levels 1, 5, 6, or 10... Which is _best_ depends on your budget and your workload... To properly set this up, you need to know your workload... How much data? Mostly read or mostly write or a good read/write mix? Large data requests or small data requests? Stuff like that can have a big impact on which disk layout is best... -- Eric Rostetter The Department of Physics The University of Texas at Austin Go Longhorns! ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: RAID-5 and database servers
Sorry, I just replied to one msg not to the group.. anyway: for you Matt: Mostly they recommend raid 1+0 or raid 1 + hotspares.. or raid5 with more then 4 discs (has that to do with smaller datablocks/stripe or the higher troughput??).. otherwise performance can be very bad.. At least that was in the recommendations of my postgres serverpart.. cheers.. Op dinsdag 09-03-2010 om 23:23 uur [tijdzone +], schreef Jefferson Ogata: > On 2010-03-09 23:12, Matt Domsch wrote: > > On Tue, Mar 09, 2010 at 04:54:44PM -0600, John G. Heim wrote: > >> Has anyone configured a database server with RAID-5? Is it really a bad > >> idea > >> to do so? I asked last month for tips on configuring a DB server. I have > >> around $6K to spend. I am pretty much settled on getting 2 quad-core CPUs > >> and 32 Gb of RAM. But I'm still ignorant in terms of what to get for disk. > >> 1500 RPM, I know that. But is it better to do RAID-1 or RAID-5. I can't > >> figure out why RAID-1 would be better than RAID-5. I understand that with > >> RAID-5, a single database write might translate into writing 2 blocks (a > >> data block and a parity block). But doesn't RAID-1 *always* do an extra > >> write for every data block written? > > > > RAID 5's problem isn't the extra write. It's that to write a hunk > > that's not a whole stripe width (64k * (num_drives - 1)) it has to > > first read a whole stripe (num_drives-1), calculate the parity, and > > then write to 2 disks. > > Not really. It can recalculate parity for a single block using the > parity block, the new block, and the block it is about to overwrite, > regardless of how many disks are in the stripe. > > In any case, RAID 5 (and even RAID 6) implementations are extremely fast > nowadays. What you should do is ask your database vendor what numbers > they expect in terms of read and write bandwidth and IOPS in order to > achieve your performance objective, and then use iozone or similar tools > to benchmark the RAID configurations you are considering. > > Note that using direct I/O and/or asynchronous I/O may have a large > impact on performance, as well as available memory. The RAID level may > be essentially insignificant compared to these factors. Just try to make > your RAID block size equal to the database block size. And align your > partitions to the block size as well, or don't use partitions at all. > See this essay for further info on the latter: > > http://insights.oetiker.ch/linux/raidoptimization/ > > ___ > Linux-PowerEdge mailing list > Linux-PowerEdge@dell.com > https://lists.us.dell.com/mailman/listinfo/linux-poweredge > Please read the FAQ at http://lists.us.dell.com/faq ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: RAID-5 and database servers
Hi, On Tue, 9 Mar 2010, Matt Domsch wrote: > On Tue, Mar 09, 2010 at 04:54:44PM -0600, John G. Heim wrote: >> Has anyone configured a database server with RAID-5? Is it really a bad idea >> to do so? I asked last month for tips on configuring a DB server. I have >> around $6K to spend. I am pretty much settled on getting 2 quad-core CPUs >> and 32 Gb of RAM. But I'm still ignorant in terms of what to get for disk. >> 1500 RPM, I know that. But is it better to do RAID-1 or RAID-5. I can't >> figure out why RAID-1 would be better than RAID-5. I understand that with >> RAID-5, a single database write might translate into writing 2 blocks (a >> data block and a parity block). But doesn't RAID-1 *always* do an extra >> write for every data block written? > > RAID 5's problem isn't the extra write. It's that to write a hunk > that's not a whole stripe width (64k * (num_drives - 1)) it has to > first read a whole stripe (num_drives-1), calculate the parity, and > then write to 2 disks. It is not this bad. Raid5 needs to read the old block and the parity block, to "calculate out" the old block from the parity, to "calculate in" the new block into parity, write the new block, write the parity block. So a raid5 write does 2 reads from different disks and two writes to different disks, plus calculation. A raid1 write just does 2 writes to different disks. Raid1 can have a double throughput advantage during read - just using disk1 AND disk2 - if the controller supports it. Viele Gruesse Eberhard Moenkeberg (emoe...@gwdg.de, e...@kki.org) -- Eberhard Moenkeberg Arbeitsgruppe IT-Infrastruktur E-Mail: emoe...@gwdg.de Tel.: +49 (0)551 201-1551 - Gesellschaft fuer wissenschaftliche Datenverarbeitung mbH Goettingen (GWDG) Am Fassberg 11, 37077 Goettingen URL:http://www.gwdg.de E-Mail: g...@gwdg.de Tel.: +49 (0)551 201-1510Fax:+49 (0)551 201-2150 Geschaeftsfuehrer: Prof. Dr. Bernhard Neumair Aufsichtsratsvorsitzender: Dipl.-Kfm. Markus Hoppe Sitz der Gesellschaft: Goettingen Registergericht: Goettingen Handelsregister-Nr. B 598 - ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: RAID-5 and database servers
On 2010-03-09 23:12, Matt Domsch wrote: > On Tue, Mar 09, 2010 at 04:54:44PM -0600, John G. Heim wrote: >> Has anyone configured a database server with RAID-5? Is it really a bad idea >> to do so? I asked last month for tips on configuring a DB server. I have >> around $6K to spend. I am pretty much settled on getting 2 quad-core CPUs >> and 32 Gb of RAM. But I'm still ignorant in terms of what to get for disk. >> 1500 RPM, I know that. But is it better to do RAID-1 or RAID-5. I can't >> figure out why RAID-1 would be better than RAID-5. I understand that with >> RAID-5, a single database write might translate into writing 2 blocks (a >> data block and a parity block). But doesn't RAID-1 *always* do an extra >> write for every data block written? > > RAID 5's problem isn't the extra write. It's that to write a hunk > that's not a whole stripe width (64k * (num_drives - 1)) it has to > first read a whole stripe (num_drives-1), calculate the parity, and > then write to 2 disks. Not really. It can recalculate parity for a single block using the parity block, the new block, and the block it is about to overwrite, regardless of how many disks are in the stripe. In any case, RAID 5 (and even RAID 6) implementations are extremely fast nowadays. What you should do is ask your database vendor what numbers they expect in terms of read and write bandwidth and IOPS in order to achieve your performance objective, and then use iozone or similar tools to benchmark the RAID configurations you are considering. Note that using direct I/O and/or asynchronous I/O may have a large impact on performance, as well as available memory. The RAID level may be essentially insignificant compared to these factors. Just try to make your RAID block size equal to the database block size. And align your partitions to the block size as well, or don't use partitions at all. See this essay for further info on the latter: http://insights.oetiker.ch/linux/raidoptimization/ ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: RAID-5 and database servers
On Tue, Mar 09, 2010 at 04:54:44PM -0600, John G. Heim wrote: > Has anyone configured a database server with RAID-5? Is it really a bad idea > to do so? I asked last month for tips on configuring a DB server. I have > around $6K to spend. I am pretty much settled on getting 2 quad-core CPUs > and 32 Gb of RAM. But I'm still ignorant in terms of what to get for disk. > 1500 RPM, I know that. But is it better to do RAID-1 or RAID-5. I can't > figure out why RAID-1 would be better than RAID-5. I understand that with > RAID-5, a single database write might translate into writing 2 blocks (a > data block and a parity block). But doesn't RAID-1 *always* do an extra > write for every data block written? RAID 5's problem isn't the extra write. It's that to write a hunk that's not a whole stripe width (64k * (num_drives - 1)) it has to first read a whole stripe (num_drives-1), calculate the parity, and then write to 2 disks. -- Matt Domsch Technology Strategist, Dell Office of the CTO linux.dell.com & www.dell.com/linux ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
RAID-5 and database servers
Has anyone configured a database server with RAID-5? Is it really a bad idea to do so? I asked last month for tips on configuring a DB server. I have around $6K to spend. I am pretty much settled on getting 2 quad-core CPUs and 32 Gb of RAM. But I'm still ignorant in terms of what to get for disk. 1500 RPM, I know that. But is it better to do RAID-1 or RAID-5. I can't figure out why RAID-1 would be better than RAID-5. I understand that with RAID-5, a single database write might translate into writing 2 blocks (a data block and a parity block). But doesn't RAID-1 *always* do an extra write for every data block written? ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq