Re: [PERFORM] hardware advice
- Original Message - From: David Boreham david_l...@boreham.org To: pgsql-performance@postgresql.org pgsql-performance@postgresql.org Cc: Sent: Tuesday, 2 October 2012, 16:14 Subject: Re: [PERFORM] hardware advice On 10/2/2012 2:20 AM, Glyn Astill wrote: newer R910s recently all of a sudden went dead to the world; no prior symptoms showing in our hardware and software monitoring, no errors in the os logs, nothing in the dell drac logs. After a hard reset it's back up as if nothing happened, and it's an issue I'm none the wiser to the cause. Not good piece of mind. This could be an OS bug rather than a hardware problem. Yeah actually I'm leaning towards this being a specific bug in the linux kernel. Everything else I said still stands though. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] hardware advice
From: M. D. li...@turnkey.bz To: pgsql-performance@postgresql.org Sent: Friday, 28 September 2012, 18:33 Subject: Re: [PERFORM] hardware advice On 09/28/2012 09:57 AM, David Boreham wrote: On 9/28/2012 9:46 AM, Craig James wrote: Your best warranty would be to have the confidence to do your own repairs, and to have the parts on hand. I'd seriously consider putting your own system together. Maybe go to a few sites with pre-configured machines and see what parts they use. Order those, screw the thing together yourself, and put a spare of each critical part on your shelf. This is what I did for years, but after taking my old parts collection to the landfill a few times, realized I may as well just buy N+1 machines and keep zero spares on the shelf. That way I get a spare machine available for use immediately, and I know the parts are working (parts on the shelf may be defective). If something breaks, I use the spare machine until the replacement parts arrive. Note in addition that a warranty can be extremely useful in certain organizations as a vehicle of blame avoidance (this may be its primary purpose in fact). If I buy a bunch of machines that turn out to have buggy NICs, well that's my fault and I can kick myself since I own the company, stay up late into the night reading kernel code, and buy new NICs. If I have an evil Dilbertian boss, then well...I'd be seriously thinking about buying Dell boxes in order to blame Dell rather than myself, and be able to say everything is warrantied if badness goes down. Just saying... I'm kinda in the latter shoes. Dell is the only thing that is trusted in my organisation. If I would build my own, I would be fully blamed for anything going wrong in the next 3 years. Thanks everyone for your input. Now my final choice will be if my budget allows for the latest and fastest, else I'm going for the x5690. I don't have hundreds of users, so I think the x5690 should do a pretty good job handling the load. Having plenty experience with Dell I'd urge you reconsider. All the Dell servers we've had have arrived hideously misconfigured, and tech support gets you nowhere. Once we've rejigged the hardware ourselves, maybe replacing a part or two they've performed okay. Reliability has been okay, however one of our newer R910s recently all of a sudden went dead to the world; no prior symptoms showing in our hardware and software monitoring, no errors in the os logs, nothing in the dell drac logs. After a hard reset it's back up as if nothing happened, and it's an issue I'm none the wiser to the cause. Not good piece of mind. Look around and find another vendor, even if your company has to pay more for you to have that blame avoidance. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] hardware advice - opinions about HP?
From: pgsql-performance-ow...@postgresql.org [mailto:pgsql-performance-ow...@postgresql.org] On Behalf Of Glyn Astill Sent: Tuesday, October 02, 2012 4:21 AM To: M. D.; pgsql-performance@postgresql.org Subject: Re: [PERFORM] hardware advice From: M. D. li...@turnkey.bz To: pgsql-performance@postgresql.org Sent: Friday, 28 September 2012, 18:33 Subject: Re: [PERFORM] hardware advice On 09/28/2012 09:57 AM, David Boreham wrote: On 9/28/2012 9:46 AM, Craig James wrote: Your best warranty would be to have the confidence to do your own repairs, and to have the parts on hand. I'd seriously consider putting your own system together. Maybe go to a few sites with pre-configured machines and see what parts they use. Order those, screw the thing together yourself, and put a spare of each critical part on your shelf. This is what I did for years, but after taking my old parts collection to the landfill a few times, realized I may as well just buy N+1 machines and keep zero spares on the shelf. That way I get a spare machine available for use immediately, and I know the parts are working (parts on the shelf may be defective). If something breaks, I use the spare machine until the replacement parts arrive. Note in addition that a warranty can be extremely useful in certain organizations as a vehicle of blame avoidance (this may be its primary purpose in fact). If I buy a bunch of machines that turn out to have buggy NICs, well that's my fault and I can kick myself since I own the company, stay up late into the night reading kernel code, and buy new NICs. If I have an evil Dilbertian boss, then well...I'd be seriously thinking about buying Dell boxes in order to blame Dell rather than myself, and be able to say everything is warrantied if badness goes down. Just saying... I'm kinda in the latter shoes. Dell is the only thing that is trusted in my organisation. If I would build my own, I would be fully blamed for anything going wrong in the next 3 years. Thanks everyone for your input. Now my final choice will be if my budget allows for the latest and fastest, else I'm going for the x5690. I don't have hundreds of users, so I think the x5690 should do a pretty good job handling the load. Having plenty experience with Dell I'd urge you reconsider. All the Dell servers we've had have arrived hideously misconfigured, and tech support gets you nowhere. Once we've rejigged the hardware ourselves, maybe replacing a part or two they've performed okay. Reliability has been okay, however one of our newer R910s recently all of a sudden went dead to the world; no prior symptoms showing in our hardware and software monitoring, no errors in the os logs, nothing in the dell drac logs. After a hard reset it's back up as if nothing happened, and it's an issue I'm none the wiser to the cause. Not good piece of mind. Look around and find another vendor, even if your company has to pay more for you to have that blame avoidance. We're currently using Dell and have had enough problems to think about switching. What about HP? Dan Franklin
Re: [PERFORM] hardware advice - opinions about HP?
On Tue, Oct 2, 2012 at 10:51:46AM -0400, Franklin, Dan (FEN) wrote: Look around and find another vendor, even if your company has to pay more for you to have that blame avoidance. We're currently using Dell and have had enough problems to think about switching. What about HP? If you need a big vendor, I think HP is a good choice. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] hardware advice
On 10/2/2012 2:20 AM, Glyn Astill wrote: newer R910s recently all of a sudden went dead to the world; no prior symptoms showing in our hardware and software monitoring, no errors in the os logs, nothing in the dell drac logs. After a hard reset it's back up as if nothing happened, and it's an issue I'm none the wiser to the cause. Not good piece of mind. This could be an OS bug rather than a hardware problem. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] hardware advice - opinions about HP?
On Tue, Oct 2, 2012 at 9:14 AM, Bruce Momjian br...@momjian.us wrote: On Tue, Oct 2, 2012 at 10:51:46AM -0400, Franklin, Dan (FEN) wrote: We're currently using Dell and have had enough problems to think about switching. What about HP? If you need a big vendor, I think HP is a good choice. This brings up a point I make sometimes to folks. Big companies can get great treatment from big vendors. When you work somewhere that orders servers by the truckload, you need a vendor who can fill trucks with servers in a day's notice, and send you a hundred different replacement parts the next. Conversely, if you are a smaller company that orders a dozen or so servers a year, then often a big vendor is not the best match. You're just a drop in the ocean to them. A small vendor is often a much better match here. They can carefully test those two 48 core opteron servers with 100 drives over a week's time to make sure it works the way you need it to. It might take them four weeks to build a big specialty box, but it will usually get built right and for a decent price. Also the sales people will usually be more knowledgeable about the machines they sell. Recent job: 20 or fewer servers ordered a year, boutique shop for them (aberdeeninc in this case). Other recent job: 20 or more servers a week. Big reseller (not at liberty to release the name). -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] hardware advice
On 09/27/2012 10:22 PM, M. D. wrote: On 09/27/2012 02:55 PM, Scott Marlowe wrote: On Thu, Sep 27, 2012 at 2:46 PM, M. D. li...@turnkey.bz wrote: select item.item_id,item_plu.number,item.description, (select number from account where asset_acct = account_id), (select number from account where expense_acct = account_id), (select number from account where income_acct = account_id), (select dept.name from dept where dept.dept_id = item.dept_id) as dept, (select subdept.name from subdept where subdept.subdept_id = item.subdept_id) as subdept, (select sum(on_hand) from item_change where item_change.item_id = item.item_id) as on_hand, (select sum(on_order) from item_change where item_change.item_id = item.item_id) as on_order, (select sum(total_cost) from item_change where item_change.item_id = item.item_id) as total_cost from item join item_plu on item.item_id = item_plu.item_id and item_plu.seq_num = 0 where item.inactive_on is null and exists (select item_num.number from item_num where item_num.item_id = item.item_id) and exists (select stocked from item_store where stocked = 'Y' and inactive_on is null and item_store.item_id = item.item_id) Have you tried re-writing this query first? Is there a reason to have a bunch of subselects instead of joining the tables? What pg version are you running btw? A newer version of pg might help too. This query is inside an application (Quasar Accounting) written in Qt and I don't have access to the source code. Is there any prospect of the planner/executor being taught to merge each of those groups of three index scans, to aid this sort of poor query? -- Jeremy -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] hardware advice
On Thu, Sep 27, 2012 at 03:50:33PM -0500, Shaun Thomas wrote: On 09/27/2012 03:44 PM, Scott Marlowe wrote: This 100x this. We used to buy our boxes from aberdeeninc.com and got a 5 year replacement parts warranty included. We spent ~$10k on a server that was right around $18k from dell for the same numbers and a 3 year warranty. Whatever you do, go for the Intel ethernet adaptor option. We've had so many headaches with integrated broadcom NICs. :( +++1 Sigh. Ken -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] hardware advice
On 9/27/2012 1:56 PM, M. D. wrote: I'm in Belize, so what I'm considering is from ebay, where it's unlikely that I'll get the warranty. Should I consider some other brand rather? To build my own or buy custom might be an option too, but I would not get any warranty. Your best warranty would be to have the confidence to do your own repairs, and to have the parts on hand. I'd seriously consider putting your own system together. Maybe go to a few sites with pre-configured machines and see what parts they use. Order those, screw the thing together yourself, and put a spare of each critical part on your shelf. A warranty is useless if you can't use it in a timely fashion. And you could easily get better reliability by spending the money on spare parts. I'd bet that for the price of a warranty you can buy a spare motherboard, a few spare disks, a memory stick or two, a spare power supply, and maybe even a spare 3WARE RAID controller. Craig -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] hardware advice
On 9/28/2012 9:46 AM, Craig James wrote: Your best warranty would be to have the confidence to do your own repairs, and to have the parts on hand. I'd seriously consider putting your own system together. Maybe go to a few sites with pre-configured machines and see what parts they use. Order those, screw the thing together yourself, and put a spare of each critical part on your shelf. This is what I did for years, but after taking my old parts collection to the landfill a few times, realized I may as well just buy N+1 machines and keep zero spares on the shelf. That way I get a spare machine available for use immediately, and I know the parts are working (parts on the shelf may be defective). If something breaks, I use the spare machine until the replacement parts arrive. Note in addition that a warranty can be extremely useful in certain organizations as a vehicle of blame avoidance (this may be its primary purpose in fact). If I buy a bunch of machines that turn out to have buggy NICs, well that's my fault and I can kick myself since I own the company, stay up late into the night reading kernel code, and buy new NICs. If I have an evil Dilbertian boss, then well...I'd be seriously thinking about buying Dell boxes in order to blame Dell rather than myself, and be able to say everything is warrantied if badness goes down. Just saying... -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] hardware advice
On 09/28/2012 09:57 AM, David Boreham wrote: On 9/28/2012 9:46 AM, Craig James wrote: Your best warranty would be to have the confidence to do your own repairs, and to have the parts on hand. I'd seriously consider putting your own system together. Maybe go to a few sites with pre-configured machines and see what parts they use. Order those, screw the thing together yourself, and put a spare of each critical part on your shelf. This is what I did for years, but after taking my old parts collection to the landfill a few times, realized I may as well just buy N+1 machines and keep zero spares on the shelf. That way I get a spare machine available for use immediately, and I know the parts are working (parts on the shelf may be defective). If something breaks, I use the spare machine until the replacement parts arrive. Note in addition that a warranty can be extremely useful in certain organizations as a vehicle of blame avoidance (this may be its primary purpose in fact). If I buy a bunch of machines that turn out to have buggy NICs, well that's my fault and I can kick myself since I own the company, stay up late into the night reading kernel code, and buy new NICs. If I have an evil Dilbertian boss, then well...I'd be seriously thinking about buying Dell boxes in order to blame Dell rather than myself, and be able to say everything is warrantied if badness goes down. Just saying... I'm kinda in the latter shoes. Dell is the only thing that is trusted in my organisation. If I would build my own, I would be fully blamed for anything going wrong in the next 3 years. Thanks everyone for your input. Now my final choice will be if my budget allows for the latest and fastest, else I'm going for the x5690. I don't have hundreds of users, so I think the x5690 should do a pretty good job handling the load. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] hardware advice
On Fri, Sep 28, 2012 at 11:33 AM, M. D. li...@turnkey.bz wrote: On 09/28/2012 09:57 AM, David Boreham wrote: On 9/28/2012 9:46 AM, Craig James wrote: Your best warranty would be to have the confidence to do your own repairs, and to have the parts on hand. I'd seriously consider putting your own system together. Maybe go to a few sites with pre-configured machines and see what parts they use. Order those, screw the thing together yourself, and put a spare of each critical part on your shelf. This is what I did for years, but after taking my old parts collection to the landfill a few times, realized I may as well just buy N+1 machines and keep zero spares on the shelf. That way I get a spare machine available for use immediately, and I know the parts are working (parts on the shelf may be defective). If something breaks, I use the spare machine until the replacement parts arrive. Note in addition that a warranty can be extremely useful in certain organizations as a vehicle of blame avoidance (this may be its primary purpose in fact). If I buy a bunch of machines that turn out to have buggy NICs, well that's my fault and I can kick myself since I own the company, stay up late into the night reading kernel code, and buy new NICs. If I have an evil Dilbertian boss, then well...I'd be seriously thinking about buying Dell boxes in order to blame Dell rather than myself, and be able to say everything is warrantied if badness goes down. Just saying... I'm kinda in the latter shoes. Dell is the only thing that is trusted in my organisation. If I would build my own, I would be fully blamed for anything going wrong in the next 3 years. Thanks everyone for your input. Now my final choice will be if my budget allows for the latest and fastest, else I'm going for the x5690. I don't have hundreds of users, so I think the x5690 should do a pretty good job handling the load. If people in your organization trust Dell, they just haven't dealt with them enough. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] hardware advice
On Thu, Sep 27, 2012 at 4:11 PM, M. D. li...@turnkey.bz wrote: At this point I'm dealing with a fairly small database of 8 to 9 GB. ... The on_hand lookup table currently has 3 million rows after 4 years of data. ... For both servers I'd have at least 32GB Ram and 4 Hard Drives in raid 10. For a 9GB database, that amount of RAM seams like overkill to me. Unless you expect to grow a lot faster than you've been growing, or perhaps your middle tier consumes a lot of those 32GB, I don't see the point there. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] hardware advice
On Thu, Sep 27, 2012 at 12:11 PM, M. D. li...@turnkey.bz wrote: Hi everyone, I want to buy a new server, and am contemplating a Dell R710 or the newer R720. The R710 has the x5600 series CPU, while the R720 has the newer E5-2600 series CPU. At this point I'm dealing with a fairly small database of 8 to 9 GB. The server will be dedicated to Postgres and a C++ based middle tier. The longest operations right now is loading the item list (80,000 items) and checking On Hand for an item. The item list does a sum for each item to get OH. The database design is out of my control. The on_hand lookup table currently has 3 million rows after 4 years of data. My main question is: Will a E5-2660 perform faster than a X5690? I'm leaning to clock speeds because I know doing the sum of those rows is CPU intensive, but have not done extensive research to see if the newer CPUs will outperform the x5690 per clock cycle. Overall the current CPU is hardly busy (after 1 min) - load average: 0.81, 0.46, 0.30, with % never exceeding 50%, but the speed increase is something I'm ready to pay for if it will actually be noticeably faster. I'm comparing the E5-2660 rather than the 2690 because of price. For both servers I'd have at least 32GB Ram and 4 Hard Drives in raid 10. I don't think you've supplied enough information for anyone to give you a meaningful answer. What's your current configuration? Are you I/O bound, CPU bound, memory limited, or some other problem? You need to do a specific analysis of the queries that are causing you problems (i.e. why do you need to upgrade at all?) Regarding Dell ... we were disappointed by Dell. They're expensive, they try to lock you in to their service contracts, and (when I bought two) they lock you in to their replacement parts, which cost 2-3x what you can buy from anyone else. If you're planning to use a RAID 10 configuration, then a BBU cache will make more difference than almost anything else you can do. I've heard that Dell's current RAID controller is pretty good, but in the past they've re-branded other controllers as Perc XYZ and you couldn't figure out what was really under the covers. RAID controllers are wildly different in performance, and you really want to get only the best. We use a white box vendor (ASA Computers), and have been very happy with the results. They build exactly what I ask for and deliver it in about a week. They offer on-site service and warranties, but don't pressure me to buy them. I'm not locked in to anything. Their prices are good. My current configuration is a dual 4-core Intel Xeon 2.13 GHz system with 12GB memory and 12x500GB 7200RPM SATA disks, controlled by a 3WARE RAID controller with a BBU cache. The OS and WAL are on a RAID1 pair, and the Postgres database is on a 8-disk RAID10 array. That leaves two hot spare disks. I get about 7,000 TPS for pg_bench. The chassis has dual hot-swappable power supplies and dual networks for failover. It's in the neighborhood of $5,000. Craig Best regards, Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] hardware advice
On 09/27/2012 01:22 PM, Claudio Freire wrote: On Thu, Sep 27, 2012 at 4:11 PM, M. D. li...@turnkey.bz wrote: At this point I'm dealing with a fairly small database of 8 to 9 GB. ... The on_hand lookup table currently has 3 million rows after 4 years of data. ... For both servers I'd have at least 32GB Ram and 4 Hard Drives in raid 10. For a 9GB database, that amount of RAM seams like overkill to me. Unless you expect to grow a lot faster than you've been growing, or perhaps your middle tier consumes a lot of those 32GB, I don't see the point there. The middle tier does caching and can easily take up to 10GB of RAM, therefore I'm buying more. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] hardware advice
On 9/27/2012 1:11 PM, M. D. wrote: I want to buy a new server, and am contemplating a Dell R710 or the newer R720. The R710 has the x5600 series CPU, while the R720 has the newer E5-2600 series CPU. For this the best data I've found (excepting actually running tests on the physical hardware) is to use the SpecIntRate2006 numbers, which can be found for both machines on the spec.org web site. I think the newer CPU is the clear winner with a specintrate performance of 589 vs 432. It also has a significantly larger cache. Comparing single-threaded performance, the older CPU is slightly faster (50 vs 48). That wouldn't be a big enough difference to make me pick it. The Sandy Bridge-based machine will likely use less power. http://www.spec.org/cpu2006/results/res2012q2/cpu2006-20120604-22697.html http://www.spec.org/cpu2006/results/res2012q1/cpu2006-20111219-19272.html To find more results use this page : http://www.spec.org/cgi-bin/osgresults?conf=cpu2006;op=form (enter R710 or R720 in the system field). -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] hardware advice
On 9/27/2012 1:37 PM, Craig James wrote: We use a white box vendor (ASA Computers), and have been very happy with the results. They build exactly what I ask for and deliver it in about a week. They offer on-site service and warranties, but don't pressure me to buy them. I'm not locked in to anything. Their prices are good. I'll second that : we build our own machines from white-label parts for typically less than 1/2 the Dell list price. However, Dell does provide value to some people : for example you can point a third-party software vendor at a Dell box and demand they make their application work properly whereas they may turn their nose up at a white label box. Same goes for Operating Systems : we have spent much time debugging Linux kernel issues on white box hardware. On Dell hardware we would most likely have not hit those bugs because Red Hat tests on Dell. So YMMV... -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] hardware advice
On 09/27/2012 01:47 PM, David Boreham wrote: On 9/27/2012 1:37 PM, Craig James wrote: We use a white box vendor (ASA Computers), and have been very happy with the results. They build exactly what I ask for and deliver it in about a week. They offer on-site service and warranties, but don't pressure me to buy them. I'm not locked in to anything. Their prices are good. I'll second that : we build our own machines from white-label parts for typically less than 1/2 the Dell list price. However, Dell does provide value to some people : for example you can point a third-party software vendor at a Dell box and demand they make their application work properly whereas they may turn their nose up at a white label box. Same goes for Operating Systems : we have spent much time debugging Linux kernel issues on white box hardware. On Dell hardware we would most likely have not hit those bugs because Red Hat tests on Dell. So YMMV... I'm in Belize, so what I'm considering is from ebay, where it's unlikely that I'll get the warranty. Should I consider some other brand rather? To build my own or buy custom might be an option too, but I would not get any warranty. Dell does sales directly to Belize, but the price is so much higher than US prices that it's hardly worth the support/warranty. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] hardware advice
On 9/27/2012 1:56 PM, M. D. wrote: I'm in Belize, so what I'm considering is from ebay, where it's unlikely that I'll get the warranty. Should I consider some other brand rather? To build my own or buy custom might be an option too, but I would not get any warranty. I don't have any recent experience with white label system vendors, but I suspect they are assembling machines from supermicro, asus, intel or tyan motherboards and enclosures, which is what we do. You can buy the hardware from suppliers such as newegg.com. It takes some time to read the manufacturer's documentation, figure out what kind of memory to buy and so on, which is basically what you're paying a white label box seller to do for you. For example here's a similar barebones system to the R720 I found with a couple minutes searching on newegg.com : http://www.newegg.com/Product/Product.aspx?Item=N82E16816117259 You could order that SKU, plus the two CPU devices, however many memory sticks you need, and drives. If you need less RAM (the Dell box allows up to 24 sticks) there are probably cheaper options. The equivalent Supermicro box looks to be somewhat less expensive : http://www.newegg.com/Product/Product.aspx?Item=N82E16816101693 When you consider downtime and the cost to ship equipment back to the supplier, a warranty doesn't have much value to me but it may be useful in your situation. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] hardware advice
On Thursday, September 27, 2012 02:13:01 PM David Boreham wrote: The equivalent Supermicro box looks to be somewhat less expensive : http://www.newegg.com/Product/Product.aspx?Item=N82E16816101693 When you consider downtime and the cost to ship equipment back to the supplier, a warranty doesn't have much value to me but it may be useful in your situation. And you can probably buy 2 Supermicros for the cost of the Dell. 100% spares. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] hardware advice
On Thu, Sep 27, 2012 at 2:31 PM, Alan Hodgson ahodg...@simkin.ca wrote: On Thursday, September 27, 2012 02:13:01 PM David Boreham wrote: The equivalent Supermicro box looks to be somewhat less expensive : http://www.newegg.com/Product/Product.aspx?Item=N82E16816101693 When you consider downtime and the cost to ship equipment back to the supplier, a warranty doesn't have much value to me but it may be useful in your situation. And you can probably buy 2 Supermicros for the cost of the Dell. 100% spares. This 100x this. We used to buy our boxes from aberdeeninc.com and got a 5 year replacement parts warranty included. We spent ~$10k on a server that was right around $18k from dell for the same numbers and a 3 year warranty. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] hardware advice
On 09/27/2012 01:37 PM, Craig James wrote: I don't think you've supplied enough information for anyone to give you a meaningful answer. What's your current configuration? Are you I/O bound, CPU bound, memory limited, or some other problem? You need to do a specific analysis of the queries that are causing you problems (i.e. why do you need to upgrade at all?) My current configuration is a Dell PE 1900, E5335, 16GB Ram, 2 250GB Raid 0. I'm buying a new server mostly because the current one is a bit slow and I need a new gateway server, so to get faster database responses, I want to upgrade this and use the old one for gateway. The current system is limited to 16GB Ram, so it is basically maxed out. A query that takes 89 seconds right now is run on a regular basis (82,000 rows): select item.item_id,item_plu.number,item.description, (select number from account where asset_acct = account_id), (select number from account where expense_acct = account_id), (select number from account where income_acct = account_id), (select dept.name from dept where dept.dept_id = item.dept_id) as dept, (select subdept.name from subdept where subdept.subdept_id = item.subdept_id) as subdept, (select sum(on_hand) from item_change where item_change.item_id = item.item_id) as on_hand, (select sum(on_order) from item_change where item_change.item_id = item.item_id) as on_order, (select sum(total_cost) from item_change where item_change.item_id = item.item_id) as total_cost from item join item_plu on item.item_id = item_plu.item_id and item_plu.seq_num = 0 where item.inactive_on is null and exists (select item_num.number from item_num where item_num.item_id = item.item_id) and exists (select stocked from item_store where stocked = 'Y' and inactive_on is null and item_store.item_id = item.item_id) Explain analyse: http://explain.depesz.com/s/sGq -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] hardware advice
On 09/27/2012 03:44 PM, Scott Marlowe wrote: This 100x this. We used to buy our boxes from aberdeeninc.com and got a 5 year replacement parts warranty included. We spent ~$10k on a server that was right around $18k from dell for the same numbers and a 3 year warranty. Whatever you do, go for the Intel ethernet adaptor option. We've had so many headaches with integrated broadcom NICs. :( -- Shaun Thomas OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604 312-444-8534 stho...@optionshouse.com __ See http://www.peak6.com/email_disclaimer/ for terms and conditions related to this email -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] hardware advice
On 09/27/2012 02:40 PM, David Boreham wrote: I think the newer CPU is the clear winner with a specintrate performance of 589 vs 432. The comparisons you linked to had 24 absolute threads pitted against 32, since the newer CPUs have a higher maximum cores per CPU. That said, you're right that it has a fairly large cache. And from my experience, Intel CPU generations have been scaling incredibly well lately. (Opteron, we hardly knew ye!) We went from Dunnington to Nehalem, and it was stunning how much better the X5675 was compared to the E7450. Sandy Bridge isn't quite that much of a jump though, so if you don't need that kind of bleeding-edge, you might be able to save some cash. This is especially true since the E5-2600 series has the same TDP profile and both use 32nm lithography. Me? I'm waiting for Haswell, the next tock in Intel's Tick-Tock strategy. -- Shaun Thomas OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604 312-444-8534 stho...@optionshouse.com __ See http://www.peak6.com/email_disclaimer/ for terms and conditions related to this email -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] hardware advice
On Thu, Sep 27, 2012 at 2:46 PM, M. D. li...@turnkey.bz wrote: select item.item_id,item_plu.number,item.description, (select number from account where asset_acct = account_id), (select number from account where expense_acct = account_id), (select number from account where income_acct = account_id), (select dept.name from dept where dept.dept_id = item.dept_id) as dept, (select subdept.name from subdept where subdept.subdept_id = item.subdept_id) as subdept, (select sum(on_hand) from item_change where item_change.item_id = item.item_id) as on_hand, (select sum(on_order) from item_change where item_change.item_id = item.item_id) as on_order, (select sum(total_cost) from item_change where item_change.item_id = item.item_id) as total_cost from item join item_plu on item.item_id = item_plu.item_id and item_plu.seq_num = 0 where item.inactive_on is null and exists (select item_num.number from item_num where item_num.item_id = item.item_id) and exists (select stocked from item_store where stocked = 'Y' and inactive_on is null and item_store.item_id = item.item_id) Have you tried re-writing this query first? Is there a reason to have a bunch of subselects instead of joining the tables? What pg version are you running btw? A newer version of pg might help too. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] hardware advice
On 09/27/2012 03:55 PM, Scott Marlowe wrote: Have you tried re-writing this query first? Is there a reason to have a bunch of subselects instead of joining the tables? What pg version are you running btw? A newer version of pg might help too. Wow, yeah. I was just about to say something about that. I even pasted it into a notepad and started cutting it apart, but I wasn't sure about enough of the column sources in all those subqueries. It looks like it'd be a very, very good candidate for a window function or two, and maybe a few CASE statements. But I'm about 80% certain it's not very efficient as is. -- Shaun Thomas OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604 312-444-8534 stho...@optionshouse.com __ See http://www.peak6.com/email_disclaimer/ for terms and conditions related to this email -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] hardware advice
On 9/27/2012 2:55 PM, Scott Marlowe wrote: Whatever you do, go for the Intel ethernet adaptor option. We've had so many headaches with integrated broadcom NICs.:( Sound advice, but not a get out of jail card unfortunately : we had a horrible problem with the Intel e1000 driver in RHEL for several releases. Finally diagnosed it just as RH shipped a fixed driver. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] hardware advice
On 9/27/2012 2:47 PM, Shaun Thomas wrote: On 09/27/2012 02:40 PM, David Boreham wrote: I think the newer CPU is the clear winner with a specintrate performance of 589 vs 432. The comparisons you linked to had 24 absolute threads pitted against 32, since the newer CPUs have a higher maximum cores per CPU. That said, you're right that it has a fairly large cache. And from my experience, Intel CPU generations have been scaling incredibly well lately. (Opteron, we hardly knew ye!) Yes, the rate spec test uses all the available cores. I'm assuming a concurrent workload, but since the single-thread performance isn't that much different between the two I think the higher number of cores, larger cache, newer design CPU is the best choice. We went from Dunnington to Nehalem, and it was stunning how much better the X5675 was compared to the E7450. Sandy Bridge isn't quite that much of a jump though, so if you don't need that kind of bleeding-edge, you might be able to save some cash. This is especially true since the E5-2600 series has the same TDP profile and both use 32nm lithography. We use Opteron on a price/performance basis. Intel always seems to come up with some way to make their low-cost processors useless (such as limiting the amount of memory they can address). -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] hardware advice
Hello, from benchmarking on my r/o in memory database, i can tell that 9.1 on x5650 is faster than 9.2 on e2440. I do not have x5690, but i have not so loaded e2660. If you can give me a dump and some queries, i can bench them. Nevertheless x5690 seems more efficient on single threaded workload than 2660, unless you have many clients. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] hardware advice
On Thu, Sep 27, 2012 at 6:08 PM, David Boreham david_l...@boreham.org wrote: We went from Dunnington to Nehalem, and it was stunning how much better the X5675 was compared to the E7450. Sandy Bridge isn't quite that much of a jump though, so if you don't need that kind of bleeding-edge, you might be able to save some cash. This is especially true since the E5-2600 series has the same TDP profile and both use 32nm lithography. We use Opteron on a price/performance basis. Intel always seems to come up with some way to make their low-cost processors useless (such as limiting the amount of memory they can address). Careful with AMD, since many (I'm not sure about the latest ones) cannot saturate the memory bus when running single-threaded. So, great if you have a high concurrent workload, quite bad if you don't. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] hardware advice
On Thu, Sep 27, 2012 at 2:50 PM, Shaun Thomas stho...@optionshouse.com wrote: On 09/27/2012 03:44 PM, Scott Marlowe wrote: This 100x this. We used to buy our boxes from aberdeeninc.com and got a 5 year replacement parts warranty included. We spent ~$10k on a server that was right around $18k from dell for the same numbers and a 3 year warranty. Whatever you do, go for the Intel ethernet adaptor option. We've had so many headaches with integrated broadcom NICs. :( I too have had problems with broadcom, as well as with nvidia nics and most other built in nics on servers. The Intel PCI dual nic cards have been my savior in the past. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] hardware advice
On Thursday, September 27, 2012 03:04:51 PM David Boreham wrote: On 9/27/2012 2:55 PM, Scott Marlowe wrote: Whatever you do, go for the Intel ethernet adaptor option. We've had so many headaches with integrated broadcom NICs.:( Sound advice, but not a get out of jail card unfortunately : we had a horrible problem with the Intel e1000 driver in RHEL for several releases. Finally diagnosed it just as RH shipped a fixed driver. Yeah I've been compiling a newer one on each kernel release for a couple of years. But the hardware rocks. The Supermicro boxes also mostly have Intel network onboard, so not a problem there. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] hardware advice
On 09/27/2012 04:08 PM, Evgeny Shishkin wrote: from benchmarking on my r/o in memory database, i can tell that 9.1 on x5650 is faster than 9.2 on e2440. How did you run those benchmarks? I find that incredibly hard to believe. Not only does 9.2 scale *much* better than 9.1, but the E5-2440 is a 15MB cache Sandy Bridge, as opposed to a 12MB cache Nehalem. Despite the slightly lower clock speed, you should have much better performance with 9.2 on the 2440. I know one thing you might want to check is to make sure both servers have turbo mode enabled, and power savings turned off for all CPUs. Check the BIOS for the CPU settings, because some motherboards and vendors have different defaults. I know we got inconsistent and much worse performance until we made those two changes on our HP systems. We use pgbench for benchmarking, so there's not anything I can really send you. :) -- Shaun Thomas OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604 312-444-8534 stho...@optionshouse.com __ See http://www.peak6.com/email_disclaimer/ for terms and conditions related to this email -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] hardware advice
On 09/27/2012 02:55 PM, Scott Marlowe wrote: On Thu, Sep 27, 2012 at 2:46 PM, M. D. li...@turnkey.bz wrote: select item.item_id,item_plu.number,item.description, (select number from account where asset_acct = account_id), (select number from account where expense_acct = account_id), (select number from account where income_acct = account_id), (select dept.name from dept where dept.dept_id = item.dept_id) as dept, (select subdept.name from subdept where subdept.subdept_id = item.subdept_id) as subdept, (select sum(on_hand) from item_change where item_change.item_id = item.item_id) as on_hand, (select sum(on_order) from item_change where item_change.item_id = item.item_id) as on_order, (select sum(total_cost) from item_change where item_change.item_id = item.item_id) as total_cost from item join item_plu on item.item_id = item_plu.item_id and item_plu.seq_num = 0 where item.inactive_on is null and exists (select item_num.number from item_num where item_num.item_id = item.item_id) and exists (select stocked from item_store where stocked = 'Y' and inactive_on is null and item_store.item_id = item.item_id) Have you tried re-writing this query first? Is there a reason to have a bunch of subselects instead of joining the tables? What pg version are you running btw? A newer version of pg might help too. This query is inside an application (Quasar Accounting) written in Qt and I don't have access to the source code. The query is cross database, so it's likely that's why it's written the way it is. The form this query is on also allows the user to add/remove columns, so it makes it a LOT easier from the application point of view to do columns as they are here. I had at one point tried to make this same query a table join, but did not notice any performance difference in pg 8.x - been a while so don't remember exactly what version. I'm currently on 9.0. I will upgrade to 9.2 once I get a new server. As noted above, I need to buy a new server anyway, so I'm going for this one and using the current as a VM server for several VMs and also a backup database server. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] hardware advice
On 9/27/2012 3:16 PM, Claudio Freire wrote: Careful with AMD, since many (I'm not sure about the latest ones) cannot saturate the memory bus when running single-threaded. So, great if you have a high concurrent workload, quite bad if you don't. Actually we test memory bandwidth with John McCalpin's stream program. Unfortunately it is hard to find stream test results for recent machines so it can be hard to compare two boxes unless you own examples, so I didn't mention it as a useful option. But if you can find results for the machines, or ask a friend to run it for you...definitely useful information. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] hardware advice
Please don't take responses off list, someone else may have an insight I'd miss. On Thu, Sep 27, 2012 at 3:20 PM, M. D. li...@turnkey.bz wrote: On 09/27/2012 02:55 PM, Scott Marlowe wrote: On Thu, Sep 27, 2012 at 2:46 PM, M. D. li...@turnkey.bz wrote: select item.item_id,item_plu.number,item.description, (select number from account where asset_acct = account_id), (select number from account where expense_acct = account_id), (select number from account where income_acct = account_id), (select dept.name from dept where dept.dept_id = item.dept_id) as dept, (select subdept.name from subdept where subdept.subdept_id = item.subdept_id) as subdept, (select sum(on_hand) from item_change where item_change.item_id = item.item_id) as on_hand, (select sum(on_order) from item_change where item_change.item_id = item.item_id) as on_order, (select sum(total_cost) from item_change where item_change.item_id = item.item_id) as total_cost from item join item_plu on item.item_id = item_plu.item_id and item_plu.seq_num = 0 where item.inactive_on is null and exists (select item_num.number from item_num where item_num.item_id = item.item_id) and exists (select stocked from item_store where stocked = 'Y' and inactive_on is null and item_store.item_id = item.item_id) Have you tried re-writing this query first? Is there a reason to have a bunch of subselects instead of joining the tables? What pg version are you running btw? A newer version of pg might help too. This query is inside an application (Quasar Accounting) written in Qt and I don't have access to the source code. The query is cross database, so it's likely that's why it's written the way it is. The form this query is on also allows the user to add/remove columns, so it makes it a LOT easier from the application point of view to do columns as they are here. I had at one point tried to make this same query a table join, but did not notice any performance difference in pg 8.x - been a while so don't remember exactly what version. Have you tried cranking up work_mem and see if it helps this query at least avoid a nested look on 80k rows? If they'd fit in memory and use bitmap hashes it should be MUCH faster than a nested loop. I'm currently on 9.0. I will upgrade to 9.2 once I get a new server. As noted above, I need to buy a new server anyway, so I'm going for this one and using the current as a VM server for several VMs and also a backup database server. Well being on 9.0 should make a big diff from 8.2. But again, without enough work_mem for the query to use a bitmap hash or something more efficient than a nested loop it's gonna be slow. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] hardware advice
On Thu, Sep 27, 2012 at 3:16 PM, Claudio Freire klaussfre...@gmail.com wrote: On Thu, Sep 27, 2012 at 6:08 PM, David Boreham david_l...@boreham.org wrote: We went from Dunnington to Nehalem, and it was stunning how much better the X5675 was compared to the E7450. Sandy Bridge isn't quite that much of a jump though, so if you don't need that kind of bleeding-edge, you might be able to save some cash. This is especially true since the E5-2600 series has the same TDP profile and both use 32nm lithography. We use Opteron on a price/performance basis. Intel always seems to come up with some way to make their low-cost processors useless (such as limiting the amount of memory they can address). Careful with AMD, since many (I'm not sure about the latest ones) cannot saturate the memory bus when running single-threaded. So, great if you have a high concurrent workload, quite bad if you don't. Conversely, we often got MUCH better parallel performance from our quad 12 core opteron servers than I could get on a dual 8 core xeon at the time. The newest quad 10 core Intels are about as fast as the quad 12 core opteron from 3 years ago. So for parallel operation, do remember to look at the opteron. It was much cheaper to get highly parallel operation on the opterons than the xeons at the time we got the quad 12 core machine at my last job. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] hardware advice
On Sep 28, 2012, at 1:20 AM, Shaun Thomas stho...@optionshouse.com wrote: On 09/27/2012 04:08 PM, Evgeny Shishkin wrote: from benchmarking on my r/o in memory database, i can tell that 9.1 on x5650 is faster than 9.2 on e2440. How did you run those benchmarks? I find that incredibly hard to believe. Not only does 9.2 scale *much* better than 9.1, but the E5-2440 is a 15MB cache Sandy Bridge, as opposed to a 12MB cache Nehalem. Despite the slightly lower clock speed, you should have much better performance with 9.2 on the 2440. I know one thing you might want to check is to make sure both servers have turbo mode enabled, and power savings turned off for all CPUs. Check the BIOS for the CPU settings, because some motherboards and vendors have different defaults. I know we got inconsistent and much worse performance until we made those two changes on our HP systems. We use pgbench for benchmarking, so there's not anything I can really send you. :) Yes, on pgbench utilising cpu to 80-90% e2660 is better, it goes to 140k ro tps, so scalability is very very good. But i talk about real oltp ro query. Single threaded. And cpu clock was real winner. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] hardware advice
On Thu, Sep 27, 2012 at 3:36 PM, Scott Marlowe scott.marl...@gmail.com wrote: Conversely, we often got MUCH better parallel performance from our quad 12 core opteron servers than I could get on a dual 8 core xeon at the time. Clarification that the two base machines were about the same price. 48 opteron cores (2.2GHz) or 16 xeon cores at ~2.6GHz. It's been a few years, I'm not gonna testify to the exact numbers in court. But the performance to 32 to 100 threads was WAY better on the 48 core opteron machine, never really breaking down even to 120+ threads. The Intel machine hit a very real knee of performance and dropped off really badly after about 40 threads (they were hyperthreaded). -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] hardware advice
On 09/27/2012 04:39 PM, Scott Marlowe wrote: Clarification that the two base machines were about the same price. 48 opteron cores (2.2GHz) or 16 xeon cores at ~2.6GHz. It's been a few years, I'm not gonna testify to the exact numbers in court. Same here. We got really good performance on Opteron a few years ago too. :) But some more anecdotes... with the 4x8 E7450 Dunnington, our performance was OK. With the 2x6x2 X5675 Nehalem, it was ridiculous. Half the cores, 2.5x the speed, so far as pgbench was concerned. On every workload, on every level of concurrency I tried. Like you said, the 7450 dropped off at higher concurrency, but the 5675 kept on trucking. That's why I qualified my statement about Intel CPUs as lately. They really seem to have cleaned up their server architecture. -- Shaun Thomas OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604 312-444-8534 stho...@optionshouse.com __ See http://www.peak6.com/email_disclaimer/ for terms and conditions related to this email -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] hardware advice
On Sep 28, 2012, at 1:36 AM, Scott Marlowe scott.marl...@gmail.com wrote: On Thu, Sep 27, 2012 at 3:16 PM, Claudio Freire klaussfre...@gmail.com wrote: On Thu, Sep 27, 2012 at 6:08 PM, David Boreham david_l...@boreham.org wrote: We went from Dunnington to Nehalem, and it was stunning how much better the X5675 was compared to the E7450. Sandy Bridge isn't quite that much of a jump though, so if you don't need that kind of bleeding-edge, you might be able to save some cash. This is especially true since the E5-2600 series has the same TDP profile and both use 32nm lithography. We use Opteron on a price/performance basis. Intel always seems to come up with some way to make their low-cost processors useless (such as limiting the amount of memory they can address). Careful with AMD, since many (I'm not sure about the latest ones) cannot saturate the memory bus when running single-threaded. So, great if you have a high concurrent workload, quite bad if you don't. Conversely, we often got MUCH better parallel performance from our quad 12 core opteron servers than I could get on a dual 8 core xeon at the time. The newest quad 10 core Intels are about as fast as the quad 12 core opteron from 3 years ago. So for parallel operation, do remember to look at the opteron. It was much cheaper to get highly parallel operation on the opterons than the xeons at the time we got the quad 12 core machine at my last job. But what about latency, not throughput? -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] hardware advice
On Thu, Sep 27, 2012 at 3:40 PM, Evgeny Shishkin itparan...@gmail.com wrote: On Sep 28, 2012, at 1:36 AM, Scott Marlowe scott.marl...@gmail.com wrote: On Thu, Sep 27, 2012 at 3:16 PM, Claudio Freire klaussfre...@gmail.com wrote: On Thu, Sep 27, 2012 at 6:08 PM, David Boreham david_l...@boreham.org wrote: We went from Dunnington to Nehalem, and it was stunning how much better the X5675 was compared to the E7450. Sandy Bridge isn't quite that much of a jump though, so if you don't need that kind of bleeding-edge, you might be able to save some cash. This is especially true since the E5-2600 series has the same TDP profile and both use 32nm lithography. We use Opteron on a price/performance basis. Intel always seems to come up with some way to make their low-cost processors useless (such as limiting the amount of memory they can address). Careful with AMD, since many (I'm not sure about the latest ones) cannot saturate the memory bus when running single-threaded. So, great if you have a high concurrent workload, quite bad if you don't. Conversely, we often got MUCH better parallel performance from our quad 12 core opteron servers than I could get on a dual 8 core xeon at the time. The newest quad 10 core Intels are about as fast as the quad 12 core opteron from 3 years ago. So for parallel operation, do remember to look at the opteron. It was much cheaper to get highly parallel operation on the opterons than the xeons at the time we got the quad 12 core machine at my last job. But what about latency, not throughput? It means little when you're building a server to handle literally thousands of queries per seconds from hundreds of active connections. The intel box would have simply fallen over under the load we were handling on the 48 core opteron at the time. Note that under maximum load we saw load factors in the 20 to 100 on that opteron box and still got very good response times (average latency on most queries was still in the single digits of milliseconds). For single threaded or only a few threads, yeah, the intel was slightly faster, but as soon as the real load of our web site hit the machine it wasn't even close. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] hardware advice
On Thu, Sep 27, 2012 at 3:44 PM, Shaun Thomas stho...@optionshouse.com wrote: On 09/27/2012 04:39 PM, Scott Marlowe wrote: Clarification that the two base machines were about the same price. 48 opteron cores (2.2GHz) or 16 xeon cores at ~2.6GHz. It's been a few years, I'm not gonna testify to the exact numbers in court. Same here. We got really good performance on Opteron a few years ago too. :) But some more anecdotes... with the 4x8 E7450 Dunnington, our performance was OK. With the 2x6x2 X5675 Nehalem, it was ridiculous. Half the cores, 2.5x the speed, so far as pgbench was concerned. On every workload, on every level of concurrency I tried. Like you said, the 7450 dropped off at higher concurrency, but the 5675 kept on trucking. That's why I qualified my statement about Intel CPUs as lately. They really seem to have cleaned up their server architecture. Yeah, Intel's made a lot of headway on multi-core architecture since then. But the 5620 etc series of the time were still pretty meh at high concurrency compared to the opteron. The latest ones, which I've tested now (40 hyperthreaded cores i.e 80 virtual cores) are definitely faster than the now 4 year old 48 core opterons. But at a much higher cost for a pretty moderate (20 to 30%) increase in performance. OTOH, they don't break down past 40 to 100 connections any more, so that's the big improvement to me. How the curve looks like heading to 60+ threads is mildly interesting, but how the server performs as you go past it was what worried me before. Now both architectures seem to behave much better in such overload scenarios. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] hardware advice
On Thu, Sep 27, 2012 at 3:28 PM, David Boreham david_l...@boreham.org wrote: On 9/27/2012 3:16 PM, Claudio Freire wrote: Careful with AMD, since many (I'm not sure about the latest ones) cannot saturate the memory bus when running single-threaded. So, great if you have a high concurrent workload, quite bad if you don't. Actually we test memory bandwidth with John McCalpin's stream program. Unfortunately it is hard to find stream test results for recent machines so it can be hard to compare two boxes unless you own examples, so I didn't mention it as a useful option. But if you can find results for the machines, or ask a friend to run it for you...definitely useful information. IIRC the most recent tests from Greg Smith show the latest model Intels winning by a fair bit over the opterons. Before that though the 48 core opteron servers were winning. It tends to go back and forth. Dollar for dollar, the Opterons are usually the better value now, while the Intels give the absolute best performance money can buy. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Hardware advice for scalable warehouse db
Hi Chris, A couple comments on the NetApp SAN. We use NetApp, primarily with Fiber connectivity and FC drives. All of the Postgres files are located on the SAN and this configuration works well. We have tried iSCSI, but performance his horrible. Same with SATA drives. The SAN will definitely be more costly then local drives. It really depends on what your needs are. The biggest benefit for me in using SAN is using the special features that it offers. We use snapshots and flex clones, which is a great way to backup and clone large databases. Cheers, Terry On Thu, Jul 14, 2011 at 11:34 PM, chris chri...@gmx.net wrote: Hi list, My employer will be donated a NetApp FAS 3040 SAN [1] and we want to run our warehouse DB on it. The pg9.0 DB currently comprises ~1.5TB of tables, 200GB of indexes, and grows ~5%/month. The DB is not update critical, but undergoes larger read and insert operations frequently. My employer is a university with little funds and we have to find a cheap way to scale for the next 3 years, so the SAN seems a good chance to us. We are now looking for the remaining server parts to maximize DB performance with costs = $4000. I digged out the following configuration with the discount we receive from Dell: 1 x Intel Xeon X5670, 6C, 2.93GHz, 12M Cache 16 GB (4x4GB) Low Volt DDR3 1066Mhz PERC H700 SAS RAID controller 4 x 300 GB 10k SAS 6Gbps 2.5 in RAID 10 I was thinking to put the WAL and the indexes on the local disks, and the rest on the SAN. If funds allow, we might downgrade the disks to SATA and add a 50 GB SATA SSD for the WAL (SAS/SATA mixup not possible). Any comments on the configuration? Any experiences with iSCSI vs. Fibre Channel for SANs and PostgreSQL? If the SAN setup sucks, do you see a cheap alternative how to connect as many as 16 x 2TB disks as DAS? Thanks so much! Best, Chris [1]: http://www.b2net.co.uk/netapp/fas3000.pdf -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Hardware advice for scalable warehouse db
chris wrote: My employer is a university with little funds and we have to find a cheap way to scale for the next 3 years, so the SAN seems a good chance to us. A SAN is rarely ever the cheapest way to scale anything; you're paying extra for reliability instead. I was thinking to put the WAL and the indexes on the local disks, and the rest on the SAN. If funds allow, we might downgrade the disks to SATA and add a 50 GB SATA SSD for the WAL (SAS/SATA mixup not possible). If you want to keep the bulk of the data on the SAN, this is a reasonable way to go, performance-wise. But be aware that losing the WAL means your database is likely corrupted. That means that much of the reliability benefit of the SAN is lost in this configuration. Any experiences with iSCSI vs. Fibre Channel for SANs and PostgreSQL? If the SAN setup sucks, do you see a cheap alternative how to connect as many as 16 x 2TB disks as DAS? I've never heard anyone recommend iSCSI if you care at all about performance, while FC works fine for this sort of job. The physical dimensions of 3.5 drives makes getting 16 of them in one reasonably sized enclosure normally just out of reach. But a Dell PowerVault MD1000 will give you 15 x 2TB as inexpensively as possible in a single 3U space (well, as cheaply as you want to go--you might build your own giant box cheaper but I wouldn't recommend ). I've tested MD1000, MD1200, and MD1220 arrays before, and always gotten seriously good performance relative to the dollars spent with that series. Only one of these Dell storage arrays I've heard two disappointing results from (but not tested directly yet) is the MD3220. -- Greg Smith 2ndQuadrant USg...@2ndquadrant.com Baltimore, MD -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Hardware advice for scalable warehouse db
1 x Intel Xeon X5670, 6C, 2.93GHz, 12M Cache 16 GB (4x4GB) Low Volt DDR3 1066Mhz PERC H700 SAS RAID controller 4 x 300 GB 10k SAS 6Gbps 2.5 in RAID 10 Apart from Gregs excellent recommendations. I would strongly suggest more memory. 16GB in 2011 is really on the low side. PG is using memory (either shared_buffers og OS cache) for keeping frequently accessed data in. Good recommendations are hard without knowledge of data and access-patterns, but 64, 128 and 256GB system are quite frequent when you have data that can't all be in memory at once. SAN's are nice, but I think you can buy a good DAS thing each year for just the support cost of a Netapp, but you might have gotten a really good deal there too. But you are getting a huge amount of advanced configuration features and potential ways of sharing and.. and .. just see the specs. .. and if you need those the SAN is a good way to go, but they do come with a huge pricetag. Jesper -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Hardware advice for scalable warehouse db
On 7/15/2011 2:10 AM, Greg Smith wrote: chris wrote: My employer is a university with little funds and we have to find a cheap way to scale for the next 3 years, so the SAN seems a good chance to us. A SAN is rarely ever the cheapest way to scale anything; you're paying extra for reliability instead. I was thinking to put the WAL and the indexes on the local disks, and the rest on the SAN. If funds allow, we might downgrade the disks to SATA and add a 50 GB SATA SSD for the WAL (SAS/SATA mixup not possible). If you want to keep the bulk of the data on the SAN, this is a reasonable way to go, performance-wise. But be aware that losing the WAL means your database is likely corrupted. That means that much of the reliability benefit of the SAN is lost in this configuration. Any experiences with iSCSI vs. Fibre Channel for SANs and PostgreSQL? If the SAN setup sucks, do you see a cheap alternative how to connect as many as 16 x 2TB disks as DAS? I've never heard anyone recommend iSCSI if you care at all about performance, while FC works fine for this sort of job. The physical dimensions of 3.5 drives makes getting 16 of them in one reasonably sized enclosure normally just out of reach. But a Dell PowerVault MD1000 will give you 15 x 2TB as inexpensively as possible in a single 3U space (well, as cheaply as you want to go--you might build your own giant box cheaper but I wouldn't recommend ). I'm curious what people think of these: http://www.pc-pitstop.com/sas_cables_enclosures/scsase166g.asp I currently have my database on two of these and for my purpose they seem to be fine and are quite a bit less expensive than the Dell MD1000. I actually have three more of the 3G versions with expanders for mass storage arrays (RAID0) and haven't had any issues with them in the three years I've had them. Bob -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Hardware advice for scalable warehouse db
On Fri, Jul 15, 2011 at 12:34 AM, chris chri...@gmx.net wrote: I was thinking to put the WAL and the indexes on the local disks, and the rest on the SAN. If funds allow, we might downgrade the disks to SATA and add a 50 GB SATA SSD for the WAL (SAS/SATA mixup not possible). Just to add to the conversation, there's no real advantage to putting WAL on SSD. Indexes can benefit from them, but WAL is mosty seqwuential throughput and for that a pair of SATA 1TB drives at 7200RPM work just fine for most folks. For example, in one big server we're running we have 24 drives in a RAID-10 for the /data/base dir with 4 drives in a RAID-10 for pg_xlog, and those 4 drives tend to have the same io util % under iostat as the 24 drives under normal usage. It takes a special kind of load (lots of inserts happening in large transactions quickly) for the 4 drive RAID-10 to have more than 50% util ever. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Hardware advice for scalable warehouse db
On Fri, Jul 15, 2011 at 10:39 AM, Robert Schnabel schnab...@missouri.edu wrote: I'm curious what people think of these: http://www.pc-pitstop.com/sas_cables_enclosures/scsase166g.asp I currently have my database on two of these and for my purpose they seem to be fine and are quite a bit less expensive than the Dell MD1000. I actually have three more of the 3G versions with expanders for mass storage arrays (RAID0) and haven't had any issues with them in the three years I've had them. I have a co-worker who's familiar with them and they seem a lot like the 16 drive units we use from Aberdeen, which fully outfitted with 15k SAS drives run $5k to $8k depending on the drives etc. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Hardware advice for scalable warehouse db
Just to add to the conversation, there's no real advantage to putting WAL on SSD. Indexes can benefit from them, but WAL is mosty seqwuential throughput and for that a pair of SATA 1TB drives at 7200RPM work just fine for most folks. Actually, there's a strong disadvantage to putting WAL on SSD. SSD is very prone to fragmentation if you're doing a lot of deleting and replacing files. I've implemented data warehouses where the database was on SSD but WAL was still on HDD. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Hardware advice for scalable warehouse db
Hi list, Thanks a lot for your very helpful feedback! I've tested MD1000, MD1200, and MD1220 arrays before, and always gotten seriously good performance relative to the dollars spent Great hint, but I'm afraid that's too expensive for us. But it's a great way to scale over the years, I'll keep that in mind. I had a look at other server vendors who offer 4U servers with slots for 16 disks for 4k in total (w/o disks), maybe that's an even cheaper/better solution for us. If you had the choice between 16 x 2TB SATA vs. a server with some SSDs for WAL/indexes and a SAN (with SATA disk) for data, what would you choose performance-wise? Again, thanks so much for your help. Best, Chris -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Hardware advice for scalable warehouse db
On Fri, Jul 15, 2011 at 11:49 AM, chris r. chri...@gmx.net wrote: Hi list, Thanks a lot for your very helpful feedback! I've tested MD1000, MD1200, and MD1220 arrays before, and always gotten seriously good performance relative to the dollars spent Great hint, but I'm afraid that's too expensive for us. But it's a great way to scale over the years, I'll keep that in mind. I had a look at other server vendors who offer 4U servers with slots for 16 disks for 4k in total (w/o disks), maybe that's an even cheaper/better solution for us. If you had the choice between 16 x 2TB SATA vs. a server with some SSDs for WAL/indexes and a SAN (with SATA disk) for data, what would you choose performance-wise? Again, thanks so much for your help. Best, Chris SATA drives can easily flip bits and postgres does not checksum data, so it will not automatically detect corruption for you. I would steer well clear of SATA unless you are going to be using a fs like ZFS which checksums data. I would hope that a SAN would detect this for you, but I have no idea. -- Rob Wultsch wult...@gmail.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Hardware advice for scalable warehouse db
On 7/14/11 11:34 PM, chris wrote: Any comments on the configuration? Any experiences with iSCSI vs. Fibre Channel for SANs and PostgreSQL? If the SAN setup sucks, do you see a cheap alternative how to connect as many as 16 x 2TB disks as DAS? Here's the problem with iSCSI: on gigabit ethernet, your maximum possible throughput is 100mb/s, which means that your likely maximum database throughput (for a seq scan or vacuum, for example) is 30mb/s. That's about a third of what you can get with good internal RAID. While multichannel iSCSI is possible, it's hard to configure, and doesn't really allow you to spread a *single* request across multiple channels. So: go with fiber channel if you're using a SAN. iSCSI also has horrible lag times, but you don't care about that so much for DW. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Hardware advice
Hi Alex, Please check out http://www.powerpostgresql.com/PerfList before you use RAID 5 for PostgreSQL. Anyhow, In a larger scale you end up in the response time of the I/O system for an read or write. The read is in modern RAID and SAN environments the part where you have to focus when you want to tune your system because most RAID and SAN system can buffer write. PostgreSQL does use the Linux file system cache which is normally much larger then the RAID or SAN cache for reading. This means whenever a PostgreSQL read goes to the RAID or SAN sub system the response time of the hard disk will become interesting. I guess you can imagine that multiple reads to the same spins are causing an delay in the response time. Alexandru, You should have two XEONs, what every your core count is. This would use the full benefit of the memory architecture. You know two FSBs and two memory channels. Cheers Sven Alex Turner schrieb: The test that I did - which was somewhat limited, showed no benefit splitting disks into seperate partitions for large bulk loads. The program read from one very large file and wrote the input out to two other large files. The totaly throughput on a single partition was close to the maximum theoretical for that logical drive, even though the process was reading and writing to three seperate places on the disk. I don't know what this means for postgresql setups directly, but I would postulate that the benefit from splitting pg_xlog onto a seperate spindle is not as great as it might once have been for large bulk transactions. I am therefore going to be going to a single 6 drive RAID 5 for my data wharehouse application because I want the read speed to be availalbe. I can benefit from fast reads when I want to do large data scans at the expense of slightly slower insert speed. Alex. On 12/5/06, *Alexandru Coseru* [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote: Hello.. Thanks for the advices.. Actually , i'm waiting for the clovertown to show up on the market... Regards Alex - Original Message - From: Sven Geisler [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] To: Alexandru Coseru [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] Cc: pgsql-performance@postgresql.org mailto:pgsql-performance@postgresql.org Sent: Tuesday, December 05, 2006 11:57 AM Subject: Re: [PERFORM] Hardware advice Hi Alexandru, Alexandru Coseru schrieb: [...] Question 1: The RAID layout should be: a) 2 hdd in raid 1 for system and pg_xlog and 6 hdd in raid10 for data ? b) 8 hdd in raid10 for all ? c) 2 hdd in raid1 for system , 2 hdd in raid1 for pg_xlog , 4 hdd in raid10 for data ? Obs: I'm going for setup a) , but i want to hear your thoughts as well. This depends on you data size. I think, option a and c are good. The potential bottleneck may the RAID 1 for pg_xlog if you have huge amount of updates and insert. What is about another setup 4 hdd in RAID 10 for System and pg_xlog - System partitions are normally not in heavy use and pg_xlog should be fast for writing. 4 hdd in RAID 10 for data. Question 2: (Don't want to start a flame here. but here is goes) What filesystem should i run for data ? ext3 or xfs ? The tables have ~ 15.000 rel_pages each. The biggest table has now over 30.000 pages. We have a database running with 60,000+ tables. The tables size is between a few kByte for the small tables and up to 30 GB for the largest one. We had no issue with ext3 in the past. Question 3: The block size in postgresql is 8kb. The strip size in the raid ctrl is 64k. Should i increase the pgsql block size to 16 or 32 or even 64k ? You should keep in mind that the file system has also a block size. Ext3 has as maximum 4k. I would set up the partitions aligned to the stripe size to prevent unaligned reads. I guess, you can imagine that a larger block size of postgresql may also end up in unaligned reads because the file system has a smaller block size. RAID Volume and File system set up 1. Make all partitions aligned to the RAID strip size. The first partition should be start at 128 kByte. You can do this with fdisk. after you created the partition switch to the expert mode (type x) and modify the begin of the partition (type b). You should change this value to 128 (default is 63). All other partition should also start on a multiple of 128 kByte. 2. Give the file system a hint that you work with larger block sizes. Ext3: mke2fs -b 4096 -j
Re: [PERFORM] Hardware advice
If your data is valuable I'd recommend against RAID5 ... see http://www.miracleas.com/BAARF/RAID5_versus_RAID10.txt performance aside, I'd advise against RAID5 in almost all circumstances. Why take chances ? Greg Williamson DBA GlobeXplorer LLC -Original Message- From: [EMAIL PROTECTED] on behalf of Sven Geisler Sent: Wed 12/6/2006 1:09 AM To: Alex Turner Cc: Alexandru Coseru; pgsql-performance@postgresql.org Subject:Re: [PERFORM] Hardware advice Hi Alex, Please check out http://www.powerpostgresql.com/PerfList before you use RAID 5 for PostgreSQL. Anyhow, In a larger scale you end up in the response time of the I/O system for an read or write. The read is in modern RAID and SAN environments the part where you have to focus when you want to tune your system because most RAID and SAN system can buffer write. PostgreSQL does use the Linux file system cache which is normally much larger then the RAID or SAN cache for reading. This means whenever a PostgreSQL read goes to the RAID or SAN sub system the response time of the hard disk will become interesting. I guess you can imagine that multiple reads to the same spins are causing an delay in the response time. Alexandru, You should have two XEONs, what every your core count is. This would use the full benefit of the memory architecture. You know two FSBs and two memory channels. Cheers Sven Alex Turner schrieb: The test that I did - which was somewhat limited, showed no benefit splitting disks into seperate partitions for large bulk loads. The program read from one very large file and wrote the input out to two other large files. The totaly throughput on a single partition was close to the maximum theoretical for that logical drive, even though the process was reading and writing to three seperate places on the disk. I don't know what this means for postgresql setups directly, but I would postulate that the benefit from splitting pg_xlog onto a seperate spindle is not as great as it might once have been for large bulk transactions. I am therefore going to be going to a single 6 drive RAID 5 for my data wharehouse application because I want the read speed to be availalbe. I can benefit from fast reads when I want to do large data scans at the expense of slightly slower insert speed. Alex. On 12/5/06, *Alexandru Coseru* [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote: Hello.. Thanks for the advices.. Actually , i'm waiting for the clovertown to show up on the market... Regards Alex - Original Message - From: Sven Geisler [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] To: Alexandru Coseru [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] Cc: pgsql-performance@postgresql.org mailto:pgsql-performance@postgresql.org Sent: Tuesday, December 05, 2006 11:57 AM Subject: Re: [PERFORM] Hardware advice Hi Alexandru, Alexandru Coseru schrieb: [...] Question 1: The RAID layout should be: a) 2 hdd in raid 1 for system and pg_xlog and 6 hdd in raid10 for data ? b) 8 hdd in raid10 for all ? c) 2 hdd in raid1 for system , 2 hdd in raid1 for pg_xlog , 4 hdd in raid10 for data ? Obs: I'm going for setup a) , but i want to hear your thoughts as well. This depends on you data size. I think, option a and c are good. The potential bottleneck may the RAID 1 for pg_xlog if you have huge amount of updates and insert. What is about another setup 4 hdd in RAID 10 for System and pg_xlog - System partitions are normally not in heavy use and pg_xlog should be fast for writing. 4 hdd in RAID 10 for data. Question 2: (Don't want to start a flame here. but here is goes) What filesystem should i run for data ? ext3 or xfs ? The tables have ~ 15.000 rel_pages each. The biggest table has now over 30.000 pages. We have a database running with 60,000+ tables. The tables size is between a few kByte for the small tables and up to 30 GB for the largest one. We had no issue with ext3 in the past. Question 3: The block size in postgresql is 8kb. The strip size in the raid ctrl is 64k. Should i increase the pgsql block size to 16 or 32 or even 64k ? You should keep in mind that the file system has also a block size. Ext3 has as maximum 4k. I would set up the partitions aligned to the stripe size to prevent unaligned reads. I guess, you can imagine that a larger block size of postgresql may also end up in unaligned reads because the file system has a smaller block size. RAID Volume and File system set up 1. Make all partitions aligned
Re: [PERFORM] Hardware advice
Hi Alexandru, Alexandru Coseru schrieb: [...] Question 1: The RAID layout should be: a) 2 hdd in raid 1 for system and pg_xlog and 6 hdd in raid10 for data ? b) 8 hdd in raid10 for all ? c) 2 hdd in raid1 for system , 2 hdd in raid1 for pg_xlog , 4 hdd in raid10 for data ? Obs: I'm going for setup a) , but i want to hear your thoughts as well. This depends on you data size. I think, option a and c are good. The potential bottleneck may the RAID 1 for pg_xlog if you have huge amount of updates and insert. What is about another setup 4 hdd in RAID 10 for System and pg_xlog - System partitions are normally not in heavy use and pg_xlog should be fast for writing. 4 hdd in RAID 10 for data. Question 2: (Don't want to start a flame here. but here is goes) What filesystem should i run for data ? ext3 or xfs ? The tables have ~ 15.000 rel_pages each. The biggest table has now over 30.000 pages. We have a database running with 60,000+ tables. The tables size is between a few kByte for the small tables and up to 30 GB for the largest one. We had no issue with ext3 in the past. Question 3: The block size in postgresql is 8kb. The strip size in the raid ctrl is 64k. Should i increase the pgsql block size to 16 or 32 or even 64k ? You should keep in mind that the file system has also a block size. Ext3 has as maximum 4k. I would set up the partitions aligned to the stripe size to prevent unaligned reads. I guess, you can imagine that a larger block size of postgresql may also end up in unaligned reads because the file system has a smaller block size. RAID Volume and File system set up 1. Make all partitions aligned to the RAID strip size. The first partition should be start at 128 kByte. You can do this with fdisk. after you created the partition switch to the expert mode (type x) and modify the begin of the partition (type b). You should change this value to 128 (default is 63). All other partition should also start on a multiple of 128 kByte. 2. Give the file system a hint that you work with larger block sizes. Ext3: mke2fs -b 4096 -j -R stride=2 /dev/sda1 -L LABEL I made a I/O test with PostgreSQL on a RAID system with stripe size of 64kByte and block size of 8 kByte in the RAID system. Stride=2 was the best value. PS: You should have a second XEON in your budget plan. Sven. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Hardware advice
Hello.. Thanks for the advices.. Actually , i'm waiting for the clovertown to show up on the market... Regards Alex - Original Message - From: Sven Geisler [EMAIL PROTECTED] To: Alexandru Coseru [EMAIL PROTECTED] Cc: pgsql-performance@postgresql.org Sent: Tuesday, December 05, 2006 11:57 AM Subject: Re: [PERFORM] Hardware advice Hi Alexandru, Alexandru Coseru schrieb: [...] Question 1: The RAID layout should be: a) 2 hdd in raid 1 for system and pg_xlog and 6 hdd in raid10 for data ? b) 8 hdd in raid10 for all ? c) 2 hdd in raid1 for system , 2 hdd in raid1 for pg_xlog , 4 hdd in raid10 for data ? Obs: I'm going for setup a) , but i want to hear your thoughts as well. This depends on you data size. I think, option a and c are good. The potential bottleneck may the RAID 1 for pg_xlog if you have huge amount of updates and insert. What is about another setup 4 hdd in RAID 10 for System and pg_xlog - System partitions are normally not in heavy use and pg_xlog should be fast for writing. 4 hdd in RAID 10 for data. Question 2: (Don't want to start a flame here. but here is goes) What filesystem should i run for data ? ext3 or xfs ? The tables have ~ 15.000 rel_pages each. The biggest table has now over 30.000 pages. We have a database running with 60,000+ tables. The tables size is between a few kByte for the small tables and up to 30 GB for the largest one. We had no issue with ext3 in the past. Question 3: The block size in postgresql is 8kb. The strip size in the raid ctrl is 64k. Should i increase the pgsql block size to 16 or 32 or even 64k ? You should keep in mind that the file system has also a block size. Ext3 has as maximum 4k. I would set up the partitions aligned to the stripe size to prevent unaligned reads. I guess, you can imagine that a larger block size of postgresql may also end up in unaligned reads because the file system has a smaller block size. RAID Volume and File system set up 1. Make all partitions aligned to the RAID strip size. The first partition should be start at 128 kByte. You can do this with fdisk. after you created the partition switch to the expert mode (type x) and modify the begin of the partition (type b). You should change this value to 128 (default is 63). All other partition should also start on a multiple of 128 kByte. 2. Give the file system a hint that you work with larger block sizes. Ext3: mke2fs -b 4096 -j -R stride=2 /dev/sda1 -L LABEL I made a I/O test with PostgreSQL on a RAID system with stripe size of 64kByte and block size of 8 kByte in the RAID system. Stride=2 was the best value. PS: You should have a second XEON in your budget plan. Sven. -- No virus found in this incoming message. Checked by AVG Free Edition. Version: 7.1.409 / Virus Database: 268.15.7/569 - Release Date: 12/5/2006 ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Hardware advice
The test that I did - which was somewhat limited, showed no benefit splitting disks into seperate partitions for large bulk loads. The program read from one very large file and wrote the input out to two other large files. The totaly throughput on a single partition was close to the maximum theoretical for that logical drive, even though the process was reading and writing to three seperate places on the disk. I don't know what this means for postgresql setups directly, but I would postulate that the benefit from splitting pg_xlog onto a seperate spindle is not as great as it might once have been for large bulk transactions. I am therefore going to be going to a single 6 drive RAID 5 for my data wharehouse application because I want the read speed to be availalbe. I can benefit from fast reads when I want to do large data scans at the expense of slightly slower insert speed. Alex. On 12/5/06, Alexandru Coseru [EMAIL PROTECTED] wrote: Hello.. Thanks for the advices.. Actually , i'm waiting for the clovertown to show up on the market... Regards Alex - Original Message - From: Sven Geisler [EMAIL PROTECTED] To: Alexandru Coseru [EMAIL PROTECTED] Cc: pgsql-performance@postgresql.org Sent: Tuesday, December 05, 2006 11:57 AM Subject: Re: [PERFORM] Hardware advice Hi Alexandru, Alexandru Coseru schrieb: [...] Question 1: The RAID layout should be: a) 2 hdd in raid 1 for system and pg_xlog and 6 hdd in raid10 for data ? b) 8 hdd in raid10 for all ? c) 2 hdd in raid1 for system , 2 hdd in raid1 for pg_xlog , 4 hdd in raid10 for data ? Obs: I'm going for setup a) , but i want to hear your thoughts as well. This depends on you data size. I think, option a and c are good. The potential bottleneck may the RAID 1 for pg_xlog if you have huge amount of updates and insert. What is about another setup 4 hdd in RAID 10 for System and pg_xlog - System partitions are normally not in heavy use and pg_xlog should be fast for writing. 4 hdd in RAID 10 for data. Question 2: (Don't want to start a flame here. but here is goes) What filesystem should i run for data ? ext3 or xfs ? The tables have ~ 15.000 rel_pages each. The biggest table has now over 30.000 pages. We have a database running with 60,000+ tables. The tables size is between a few kByte for the small tables and up to 30 GB for the largest one. We had no issue with ext3 in the past. Question 3: The block size in postgresql is 8kb. The strip size in the raid ctrl is 64k. Should i increase the pgsql block size to 16 or 32 or even 64k ? You should keep in mind that the file system has also a block size. Ext3 has as maximum 4k. I would set up the partitions aligned to the stripe size to prevent unaligned reads. I guess, you can imagine that a larger block size of postgresql may also end up in unaligned reads because the file system has a smaller block size. RAID Volume and File system set up 1. Make all partitions aligned to the RAID strip size. The first partition should be start at 128 kByte. You can do this with fdisk. after you created the partition switch to the expert mode (type x) and modify the begin of the partition (type b). You should change this value to 128 (default is 63). All other partition should also start on a multiple of 128 kByte. 2. Give the file system a hint that you work with larger block sizes. Ext3: mke2fs -b 4096 -j -R stride=2 /dev/sda1 -L LABEL I made a I/O test with PostgreSQL on a RAID system with stripe size of 64kByte and block size of 8 kByte in the RAID system. Stride=2 was the best value. PS: You should have a second XEON in your budget plan. Sven. -- No virus found in this incoming message. Checked by AVG Free Edition. Version: 7.1.409 / Virus Database: 268.15.7/569 - Release Date: 12/5/2006 ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Hardware advice
Alexandru, The server will have kernel 2.1.19 and it will be use only as a postgresql Assuming you're talking Linux, I think you mean 2.6.19? -- Josh Berkus PostgreSQL @ Sun San Francisco ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Hardware advice
Hello.. Yes , sorry for the mistype.. Regards Alex - Original Message - From: Josh Berkus josh@agliodbs.com To: pgsql-performance@postgresql.org Cc: Alexandru Coseru [EMAIL PROTECTED] Sent: Sunday, December 03, 2006 10:11 PM Subject: Re: [PERFORM] Hardware advice Alexandru, The server will have kernel 2.1.19 and it will be use only as a postgresql Assuming you're talking Linux, I think you mean 2.6.19? -- Josh Berkus PostgreSQL @ Sun San Francisco -- No virus found in this incoming message. Checked by AVG Free Edition. Version: 7.1.409 / Virus Database: 268.15.4/563 - Release Date: 12/2/2006 ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] Hardware advice
On 30/5/03 6:17 pm, scott.marlowe [EMAIL PROTECTED] wrote: On Fri, 30 May 2003, Adam Witney wrote: Hi scott, Thanks for the info You might wanna do something like go to all 146 gig drives, put a mirror set on the first 20 or so gigs for the OS, and then use the remainder (5x120gig or so ) to make your RAID5. The more drives in a RAID5 the better, generally, up to about 8 or 12 as the optimal for most setups. I am not quite sure I understand what you mean here... Do you mean take 20Gb from each of the 5 drives to setup a 20Gb RAID 1 device? Or just from the first 2 drives? You could do it either way, since the linux kernel supports more than 2 drives in a mirror. But, this costs on writes, so don't do it for things like /var or the pg_xlog directory. There are a few ways you could arrange 5 146 gig drives. One might be to make the first 20 gig on each drive part of a mirror set where the first two drives are the live mirror, and the next three are hot spares. Then you could setup your RAID5 to have 4 live drives and 1 hot spare. Hot spares are nice to have because they provide for the shortest period of time during which your machine is running with a degraded RAID array. note that in linux you can set the kernel parameter dev.raid.speed_limit_max and dev.raid.speed_limit_min to control the rebuild bandwidth used so that when a disk dies you can set a compromise between fast rebuilds, and lowering the demands on the I/O subsystem during a rebuild. The max limit default is 100k / second, which is quite slow. On a machine with Ultra320 gear, you could set that to 10 ot 20 megs a second and still not saturate your SCSI buss. Now that I think of it, you could probably set it up so that you have a mirror set for the OS, one for pg_xlog, and then use the rest of the drives as RAID5. Then grab space on the fifth drive to make a hot spare for both the pg_xlog and the OS drive. Drive 0 [OS RAID1 20 Gig D0][big data drive RAID5 106 Gig D0] Drive 1 [OS RAID1 20 Gig D1][big data drive RAID5 106 Gig D1] Drive 2 [pg_xlog RAID1 20 gig D0][big data drive RAID5 106 Gig D2] Drive 3 [pg_xlog RAID1 20 gig D1][big data drive RAID5 106 Gig D3] Drive 4 [OS hot spare 20 gig][g_clog hot spare 20 gig][big data drive RAID5 106 Gig hot spare] That would give you ~ 300 gigs storage. Of course, there will likely be slightly less performance than you might get from dedicated RAID arrays for each RAID1/RAID5 set, but my guess is that by having 4 (or 5 if you don't want a hot spare) drives in the RAID5 it'll still be faster than a dedicated 3 drive RAID array. Hi Scott, Just following up a post from a few months back... I have now purchased the hardware, do you have a recommended/preferred Linux distro that is easy to configure for software RAID? Thanks again Adam -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]