Re: [PERFORM] NFS, file system cache and shared_buffers
On 5/27/2014 9:09 AM, Shaun Thomas wrote: On 05/27/2014 10:00 AM, Albe Laurenz wrote: I know that Oracle recommends it - they even built an NFS client into their database server to make the most of it. That's odd. Every time the subject of NFS comes up, it's almost immediately shot down with explicit advice to Never Do That(tm). It can be kinda safe-ish if mounted in sync mode with caching disabled, but I'd never use it on any of our systems. It has been a long time since I was in the weeds of this issue, but the crux is that it was (still is?) hard to be sure that the filesystem's behavior was exactly as expected. My recollection of the Oracle story was that they had to verify the end-to-end behavior, and essentially certify its correctness to guarantee database acid. So you needed to be running a very specific version of the NFS code, configured in a very specific way. This isn't entirely inconsistent with the reference above that they "built an NFS client". That's something you might need to do in order to be sure it behaves in the way you expect. Possibly the NFS implementations deployed today are more consistent and correct than was the case in the past. I wouldn't use a network filesystem for any kind of database storage myself though. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Profiling PostgreSQL
On 5/22/2014 7:27 AM, Dimitris Karampinas wrote: Is there any way to get the call stack of a function when profiling PostgreSQL with perf ? I configured with --enable-debug, I run a benchmark against the system and I'm able to identify a bottleneck. 40% of the time is spent on an spinlock yet I cannot find out the codepath that gets me there. Using --call-graph with perf record didn't seem to help. Any ideas ? Can you arrange to run 'pstack' a few times on the target process (either manually or with a shell script)? If the probability of the process being in the spinning state is high, then this approach should snag you at least one call stack. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Postgresql in a Virtual Machine
On 11/26/2013 7:26 AM, Craig James wrote: For those of us with small (a few to a dozen servers), we'd like to get out of server maintenance completely. Can anyone with experience on a cloud VM solution comment? Do the VM solutions provided by the major hosting companies have the same good performance as the VM's that that several have described here? Obviously there's Amazon's new Postgres solution available. What else is out there in the way of "instant on" solutions with Linux/Postgres/Apache preconfigured systems? Has anyone used them in production? I've done some work with Heroku and the MySQL flavor of AWS service. They work, and are convenient, but there are a couple of issues : 1. Random odd (and bad) things can happen from a performance perspective that you just need to cope with. e.g. I/O will become vastly slower for periods of 10s of seconds, once or twice a day. If you don't like the idea of phenomena like this in your system, beware. 2. Your inability to connect with the bare metal may turn out to be a significant hassle when trying to understand some performance issue in the future. Tricks that we're used to using such as looking at "iostat" (or even "top") output are no longer usable because the hosting company will not give you a login on the host VM. This limitation extends to many many techniques that have been commonly used in the past and can become a major headache to the point where you need to reproduce the system on physical hardware just to understand what's going on with it (been there, done that...) For the reasons above I would caution deploying a production service (today) on a "SaaS" database service like Heroku or Amazon RDS. Running your own database inside a stock VM might be better, but it can be hard to get the right kind of I/O for that deployment scenario. In the case of self-hosted VMWare or KVM obviously you have much more control and observability. Heroku had (at least when I last used it, a year ago or so) an additional issue in that they host on AWS VMs so if something goes wrong you are talking to one company that is using another company's virtual machine service. Not a recipe for clarity, good service and hair retention... -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] AMD vs Intel
On 9/4/2013 3:01 AM, Johan Loubser wrote: I am tasked with getting specs for a postgres database server for the core purpose of running moodle at our university. The main question is at the moment is 12core AMD or 6/8core (E Series) INTEL. What would be the most in portend metric in planning an enterprise level server for moodle. The only way to be sure is to test your workload on the two different machines. That said, we moved from AMD to Intel E series Xeon a year ago. We were using 8-core AMD devices not 12, and switched to 6-core Intel. The Intel devices were more attractive due to their (much) higher single-core performance. Another factor was a history of hardware bugs in the support chips used on AMD systems that we haven't generally seen with Intel. So all other things being equal, Intel brings less hassle, in my experience. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Reliability with RAID 10 SSD and Streaming Replication
On 5/22/2013 8:18 AM, Greg Smith wrote: They can easily hit that number. Or they can do this: Device: r/sw/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util sdd 2702.80 19.40 19.67 0.1614.91 273.68 71.74 0.37 100.00 sdd 2707.60 13.00 19.53 0.1014.78 276.61 90.34 0.37 100.00 That's an Intel 710 being crushed by a random read database server workload, unable to deliver even 3000 IOPS / 20MB/s. I have hours of data like this from several servers. This is interesting. Do you know what it is about the workload that leads to the unusually low rps ? -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] New server setup
On 3/20/2013 6:44 PM, David Rees wrote: On Thu, Mar 14, 2013 at 4:37 PM, David Boreham wrote: You might want to evaluate the performance you can achieve with a single-SSD (use several for capacity by all means) before considering a RAID card + SSD solution. Again I bet it depends on the application but our experience with the older Intel 710 series is that their performance out-runs the CPU, at least under our PG workload. How many people are using a single enterprise grade SSD for production without RAID? I've had a few consumer grade SSDs brick themselves - but are the enterprise grade SSDs, like the new Intel S3700 which you can get in sizes up to 800GB, reliable enough to run as a single drive without RAID1? The performance of one is definitely good enough for most medium sized workloads without the complexity of a BBU RAID and multiple spinning disks... You're replying to my post, but I'll raise my hand again :) We run a bunch of single-socket 1U, short-depth machines (Supermicro chassis) using 1x Intel 710 drives (we'd use S3700 in new deployments today). The most recent of these have 128G and E5-2620 hex-core CPU and dissipate less than 150W at full-load. Couldn't be happier with the setup. We have 18 months up time with no drive failures, running at several hundred wps 7x24. We also write 10's of GB of log files every day that are rotated, so the drives are getting beaten up on bulk data overwrites too. There is of course a non-zero probability of some unpleasant firmware bug afflicting the drives (as with regular spinning drives), and initially we deployed a "spare" 10k HD in the chassis, spun-down, that would allow us to re-jigger the machines without SSD remotely (the data center is 1000 miles away). We never had to do that, and later deployments omitted the HD spare. We've also considered mixing SSD from two vendors for firmware-bug-diversity, but so far we only have one approved vendor (Intel). -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] New server setup
On 3/14/2013 3:37 PM, Mark Kirkwood wrote: I not convinced about the need for BBU with SSD - you *can* use them without one, just need to make sure about suitable longevity and also the presence of (proven) power off protection (as discussed previously). It is worth noting that using unproven or SSD known to be lacking power off protection with a BBU will *not* save you from massive corruption (or device failure) upon unexpected power loss. I think it probably depends on the specifics of the deployment, but for us the fact that the BBU isn't required in order to achieve high write tps with SSDs is one of the key benefits -- the power, cooling and space savings over even a few servers are significant. In our case we only have one or two drives per server so no need for fancy drive string arrangements. Also, in terms of performance, the faster PCIe SSD do about as well by themselves as connected to a RAID card with BBU. In fact they will do better in some cases (the faster SSD can get close to the max IOPS many RAID cards can handle...so more than a couple of 'em plugged into one card will be throttled by its limitations). You might want to evaluate the performance you can achieve with a single-SSD (use several for capacity by all means) before considering a RAID card + SSD solution. Again I bet it depends on the application but our experience with the older Intel 710 series is that their performance out-runs the CPU, at least under our PG workload.
Re: [PERFORM] New server setup
On 3/13/2013 9:29 PM, Mark Kirkwood wrote: Just going through this now with a vendor. They initially assured us that the drives had "end to end protection" so we did not need to worry. I had to post stripdown pictures from Intel's s3700, showing obvious capacitors attached to the board before I was taken seriously and actually meaningful specifications were revealed. So now I'm demanding to know: - chipset (and version) - original manufacturer (for re-badged ones) - power off protection *explicitly* mentioned - show me the circuit board (and where are the capacitors) In addition to the above, I only use drives where I've seen compelling evidence that plug pull tests have been done and passed (e.g. done by someone on this list or in-house here). I also like to have a high level of confidence in the firmware development group. This results in a very small set of acceptable products :(
Re: [PERFORM] New server setup
On 3/13/2013 1:23 PM, Steve Crawford wrote: What concerns me more than wear is this: InfoWorld Article: http://www.infoworld.com/t/solid-state-drives/test-your-ssds-or-risk-massive-data-loss-researchers-warn-213715 Referenced research paper: https://www.usenix.org/conference/fast13/understanding-robustness-ssds-under-power-fault Kind of messes with the "D" in ACID. It is somewhat surprising to discover that many SSD products are not durable under sudden power loss (what where they thinking!?, and ...why doesn't anyone care??). However, there is a set of SSD types known to be designed to address power loss events that have been tested by contributors to this list. Use only those devices and you won't see this problem. SSDs do have a wear-out mechanism but wear can be monitored and devices replaced in advance of failure. In practice longevity is such that most machines will be in the dumpster long before the SSD wears out. We've had machines running with several hundred wps constantly for 18 months using Intel 710 drives and the wear level SMART value is still zero. In addition, like any electronics module (CPU, memory, NIC), an SSD can fail so you do need to arrange for valuable data to be replicated. As with old school disk drives, firmware bugs are a concern so you might want to consider what would happen if all the drives of a particular type all decided to quit working at the same second in time (I've only seen this happen myself with magnetic drives, but in theory it could happen with SSD). -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Do I have a hardware or a software problem?
On 12/11/2012 8:11 PM, Evgeny Shishkin wrote: Quoting http://www.storagereview.com/intel_ssd_dc_s3700_series_enterprise_ssd_review Heh. A fine example of the kind of hand-waving of which I spoke ;) Higher performance is certainly a benefit, although at present we can't saturate even a single 710 series drive (the application, CPU, OS, etc is the bottleneck). Similarly while dual-porting certainly has its uses, it is not something I need. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Do I have a hardware or a software problem?
On 12/11/2012 7:49 PM, Evgeny Shishkin wrote: Yeah, s3700 looks promising, but sata interface is limiting factor for this drive. I'm looking towards SMART ssd http://www.storagereview.com/smart_storage_systems_optimus_sas_enterprise_ssd_review What don't you like about SATA ? I prefer to avoid SAS drives if possible due to the price premium for dubious benefits besides vague hand-waving "enterprise-ness" promises. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Do I have a hardware or a software problem?
On 12/11/2012 7:38 PM, Evgeny Shishkin wrote: Which drives would you recommend? Besides intel 320 and 710. Those are the only drive types we have deployed in servers at present (almost all 710, but we have some 320 for less mission-critical machines). The new DC-S3700 Series looks nice too, but isn't yet in the sales channel : http://www.intel.com/content/www/us/en/solid-state-drives/solid-state-drives-dc-s3700-series.html -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Do I have a hardware or a software problem?
On 12/11/2012 7:20 PM, Evgeny Shishkin wrote: Oh, there is no 100% safe system. In this case we're discussing specifically "safety in the event of power loss shortly after the drive indicates to the controller that it has committed a write operation". Some drives do provide 100% safety against this event, and they don't cost much more than those that don't. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Do I have a hardware or a software problem?
On 12/11/2012 7:13 PM, Evgeny Shishkin wrote: Yes, i am aware of this issue. Never experienced this neither on intel 520, no ocz vertex 3. Have you heard of them on this list? People have done plug-pull tests and reported the results on the list (sometime in the past couple of years). But you don't need to do the test to know these drives are not safe. They're unsafe by design. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] HT on or off for E5-26xx ?
Here are the SELECT only pgbench test results from my E5-2620 machine, with HT on and off: HT off: bash-4.1$ /usr/pgsql-9.2/bin/pgbench -T 600 -j 48 -c 48 -S starting vacuum...end. transaction type: SELECT only scaling factor: 100 query mode: simple number of clients: 48 number of threads: 48 duration: 600 s number of transactions actually processed: 25969680 tps = 43281.392273 (including connections establishing) tps = 43282.476955 (excluding connections establishing) All 6 cores saturated: avg-cpu: %user %nice %system %iowait %steal %idle 81.420.00 18.210.000.000.37 HT on: bash-4.1$ /usr/pgsql-9.2/bin/pgbench -T 600 -j 48 -c 48 -S starting vacuum...end. transaction type: SELECT only scaling factor: 100 query mode: simple number of clients: 48 number of threads: 48 duration: 600 s number of transactions actually processed: 29934601 tps = 49888.697225 (including connections establishing) tps = 49889.570754 (excluding connections establishing) 12% of CPU showing as idle (whether that's true or not I'm not sure): avg-cpu: %user %nice %system %iowait %steal %idle 71.090.00 16.990.000.00 11.92 So for this particular test HT gives us the equivalent of about one extra core. It does not reduce performance, rather increases performance slightly. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] HT on or off for E5-26xx ?
On 11/8/2012 6:58 AM, Shaun Thomas wrote: On 11/07/2012 09:16 PM, David Boreham wrote: bash-4.1$ /usr/pgsql-9.2/bin/pgbench -T 600 -j 48 -c 48 Unfortunately without -S, you're not really testing the processors. A regular pgbench can fluctuate a more than that due to writing and checkpoints. Hmm...my goal was to test with a workload close to our application's (which is heavy OLTP, small transactions and hence sensitive to I/O commit rate). The hypothesis I was testing was that enabling HT positively degrades performance (which in my case it does not). I wasn't to be honest really testing the additional benefit from HT, rather observing that it is non-negative :) If I have time I can run the select-only test for you and post the results. The DB fits into memory so it will be a good CPU test. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] HT on or off for E5-26xx ?
Well, the results are in and at least in this particular case conventional wisdom is overturned. Not a huge benefit, but throughput is definitely higher with HT enabled and nthreads >> ncores: HT off : bash-4.1$ /usr/pgsql-9.2/bin/pgbench -T 600 -j 48 -c 48 starting vacuum...end. transaction type: TPC-B (sort of) scaling factor: 100 query mode: simple number of clients: 48 number of threads: 48 duration: 600 s number of transactions actually processed: 2435711 tps = 4058.667332 (including connections establishing) tps = 4058.796309 (excluding connections establishing) avg-cpu: %user %nice %system %iowait %steal %idle 52.500.00 14.795.070.00 27.64 Device: rrqm/s wrqm/s r/s w/srMB/swMB/s avgrq-sz avgqu-sz await svctm %util sda 0.00 5700.300.10 13843.50 0.00 74.78 11.0648.463.50 0.05 65.21 HT on: bash-4.1$ /usr/pgsql-9.2/bin/pgbench -T 600 -j 48 -c 48 starting vacuum...end. transaction type: TPC-B (sort of) scaling factor: 100 query mode: simple number of clients: 48 number of threads: 48 duration: 600 s number of transactions actually processed: 2832463 tps = 4720.668984 (including connections establishing) tps = 4720.750477 (excluding connections establishing) avg-cpu: %user %nice %system %iowait %steal %idle 40.610.00 12.713.090.00 43.59 Device: rrqm/s wrqm/s r/s w/srMB/swMB/s avgrq-sz avgqu-sz await svctm %util sda 0.00 6197.10 14.80 16389.50 0.14 86.53 10.8254.113.30 0.05 82.35 System details: E5-2620 (6 core + HT 15Mb LL) 64G (4 channels with 16G 1333 modules), Intel 710 300G (which is faster than the smaller drives, note), Supermicro X9SRi-F Motherboard. CentOS 6.3 64-bit, PG 9.2.1 from the PGDG RPM repository. pgbench running locally on the server. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] HT on or off for E5-26xx ?
On 11/7/2012 6:37 AM, Devrim GÜNDÜZ wrote: HT should be good for file servers, or say many of the app servers, or small web/mail servers. PostgreSQL relies on the CPU power, and since the HT CPUs don't have the same power as the original CPU, when OS submits a job to that particular HTed CPU, query will run significantly slow. To avoid issues, I would suggest you to turn HT off on all PostgreSQL servers. If you can throw some more money, another 6-core CPU would give more benefit. I realize this is the "received knowledge" but it is not supported by the evidence before me (which is that I get nearly 2x the throughput from pgbench using nthreads == nhtcores vs nthreads == nfullcores). Intel's latest HT implementation seems to suffer less from the kinds of resource sharing contention issues seen in older generations. Once I have the machine's full memory installed I'll run pgbench with HT disabled in the BIOS and post the results. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] HT on or off for E5-26xx ?
On 11/6/2012 9:16 PM, Mark Kirkwood wrote: I've been benchmarking a E5-4640 (4 socket) and hyperthreading off gave much better scaling behaviour in pgbench (gentle rise and flatten off), whereas with hyperthreading on there was a dramatic falloff after approx number clients = number of (hyperthreaded) cpus. The box is intended to be a pure db server, so we are running with hyperthreading off. It looks like this syndrome is not observed on my box, likely due to the much lower number of cores system-wide (12). I see pgbench tps increase nicely until #threads/clients == #cores, then plateau. I tested up to 96 threads btw. We're waiting on more memory modules to arrive. I'll post some test results once we have all 4 memory banks populated. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] HT on or off for E5-26xx ?
I'm bringing up a new type of server using Intel E5-2620 (unisocket) which was selected for good SpecIntRate performance vs cost/power (201 for $410 and 95W). Was assuming it was 6-core but I just noticed it has HT which is currently enabled since I see 12 cores in /proc/cpuinfo Question for the performance experts : is it better to have HT enabled or disabled for this generation of Xeon ? Workload will be moderately concurrent, small OLTP type transactions. We'll also run a few low-load VMs (using KVM) and a big Java application. Any thoughts welcome. Thanks. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] hardware advice
On 10/2/2012 2:20 AM, Glyn Astill wrote: newer R910s recently all of a sudden went dead to the world; no prior symptoms showing in our hardware and software monitoring, no errors in the os logs, nothing in the dell drac logs. After a hard reset it's back up as if nothing happened, and it's an issue I'm none the wiser to the cause. Not good piece of mind. This could be an OS bug rather than a hardware problem. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] hardware advice
On 9/28/2012 9:46 AM, Craig James wrote: Your best warranty would be to have the confidence to do your own repairs, and to have the parts on hand. I'd seriously consider putting your own system together. Maybe go to a few sites with pre-configured machines and see what parts they use. Order those, screw the thing together yourself, and put a spare of each critical part on your shelf. This is what I did for years, but after taking my old parts collection to the landfill a few times, realized I may as well just buy N+1 machines and keep zero spares on the shelf. That way I get a spare machine available for use immediately, and I know the parts are working (parts on the shelf may be defective). If something breaks, I use the spare machine until the replacement parts arrive. Note in addition that a warranty can be extremely useful in certain organizations as a vehicle of blame avoidance (this may be its primary purpose in fact). If I buy a bunch of machines that turn out to have buggy NICs, well that's my fault and I can kick myself since I own the company, stay up late into the night reading kernel code, and buy new NICs. If I have an evil Dilbertian boss, then well...I'd be seriously thinking about buying Dell boxes in order to blame Dell rather than myself, and be able to say "everything is warrantied" if badness goes down. Just saying... -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] hardware advice
On 9/27/2012 3:16 PM, Claudio Freire wrote: Careful with AMD, since many (I'm not sure about the latest ones) cannot saturate the memory bus when running single-threaded. So, great if you have a high concurrent workload, quite bad if you don't. Actually we test memory bandwidth with John McCalpin's stream program. Unfortunately it is hard to find stream test results for recent machines so it can be hard to compare two boxes unless you own examples, so I didn't mention it as a useful option. But if you can find results for the machines, or ask a friend to run it for you...definitely useful information. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] hardware advice
On 9/27/2012 2:47 PM, Shaun Thomas wrote: On 09/27/2012 02:40 PM, David Boreham wrote: I think the newer CPU is the clear winner with a specintrate performance of 589 vs 432. The comparisons you linked to had 24 absolute threads pitted against 32, since the newer CPUs have a higher maximum cores per CPU. That said, you're right that it has a fairly large cache. And from my experience, Intel CPU generations have been scaling incredibly well lately. (Opteron, we hardly knew ye!) Yes, the "rate" spec test uses all the available cores. I'm assuming a concurrent workload, but since the single-thread performance isn't that much different between the two I think the higher number of cores, larger cache, newer design CPU is the best choice. We went from Dunnington to Nehalem, and it was stunning how much better the X5675 was compared to the E7450. Sandy Bridge isn't quite that much of a jump though, so if you don't need that kind of bleeding-edge, you might be able to save some cash. This is especially true since the E5-2600 series has the same TDP profile and both use 32nm lithography. We use Opteron on a price/performance basis. Intel always seems to come up with some way to make their low-cost processors useless (such as limiting the amount of memory they can address). -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] hardware advice
On 9/27/2012 2:55 PM, Scott Marlowe wrote: Whatever you do, go for the Intel ethernet adaptor option. We've had so many >headaches with integrated broadcom NICs.:( Sound advice, but not a get out of jail card unfortunately : we had a horrible problem with the Intel e1000 driver in RHEL for several releases. Finally diagnosed it just as RH shipped a fixed driver. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] hardware advice
On 9/27/2012 1:56 PM, M. D. wrote: I'm in Belize, so what I'm considering is from ebay, where it's unlikely that I'll get the warranty. Should I consider some other brand rather? To build my own or buy custom might be an option too, but I would not get any warranty. I don't have any recent experience with white label system vendors, but I suspect they are assembling machines from supermicro, asus, intel or tyan motherboards and enclosures, which is what we do. You can buy the hardware from suppliers such as newegg.com. It takes some time to read the manufacturer's documentation, figure out what kind of memory to buy and so on, which is basically what you're paying a white label box seller to do for you. For example here's a similar barebones system to the R720 I found with a couple minutes searching on newegg.com : http://www.newegg.com/Product/Product.aspx?Item=N82E16816117259 You could order that SKU, plus the two CPU devices, however many memory sticks you need, and drives. If you need less RAM (the Dell box allows up to 24 sticks) there are probably cheaper options. The equivalent Supermicro box looks to be somewhat less expensive : http://www.newegg.com/Product/Product.aspx?Item=N82E16816101693 When you consider downtime and the cost to ship equipment back to the supplier, a warranty doesn't have much value to me but it may be useful in your situation. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] hardware advice
On 9/27/2012 1:37 PM, Craig James wrote: We use a "white box" vendor (ASA Computers), and have been very happy with the results. They build exactly what I ask for and deliver it in about a week. They offer on-site service and warranties, but don't pressure me to buy them. I'm not locked in to anything. Their prices are good. I'll second that : we build our own machines from white-label parts for typically less than 1/2 the Dell list price. However, Dell does provide value to some people : for example you can point a third-party software vendor at a Dell box and demand they make their application work properly whereas they may turn their nose up at a white label box. Same goes for Operating Systems : we have spent much time debugging Linux kernel issues on white box hardware. On Dell hardware we would most likely have not hit those bugs because Red Hat tests on Dell. So YMMV... -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] hardware advice
On 9/27/2012 1:11 PM, M. D. wrote: I want to buy a new server, and am contemplating a Dell R710 or the newer R720. The R710 has the x5600 series CPU, while the R720 has the newer E5-2600 series CPU. For this the best data I've found (excepting actually running tests on the physical hardware) is to use the SpecIntRate2006 numbers, which can be found for both machines on the spec.org web site. I think the newer CPU is the clear winner with a specintrate performance of 589 vs 432. It also has a significantly larger cache. Comparing single-threaded performance, the older CPU is slightly faster (50 vs 48). That wouldn't be a big enough difference to make me pick it. The Sandy Bridge-based machine will likely use less power. http://www.spec.org/cpu2006/results/res2012q2/cpu2006-20120604-22697.html http://www.spec.org/cpu2006/results/res2012q1/cpu2006-20111219-19272.html To find more results use this page : http://www.spec.org/cgi-bin/osgresults?conf=cpu2006;op=form (enter R710 or R720 in the "system" field). -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] SSD selection
On 5/16/2012 11:01 AM, Merlin Moncure wrote: Although your assertion 100% supported by intel's marketing numbers, there are some contradicting numbers out there that show the drives offering pretty similar performance. For example, look here: http://www.anandtech.com/show/4902/intel-ssd-710-200gb-review/4 and you can see that 4k aligned writes are giving quite similar results (14k iops) even though the 710 is only rated for 2700 iops while the 320 is rated for 21000 IOPS. Other benchmarks also show similar results. Actually I said the same thing you're saying : that the two series will deliver similar performance. The spec numbers however would be for worst case conditions (in the case of the 710). I'm not convinced that those tests were exercising the worst case part of the envelope. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] SSD selection
On 5/15/2012 12:16 PM, Rosser Schwarz wrote: As the other posters in this thread have said, your best bet is probably the Intel 710 series drives, though I'd still expect some 320-series drives in a RAID configuration to still be pretty stupendously fast. One thing to mention is that the 710 are not faster than 320 series (unless in your definition of fast you count potential GC pauses of course). The 710's primary claim to fame is that it has endurance and GC characteristics designed for server and database use (constant load, heavy write load). So 320 drives will be just as fast, if not faster, but they will wear out much more quickly (possibly not a concern for the OP in his deployment) and may suffer from unwelcome GC pauses. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] SSD selection
On 5/15/2012 9:21 AM, Віталій Тимчишин wrote: We've reached to the point when we would like to try SSDs. We've got a central DB currently 414 GB in size and increasing. Working set does not fit into our 96GB RAM server anymore. So, the main question is what to take. Here what we've got: 1) Intel 320. Good, but slower then current generation sandforce drives 2) Intel 330. Looks like cheap 520 without capacitor 3) Intel 520. faster then 320 No capacitor. 4) OCZ Vertex 3 Pro - No available. Even on OCZ site 5) OCZ Deneva - can't find in my country :) Is the 710 series too costly for your deployment ? I ask because that would be the obvious choice for a database (much better write endurance than any of the drives above, and less likely to suffer from firmware bugs or unpleasant GC behavior). We've been running them in production for a few months with zero problems and great performance. The price on the 710's tends to vary on whether they're in stock : NewEgg is currently showing $1100 for the 300G drive, but no stock. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Intel 710 pgbench write latencies
So the Intel 710 kind of sucks latency wise. Is it because it is also heavily reading, and maybe WAL should not be put on it? A couple quick thoughts: 1. There are a lot of moving parts in the system besides the SSDs. It will take some detailed analysis to determine the cause for the outlying high latency transactions. The cause may not be as simple as one SSD processes I/O operations less quickly than another. For example the system may be subject to some sort of starvation issue in PG or the OS that is affected by quite small differences in underlying storage performance. 2. What are your expectations for maximum transaction latency ? In my experience it is not possible to guarantee sub-second (or even sub-multi-second) latencies overall in a system built with general purpose OS and database software. (put another way : a few outlying 1 second and even several-second transactions would be pretty much what I'd expect to see on a database under sustained saturation load as experienced under a pgbench test). -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] WAL in RAM
On 10/28/2011 12:26 PM, Tomas Vondra wrote: For example the Intel 710 SSD has a sequential write speed of 210MB/s, while a simple SATA 7.2k drive can write about 50-100 MB/s for less than 1/10 of the 710 price. Bulk data transfer rates mean almost nothing in the context of a database (unless you're for example backing it up by copying the files to another machine...) The key factor typically is small block writes/s (for WAL) and random small block reads/s (for data). 710 or similar performance SSDs will deliver on the order of 20-50x the performance of a traditional hard drive in these areas. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Choosing between Intel 320, Intel 510 or OCZ Vertex 3 SSD for db server
On 10/28/2011 12:40 AM, Amitabh Kant wrote: Sadly, 710 is not that easily available around here at the moment. All three sizes are in stock at newegg.com, if you have a way to export from the US to your location. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Choosing between Intel 320, Intel 510 or OCZ Vertex 3 SSD for db server
On 10/25/2011 8:55 AM, Claudio Freire wrote: But what about unexpected failures. Faulty electronics, stuff like that? I really don't think a production server can work without at least raid-1. Same approach : a server either works or it does not. The transition between working and not working may be expected or not expected. The solution is the same : use another machine to perform the work the now not working machine was doing. The big benefit of this approach is that you now do not need to worry about specific kinds of failure or individual components, including storage. If it helps, think of this architecture as raid-1, but with the whole machine being the "drive" rather than individual storage devices. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Choosing between Intel 320, Intel 510 or OCZ Vertex 3 SSD for db server
On 10/24/2011 4:47 PM, Claudio Freire wrote: What about redundancy? How do you swap an about-to-die SSD? Software RAID-1? The approach we take is that we use 710 series devices which have predicted reliability similar to all the other components in the machine, therefore the unit of replacement is the entire machine. We don't use trays for example (which saves quite a bit on data center space). If I were running short endurance devices such as 320 series I would be interested in replacing the drives before the machine itself is likely to fail, but I'd do so by migrating the data and load to another machine for the replacement to be done offline. Note that there are other operations procedures that need to be done and can not be done without downtime (e.g. OS upgrade), so some kind of plan to deliver service while a single machine is down for a while will be needed regardless of the storage device situation. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Choosing between Intel 320, Intel 510 or OCZ Vertex 3 SSD for db server
On 10/24/2011 3:31 PM, Merlin Moncure wrote: 4. Consider using Intel 710 series rather than 320 (pay for them with the > money saved from #3 above). Those devices have much, much higher specified > endurance than the 320s and since your DB is quite small you only need to > buy one of them. 710 is good idea if and only if you are worried about write durability (in which case it's a great idea). I disagree with this (that it is the only reason to select 710 series). The write endurance (specified at least) is orders of magnitude higher. Doing 100's of TPS constantly it is possible to burn through the 320's endurance lifetime in a year or two. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Choosing between Intel 320, Intel 510 or OCZ Vertex 3 SSD for db server
A few quick thoughts: 1. 320 would be the only SSD I'd trust from your short-list. It's the only one with proper protection from unexpected power loss. 2. Multiple RAID'ed SSDs sounds like (vast) overkill for your workload. A single SSD should be sufficient (will get you several thousand TPS on pgbench for your DB size). 3. Consider not using the magnetic disks at all (saves on space, power and the cost of the RAID controller for them). 4. Consider using Intel 710 series rather than 320 (pay for them with the money saved from #3 above). Those devices have much, much higher specified endurance than the 320s and since your DB is quite small you only need to buy one of them. On 10/24/2011 8:09 AM, Amitabh Kant wrote: Hello I need to choose between Intel 320 , Intel 510 and OCZ Vertex 3 SSD's for my database server. From recent reading in the list and other places, I have come to understand that OCZ Vertex 3 should not be used, Intel 510 uses a Marvel controller while Intel 320 had a nasty bug which has been rectified. So the list narrows down to only 510 and 320, unless I have understood the OCZ Vertex reviews incorrectly. The server would itself be built along these lines: Dual CPU Xeon 5620, 32 or 48 GB RAM, 2 SAS 10K disk in RAID 1 for OS, 2 SAS 10K disk in RAID 1 for pg_xlog and 4 SSD in RAID 10 for data directory (overkill??). OS would be FreeBSD 8.2 (I would be tuning the sysctl variables). PG version would be 9.1 with replication set to another machine (Dual CPU Xeon 54xx, 32 GB RAM, 6 15K SAS 146 GB: 4 in RAID 10 for data and 2 in RAID 1 for OS + pg_xlog). The second machine hosts my current db , and there is not much of an issue with the performance. We need better redundancy now(current was to take a dump/backup every 12 hours), so the new machine. My database itself is not very big, approx 40 GB as of now, and would not grow beyond 80 GB in the next year or two. There are some tables where insert & updates are fairly frequent. From what I could gather, we are not doing more than 300-400 tps at the moment, and the growth should not be very high in the short term. Hope someone can give some pointers to which SSD I should go for at the moment. Amitabh -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Intel 710 Endurance Test Results
I ran a test using Intel's timed workload wear indication feature on a 100G 710 series SSD. The test works like this : you reset the wear indication counters, then start running some workload (in my case pgbench at scale 100 for 4 hours). During the test run a wear indication attribute can be read via smartctl. This contains the drive's estimate for the total drive wear lifetime used during the test run time. Provided we trust the drive's estimate, this in turn provides an answer to the interesting question "how long before my drive wears out in production?". That is, provided the test workload can be reasonably accurately related to production workload. Executive summary : the drive says that we used 0.025% of its wear life during this 4h test. The test performed 47 million transactions on a roughly 17G database. We think that this pgbench workload performs transactions at roughly 10X the rate we expect from our application in production under heavy load. So the drive says you could run this pgbench test for about two years before wearing out the flash devices. Or 20 years under our expected production workload. Smart attributes read just after test completion: [root@server1 9.1]# smartctl -A /dev/sda smartctl 5.41 2011-06-09 r3365 [x86_64-linux-2.6.32-71.29.1.el6.x86_64] (local build) Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net === START OF READ SMART DATA SECTION === SMART Attributes Data Structure revision number: 5 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 3 Spin_Up_Time0x0020 100 100 000Old_age Offline - 0 4 Start_Stop_Count0x0030 100 100 000Old_age Offline - 0 5 Reallocated_Sector_Ct 0x0032 100 100 000Old_age Always - 0 9 Power_On_Hours 0x0032 100 100 000Old_age Always - 90 12 Power_Cycle_Count 0x0032 100 100 000Old_age Always - 8 170 Reserve_Block_Count 0x0033 100 100 010Pre-fail Always - 0 171 Program_Fail_Count 0x0032 100 100 000Old_age Always - 0 172 Erase_Fail_Count0x0032 100 100 000Old_age Always - 0 174 Unknown_Attribute 0x0032 100 100 000Old_age Always - 4 183 SATA_Downshift_Count0x0030 100 100 000Old_age Offline - 0 184 End-to-End_Error0x0032 100 100 090Old_age Always - 0 187 Reported_Uncorrect 0x0032 100 100 000Old_age Always - 0 190 Airflow_Temperature_Cel 0x0032 073 070 000Old_age Always - 27 (Min/Max 17/30) 192 Unsafe_Shutdown_Count 0x0032 100 100 000Old_age Always - 4 194 Temperature_Celsius 0x0032 100 100 000Old_age Always - 32 199 UDMA_CRC_Error_Count0x0030 100 100 000Old_age Offline - 0 225 Host_Writes_32MiB 0x0032 100 100 000Old_age Always - 51362 226 Workld_Media_Wear_Indic 0x0032 100 100 000Old_age Always - 26 227 Workld_Host_Reads_Perc 0x0032 100 100 000Old_age Always - 1 228 Workload_Minutes0x0032 100 100 000Old_age Always - 242 232 Available_Reservd_Space 0x0033 100 100 010Pre-fail Always - 0 233 Media_Wearout_Indicator 0x0032 100 100 000Old_age Always - 0 241 Host_Writes_32MiB 0x0032 100 100 000Old_age Always - 51362 242 Host_Reads_32MiB0x0032 100 100 000Old_age Always - 822 pgbench output from the test run: bash-4.1$ /usr/pgsql-9.1/bin/pgbench -T 14400 -j 8 -c 64 starting vacuum...end. transaction type: TPC-B (sort of) scaling factor: 1000 query mode: simple number of clients: 64 number of threads: 8 duration: 14400 s number of transactions actually processed: 47205109 tps = 3278.127690 (including connections establishing) tps = 3278.135396 (excluding connections establishing) -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Suggestions for Intel 710 SSD test
On 10/2/2011 11:55 PM, Gregory Gerard wrote: Which repo did you get them from? http://yum.postgresql.org/9.1/redhat/rhel-$releasever-$basearch -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Suggestions for Intel 710 SSD test
On 10/2/2011 10:49 PM, David Boreham wrote: There are some OS differences between the old and new servers : old is running CentOS 5.7 while the new is running 6.0. Old server has atime enabled while new has relatime mount option specified. Both are running PG 9.1.1 from the yum repo. Also the old server is using ext3 while the new has ext4 (with discard/trim enabled). -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Suggestions for Intel 710 SSD test
On 10/2/2011 10:35 PM, Greg Smith wrote: That sounds about the same performance as the 320 drive I tested earlier this year then. You might try duplicating some of the benchmarks I ran on that: http://archives.postgresql.org/message-id/4d9d1fc3.4020...@2ndquadrant.com Thanks. Actually I had been looking for that email, which by brain remembered but my computer and Google could not find ;) Make sure to reference the capacity of the drive though. The 320 units do scale their performance based on that, presumably there's some of that with the 710s as well. This is a 100G drive. The performance specs vary curiously vs capacity for the 710 series : write tps actually goes down as the size increases, but bulk write data rate is higher for the larger drives. I ran some pgbench earlier this evening, before reading your old email above, so the parameters are different: This is the new server with 100G 710 (AMD 6128 with 64G): bash-4.1$ /usr/pgsql-9.1/bin/pgbench -T 600 -j 8 -c 64 starting vacuum...end. transaction type: TPC-B (sort of) scaling factor: 100 query mode: simple number of clients: 64 number of threads: 8 duration: 600 s number of transactions actually processed: 2182014 tps = 3636.520405 (including connections establishing) tps = 3636.738290 (excluding connections establishing) This is the output from iostat while the test runs: Device: rrqm/s wrqm/s r/s w/srMB/swMB/s avgrq-sz avgqu-sz await svctm %util sda 0.00 4851.000.00 14082.00 0.0073.76 10.7345.723.25 0.05 71.60 This is our current production server type (Dual AMD 2346HE 32G 10K 300G 'raptor) with disk write cache turned off and with data and wal on the same drive: bash-3.2$ /usr/pgsql-9.1/bin/pgbench -T 600 -j 8 -c 64 starting vacuum...end. transaction type: TPC-B (sort of) scaling factor: 100 query mode: simple number of clients: 64 number of threads: 8 duration: 600 s number of transactions actually processed: 66426 tps = 108.925653 (including connections establishing) tps = 108.941509 (excluding connections establishing) Device: rrqm/s wrqm/s r/s w/srMB/swMB/s avgrq-sz avgqu-sz await svctm %util sda 0.00 438.00 0.00 201.00 0.00 2.60 26.4755.95 286.09 4.98 100.00 same server with disk write cache turned on: bash-3.2$ /usr/pgsql-9.1/bin/pgbench -T 600 -j 8 -c 64 starting vacuum...end. transaction type: TPC-B (sort of) scaling factor: 100 query mode: simple number of clients: 64 number of threads: 8 duration: 600 s number of transactions actually processed: 184724 tps = 305.654008 (including connections establishing) tps = 305.694317 (excluding connections establishing) Device: rrqm/s wrqm/s r/s w/srMB/swMB/s avgrq-sz avgqu-sz await svctm %util sda 0.00 668.00 0.00 530.00 0.00 4.55 17.57 142.99 277.28 1.89 100.10 There are some OS differences between the old and new servers : old is running CentOS 5.7 while the new is running 6.0. Old server has atime enabled while new has relatime mount option specified. Both are running PG 9.1.1 from the yum repo. One very nice measurement is the power consumption on the new server : peak dissipation is 135W under the pgbench load (measured on the ac input to the psu). Idle is draws around 90W. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Suggestions for Intel 710 SSD test
On 10/2/2011 6:26 PM, Gregory Gerard wrote: If I may ask what were your top three candidates before choosing the intel? All the other options considered viable were using traditional rotational disks. I personally don't have any confidence in the other SSD vendors today, except perhaps for FusionIO (where a couple of old friends work, and I can certainly vouch for their competence) but their products are too costly for our application at present. Also why not just plan a graceful switch to a replicated server? At some point you have to detect the drive is about to go (or it just goes without warning). Presumably that point will be in a while and be coordinated with an upgrade like 9.2 in a year. Sure, we have this capability but once you walk through what has to happen if you are burning through SSDs every few months, the 710 value proposition is more attractive for us. For example our data center is 1200 miles from our HQ and it takes a very long road trip or a plane flight to get hands-on with the boxes. We spent some considerable time planning for the 320 style deployment to be honest -- figuring out how to predict when the drive would wear out, building replication mechanisms that would cope gracefully and so on. But given the option of the 710 where wear out can essentially be crossed off the list of things to worry about, that's the way we decided to go. Finally why not the pci based cards? Few reasons: 1) Today Intel doesn't make them (that will change soon), 2) a desire to maintain backwards compatibility at least for this generation, on a system architecture level with traditional disk drives, 3) concerns about mechanical integrity and system airflow issues with the PCI card and connector in 1U enclosures. The SSD fits into the same location as a traditional disk but can be velcro'ed down rather than bolted for easier field replacement. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Suggestions for Intel 710 SSD test
On 10/2/2011 2:33 AM, Arjen van der Meijden wrote: Given the fact that you can get two 320's for the price of one 710, its probably always a bit difficult to actually make the choice (unless you want a fixed amount of disks and the best endurance possible for that). One thing I'd add to this is that the price/bit is more like 4X ($2k for the 300G 710 vs $540 for the 300G 320). The largest 710 drive is 300G whereas the largest 320 is 600G which may imply that the 710's are twice as over-provisioned as the 320. It may be that at present we're paying 2x for the relative over-provisioning and another 2x to enjoy the better silicon and firmware. This hopefully implies that prices will fall in the future provided a credible competitor emerges (Hitachi??). -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Suggestions for Intel 710 SSD test
On 10/1/2011 9:22 PM, Andy wrote: Do you have an Intel 320? I'd love to see tests comparing 710 to 320 and see if it's worth the price premium. Good question. I don't have a 320 drive today, but will probably get one for testing soon. However, my conclusion based on the Intel spec documents is that the 710 and 320 will have similar performance. We elected to use 710 devices not for performance reasons vs the 320 but for the specified lifetime endurance. The 710 series is specified at around 4k complete drive overwrites where the 320 is specified at only around 100. So I see the 710 series as "like the 320 but much less likely to wear out". It may also have better behavior under constant load (the white paper makes vague mention of different background GC vs the non-enterprise drives). So for our uses, the 320 series looks great except for concerns that : a) the may wear out quite quickly, leading to extra cost to enter data center and pull drives, etc and the need to maintain a long-term test rig to determine if and when they wear out before it happens in production. b) the GC may behave badly under constant load (leading for example to unexpected periods of relatively very poor performance). The value proposition for the 710 vs. 320 for me is not performance but the avoidance of these two worries. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Suggestions for Intel 710 SSD test
On 10/1/2011 10:00 PM, Gregory Gerard wrote: How does this same benchmark compare on similar (or same) hardware but with magnetic media? I don't have that data at present :( So far I've been comparing performance with our current production machines, which are older. Those machines use 'raptors with no BBU/WBC, so their tps performance is very low (100 per second or so). Since our application tends to be limited by the transaction commit rate, the SSD-based machine is clearly going to be at least an order of magnitude faster. But I don't know its relative performance vs. the traditional BBU/WBC controller + string of drives approach (we rejected that as an option already for power consumption and space reasons). I could install a raptor drive in the new server and test that, but it wouldn't have a BBU/WBC controller. I guess I could enable the drive's write cache which might produce comparable results to a controller's WBC. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Suggestions for Intel 710 SSD test
I have a 710 (Lyndonville) SSD in a test server. Ultimately we'll run capacity tests using our application (which in turn uses PG), but it'll take a while to get those set up. In the meantime, I'd be happy to entertain running whatever tests folks here would like to suggest, spare time-permitting. I've already tried bonnie++, sysbench and a simple WAL emulation test program I wrote more than 10 years ago. The drive tests at around 160Mbyte/s on bulk data and 4k tps for commit rate writing small blocks. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Reports from SSD purgatory
On 8/24/2011 1:32 PM, Tomas Vondra wrote: Why is that important? It's simply a failure of electronics and it has nothing to do with the wear limits. It simply fails without prior warning from the SMART. In the cited article (actually in all articles I've read on this subject), the failures were not properly analyzed*. Therefore the conclusion that the failures were of electronics components is invalid. In the most recent article, people have pointed to it as confirming electronics failures but the article actually states that the majority of failures were suspected to be firmware-related. We know that a) there have been failures, but b) not the cause. We don't even know for sure that the cause was not cell wear. That's because all we know is that the drives did not report wear before failing. The wear reporting mechanism could be broken for all we know. -- *A "proper" analysis would involve either the original manufacturer's FA lab, or a qualified independent analysis lab. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Intel 320 SSD info
On 8/24/2011 11:41 AM, Greg Smith wrote: I've measured the performance of this drive from a couple of directions now, and it always comes out the same. For PostgreSQL, reading or writing 8K blocks, I'm seeing completely random workloads hit a worst-case of 20MB/s; that's just over 2500 IOPS. It's quite possible that number can go lower under pressure of things like internal drive garbage collection however, which I believe is going into the 600 IOPS figure. I haven't tried to force that yet--drive is too useful to me to try and burn it out doing tests like that at the moment. I hope someone from Intel is reading -- it would be well worth their while to just send you a few drives, since you are set up to perform the right test, and can provide impartial results. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Intel 320 SSD info
On 8/24/2011 11:23 AM, Andy wrote: According to the specs for database storage: "Random 4KB arites: Up to 600 IOPS" Is that for real? 600 IOPS is *atrociously terrible* for an SSD. Not much faster than mechanical disks. The underlying (Flash block) write rate really is terrible (and slower than most rotating disks). The trick with SSD is that firmware performs all kinds of stunts to make the performance seen by the OS much higher (most of the time !). This is akin to write-back caching in a raid controller, for example, where much higher write rates than the physical drives support are achievable.
Re: [PERFORM] Intel 320 SSD info
On 8/24/2011 11:17 AM, Merlin Moncure wrote: hm, I think they need to reconcile those numbers with the ones on this page: http://www.intel.com/content/www/us/en/solid-state-drives/solid-state-drives-320-series.html 600 write ips vs 3.7k/23k. They do provide an explanation (and what I find interesting about this document is that they are basically "coming clean" about the real worst-case performance, which I personally find refreshing and encouraging). The difference is that the high number is achieved if the drive does not need to perform a block erase to process the write (this is true most of the time since the capacity is over-provisioned and there is an expectation that GC will have generated free blocks in the background). The low number is the performance under worst-case conditions where the drive is a) full and b) no blocks have been trimmed, and c) GC wasn't able to run yet. I suspect that in production use it will be possible to predict in advance when the drive is approaching the point where it will run out of free blocks, and hence perform poorly. Whether or not this is possible is a big question for us in planning our transition to SSDs in production. Anyone using SSDs should be aware of how they work and the possible worst case performance. This article helps with that ! -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Intel 320 SSD info
Apologies if this has already been posted here (I hadn't seen it before today, and can't find a previous post). This will be of interest to anyone looking at using SSDs for database storage : http://www.intel.com/content/www/us/en/solid-state-drives/ssd-320-enterprise-server-storage-application-specification-addendum.html -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] RAID Controllers
On 8/22/2011 10:55 PM, Scott Marlowe wrote: If you're running linux and thus stuck with the command line on the LSI, I'd recommend anything else. MegaRAID is the hardest RAID control software to use I've ever seen. If you can spring for the money, get the Areca 1680: http://www.newegg.com/Product/Product.aspx?Item=N82E16816151023 Be sure and get the battery unit for it. You can configure it from an external ethernet connector very easily, and the performance is outstandingly good. Thanks. I took a look at Areca. The fan on the controller board is a big warning signal for me (those fans are in my experience the single most unreliable component ever used in computers). Can you say a bit more about the likely problems with the CLI ? I'm thinking that I configure the card once, and copy the config to all the other boxes, so even if it's as obscure as Cisco IOS, how bad can it be ? Is the concern more with things like a rebuild; monitoring for drive failures -- that kind of constant management task ? How about Adaptec on Linux ? The supercapacitor and NAND flash idea looks like a good one, provided the firmware doesn't have bugs (true with any write back controller though). -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] RAID Controllers
On 8/23/2011 5:14 AM, Robert Schnabel wrote: I'm by no means an expert but it seems to me if you're going to choose between two 6 GB/s cards you may as well put SAS2 drives in. I have two Adaptec 6445 cards in one of my boxes and several other Adaptec series 5 controllers in others. They suit my needs and I haven't had any problems with them. I think it has been mentioned previously but they do tend to run hot so plenty of airflow would be good. Thanks. Good point about airflow. By SAS I meant 6Gbit SAS drives. But we have many servers already with 10k raptors and it is tempting to use those since we would be able to use a common pool of spare drives across all servers. 15K rpm is tempting though. I'm not sure if the DB transaction commit rate scales up linearly when BBU is used (it would without BBU). -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] RAID Controllers
I'm buying a bunch of new machines (all will run an application that heavily writes to PG). These machines will have 2 spindle groups in a RAID-1 config. Drives will be either 15K SAS, or 10K SATA (I haven't decided if it is better to buy the faster drives, or drives that are identical to the ones we are already running in our production servers, thus achieving commonality in spares across all machines). Controller choice looks to be between Adaptec 6405, with the supercapacitor unit; or LSI 9260-4i with its BBU. Price is roughly the same. Would be grateful for any thoughts on this choice. Thanks. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Reports from SSD purgatory
This comment by the author I think tends to support my theory that most of the failures seen are firmware related (and not due to actual hardware failures, which as I mentioned in the previous thread are very rare and should occur roughly equally often in hard drives as SSDs) : /As we explained in the article, write endurance is a spec'ed failure. That won't happen in the first year, even at enterprise level use. That has nothing to do with our data. We're interested in random failures. The stuff people have been complaining about... BSODs with OCZ drives, LPM stuff with m4s, the SSD 320 problem that makes capacity disappear... etc... Mostly "soft" errors. Any hard error that occurs is subject to the "defective parts per million" problem that any electrical component also suffers from./ and from the main article body: /Firmware is the most significant, and we see its impact in play almost every time an SSD problem is reported. / (Hard drives also suffer from firmware bugs of course) I think I'm generally encouraged by this article because it suggests that once the firmware bugs are fixed (or if you buy from a vendor less likely to ship with bugs in the first place), then SSD reliability will be much better than it is perceived to be today.
Re: [PERFORM] Postgres refusing to use >1 core
On 5/11/2011 9:17 PM, Aren Cambre wrote: So here's what's going on. If I were doing this, considering the small size of the data set, I'd read all the data into memory. Process it entirely in memory (with threads to saturate all the processors you have). Then write the results to the DB. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Benchmarking a large server
On 5/9/2011 6:32 PM, Craig James wrote: Maybe this is a dumb question, but why do you care? If you have 1TB RAM and just a little more actual disk space, it seems like your database will always be cached in memory anyway. If you "eliminate the cach effect," won't the benchmark actually give you the wrong real-life results? The time it takes to populate the cache from a cold start might be important. Also, if it were me, I'd be wanting to check for weird performance behavior at this memory scale. I've seen cases in the past where the VM subsystem went bananas because the designers and testers of its algorithms never considered the physical memory size we deployed. How many times was the kernel tested with this much memory, for example ? (never??) -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Benchmarking a large server
On 5/9/2011 3:11 PM, Merlin Moncure wrote: The problem with bonnie++ is that the results aren't valid, especially the read tests. I think it refuses to even run unless you set special switches. I only care about writes ;) But definitely, be careful with the tools. I tend to prefer small programs written in house myself, and of course simply running your application under a synthesized load. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Benchmarking a large server
hm, if it was me, I'd write a small C program that just jumped directly on the device around and did random writes assuming it wasn't formatted. For sequential read, just flush caches and dd the device to /dev/null. Probably someone will suggest better tools though. I have a program I wrote years ago for a purpose like this. One of the things it can do is write to the filesystem at the same time as dirtying pages in a large shared or non-shared memory region. The idea was to emulate the behavior of a database reasonably accurately. Something like bonnie++ would probably be a good starting point these days though. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Intel SSDs that may not suck
On 4/6/2011 9:19 PM, gnuo...@rcn.com wrote: SSDs have been around for quite some time. The first that I've found is Texas Memory. Not quite 1977, but not flash either, although they've been doing so for a couple of years. Well, I built my first ram disk (which of course I thought I had invented, at the time) in 1982. But today we're seeing solid state storage seriously challenging rotating media across all applications, except at the TB and beyond scale. That's what's new. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Intel SSDs that may not suck
Had to say a quick thanks to Greg and the others who have posted detailed test results on SSDs here. For those of us watching for the inflection point where we can begin the transition from mechanical to solid state storage, this data and experience is invaluable. Thanks for sharing it. A short story while I'm posting : my Dad taught electronics engineering and would often visit the local factories with groups of students. I remember in particular after a visit to a disk drive manufacturer (Burroughs), in 1977 he came home telling me that he'd asked the plant manager what their plan was once solid state storage made their products obsolete. The manager looked at him like he was form another planet... So I've been waiting patiently 34 years for this hopefully soon-to-arrive moment ;) -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] GPU Accelerated Sorting
On 8/30/2010 3:18 PM, Chris Browne wrote: ... As long as you're willing to rewrite PostgreSQL in Occam 2... Just re-write it in Google's new language 'Go' : it's close enough to Occam and they'd probably fund the project.. ;) -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] GPU Accelerated Sorting
Feels like I fell through a worm hole in space/time, back to inmos in 1987, and a guy from marketing has just walked in the office going on about there's a customer who wants to use our massively parallel hardware to speed up databases... -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Testing Sandforce SSD
Do you guys have any more ideas to properly 'feel this disk at its teeth' ? While an 'end-to-end' test using PG is fine, I think it would be easier to determine if the drive is behaving correctly by using a simple test program that emulates the storage semantics the WAL expects. Have it write a constant stream of records, fsync'ing after each write. Record the highest record number flushed so far in some place that won't be lost with the drive under test (e.g. send it over the network to another machine). Kill the power, bring the system back up again and examine what's at the tail end of that file. I think this will give you the worst case test with the easiest result discrimination. If you want to you could add concurrent random writes to another file for extra realism. Someone here may already have a suitable test program. I know I've written several over the years in order to test I/O performance, prove the existence of kernel bugs, and so on. I doubt it matters much how long the power is turned of. A second should be plenty time to flush pending writes if the drive is going to do so. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] performance on new linux box
On 7/8/2010 3:18 PM, timothy.noo...@emc.com wrote: How does the linux machine know that there is a BBU installed and to change its behavior or change the behavior of Postgres? I am experiencing performance issues, not with searching but more with IO. It doesn't change its behavior at all. It's in the business of writing stuff to a file and waiting until that stuff has been put on the disk (it wants a durable write). What the write buffer/cache does is to inform the OS, and hence PG, that the write has been done when in fact it hasn't (yet). So the change in behavior is only to the extent that the application doesn't spend as much time waiting. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] performance on new linux box
On 7/8/2010 1:47 PM, Ryan Wexler wrote: Thanks for the explanations that makes things clearer. It still amazes me that it would account for a 5x change in IO. The buffering allows decoupling of the write rate from the disk rotation speed. Disks don't spin that fast, at least not relative to the speed the CPU is running at. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Linux/PostgreSQL scalability issue - problem with 8 cores
James Mansion wrote: Jakub Ouhrabka wrote: How can we diagnose what is happening during the peaks? Can you try forcing a core from a bunch of the busy processes? (Hmm - does Linux have an equivalent to the useful Solaris pstacks?) There's a 'pstack' for Linux, shipped at least in Red Hat distributions (and possibly others, I'm not sure). It's a shell script wrapper around gdb, so easily ported to any Linux. ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] User concurrency thresholding: where do I look?
Tom Lane wrote: Having malloc/free use an internal mutex is necessary in multi-threaded programs, but the backend isn't multi-threaded. Hmm...confused. I'm not following why then there is contention for the mutex. Surely this has to be some other mutex that is in contention, not a heap lock ? It'd be handy to see the call stack for the wait state -- if the thing is spending a significant proportion of its time in contention it should be easy to get that with a simple tool such as pstack or a debugger. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Performance Tuning and Disk Cache
Barry Moore wrote: I have a very slow query that I'm trying to tune. I think my performance tuning is being complicated by the system's page cache. If a run the query after the system has been busy with other tasks for quite a long time then the query can take up to 8-10 minutes to complete. If I then rerun the same query it will complete in a couple of seconds. Does anyone know how I can repeatedly run the same query in the "worst case scenario" of no postgres data in the disk cache (e.g., clear the page cache or force it to be ignored)? In my experience the only 100% reliable way to do this is to reboot the machine. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Low throughput of binary inserts from windows to linux
Tom Lane wrote: In case I was mistaken, this explanation makes perfectly sens to me. But then again it would indicate a 'bug' in libpq, in the sense that it (apparently) sets TCP_NODELAY on linux but not on windows. No, it would mean a bug in Windows in that it fails to honor TCP_NODELAY. Last time I did battle with nagle/delayed ack interaction in windows (the other end has to be another stack implementation -- windows to itself I don't think has the problem), it _did_ honor TCP_NODELAY. That was a while ago (1997) but I'd be surprised if things have changed much since then. Basically nagle has to be turned off for protocols like this (request/response interaction over TCP) otherwise you'll sometimes end up with stalls waiting for the delayed ack before sending, which in turn results in very low throughput, per connection. As I remember Windows client talking to Solaris server had the problem, but various other permutations of client and server stack implementation did not.
Re: [PERFORM] New to PostgreSQL, performance considerations
Alexander Staubo wrote: No, fsync=on. The tps values are similarly unstable with fsync=off, though -- I'm seeing bursts of high tps values followed by low-tps valleys, a kind of staccato flow indicative of a write caching being filled up and flushed. Databases with checkpointing typically exhibit this cyclical throughput syndrome. (put another way : this is to be expected and you are correct that it indicates buffered data being flushed to disk periodically). ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: RES: [PERFORM] Bad iostat numbers
Carlos H. Reimer wrote: I´ve taken a look in the /var/log/messages and found some temperature messages about the disk drives: Nov 30 11:08:07 totall smartd[1620]: Device: /dev/sda, Temperature changed 2 Celsius to 51 Celsius since last report Can this temperature influence in the performance? it can influence 'working-ness' which I guess in turn affects performance ;) But I'm not sure if 50C is too high for a disk drive, it might be ok. If you are able to, I'd say just replace the drives and see if that improves things. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Bad iostat numbers
Carlos H. Reimer wrote: avg-cpu: %user %nice %system %iowait %idle 50.400.000.501.10 48.00 Device:rrqm/s wrqm/s r/s w/s rsec/s wsec/srkB/swkB/s avgrq-sz avgqu-sz await svctm %util sda 0.00 7.80 0.40 6.40 41.60 113.6020.80 56.8022.82 570697.50 10.59 147.06 100.00 sdb 0.20 7.80 0.60 6.40 40.00 113.6020.00 56.8021.94 570697.509.83 142.86 100.00 md1 0.00 0.00 1.20 13.40 81.60 107.2040.80 53.6012.93 0.000.00 0.00 0.00 md0 0.00 0.00 0.00 0.000.000.00 0.00 0.00 0.00 0.000.00 0.00 0.00 Are they not saturated? What kind of parameters should I pay attention when comparing SCSI controllers and disks? I would like to discover how much cache is present in the controller, how can I find this value from Linux? These number look a bit strange. I am wondering if there is a hardware problem on one of the drives or on the controller. Check in syslog for messages about disk timeouts etc. 100% util but 6 writes/s is just wrong (unless the drive is a 1980's vintage floppy). ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] measuring shared memory usage on Windows
Unfortunately often operating system virtual memory and filesystem caching code that does exactly the opposite of what a database application would like. For some reason the kernel guys don't see it that way ;) Over the years there have been various kernel features added with the overall goal of solving problems in this area : O_DIRECT, ckrm, flags to mmap() and so on. So far I'm not sure any of them has really succeeded. Hence the real-world wisdom to 'let the filesystem cache do its thing' and configure a small shared memory cache in the application. Ideally one would want to use O_DIRECT or equivalent to bypass the OS's cleverness and manage the filesystem caching oneself. However it turns out that enabling O_DIRECT makes things much worse not better (YMMV obviously). It's hard to achieve the level of concurrency that the kernel can get for disk I/O, from user mode. Another approach is to make the application cache size dynamic, with the goal that it can grow and shrink to reach the size that provides the best overall performance. I've seen attempts to drive the sizing using memory access latency measurements done from user mode inside the application. However I'm not sure that anyone has taken this approach beyond the science project stage. So AFAIK this is still a generally unsolved problem. NT (Windows) is particularly interesting because it drives the filesystem cache sizing with a signal that it mesures from the VM pages evicted per second counter. In order to keep its feedback loop stable, the OS wants to see a non-zero value for this signal at all times. So you will see that even under ideal conditions the system will still page a little. (Unless that code has changed in Win2003 -- it's been a while since I checked). So don't drive yourself crazy trying to get it to stop paging ;) ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] measuring shared memory usage on Windows
really makes me think that that area is just a comfortable way to access files on disk as memory areas; with the hope of propably better caching then not-memory-mapped files. No, absolutely not. CreateFileMaping() does much the same thing as mmap() in Unix. That would explain my disturbing impressions of performance of PostgreSQL on win32 rising when lowering shared_memory... I don't know what your disturbing impressions are, but no it doesn't explain them. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] measuring shared memory usage on Windows
I learned the hard way that just rising it can lead to a hard performance loss :) I looked back in the list archives to try to find your post on the underlying problem, but could only find this rather terse sentence. If you have more detailed information please post or point me at it. But...my first thought is that you have configured the shared memory region so large that the system as a whole can not fit all the working set sizes for all running processes in to physical memory. This is a common pitfall for databases with caches implemented as mapped shared user space regions (which is basically all databases). For example, if you have 1G of RAM on the box, you can't configure a cache of 900 meg and expect things to work well. This is because the OS and associated other stuff running on the box will use ~300megs. The system will page as a result. The only sure fire way I know of to find the absolute maximum cache size that can be safely configured is to experiment with larger and larger sizes until paging occurs, then back off a bit. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Performance penalty for remote access of postgresql
Guoping Zhang wrote: a) SERVER A to SERVER B: 0.35ms SERVER A to itself (Local host): 0.022ms 0.35ms seems rather slow. You might try investigating what's in the path. For comparison, between two machines here (three GigE switches in the path), I see 0.10ms RTT. Between two machines on the same switch I get 0.08ms. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Which processor runs better for Postgresql?
My suggestion is to look at something like this: http://www.abmx.com/1u-supermicro-amd-opteron-rackmount-server-p-210.html 1U rackmount opteron from Supermicro that can have two dual core opterons and 4 drives and up to 16 gigs of ram. Supermicro server motherboards have always treated me well and performed well too. I've had good experience with similar machines from Tyan : http://www.tyan.com/products/html/gt24b2891.html ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] 64-bit vs 32-bit performance ... backwards?
Anthony Presley wrote: I had an interesting discussion today w/ an Enterprise DB developer and sales person, and was told, twice, that the 64-bit linux version of Enterprise DB (which is based on the 64-bit version of PostgreSQL 8.1) is SIGNIFICANTLY SLOWER than the 32-bit version. Since the guys of EDB are PostgreSQL . has anyone seen that the 64-bit is slower than the 32-bit version? I was told that the added 32-bits puts a "strain" and extra "overhead" on the processor / etc which actually slows down the pointers and necessary back-end "stuff" on the database. I'm curious if anyone can back this up or debunk it. It's about the polar opposite of everything I've heard from every other database vendor for the past several years, and would be quite an eye-opener for me. What they are saying is strictly true : 64-bit pointers tend to increase the working set size of an application vs. 32-bit pointers. This means that any caches will have somewhat lower hit ratio. Also the bytes/s between the CPU and memory will be higher due to moving those larger pointers. In the case of a 32-bit OS this also applies to the kernel so the effect will be system-wide. However, an application that needs to work on > around 2G of data will in the end be much faster 64-bit due to reduced I/O (it can keep more of the data in memory). I worked on porting a large database application from 32-bit to 64-bit. One of our customers required us to retain the 32-bit version because of this phenomenon. In measurements I conducted on that application, the performance difference wasn't great (10% or so), but it was measurable. This was with Sun Sparc hardware. It is possible that more modern CPU designs have more efficient 64-bit implementation than 32-bit, so the opposite might be seen too. Whether or not PG would show the same thing I can't say for sure. Probably it would though. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] scaling up postgres
Tom Lane wrote: [EMAIL PROTECTED] writes: I'm using httperf/autobench for measurments and the best result I can get is that my system can handle a trafiic of almost 1600 New con/sec. As per PFC's comment, if connections/sec is a bottleneck for you then the answer is to use persistent connections. Launching a new backend is a fairly heavyweight operation in Postgres. I thought the OP was talking about HTTP connections/s. He didn't say if he was using persistent database connections or not (obviously better if so). If it were the case that his setup is new backend launch rate-limited, then wouldn't the machine show CPU saturation ? (he said it didn't).
Re: [PERFORM] scaling up postgres
I cannot scale beyond that value and the funny thing, is that none of the servers is swapping, or heavy loaded, neither postgres nor apache are refusing connexions. Hearing a story like this (throughput hits a hard limit, but hardware doesn't appear to be 100% utilized), I'd suspect insufficient concurrency to account for the network latency between the two servers. Also check that your disks aren't saturating somewhere (with iostat or something similar). You could run pstack against both processes and see what they're doing while the system is under load. That might give a clue (e.g. you might see the apache processs waiting on a response from PG, and the PG processes waiting on a new query to process). Since you've proved that your test client and apache can handle a much higher throughput, the problem must lie somewhere else (in posgresql or the interface between the web server and postgresql). ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Memory and/or cache issues?
2006-05-04 18:04:58 EDT USER=postgres DB=FIX1 [12427] PORT = [local] ERROR: invalid memory alloc request size 18446744073709551613 Perhaps I'm off beam here, but any time I've seen an app try to allocate a gazillion bytes, it's due to some code incorrectly calculating the size of something (or more commonly, using an initialized variable as the basis for said calculation).
Re: [PERFORM] Large (8M) cache vs. dual-core CPUs
>While in general there may not be that much of a % difference between the 2 chips, >there's a huge gap in Postgres. For whatever reason, Postgres likes Opterons. >Way more than Intel P4-architecture chips. It isn't only Postgres. I work on a number of other server applications that also run much faster on Opterons than the published benchmark figures would suggest they should. They're all compiled with gcc4, so possibly there's a compiler issue. I don't run Windows on any of our Opteron boxes so I can't easily compare using the MS compiler. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Large (8M) cache vs. dual-core CPUs
The reason AMD is has held off from supporting DDR2 until now are: 1. DDR is EOL. JEDEC is not ratifying any DDR faster than 200x2 while DDR2 standards as fast as 333x4 are likely to be ratified (note that Intel pretty much avoided DDR, leaving it to AMD, while DDR2 is Intel's main RAM technology. Guess who has more pull with JEDEC?) DDR2 is to RDRAM as C# is to Java ;) ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Large (8M) cache vs. dual-core CPUs
My personal favorite pg platform at this time is one based on a 2 socket, dual core ready mainboard with 16 DIMM slots combined with dual core AMD Kx's. Right. We've been buying Tyan bare-bones boxes like this. It's better to go with bare-bones than building boxes from bare metal because the cooling issues are addressed correctly. Note that if you need a large number of machines, then Intel Core Duo may give the best overall price/performance because they're cheaper to run and cool. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Large (8M) cache vs. dual-core CPUs
Actually, that was from an article from this last month that compared the dual core intel to the amd. for every dollar spent on the intel, you got about half the performance of the amd. Not bigotry. fact. But don't believe me or the other people who've seen the difference. Go buy the Intel box. No skin off my back. I've been doing plenty of performance evaluation on a parallel application we're developing here : on Dual Core Opterons, P4, P4D. I can say that the Opterons open up a can of wupass on the Intel processors. Almost 2x the performance on our application vs. what the SpecCPU numbers would suggest.
Re: [PERFORM] Hardware/OS recommendations for large databases (
Brendan Duddridge wrote: Thanks for your reply. So how is that different than something like Slony2 or pgcluster with multi-master replication? Is it similar technology? We're currently looking for a good clustering solution that will work on our Apple Xserves and Xserve RAIDs. I think you need to be more specific about what you're trying to do. 'clustering' encompasses so many things that it means almost nothing by itself. slony provides facilities for replicating data. Its primary purpose is to improve reliability. MPP distributes both data and queries. Its primary purpose is to improve performance for a subset of all query types. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Hardware/OS recommendations for large databases (
Alan Stange wrote: Not sure I get your point. We would want the lighter one, all things being equal, right ? (lower shipping costs, less likely to break when dropped on the floor) Why would the lighter one be less likely to break when dropped on the floor? They'd have less kinetic energy upon impact. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Hardware/OS recommendations for large databases (
Alex Turner wrote: Just pick up a SCSI drive and a consumer ATA drive. Feel their weight. Not sure I get your point. We would want the lighter one, all things being equal, right ? (lower shipping costs, less likely to break when dropped on the floor) ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Hardware/OS recommendations for large databases (
I suggest you read this on the difference between enterprise/SCSI and desktop/IDE drives: http://www.seagate.com/content/docs/pdf/whitepaper/D2c_More_than_Interface_ATA_vs_SCSI_042003.pdf This is exactly the kind of vendor propaganda I was talking about and it proves my point quite well : that there's nothing specific relating to reliability that is different between SCSI and SATA drives cited in that paper. It does have a bunch of FUD such as 'oh yeah we do a lot more drive characterization during manufacturing'.
Re: [PERFORM] Hardware/OS recommendations for large databases (
>Spend a fortune on dual core CPUs and then buy crappy disks... I bet >for most applications this system will be IO bound, and you will see a >nice lot of drive failures in the first year of operation with >consumer grade drives. I guess I've never bought into the vendor story that there are two reliability grades. Why would they bother making two different kinds of bearing, motor etc ? Seems like it's more likely an excuse to justify higher prices. In my experience the expensive SCSI drives I own break frequently while the cheapo desktop drives just keep chunking along (modulo certain products that have a specific known reliability problem). I'd expect that a larger number of hotter drives will give a less reliable system than a smaller number of cooler ones. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Hardware/OS recommendations for large databases (
Steve Wampler wrote: Joshua D. Drake wrote: The reason you want the dual core cpus is that PostgreSQL can only execute 1 query per cpu at a time,... Is that true? I knew that PG only used one cpu per query, but how does PG know how many CPUs there are to limit the number of queries? He means only one query can be executing on each cpu at any particular instant. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Postgres recovery time
Piccarello, James (James) wrote: Postgres recovery time Does anyone know what factors affect the recovery time of postgres if it does not shutdown cleanly? With the same size database I've seen times from a few seconds to a few minutes. The longest time was 33 minutes. The 33 minutes was after a complete system crash and reboot so there are a lot of other things going on as well. 125 seconds was the longest time I could reproduce by just doing a kill -9 on postmaster. Is it the size of the transaction log? The dead space in files? I don't know much about postgresql, but typically WAL mechanisms will exhibit recovery times that are bounded by the amount of log record data written since the last checkpoint. The 'worst' case will be where you have continuous writes to the database and a long checkpoint interval. In that case many log records must be replayed into the data files upon recovery. The 'best' case would be zero write transactions since the last checkpoint. In that case recovery would be swift since there are no live records to recover. In your tests you are probably exercising this 'best' or near best case.