Re: [PERFORM] CPU bound at 99%
Bryan Buecking wrote: On Tue, Apr 22, 2008 at 10:55:19AM -0500, Erik Jones wrote: Are you referring to PHP's persistent connections? Do not use those. Here's a thread that details the issues with why not: http://archives.postgresql.org/pgsql-general/2007-08/msg00660.php . Thanks for that article, very informative and persuasive enough that I've turned off persistent connections. Note that it's not always true - current recommended practice for PHP is to run it in FastCGI, in which case even though there are hundreds of Apache processes, there are only few PHP processes with their persistent database connections (and unused PHP FastCGI servers get killed off routinely) so you get almost "proper" pooling without the overhead. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] CPU bound at 99%
about 2300 connections in idle (ps auxwww | grep postgres | idle) [...] The server that connects to the db is an apache server using persistent connections. MaxClients is 2048 thus the high number of connections needed. Application was written in PHP using the Pear DB class. This is pretty classical. When your number of threads gets out of control, everything gets slower, so more requests pile up, spawning more threads, this is positive feedback, and in seconds all hell breaks loose. That's why I call it imploding, like if it collapses under its own weight. There is a threshold effect and it gets from working good to a crawl rather quickly once you pass the threshold, as you experienced. Note that the same applies to Apache, PHP as well as Postgres : there is a "sweet spot" in the number of threads, for optimum efficiency, depending on how many cores you have. Too few threads, and it will be waiting for IO or waiting for the database. Too many threads, and CPU cache utilization becomes suboptimal and context switches eat your performance. This sweet spot is certainly not at 500 connections per core, either for Postgres or for PHP. It is much lower, about 5-20 depending on your load. I will copypaste here an email I wrote to another person with the exact same problem, and the exact same solution. Please read this carefully : * Basically there are three classes of websites in my book. 1- Low traffic (ie a few hits/s on dynamic pages), when performance doesn't matter 2- High traffic (ie 10-100 hits/s on dynamic pages), when you must read the rest of this email 3- Monster traffic (lots more than that) when you need to give some of your cash to Akamai, get some load balancers, replicate your databases, use lots of caching, etc. This is yahoo, flickr, meetic, etc. Usually people whose web sites are imploding under load think they are in class 3 but really most of them are in class 2 but using inadequate technical solutions like MySQL, etc. I had a website with 200K members that ran on a Celeron 1200 with 512 MB RAM, perfectly fine, and lighttpd wasn't even visible in the top. Good news for you is that the solution to your problem is pretty easy. You should be able to solve that in about 4 hours. Suppose you have some web servers for static content ; obviously you are using lighttpd on that since it can service an "unlimited" (up to the OS limit, something like 64K sockets) number of concurrent connections. You could also use nginx or Zeus. I think Akamai uses Zeus. But Lighttpd is perfectly fine (and free). For your static content servers you will want to use lots of RAM for caching, if you serve images, put the small files like thumbnails, css, javascript, html pages on a separate server so that they are all served from RAM, use a cheap CPU since a Pentium-M with lighttpd will happily push 10K http hits/s if you don't wait for IO. Large files should be on the second static server to avoid cache trashing on the server which has all the frequently accessed small files. Then you have some web servers for generating your dynamic content. Let's suppose you have N CPU cores total. With your N cores, the ideal number of threads would be N. However those will also wait for IO and database operations, so you want to fill those wait times with useful work, so maybe you will use something like 2...10 threads per core. This can only be determined by experimentation, it depends on the type and length of your SQL queries so there is no "one size fits all" answer. Example. You have pages that take 20 ms to generate, and you have 100 requests for those coming up. Let's suppose you have one CPU core. (Note : if your pages take longer than 10 ms, you have a problem. On the previously mentioned website, now running on the cheapest Core 2 we could find since the torrent tracker eats lots of CPU, pages take about 2-5 ms to generate, even the forum pages with 30 posts on them. We use PHP with compiled code caching and SQL is properly optimized). And, yes, it uses MySQL. Once I wrote (as an experiment) an extremely simple forum which did 1400 pages/second (which is huge) with a desktop Core2 as the Postgres 8.2 server. - You could use Apache in the old fasion way, have 100 threads, so all your pages will take 20 ms x 100 = 2 seconds, But the CPU cache utilisation will suck because of all those context switches, you'll have 100 processes eating your RAM (count 8MB for a PHP process), 100 database connections, 100 postgres processes, the locks will stay on longer, transactions will last longer, you'll get more dead rows to vacuum, etc. And actually, since Apache will not buffer the output of your scripts, the PHP or Perl interpreter will stay in memory (and hog a database connection) until the client at the o
Re: [PERFORM] CPU bound at 99%
Erik Jones wrote: max_connections = 2400 That is WAY too high. Get a real pooler, such as pgpool, and drop that down to 1000 and test from there. I see you mentioned 500 concurrent connections. Are each of those connections actually doing something? My guess that once you cut down on the number actual connections you'll find that each connection can get it's work done faster and you'll see that number drop significantly. It's not an issue for me - I'm expecting *never* to top 100 concurrent connections, and many of those will be idle, with the usual load being closer to 30 connections. Big stuff ;-) However, I'm curious about what an idle backend really costs. On my system each backend has an RSS of about 3.8MB, and a psql process tends to be about 3.0MB. However, much of that will be shared library bindings etc. The real cost per psql instance and associated backend appears to be 1.5MB (measured with 10 connections using system free RAM change) . If I use a little Python program to generate 50 connections free system RAM drops by ~45MB and rises by the same amount when the Python process exists and the backends die, so the backends presumably use less than 1MB each of real unshared RAM. Presumably the backends will grow if they perform some significant queries and are then left idle. I haven't checked that. At 1MB of RAM per backend that's not a trivial cost, but it's far from earth shattering, especially allowing for the OS swapping out backends that're idle for extended periods. So ... what else does an idle backend cost? Is it reducing the amount of shared memory available for use on complex queries? Are there some lists PostgreSQL must scan for queries that get more expensive to examine as the number of backends rise? Are there locking costs? -- Craig Ringer -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] CPU bound at 99%
Bryan Buecking <[EMAIL PROTECTED]> writes: > On Tue, Apr 22, 2008 at 10:55:19AM -0500, Erik Jones wrote: >> That is WAY too high. Get a real pooler, such as pgpool, and drop >> that down to 1000 and test from there. > I agree, but the number of idle connections dont' seem to affect > performace only memory usage. I doubt that's true (and your CPU load suggests the contrary as well). There are common operations that have to scan the whole PGPROC array, which has one entry per open connection. What's worse, some of them require exclusive lock on the array. 8.3 has some improvements in this area that will probably let it scale to more connections than previous releases, but in any case connection pooling is a good thing. > I'm trying to lessen the load of > connection setup. But sounds like this tax is minimal? Not really. You're better off reusing a connection over a large number of queries. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] CPU bound at 99%
On Tue, Apr 22, 2008 at 10:10 AM, Bryan Buecking <[EMAIL PROTECTED]> wrote: > > I agree, but the number of idle connections dont' seem to affect > performace only memory usage. I'm trying to lessen the load of > connection setup. But sounds like this tax is minimal? Not entirely true. There are certain things that happen that require one backend to notify ALL OTHER backends. when this happens a lot, then the system will slow to a crawl. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] CPU bound at 99%
On Tue, Apr 22, 2008 at 01:21:03PM -0300, Rodrigo Gonzalez wrote: > Are tables vacuumed often? How often is often. Right now db is vaccumed once a day. -- Bryan Buecking http://www.starling-software.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] CPU bound at 99%
Are tables vacuumed often? Bryan Buecking escribió: On Tue, Apr 22, 2008 at 10:55:19AM -0500, Erik Jones wrote: On Apr 22, 2008, at 10:31 AM, Bryan Buecking wrote: max_connections = 2400 That is WAY too high. Get a real pooler, such as pgpool, and drop that down to 1000 and test from there. I agree, but the number of idle connections dont' seem to affect performace only memory usage. I'm trying to lessen the load of connection setup. But sounds like this tax is minimal? When these issues started happening, max_connections was set to 1000 and I was not using persistent connections. I see you mentioned 500 concurrent connections. Are each of those connections actually doing something? Yes out of the 2400 odd connections, 500 are either in SELECT or RESET. My guess that once you cut down on the number actual connections you'll find that each connection can get it's work done faster and you'll see that number drop significantly. I agree, but not in this case. I will look at using pooling. smime.p7s Description: S/MIME Cryptographic Signature
Re: [PERFORM] CPU bound at 99%
On Tue, Apr 22, 2008 at 10:55:19AM -0500, Erik Jones wrote: > > Are you referring to PHP's persistent connections? Do not use those. > Here's a thread that details the issues with why not: > http://archives.postgresql.org/pgsql-general/2007-08/msg00660.php . Thanks for that article, very informative and persuasive enough that I've turned off persistent connections. -- Bryan Buecking http://www.starling-software.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] CPU bound at 99%
Bryan, > > about 2300 connections in idle > > > (ps auxwww | grep postgres | idle) that is about 2300 processes being task scheduled by your kernel, each of them using > 1 MB of RAM and some other ressources, are you sure that this is what you want? Usual recommended design for a web application: start request, rent a connection from connection pool, do query, put connection back, finish request, wait for next request so to get 500 connections in parallel, you would have the outside situaion of 500 browsers submitting requests within the time needed to fullfill one request. Harald -- GHUM Harald Massa persuadere et programmare Harald Armin Massa Spielberger Straße 49 70435 Stuttgart 0173/9409607 fx 01212-5-13695179 - EuroPython 2008 will take place in Vilnius, Lithuania - Stay tuned! -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] CPU bound at 99%
On Tue, Apr 22, 2008 at 10:55:19AM -0500, Erik Jones wrote: > On Apr 22, 2008, at 10:31 AM, Bryan Buecking wrote: > > >max_connections = 2400 > > That is WAY too high. Get a real pooler, such as pgpool, and drop > that down to 1000 and test from there. I agree, but the number of idle connections dont' seem to affect performace only memory usage. I'm trying to lessen the load of connection setup. But sounds like this tax is minimal? When these issues started happening, max_connections was set to 1000 and I was not using persistent connections. > I see you mentioned 500 concurrent connections. Are each of those > connections actually doing something? Yes out of the 2400 odd connections, 500 are either in SELECT or RESET. > My guess that once you cut down on the number actual connections > you'll find that each connection can get it's work done faster > and you'll see that number drop significantly. I agree, but not in this case. I will look at using pooling. -- Bryan Buecking http://www.starling-software.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] CPU bound at 99%
On Tue, Apr 22, 2008 at 08:41:09AM -0700, Joshua D. Drake wrote: > On Wed, 23 Apr 2008 00:31:01 +0900 > Bryan Buecking <[EMAIL PROTECTED]> wrote: > > > at any given time there is about 5-6 postgres in startup > > (ps auxwww | grep postgres | grep startup | wc -l) > > > > about 2300 connections in idle > > (ps auxwww | grep postgres | idle) > > > > and loads of "FATAL: sorry, too many clients already" being logged. > > > > The server that connects to the db is an apache server using > > persistent connections. MaxClients is 2048 thus the high number of > > connections needed. Application was written in PHP using the Pear DB > > class. > > Sounds like your pooler isn't reusing connections properly. The persistent connections are working properly. The idle connections are expected given that the Apache child process are not closing them (A la non-persistent). The connections do go away after 1000 requests (MaxChildRequest). I decided to move towards persistent connections since prior to persistent connections the idle vs startup were reversed. -- Bryan Buecking http://www.starling-software.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] CPU bound at 99%
On Apr 22, 2008, at 10:31 AM, Bryan Buecking wrote: Hi, I'm running into an performance problem where a Postgres db is running at 99% CPU (4 cores) with about 500 concurrent connection doing various queries from a web application. This problem started about a week ago, and has been steadily going downhill. I have been tweaking the config a bit, mainly shared_memory but have seen no noticeable improvements. at any given time there is about 5-6 postgres in startup (ps auxwww | grep postgres | grep startup | wc -l) about 2300 connections in idle (ps auxwww | grep postgres | idle) and loads of "FATAL: sorry, too many clients already" being logged. The server that connects to the db is an apache server using persistent connections. MaxClients is 2048 thus the high number of connections needed. Application was written in PHP using the Pear DB class. Are you referring to PHP's persistent connections? Do not use those. Here's a thread that details the issues with why not: http://archives.postgresql.org/pgsql-general/2007-08/msg00660.php . Basically, PHP's persistent connections are NOT pooling solution. Us pgpool or somesuch. max_connections = 2400 That is WAY too high. Get a real pooler, such as pgpool, and drop that down to 1000 and test from there. I see you mentioned 500 concurrent connections. Are each of those connections actually doing something? My guess that once you cut down on the number actual connections you'll find that each connection can get it's work done faster and you'll see that number drop significantly. For example, our application does anywhere from 200 - 600 transactions per second, dependent on the time of day/week, and we never need more that 150 to 200 connections (although we do have the max_connections set to 500). Erik Jones DBA | Emma® [EMAIL PROTECTED] 800.595.4401 or 615.292.5888 615.292.0777 (fax) Emma helps organizations everywhere communicate & market in style. Visit us online at http://www.myemma.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] CPU bound at 99%
On Wed, 23 Apr 2008 00:31:01 +0900 Bryan Buecking <[EMAIL PROTECTED]> wrote: > at any given time there is about 5-6 postgres in startup > (ps auxwww | grep postgres | grep startup | wc -l) > > about 2300 connections in idle > (ps auxwww | grep postgres | idle) > > and loads of "FATAL: sorry, too many clients already" being logged. > > The server that connects to the db is an apache server using > persistent connections. MaxClients is 2048 thus the high number of > connections needed. Application was written in PHP using the Pear DB > class. Sounds like your pooler isn't reusing connections properly. Sincerely, Joshua D. Drake -- The PostgreSQL Company since 1997: http://www.commandprompt.com/ PostgreSQL Community Conference: http://www.postgresqlconference.org/ United States PostgreSQL Association: http://www.postgresql.us/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] CPU bound at 99%
Hi, I'm running into an performance problem where a Postgres db is running at 99% CPU (4 cores) with about 500 concurrent connection doing various queries from a web application. This problem started about a week ago, and has been steadily going downhill. I have been tweaking the config a bit, mainly shared_memory but have seen no noticeable improvements. at any given time there is about 5-6 postgres in startup (ps auxwww | grep postgres | grep startup | wc -l) about 2300 connections in idle (ps auxwww | grep postgres | idle) and loads of "FATAL: sorry, too many clients already" being logged. The server that connects to the db is an apache server using persistent connections. MaxClients is 2048 thus the high number of connections needed. Application was written in PHP using the Pear DB class. Here are some typical queries taking place (table media has about 40,000 records and category about 40): LOG: duration: 66141.530 ms statement: SELECT COUNT(*) AS CNT FROM media m JOIN category ca USING(category_id) WHERE CATEGORY_ROOT(m.category_id) = '-1' AND m.deleted_on IS NULL LOG: duration: 57828.983 ms statement: SELECT COUNT(*) AS CNT FROM media m JOIN category ca USING(category_id) WHERE CATEGORY_ROOT(m.category_id) = '-1' AND m.deleted_on IS NULL AND m.POSTED_ON + interval '7 day' System == cpu Xeon(R) CPU 5160 @ 3.00GHz stepping 06 x 4 L1, L2 = 32K, 4096K mem 8GB dbmspostgresql-server 8.2.4 disks scsi0 : LSI Logic SAS based MegaRAID driver SCSI device sda: 142082048 512-byte hdwr sectors (72746 MB) SCSI device sda: 142082048 512-byte hdwr sectors (72746 MB) Stats == top - 00:28:40 up 12:43, 1 user, load average: 46.88, 36.55, 37.65 Tasks: 2184 total, 63 running, 2119 sleeping, 1 stopped, 1 zombie Cpu0: 99.3% us, 0.5% sy, 0.0% ni, 0.0% id, 0.0% wa, 0.0% hi, 0.2% si Cpu1: 98.3% us, 1.4% sy, 0.0% ni, 0.2% id, 0.0% wa, 0.0% hi, 0.0% si Cpu2: 99.5% us, 0.5% sy, 0.0% ni, 0.0% id, 0.0% wa, 0.0% hi, 0.0% si Cpu3: 99.5% us, 0.5% sy, 0.0% ni, 0.0% id, 0.0% wa, 0.0% hi, 0.0% si Mem: 8166004k total, 6400368k used, 1765636k free, 112080k buffers Swap: 1020088k total,0k used, 1020088k free, 3558764k cached $ vmstat 3 procs -memory-- ---swap-- -io --system-- cpu r b swpd free buff cache si sobibo incs us sy id wa 4 00 559428 109440 3558684001127 31 117 96 2 2 0 5 00 558996 109452 355867200 041 1171 835 93 1 7 0 4 00 558996 109452 355874000 038 1172 497 98 1 1 0 11 00 554516 109452 355874000 019 1236 610 97 1 2 0 25 00 549860 109452 355874000 032 1228 332 99 1 0 0 12 00 555412 109452 355874000 0 4 1148 284 99 1 0 0 15 00 555476 109452 355874000 023 1202 290 99 1 0 0 15 00 555476 109452 355874000 0 1 1125 260 99 1 0 0 16 00 555460 109452 355874000 012 1214 278 99 1 0 0 # - # PostgreSQL configuration file # - #data_directory = 'ConfigDir' # use data in another directory # (change requires restart) #hba_file = 'ConfigDir/pg_hba.conf' # host-based authentication file # (change requires restart) #ident_file = 'ConfigDir/pg_ident.conf' # ident configuration file # (change requires restart) # If external_pid_file is not explicitly set, no extra PID file is written. #external_pid_file = '(none)' # write an extra PID file # (change requires restart) #--- # CONNECTIONS AND AUTHENTICATION #--- # - Connection Settings - listen_addresses = 'localhost'# what IP address(es) to listen on; # comma-separated list of addresses; # defaults to 'localhost', '*' = all # (change requires restart) port = 5432 # (change requires restart) max_connections = 2400 # (change requires restart) # Note: increasing max_connections costs ~400 bytes of shared memory per # connection slot, plus lock space (see max_locks_per_transaction). You # might also need to raise shared_buffers to support more connections. superuser_reserved_connections = 3 # (change requires restart) #unix_socket_directory = '' # (change requires restart) #unix_socket_group = '' # (change requires restart) #uni