Re: [PERFORM] Query plan for heavy SELECT with lite sub-SELECTs
Alvaro Herrera wrote: Performance analysis of strange queries is useful, but the input queries have to be meaningful as well. Otherwise you end up optimizing bizarre and useless cases. I had a similar one a few weeks ago. I did some batch-processing over a bunch of documents and discovered postgresql was faster if I let it process just 1000 documents, in stead of all 45000 at the same time. But with 1000 it was faster than 1000x one document. So I started with a query like: SELECT docid, (SELECT work to be done for each document) FROM documents ORDER BY docid LIMIT 1000 OFFSET ? And I noticed the 44th iteration was much slower than the first. Rewriting it to something like this made the last iteration about as fast as the first: SELECT docid, (SELECT work to be done for each document) FROM documents WHERE docid IN (SELECT docid FROM documents ORDER BY docid LIMIT 1000 OFFSET ? ) I know something like that isn't very set-based thinking, but then again the query's structure did come from a iterative algoritm, but turned out to be faster (less query-overhead) and easier to scale in PostgreSQL. I've tried a few more set-like structures, but those were all slower than this aproach probably because they would be were a little more complex. Some of them took more than 10x the amount of time... Another real-life example would be to display the amount of replies to a topic in a topic listing of a forum or the name of the author of the last message. You probably don't want to count all the replies for each topic if you're only going to display headings 100 - 200. And there are a few more examples to think of where a join+group by isn't going to work, but a subquery in the selectlist just does what you want. Of course most of the time you won't be using a OFFSET then. Best regards, Arjen ---(end of broadcast)--- TIP 6: explain analyze is your friend
[PERFORM] profiling PL/pgSQL?
I have 700 lines of non-performant pgSQL code that I'd like to profile to see what's going on. What's the best way to profile stored procedures? Thanks, Drew ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] profiling PL/pgSQL?
am Fri, dem 03.11.2006, um 3:12:14 -0800 mailte Drew Wilson folgendes: I have 700 lines of non-performant pgSQL code that I'd like to profile to see what's going on. What's the best way to profile stored procedures? RAISE NOTICE, you can raise the aktual time within a transaction with timeofday() Andreas -- Andreas Kretschmer Kontakt: Heynitz: 035242/47215, D1: 0160/7141639 (mehr: - Header) GnuPG-ID: 0x3FFF606C, privat 0x7F4584DA http://wwwkeys.de.pgp.net ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] profiling PL/pgSQL?
A. Kretschmer wrote: am Fri, dem 03.11.2006, um 3:12:14 -0800 mailte Drew Wilson folgendes: I have 700 lines of non-performant pgSQL code that I'd like to profile to see what's going on. What's the best way to profile stored procedures? RAISE NOTICE, you can raise the aktual time within a transaction with timeofday() Of course you only have very small values of best available with plpgsql debugging. There's a GUI debugger from EnterpriseDB I believe, but I've no idea how good it is. Any users/company bods care to let us know? -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[PERFORM] Context switch storm
Hi, We've migrated one of our servers from pg 7.4 to 8.1 and from times to times (4 hours) the server start doing a lot of context switching and all transactions become very slow. The average context switching for this server as vmstat shows is 1 but when the problem occurs it goes to 25. CPU and memory usage are ok. What is producing this context switching storms? It is a box with 16GB RAM and 4 XEONprocessors running RedHat Enterprise Linux AS. Should I disable Hyperthreading? Thank you in advance! Reimer
Re: [PERFORM] Context switch storm
Based on what other people have posted, hyperthreading seems not to be beneficial for postgres -- try searching through the archives of this list. (And then turn it off and see if it helps.) You might also post a few details: config settings (shared_buffers, work_mem, maintenance_work_mem, wal and checkpoint settings, etc.) are you using autovacuum ? all tables are vacuumed and analyzed regularly ? How big are they ? Do they and indexes fit in RAM ? any particular queries that running and might be related (explain analyze results of them would be useful) disk configuration Other processes on this box ? # of connections to it (I've seen this alone push servers over the edge) HTH, Greg Williamson DBA GlobeXplorer LLC -Original Message- From: [EMAIL PROTECTED] on behalf of [EMAIL PROTECTED] Sent: Fri 11/3/2006 2:32 AM To: pgsql-performance@postgresql.org Cc: Subject:[PERFORM] Context switch storm Hi, We've migrated one of our servers from pg 7.4 to 8.1 and from times to times (4 hours) the server start doing a lot of context switching and all transactions become very slow. The average context switching for this server as vmstat shows is 1 but when the problem occurs it goes to 25. CPU and memory usage are ok. What is producing this context switching storms? It is a box with 16GB RAM and 4 XEON processors running RedHat Enterprise Linux AS. Should I disable Hyperthreading? Thank you in advance! Reimer --- Click link below if it is SPAM [EMAIL PROTECTED] https://mailscanner.globexplorer.com/dspam/dspam.cgi?signatureID=454b34ac206028992556831[EMAIL PROTECTED]retrain=spamtemplate=historyhistory_page=1 !DSPAM:454b34ac206028992556831! --- ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Context switch storm
[EMAIL PROTECTED] wrote: Hi, We've migrated one of our servers from pg 7.4 to 8.1 and from times to times (4 hours) the server start doing a lot of context switching and all transactions become very slow. The average context switching for this server as vmstat shows is 1 but when the problem occurs it goes to 25. CPU and memory usage are ok. What is producing this context switching storms? It is a box with 16GB RAM and 4 XEON processors running RedHat Enterprise Linux AS. It's memory bandwidth issues on the older Xeons. If you search the archives you'll see a lot of discussion of this. I'd have thought 8.1 would be better than 7.4 though. You'll tend to see it when you have multiple clients and most queries can use RAM rather than disk I/O. My understanding of what happens is that PG requests data from RAM - it's not in cache so the process gets suspended to wait. The next process does the same, with the same result. You end up with lots of processes all fighting over what data is in the cache and no-one gets much work done. Should I disable Hyperthreading? I seem to remember that helps, but do check the mailing list archives for discussion on this. If you can keep your numbers of clients down below the critical level, you should find the overall workload is fine. The problem is of course that as the context-switching increases, each query takes longer which means more clients connect, which increases the context-swtching, which means... HTH -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Context switch storm
Cosimo Streppone wrote: Richard Huxton wrote: [EMAIL PROTECTED] wrote: The average context switching for this server as vmstat shows is 1 but when the problem occurs it goes to 25. You'll tend to see it when you have multiple clients and most queries can use RAM rather than disk I/O. My understanding of what happens is that PG requests data from RAM - it's not in cache so the process gets suspended to wait. The next process does the same, with the same result. You end up with lots of processes all fighting over what data is in the cache and no-one gets much work done. Does this happen also with 8.0, or is specific to 8.1 ? All versions suffer to a degree - they just push the old Xeon in the wrong way. However, more recent versions *should* be better than older versions. I believe some work was put in to prevent contention on various locks which should reduce context-switching across the board. I seem to have the same exact behaviour for an OLTP-loaded 8.0.1 server upgrade from 8.0.1 - the most recent is 8.0.9 iirc when I raise `shared_buffers' from 8192 to 4. I would expect an increase in tps/concurrent clients, but I see an average performance below a certain threshold of users, and when concurrent users get above that level, performance starts to drop, no matter what I do. Are you seeing a jump in context-switching in top? You'll know when you do - it's a *large* jump. That's the key diagnosis. Otherwise it might simply be your configuration settings aren't ideal for that workload. Server logs and io/vm statistics seem to indicate that there is little or no disk activity but machine loads increases to 7.0/8.0. After some minutes, the problem goes away, and performance returns to acceptable levels. That sounds like it. Query time increases across the board as all the clients fail to get any data back. When the load increases, *random* database queries show this slowness, even if they are perfectly planned and indexed. Is there anything we can do? Well, the client I saw it with just bought a dual-opteron server and used their quad-Xeon for something else. However, I do remember that 8.1 seemed better than 7.4 before they switched. Part of that might just have been better query-planning and other efficiences though. -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Setting nice values
Am Donnerstag, den 02.11.2006, 09:41 -0600 schrieb Scott Marlowe: Sometimes it's the simple solutions that work best. :) Welcome to the world of pgsql, btw... OTOH, there are also non-simple solutions to this, which might make sense anyway: Install slony, and run your queries against a readonly replica of your data. Andreas signature.asc Description: Dies ist ein digital signierter Nachrichtenteil
Re: [PERFORM] Context switch storm
The solution for us has been twofold: upgrade to the newest PG version available at the time while we waited for our new Opteron-based DB hardware to arrive. Andreas Am Freitag, den 03.11.2006, 13:29 + schrieb Richard Huxton: Cosimo Streppone wrote: Richard Huxton wrote: [EMAIL PROTECTED] wrote: The average context switching for this server as vmstat shows is 1 but when the problem occurs it goes to 25. You'll tend to see it when you have multiple clients and most queries can use RAM rather than disk I/O. My understanding of what happens is that PG requests data from RAM - it's not in cache so the process gets suspended to wait. The next process does the same, with the same result. You end up with lots of processes all fighting over what data is in the cache and no-one gets much work done. Does this happen also with 8.0, or is specific to 8.1 ? All versions suffer to a degree - they just push the old Xeon in the wrong way. However, more recent versions *should* be better than older versions. I believe some work was put in to prevent contention on various locks which should reduce context-switching across the board. I seem to have the same exact behaviour for an OLTP-loaded 8.0.1 server upgrade from 8.0.1 - the most recent is 8.0.9 iirc when I raise `shared_buffers' from 8192 to 4. I would expect an increase in tps/concurrent clients, but I see an average performance below a certain threshold of users, and when concurrent users get above that level, performance starts to drop, no matter what I do. Are you seeing a jump in context-switching in top? You'll know when you do - it's a *large* jump. That's the key diagnosis. Otherwise it might simply be your configuration settings aren't ideal for that workload. Server logs and io/vm statistics seem to indicate that there is little or no disk activity but machine loads increases to 7.0/8.0. After some minutes, the problem goes away, and performance returns to acceptable levels. That sounds like it. Query time increases across the board as all the clients fail to get any data back. When the load increases, *random* database queries show this slowness, even if they are perfectly planned and indexed. Is there anything we can do? Well, the client I saw it with just bought a dual-opteron server and used their quad-Xeon for something else. However, I do remember that 8.1 seemed better than 7.4 before they switched. Part of that might just have been better query-planning and other efficiences though. signature.asc Description: Dies ist ein digital signierter Nachrichtenteil
Re: [PERFORM] Context switch storm
On Fri, 3 Nov 2006, Richard Huxton wrote: It's memory bandwidth issues on the older Xeons. If you search the archives you'll see a lot of discussion of this. I'd have thought 8.1 would be better than 7.4 though. Hmmm... I just checked; one of our production systems is a multi-cpu Xeon based system of uncertain age (nobody remember 'zactly). While we haven't seen this problem yet, it's scheduled to take over demo-duty shortly and it would be an embarassment if we had this trouble during a demo... Is there any easy way to tell if you're at risk? Thanks, Richard -- Richard Troy, Chief Scientist Science Tools Corporation 510-924-1363 or 202-747-1263 [EMAIL PROTECTED], http://ScienceTools.com/ ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Context switch storm
Richard Troy wrote: On Fri, 3 Nov 2006, Richard Huxton wrote: It's memory bandwidth issues on the older Xeons. If you search the archives you'll see a lot of discussion of this. I'd have thought 8.1 would be better than 7.4 though. Hmmm... I just checked; one of our production systems is a multi-cpu Xeon based system of uncertain age (nobody remember 'zactly). While we haven't seen this problem yet, it's scheduled to take over demo-duty shortly and it would be an embarassment if we had this trouble during a demo... Is there any easy way to tell if you're at risk? Try: - multiple clients - query doing sorts that fit into work_mem -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Context switch storm
Richard Huxton wrote: [EMAIL PROTECTED] wrote: Hi, We've migrated one of our servers from pg 7.4 to 8.1 and from times to times (4 hours) the server start doing a lot of context switching and all transactions become very slow. The average context switching for this server as vmstat shows is 1 but when the problem occurs it goes to 25. You'll tend to see it when you have multiple clients and most queries can use RAM rather than disk I/O. My understanding of what happens is that PG requests data from RAM - it's not in cache so the process gets suspended to wait. The next process does the same, with the same result. You end up with lots of processes all fighting over what data is in the cache and no-one gets much work done. Does this happen also with 8.0, or is specific to 8.1 ? I seem to have the same exact behaviour for an OLTP-loaded 8.0.1 server when I raise `shared_buffers' from 8192 to 4. I would expect an increase in tps/concurrent clients, but I see an average performance below a certain threshold of users, and when concurrent users get above that level, performance starts to drop, no matter what I do. Server logs and io/vm statistics seem to indicate that there is little or no disk activity but machine loads increases to 7.0/8.0. After some minutes, the problem goes away, and performance returns to acceptable levels. When the load increases, *random* database queries show this slowness, even if they are perfectly planned and indexed. Is there anything we can do? -- Cosimo ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Context switch storm
If you can keep your numbers of clients down below the critical level, you should find the overall workload is fine. We have at about 600 connections. Is this a case to use a connection pool (pg_pool) system? And why this happens only with 8.0 and 8.1 and not with the 7.4?
Re: [PERFORM] Context switch storm
[EMAIL PROTECTED] wrote: If you can keep your numbers of clients down below the critical level, you should find the overall workload is fine. We have at about 600 connections. Is this a case to use a connection pool (pg_pool) system? Possibly - that should help. I'm assuming that most of your queries are very short, so you could probably get that figure down a lot lower. You'll keep the same amount of queries running through the system, just queue them up. And why this happens only with 8.0 and 8.1 and not with the 7.4? Not sure. Maybe 8.x is making more intensive use of your memory, possibly with a change in your plans. -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Query plan for heavy SELECT with lite sub-SELECTs
Arjen van der Meijden [EMAIL PROTECTED] writes: ... Rewriting it to something like this made the last iteration about as fast as the first: SELECT docid, (SELECT work to be done for each document) FROM documents WHERE docid IN (SELECT docid FROM documents ORDER BY docid LIMIT 1000 OFFSET ? ) The reason for this, of course, is that the LIMIT/OFFSET filter is the last step in a query plan --- it comes *after* computation of the SELECT output list. (So does ORDER BY, if an explicit sort step is needed.) So if you have an expensive-to-compute output list, a trick like Arjen's will help. I don't think you can use an IN though, at least not if you want to preserve the sort ordering in the final result. regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] profiling PL/pgSQL?
On 11/3/06, Richard Huxton dev@archonet.com wrote: There's a GUI debugger from EnterpriseDB I believe, but I've no idea how good it is. Any users/company bods care to let us know? If you visit: http://cvs.pgfoundry.org/cgi-bin/cvsweb.cgi/edb-debugger/#dirlist We have both a PL/pgSQL profiler and tracer available. -- Jonah H. Harris, Software Architect | phone: 732.331.1300 EnterpriseDB Corporation| fax: 732.331.1301 33 Wood Ave S, 2nd Floor| [EMAIL PROTECTED] Iselin, New Jersey 08830| http://www.enterprisedb.com/ ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Context switch storm
[EMAIL PROTECTED] writes: And why this happens only with 8.0 and 8.1 and not with the 7.4? 8.0 and 8.1 are vulnerable to this behavior because of conflicts for access to pg_subtrans (which didn't exist in 7.4). The problem occurs when you have old open transactions, causing the window over which pg_subtrans must be accessed to become much wider than normal. 8.2 should eliminate or at least alleviate the issue, but in the meantime see if you can get your applications to not sit on open transactions. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Context switch storm
Am Freitag, den 03.11.2006, 14:38 + schrieb Richard Huxton: [EMAIL PROTECTED] wrote: If you can keep your numbers of clients down below the critical level, you should find the overall workload is fine. We have at about 600 connections. Is this a case to use a connection pool (pg_pool) system? Possibly - that should help. I'm assuming that most of your queries are very short, so you could probably get that figure down a lot lower. You'll keep the same amount of queries running through the system, just queue them up. that have Ah, yes, now that you mention, avoid running many queries with a similiar timing behaviour, PG8 seems to have a lock design that's very bad for the memory architecture of the Xeons. So running SELECT * FROM table WHERE id=1234567890; from 600 clients in parallel can be quite bad than say a complicated 6-way join :( Andreas And why this happens only with 8.0 and 8.1 and not with the 7.4? Not sure. Maybe 8.x is making more intensive use of your memory, possibly with a change in your plans. signature.asc Description: Dies ist ein digital signierter Nachrichtenteil
[PERFORM] EXISTS optimization
To support migration of existing queries, it would be nice not to have to rewrite EXISTS clauses as IN clauses. Here is one example of a query which optimizes poorly: DELETE FROM CaseDispo WHERE EXISTS ( SELECT * FROM Consolidation C WHERE C.caseNo = '2006CM000123' AND C.xrefOrConsol = 'C' AND C.countyNo = 30 AND CaseDispo.caseNo = C.crossRefCase AND CaseDispo.countyNo = C.countyNo AND CaseDispo.dispoDate = DATE '2005-10-31' ); Seq Scan on CaseDispo (cost=0.00..1227660.52 rows=176084 width=6) (actual time=501.557..501.557 rows=0 loops=1) Filter: (subplan) SubPlan - Result (cost=0.00..3.46 rows=1 width=48) (actual time=0.000..0.000 rows=0 loops=352167) One-Time Filter: (($2)::date = '2005-10-31'::date) - Index Scan using Consolidation_pkey on Consolidation C (cost=0.00..3.46 rows=1 width=48) (actual time=0.008..0.008 rows=0 loops=84) Index Cond: (((caseNo)::bpchar = '2006CM000123'::bpchar) AND (($0)::bpchar = (crossRefCase)::bpchar) AND ((countyNo)::smallint = 30) AND (($1)::smallint = (countyNo)::smallint)) Filter: (xrefOrConsol = 'C'::bpchar) Total runtime: 501.631 ms (9 rows) To most programmers, it would be obvious that this is an exact logical equivalent to: DELETE FROM CaseDispo WHERE countyNo = 30 AND dispoDate = DATE '2005-10-31' AND caseNo IN ( SELECT crossRefCase FROM Consolidation C WHERE C.caseNo = '2006CM000123' AND C.xrefOrConsol = 'C' AND C.countyNo = 30 ); Nested Loop (cost=7.02..10.50 rows=1 width=6) (actual time=0.036..0.036 rows=0 loops=1) - HashAggregate (cost=7.02..7.03 rows=1 width=18) (actual time=0.034..0.034 rows=0 loops=1) - Index Scan using Consolidation_pkey on Consolidation C (cost=0.00..7.02 rows=1 width=18) (actual time=0.032..0.032 rows=0 loops=1) Index Cond: (((caseNo)::bpchar = '2006CM000123'::bpchar) AND ((countyNo)::smallint = 30)) Filter: (xrefOrConsol = 'C'::bpchar) - Index Scan using CaseDispo_pkey on CaseDispo (cost=0.00..3.46 rows=1 width=24) (never executed) Index Cond: (((CaseDispo.caseNo)::bpchar = (outer.crossRefCase)::bpchar) AND ((CaseDispo.dispoDate)::date = '2005-10-31'::date) AND ((CaseDispo.countyNo)::smallint = 30)) Total runtime: 0.109 ms (8 rows) On this particular query, three orders of magnitude only gets you up to half a second, but the same thing happens on longer running queries. And even that half a second is significant when a user has to sit there and wait for the hourglass to clear on a regular basis. Clearly, the problem is not in the costing -- it recognizes the high cost of the EXISTS form. The problem is that it doesn't recognize that these are logically equivalent. Is there any work in progress to expand the set of plans examined for an EXISTS clause? If not, can we add such an enhancement to the TODO list? Do we need a good write-up on what optimizations are legal for EXISTS? -Kevin ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate