Re: [PERFORM] Performance problems with prepared statements
Theo Kramer a écrit : On Wed, 2007-10-10 at 17:00 +0200, Cédric Villemain wrote: snip Reading the manual, you can learn that prepared statement can (not) follow the same plan as direct query: the plan is make before pg know the value of the variable. See 'Notes' http://www.postgresql.org/docs/8.2/interactive/sql-prepare.html Thanks, had missed that, however, I am afraid that I fail to see how preparing a query using PQprepare() and then executing it using PQexecPrepared(), is 8 thousand times slower than directly executing it.,, ( 403386.583ms/50.0ms = 8067 ). When doing a 'manual' prepare and explain analyze I get the following rascal=# prepare cq (char(12), smallint, integer) as SELECT oid, calllog_mainteng, calllog_phase, calllog_self FROM calllog WHERE calllog_mainteng = $1 AND calllog_phase = $2 AND calllog_self $3 OR calllog_mainteng = $1 AND calllog_phase $2 ORDER BY calllog_mainteng DESC, calllog_phase DESC, calllog_self DESC limit 25; PREPARE rascal=# explain analyze execute cq ('124 ', 8, 366942); QUERY PLAN --- Limit (cost=0.00..232.73 rows=25 width=26) (actual time=2.992..3.178 rows=25 loops=1) - Index Scan Backward using calllog_rmc_idx on calllog (cost=0.00..38651.38 rows=4152 width=26) (actual time=2.986..3.116 rows=25 loops=1) Index Cond: (calllog_mainteng = $1) Filter: (((calllog_phase = $2) AND (calllog_self $3)) OR (calllog_phase $2)) Total runtime: 3.272 ms So I suspect that there is something more fundamental here... my two cents: perhaps ... please check that with your C code And be sure you are not providing time from application. If you have a lot of data and/or a lag on your lan, it can be the cause of your so big difference between psql and C ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] building a performance test suite
Hi, Le jeudi 11 octobre 2007, Kevin Kempter a écrit : I'm preparing to create a test suite of very complex queries that can be profiled in terms of load and performance. The ultimate goal is to define a load/performance profile during a run of the old application code base and then again with changes to the application code base. You may want to consider using pgfouine and Tsung, the former to create tsung sessions from PostgreSQL logs and the latter to replay them simulating any number of concurrent users. Tsung can also operate as PostgreSQL proxy recorder, you point your application to it, it forwards the queries and record a session file for you. The replay process can mix several sessions and use several phases of different load behaviours. Then a little helper named tsung-plotter could be useful to draw several Tsung results on the same charts for comparing. Some links: http://pgfouine.projects.postgresql.org/tsung.html http://tsung.erlang-projects.org/ http://debian.dalibo.org/sid/tsung-ploter_0.1-1_all.deb http://debian.dalibo.org/sid/tsung-ploter_0.1-1.tar.gz Hope this helps, regards, -- dim signature.asc Description: This is a digitally signed message part.
Re: [PERFORM] Performance problems with prepared statements
Theo Kramer wrote: Thanks, had missed that, however, I am afraid that I fail to see how preparing a query using PQprepare() and then executing it using PQexecPrepared(), is 8 thousand times slower than directly executing it.,, ( 403386.583ms/50.0ms = 8067 ). When doing a 'manual' prepare and explain analyze I get the following rascal=# explain analyze execute cq ('124 ', 8, 366942); Total runtime: 3.272 ms So I suspect that there is something more fundamental here... OK, so there must be something different between the two scenarios. It can only be one of: 1. Query 2. DB Environment (user, locale, settings) 3. Network environment (server/client/network activity etc) Are you sure you have the parameter types correct in your long-running query? Try setting log_min_duration_statement=9000 or so to capture long-running queries. Make sure the user and any custom settings are the same. Compare SHOW ALL for both ways. You've said elsewhere you've ruled out the network environment, so there's not point worrying about that further. -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Performance problems with prepared statements
On Thu, 2007-10-11 at 10:12 +0100, Richard Huxton wrote: Theo Kramer wrote: So I suspect that there is something more fundamental here... OK, so there must be something different between the two scenarios. It can only be one of: 1. Query 2. DB Environment (user, locale, settings) 3. Network environment (server/client/network activity etc) I suspect that it could also be in the way the libpq PQprepare(), and PQexecPrepared() are handled... as opposed to the way PREPARE and EXECUTE are handled. Are you sure you have the parameter types correct in your long-running query? Yes - the problem surfaced during a going live session on an 80 user system... and we had to roll back to the previous system in a hurry. This was a part of the application that had missed testing, but I have had other reports from some of my other systems where this appears to be a problem but not of the magnitude that this one is. In any case I have managed to reproduce it in my test environment with configuration settings the same. Try setting log_min_duration_statement=9000 or so to capture long-running queries. Thanks - will give that a try. Make sure the user and any custom settings are the same. Compare SHOW ALL for both ways. You've said elsewhere you've ruled out the network environment, so there's not point worrying about that further. It is definitely not a network problem - ie. the postgresql server load goes way up when this query is run. -- Regards Theo ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[PERFORM] Huge amount of memory consumed during transaction
Hi, I'm running into a problem with PostgreSQL 8.2.4 (running on 32 bit Debian Etch/2x dual core C2D/8GB mem). The thing is that I have a huge transaction that does 2 things: 1) delete about 300.000 rows from a table with about 15 million rows and 2) do some (heavy) calculations and re-insert a litlte more than 300.000 new rows. My problem is that this consumes huge amounts of memory. The transaction runs for about 20 minutes and during that transaction memory usage peaks to about 2GB. Over time, the more rows that are involved in this transaction, the higher the peak memory requirements. Lately we increased our shared_buffers to 1.5GB, and during this transaction we reached the process memory limit, causing an out of memory and a rollback of the transaction: BEGIN DELETE 299980 ERROR: out of memory DETAIL: Failed on request of size 4194304. ROLLBACK DROP SEQUENCE real19m45.797s user0m0.024s sys 0m0.000s On my development machine, which has less than 2GB of memory, I can not even finish the transaction. Is there a way to tell PG to start swapping to disk instead of using ram memory during such a transaction? Thanks in advance for all help _ Express yourself instantly with MSN Messenger! Download today it's FREE! http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/
Re: [PERFORM] Huge amount of memory consumed during transaction
henk de wit wrote: Hi, I'm running into a problem with PostgreSQL 8.2.4 (running on 32 bit Debian Etch/2x dual core C2D/8GB mem). The thing is that I have a huge transaction that does 2 things: 1) delete about 300.000 rows from a table with about 15 million rows and 2) do some (heavy) calculations and re-insert a litlte more than 300.000 new rows. My problem is that this consumes huge amounts of memory. What exactly consumes all your memory? I'm assuming it's not just straight SQL. -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Huge amount of memory consumed during transaction
On 10/11/07, henk de wit [EMAIL PROTECTED] wrote: Hi, I'm running into a problem with PostgreSQL 8.2.4 (running on 32 bit Debian Etch/2x dual core C2D/8GB mem). The thing is that I have a huge transaction that does 2 things: 1) delete about 300.000 rows from a table with about 15 million rows and 2) do some (heavy) calculations and re-insert a litlte more than 300.000 new rows. My problem is that this consumes huge amounts of memory. The transaction runs for about 20 minutes and during that transaction memory usage peaks to about 2GB. Over time, the more rows that are involved in this transaction, the higher the peak memory requirements. How is the memory consumed? How are you measuring it? I assume you mean the postgres process that is running the query uses the memory. If so, which tool(s) are you using and what's the output that shows it being used? I believe that large transactions with foreign keys are known to cause this problem. Lately we increased our shared_buffers to 1.5GB, and during this transaction we reached the process memory limit, causing an out of memory and a rollback of the transaction: How much memory does this machine have? You do realize that shared_buffers are not a generic postgresql memory pool, but explicitly used to hold data from the discs. If you need to sort and materialize data, that is done with memory allocated from the heap. If you've given all your memory to shared_buffers, there might not be any left. How much swap have you got configured? Lastly, what does explain your query here say? ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] [GENERAL] Slow TSearch2 performance for table with 1 million documents.
It appears that the ORDER BY rank operation is the slowing factor. If I remove it then the query is pretty fast. Is there another way to perform ORDER BY such that it does not do a sort? Benjamin On Oct 5, 2007, at 3:57 PM, Benjamin Arai wrote: On Oct 5, 2007, at 8:32 AM, Oleg Bartunov wrote: On Fri, 5 Oct 2007, Tom Lane wrote: Benjamin Arai [EMAIL PROTECTED] writes: # explain analyze select * FROM fulltext_article, to_tsquery ('simple','dog') AS q WHERE idxfti @@ q ORDER BY rank(idxfti, q) DESC; QUERY PLAN --- - --- - Sort (cost=6576.74..6579.07 rows=933 width=774) (actual time=12969.237..12970.490 rows=5119 loops=1) Sort Key: rank(fulltext_article.idxfti, q.q) - Nested Loop (cost=3069.79..6530.71 rows=933 width=774) (actual time=209.513..12955.498 rows=5119 loops=1) - Function Scan on q (cost=0.00..0.01 rows=1 width=32) (actual time=0.005..0.006 rows=1 loops=1) - Bitmap Heap Scan on fulltext_article (cost=3069.79..6516.70 rows=933 width=742) (actual time=209.322..234.390 rows=5119 loops=1) Recheck Cond: (fulltext_article.idxfti @@ q.q) - Bitmap Index Scan on fulltext_article_idxfti_idx (cost=0.00..3069.56 rows=933 width=0) (actual time=208.373..208.373 rows=5119 loops=1) Index Cond: (fulltext_article.idxfti @@ q.q) Total runtime: 12973.035 ms (9 rows) The time seems all spent at the join step, which is odd because it really hasn't got much to do. AFAICS all it has to do is compute the rank() values that the sort step will use. Is it possible that rank() is really slow? can you try rank_cd() instead ? Using Rank: -# ('simple','dog') AS q WHERE idxfti @@ q ORDER BY rank(idxfti, q) DESC; QUERY PLAN -- -- Sort (cost=6576.74..6579.07 rows=933 width=774) (actual time=98083.081..98084.351 rows=5119 loops=1) Sort Key: rank(fulltext_article.idxfti, q.q) - Nested Loop (cost=3069.79..6530.71 rows=933 width=774) (actual time=479.122..98067.594 rows=5119 loops=1) - Function Scan on q (cost=0.00..0.01 rows=1 width=32) (actual time=0.003..0.004 rows=1 loops=1) - Bitmap Heap Scan on fulltext_article (cost=3069.79..6516.70 rows=933 width=742) (actual time=341.739..37112.110 rows=5119 loops=1) Recheck Cond: (fulltext_article.idxfti @@ q.q) - Bitmap Index Scan on fulltext_article_idxfti_idx (cost=0.00..3069.56 rows=933 width=0) (actual time=321.443..321.443 rows=5119 loops=1) Index Cond: (fulltext_article.idxfti @@ q.q) Total runtime: 98087.575 ms (9 rows) Using Rank_cd: # explain analyze select * FROM fulltext_article, to_tsquery ('simple','cat') AS q WHERE idxfti @@ q ORDER BY rank_cd(idxfti, q) DESC; QUERY PLAN -- -- - Sort (cost=6576.74..6579.07 rows=933 width=774) (actual time=199316.648..199324.631 rows=26054 loops=1) Sort Key: rank_cd(fulltext_article.idxfti, q.q) - Nested Loop (cost=3069.79..6530.71 rows=933 width=774) (actual time=871.428..199244.330 rows=26054 loops=1) - Function Scan on q (cost=0.00..0.01 rows=1 width=32) (actual time=0.006..0.007 rows=1 loops=1) - Bitmap Heap Scan on fulltext_article (cost=3069.79..6516.70 rows=933 width=742) (actual time=850.674..50146.477 rows=26054 loops=1) Recheck Cond: (fulltext_article.idxfti @@ q.q) - Bitmap Index Scan on fulltext_article_idxfti_idx (cost=0.00..3069.56 rows=933 width=0) (actual time=838.120..838.120 rows=26054 loops=1) Index Cond: (fulltext_article.idxfti @@ q.q) Total runtime: 199338.297 ms (9 rows) regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings Regards, Oleg _ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83 ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Huge amount of memory consumed during transaction
henk de wit [EMAIL PROTECTED] writes: ERROR: out of memory DETAIL: Failed on request of size 4194304. This error should have produced a map of per-context memory use in the postmaster log. Please show us that. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] [GENERAL] Slow TSearch2 performance for table with 1 million documents.
Benjamin Arai [EMAIL PROTECTED] writes: It appears that the ORDER BY rank operation is the slowing factor. If I remove it then the query is pretty fast. Is there another way to perform ORDER BY such that it does not do a sort? I think you misunderstood: it's not the sort that's slow, it's the computation of the rank() values that are inputs to the sort. regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] [GENERAL] Slow TSearch2 performance for table with 1 million documents.
Oh, I see. I didn't look carefully at the EXPLAIN ANALYZE I posted. So, is there a solution to the rank problem? Benjamin On Oct 11, 2007, at 8:53 AM, Tom Lane wrote: Benjamin Arai [EMAIL PROTECTED] writes: It appears that the ORDER BY rank operation is the slowing factor. If I remove it then the query is pretty fast. Is there another way to perform ORDER BY such that it does not do a sort? I think you misunderstood: it's not the sort that's slow, it's the computation of the rank() values that are inputs to the sort. regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Huge amount of memory consumed during transaction
On Oct 11, 2007, at 9:51 AM, Tom Lane wrote: henk de wit [EMAIL PROTECTED] writes: ERROR: out of memory DETAIL: Failed on request of size 4194304. This error should have produced a map of per-context memory use in the postmaster log. Please show us that. regards, tom lane Tom, are there any docs anywhere that explain how to interpret those per-context memory dumps? For example, when I see an autovacuum context listed is it safe to assume that the error came from an autovac operation, etc.? Erik Jones Software Developer | Emma® [EMAIL PROTECTED] 800.595.4401 or 615.292.5888 615.292.0777 (fax) Emma helps organizations everywhere communicate market in style. Visit us online at http://www.myemma.com ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] Performance problems with prepared statements
On 10/11/07, Theo Kramer [EMAIL PROTECTED] wrote: On Thu, 2007-10-11 at 10:12 +0100, Richard Huxton wrote: Theo Kramer wrote: So I suspect that there is something more fundamental here... OK, so there must be something different between the two scenarios. It can only be one of: 1. Query 2. DB Environment (user, locale, settings) 3. Network environment (server/client/network activity etc) I suspect that it could also be in the way the libpq PQprepare(), and PQexecPrepared() are handled... as opposed to the way PREPARE and EXECUTE are handled. PQexecPrepared is generally the fastest way to run queries from a C app as long as you get the right plan. Some suggestions * you can explain/explain analyze executing prepared statements from psql shell...try that and see if you can reproduce results * at worst case you can drop to execParams which is faster (and better) than PQexec, at least * if problem is plan related, you can always disable certain plan types (seqscan), prepare, and re-enable those plan types * do as Jonah suggested, first step is to try and reproduce problem from psql merlin ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Huge amount of memory consumed during transaction
Erik Jones [EMAIL PROTECTED] writes: Tom, are there any docs anywhere that explain how to interpret those =20 per-context memory dumps? No, not really. What you have to do is grovel around in the code and see where contexts with particular names might get created. For example, when I see an autovacuum =20 context listed is it safe to assume that the error came from an =20 autovac operation, etc.? Probably, but I haven't looked. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Performance problems with prepared statements
On 2007-10-10, Theo Kramer [EMAIL PROTECTED] wrote: When doing a 'manual' prepare and explain analyze I get the following rascal=# prepare cq (char(12), smallint, integer) as SELECT oid, calllog_mainteng, calllog_phase, calllog_self FROM calllog WHERE calllog_mainteng = $1 AND calllog_phase = $2 AND calllog_self $3 OR calllog_mainteng = $1 AND calllog_phase $2 ORDER BY calllog_mainteng DESC, calllog_phase DESC, calllog_self DESC limit 25; PREPARE When you do this from the application, are you passing it 3 parameters, or 5? The plan is clearly taking advantage of the fact that the two occurrences of $1 and $2 are known to be the same value; if your app is using some interface that uses ? placeholders rather than numbered parameters, then the planner will not be able to make this assumption. Also, from the application, is the LIMIT 25 passed as a constant or is that also a parameter? -- Andrew, Supernews http://www.supernews.com - individual and corporate NNTP services ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Performance problems with prepared statements
On 10/11/07, Andrew - Supernews [EMAIL PROTECTED] wrote: On 2007-10-10, Theo Kramer [EMAIL PROTECTED] wrote: When doing a 'manual' prepare and explain analyze I get the following rascal=# prepare cq (char(12), smallint, integer) as SELECT oid, calllog_mainteng, calllog_phase, calllog_self FROM calllog WHERE calllog_mainteng = $1 AND calllog_phase = $2 AND calllog_self $3 OR calllog_mainteng = $1 AND calllog_phase $2 ORDER BY calllog_mainteng DESC, calllog_phase DESC, calllog_self DESC limit 25; PREPARE When you do this from the application, are you passing it 3 parameters, or 5? The plan is clearly taking advantage of the fact that the two occurrences of $1 and $2 are known to be the same value; if your app is using some interface that uses ? placeholders rather than numbered parameters, then the planner will not be able to make this assumption. Also, from the application, is the LIMIT 25 passed as a constant or is that also a parameter? also, this looks a bit like a drilldown query, which is ordering the table on 2+ fields. if that's the case, row wise comparison is a better and faster approach. is this a converted cobol app? merlin ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Huge amount of memory consumed during transaction
How is the memory consumed? How are you measuring it? I assume you mean the postgres process that is running the query uses the memory. If so, which tool(s) are you using and what's the output that shows it being used? It's periodically measured and recorded by a script from which the relevant parts are: GET_VSZ=ps aux | grep $REQ_GREP | grep -v grep | grep -v $$ | awk '{print \$5}' | sort -n | tail -n1; GET_RSS=ps aux | grep $REQ_GREP | grep -v grep | grep -v $$ | awk '{print \$6}' | sort -n | tail -n1; From this I draw graphs using Cacti. I just checked a recent transaction; during this transaction which involved about 900.000 rows, VSZ peakes at 2.36GB, with RSS then peaking at 2.27GB. This memory usage is on top of a shared_buffers being set back to 320MB. Shortly after the transaction finished, memory usage indeed drops back to a nearly steady 320MB. (btw, I mistyped the rows involved in the original post; the 2GB memory usage is for 900.000 rows, not 300.000). After some more digging, I found out that the immense increase of memory usage started fairly recently (but before the increase of my shared_buffers, that just caused the out of memory exception). E.g. for a transaction with 300.000 rows involved a few weeks back, the memory usage stayed at a rather moderate 546MB/408MB (including 320MB for shared_buffers), and for some 800.000 rows the memory usage peaked at 'only' 631/598. When I draw a graph of rows involved vs memory usage there is a direct relation; apart from a few exceptions its clearly that the more rows are involved, the more memory is consumed. I'll have to check what was exactly changed at the PG installation recently, but nevertheless even with the more moderate memory consumption it becomes clear that PG eventually runs out of memory when more and more rows are involved. I believe that large transactions with foreign keys are known to cause this problem. As far as I can see there are no, or nearly no foreign keys involved in the transaction I'm having problems with. How much memory does this machine have? It's in the original post: 8GB ;) If you've given all your memory to shared_buffers, there might not be any left. I have of course not given all memory to shared_buffers. I tried to apply the rule of thumb of setting it to 1/4 of total memory. To be a little conservative, even a little less than that. 1/4 of 8GB is 2GB, so I tried with 1.5 to start. All other queries and small transactions run fine (we're talking about thousands upon thousands of queries and 100's of different ones. It's this huge transaction that occupies so much memory. Lastly, what does explain your query here say? I can't really test that easily now and it'll be a huge explain anyway (the query is almost 500 lines :X). I'll try to get one though. _ Express yourself instantly with MSN Messenger! Download today it's FREE! http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/
Re: [PERFORM] Huge amount of memory consumed during transaction
This error should have produced a map of per-context memory use in the postmaster log. Please show us that. I'm not exactly sure what to look for in the log. I'll do my best though and see what I can come up with. _ Express yourself instantly with MSN Messenger! Download today it's FREE! http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/
Re: [PERFORM] Huge amount of memory consumed during transaction
henk de wit [EMAIL PROTECTED] writes: I'm not exactly sure what to look for in the log. I'll do my best though an= d see what I can come up with. It'll be a bunch of lines like TopMemoryContext: 49832 total in 6 blocks; 8528 free (6 chunks); 41304 used immediately in front of the out-of-memory ERROR report. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match