[PERFORM] size of cache
with my application, it seems that size of cache has great effect: from 512 Kb of L2 cache to 1Mb boost performance with a factor 3 and 20% again from 1Mb L2 cache to 2Mb L2 cache. I don't understand why a 512Kb cache L2 is too small to fit the data's does it exist a tool to trace processor activity and confirm that processor is waiting for memory ? does it exist a tool to snapshot postgres activity and understand where we spend time and potentialy avoid the bottleneck ? thanks for your tips. -- Jean-Max Reymond CKR Solutions Open Source Nice France http://www.ckr-solutions.com ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[PERFORM] large table vs multiple smal tables
Hello I have a large database with 4 large tables (each containing at least 200 000 rows, perhaps even 1 or 2 million) and i ask myself if it's better to split them into small tables (e.g tables of 2000 rows) to speed the access and the update of those tables (considering that i will have few update but a lot of reading). Do you think it would be efficient ? Nicolas, wondering if he hadn't be too greedy -- - « soyez ce que vous voudriez avoir l'air d'être » Lewis Caroll ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] General DB Tuning
On Wed, Jul 13, 2005 at 09:52:20AM +0800, Christopher Kings-Lynne wrote: The 8.0.2 jdbc driver uses real prepared statements instead of faked ones. The problem is the new protocol (that the 8.0.2 driver users) has a bug where protocol-prepared queries don't get logged properly. I don't know if it's been fixed... It's not in 8.0.3, but I was having the same problems with DBD::Pg so I backported some of it and also changed the code so that it listed the values of the bind parameters, so you get something like LOG: statement: SELECT sr.name,sr.seq_region_id, sr.length, 1 FROM seq_region sr WHERE sr.name = $1 AND sr.coord_system_id = $2 LOG: binding: dbdpg_2 with 2 parameters LOG: bind dbdpg_2 $1 = 20 LOG: bind dbdpg_2 $2 = 1 LOG: statement: EXECUTE [PREPARE: SELECT sr.name,sr.seq_region_id, sr.length, 1 FROM seq_region sr WHERE sr.name = $1 AND sr.coord_system_id = $2] LOG: duration: 0.164 ms I've attached a patch in case anyone finds it useful. -Mark *** postgresql-8.0.3/src/backend/tcop/postgres.c2005-07-13 09:42:04.997669193 +0100 --- postgresql-8.0.3/src/backend/tcop/postgres.c2005-07-13 09:34:24.618195580 +0100 *** *** 1370,1375 --- 1370,1378 else portal = CreatePortal(portal_name, false, false); + if (log_statement == LOGSTMT_ALL) + ereport(LOG, (errmsg(binding: \%s\ with %d parameters, stmt_name, numParams))); + /* * Fetch parameters, if any, and store in the portal's memory context. * *** *** 1428,1433 --- 1431,1439 * grotty but is a big win when dealing with very * large parameter strings. */ + if (log_statement == LOGSTMT_ALL) + ereport(LOG, (errmsg(bind \%s\ $%d = \%s\, stmt_name, i+1, pvalue))); + pbuf.data = (char *) pvalue; pbuf.maxlen = plength + 1; pbuf.len = plength; *** *** 1578,1583 --- 1584,1593 boolis_trans_exit = false; boolcompleted; charcompletionTag[COMPLETION_TAG_BUFSIZE]; + struct timeval start_t, stop_t; + boolsave_log_duration = log_duration; + int save_log_min_duration_statement = log_min_duration_statement; + boolsave_log_statement_stats = log_statement_stats; /* Adjust destination to tell printtup.c what to do */ dest = whereToSendOutput; *** *** 1614,1619 --- 1624,1647 set_ps_display(portal-commandTag); + /* +* We use save_log_* so SET log_duration = true and SET +* log_min_duration_statement = true don't report incorrect time +* because gettimeofday() wasn't called. Similarly, +* log_statement_stats has to be captured once. +*/ + if (save_log_duration || save_log_min_duration_statement != -1) + gettimeofday(start_t, NULL); + + if (save_log_statement_stats) + ResetUsage(); + + if (log_statement == LOGSTMT_ALL) + /* We have the portal, so output the source query. */ + ereport(LOG, + (errmsg(statement: EXECUTE %s [PREPARE: %s], portal_name, + portal-sourceText ? portal-sourceText : ))); + BeginCommand(portal-commandTag, dest); /* Check for transaction-control commands */ *** *** 1708,1713 --- 1736,1785 pq_putemptymessage('s'); } + /* +* Combine processing here as we need to calculate the query duration +* in both instances. +*/ + if (save_log_duration || save_log_min_duration_statement != -1) + { + longusecs; + + gettimeofday(stop_t, NULL); + if (stop_t.tv_usec start_t.tv_usec) + { + stop_t.tv_sec--; + stop_t.tv_usec += 100; + } + usecs = (long) (stop_t.tv_sec - start_t.tv_sec) * 100 + + (long) (stop_t.tv_usec - start_t.tv_usec); + + /* Only print duration if we previously printed the statement. */ + if (log_statement == LOGSTMT_ALL save_log_duration) + ereport(LOG, + (errmsg(duration: %ld.%03ld ms, + (long) ((stop_t.tv_sec - start_t.tv_sec) * 1000 + + (stop_t.tv_usec - start_t.tv_usec) / 1000), +
[PERFORM] (pas de sujet)
Nicolas, These sizes would not be considered large. I would leave them as single tables. Ken ok, i though it was large but i must confess i'm relatively new in the database word. thank you for the answer. Just another question : what is the maximal number of rows that can be contain in a cursor ? Nicolas, having a lot of things to learn -- - « soyez ce que vous voudriez avoir l'air d'être » Lewis Caroll ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] cost-based vacuum
On Tue, 2005-07-12 at 13:50 -0400, Ian Westmacott wrote: It appears not to matter whether it is one of the tables being written to that is ANALYZEd. I can ANALYZE an old, quiescent table, or a system table and see this effect. Can you confirm that this effect is still seen even when the ANALYZE doesn't touch *any* of the tables being accessed? - this is a dual Xeon. Is that Xeon MP then? - Looking at oprofile reports for 10-minute runs of a database-wide VACUUM with vacuum_cost_delay=0 and 1000, shows the latter spending a lot of time in LWLockAcquire and LWLockRelease (20% each vs. 2%). Is this associated with high context switching also? Best Regards, Simon Riggs ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] size of cache
On Wed, 2005-07-13 at 10:20 +0200, Jean-Max Reymond wrote: with my application, it seems that size of cache has great effect: from 512 Kb of L2 cache to 1Mb boost performance with a factor 3 and 20% again from 1Mb L2 cache to 2Mb L2 cache. Memory request time is the main bottleneck in well tuned database systems, so your results could be reasonable. I don't understand why a 512Kb cache L2 is too small to fit the data's does it exist a tool to trace processor activity and confirm that processor is waiting for memory ? You have both data and instruction cache on the CPU. It is likely it is the instruction cache that is too small to fit all of the code required for your application's workload mix. Use Intel VTune or similar to show the results you seek. Best Regards, Simon Riggs ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] General DB Tuning
On Wed, 2005-07-13 at 09:52 +0800, Christopher Kings-Lynne wrote: Is there a different kind of 'prepared' statements that we should be using in the driver to get logging to work properly? What is the 'new' protocol? The 8.0.2 jdbc driver uses real prepared statements instead of faked ones. The problem is the new protocol (that the 8.0.2 driver users) has a bug where protocol-prepared queries don't get logged properly. I don't know if it's been fixed... Yes, there is a fix for this in 8.1 Brent has been sent the details. Best Regards, Simon Riggs ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] cost-based vacuum
On Wed, 2005-07-13 at 11:55, Simon Riggs wrote: On Tue, 2005-07-12 at 13:50 -0400, Ian Westmacott wrote: It appears not to matter whether it is one of the tables being written to that is ANALYZEd. I can ANALYZE an old, quiescent table, or a system table and see this effect. Can you confirm that this effect is still seen even when the ANALYZE doesn't touch *any* of the tables being accessed? Yes. - this is a dual Xeon. Is that Xeon MP then? Yes. - Looking at oprofile reports for 10-minute runs of a database-wide VACUUM with vacuum_cost_delay=0 and 1000, shows the latter spending a lot of time in LWLockAcquire and LWLockRelease (20% each vs. 2%). Is this associated with high context switching also? Yes, it appears that context switches increase up to 4-5x during cost-based ANALYZE. --Ian ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[PERFORM] Quad Opteron stuck in the mud
Gurus, A table in one of my databases has just crossed the 30 million row mark and has begun to feel very sluggish for just about anything I do with it. I keep the entire database vacuumed regularly. And, as long as I'm not doing a sequential scan, things seem reasonably quick most of the time. I'm now thinking that my problem is IO because anything that involves heavy ( like a seq scan ) IO seems to slow to a crawl. Even if I am using indexed fields to grab a few thousand rows, then going to sequential scans it gets very very slow. I have also had the occurrence where queries will not finish for days ( I eventually have to kill them ). I was hoping to provide an explain analyze for them, but if they never finish... even the explain never finishes when I try that. For example, as I'm writing this, I am running an UPDATE statement that will affect a small part of the table, and is querying on an indexed boolean field. I have been waiting for over an hour and a half as I write this and it still hasn't finished. I'm thinking I bet Tom, Simon or Josh wouldn't put up with this kind of wait time.., so I thought I would see if anyone here had some pointers. Maybe I have a really stupid setting in my conf file that is causing this. I really can't believe I am at the limits of this hardware, however. The query: update eventactivity set ftindex = false where ftindex = true; ( added the where clause because I don't want to alter where ftindex is null ) The table: Column|Type | Modifiers -+-+--- entrydate | timestamp without time zone | incidentid | character varying(40) | statustype | character varying(20) | unitid | character varying(20) | recordtext | character varying(255) | recordtext2 | character varying(255) | insertdate | timestamp without time zone | ftindex | boolean | Indexes: eventactivity1 btree (incidentid), eventactivity_entrydate_idx btree (entrydate), eventactivity_ftindex_idx btree (ftindex), eventactivity_oid_idx btree (oid) The hardware: 4 x 2.2GHz Opterons 12 GB of RAM 4x10k 73GB Ultra320 SCSI drives in RAID 0+1 1GB hardware cache memory on the RAID controller The OS: Fedora, kernel 2.6.6-1.435.2.3smp ( redhat stock kernel ) filesystem is mounted as ext2 # vmstat output ( as I am waiting for this to finish ): procs ---memory-- ---swap-- -io --system-- cpu r b swpd free buff cache si sobibo incs us sy id wa 0 1 5436 2823908 26140 918370401 2211 540 694 336 9 2 76 13 # iostat output ( as I am waiting for this to finish ): avg-cpu: %user %nice%sys %iowait %idle 9.190.002.19 13.08 75.53 Device:tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn cciss/c0d0 329.26 17686.03 4317.57 161788630 39496378 # This is a dedicated postgresql server, so maybe some of these settings are more liberal than they should be? relevant ( I hope ) postgresql.conf options are: shared_buffers = 5 effective_cache_size = 1348000 random_page_cost = 3 work_mem = 512000 max_fsm_pages = 8 log_min_duration_statement = 6 fsync = true ( not sure if I'm daring enough to run without this ) wal_buffers = 1000 checkpoint_segments = 64 checkpoint_timeout = 3000 # FOR PG_AUTOVACUUM --# stats_command_string = true stats_row_level = true Thanks in advance, Dan ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] cost-based vacuum
Ian Westmacott [EMAIL PROTECTED] writes: On Wed, 2005-07-13 at 11:55, Simon Riggs wrote: On Tue, 2005-07-12 at 13:50 -0400, Ian Westmacott wrote: It appears not to matter whether it is one of the tables being written to that is ANALYZEd. I can ANALYZE an old, quiescent table, or a system table and see this effect. Can you confirm that this effect is still seen even when the ANALYZE doesn't touch *any* of the tables being accessed? Yes. This really isn't making any sense at all. I took another look through the vacuum_delay_point() calls, and I can see a couple that are questionably placed: * the one in count_nondeletable_pages() is done while we are holding exclusive lock on the table; we might be better off not to delay there, so as not to block non-VACUUM processes longer than we have to. * the ones in hashbulkdelete and rtbulkdelete are done while holding various forms of exclusive locks on the index (this was formerly true of gistbulkdelete as well). Again it might be better not to delay. However, these certainly do not explain Ian's problem, because (a) these only apply to VACUUM, not ANALYZE; (b) they would only lock the table being VACUUMed, not other ones; (c) if these locks were to block the reader or writer thread, it'd manifest as blocking on a semaphore, not as a surge in LWLock thrashing. Is that Xeon MP then? Yes. The LWLock activity is certainly suggestive of prior reports of excessive buffer manager lock contention, but it makes *no* sense that that would be higher with vacuum cost delay than without. I'd have expected the other way around. I'd really like to see a test case for this... regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Quad Opteron stuck in the mud
On Jul 13, 2005, at 1:11 PM, John A Meinel wrote: I might be wrong, but there may be something much more substantially wrong than slow i/o. John Yes, I'm afraid of that too. I just don't know what tools I should use to figure that out. I have some 20 other databases on this system, same schema but varying sizes, and the small ones perform very well. It feels like there is an O(n) increase in wait time that has recently become very noticeable on the largest of them. -Dan ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Quad Opteron stuck in the mud
* Dan Harris ([EMAIL PROTECTED]) wrote: On Jul 13, 2005, at 1:11 PM, John A Meinel wrote: I might be wrong, but there may be something much more substantially wrong than slow i/o. Yes, I'm afraid of that too. I just don't know what tools I should use to figure that out. I have some 20 other databases on this system, same schema but varying sizes, and the small ones perform very well. It feels like there is an O(n) increase in wait time that has recently become very noticeable on the largest of them. Could you come up w/ a test case that others could reproduce where explain isn't returning? I think that would be very useful towards solving at least that issue... Thanks, Stephen signature.asc Description: Digital signature
Re: [PERFORM] Quad Opteron stuck in the mud
On Wed, Jul 13, 2005 at 01:16:25PM -0600, Dan Harris wrote: On Jul 13, 2005, at 1:11 PM, John A Meinel wrote: I might be wrong, but there may be something much more substantially wrong than slow i/o. Yes, I'm afraid of that too. I just don't know what tools I should use to figure that out. I have some 20 other databases on this system, same schema but varying sizes, and the small ones perform very well. It feels like there is an O(n) increase in wait time that has recently become very noticeable on the largest of them. I'd guess it's stuck on some lock. Try that EXPLAIN, and when it blocks, watch the pg_locks view for locks not granted to the process executing the EXPLAIN. Then check what else is holding the locks. -- Alvaro Herrera (alvherre[a]alvh.no-ip.org) La rebeldía es la virtud original del hombre (Arthur Schopenhauer) ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Quad Opteron stuck in the mud
On Jul 13, 2005, at 2:17 PM, Stephen Frost wrote: Could you come up w/ a test case that others could reproduce where explain isn't returning? This was simply due to my n00bness :) I had always been doing explain analyze, instead of just explain. Next time one of these queries comes up, I will be sure to do the explain without analyze. FYI that update query I mentioned in the initial thread just finished after updating 8.3 million rows. -Dan ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] cost-based vacuum
On Wed, 2005-07-13 at 14:58 -0400, Tom Lane wrote: Ian Westmacott [EMAIL PROTECTED] writes: On Wed, 2005-07-13 at 11:55, Simon Riggs wrote: On Tue, 2005-07-12 at 13:50 -0400, Ian Westmacott wrote: It appears not to matter whether it is one of the tables being written to that is ANALYZEd. I can ANALYZE an old, quiescent table, or a system table and see this effect. Can you confirm that this effect is still seen even when the ANALYZE doesn't touch *any* of the tables being accessed? Yes. This really isn't making any sense at all. Agreed. I think all of this indicates that some wierdness (technical term) is happening at a different level in the computing stack. I think all of this points fairly strongly to it *not* being a PostgreSQL algorithm problem, i.e. if the code was executed by an idealised Knuth- like CPU then we would not get this problem. Plus, I have faith that if it was a problem in that plane then you or another would have uncovered it by now. However, these certainly do not explain Ian's problem, because (a) these only apply to VACUUM, not ANALYZE; (b) they would only lock the table being VACUUMed, not other ones; (c) if these locks were to block the reader or writer thread, it'd manifest as blocking on a semaphore, not as a surge in LWLock thrashing. I've seen enough circumstantial evidence to connect the time spent inside LWLockAcquire/Release as being connected to the Semaphore ops within them, not the other aspects of the code. Months ago we discussed the problem of false sharing on closely packed arrays of shared variables because of the large cache line size of the Xeon MP. When last we touched on that thought, I focused on the thought that the LWLock array was too tightly packed for the predefined locks. What we didn't discuss (because I was too focused on the other array) was the PGPROC shared array is equally tightly packed, which could give problems on the semaphores in LWLock. Intel says fairly clearly that this would be an issue. Is that Xeon MP then? Yes. The LWLock activity is certainly suggestive of prior reports of excessive buffer manager lock contention, but it makes *no* sense that that would be higher with vacuum cost delay than without. I'd have expected the other way around. I'd really like to see a test case for this... My feeling is that a micro-architecture test would be more likely to reveal some interesting information. Best Regards, Simon Riggs ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Quad Opteron stuck in the mud
On Jul 13, 2005, at 2:54 PM, Dan Harris wrote: 4 x 2.2GHz Opterons 12 GB of RAM 4x10k 73GB Ultra320 SCSI drives in RAID 0+1 1GB hardware cache memory on the RAID controller if it is taking that long to update about 25% of your table, then you must be I/O bound. check I/o while you're running a big query. also, what RAID controller are you running? be sure you have the latest BIOS and drivers for it. on a pair of dual opterons, I can do large operations on tables with 100 million rows much faster than you seem to be able. I have MegaRAID 320-2x controllers with 15kRPM drives. Vivek Khera, Ph.D. +1-301-869-4449 x806 smime.p7s Description: S/MIME cryptographic signature
Re: [PERFORM] cost-based vacuum
I can at least report that the problem does not seem to occur with Postgres 8.0.1 running on a dual Opteron. --Ian On Wed, 2005-07-13 at 16:39, Simon Riggs wrote: On Wed, 2005-07-13 at 14:58 -0400, Tom Lane wrote: Ian Westmacott [EMAIL PROTECTED] writes: On Wed, 2005-07-13 at 11:55, Simon Riggs wrote: On Tue, 2005-07-12 at 13:50 -0400, Ian Westmacott wrote: It appears not to matter whether it is one of the tables being written to that is ANALYZEd. I can ANALYZE an old, quiescent table, or a system table and see this effect. Can you confirm that this effect is still seen even when the ANALYZE doesn't touch *any* of the tables being accessed? Yes. This really isn't making any sense at all. Agreed. I think all of this indicates that some wierdness (technical term) is happening at a different level in the computing stack. I think all of this points fairly strongly to it *not* being a PostgreSQL algorithm problem, i.e. if the code was executed by an idealised Knuth- like CPU then we would not get this problem. Plus, I have faith that if it was a problem in that plane then you or another would have uncovered it by now. However, these certainly do not explain Ian's problem, because (a) these only apply to VACUUM, not ANALYZE; (b) they would only lock the table being VACUUMed, not other ones; (c) if these locks were to block the reader or writer thread, it'd manifest as blocking on a semaphore, not as a surge in LWLock thrashing. I've seen enough circumstantial evidence to connect the time spent inside LWLockAcquire/Release as being connected to the Semaphore ops within them, not the other aspects of the code. Months ago we discussed the problem of false sharing on closely packed arrays of shared variables because of the large cache line size of the Xeon MP. When last we touched on that thought, I focused on the thought that the LWLock array was too tightly packed for the predefined locks. What we didn't discuss (because I was too focused on the other array) was the PGPROC shared array is equally tightly packed, which could give problems on the semaphores in LWLock. Intel says fairly clearly that this would be an issue. Is that Xeon MP then? Yes. The LWLock activity is certainly suggestive of prior reports of excessive buffer manager lock contention, but it makes *no* sense that that would be higher with vacuum cost delay than without. I'd have expected the other way around. I'd really like to see a test case for this... My feeling is that a micro-architecture test would be more likely to reveal some interesting information. Best Regards, Simon Riggs ---(end of broadcast)--- TIP 6: explain analyze is your friend ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Quad Opteron stuck in the mud
On Wed, 2005-07-13 at 12:54 -0600, Dan Harris wrote: For example, as I'm writing this, I am running an UPDATE statement that will affect a small part of the table, and is querying on an indexed boolean field. An indexed boolean field? Hopefully, ftindex is false for very few rows of the table? Try changing the ftindex to be a partial index, so only index the false values. Or don't index it at all. Split the table up into smaller pieces. Don't use an UPDATE statement. Keep a second table, and insert records into it when you would have updated previously. If a row is not found, you know that it has ftindex=true. That way, you'll never have row versions building up in the main table, which you'll still get even if you VACUUM. Best Regards, Simon Riggs ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[PERFORM] Slow Query
Hi, I'm having a problem with a query that performs a sequential scan on a table when it should be performing an index scan. The interesting thing is, when we dumped the database on another server, it performed an index scan on that server. The systems are running the same versions of postgres (7.4.8) and the problem persists after running an ANALYZE VERBOSE and after a REINDEX TABLE sq_ast FORCE. The only difference that i can see is that the postgresql.conf files differ slightly, and the hardware is different. Note that the system performing the sequential scan is a Dual 2.8GHz Xeon, 4GB Ram, 300GB HDD. And the system performing an index scan is not as powerful. A copy of the postgresql.conf for the system performing the index scan can be found at http://beta.squiz.net/~mmcintyre/postgresql_squiz_uk.conf A copy of the postgresql.conf for the system performing the sequential scan can be found at http://beta.squiz.net/~mmcintyre/postgresql_future.conf The Query: SELECT a.assetid, a.short_name, a.type_code, a.status, l.linkid, l.link_type, l.sort_order, lt.num_kids, u.url, ap.path, CASE u.http WHEN '1' THEN 'http' WHEN '0' THEN 'https' END AS protocol FROM ((sq_ast a LEFT JOIN sq_ast_url u ON a.assetid = u.assetid) LEFT JOIN sq_ast_path ap ON a.assetid = ap.assetid),sq_ast_lnk l, sq_ast_lnk_tree lt WHERE a.assetid = l.minorid AND l.linkid = lt.linkid AND l.majorid = '2' AND l.link_type = 2 ORDER BY sort_order The EXPLAIN ANALYZE from the system performing an sequential scan: QUERY PLAN Sort (cost=30079.79..30079.89 rows=42 width=113) (actual time=39889.989..39890.346 rows=260 loops=1) Sort Key: l.sort_order - Nested Loop (cost=25638.02..30078.65 rows=42 width=113) (actual time=9056.336..39888.557 rows=260 loops=1) - Merge Join (cost=25638.02..29736.01 rows=25 width=109) (actual time=9056.246..39389.359 rows=260 loops=1) Merge Cond: ((outer.assetid)::text = inner.?column5?) - Merge Left Join (cost=25410.50..29132.82 rows=150816 width=97) (actual time=8378.176..38742.111 rows=150567 loops=1) Merge Cond: ((outer.assetid)::text = (inner.assetid)::text) - Merge Left Join (cost=25410.50..26165.14 rows=150816 width=83) (actual time=8378.130..9656.413 rows=150489 loops=1) Merge Cond: (outer.?column5? = inner.?column4?) - Sort (cost=25408.17..25785.21 rows=150816 width=48) (actual time=8377.733..8609.218 rows=150486 loops=1) Sort Key: (a.assetid)::text - Seq Scan on sq_ast a (cost=0.00..12436.16 rows=150816 width=48) (actual time=0.011..5578.231 rows=151378 loops=1) - Sort (cost=2.33..2.43 rows=37 width=43) (actual time=0.364..0.428 rows=37 loops=1) Sort Key: (u.assetid)::text - Seq Scan on sq_ast_url u (cost=0.00..1.37 rows=37 width=43) (actual time=0.023..0.161 rows=37 loops=1) - Index Scan using sq_ast_path_ast on sq_ast_path ap (cost=0.00..2016.98 rows=45893 width=23) (actual time=0.024..14041.571 rows=45812 loops=1) - Sort (cost=227.52..227.58 rows=25 width=21) (actual time=131.838..132.314 rows=260 loops=1) Sort Key: (l.minorid)::text - Index Scan using sq_ast_lnk_majorid on sq_ast_lnk l (cost=0.00..226.94 rows=25 width=21) (actual time=0.169..126.201 rows=260 loops=1) Index Cond: ((majorid)::text = '2'::text) Filter: (link_type = 2) - Index Scan using sq_ast_lnk_tree_linkid on sq_ast_lnk_tree lt (cost=0.00..13.66 rows=3 width=8) (actual time=1.539..1.900 rows=1 loops=260) Index Cond: (outer.linkid = lt.linkid) Total runtime: 39930.395 ms The EXPLAIN ANALYZE from the system performing an index scan scan: Sort (cost=16873.64..16873.74 rows=40 width=113) (actual time=2169.905..2169.912 rows=13 loops=1) Sort Key: l.sort_order - Nested Loop (cost=251.39..16872.58 rows=40 width=113) (actual time=45.724..2169.780 rows=13 loops=1) - Merge Join (cost=251.39..16506.42 rows=32 width=109) (actual time=45.561..2169.012 rows=13 loops=1) Merge Cond: ((outer.assetid)::text = inner.?column5?) - Merge Left Join (cost=2.33..15881.92 rows=149982 width=97) (actual time=0.530..1948.718 rows=138569 loops=1) Merge Cond: ((outer.assetid)::text = (inner.assetid)::text) - Merge Left Join (cost=2.33..13056.04 rows=149982 width=83) (actual time=0.406..953.781 rows=138491 loops=1) Merge Cond: ((outer.assetid)::text = inner.?column4?) - Index Scan using sq_ast_pkey on sq_ast a (cost=0.00..14952.78 rows=149982 width=48) (actual time=0.154..388.872 rows=138488
Re: [PERFORM] performance problems ... 100 cpu utilization
Dennis [EMAIL PROTECTED] writes checking the status of connections at this point ( ps -eaf | grep postgres:) where the CPU is maxed out I saw this: 127 idle 12 bind 38 parse 34 select Are you sure 100% CPU usage is solely contributed by Postgresql? Also, from the ps status you list, I can hardly see that's a problem because of problem you mentioned below. I know there has been discussion about problems on Xeon MP systems. Is this what we are running into? Or is something else going on? Is there other information I can provide that might help determine what is going on? Here is a talk about Xeon-SMP spinlock contention problem: http://archives.postgresql.org/pgsql-performance/2005-05/msg00441.php Regards, Qingqing ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] performance problems ... 100 cpu utilization
Qingqing Zhou wrote: Are you sure 100% CPU usage is solely contributed by Postgresql? Also, from the ps status you list, I can hardly see that's a problem because of problem you mentioned below. The postgreSQL processes are what is taking up all the cpu. There aren't any other major applications on the machine. Its a dedicated database server, only for this application. It doesn't seem to make sense that PostgreSQL would be maxed out at this point. I think given the size of the box, it could do quite a bit better. So, what is going on? I don't know. Dennis ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] performance problems ... 100 cpu utilization
What is the load average on this machine? Do you do many updates? If you do a lot of updates, perhaps you haven't vacuumed recently. We were seeing similar symptoms when we started load testing our stuff and it turned out we were vacuuming too infrequently. David Dennis wrote: Qingqing Zhou wrote: Are you sure 100% CPU usage is solely contributed by Postgresql? Also, from the ps status you list, I can hardly see that's a problem because of problem you mentioned below. The postgreSQL processes are what is taking up all the cpu. There aren't any other major applications on the machine. Its a dedicated database server, only for this application. It doesn't seem to make sense that PostgreSQL would be maxed out at this point. I think given the size of the box, it could do quite a bit better. So, what is going on? I don't know. Dennis ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq -- David Mitchell Software Engineer Telogis ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[PERFORM] lots of updates on small table
Hi, Our application requires a number of processes to select and update rows from a very small (10 rows) Postgres table on a regular and frequent basis. These processes often run for weeks at a time, but over the space of a few days we find that updates start getting painfully slow. We are running a full vacuum/analyze and reindex on the table every day, but the updates keep getting slower and slower until the processes are restarted. Restarting the processes isn't really a viable option in our 24/7 production environment, so we're trying to figure out what's causing the slow updates. The environment is as follows: Red Hat 9, kernel 2.4.20-8 PostgreSQL 7.3.2 ecpg 2.10.0 The processes are all compiled C programs accessing the database using ECPG. Does anyone have any thoughts on what might be happening here? Thanks Alison ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings