Re: [PERFORM] Delete performance again
Hi, Maybe you can try this syntax. I'm not sure, but it eventually perform better: delete from company_alias USING comprm where company_alias.company_id =comprm.id Cheers, Marc -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] 7.4 - basic tuning question
Simon Waters wrote: The best advice is to "upgrade at your earliest convenience" with performance questions and 7.4 - you're missing a *lot* of improvements. You say you're planning to anyway, and I'd recommend putting effort into the upgrade rather than waste effort on tuning a system you're leaving. > I assume that the histogram_bounds for strings are alphabetical in order, so > that "DEMOSTART" falls between "DELETE" and "IDEMAIL". Even on a worst case > of including both these common values, the planner ought to have assumed that > less than <10% of records were likely covered by the value selected, so it > seems unlikely to me that not using the index would be a good idea. Well, the real question is how many blocks need to be read to find those DEMOSTART rows. At some point around 5-10% of the table it's easier just to read the whole table than go back and fore between index and table. The precise point will depend on how much RAM you have, disk speeds etc. > => SELECT COUNT(*) FROM log WHERE event='DEMOSTART'; > (...lots of time passes...) > count > --- > 1432 > (1 row) OK, not many. The crucial bit is below though. These are the 10 values it will hold stats on, and all it knows is that DEMOSTART has less than 57000 entries. OK, it's more complicated than that, but basically there are values it tracks and everything else. So - it assumes that all other values have the same chance of occuring. > => SELECT COUNT(*), event FROM log GROUP BY event ORDER BY count; > > count | event > +--- [snip] > 57022 | NEWUSR > 64907 | PUBREC0 > 65449 | UNPUBLISH > 92843 | LOGOUT > 99018 | KILLSESS > 128900 | UPLOAD > 134994 | LOGIN > 137608 | NEWPAGE > 447556 | PUBREC1 > 489572 | PUBLISH Which is why it guesses 20436 rows below. If you'd done "SET enable_seqscan = off" then run the explain again it should have estimated a cost for the index that was more than 54317.14 > => EXPLAIN SELECT * FROM log WHERE event='DEMOSTART'; > QUERY PLAN > > Seq Scan on log (cost=0.00..54317.14 rows=20436 width=93) >Filter: (event = 'DEMOSTART'::text) > (2 rows) > > > => ALTER TABLE log ALTER COLUMN events SET STATISTICS 50; ANALYSE > LOG(event); > ALTER TABLE > ANALYZE > > > => EXPLAIN SELECT COUNT(*) FROM log WHERE event='DEMOSTART'; > QUERY PLAN > > --- > Aggregate (cost=5101.43..5101.43 rows=1 width=0) >-> Index Scan using log_event on log (cost=0.00..5098.15 rows=1310 > width=0) > Index Cond: (event = 'DEMOSTART'::text) > (3 rows) Not bad - now it knows how many rows it will find, and it sees that the index is cheaper. It's not completely accurate - it uses a statistical sampling (and of course it's out of date as soon as you update the table). HTH -- Richard Huxton Archonet Ltd -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] 7.4 - basic tuning question
Hi, we have a log table on one server with 1.9 million records. One column "event" (type text) in that table is a string that (currently) takes a small number of distinct values (~43) (hmm that could have been normalised better). We noted on querying for events of a specific type, that the queries were slower than expected. It simply wasn't using the index (btree, default settings) on this column on this server (the test server, with less records, was fine). Using "ALTER TABLE SET STATISTICS" to increase the number of buckets to 50 resolved the issue, we went pretty much straight there on discovering there are no "HINTS". However we aren't quite sure why this case was pathological, and my brain doesn't grok the documentation quite. I assume that the histogram_bounds for strings are alphabetical in order, so that "DEMOSTART" falls between "DELETE" and "IDEMAIL". Even on a worst case of including both these common values, the planner ought to have assumed that less than <10% of records were likely covered by the value selected, so it seems unlikely to me that not using the index would be a good idea. What am I missing? (and yes there is a plan to upgrade!). => SELECT COUNT(*) FROM log WHERE event='DEMOSTART'; (...lots of time passes...) count --- 1432 (1 row) => SELECT COUNT(*), event FROM log GROUP BY event ORDER BY count; count | event +--- 6 | DNRFAIL 14 | ADMDNR 14 | UPGRADE 18 | FOCRENEW 21 | AUTOCN 25 | ADMCC 27 | TEMPIN 31 | DNRCANCEL 43 | EXPIRED 128 | DIRECTBUY 130 | CANCEL 130 | CANCELQ 154 | FOCBUY 173 | EXPCCWARN 179 | OFFER 209 | DNROK 214 | TEMPRE 356 | CCWARN 429 | ADMLOGIN 719 | SUBSCRIBE 787 | CCSUCCESS 988 | CCFAILURE 1217 | TEMPNEW 1298 | PAYPAL 1431 | DEMOSTART 1776 | CCREQUEST 2474 | ACCTUPD 15169 | SYSMAINT 42251 | IDEMAIL 46964 | DELETE 50764 | RELOGIN 57022 | NEWUSR 64907 | PUBREC0 65449 | UNPUBLISH 92843 | LOGOUT 99018 | KILLSESS 128900 | UPLOAD 134994 | LOGIN 137608 | NEWPAGE 447556 | PUBREC1 489572 | PUBLISH => EXPLAIN SELECT * FROM log WHERE event='DEMOSTART'; QUERY PLAN Seq Scan on log (cost=0.00..54317.14 rows=20436 width=93) Filter: (event = 'DEMOSTART'::text) (2 rows) => ALTER TABLE log ALTER COLUMN events SET STATISTICS 50; ANALYSE LOG(event); ALTER TABLE ANALYZE => EXPLAIN SELECT COUNT(*) FROM log WHERE event='DEMOSTART'; QUERY PLAN --- Aggregate (cost=5101.43..5101.43 rows=1 width=0) -> Index Scan using log_event on log (cost=0.00..5098.15 rows=1310 width=0) Index Cond: (event = 'DEMOSTART'::text) (3 rows) => SELECT COUNT(*) FROM log WHERE event='DEMOSTART'; (...almost no time passes...) count --- 1432 (1 row) BEFORE pajax=> select * from pg_stats where tablename = 'log' and attname='event'; schemaname | tablename | attname | null_frac | avg_width | n_distinct | most_common_vals| most_common_freqs | histogram_bounds | correlation +---+-+---+---+++---+-+- public | log | event | 0 |10 | 25 | {PUBLISH,PUBREC1,NEWPAGE,UPLOAD,LOGIN,KILLSESS,LOGOUT} | {0.257333,0.248333,0.072,0.0696667,0.061,0.054,0.0506667} | {ACCTUPD,DELETE,IDEMAIL,NEWUSR,NEWUSR,PUBREC0,PUBREC0,RELOGIN,SYSMAINT,UNPUBLISH,UNPUBLISH} | 0.120881 (1 row) AFTER pajax=> select * from pg_stats where tablename='log' and attname='event'; schemaname | tablename | attname | null_frac | avg_width | n_distinct | most_common_vals| most_common_freqs | histogram_bounds | correlation +---+-+---+---+++-+---
Re: [PERFORM] Slow Inserts on large tables
"Peter Childs" <[EMAIL PROTECTED]> writes: > 2008/10/3 Peter Eisentraut <[EMAIL PROTECTED]>: >> Then show us your checkpointing-related parameters. > I've currently got them set to > checkpoint_segments = 3 > checkpoint_timeout = 180s > checkpoint_completion_target = 0.5 > after reading that doing more smaller checkpoints might make each > checkpoint work quicker and hence less of a performance hit when they > actually happen. That concept is actually pretty obsolete in 8.3: with spread-out checkpoints it basically shouldn't hurt to increase the checkpoint interval, and could actually help because the bgwriter doesn't have such a tight deadline to finish the checkpoint. In any case you *definitely* need to increase checkpoint_segments --- the value you've got could be forcing a checkpoint every few seconds not every few minutes. What I would suggest is turning on log_checkpoints and then seeing if there's any correlation between your slow insert commands and the checkpoints. I'm suspicious that the problem is somewhere else. (For instance, have you got anything that might take a lock on the table? Maybe enabling log_lock_waits would be a good idea too.) regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] dedicated server & postgresql 8.1 conf tunning
On 2. Oct, 2008, at 10:00, <[EMAIL PROTECTED]> <[EMAIL PROTECTED]> wrote: Unfornatly, i can't update pgsql to 8.3 since it's not in debian stable. Did you consider using backport packages (http://www.backports.org) for Debian Etch? They are providing postgresql v.8.3.3 packages for Debian Etch. Cheers. PS: We are also running backported postgresql packages using Debian Etch on our production servers without any problems. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Slow Inserts on large tables
Peter, (please take this with a pinch of salt as I am no expert) Here is a possible scenario: Each of your checkpoints takes 90 seconds or more (you told it so with the checkpoint_completion_target). If your insert fills 3 checkpoint segments (48 megs ) in less than 90 seconds then a new checkpoint request is issued. And maybe a third one, and so on. I imagine that this can flood the disk cache with write requests at some point although I can't explain how. Have a look at the log, see the interval between the checkpoint requests and try to make this (a lot) larger than the checkpoint duration. Start by increasing your checkpoint_segments (to, say, 16). If this doesn't work, maybe the timeout is too short, or the 90 seconds target to generous. Regards, Iulian --- On Fri, 10/3/08, Peter Childs <[EMAIL PROTECTED]> wrote: From: Peter Childs <[EMAIL PROTECTED]> Subject: Re: [PERFORM] Slow Inserts on large tables To: Cc: "Postgresql Performance" Date: Friday, October 3, 2008, 9:47 AM 2008/10/3 Peter Eisentraut <[EMAIL PROTECTED]>: > Peter Childs wrote: >> >> I have a problem where by an insert on a "large" table will sometimes >> take longer than usual. > >> I think the problem might have something to do with checkpoints, > > Then show us your checkpointing-related parameters. Or try to set them to a > lot higher values so checkpoints happen more rarely and see if that makes a > difference. > > More often or less often? I've currently got them set to checkpoint_segments = 3 checkpoint_timeout = 180s checkpoint_completion_target = 0.5 after reading that doing more smaller checkpoints might make each checkpoint work quicker and hence less of a performance hit when they actually happen. Regards Peter -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Slow Inserts on large tables
2008/10/3 Peter Eisentraut <[EMAIL PROTECTED]>: > Peter Childs wrote: >> >> I have a problem where by an insert on a "large" table will sometimes >> take longer than usual. > >> I think the problem might have something to do with checkpoints, > > Then show us your checkpointing-related parameters. Or try to set them to a > lot higher values so checkpoints happen more rarely and see if that makes a > difference. > > More often or less often? I've currently got them set to checkpoint_segments = 3 checkpoint_timeout = 180s checkpoint_completion_target = 0.5 after reading that doing more smaller checkpoints might make each checkpoint work quicker and hence less of a performance hit when they actually happen. Regards Peter -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Slow Inserts on large tables
Peter Childs wrote: I have a problem where by an insert on a "large" table will sometimes take longer than usual. I think the problem might have something to do with checkpoints, Then show us your checkpointing-related parameters. Or try to set them to a lot higher values so checkpoints happen more rarely and see if that makes a difference. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Slow Inserts on large tables
I have a problem where by an insert on a "large" table will sometimes take longer than usual. Usually the inserts are quick then from time to time they will take a long time sometimes as much as 10seconds or longer. (But usually under 500ms which is when I start logging them) The queries are slow drip fed so bulk loading really is not an option, Its logging data. Used in analysis and for historical purposes mostly. I think the problem might have something to do with checkpoints, I'm relatively sure its not when the table expands as I've run a vacuum verbose straight away after a longer insert and not found loads of space in the fsm. I'm using 8.3.1 (I thought I'd upgraded to 8.3.3 but it does not look like the upgrade worked) I'm more than happy to upgrade just have to find the down time (even a few seconds can be difficult) Any help would be appreciated. Regards Peter Childs -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance