Re: [PERFORM] VERY slow queries at random
On 07.06.2007, at 22:42, Greg Smith wrote: On Thu, 7 Jun 2007, Gunther Mayer wrote: wal checkpoint config is on pg defaults everywhere, all relevant config options are commented out. I'm no expert in wal stuff but I don't see how that could cause the problem? Checkpoints are very resource intensive and can cause other processes (including your selects) to hang for a considerable period of time while they are processing. With the default parameters, they can happen very frequently. Normally checkpoint_segments and checkpoint_timeout are increased in order to keep this from happening. This would normally be an issue only if you're writing a substantial amount of data to your tables. If there are a lot of writes going on, you might get some improvement by adjusting those parameters upward; the defaults are pretty low. Make sure you read http://www.postgresql.org/docs/8.2/static/wal-configuration.html first so you know what you're playing with, there are some recovery implications invoved. I remember us having problems with 8.0 background writer, you might want to try turning it off. Not sure if it behaves as badly in 8.2. increasing wal buffers might be a good idea also. Kristo ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] VERY slow queries at random
could be that the checkpoints are done too seldom. what is your wal checkpoint config? Kristo On 07.06.2007, at 0:27, Scott Marlowe wrote: Gunther Mayer wrote: Hi there, We run a small ISP with a FreeBSD/freeradius/postgresql 8.2.4 backend and 200+ users. Authentication happens via UAM/hotspot and I see a lot of authorisation and accounting packets that are handled via PL/PGSQL functions directly in the database. Everything seems to work 100% except that a few times a day I see Jun 6 10:41:31 caligula postgres[57347]: [4-1] radiususer: LOG: duration: 19929.291 ms statement: SELECT fn_accounting_start(...) in my logs. I'm logging slow queries with log_min_duration_statement = 500 in my postgresql.conf. Sometimes another query runs equally slow or even slower (I've seen 139 seconds!!!) a few minutes before or after as well, but then everything is back to normal. Even though I haven't yet indexed my data I know that the system is performant because my largest table (the accounting one) only has 5000+ rows, the entire database is only a few MB's and I have plenty of memory (2GB), shared_buffers = 100MB and max_fsm_pages = 179200. Also from briefly enabling log_parser_stats = on log_planner_stats = on log_executor_stats = on I saw that most queries are 100% satisfied from cache so the disk doesn't even get hit. Finally, the problem seems unrelated to load because it happens at 4am just as likely as at peak traffic time. What the heck could cause such erratic behaviour? I suspect some type of resource problem but what and how could I dig deeper? Maybe your hard drive is set to spin down after a certain period of idle, and since most all your data is coming from memory, then it might be that on the rare occasion when it needs to hit the drive it's not spun up anymore. Maybe some other process is cranking up (cron jobs???) that are chewing up all your I/O bandwidth? Hard to say. Anything in the system logs that would give you a hint? Try correlating them by the time of the slow pgsql queries. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: My quick and dirty solution (Re: [PERFORM] Performance Problem with Vacuum of bytea table (PG 8.0.13))
you should first cluster the table on primary key. The table is probably already bloated from the 3 hr delay it had before. First CLUSTER primary key index name ON group_fin_account_tst; Then vacuum it every 3 minutes. NB! clustering takes an access exclusive lock on table Kristo On 25.05.2007, at 15:30, Bastian Voigt wrote: No, this did not help. The vacuum process is still running far too long and makes everything slow. It is even worse than before, cause now the system is slow almost all the time while when vacuuming only every 3 hours it is only slow once every three hours. I now did the following. Well, no comment. Shellscript A: while true do psql -U $user -d $database -c vacuum analyze verbose binary_cache echo Going to sleep sleep 60 done Shellscript B: while true do ps aux $tempfile numwaiting=`grep UPDATE.waiting $tempfile | grep -c -v grep` echo Number of waiting updates: $numwaiting vacuumpid=`grep VACUUM $tempfile| grep -v grep | awk '{print $2}'` echo PID of vacuum process: $vacuumpid if [ $numwaiting -gt 5 ] then echo Too many waiting transactions, killing vacuum process $vacuumpid... kill $vacuumpid fi echo Sleeping 30 Seconds sleep 30 done -- Bastian Voigt Neumünstersche Straße 4 20251 Hamburg telefon +49 - 40 - 67957171 mobil +49 - 179 - 4826359 ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Performance problem on 8.2.4, but not 8.2.3
Lo! referrer_paths seems to have totally wrong stats. try full analyze on it. how many records in total do you have in referrer_paths on 8.2.4 server? might be just a problem of usage pattern change from old system to new (1 row vs. 121399 rows) ? does not seem to be just a plan problem as the data itself seems to be quite different. Kristo On 25.05.2007, at 21:08, Dave Pirotte wrote: Greetings, We have two servers running pgsql -- an older server running 8.2.3, and a newer (far superior) one running 8.2.4. One of our reporting queries is running painfully slowly on 8.2.4, but it executes in a reasonable timeframe on 8.2.3. Below, I've included a contrived, stripped down query which still exhibits the same unintuitively poor performance, as well as its explain analyze output from both servers. In particular, 8.2.4 opts for filters in a couple places where we would expect index conds. Also, we've noticed that the 8.2.4 box (in other similar queries) consistently underestimates costs, whereas the 8.2.3 box consistently overestimates. All columns involved in this query are indexed (btrees), and there is a functional index on mm_date_trunc('day', created_at)...where mm_date_trunc is simply an immutable version of date_trunc (fine for our purposes). The only configuration differences between the servers are various memory settings... work_mem and temp_buffers are 8mb / 16mb, shared buffers 128mb / 512mb on the 8.2.3 and 8.2.4 servers, respectively. Stats targets are 10 on both, for consistency... but it is worth mentioning that performance was still abysmal under 8.2.4 with 250 as the target. Any insight would be most appreciated, as we're a bit stumped. Thanks! Cheers, Dave Pirotte Director of Technology Media Matters for America === select h.day, h.c as total, (select count(*) as c from hits h2 join uri_qstrings uq on (h2.uri_qstring_id = uq.id) join referrer_paths rp on (h2.referrer_path_id = rp.id) join referrer_domains rd on (rp.referrer_domain_id = rd.id) where mm_date_trunc('day', created_at) = h.day and site_id = 3 and uq.qstring = '?f=h_top' and rd.domain = 'mediamatters.org' ) as h_top from ( select mm_date_trunc('day', h.created_at) as day, count(*) as c from hits h where created_at date_trunc('day', now() - interval '2 days') group by mm_date_trunc('day', h.created_at) ) h order by h.day asc; QUERY PLAN (8.2.4) -- -- Sort (cost=204012.65..204012.66 rows=3 width=16) (actual time=83012.885..83012.885 rows=3 loops=1) Sort Key: day - Subquery Scan h (cost=149811.02..204012.62 rows=3 width=16) (actual time=28875.251..83012.868 rows=3 loops=1) - HashAggregate (cost=149811.02..149811.06 rows=3 width=8) (actual time=1602.787..1602.794 rows=3 loops=1) - Bitmap Heap Scan on hits h (cost=6485.90..148079.18 rows=346368 width=8) (actual time=48.222..1358.196 rows=391026 loops=1) Recheck Cond: (created_at date_trunc ('day'::text, (now() - '2 days'::interval))) - Bitmap Index Scan on hits_created_idx (cost=0.00..6399.31 rows=346368 width=0) (actual time=47.293..47.293 rows=391027 loops=1) Index Cond: (created_at date_trunc ('day'::text, (now() - '2 days'::interval))) SubPlan - Aggregate (cost=18067.17..18067.18 rows=1 width=0) (actual time=27136.681..27136.681 rows=1 loops=3) - Nested Loop (cost=40.66..18067.16 rows=1 width=0) (actual time=1105.396..27135.496 rows=3394 loops=3) - Nested Loop (cost=40.66..18063.56 rows=9 width=8) (actual time=32.132..26837.394 rows=50537 loops=3) - Nested Loop (cost=40.66..5869.35 rows=47 width=8) (actual time=20.482..276.889 rows=121399 loops=3) - Index Scan using referrer_domains_domains_idx on referrer_domains rd (cost=0.00..8.27 rows=1 width=8) (actual time=0.024..0.026 rows=1 loops=3) Index Cond: ((domain)::text = 'mediamatters.org'::text) - Bitmap Heap Scan on referrer_paths rp (cost=40.66..5834.77 rows=2105 width=16) (actual time=20.402..210.440 rows=121399 loops=3) Recheck Cond: (rp.referrer_domain_id = rd.id) - Bitmap Index Scan on referrer_paths_domains_idx