Re: [PERFORM] VERY slow queries at random

2007-06-08 Thread Kristo Kaiv


On 07.06.2007, at 22:42, Greg Smith wrote:


On Thu, 7 Jun 2007, Gunther Mayer wrote:

wal checkpoint config is on pg defaults everywhere, all relevant  
config options are commented out. I'm no expert in wal stuff but I  
don't see how that could cause the problem?


Checkpoints are very resource intensive and can cause other  
processes (including your selects) to hang for a considerable  
period of time while they are processing.  With the default  
parameters, they can happen very frequently.  Normally  
checkpoint_segments and checkpoint_timeout are increased in order  
to keep this from happening.


This would normally be an issue only if you're writing a  
substantial amount of data to your tables.  If there are a lot of  
writes going on, you might get some improvement by adjusting those  
parameters upward; the defaults are pretty low.  Make sure you read  
http://www.postgresql.org/docs/8.2/static/wal-configuration.html  
first so you know what you're playing with, there are some recovery  
implications invoved.


I remember us having problems with 8.0 background writer, you might  
want to try turning it off. Not sure if it behaves as badly in 8.2.

increasing wal buffers might be a good idea also.

Kristo


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [PERFORM] VERY slow queries at random

2007-06-07 Thread Kristo Kaiv

could be that the checkpoints are done too seldom.
what is your wal checkpoint config?

Kristo
On 07.06.2007, at 0:27, Scott Marlowe wrote:


Gunther Mayer wrote:

Hi there,

We run a small ISP with a FreeBSD/freeradius/postgresql 8.2.4 backend
and 200+ users. Authentication happens via UAM/hotspot and I see a  
lot

of authorisation and accounting packets that are handled via PL/PGSQL
functions directly in the database.

Everything seems to work 100% except that a few times a day I see

Jun  6 10:41:31 caligula postgres[57347]: [4-1] radiususer: LOG:
duration: 19929.291 ms  statement: SELECT fn_accounting_start(...)

in my logs. I'm logging slow queries with  
log_min_duration_statement =
500 in my postgresql.conf. Sometimes another query runs equally  
slow or
even slower (I've seen 139 seconds!!!) a few minutes before or  
after as

well, but then everything is back to normal.

Even though I haven't yet indexed my data I know that the system is
performant because my largest table (the accounting one) only has  
5000+
rows, the entire database is only a few MB's and I have plenty of  
memory

(2GB), shared_buffers = 100MB and max_fsm_pages = 179200.  Also from
briefly enabling

log_parser_stats = on
log_planner_stats = on
log_executor_stats = on

I saw that most queries are 100% satisfied from cache so the disk
doesn't even get hit. Finally, the problem seems unrelated to load
because it happens at 4am just as likely as at peak traffic time.

What the heck could cause such erratic behaviour? I suspect some  
type of

resource problem but what and how could I dig deeper?


Maybe your hard drive is set to spin down after a certain period of  
idle, and since most all your data is coming from memory, then it  
might be that on the rare occasion when it needs to hit the drive  
it's not spun up anymore.


Maybe some other process is cranking up (cron jobs???) that are  
chewing up all your I/O bandwidth?


Hard to say.  Anything in the system logs that would give you a  
hint?  Try correlating them by the time of the slow pgsql queries.



---(end of  
broadcast)---

TIP 2: Don't 'kill -9' the postmaster



---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: My quick and dirty solution (Re: [PERFORM] Performance Problem with Vacuum of bytea table (PG 8.0.13))

2007-05-25 Thread Kristo Kaiv

you should first cluster the table on primary key.
The table is probably already bloated from the 3 hr delay it had before.
First
CLUSTER primary key index name ON group_fin_account_tst;
Then
vacuum it every 3 minutes.
NB! clustering takes an access exclusive lock on table

Kristo

On 25.05.2007, at 15:30, Bastian Voigt wrote:

No, this did not help. The vacuum process is still running far too  
long and makes everything slow. It is even worse than before, cause  
now the system is slow almost all the time while when vacuuming  
only every 3 hours it is only slow once every three hours.



I now did the following. Well, no comment.


Shellscript A:

while true
do
   psql -U $user -d $database -c vacuum analyze verbose binary_cache
   echo Going to sleep
   sleep 60
done


Shellscript B:

while true
do
   ps aux  $tempfile
   numwaiting=`grep UPDATE.waiting $tempfile | grep -c -v grep`
   echo Number of waiting updates: $numwaiting

   vacuumpid=`grep VACUUM $tempfile| grep -v grep | awk '{print  
$2}'`

   echo PID of vacuum process: $vacuumpid

   if [ $numwaiting -gt 5 ]
   then
   echo Too many waiting transactions, killing vacuum  
process $vacuumpid...

   kill $vacuumpid
   fi
   echo Sleeping 30 Seconds
   sleep 30
done

--
Bastian Voigt
Neumünstersche Straße 4
20251 Hamburg
telefon +49 - 40  - 67957171
mobil   +49 - 179 - 4826359



---(end of  
broadcast)---

TIP 5: don't forget to increase your free space map settings



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Performance problem on 8.2.4, but not 8.2.3

2007-05-25 Thread Kristo Kaiv

Lo!

referrer_paths seems to have totally wrong stats. try full analyze on  
it.

how many records in total do you have in referrer_paths on 8.2.4 server?
might be just a problem of usage pattern change from old system to  
new (1 row vs. 121399 rows) ?
does not seem to be just a plan problem as the data itself seems to  
be quite different.


Kristo

On 25.05.2007, at 21:08, Dave Pirotte wrote:


Greetings,

We have two servers running pgsql -- an older server running 8.2.3,  
and a newer (far superior) one running 8.2.4.  One of our reporting  
queries is running painfully slowly on 8.2.4, but it executes in a  
reasonable timeframe on 8.2.3.  Below, I've included a contrived,  
stripped down query which still exhibits the same unintuitively  
poor performance, as well as its explain analyze output from both  
servers.  In particular, 8.2.4 opts for filters in a couple places  
where we would expect index conds.  Also, we've noticed that the  
8.2.4 box (in other similar queries) consistently underestimates  
costs, whereas the 8.2.3 box consistently overestimates.


All columns involved in this query are indexed (btrees), and there  
is a functional index on mm_date_trunc('day', created_at)...where  
mm_date_trunc is simply an immutable version of date_trunc (fine  
for our purposes).  The only configuration differences between the  
servers are various memory settings... work_mem and temp_buffers  
are 8mb / 16mb, shared buffers 128mb / 512mb on the 8.2.3 and 8.2.4  
servers, respectively.  Stats targets are 10 on both, for  
consistency... but it is worth mentioning that performance was  
still abysmal under 8.2.4 with 250 as the target.


Any insight would be most appreciated, as we're a bit stumped.   
Thanks!


Cheers,

Dave Pirotte
Director of Technology
Media Matters for America

===

select h.day, h.c as total,
(select count(*) as c
from hits h2
join uri_qstrings uq on (h2.uri_qstring_id = uq.id)
join referrer_paths rp on (h2.referrer_path_id = rp.id)
join referrer_domains rd on (rp.referrer_domain_id = rd.id)
where mm_date_trunc('day', created_at) = h.day
and site_id = 3
and uq.qstring = '?f=h_top'
and rd.domain = 'mediamatters.org'
) as h_top
from (
select mm_date_trunc('day', h.created_at) as day,
count(*) as c
from hits h
where created_at  date_trunc('day', now() - interval '2 days')
group by mm_date_trunc('day', h.created_at)
) h
order by h.day asc;

   
 QUERY PLAN (8.2.4)
-- 
-- 

Sort  (cost=204012.65..204012.66 rows=3 width=16) (actual  
time=83012.885..83012.885 rows=3 loops=1)
   Sort Key: day   -  Subquery Scan h   
(cost=149811.02..204012.62 rows=3 width=16) (actual  
time=28875.251..83012.868 rows=3 loops=1)
 -  HashAggregate  (cost=149811.02..149811.06 rows=3  
width=8) (actual time=1602.787..1602.794 rows=3 loops=1)
   -  Bitmap Heap Scan on hits h   
(cost=6485.90..148079.18 rows=346368 width=8) (actual  
time=48.222..1358.196 rows=391026 loops=1)
 Recheck Cond: (created_at  date_trunc 
('day'::text, (now() - '2 days'::interval)))
 -  Bitmap Index Scan on hits_created_idx   
(cost=0.00..6399.31 rows=346368 width=0) (actual  
time=47.293..47.293 rows=391027 loops=1)
   Index Cond: (created_at  date_trunc 
('day'::text, (now() - '2 days'::interval)))

 SubPlan
   -  Aggregate  (cost=18067.17..18067.18 rows=1 width=0)  
(actual time=27136.681..27136.681 rows=1 loops=3)
 -  Nested Loop  (cost=40.66..18067.16 rows=1  
width=0) (actual time=1105.396..27135.496 rows=3394 loops=3)
   -  Nested Loop  (cost=40.66..18063.56  
rows=9 width=8) (actual time=32.132..26837.394 rows=50537 loops=3)
 -  Nested Loop  (cost=40.66..5869.35  
rows=47 width=8) (actual time=20.482..276.889 rows=121399 loops=3)
   -  Index Scan using  
referrer_domains_domains_idx on referrer_domains rd   
(cost=0.00..8.27 rows=1 width=8) (actual time=0.024..0.026 rows=1  
loops=3)
 Index Cond:  
((domain)::text = 'mediamatters.org'::text)
   -  Bitmap Heap Scan on  
referrer_paths rp  (cost=40.66..5834.77 rows=2105 width=16) (actual  
time=20.402..210.440 rows=121399 loops=3)
 Recheck Cond:  
(rp.referrer_domain_id = rd.id)
 -  Bitmap Index Scan on  
referrer_paths_domains_idx