Re: [PERFORM] Getting Slow

2007-06-07 Thread Alvaro Herrera
Tom Lane wrote:
> Alvaro Herrera <[EMAIL PROTECTED]> writes:
> > Also if autovacuum is eating all your I/O you may want to look into
> > throttling it back a bit by setting autovacuum_vacuum_cost_delay to a
> > non-zero value.
> 
> BTW, why is it that autovacuum_cost_delay isn't enabled by default?
> I can hardly believe that anyone will want to run it without that.
> *Especially* not with multiple workers configured by default.

Just because we haven't agreed a value.  Default autovacuum parameters
is something we should definitely discuss for 8.3.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Getting Slow

2007-06-07 Thread Tom Lane
Alvaro Herrera <[EMAIL PROTECTED]> writes:
> Also if autovacuum is eating all your I/O you may want to look into
> throttling it back a bit by setting autovacuum_vacuum_cost_delay to a
> non-zero value.

BTW, why is it that autovacuum_cost_delay isn't enabled by default?
I can hardly believe that anyone will want to run it without that.
*Especially* not with multiple workers configured by default.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] performance drop on 8.2.4, reverting to 8.1.4

2007-06-07 Thread Tom Lane
Alvaro Herrera <[EMAIL PROTECTED]> writes:
> Tom Lane escribió:
>> I was hoping that the auto plan invalidation code in CVS HEAD would get
>> it out of this problem, but it seems not to for the problem-as-given.
>> The trouble is that it won't change plans until autovacuum analyzes the
>> tables, and that won't happen until the transaction commits and sends
>> off its I-inserted-lotsa-rows report to the stats collector.

> I think there is something we can do about this -- drop the default
> value for analyze threshold.

Maybe worth doing, but it doesn't help for Steve's example.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] performance drop on 8.2.4, reverting to 8.1.4

2007-06-07 Thread Alvaro Herrera
Tom Lane escribió:

> I was hoping that the auto plan invalidation code in CVS HEAD would get
> it out of this problem, but it seems not to for the problem-as-given.
> The trouble is that it won't change plans until autovacuum analyzes the
> tables, and that won't happen until the transaction commits and sends
> off its I-inserted-lotsa-rows report to the stats collector.  So any
> given large transaction is stuck with the plans it first forms.  There's
> probably nothing we can do about that in time for 8.3, but it's
> something to think about for future releases ...

I think there is something we can do about this -- drop the default
value for analyze threshold.  We even discussed way back that we could
drop the concept of thresholds altogether, and nobody came up with an
argument for defending them.

> it won't change plans until autovacuum analyzes the
> tables, and that won't happen until the transaction commits and sends
> off its I-inserted-lotsa-rows report to the stats collector.  So any
> given large transaction is stuck with the plans it first forms.  There's
> probably nothing we can do about that in time for 8.3, but it's
> something to think about for future releases ...

Ah, *within* a single large transaction :-(  Yeah that's probably not
very solvable for the moment.

-- 
Alvaro Herrera   Valdivia, Chile   ICBM: S 39º 49' 18.1", W 73º 13' 56.4"
"Ninguna manada de bestias tiene una voz tan horrible como la humana" (Orual)

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] performance drop on 8.2.4, reverting to 8.1.4

2007-06-07 Thread Tom Lane
"Steven Flatt" <[EMAIL PROTECTED]> writes:
> One instance of our problem goes like this, and I have included a
> self-contained example with which you can reproduce the problem.

This is fairly interesting, because if you run the query by hand after
the function finishes, it's pretty fast.  What I think is happening is
that the plpgsql function caches a plan for the catalog query that is
predicated on pg_constraint and pg_inherits being small, and after
you've inserted a few thousand rows in them, that's not true anymore.

In CVS 8.2 (and HEAD), the core of the query seems to be
planned like this initially:

   ->  Hash Join  (cost=1.24..8.70 rows=1 width=76)
 Hash Cond: (c.conrelid = i.inhparent)
 ->  Seq Scan on pg_constraint c  (cost=0.00..7.35 rows=27 
width=76)
   Filter: (confrelid <> 0::oid)
 ->  Hash  (cost=1.23..1.23 rows=1 width=8)
   ->  Seq Scan on pg_inherits i  (cost=0.00..1.23 
rows=1 width=8)
 Filter: (inhrelid = 42154::oid)

With a thousand or so rows inserted in each catalog, it likes
this plan better:

   ->  Nested Loop  (cost=0.00..16.55 rows=1 width=76)
 ->  Index Scan using pg_inherits_relid_seqno_index on 
pg_inherits i  (cost=0.00..8.27 rows=1 width=8)
   Index Cond: (inhrelid = 42154::oid)
 ->  Index Scan using pg_constraint_conrelid_index on 
pg_constraint c  (cost=0.00..8.27 rows=1 width=76)
   Index Cond: (c.conrelid = i.inhparent)
   Filter: (c.confrelid <> 0::oid)

and indeed that plan is a lot better as the catalogs grow.
But the plpgsql function cached the other plan at start.

I'm not entirely sure why 8.1 doesn't fall into the same trap ---
perhaps it's because it's unable to rearrange outer joins.
It's certainly not being any smarter than 8.2.

Anyway, it seems that you could either try to get some pg_constraint and
pg_inherits rows created before you start this function, or you could
change it to use an EXECUTE to force replanning of the inner query.
Or just start a new session after the first few hundred table creations.

I was hoping that the auto plan invalidation code in CVS HEAD would get
it out of this problem, but it seems not to for the problem-as-given.
The trouble is that it won't change plans until autovacuum analyzes the
tables, and that won't happen until the transaction commits and sends
off its I-inserted-lotsa-rows report to the stats collector.  So any
given large transaction is stuck with the plans it first forms.  There's
probably nothing we can do about that in time for 8.3, but it's
something to think about for future releases ...

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Weird 8.2.4 performance

2007-06-07 Thread Joshua D. Drake

Mark Kirkwood wrote:

Kurt Overberg wrote:
work_mem = 100MB# when I ran the original 
query, this was set to 1MB, increased on Mark Kirkwood's advice, 
seemed to help a bit but not really




For future reference, be careful with this parameter, as *every* 
connection will use this much memory for each sort or hash (i.e it's not 
shared and can be allocated several times by each connection!)...yeah, I 
know I suggested increasing it to see what effect it would have :-).


This is however a parameter that can be set on the fly for the specific 
query.


Joshua D. Drake




And I'd agree with Steiner and others, looks like caching effects are 
the cause of the timing difference between production and the mac!


Cheers

Mark




---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org




--

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [PERFORM] Weird 8.2.4 performance

2007-06-07 Thread Mark Kirkwood

Kurt Overberg wrote:
work_mem = 100MB# when I ran the original query, 
this was set to 1MB, increased on Mark Kirkwood's advice, seemed to help 
a bit but not really




For future reference, be careful with this parameter, as *every* 
connection will use this much memory for each sort or hash (i.e it's not 
shared and can be allocated several times by each connection!)...yeah, I 
know I suggested increasing it to see what effect it would have :-).


And I'd agree with Steiner and others, looks like caching effects are 
the cause of the timing difference between production and the mac!


Cheers

Mark




---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] Best way to delete unreferenced rows?

2007-06-07 Thread Craig James

Tyrrill, Ed wrote:


I have a table, let's call it A, whose primary key, a_id, is referenced
in a second table, let's call it B.  For each unique A.a_id there are
generally many rows in B with the same a_id.  My problem is that I want
to delete a row in A when the last row in B that references it is
deleted.  Right now I just query for rows in A that aren't referenced by
B, and that worked great when the tables were small, but it takes over
an hour now that the tables have grown larger (over 200 million rows in
B and 14 million in A).  The delete has to do a sequential scan of both
tables since I'm looking for what's not in the indexes.

I was going to try creating a trigger after delete on B for each row to
check for more rows in B with the same a_id, and delete the row in A if
none found.  In general I will be deleting 10's of millions of rows from
B and 100's of thousands of rows from A on a daily basis.  What do you
think?  Does anyone have any other suggestions on different ways to
approach this?


Essentially what you're doing is taking the one-hour job and spreading out in 
little chunks over thousands of queries.  If you have 10^7 rows in B and 10^5 
rows in A, then on average you have 100 references from B to A.  That means 
that 99% of the time, your trigger will scan B and find that there's nothing to 
do.  This could add a lot of overhead to your ordinary transactions, costing a 
lot more in the long run than just doing the once-a-day big cleanout.

You didn't send the specifics of the query you're using, along with an EXPLAIN 
ANALYZE of it in operation.  It also be that your SQL is not optimal, and that 
somebody could suggest a more efficient query.

It's also possible that it's not the sequential scans that are the problem, but 
rather that it just takes a long time to delete 100,000 rows from table A 
because you have a lot of indexes. Or it could be a combination of performance 
problems.

You haven't given us enough information to really analyze your problem.  Send 
more details!

Craig

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[PERFORM] Best way to delete unreferenced rows?

2007-06-07 Thread Tyrrill, Ed
Hey All,

I have a table, let's call it A, whose primary key, a_id, is referenced
in a second table, let's call it B.  For each unique A.a_id there are
generally many rows in B with the same a_id.  My problem is that I want
to delete a row in A when the last row in B that references it is
deleted.  Right now I just query for rows in A that aren't referenced by
B, and that worked great when the tables were small, but it takes over
an hour now that the tables have grown larger (over 200 million rows in
B and 14 million in A).  The delete has to do a sequential scan of both
tables since I'm looking for what's not in the indexes.

I was going to try creating a trigger after delete on B for each row to
check for more rows in B with the same a_id, and delete the row in A if
none found.  In general I will be deleting 10's of millions of rows from
B and 100's of thousands of rows from A on a daily basis.  What do you
think?  Does anyone have any other suggestions on different ways to
approach this?

Thanks,
Ed

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] VERY slow queries at random

2007-06-07 Thread Greg Smith

On Thu, 7 Jun 2007, Gunther Mayer wrote:

wal checkpoint config is on pg defaults everywhere, all relevant config 
options are commented out. I'm no expert in wal stuff but I don't see how 
that could cause the problem?


Checkpoints are very resource intensive and can cause other processes 
(including your selects) to hang for a considerable period of time while 
they are processing.  With the default parameters, they can happen very 
frequently.  Normally checkpoint_segments and checkpoint_timeout are 
increased in order to keep this from happening.


This would normally be an issue only if you're writing a substantial 
amount of data to your tables.  If there are a lot of writes going on, you 
might get some improvement by adjusting those parameters upward; the 
defaults are pretty low.  Make sure you read 
http://www.postgresql.org/docs/8.2/static/wal-configuration.html first so 
you know what you're playing with, there are some recovery implications 
invoved.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] Getting Slow

2007-06-07 Thread Greg Smith

On Thu, 7 Jun 2007, Joe Lester wrote:


Memory: 4GB RAM

shared_buffers = 1
work_mem = 2048
effective_cache_size = 3


With these parameters, your server has 80MB dedicated to its internal 
caching, is making query decisions assuming the operating system only has 
240MB of memory available for its caching, and is only allowing individual 
clients to have a tiny amount of memory to work with before they have to 
swap things to disk.  You're not giving it anywhere close to enough memory 
to effectively work with a 5GB database, and your later reports show 
you're barely using 1/2 the RAM in this system usefully.


Multiply all these parameters by 10X, restart your server, and then you'll 
be in the right ballpark for a system with 4GB of RAM.  There might be 
some other tuning work left after that, but these values are so far off 
that until you fix them it's hard to say what else needs to be done.  See 
http://www.westnet.com/~gsmith/content/postgresql/pg-5minute.htm for more 
information on this topic.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] Getting Slow

2007-06-07 Thread Alvaro Herrera
Joe Lester wrote:

> max_fsm_pages = 15

This may be a bit too low -- it's just a little more than 1 GB, which
means it might fail to keep track of all your tables (or it may not, if
you don't have many updates).

> autovacuum_naptime = 60
> autovacuum_vacuum_threshold = 150
> autovacuum_vacuum_scale_factor = 0.0001
> autovacuum_analyze_scale_factor = 0.0001

The scale factors seems awfully low.  How about 0.01 instead and see if
you avoid vacuuming all your tables with every iteration ... have you
noticed how much work autovacuum is really doing?  It may be too much.

Also if autovacuum is eating all your I/O you may want to look into
throttling it back a bit by setting autovacuum_vacuum_cost_delay to a
non-zero value.

-- 
Alvaro Herrerahttp://www.advogato.org/person/alvherre
"La tristeza es un muro entre dos jardines" (Khalil Gibran)

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] Getting Slow

2007-06-07 Thread Steinar H. Gunderson
On Thu, Jun 07, 2007 at 01:48:43PM -0400, Joe Lester wrote:
> - The server log shows frequent "archived transaction log file"  
> entries. Usually once every 10 minutes or so, but sometimes 2 or 3  
> per minute.

Sounds like you've got a lot of writes going. You might want more power in
your I/O?

> Operating System: Mac OS 10.4.7 Client

Is there a particular reason for this? It's not known to be the best server
OS around -- it's hard to say that an OS change would do anything for your
problem, but it looks like an unusual choice.

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Getting Slow

2007-06-07 Thread Andrew Sullivan
On Thu, Jun 07, 2007 at 01:48:43PM -0400, Joe Lester wrote:
> of a table). Running the same query 4 times in a row would yield  
> dramatically different results... 1.001 seconds, 5 seconds, 22  
> seconds, 0.01 seconds, to complete.

> - When queries are especially slow, the server shows a big spike in  
> read/write activity.

My bet is that you're maxing your disk subsystem somehow.  The
problem with being I/O bound is that it doesn't matter how great you
do on average: if you have too much I/O traffic, it looks like you're
stopped.  Softraid can be expensive -- first thing I'd look at is to
see whether you are in fact hitting 100% of your I/O capacity and, if
so, what your options are for getting more room there.

A

-- 
Andrew Sullivan  | [EMAIL PROTECTED]
"The year's penultimate month" is not in truth a good way of saying
November.
--H.W. Fowler

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


[PERFORM] Getting Slow

2007-06-07 Thread Joe Lester
About six months ago, our normally fast postgres server started  
having performance issues. Queries that should have been instant were  
taking up to 20 seconds to complete (like selects on the primary key  
of a table). Running the same query 4 times in a row would yield  
dramatically different results... 1.001 seconds, 5 seconds, 22  
seconds, 0.01 seconds, to complete.


At the time we upgraded the hardware and the performance problems  
went away. But I did not feel like we had solved the underlying problem.


Now, six months later, the same thing is happening... and I'm kind of  
glad because now, I'd like to find out what the real issue is. I'm  
just starting to diagnose it so I don't know a lot yet, but what I do  
know, I'll share with you here in the hopes of starting off on the  
right track.


I've already described the main symptom. Here are some other random  
observations:
- The server log shows frequent "archived transaction log file"  
entries. Usually once every 10 minutes or so, but sometimes 2 or 3  
per minute.
- The server box seems otherwise to be responsive. CPU sits at about  
90% idle.
- When queries are especially slow, the server shows a big spike in  
read/write activity.
- This morning I did a VACUUM ANALYZE. It seemed to help for 30  
minutes or so, but then it was back to being slowish. I'd hate to  
schedule these because it feels more like a band-aid. For a long time  
we've been doing just fine with autovacuum, so why start scheduling  
vacuums now?


Here's info about our configuration. Any advise/pointers would be  
much appreciated. Thanks!


Computer: Mac Pro Dual Core Intel
Operating System: Mac OS 10.4.7 Client
Memory: 4GB RAM
Data Drives: 3 drives in a software RAID (internal)
Log/Backup Drive: 1 (the startup disk, internal)

Postgres Version: 8.1.4
Data Size: 5.1 GB
# of Tables: 60
Size of Tables: Most are under 100,000 records. A few are in the  
millions. Largest is 7058497.

Average Number of Simultaneous Client Connections: 250

max_connections = 500
shared_buffers = 1
work_mem = 2048
max_stack_depth = 6000
effective_cache_size = 3
fsync = on
wal_sync_method = fsync
archive_command = 'cp -i %p /Users/postgres/officelink/wal_archive/%f  

max_fsm_pages = 15
stats_start_collector = on
stats_row_level = on
log_min_duration_statement = 2000
log_line_prefix = '%t %h '
superuser_reserved_connections = 3
autovacuum = on
autovacuum_naptime = 60
autovacuum_vacuum_threshold = 150
autovacuum_vacuum_scale_factor = 0.0001
autovacuum_analyze_scale_factor = 0.0001

sudo pico /etc/rc
sysctl -w kern.sysv.shmmax=4294967296
sysctl -w kern.sysv.shmall=1048576

sudo pico /etc/sysctl.conf
kern.maxproc=2048
kern.maxprocperuid=800
kern.maxfiles=4
kern.maxfilesperproc=3

Processes:  470 total, 2 running, 4 stuck, 464 sleeping... 587  
threads 13:34:50
Load Avg:  0.45, 0.34, 0.33 CPU usage:  5.1% user, 5.1% sys,  
89.7% idle
SharedLibs: num =  157, resident = 26.9M code, 3.29M data, 5.44M  
LinkEdit

MemRegions: num = 15307, resident =  555M + 25.5M private,  282M shared
PhysMem:   938M wired,  934M active, 2.13G inactive, 3.96G used,  
43.1M free

VM:  116G + 90.1M   1213436(0) pageins, 263418(0) pageouts

  PID COMMAND  %CPU   TIME   #TH #PRTS #MREGS RPRVT  RSHRD   
RSIZE  VSIZE
29804 postgres 0.0%  0:03.24   1 927  1.27M   245M
175M   276M
29720 postgres 0.0%  0:01.89   1 927  1.25M   245M
125M   276M
29714 postgres 0.0%  0:03.70   11027  1.30M   245M
215M   276M
29711 postgres 0.0%  0:01.38   11027  1.21M   245M
107M   276M
29707 postgres 0.0%  0:01.27   1 927  1.16M   245M   
78.2M   276M
29578 postgres 0.0%  0:01.33   1 927  1.16M   245M   
67.8M   276M
29556 postgres 0.0%  0:00.39   1 927  1.09M   245M   
91.8M   276M
29494 postgres 0.0%  0:00.19   1 927  1.05M   245M   
26.5M   276M
29464 postgres 0.0%  0:01.98   1 927  1.16M   245M   
88.8M   276M
29425 postgres 0.0%  0:01.61   1 927  1.17M   245M
112M   276M
29406 postgres 0.0%  0:01.42   1 927  1.15M   245M
118M   276M
29405 postgres 0.0%  0:00.13   1 926   924K   245M   
17.9M   276M
29401 postgres 0.0%  0:00.98   11027  1.13M   245M   
84.4M   276M
29400 postgres 0.0%  0:00.90   11027  1.14M   245M   
78.4M   276M
29394 postgres 0.0%  0:01.56   11027  1.17M   245M
111M   276M

Re: [PERFORM] performance drop on 8.2.4, reverting to 8.1.4

2007-06-07 Thread Steven Flatt

On 6/5/07, Tom Lane <[EMAIL PROTECTED]> wrote:


If you're feeling desperate you could revert this patch in your local
copy:
http://archives.postgresql.org/pgsql-committers/2006-11/msg00066.php

   regards, tom lane



Reverting that patch has not appeared to solve our problem.  Perhaps I
didn't provide enough information, because I feel like there's more going
on here.

One instance of our problem goes like this, and I have included a
self-contained example with which you can reproduce the problem.  We make
heavy use of partitioned tables, so during our schema install, we create a
lot of inherited tables (on the order of 2000) to which we also want to add
the FK constraints that exist on the parent table.  The PLpgSQL function
below does this.  It queries for all FK constraints that are on the parent
table but not on the child, then generates the sql to add them to
the child.  (The function has been modified from the original but the main
query is the same.)

Note the "this is slow" section and the "replace with this which is fast"
section.  Both queries are fast on 8.1.4 (entire function completes in 2
minutes), but not on 8.2.4.  If you notice the "ELAPSED TIME"s written to
the console, the query times start equally fast but grows painfully slow
rather quickly with the "slow" version on 8.2.4.

Sorry for not providing explain analyze output, but I found it hard to tie
the output into the execution of the function.  When I did stand-alone
explain analyzes, the actual times reported were similar on 8.1.4 and 8.2.4.
I think the degradation has more to do with doing many such queries in a
single transaction or something like that.

Plus, correct me if I'm wrong, but the degrading query is executed against
pg_catalog tables only, which are in general smallish, so I have a hard time
believing that even a sub-optimal query plan results in this level of
degradation.

Any help is much appreciated, thanks.
Steve


CREATE OR REPLACE FUNCTION inherit_fks_test()
   RETURNS interval
   VOLATILE
   LANGUAGE PLpgSQL
   AS '
   DECLARE
   childtbl varchar;
   childoid oid;
   rec  record;
   starttimestamptz;
   finish   timestamptz;
   time1timestamptz;
   time2timestamptz;
   elapsed  interval;
   BEGIN
   start := timeofday();

   EXECUTE ''SET LOCAL log_min_messages TO NOTICE'';
   EXECUTE ''CREATE TABLE foo(a INT UNIQUE)'';
   EXECUTE ''CREATE TABLE bar(b INT REFERENCES foo(a))'';

   FOR count IN 1 .. 2000
   LOOP
   childtbl := ''bar_'' || count;
   EXECUTE ''CREATE TABLE '' || childtbl || ''() INHERITS
(bar)'';

   childoid := childtbl::regclass::oid;

   time1 := timeofday();
   FOR rec IN
   SELECT ''ALTER TABLE ''
  || quote_ident(n.nspname) || ''.''
  || quote_ident(cl.relname)
  || '' ADD CONSTRAINT ''
  || quote_ident(parent_const.conname) || '' ''
  || parent_const.def AS cmd
   FROM pg_catalog.pg_class cl
   JOIN pg_catalog.pg_namespace n
   ON (n.oid = cl.relnamespace)
   JOIN pg_catalog.pg_inherits i
   ON (i.inhrelid = cl.oid)
   JOIN (
   SELECT c.conname,
  c.conrelid,
  c.confrelid,
  pg_get_constraintdef(c.oid) AS def
   FROM pg_catalog.pg_constraint c
   WHERE c.confrelid <> 0
   ) AS parent_const
   ON (parent_const.conrelid = i.inhparent)

-- This is slow
---
   LEFT OUTER JOIN (
   SELECT c2.conname,
  c2.conrelid,
  c2.confrelid,
  pg_get_constraintdef(c2.oid) AS def
   FROM pg_catalog.pg_constraint c2
   WHERE c2.confrelid <> 0
   ) AS child_const
   ON (child_const.conrelid = cl.oid
   AND child_const.conname =
   parent_const.conname
   AND child_const.confrelid =
   parent_const.confrelid
   AND child_const.def = parent_const.def)
   WHERE child_const.conname IS NULL
---

-- Replace with this which is fast
--

[PERFORM] How Are The Variables Related?

2007-06-07 Thread Y Sidhu

On a FreeBSD system, is page size for shared_buffers calculation 8K? And is
page size for shmall calculation 4K? The documentation hints at these
values. Anyone know?

--
Yudhvir Singh Sidhu
408 375 3134 cell


Re: [PERFORM] VERY slow queries at random

2007-06-07 Thread Alvaro Herrera
Gunther Mayer wrote:

> On another note, autovacuum couldn't cause such issues, could it? I do 
> have autovacuum enabled (autovacuum=on as well as 
> stats_start_collector=on, stats_block_level = on and stats_row_level = 
> on), is there any possibility that autovacuum is not as resource 
> friendly as advertised?

Hmm.  I am not sure where did you read that but I don't think it has
ever been stated that autovacuum is resource friendly in the default
configuration (I, for one, have never tried, intended or wanted to state
that).  I suggest tuning the autovacuum_vacuum_cost_delay parameters if
you want it to interfere less with your regular operation.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] VERY slow queries at random

2007-06-07 Thread Andrew Sullivan
On Thu, Jun 07, 2007 at 04:22:47PM +0200, Gunther Mayer wrote:
> There are a whole bunch of update queries that fire all the time but 
> afaik none of them ever lock the entire table. To the best of my 
> knowledge UPDATE ... WHERE ... only locks those rows that it actually 
> operates on, in my case this is always a single row.

Well that shouldn't be biting you, then (you're not in SERIALIZABLE
mode, right?).  The other obvious bit would be checkpoint storms. 
What's your bgwriter config like?

> Question is, how do I find out about locks at the time when I only get 
> told about the slow query *after* it has completed and postgres has told 
> me so by logging a slow query entry in my logs?

You can't :(

A

-- 
Andrew Sullivan  | [EMAIL PROTECTED]
This work was visionary and imaginative, and goes to show that visionary
and imaginative work need not end up well. 
--Dennis Ritchie

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] VERY slow queries at random

2007-06-07 Thread Gunther Mayer

Kristo Kaiv wrote:

could be that the checkpoints are done too seldom.
what is your wal checkpoint config?

wal checkpoint config is on pg defaults everywhere, all relevant config 
options are commented out. I'm no expert in wal stuff but I don't see 
how that could cause the problem?


Gunther

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [PERFORM] VERY slow queries at random

2007-06-07 Thread Gunther Mayer

Scott Marlowe wrote:

Gunther Mayer wrote:

Hi there,

We run a small ISP with a FreeBSD/freeradius/postgresql 8.2.4 backend
and 200+ users. Authentication happens via UAM/hotspot and I see a lot
of authorisation and accounting packets that are handled via PL/PGSQL
functions directly in the database.

Everything seems to work 100% except that a few times a day I see

Jun  6 10:41:31 caligula postgres[57347]: [4-1] radiususer: LOG:
duration: 19929.291 ms  statement: SELECT fn_accounting_start(...)

in my logs. I'm logging slow queries with log_min_duration_statement =
500 in my postgresql.conf. Sometimes another query runs equally slow or
even slower (I've seen 139 seconds!!!) a few minutes before or after as
well, but then everything is back to normal.

Even though I haven't yet indexed my data I know that the system is
performant because my largest table (the accounting one) only has 5000+
rows, the entire database is only a few MB's and I have plenty of memory
(2GB), shared_buffers = 100MB and max_fsm_pages = 179200.  Also from
briefly enabling

log_parser_stats = on
log_planner_stats = on
log_executor_stats = on

I saw that most queries are 100% satisfied from cache so the disk
doesn't even get hit. Finally, the problem seems unrelated to load
because it happens at 4am just as likely as at peak traffic time.

What the heck could cause such erratic behaviour? I suspect some type of
resource problem but what and how could I dig deeper? 


Maybe your hard drive is set to spin down after a certain period of 
idle, and since most all your data is coming from memory, then it 
might be that on the rare occasion when it needs to hit the drive it's 
not spun up anymore.
I doubt that as a serious amount of logging is taking place on the box 
all the time which goes straight to disk. Also, no disk in the world 
would take more than a minute to spin up...
Maybe some other process is cranking up (cron jobs???) that are 
chewing up all your I/O bandwidth?
Hmm, I investigated that too but if that was the case the queries would 
run slow always at the same time of the day.
Hard to say.  Anything in the system logs that would give you a hint?  
Try correlating them by the time of the slow pgsql queries.
Nothing relevant in the system logs at the time of the slow query 
appearing. I have in the mean time tweaked syslog-ng.conf such that as 
soon as it detects a "duration: ms" log message it 
spawns top and top -m io and redirects the output to file. At least in 
that way I can check what's keeping the system busy immediately *after* 
a slow query has occured. Of course now Murphy's law has it that since 
I've done that (30 hours ago) not a single slow query has fired, but 
hey, I'll look at the results once I have them.


On another note, autovacuum couldn't cause such issues, could it? I do 
have autovacuum enabled (autovacuum=on as well as 
stats_start_collector=on, stats_block_level = on and stats_row_level = 
on), is there any possibility that autovacuum is not as resource 
friendly as advertised?


Gunther

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [PERFORM] VERY slow queries at random

2007-06-07 Thread Gunther Mayer

Andrew Sullivan wrote:

On Wed, Jun 06, 2007 at 09:20:54PM +0200, Gunther Mayer wrote:
  

What the heck could cause such erratic behaviour? I suspect some type of
resource problem but what and how could I dig deeper?



Is something (perhaps implicitly) locking the table?  That will cause
this.
  
There are a whole bunch of update queries that fire all the time but 
afaik none of them ever lock the entire table. To the best of my 
knowledge UPDATE ... WHERE ... only locks those rows that it actually 
operates on, in my case this is always a single row. No explicit locking 
is done anywhere, but perhaps you're right and it is a locking issue. 
Question is, how do I find out about locks at the time when I only get 
told about the slow query *after* it has completed and postgres has told 
me so by logging a slow query entry in my logs?


Gunther

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: {Spam} Re: [PERFORM] Weird 8.2.4 performance

2007-06-07 Thread Dimitri Fontaine
Le jeudi 07 juin 2007, Kurt Overberg a écrit :
> Is there a primer somewhere on how to read EXPLAIN output?

Those Robert Treat slides are a great reading:
  http://www.postgresql.org/communityfiles/13.sxi

Regards,
-- 
dim

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Weird 8.2.4 performance

2007-06-07 Thread Tom Lane
Kurt Overberg <[EMAIL PROTECTED]> writes:
> ... Turning off bitmap scans definitely seems  
> to help things,

I really seriously doubt that.  On queries like this, where each inner
scan is fetching a couple hundred rows, the small extra overhead of a
bitmap scan should easily pay for itself.  I think you're looking
entirely at caching effects that allow a re-read of the same data to
go faster.

You might try running the same query plan several times in a row and
noting the lowest time, then repeat for the other query plan.  This will
get you comparable fully-cached times, which I bet will be very close
to the same.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Weird 8.2.4 performance

2007-06-07 Thread Steinar H. Gunderson
On Thu, Jun 07, 2007 at 07:18:22AM -0400, Kurt Overberg wrote:
> - My production environment is running RedHat 2.6.9.ELsmp on a server  
> with 16GB of memory

Seriously, this (the RAM amount) _is_ all the difference. (You don't say how
much RAM is in your Mac, but something tells me it's not 16GB.) If you install
8.2.4 on your server, there's no reason why the query you pasted shouldn't be
at least as fast as on 8.0.

> Is there a primer somewhere on how to read EXPLAIN output?

Yes, the documentation contains one.

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PERFORM] Weird 8.2.4 performance

2007-06-07 Thread Kurt Overberg
Thank you everyone for the replies.  I'll try to answer everyone's  
questions in one post.


* Regarding production/mac memory and cache usage.  This query HAS  
been running on 8.0 on my Mac, I just got that particular query  
explain from our production system because I had to nuke my local 8.0  
database before installing  8.2.4 due to disk space limitations.  The  
query that this sample query is part of run in under 5 seconds when I  
was running 8.0 locally on my mac, and it did a bunch of agregations  
based on task instance.


* work_mem is set to 1 megabyte (the default) on both 8.0 and 8.2.4.

* setting enable_bitmapscan = false on 8.2.4

0605=# explain analyze  select id from taskinstance where taskid in  
(select id from task where campaignid = 76);

 QUERY PLAN
 
 
---
Nested Loop  (cost=16.94..15484.61 rows=2309 width=4) (actual  
time=44.751..8498.689 rows=1117 loops=1)
   ->  HashAggregate  (cost=16.94..17.01 rows=7 width=4) (actual  
time=0.144..0.194 rows=10 loops=1)
 ->  Index Scan using "Task_campaignId_key" on task   
(cost=0.00..16.93 rows=7 width=4) (actual time=0.069..0.116 rows=10  
loops=1)

   Index Cond: (campaignid = 51)
   ->  Index Scan using taskid_taskinstance_key on taskinstance   
(cost=0.00..2202.73 rows=554 width=8) (actual time=20.305..849.640  
rows=112 loops=10)

 Index Cond: (taskinstance.taskid = task.id)
Total runtime: 8499.599 ms

...FWIW, this query returns about 900 rows.  TaskInstance is a fairly  
large table in width (20 columns, about 15 are varchar, 3 timestamps  
and a few ints)
and height (650,000) rows.  I can't really run the same query  
multiple times due to caching, so I change up "campaignid".  Is there  
a way to flush that cache?  Turning off bitmap scans definitely seems  
to help things, but I'm concerned that when/if I flip my production  
machine, I'm going to run into who-knows-what.  I don't really have a  
set of SQL acceptance tests to test jumping from rev to rev (I know I  
should- BAD DEVELOPER, BAD!).


* Configuration

- My production environment is running RedHat 2.6.9.ELsmp on a server  
with 16GB of memory


-  My old 8.0 database on my mac only had this modified from default:

shared_buffers =  100
work_mem = 1024

- 8.2.4 database seemed to go through some sort of auto-config when I  
installed it, settings I think are different are as follows:


shared_buffers = 128MB  # min 128kB or  
max_connections*16kB
work_mem = 100MB# when I ran the original  
query, this was set to 1MB, increased on Mark Kirkwood's advice,  
seemed to help a bit but not really


8.2.4 Database size- 25 GB (from du -sh on the directory 'base')

* Richard Huxton

Thanks for the kind words- I'm glad I was able to 'ask a good  
question'.  I'm very new to this mailing list, but I'm on many Java/ 
Struts/Perl mailing lists and have seen enough poorly worded/spelled/ 
asked questions to last a lifetime.  My situation is: I'm the senior  
(read: first) developer at a small but growing startup.  Everything I  
know about PostgreSQL I've learned over the past 4 years in which our  
tiny little DB grew from one database with 100 users to over a 4 node  
Slony setup 300,000 users.  Somehow, I'm not sure why, but I find  
myself in the awkward position of being the 'go-to guy' for all  
database related stuff at my company.  What I don't know could fill  
volumes, but I've been able to keep the durn database running for  
over 4 years (which is mostly a testament to how awesome PostgreSQL  
is)- so when I hit something that makes no sense, I KNOW that if I  
have any hope of getting one of ye postgresql gods to help me with an  
obscure, non-sensical problem such as this one, I'd better include as  
much context as possible.  :-)   FWIW- we're looking to hire a  
PostgreSQL hired gun to help me with this and many other things.   
Ideally, that person would be in Boston, MA, USA and be able to come  
into the office, but we'd consider remote people too.  If you're  
interested, drop me a line.


Thanks again for the replies, gang.  Have there been many reported  
performance related problems regarding people upgrading from 8.0->8.2?


Is there a primer somewhere on how to read EXPLAIN output?

Thanks again for helping me with this...

/kurt



On Jun 7, 2007, at 5:23 AM, Richard Huxton wrote:


Mark Kirkwood wrote:
8.2 is deciding to use a bitmap index scan on  
taskid_taskinstance_key, which seems to be slower (!) than a plain  
old index scan that 8.0 is using. A dirty work around is to  
disable bitmap scans via:


I'm having difficulty figuring out why it's doing this at all.  
There's only one index involved, and it's over the primary-key 

Re: [PERFORM] Weird 8.2.4 performance

2007-06-07 Thread Mark Kirkwood

Steinar H. Gunderson wrote:

On Thu, Jun 07, 2007 at 11:35:27AM +0200, Steinar H. Gunderson wrote:

If that table
doesn't fit in the cache on your Mac, you pretty much lose. From the EXPLAIN
output, it looks like it fits very nicely in cache on your server. Thus, I
don't think the difference is between 8.0 and 8.2, but rather your production
server and your test machine.



That's a good point, however its not immediately obvious that the 
production server is *not* running MacOSX Tiger (or has any more 
memory)... Kurt can you post the relevant specs for the the 8.0 and 8.2 
boxes?


Cheers

Mark


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [PERFORM] Weird 8.2.4 performance

2007-06-07 Thread Steinar H. Gunderson
On Thu, Jun 07, 2007 at 11:35:27AM +0200, Steinar H. Gunderson wrote:
> How big did you say these tables were?

Sorry, you already said that -- 650k rows for one of them. If that table
doesn't fit in the cache on your Mac, you pretty much lose. From the EXPLAIN
output, it looks like it fits very nicely in cache on your server. Thus, I
don't think the difference is between 8.0 and 8.2, but rather your production
server and your test machine.

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


[PERFORM] copy from performance on large tables with indexes

2007-06-07 Thread Marc Mamin

Hello,


Postgres: 8.2
os: Linux 4CPU, 4 GB RAM, Raid 1, 32 bit system
work_mem: 600 Mb


I have some tables which may become quite large (currently up to 6 Gb) .
I initially fill them using copy from (files) .

The import is fast enough as I only have a primary key on the table:
about 18 minutes
(over 300 Mb/minute)

Then I need 5 additional indexes on it. Creation time: 30 minutes


subsequently I compute some aggregations which need 4 hours and 30
minutes additional time


And now the problem:

If I get additional data for the table, the import become much more
slower due to the indexes (about 30 times slower !):

The performance degradation  is probably  due to the fact that all
indexs are too large to be kept in memory. 
Moreover I guess that the indexes fill factors are too high (90%)

During this second import, I have about 20% iowait time.



The usual solution is to drop the indexes before the second import and
rebuild them afterwards, but I feel unconfident doing this as I don't
know how the system will react if some SELECT statements occures when
the index are missing. I can hardly avoid this.


So my idea for the second import process:


1) make a copy of the table:

   create table B as select * from table A;
   alter table B add constraint B_pk primary key (id);


2) import the new data in table B

   copy B from file;

3) create the required indexes on B

   create index Bix_1 on B..
   create index Bix_2 on B..
   create index Bix_2 on B..
   create index Bix_2 on B..
   
4) replace table A with table B

   alter table A renam to A_trash;
   alter table B renam to A;
   drop table A_trash;

 (and rename the indexes to get the  original state)
 
 
 
 
 
 This seems to work but with side effects:
 
 The only objects that refer to the tables are functions and indexes.
 
If a function is called within a same session before and after the table
renaming, the second attempt fails (or use the table A_trash if it still
exists). So I should close the session and start a new one before
further processing. Errors in other live sessions are acceptable, but
maybe you know a way to avoid them?)



And now a few questions :-)

- do you see any issue that prevent this workflow to work?

- is there any other side effect to take care of ?

- what is the maximum acceptable value for the parameter work_mem for my
configuration 
  (see the complete configuration below)
  
- has anybody built a similar workflow ?  

- could this be a feature request to extend the capabilities of copy
from ?



Thanks for your time and attention,

Marc Mamin

 


Re: [PERFORM] Weird 8.2.4 performance

2007-06-07 Thread Steinar H. Gunderson
On Wed, Jun 06, 2007 at 07:27:27PM -0400, Kurt Overberg wrote:
> This query runs great on production under 8.0 (27ms), but under 8.2.4  
> (on my mac) I'm seeing times in excess of 50,000ms.  Note that on  
> 8.2.4, if I run the query again, it gets successively faster  
> (50,000ms->6000ms->27ms).  Is this normal?

Your production server probably has all the data in your cache, and your Mac
has not. Furthermore, they seem to be running on different data sets, judging
from your EXPLAIN ANALYZE.

How big did you say these tables were?

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Weird 8.2.4 performance

2007-06-07 Thread Richard Huxton

Mark Kirkwood wrote:


8.2 is deciding to use a bitmap index scan on taskid_taskinstance_key, 
which seems to be slower (!) than a plain old index scan that 8.0 is 
using. A dirty work around is to disable bitmap scans via:


I'm having difficulty figuring out why it's doing this at all. There's 
only one index involved, and it's over the primary-key to boot!


An EXPLAIN ANALYSE with enable_bitmapscan off should say why PG thinks 
the costs are cheaper than they actually are.


PS - well worded question Kurt. All the relevant information neatly laid 
out, explain analyse on both platforms - you should be charging to let 
people help ;-)


--
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] VERY slow queries at random

2007-06-07 Thread Kristo Kaiv

could be that the checkpoints are done too seldom.
what is your wal checkpoint config?

Kristo
On 07.06.2007, at 0:27, Scott Marlowe wrote:


Gunther Mayer wrote:

Hi there,

We run a small ISP with a FreeBSD/freeradius/postgresql 8.2.4 backend
and 200+ users. Authentication happens via UAM/hotspot and I see a  
lot

of authorisation and accounting packets that are handled via PL/PGSQL
functions directly in the database.

Everything seems to work 100% except that a few times a day I see

Jun  6 10:41:31 caligula postgres[57347]: [4-1] radiususer: LOG:
duration: 19929.291 ms  statement: SELECT fn_accounting_start(...)

in my logs. I'm logging slow queries with  
log_min_duration_statement =
500 in my postgresql.conf. Sometimes another query runs equally  
slow or
even slower (I've seen 139 seconds!!!) a few minutes before or  
after as

well, but then everything is back to normal.

Even though I haven't yet indexed my data I know that the system is
performant because my largest table (the accounting one) only has  
5000+
rows, the entire database is only a few MB's and I have plenty of  
memory

(2GB), shared_buffers = 100MB and max_fsm_pages = 179200.  Also from
briefly enabling

log_parser_stats = on
log_planner_stats = on
log_executor_stats = on

I saw that most queries are 100% satisfied from cache so the disk
doesn't even get hit. Finally, the problem seems unrelated to load
because it happens at 4am just as likely as at peak traffic time.

What the heck could cause such erratic behaviour? I suspect some  
type of

resource problem but what and how could I dig deeper?


Maybe your hard drive is set to spin down after a certain period of  
idle, and since most all your data is coming from memory, then it  
might be that on the rare occasion when it needs to hit the drive  
it's not spun up anymore.


Maybe some other process is cranking up (cron jobs???) that are  
chewing up all your I/O bandwidth?


Hard to say.  Anything in the system logs that would give you a  
hint?  Try correlating them by the time of the slow pgsql queries.



---(end of  
broadcast)---

TIP 2: Don't 'kill -9' the postmaster



---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate