Re: [PERFORM] Delete query takes exorbitant amount of time

2005-03-25 Thread Christopher Kings-Lynne
There is clear benefit from forcing them to be the same. In logical data
terms, they *should* be the same. I don't check fruit.apple_grade
against fruit_type.orange_grade. When would I want to make a check of
that nature? If there is a reason, thats great, lets keep status quo
then.
I respect the effort and thought that has already gone into the
implementation; I seek only to offer a very minor improvement based upon
recent list issues.
The main problem would be people getting errors when upgrading their 
databases, or restoring from a backup, say.

Chris
---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faq


Re: [PERFORM] How to improve db performance with $7K?

2005-03-25 Thread Steve Poe
Tom,
From what I understand, the vendor used ProIV for development, when 
they attempted to use 7.4.3, they had ODBC issues and something else I 
honestly don't know, but I was told that data was not coming through 
properly. Their somewhat at the mercy of the ProIV people to give them 
the stamp of approval, then the vendor will tell us what they support.

Thanks.
Steve Poe
Tom Lane wrote:
Steve Poe <[EMAIL PROTECTED]> writes:
 

Situation:  An 24/7 animal hospital (100 employees) runs their business 
on Centos 3.3 (RHEL 3) Postgres 7.4.2 (because they have to)
   

[ itch... ]  Surely they could at least move to 7.4.7 without pain.
There are serious data-loss bugs known in 7.4.2.
regards, tom lane
---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings
 


---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [PERFORM] How to improve db performance with $7K?

2005-03-25 Thread Joshua D. Drake
Steve Poe wrote:
Situation:  An 24/7 animal hospital (100 employees) runs their 
business on Centos 3.3 (RHEL 3) Postgres 7.4.2 (because they have to) 
off a 2-CPU Xeon 2.8MHz, 4GB of RAM, (3) SCSI disks RAID 0 (zcav value 
35MB per sec). The databse is 11GB comprised over 100 tables and 
indexes from 1MB to 2GB in size.

I recently told the hospital management team worst-case scenerio they 
need to get the database on its own drive array since the RAID0 is a 
disaster wating to happen. I said ideally a new dual AMD server with 
6/7-disk configuration would be ideal for safety and performance, but 
they don't have $15K.  I said a seperate drive array offer the balance 
of safety and performance.

I have been given budget of $7K to accomplish a safer/faster database 
through hardware upgrades.  The objective is to get a drive array, but 
I can use the budget any way I see fit to accomplish the goal.

You could build a dual opteron with 4 GB of ram, 12 10k raptor SATA 
drives with a battery backed cache for about 7k or less.

Or if they are not CPU bound just IO bound you could easily just
add an external 12 drive array (even if scsi) for less than 7k.
Sincerely,
Joshua D. Drake

Since I am a dba novice, I did not physically build this server, nor 
did I write the application the hospital runs on, but I have the 
opportunity to make it better, I'd thought I should seek some advice 
from those who have been down this road before.  Suggestions/ideas 
anyone?

Thanks.
Steve Poe
---(end of broadcast)---
TIP 8: explain analyze is your friend

--
Command Prompt, Inc., home of Mammoth PostgreSQL - S/ODBC and S/JDBC
Postgresql support, programming shared hosting and dedicated hosting.
+1-503-667-4564 - [EMAIL PROTECTED] - http://www.commandprompt.com
PostgreSQL Replicator -- production quality replication for PostgreSQL
begin:vcard
fn:Joshua Drake
n:Drake;Joshua
org:Command Prompt, Inc.
adr:;;PO Box 215 ;Cascade Locks;OR;97014;US
email;internet:[EMAIL PROTECTED]
title:Consultant
tel;work:503-667-4564
tel;fax:503-210-0334
x-mozilla-html:FALSE
url:http://www.commandprompt.com
version:2.1
end:vcard


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] How to improve db performance with $7K?

2005-03-25 Thread Will LaShell
You can purchase a whole new dual opteron  740,with 6 gigs of ram, a 
case to match and 6   74 gig  ultra320 sca drives  for about   $7k

I know because that's what I bought one for 2 weeks ago.   Using Tyan's  
dual board.

If you need some details and are willing to go that route,  let me know 
and I'll get you the information.

Sincerely,
Will LaShell
Steve Poe wrote:
Situation:  An 24/7 animal hospital (100 employees) runs their 
business on Centos 3.3 (RHEL 3) Postgres 7.4.2 (because they have to) 
off a 2-CPU Xeon 2.8MHz, 4GB of RAM, (3) SCSI disks RAID 0 (zcav value 
35MB per sec). The databse is 11GB comprised over 100 tables and 
indexes from 1MB to 2GB in size.

I recently told the hospital management team worst-case scenerio they 
need to get the database on its own drive array since the RAID0 is a 
disaster wating to happen. I said ideally a new dual AMD server with 
6/7-disk configuration would be ideal for safety and performance, but 
they don't have $15K.  I said a seperate drive array offer the balance 
of safety and performance.

I have been given budget of $7K to accomplish a safer/faster database 
through hardware upgrades.  The objective is to get a drive array, but 
I can use the budget any way I see fit to accomplish the goal.

Since I am a dba novice, I did not physically build this server, nor 
did I write the application the hospital runs on, but I have the 
opportunity to make it better, I'd thought I should seek some advice 
from those who have been down this road before.  Suggestions/ideas 
anyone?

Thanks.
Steve Poe
---(end of broadcast)---
TIP 8: explain analyze is your friend

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] How to improve db performance with $7K?

2005-03-25 Thread Tom Lane
Steve Poe <[EMAIL PROTECTED]> writes:
> Situation:  An 24/7 animal hospital (100 employees) runs their business 
> on Centos 3.3 (RHEL 3) Postgres 7.4.2 (because they have to)

[ itch... ]  Surely they could at least move to 7.4.7 without pain.
There are serious data-loss bugs known in 7.4.2.

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


[PERFORM] How to improve db performance with $7K?

2005-03-25 Thread Steve Poe
Situation:  An 24/7 animal hospital (100 employees) runs their business 
on Centos 3.3 (RHEL 3) Postgres 7.4.2 (because they have to) off a 2-CPU 
Xeon 2.8MHz, 4GB of RAM, (3) SCSI disks RAID 0 (zcav value 35MB per 
sec). The databse is 11GB comprised over 100 tables and indexes from 1MB 
to 2GB in size.

I recently told the hospital management team worst-case scenerio they 
need to get the database on its own drive array since the RAID0 is a 
disaster wating to happen. I said ideally a new dual AMD server with 
6/7-disk configuration would be ideal for safety and performance, but 
they don't have $15K.  I said a seperate drive array offer the balance 
of safety and performance.

I have been given budget of $7K to accomplish a safer/faster database 
through hardware upgrades.  The objective is to get a drive array, but I 
can use the budget any way I see fit to accomplish the goal.

Since I am a dba novice, I did not physically build this server, nor did 
I write the application the hospital runs on, but I have the opportunity 
to make it better, I'd thought I should seek some advice from those who 
have been down this road before.  Suggestions/ideas anyone?

Thanks.
Steve Poe
---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] pg_autovacuum not having enough suction ?

2005-03-25 Thread Otto Blomqvist
ok, Thanks a lot for your time guys  ! I guess my table is pretty unusual
and thats why this problem has not surfaced until now. Better late then
never ;) I'll cron a "manual" vacuum full on the table.



"Tom Lane" <[EMAIL PROTECTED]> wrote in message
news:[EMAIL PROTECTED]
> > Otto Blomqvist wrote:
> >> This table is basically a queue full of records waiting to get
transfered
> >> over from our 68030 system to the PG database. The records are then
moved
> >> into folders (using a trigger) like file_92_myy depending on what month
the
> >> record was created on the 68030. During normal operations there should
not
> >> be more than 10 records at a time in the table, although during the
course
> >> of a day a normal system will get about 50k records. I create 5
records
> >> to simulate incoming traffic, since we don't have much traffic in the
test
> >> lab.
>
> Really the right way to do housekeeping for a table like that is to
> VACUUM FULL (or better yet, TRUNCATE, if possible) immediately after
> discarding a batch of records.  The VACUUM FULL will take very little
> time if it only has to repack <10 records.  Plain VACUUM is likely to
> leave the table nearly empty but physically sizable, which is bad news
> from a statistical point of view: as the table fills up again, it won't
> get physically larger, thereby giving the planner no clue that it
> doesn't still have <10 records.  This means the queries that process
> the 50K-record patch are going to get horrible plans :-(
>
> I'm not sure if autovacuum could be taught to do that --- it could
> perhaps launch a vacuum as soon as it notices a large fraction of the
> table got deleted, but do we really want to authorize it to launch
> VACUUM FULL?  It'd be better to issue the vacuum synchronously
> as part of the batch updating script, I feel.
>
> regards, tom lane
>
> ---(end of broadcast)---
> TIP 2: you can get off all lists at once with the unregister command
> (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
>



---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] pg_autovacuum not having enough suction ?

2005-03-25 Thread Bruce Momjian
Tom Lane wrote:
> Bruce Momjian  writes:
> > Tom Lane wrote:
> >> I'm not sure if autovacuum could be taught to do that --- it could
> >> perhaps launch a vacuum as soon as it notices a large fraction of the
> >> table got deleted, but do we really want to authorize it to launch
> >> VACUUM FULL?
> 
> > One problem with VACUUM FULL would be autovacuum waiting for an
> > exclusive lock on the table.  Anyway, it is documented now as a possible
> > issue.
> 
> I don't care too much about autovacuum waiting awhile to get a lock.
> I do care about other processes getting queued up behind it, though.
> 
> Perhaps it would be possible to alter the normal lock queuing semantics
> for this case, so that autovacuum's request doesn't block later
> arrivals, and it can only get the lock when no one is interested in the
> table.  Of course, that might never happen, or by the time it does
> there's no point in VACUUM FULL anymore :-(

Can we issue a LOCK TABLE with a statement_timeout, and only do the
VACUUM FULL if we can get a lock quickly?  That seems like a plan.

The only problem is that you can't VACUUM FULL in a transaction:

test=> create table test (x int);
CREATE TABLE
test=> insert into test values (1);
INSERT 0 1
test=> begin;
BEGIN
test=> lock table test;
LOCK TABLE
test=> vacuum full;
ERROR:  VACUUM cannot run inside a transaction block

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] pg_autovacuum not having enough suction ?

2005-03-25 Thread Bruce Momjian
Tom Lane wrote:
> > Otto Blomqvist wrote:
> >> This table is basically a queue full of records waiting to get transfered
> >> over from our 68030 system to the PG database. The records are then moved
> >> into folders (using a trigger) like file_92_myy depending on what month the
> >> record was created on the 68030. During normal operations there should not
> >> be more than 10 records at a time in the table, although during the course
> >> of a day a normal system will get about 50k records. I create 5 records
> >> to simulate incoming traffic, since we don't have much traffic in the test
> >> lab.
> 
> Really the right way to do housekeeping for a table like that is to
> VACUUM FULL (or better yet, TRUNCATE, if possible) immediately after
> discarding a batch of records.  The VACUUM FULL will take very little
> time if it only has to repack <10 records.  Plain VACUUM is likely to
> leave the table nearly empty but physically sizable, which is bad news
> from a statistical point of view: as the table fills up again, it won't
> get physically larger, thereby giving the planner no clue that it
> doesn't still have <10 records.  This means the queries that process
> the 50K-record patch are going to get horrible plans :-(
> 
> I'm not sure if autovacuum could be taught to do that --- it could
> perhaps launch a vacuum as soon as it notices a large fraction of the
> table got deleted, but do we really want to authorize it to launch
> VACUUM FULL?  It'd be better to issue the vacuum synchronously
> as part of the batch updating script, I feel.

I added this to the TODO section for autovacuum:

o Do VACUUM FULL if table is nearly empty?

I don't think autovacuum is every going to be smart enough to recycle
during the delete, especially since the rows can't be reused until the
transaction completes.

One problem with VACUUM FULL would be autovacuum waiting for an
exclusive lock on the table.  Anyway, it is documented now as a possible
issue.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] pg_autovacuum not having enough suction ?

2005-03-25 Thread Tom Lane
Bruce Momjian  writes:
> Tom Lane wrote:
>> I'm not sure if autovacuum could be taught to do that --- it could
>> perhaps launch a vacuum as soon as it notices a large fraction of the
>> table got deleted, but do we really want to authorize it to launch
>> VACUUM FULL?

> One problem with VACUUM FULL would be autovacuum waiting for an
> exclusive lock on the table.  Anyway, it is documented now as a possible
> issue.

I don't care too much about autovacuum waiting awhile to get a lock.
I do care about other processes getting queued up behind it, though.

Perhaps it would be possible to alter the normal lock queuing semantics
for this case, so that autovacuum's request doesn't block later
arrivals, and it can only get the lock when no one is interested in the
table.  Of course, that might never happen, or by the time it does
there's no point in VACUUM FULL anymore :-(

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] pg_autovacuum not having enough suction ?

2005-03-25 Thread Joshua D. Drake

> > I'm not sure if autovacuum could be taught to do that --- it could
> > perhaps launch a vacuum as soon as it notices a large fraction of the
> > table got deleted, but do we really want to authorize it to launch
> > VACUUM FULL?  It'd be better to issue the vacuum synchronously
> > as part of the batch updating script, I feel.
> 
> I added this to the TODO section for autovacuum:
> 
> o Do VACUUM FULL if table is nearly empty?

We should never automatically launch a vacuum full. That seems like a
really bad idea.

Sincerely,

Joshua D. Drake


> 
> I don't think autovacuum is every going to be smart enough to recycle
> during the delete, especially since the rows can't be reused until the
> transaction completes.
> 
> One problem with VACUUM FULL would be autovacuum waiting for an
> exclusive lock on the table.  Anyway, it is documented now as a possible
> issue.
> 
-- 
Command Prompt, Inc., Your PostgreSQL solutions company. 503-667-4564
Custom programming, 24x7 support, managed services, and hosting
Open Source Authors: plPHP, pgManage, Co-Authors: plPerlNG
Reliable replication, Mammoth Replicator - http://www.commandprompt.com/


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] pg_autovacuum not having enough suction ?

2005-03-25 Thread Tom Lane
> Otto Blomqvist wrote:
>> This table is basically a queue full of records waiting to get transfered
>> over from our 68030 system to the PG database. The records are then moved
>> into folders (using a trigger) like file_92_myy depending on what month the
>> record was created on the 68030. During normal operations there should not
>> be more than 10 records at a time in the table, although during the course
>> of a day a normal system will get about 50k records. I create 5 records
>> to simulate incoming traffic, since we don't have much traffic in the test
>> lab.

Really the right way to do housekeeping for a table like that is to
VACUUM FULL (or better yet, TRUNCATE, if possible) immediately after
discarding a batch of records.  The VACUUM FULL will take very little
time if it only has to repack <10 records.  Plain VACUUM is likely to
leave the table nearly empty but physically sizable, which is bad news
from a statistical point of view: as the table fills up again, it won't
get physically larger, thereby giving the planner no clue that it
doesn't still have <10 records.  This means the queries that process
the 50K-record patch are going to get horrible plans :-(

I'm not sure if autovacuum could be taught to do that --- it could
perhaps launch a vacuum as soon as it notices a large fraction of the
table got deleted, but do we really want to authorize it to launch
VACUUM FULL?  It'd be better to issue the vacuum synchronously
as part of the batch updating script, I feel.

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: lazy_update_relstats considered harmful (was Re: [PERFORM] pg_autovacuum not having enough suction ?)

2005-03-25 Thread Tom Lane
Simon Riggs <[EMAIL PROTECTED]> writes:
> On Fri, 2005-03-25 at 15:22 -0500, Tom Lane wrote:
>> 2. Dead tuples don't have that much influence on scan costs either, at
>> least not once they are marked as known-dead.  Certainly they shouldn't
>> be charged at full freight.

> Yes, minor additional CPU time, but the main issue is when the dead
> tuples force additional I/O.

I/O costs are mostly estimated off relpages, though, not reltuples.
The only time you really pay through the nose for a dead tuple is when
an indexscan visits it, but with the known-dead marking we now do in
btree indexes, I'm pretty sure that path is seldom taken.

>> It's possible that there'd be some value in adding a column to pg_class
>> to record dead tuple count, but given what we have now, the calculation
>> in lazy_update_relstats is totally wrong.

> Yes, thats the way. We can record the (averaged?) dead tuple count, but
> also record the actual row count in reltuples.

What I'd be inclined to record is the actual number of dead rows removed
by the most recent VACUUM.  Any math on that is best done in the
planner, since we can change the logic more easily than the database
contents.  It'd probably be reasonable to take half of that number as
the estimate of the average number of dead tuples.

But in any case, that's for the future; we can't have it in 8.0.*, and
right at the moment I'm focusing on what to push out for 8.0.2.

> We definitely need to record the physical and logical tuple counts,
> since each of them have different contributions to run-times.

There isn't any difference, if you are talking about fully dead tuples.
It would be possible for VACUUM to also count the number of
not-committed-but-not-removable tuples (ie, new from still-open
transactions, plus dead-but-still-visible-to-somebody), but I'm not sure
that it would be useful to do so, because that sort of count is hugely
transient.  The stat would be irrelevant moments after it was taken.

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: lazy_update_relstats considered harmful (was Re: [PERFORM]

2005-03-25 Thread Simon Riggs
On Fri, 2005-03-25 at 15:22 -0500, Tom Lane wrote:
> 2. Dead tuples don't have that much influence on scan costs either, at
> least not once they are marked as known-dead.  Certainly they shouldn't
> be charged at full freight.

Yes, minor additional CPU time, but the main issue is when the dead
tuples force additional I/O.

> It's possible that there'd be some value in adding a column to pg_class
> to record dead tuple count, but given what we have now, the calculation
> in lazy_update_relstats is totally wrong.

Yes, thats the way. We can record the (averaged?) dead tuple count, but
also record the actual row count in reltuples.

We definitely need to record the physical and logical tuple counts,
since each of them have different contributions to run-times.

For comparing seq scan v index, we need to look at the physical tuples
count * avg row size, whereas when we calculate number of rows returned
we should look at fractions of the logical row count.

> The idea I was trying to capture is that the tuple density is at a
> minimum right after VACUUM, and will increase as free space is filled
> in until the next VACUUM, so that recording the exact tuple count
> underestimates the number of tuples that will be seen on-the-average.
> But I'm not sure that idea really holds water.  The only way that a
> table can be at "steady state" over a long period is if the number of
> live tuples remains roughly constant (ie, inserts balance deletes).
> What actually increases and decreases over a VACUUM cycle is the density
> of *dead* tuples ... but per the above arguments this isn't something
> we should adjust reltuples for.
> 
> So I'm thinking lazy_update_relstats should be ripped out and we should
> go back to recording just the actual stats.
> 
> Sound reasonable?  Or was I right the first time and suffering brain
> fade today?

Well, I think the original idea had some validity, but clearly
lazy_update_relstats isn't the way to do it even though we thought so at
the time.

Best Regards, Simon Riggs



---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] Delete query takes exorbitant amount of time

2005-03-25 Thread Simon Riggs
On Fri, 2005-03-25 at 16:25 -0500, Tom Lane wrote:
> Stephan Szabo <[EMAIL PROTECTED]> writes:
> > On Fri, 25 Mar 2005, Simon Riggs wrote:
> >> Could it be that because PostgreSQL has a very highly developed sense of
> >> datatype comparison that we might be taking this to extremes? Would any
> >> other RDBMS consider two different datatypes to be comparable?
> 
> > We do have a broader comparable than the spec.
> 
> However, the set of comparisons that we can presently support *with
> indexes* is narrower than the spec, so rejecting nonindexable cases
> would be a problem.

OK. Can we have a TODO item then?

* Ensure that all SQL:2003 comparable datatypes are also indexable when
compared

...or something like that

> It's worth noting also that the test being discussed checks whether the
> PK index is usable for testing the RI constraint.  In the problem that
> started this thread, the difficulty is lack of a usable index on the FK
> column, not the PK (because that's the table that has to be searched to
> do a delete in the PK table).  We cannot enforce that there be a usable
> index on the FK column (since indexes on the FK table may not have been
> built yet when the constraint is declared), and shouldn't anyway because
> there are reasonable usage patterns where you don't need one.

Yes, I agree for CASCADE we wouldn't always want an index.

Alright then, time to leave it there.

I want to write up some additional comments for performance tips:
- Consider defining RI constraints after tables have been loaded
- Remember to add an index on the referencing table if the constraint is
defined as CASCADEing

Have a good Easter, all, wherever you are and whatever you believe in.

Best Regards, Simon Riggs


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [PERFORM] Delete query takes exorbitant amount of time

2005-03-25 Thread Karim Nassar
On Fri, 2005-03-25 at 15:10 +, Simon Riggs wrote:
> Karim: Did this happen? If not, can you drop and re-create and confirm
> that you get the WARNING? If not, we have problems.

No. Nor do I think that I should. SERIAL is shortcut for INTEGER, no? I
think there is some other (TBD) problem causing my big seq scan.

orfs=# ALTER TABLE measurement DROP CONSTRAINT 
measurement_id_int_sensor_meas_type_fkey;
ALTER TABLE
orfs=# ALTER TABLE ONLY measurement ADD CONSTRAINT 
measurement_id_int_sensor_meas_type_fkey
orfs-#  FOREIGN KEY (id_int_sensor_meas_type) REFERENCES 
int_sensor_meas_type(id_int_sensor_meas_type);
ALTER TABLE
orfs=#

The add constraint statement comes directly from a pg_dump.

For clarity, the table/indexes were created as such:

CREATE TABLE int_sensor_meas_type( 
id_int_sensor_meas_type SERIAL PRIMARY KEY,
id_sensor integer NOT NULL REFERENCES sensor,
id_meas_type integer NOT NULL REFERENCES meas_type UNIQUE);

CREATE TABLE measurement (
id_measurement SERIAL PRIMARY KEY,
id_int_sensor_meas_type integer NOT NULL REFERENCES int_sensor_meas_type,
datetime timestamp WITH TIME ZONE NOT NULL,
value numeric(15,5) NOT NULL,
created timestamp with time zone NOT NULL DEFAULT now(),
created_by TEXT NOT NULL REFERENCES public.person(id_person));

CREATE INDEX measurement__id_int_sensor_meas_type_idx ON 
measurement(id_int_sensor_meas_type);

Regards,
-- 
Karim Nassar
Department of Computer Science
Box 15600, College of Engineering and Natural Sciences
Northern Arizona University,  Flagstaff, Arizona 86011
Office: (928) 523-5868 -=- Mobile: (928) 699-9221


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [PERFORM] Delete query takes exorbitant amount of time

2005-03-25 Thread Tom Lane
Stephan Szabo <[EMAIL PROTECTED]> writes:
> On Fri, 25 Mar 2005, Simon Riggs wrote:
>> Could it be that because PostgreSQL has a very highly developed sense of
>> datatype comparison that we might be taking this to extremes? Would any
>> other RDBMS consider two different datatypes to be comparable?

> We do have a broader comparable than the spec.

However, the set of comparisons that we can presently support *with
indexes* is narrower than the spec, so rejecting nonindexable cases
would be a problem.

It's worth noting also that the test being discussed checks whether the
PK index is usable for testing the RI constraint.  In the problem that
started this thread, the difficulty is lack of a usable index on the FK
column, not the PK (because that's the table that has to be searched to
do a delete in the PK table).  We cannot enforce that there be a usable
index on the FK column (since indexes on the FK table may not have been
built yet when the constraint is declared), and shouldn't anyway because
there are reasonable usage patterns where you don't need one.

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [HACKERS] lazy_update_relstats considered harmful (was Re: [PERFORM]

2005-03-25 Thread Matthew T. O'Connor
Tom Lane wrote:
I wrote:
 

One thing that is possibly relevant here is that in 8.0 a plain VACUUM
doesn't set reltuples to the exactly correct number, but to an
interpolated value that reflects our estimate of the "steady state"
average between vacuums.  I wonder if that code is wrong, or if it's
operating as designed but is confusing autovac.
   

Now that I think it over, I'm thinking that I must have been suffering
severe brain fade the day I wrote lazy_update_relstats() (see
vacuumlazy.c).  The numbers that that routine is averaging are the pre-
and post-vacuum physical tuple counts.  But the difference between them
consists of known-dead tuples, and we shouldn't be factoring dead tuples
into reltuples.  The planner has always considered reltuples to count
only live tuples, and I think this is correct on two grounds:
1. The numbers of tuples estimated to be returned by scans certainly
shouldn't count dead ones.
2. Dead tuples don't have that much influence on scan costs either, at
least not once they are marked as known-dead.  Certainly they shouldn't
be charged at full freight.
It's possible that there'd be some value in adding a column to pg_class
to record dead tuple count, but given what we have now, the calculation
in lazy_update_relstats is totally wrong.
The idea I was trying to capture is that the tuple density is at a
minimum right after VACUUM, and will increase as free space is filled
in until the next VACUUM, so that recording the exact tuple count
underestimates the number of tuples that will be seen on-the-average.
But I'm not sure that idea really holds water.  The only way that a
table can be at "steady state" over a long period is if the number of
live tuples remains roughly constant (ie, inserts balance deletes).
What actually increases and decreases over a VACUUM cycle is the density
of *dead* tuples ... but per the above arguments this isn't something
we should adjust reltuples for.
So I'm thinking lazy_update_relstats should be ripped out and we should
go back to recording just the actual stats.
Sound reasonable?  Or was I right the first time and suffering brain
fade today?

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] Delete query takes exorbitant amount of time

2005-03-25 Thread Stephan Szabo

On Fri, 25 Mar 2005, Simon Riggs wrote:

> On Fri, 2005-03-25 at 13:47 -0500, Tom Lane wrote:
> > Simon Riggs <[EMAIL PROTECTED]> writes:
> > > On Fri, 2005-03-25 at 10:17 -0500, Tom Lane wrote:
> > >>> Other than spec compliance, you mean?  SQL99 says
> > >>>
> > >>> ... The declared type of each referencing column shall be
> > >>> comparable to the declared type of the corresponding referenced
> > >>> column.
> >
> > > Tom had said SQL99 required this; I have pointed out SQL:2003, which
> > > supercedes the SQL99 standard, does not require this.
> >
> > You're reading the wrong part of SQL:2003.  11.8  > definition> syntax rule 9 still has the text I quoted.
>
> So, we have this from SQL:2003 section 11.8 p.550
> - 3a) requires us to have an index
> - 9) requires the data types to be "comparable"
>
> In the name of spec-compliance we wish to accept an interpretation of
> the word "comparable" that means we will accept two datatypes that are
> not actually the same.
>
> So we are happy to enforce having the index, but not happy to ensure the
> index is actually usable for the task?

The indexes "usability" only applies to the purpose of guaranteeing
uniqueness which doesn't depend on the referencing type AFAICS.

> > > Leading us back to my original point - what is the benefit of continuing
> > > with having a WARNING when that leads people into trouble later?
> >
> > Accepting spec-compliant schemas.
>
> I definitely want this too - as you know I have worked on documenting
> compliance previously.
>
> Is the word "comparable" defined elsewhere in the standard?

Yes.  And at least in SQL99, there's a bunch of statements in 4.* about
what are comparable.

> Currently, datatypes with similar type categories are comparable and yet
> (in 8.0) will now use the index. So, we are taking comparable to include
> fairly radically different datatypes?

Not entirely. I believe a referenced column of int, and a referencing
column of numeric currently displays that warning, but appears to be
allowed by the spec (as the numeric types are considered mutually
comparable).

> Could it be that because PostgreSQL has a very highly developed sense of
> datatype comparison that we might be taking this to extremes? Would any
> other RDBMS consider two different datatypes to be comparable?

We do have a broader comparable than the spec. However, if we were to
limit it to the spec then many of the implicit casts and cross-type
comparison operators we have would be invalid as well since the comparison
between those types would have to fail as well unless we treat the
comparable used by  differently.

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] pg_autovacuum not having enough suction ?

2005-03-25 Thread Matthew T. O'Connor
Tom Lane wrote:
"Matthew T. O'Connor"  writes:
 

hmm the value in reltuples should be accurate after a vacuum (or 
vacuum analyze) if it's not it's a vacuum bug or something is going on 
that isn't understood.  If you or pg_autovacuum are running plain 
analyze commands, that could explain the invalid reltules numbers.
   

Was reltuples = 113082 correct right after the vacuum? 
   

Another thing to check is whether the reltuples (and relpages!) that
autovacuum is reporting are the same as what's actually in the pg_class
row for the relation.  I'm wondering if this could be a similar issue
to the old autovac bug where it wasn't reading the value correctly.
 

I don't think so, as he did some manual selects from pg_class and 
pg_stat_all in one of the emails he sent that were showing similar 
numbers to what autovac was reporting.

If they are the same then it seems like it must be a backend issue.
One thing that is possibly relevant here is that in 8.0 a plain VACUUM
doesn't set reltuples to the exactly correct number, but to an
interpolated value that reflects our estimate of the "steady state"
average between vacuums.  I wonder if that code is wrong, or if it's
operating as designed but is confusing autovac.
 

Ahh  Now that you mention it, I do remember the discussion during 
8.0 development.  This sounds very much like the cause of the problem.  
Autovac is not vacuuming often enough for this table because reltuples 
is telling autovac that there are alot more tuples in this table than 
there really are. 

Really this is just another case of the more general problem with 
autovac as it stands now.  That is, you can't set vacuum thresholds on a 
per table basis, and databases like this can't survive with a one size 
fits all threshold.  I would suggest that Otto perform regular cron 
based vacuums of this one table in addition to autovac, that is what 
several people I have heard from in the field are doing.

Come hell or high water I'm gonna get autovac integrated into 8.1, at 
which point per table thresholds would be easy todo.

Can autovac be told to run the vacuums in VERBOSE mode?  It would be
useful to compare what VERBOSE has to say to the changes in
reltuples/relpages.
Not as it stands now.  That would be an interesting feature for 
debugging purposes though.

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


Re: [PERFORM] Delete query takes exorbitant amount of time

2005-03-25 Thread Simon Riggs
On Fri, 2005-03-25 at 13:47 -0500, Tom Lane wrote:
> Simon Riggs <[EMAIL PROTECTED]> writes:
> > On Fri, 2005-03-25 at 10:17 -0500, Tom Lane wrote:
> >>> Other than spec compliance, you mean?  SQL99 says
> >>> 
> >>> ... The declared type of each referencing column shall be
> >>> comparable to the declared type of the corresponding referenced
> >>> column.
> 
> > Tom had said SQL99 required this; I have pointed out SQL:2003, which
> > supercedes the SQL99 standard, does not require this.
> 
> You're reading the wrong part of SQL:2003.  11.8  definition> syntax rule 9 still has the text I quoted.

So, we have this from SQL:2003 section 11.8 p.550
- 3a) requires us to have an index
- 9) requires the data types to be "comparable"

In the name of spec-compliance we wish to accept an interpretation of
the word "comparable" that means we will accept two datatypes that are
not actually the same. 

So we are happy to enforce having the index, but not happy to ensure the
index is actually usable for the task?

> > Leading us back to my original point - what is the benefit of continuing
> > with having a WARNING when that leads people into trouble later?
> 
> Accepting spec-compliant schemas.

I definitely want this too - as you know I have worked on documenting
compliance previously.

Is the word "comparable" defined elsewhere in the standard?

Currently, datatypes with similar type categories are comparable and yet
(in 8.0) will now use the index. So, we are taking comparable to include
fairly radically different datatypes?

Could it be that because PostgreSQL has a very highly developed sense of
datatype comparison that we might be taking this to extremes? Would any
other RDBMS consider two different datatypes to be comparable?

Please consider this. 

Best Regards, Simon Riggs



---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] pg_autovacuum not having enough suction ?

2005-03-25 Thread Otto Blomqvist

> Another thing to check is whether the reltuples (and relpages!) that
> autovacuum is reporting are the same as what's actually in the pg_class
> row for the relation.  I'm wondering if this could be a similar issue
> to the old autovac bug where it wasn't reading the value correctly.

These values where extracted at roughly the same time.

 relname | relnamespace | reltype | relowner | relam | relfilenode |
reltablespace | relpages | reltuples | reltoastrelid | reltoastidxid |
relhasindex | relisshared | relkind | relnatts | relchecks | reltriggers |
relukeys | relfkeys | relrefs | relhasoids | relhaspkey | relhasrules |
relhassubclass | relacl
-+--+-+--+---+-+
---+--+---+---+---+-
+-+-+--+---+-+--+---
---+-+++-++-
---
 file_92 | 2200 | 9384220 |  100 | 0 | 9384219 |
0 | 6624 |113082 | 0 | 0 | t   | f
| r   |   23 | 0 |   1 |0 |0 |
0 | t  | f  | f   | f  |
(1 row)

secom=# select count(*) from file_92;
 count
---
 17579
(1 row)

[2005-03-25 12:16:32 EST] INFO:  table name: secom."public"."file_92"
[2005-03-25 12:16:32 EST] INFO: relid: 9384219;   relisshared: 0
[2005-03-25 12:16:32 EST] INFO: reltuples: 113082.00;  relpages:
6624
[2005-03-25 12:16:32 EST] INFO: curr_analyze_count: 993780;
curr_vacuum_count: 732470
[2005-03-25 12:16:32 EST] INFO: last_analyze_count: 923820;
last_vacuum_count: 662699
[2005-03-25 12:16:32 EST] INFO: analyze_threshold: 113582;
vacuum_threshold: 227164


Hope this helps, if there is anything else I can do please let me know.


> If they are the same then it seems like it must be a backend issue.
>
> One thing that is possibly relevant here is that in 8.0 a plain VACUUM
> doesn't set reltuples to the exactly correct number, but to an
> interpolated value that reflects our estimate of the "steady state"
> average between vacuums.  I wonder if that code is wrong, or if it's
> operating as designed but is confusing autovac.


This average steady state value might be hard to interpolete in this case
since this is only a temporary holding place for the records ..? Normaly the
table has < 10 records in it at the same time. In the lab we create a
"lump-traffic" by sending over 5 Records. It takes about 20 hours to
transfer over all of the 50k records.





---(end of broadcast)---
TIP 8: explain analyze is your friend


lazy_update_relstats considered harmful (was Re: [PERFORM] pg_autovacuum not having enough suction ?)

2005-03-25 Thread Tom Lane
I wrote:
> One thing that is possibly relevant here is that in 8.0 a plain VACUUM
> doesn't set reltuples to the exactly correct number, but to an
> interpolated value that reflects our estimate of the "steady state"
> average between vacuums.  I wonder if that code is wrong, or if it's
> operating as designed but is confusing autovac.

Now that I think it over, I'm thinking that I must have been suffering
severe brain fade the day I wrote lazy_update_relstats() (see
vacuumlazy.c).  The numbers that that routine is averaging are the pre-
and post-vacuum physical tuple counts.  But the difference between them
consists of known-dead tuples, and we shouldn't be factoring dead tuples
into reltuples.  The planner has always considered reltuples to count
only live tuples, and I think this is correct on two grounds:

1. The numbers of tuples estimated to be returned by scans certainly
shouldn't count dead ones.

2. Dead tuples don't have that much influence on scan costs either, at
least not once they are marked as known-dead.  Certainly they shouldn't
be charged at full freight.

It's possible that there'd be some value in adding a column to pg_class
to record dead tuple count, but given what we have now, the calculation
in lazy_update_relstats is totally wrong.

The idea I was trying to capture is that the tuple density is at a
minimum right after VACUUM, and will increase as free space is filled
in until the next VACUUM, so that recording the exact tuple count
underestimates the number of tuples that will be seen on-the-average.
But I'm not sure that idea really holds water.  The only way that a
table can be at "steady state" over a long period is if the number of
live tuples remains roughly constant (ie, inserts balance deletes).
What actually increases and decreases over a VACUUM cycle is the density
of *dead* tuples ... but per the above arguments this isn't something
we should adjust reltuples for.

So I'm thinking lazy_update_relstats should be ripped out and we should
go back to recording just the actual stats.

Sound reasonable?  Or was I right the first time and suffering brain
fade today?

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] pg_autovacuum not having enough suction ?

2005-03-25 Thread Otto Blomqvist
> Was reltuples = 113082 correct right after the vacuum?

No, There where about 31000 rows after the vacuum. I'm no expert but tuples
= rows, right ?

This is not a "normal" table though, in the sence that it is only a
temporary holding ground as I explained earlier. I create 5 records and
these get sent over from our custom 68030 system, to tables like
file_92_myy, depending on the date of the record.  A pl/pgsql script is used
as a trigger to move the records after they get data from the 68030. Don't
know if that is of interest or not. I could post the trigger if you'd like.


""Matthew T. O'Connor""  wrote in message
news:[EMAIL PROTECTED]
> hmm the value in reltuples should be accurate after a vacuum (or
> vacuum analyze) if it's not it's a vacuum bug or something is going on
> that isn't understood.  If you or pg_autovacuum are running plain
> analyze commands, that could explain the invalid reltules numbers.
>
> Was reltuples = 113082 correct right after the vacuum?
>
> Matthew
>
>
> Otto Blomqvist wrote:
>
> >It looks like the reltuples-values are screwed up. Even though rows are
> >constantly being removed from the table the reltuples keep going up. If I
> >understand correctly that also makes the Vacuum threshold go up and we
end
> >up in a vicious circle. Right after pg_autovacuum performed a vacuum
analyze
> >on the table it actually had 31000 records, but reltuples reports over
100k.
> >I'm not sure if this means anything But i thought i would pass it along.
> >
> >PG version 8.0.0, 31MB tarred DB.
> >
> >[2005-03-25 09:16:14 EST] INFO:dbname: testing
> >[2005-03-25 09:16:14 EST] INFO:  oid: 9383816
> >[2005-03-25 09:16:14 EST] INFO:  username: (null)
> >[2005-03-25 09:16:14 EST] INFO:  password: (null)
> >[2005-03-25 09:16:14 EST] INFO:  conn is null, (not connected)
> >[2005-03-25 09:16:14 EST] INFO:  default_analyze_threshold: 1000
> >[2005-03-25 09:16:14 EST] INFO:  default_vacuum_threshold: 500
> >
> >
> >[2005-03-25 09:05:12 EST] INFO:  table name: secom."public"."file_92"
> >[2005-03-25 09:05:12 EST] INFO: relid: 9384219;   relisshared: 0
> >[2005-03-25 09:05:12 EST] INFO: reltuples: 49185.00;
relpages:
> >8423
> >[2005-03-25 09:05:12 EST] INFO: curr_analyze_count: 919274;
> >curr_vacuum_count: 658176
> >[2005-03-25 09:05:12 EST] INFO: last_analyze_count: 899272;
> >last_vacuum_count: 560541
> >[2005-03-25 09:05:12 EST] INFO: analyze_threshold: 49685;
> >vacuum_threshold: 100674
> >
> >
> >[2005-03-25 09:10:12 EST] DEBUG:   Performing: VACUUM ANALYZE
> >"public"."file_92"
> >[2005-03-25 09:10:33 EST] INFO:  table name: secom."public"."file_92"
> >[2005-03-25 09:10:33 EST] INFO: relid: 9384219;   relisshared: 0
> >[2005-03-25 09:10:33 EST] INFO: reltuples: 113082.00;
relpages:
> >6624
> >[2005-03-25 09:10:33 EST] INFO: curr_analyze_count: 923820;
> >curr_vacuum_count: 662699
> >[2005-03-25 09:10:33 EST] INFO: last_analyze_count: 923820;
> >last_vacuum_count: 662699
> >[2005-03-25 09:10:33 EST] INFO: analyze_threshold: 113582;
> >vacuum_threshold: 227164
> >
> >
> >[2005-03-25 09:16:14 EST] INFO:  table name: secom."public"."file_92"
> >[2005-03-25 09:16:14 EST] INFO: relid: 9384219;   relisshared: 0
> >[2005-03-25 09:16:14 EST] INFO: reltuples: 113082.00;
relpages:
> >6624  <-- Actually has 31k rows
> >[2005-03-25 09:16:14 EST] INFO: curr_analyze_count: 923820;
> >curr_vacuum_count: 662699
> >[2005-03-25 09:16:14 EST] INFO: last_analyze_count: 923820;
> >last_vacuum_count: 662699
> >[2005-03-25 09:16:14 EST] INFO: analyze_threshold: 113582;
> >vacuum_threshold: 227164
> >
> >DETAIL:  Allocated FSM size: 1000 relations + 200 pages = 11784 kB
> >shared memory.
> >
> >
> >
> >
> >- Original Message -
> >From: "Matthew T. O'Connor" 
> >To: "Otto Blomqvist" <[EMAIL PROTECTED]>;
> >
> >Sent: Thursday, March 24, 2005 3:58 PM
> >Subject: Re: [PERFORM] pg_autovacuum not having enough suction ?
> >
> >
> >
> >
> >>I would rather keep this on list since other people can chime in.
> >>
> >>Otto Blomqvist wrote:
> >>
> >>
> >>
> >>>It does not seem to be a Stats collector problem.
> >>>
> >>>  oid   | relname | relnamespace | relpages | relisshared | reltuples |
> >>>schemaname | n_tup_ins | n_tup_upd | n_tup_del
> >>>
> >>>
>
>>-+-+--+--+-+---+--
-
> >>
> >>
> >-
> >
> >
> >>>+---+---+---
> >>>9384219 | file_92 | 2200 | 8423 | f   | 49837 |
> >>>public |158176 |318527 |158176
> >>>(1 row)
> >>>
> >>>I insert 5 records
> >>>
> >>>secom=# select createfile_92records(1, 5);<--- this is a pg
> >>>
> >>>
> >script
> >
> >
> >>>that inserts records  1 threw 5.
> >>>createfile_92records
> >>>--
> >>>   0
> >>>
> >>>
> >>>  oid   | relname | r

Re: [PERFORM] pg_autovacuum not having enough suction ?

2005-03-25 Thread Tom Lane
"Matthew T. O'Connor"  writes:
> hmm the value in reltuples should be accurate after a vacuum (or 
> vacuum analyze) if it's not it's a vacuum bug or something is going on 
> that isn't understood.  If you or pg_autovacuum are running plain 
> analyze commands, that could explain the invalid reltules numbers.

> Was reltuples = 113082 correct right after the vacuum? 

Another thing to check is whether the reltuples (and relpages!) that
autovacuum is reporting are the same as what's actually in the pg_class
row for the relation.  I'm wondering if this could be a similar issue
to the old autovac bug where it wasn't reading the value correctly.

If they are the same then it seems like it must be a backend issue.

One thing that is possibly relevant here is that in 8.0 a plain VACUUM
doesn't set reltuples to the exactly correct number, but to an
interpolated value that reflects our estimate of the "steady state"
average between vacuums.  I wonder if that code is wrong, or if it's
operating as designed but is confusing autovac.

Can autovac be told to run the vacuums in VERBOSE mode?  It would be
useful to compare what VERBOSE has to say to the changes in
reltuples/relpages.

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] pg_autovacuum not having enough suction ?

2005-03-25 Thread Matthew T. O'Connor
hmm the value in reltuples should be accurate after a vacuum (or 
vacuum analyze) if it's not it's a vacuum bug or something is going on 
that isn't understood.  If you or pg_autovacuum are running plain 
analyze commands, that could explain the invalid reltules numbers.

Was reltuples = 113082 correct right after the vacuum? 

Matthew
Otto Blomqvist wrote:
It looks like the reltuples-values are screwed up. Even though rows are
constantly being removed from the table the reltuples keep going up. If I
understand correctly that also makes the Vacuum threshold go up and we end
up in a vicious circle. Right after pg_autovacuum performed a vacuum analyze
on the table it actually had 31000 records, but reltuples reports over 100k.
I'm not sure if this means anything But i thought i would pass it along.
PG version 8.0.0, 31MB tarred DB.
[2005-03-25 09:16:14 EST] INFO:dbname: testing
[2005-03-25 09:16:14 EST] INFO:  oid: 9383816
[2005-03-25 09:16:14 EST] INFO:  username: (null)
[2005-03-25 09:16:14 EST] INFO:  password: (null)
[2005-03-25 09:16:14 EST] INFO:  conn is null, (not connected)
[2005-03-25 09:16:14 EST] INFO:  default_analyze_threshold: 1000
[2005-03-25 09:16:14 EST] INFO:  default_vacuum_threshold: 500
[2005-03-25 09:05:12 EST] INFO:  table name: secom."public"."file_92"
[2005-03-25 09:05:12 EST] INFO: relid: 9384219;   relisshared: 0
[2005-03-25 09:05:12 EST] INFO: reltuples: 49185.00;  relpages:
8423
[2005-03-25 09:05:12 EST] INFO: curr_analyze_count: 919274;
curr_vacuum_count: 658176
[2005-03-25 09:05:12 EST] INFO: last_analyze_count: 899272;
last_vacuum_count: 560541
[2005-03-25 09:05:12 EST] INFO: analyze_threshold: 49685;
vacuum_threshold: 100674
[2005-03-25 09:10:12 EST] DEBUG:   Performing: VACUUM ANALYZE
"public"."file_92"
[2005-03-25 09:10:33 EST] INFO:  table name: secom."public"."file_92"
[2005-03-25 09:10:33 EST] INFO: relid: 9384219;   relisshared: 0
[2005-03-25 09:10:33 EST] INFO: reltuples: 113082.00;  relpages:
6624
[2005-03-25 09:10:33 EST] INFO: curr_analyze_count: 923820;
curr_vacuum_count: 662699
[2005-03-25 09:10:33 EST] INFO: last_analyze_count: 923820;
last_vacuum_count: 662699
[2005-03-25 09:10:33 EST] INFO: analyze_threshold: 113582;
vacuum_threshold: 227164
[2005-03-25 09:16:14 EST] INFO:  table name: secom."public"."file_92"
[2005-03-25 09:16:14 EST] INFO: relid: 9384219;   relisshared: 0
[2005-03-25 09:16:14 EST] INFO: reltuples: 113082.00;  relpages:
6624  <-- Actually has 31k rows
[2005-03-25 09:16:14 EST] INFO: curr_analyze_count: 923820;
curr_vacuum_count: 662699
[2005-03-25 09:16:14 EST] INFO: last_analyze_count: 923820;
last_vacuum_count: 662699
[2005-03-25 09:16:14 EST] INFO: analyze_threshold: 113582;
vacuum_threshold: 227164
DETAIL:  Allocated FSM size: 1000 relations + 200 pages = 11784 kB
shared memory.

- Original Message -
From: "Matthew T. O'Connor" 
To: "Otto Blomqvist" <[EMAIL PROTECTED]>;

Sent: Thursday, March 24, 2005 3:58 PM
Subject: Re: [PERFORM] pg_autovacuum not having enough suction ?
 

I would rather keep this on list since other people can chime in.
Otto Blomqvist wrote:
   

It does not seem to be a Stats collector problem.
 oid   | relname | relnamespace | relpages | relisshared | reltuples |
schemaname | n_tup_ins | n_tup_upd | n_tup_del
 

-+-+--+--+-+---+---
   

-
 

+---+---+---
9384219 | file_92 | 2200 | 8423 | f   | 49837 |
public |158176 |318527 |158176
(1 row)
I insert 5 records
secom=# select createfile_92records(1, 5);<--- this is a pg
 

script
 

that inserts records  1 threw 5.
createfile_92records
--
  0
 oid   | relname | relnamespace | relpages | relisshared | reltuples |
schemaname | n_tup_ins | n_tup_upd | n_tup_del
 

-+-+--+--+-+---+---
   

-
 

+---+---+---
9384219 | file_92 | 2200 | 8423 | f   | 49837 |
public |208179 |318932 |158377
(1 row)
reltuples does not change ? Hmm. n_tup_ins looks fine.
 

That is expected, reltuples only gets updated by a vacuum or an analyze.
   

This table is basically a queue full of records waiting to get transfered
over from our 68030 system to the PG database. The records are then moved
into folders (using a trigger) like file_92_myy depending on what month
 

the
 

record was created on the 68030. During normal operations there should
 

not
 

be more than 10 records at a time in the table, although during the
 

course
 

of a day a normal system will get about 50k records. I create 5
 

records
 

to simulate incoming traffic, since we don't have much traffic in

Re: [PERFORM] Delete query takes exorbitant amount of time

2005-03-25 Thread Tom Lane
Simon Riggs <[EMAIL PROTECTED]> writes:
> On Fri, 2005-03-25 at 10:17 -0500, Tom Lane wrote:
>>> Other than spec compliance, you mean?  SQL99 says
>>> 
>>> ... The declared type of each referencing column shall be
>>> comparable to the declared type of the corresponding referenced
>>> column.

> Tom had said SQL99 required this; I have pointed out SQL:2003, which
> supercedes the SQL99 standard, does not require this.

You're reading the wrong part of SQL:2003.  11.8  syntax rule 9 still has the text I quoted.

> Leading us back to my original point - what is the benefit of continuing
> with having a WARNING when that leads people into trouble later?

Accepting spec-compliant schemas.

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] Delete query takes exorbitant amount of time

2005-03-25 Thread Simon Riggs
On Fri, 2005-03-25 at 08:23 -0800, Stephan Szabo wrote:
> On Fri, 25 Mar 2005, Simon Riggs wrote:
> 
> > On Fri, 2005-03-25 at 10:17 -0500, Tom Lane wrote:
> > > Simon Riggs <[EMAIL PROTECTED]> writes:
> > > > I vote to make this an ERROR in 8.1 - I see little benefit in allowing
> > > > this situation to continue.
> > >
> > > Other than spec compliance, you mean?  SQL99 says
> > >
> > > ... The declared type of each referencing column shall be
> > > comparable to the declared type of the corresponding 
> > > referenced
> > > column.
> > >
> > > It doesn't say that it has to be indexable, and most definitely not that
> > > there has to be an index.
> >
> > specs at dawn, eh?
> >
> > Well, SQL:2003 Foundation, p.550 clause 3a) states that the the
> >  in the referencing table must match a unique
> > constraint on the referenced table, or the PRIMARY KEY if the columns
> > are not specified. Either way, the referenced columns are a unique
> > constraint (which makes perfect sense from a logical data perspective).
> >
> > We implement unique constraints via an index, so for PostgreSQL the
> > clause implies that it must refer to an index.
> 
> IMHO, that reference is irrrelevant.  

Tom had said SQL99 required this; I have pointed out SQL:2003, which
supercedes the SQL99 standard, does not require this.

Leading us back to my original point - what is the benefit of continuing
with having a WARNING when that leads people into trouble later?

> Yes, there must be an index due to
> our implementation, however that doesn't imply that the types must be the
> same

No, it doesn't imply it, but what benefit do you see from the
interpretation that they are allowed to differ? That interpretation
currently leads to many mistakes leading to poor performance. 

There is clear benefit from forcing them to be the same. In logical data
terms, they *should* be the same. I don't check fruit.apple_grade
against fruit_type.orange_grade. When would I want to make a check of
that nature? If there is a reason, thats great, lets keep status quo
then.

I respect the effort and thought that has already gone into the
implementation; I seek only to offer a very minor improvement based upon
recent list issues.

> nor even that the index must be usable for the cross table
> comparison.

Thats a separate discussion, possibly the next one.

Best Regards, Simon Riggs




---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] pg_autovacuum not having enough suction ?

2005-03-25 Thread Otto Blomqvist
It looks like the reltuples-values are screwed up. Even though rows are
constantly being removed from the table the reltuples keep going up. If I
understand correctly that also makes the Vacuum threshold go up and we end
up in a vicious circle. Right after pg_autovacuum performed a vacuum analyze
on the table it actually had 31000 records, but reltuples reports over 100k.
I'm not sure if this means anything But i thought i would pass it along.

PG version 8.0.0, 31MB tarred DB.

[2005-03-25 09:16:14 EST] INFO:dbname: testing
[2005-03-25 09:16:14 EST] INFO:  oid: 9383816
[2005-03-25 09:16:14 EST] INFO:  username: (null)
[2005-03-25 09:16:14 EST] INFO:  password: (null)
[2005-03-25 09:16:14 EST] INFO:  conn is null, (not connected)
[2005-03-25 09:16:14 EST] INFO:  default_analyze_threshold: 1000
[2005-03-25 09:16:14 EST] INFO:  default_vacuum_threshold: 500


[2005-03-25 09:05:12 EST] INFO:  table name: secom."public"."file_92"
[2005-03-25 09:05:12 EST] INFO: relid: 9384219;   relisshared: 0
[2005-03-25 09:05:12 EST] INFO: reltuples: 49185.00;  relpages:
8423
[2005-03-25 09:05:12 EST] INFO: curr_analyze_count: 919274;
curr_vacuum_count: 658176
[2005-03-25 09:05:12 EST] INFO: last_analyze_count: 899272;
last_vacuum_count: 560541
[2005-03-25 09:05:12 EST] INFO: analyze_threshold: 49685;
vacuum_threshold: 100674


[2005-03-25 09:10:12 EST] DEBUG:   Performing: VACUUM ANALYZE
"public"."file_92"
[2005-03-25 09:10:33 EST] INFO:  table name: secom."public"."file_92"
[2005-03-25 09:10:33 EST] INFO: relid: 9384219;   relisshared: 0
[2005-03-25 09:10:33 EST] INFO: reltuples: 113082.00;  relpages:
6624
[2005-03-25 09:10:33 EST] INFO: curr_analyze_count: 923820;
curr_vacuum_count: 662699
[2005-03-25 09:10:33 EST] INFO: last_analyze_count: 923820;
last_vacuum_count: 662699
[2005-03-25 09:10:33 EST] INFO: analyze_threshold: 113582;
vacuum_threshold: 227164


[2005-03-25 09:16:14 EST] INFO:  table name: secom."public"."file_92"
[2005-03-25 09:16:14 EST] INFO: relid: 9384219;   relisshared: 0
[2005-03-25 09:16:14 EST] INFO: reltuples: 113082.00;  relpages:
6624  <-- Actually has 31k rows
[2005-03-25 09:16:14 EST] INFO: curr_analyze_count: 923820;
curr_vacuum_count: 662699
[2005-03-25 09:16:14 EST] INFO: last_analyze_count: 923820;
last_vacuum_count: 662699
[2005-03-25 09:16:14 EST] INFO: analyze_threshold: 113582;
vacuum_threshold: 227164

DETAIL:  Allocated FSM size: 1000 relations + 200 pages = 11784 kB
shared memory.




- Original Message -
From: "Matthew T. O'Connor" 
To: "Otto Blomqvist" <[EMAIL PROTECTED]>;

Sent: Thursday, March 24, 2005 3:58 PM
Subject: Re: [PERFORM] pg_autovacuum not having enough suction ?


> I would rather keep this on list since other people can chime in.
>
> Otto Blomqvist wrote:
>
> >It does not seem to be a Stats collector problem.
> >
> >   oid   | relname | relnamespace | relpages | relisshared | reltuples |
> >schemaname | n_tup_ins | n_tup_upd | n_tup_del
>
>-+-+--+--+-+---+---
-
> >+---+---+---
> > 9384219 | file_92 | 2200 | 8423 | f   | 49837 |
> >public |158176 |318527 |158176
> >(1 row)
> >
> >I insert 5 records
> >
> >secom=# select createfile_92records(1, 5);<--- this is a pg
script
> >that inserts records  1 threw 5.
> > createfile_92records
> >--
> >0
> >
> >
> >   oid   | relname | relnamespace | relpages | relisshared | reltuples |
> >schemaname | n_tup_ins | n_tup_upd | n_tup_del
>
>-+-+--+--+-+---+---
-
> >+---+---+---
> > 9384219 | file_92 | 2200 | 8423 | f   | 49837 |
> >public |208179 |318932 |158377
> >(1 row)
> >
> >reltuples does not change ? Hmm. n_tup_ins looks fine.
> >
> >
>
> That is expected, reltuples only gets updated by a vacuum or an analyze.
>
> >This table is basically a queue full of records waiting to get transfered
> >over from our 68030 system to the PG database. The records are then moved
> >into folders (using a trigger) like file_92_myy depending on what month
the
> >record was created on the 68030. During normal operations there should
not
> >be more than 10 records at a time in the table, although during the
course
> >of a day a normal system will get about 50k records. I create 5
records
> >to simulate incoming traffic, since we don't have much traffic in the
test
> >lab.
> >
> >After a few hours we have
> >
> >secom=# select count(*) from file_92;
> > count
> >---
> > 42072
> >
> >So we have sent over approx 8000 Records.
> >
> >   oid   | relname | relnamespace | relpages | relisshared | reltuples |
> >schemaname | n_tup_ins | n_tup_upd | n_tup_del
>

Re: [PERFORM] Script for getting a table of reponse-time breakdown

2005-03-25 Thread Jack Xue
Josh,

The description of the Oracle script is:

This script can be used to focus tuning attention on the most important
issues. It reports a breakdown of total foreground response time into
four major categories: CPU usage, disk I/O, resource waits, and routine
latencies. These categories are broken down further into sub-categories,
and the component wait events are shown. 

The 8.1.6 version of the script uses the ratio_to_report analytic
function to calculate percentages. The 8.1.5 version can be used if
percentages are not required. The 8.1.5 version of the script should
work on Oracle8 also, but has not yet been tested.

The print out of this script is:
SQL> @response_time_breakdown

MAJORMINOR WAIT_EVENTSECONDSPCT 
 -   -- 
CPU time parsing   n/a   497   .57% 
 reloads   n/a13   .01% 
 execution n/a 52618 59.99% 

disk I/O normal I/Odb file sequential read 21980 25.06% 
 full scansdb file scattered read   9192 10.48% 
 direct I/Odirect path read 1484  1.69% 
   direct path write 354   .40% 
 other I/O log file sequential read9   .01% 
   db file parallel read   0   .00% 
   control file sequential read0   .00% 
   control file parallel write 0   .00% 

waitsDBWn writes   rdbms ipc reply   143   .16% 
   free buffer waits  36   .04% 
   checkpoint completed   31   .04% 
 LGWR writes   log file switch completion698   .80% 
 other locks   latch free496   .57% 
   sort segment request  108   .12% 

latency  commits   log file sync   6   .01% 
 network   SQL*Net break/reset to client  18   .02% 
   SQL*Net more data to client 8   .01% 
   SQL*Net message to client   7   .01% 
   SQL*Net more data from client   3   .00% 
 file ops  file open   4   .01% 
   file identify   1   .00% 
 misc  instance state change   0   .00% 


The script is pretty long:

---
--
-- Script:  response_time_breakdown.sql
-- Purpose: to report the components of foreground response time in % terms
-- For: 8.0 to 8.1.5
--
-- Copyright:   (c) Ixora Pty Ltd
-- Author:  Steve Adams
--
---
@save_sqlplus_settings

column major  format a8
column wait_event format a40 trunc
column secondsformat 999
column pctjustify right
break on major skip 1 on minor

select
  substr(n_major, 3)  major,
  substr(n_minor, 3)  minor,
  wait_event,
  round(time/100)  seconds
from
  (
select /*+ ordered use_hash(b) */
  '1 CPU time'  n_major,
  decode(t.ksusdnam,
'redo size', '2 reloads',
'parse time cpu', '1 parsing',
'3 execution'
  )  n_minor,
  'n/a'  wait_event,
  decode(t.ksusdnam,
'redo size', nvl(r.time, 0),
'parse time cpu', t.ksusgstv - nvl(b.time, 0),
t.ksusgstv - nvl(b.time, 0) - nvl(r.time, 0)
  )  time
from
  sys.x_$ksusgsta  t,
  (
select /*+ ordered use_nl(s) */ -- star query: few rows from d 
and b
  s.ksusestn,   -- statistic#
  sum(s.ksusestv)  time -- time used by backgrounds
from
  sys.x_$ksusd  d,  -- statname
  sys.x_$ksuse  b,  -- session
  sys.x_$ksbdp  p,  -- background process
  sys.x_$ksusesta  s-- sesstat
where
  d.ksusdnam in (
'parse time cpu',
'CPU used when call started') and
  b.ksspaown = p.ksbdppro and
  s.ksusestn = d.indx and
  s.indx = b.indx
  

Re: [PERFORM] Script for getting a table of reponse-time breakdown

2005-03-25 Thread Josh Berkus
Jack,

> This script can be used to focus tuning attention on the most important
> issues. It reports a breakdown of total foreground response time into
> four major categories: CPU usage, disk I/O, resource waits, and routine
> latencies. These categories are broken down further into sub-categories,
> and the component wait events are shown.

This would be very nice.  And very, very hard to build.

No, we don't have anything similar.  You can, of course, use profiling tools.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] Script for getting a table of reponse-time breakdown

2005-03-25 Thread Josh Berkus
Jack,

> I am thinking about how to continuously monitor the performance of a
> PostgreSQL 8 database. I am interested in two things: (1) the growth of
> tables with TOAST and indexes;

This is queryable from the system tables, if you don't mind an approximate.  

> and (2) the respond time breakdown for a 
> query.

The what?  You mean EXPLAIN ANALYZE?

> In Chapters 23 and 24 of the big manual, I found enough materials to
> teach me how to do the 1st job. But I have difficulty with the 2nd one.
> I found some script for Oracle
> (http://www.ixora.com.au/scripts/waits.htm).

Life's too short for reading Oracle docs.   Can you just explain, in 
step-by-step detail, what you want?

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Delete query takes exorbitant amount of time

2005-03-25 Thread Josh Berkus
Karim,

> I guess I was wondering if there is other general tuning advice for such
> large table indexes such as increasing statistics, etc.

Well, your index use problem is being explained by Tom, Stephan and Simon; 
basically your FKed data types are incompatible for index use purposes so the 
system *can't* use an index while loading.

If you're going with the drop/load/recreate option, then I'd suggest 
increasing work_mem for the duration.  Hmmm ... or maintenance_work_mem?  
What gets used for FK checks?  Simon?

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] Delete query takes exorbitant amount of time

2005-03-25 Thread Stephan Szabo

On Fri, 25 Mar 2005, Simon Riggs wrote:

> On Fri, 2005-03-25 at 10:17 -0500, Tom Lane wrote:
> > Simon Riggs <[EMAIL PROTECTED]> writes:
> > > I vote to make this an ERROR in 8.1 - I see little benefit in allowing
> > > this situation to continue.
> >
> > Other than spec compliance, you mean?  SQL99 says
> >
> > ... The declared type of each referencing column shall be
> > comparable to the declared type of the corresponding referenced
> > column.
> >
> > It doesn't say that it has to be indexable, and most definitely not that
> > there has to be an index.
>
> specs at dawn, eh?
>
> Well, SQL:2003 Foundation, p.550 clause 3a) states that the the
>  in the referencing table must match a unique
> constraint on the referenced table, or the PRIMARY KEY if the columns
> are not specified. Either way, the referenced columns are a unique
> constraint (which makes perfect sense from a logical data perspective).
>
> We implement unique constraints via an index, so for PostgreSQL the
> clause implies that it must refer to an index.

IMHO, that reference is irrrelevant.  Yes, there must be an index due to
our implementation, however that doesn't imply that the types must be the
same, nor even that the index must be usable for the cross table
comparison.


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] Delete query takes exorbitant amount of time

2005-03-25 Thread Simon Riggs
On Fri, 2005-03-25 at 10:17 -0500, Tom Lane wrote:
> Simon Riggs <[EMAIL PROTECTED]> writes:
> > I vote to make this an ERROR in 8.1 - I see little benefit in allowing
> > this situation to continue.
> 
> Other than spec compliance, you mean?  SQL99 says
> 
> ... The declared type of each referencing column shall be
> comparable to the declared type of the corresponding referenced
> column.
> 
> It doesn't say that it has to be indexable, and most definitely not that
> there has to be an index.

specs at dawn, eh?

Well, SQL:2003 Foundation, p.550 clause 3a) states that the the
 in the referencing table must match a unique
constraint on the referenced table, or the PRIMARY KEY if the columns
are not specified. Either way, the referenced columns are a unique
constraint (which makes perfect sense from a logical data perspective).

We implement unique constraints via an index, so for PostgreSQL the
clause implies that it must refer to an index. 

touche, Monsieur Lane and Happy Easter :-)

But even without that, there is little benefit in allowing it...

WARNING -> ERROR, please.

Best Regards, Simon Riggs


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] [BUGS] BUG #1552: massive performance hit

2005-03-25 Thread Simon Riggs
On Fri, 2005-03-25 at 03:50 -0700, Karim Nassar wrote:
> On Fri, 2005-03-25 at 10:18 +, Simon Riggs wrote:
> > > When running with PostgreSQL 7.4 on a dual-CPU Athlon MP2400+ machine 
> > > with a gigabyte of RAM, running Debian Linux version 2.6.8.1, we were 
> > > able to insert all this data in 5-7 minutes.  It's taken a while to 
> > > install Postgres 8.0.1 on the same machine, but now I have, and it's 
> > > taking 40-45 minutes to run the same insert script.  
> 
> 
> 
> > OK. Not-a-bug.
> >
> > Your situation is covered in the manual with some sage advice
> > http://www.postgresql.org/docs/8.0/static/populate.html
> > It doesn't go into great lengths about all the reasons why those
> > recommendations are good ones - but they are clear.

> Simon, this begs the question: what changed from 7.4->8.0 to require he
> modify his script?

Good question. Clearly, some combination of stats-plus-index-selection
code changed but I suspect this is a case of more, not less accuracy,
affecting us here.

The FK code literally generates SQL statements, then prepares them.
AFAICS it should be possible to add more code to 
src/backend/utils/adt/ritrigger.c to force the prepare of FK code to
avoid seq scans by executing "SET enable_seqscan = off;"
I'll have a play

But, the wider point raised by this is whether Prepare should be more
conservative in the plan it generates. When we Execute a single query,
it is perfectly OK to go for the "best" plan, since it is being executed
only this once and we can tell, right now, which one the "best" is.

With a Prepared query, it is clearly going to be executed many times and
so we should consider that the optimal plan may change over time. 

Index access has more overhead for small tables, but increases by (I
think) only logN as the number of rows in a table, N, increases.
Sequential scan access varies by N. Thus, as N increases from zero,
first of all Seq Scan is the best plan - but only marginally better than
Index access, then this changes at some value of N, then after that
index access is the best plan. As N increases, Seq Scan access clearly
diverges badly from Indexed access. 

The conservative choice for unknown, or varying N would be index access,
rather than the best plan available when the query is prepared.

I propose a more general TODO item:

* Make Prepared queries always use indexed access, if it is available

Best Regards, Simon Riggs


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] Delete query takes exorbitant amount of time

2005-03-25 Thread Tom Lane
Simon Riggs <[EMAIL PROTECTED]> writes:
> I vote to make this an ERROR in 8.1 - I see little benefit in allowing
> this situation to continue.

Other than spec compliance, you mean?  SQL99 says

... The declared type of each referencing column shall be
comparable to the declared type of the corresponding referenced
column.

It doesn't say that it has to be indexable, and most definitely not that
there has to be an index.

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


[PERFORM] Script for getting a table of reponse-time breakdown

2005-03-25 Thread Jack Xue
Hi,

I am thinking about how to continuously monitor the performance of a
PostgreSQL 8 database. I am interested in two things: (1) the growth of
tables with TOAST and indexes; and (2) the respond time breakdown for a
query.

In Chapters 23 and 24 of the big manual, I found enough materials to
teach me how to do the 1st job. But I have difficulty with the 2nd one.
I found some script for Oracle
(http://www.ixora.com.au/scripts/waits.htm).

Do we have something that can do the same job for PostgreSQL 8? 

Thanks.

-Jack


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [PERFORM] Delete query takes exorbitant amount of time

2005-03-25 Thread Simon Riggs
On Fri, 2005-03-25 at 01:58 -0500, Tom Lane wrote:
> Christopher Kings-Lynne <[EMAIL PROTECTED]> writes:
> >> In that case there's a datatype mismatch between the referencing and
> >> referenced columns, which prevents the index from being used for the
> >> FK check.
> 
> > Is creating such a foreign key a WARNING yet?
> 
> I believe so as of 8.0.  It's a bit tricky since 8.0 does allow some
> cross-type cases to be indexed, but IIRC we have a test that understands
> about that...

src/backend/commands/tablecmds.c, line 3966 in CVSTIP
/*
 * Check that the found operator is compatible with the PK index,
 * and generate a warning if not, since otherwise costly seqscans
 * will be incurred to check FK validity.
*/
if (!op_in_opclass(oprid(o), opclasses[i]))
   ereport(WARNING,
(errmsg("foreign key constraint \"%s\" "
"will require costly sequential scans",
fkconstraint->constr_name),
 errdetail("Key columns \"%s\" and \"%s\" "
"are of different types: %s and %s.",
 strVal(list_nth(fkconstraint->fk_attrs, i)),
 strVal(list_nth(fkconstraint->pk_attrs, i)),
   format_type_be(fktypoid[i]),
   format_type_be(pktypoid[i];

So, yes to the WARNING. Not sure about the cross-type cases...

Karim: Did this happen? If not, can you drop and re-create and confirm
that you get the WARNING? If not, we have problems.

I vote to make this an ERROR in 8.1 - I see little benefit in allowing
this situation to continue. If users do create a FK like this, it just
becomes another performance problem on list...

Best Regards, Simon Riggs


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] [BUGS] BUG #1552: massive performance hit between 7.4 and 8.0.1

2005-03-25 Thread Tom Lane
Simon Riggs <[EMAIL PROTECTED]> writes:
> I think we should spawn a TODO item from this:
> * Coerce FK lookups to always use an available index

No, we aren't doing that.

The correct TODO item is "Replan cached plans when table size has
changed a lot" which of course depends on having a framework to do
replanning at all.  I intend to take a look at that once Neil has
created such a framework ...

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] [BUGS] BUG #1552: massive performance hit between 7.4 and 8.0.1

2005-03-25 Thread Simon Riggs
On Wed, 2005-03-23 at 14:22 -0500, Keith Browne wrote:
> Simon Riggs wrote:
> 
> > The EXPLAINs you've enclosed are for SELECTs, yet your bug report
> > describes INSERTs as being the things that are slow.
> > [You may find better performance from using COPY]

> We're starting with an empty database, creating four tables, and 
> populating those tables with a total of 180,000-200,000 rows.  Each 
> table has a primary key, and several of the tables reference foreign 
> keys in other tables.  We've written a Python script, using psycopg, 
> which executes all the queries to create the tables and insert the rows. 
>The database is running on the same machine where the script runs.
> 
> I've seen similar performance when issuing a COMMIT after each 
> insertion, and also after batching insertions in blocks of 250 per 
> COMMIT, so batching the commits is not helping much.  I've looked at the 
> possibility of using COPY, but in our production environment it will be 
> prohibitive to build a flat file with all this data.  I'd rather 
> generate it on the fly, as we've been able to do with PostgreSQL 7.4.
> 
> > Also, your tests have compared two systems, so it might be that the
> > hardware or configuration of one system is different from the other. 
> 
> When running with PostgreSQL 7.4 on a dual-CPU Athlon MP2400+ machine 
> with a gigabyte of RAM, running Debian Linux version 2.6.8.1, we were 
> able to insert all this data in 5-7 minutes.  It's taken a while to 
> install Postgres 8.0.1 on the same machine, but now I have, and it's 
> taking 40-45 minutes to run the same insert script.  This is similar to 
> the performance we saw on another machine, a fast single-CPU AMD64 box 
> running Gentoo.
> 
> I don't think it's a hardware issue.  I dug around a bit, and found 
> suggestions that this sort of problem could be worked around by breaking 
> the database connection and restarting it after the tables had been 
> partially filled.  I modified our script to break and re-establish the 
> database connection when each table first has 4,000 records inserted, 
> and the performance is greatly improved; it now takes only about 3.5 
> minutes to insert 180,000+ rows.
> 
> I've since modified this script to build and populate a fifth table with 
> over 1.3 million rows.  The fifth table has no primary key, but lists a 
> foreign key into one of the first four tables.  With the above 
> modification (break and re-build the DB connection after 4,000 rows have 
> been inserted), the whole database can be populated in about 15 minutes. 
>   I wouldn't have dared try to build a one-million-plus-row table until 
> I found this speed-up.
> 
> > If you could repeat the test on one single system, then this would
> > assist in the diagnosis of this bug report. Also, if you could describe
> > the workload that is giving you a problem more exactly, that would help.
> > Specifically, can you confirm that you have run ANALYZE on the tables,
> > and also give us some idea of numbers of rows in each table at the time
> > you first run your programs.
> 
> Just to see if it would help, I tried modifying the script to run an 
> ANALYZE against each table after 4,000 insertions, instead of breaking 
> and re-establishing the DB connection.  I still saw ~45-minute times to 
> insert 180,000 rows.  I then tried running ANALYZE against each table 
> after *each* 4,000 rows inserted, and again, it took about 45 minutes to 
> run the insert.
> 
> Each table is empty when I first run the program.  I am dropping and 
> re-creating the database for each test run.
> 
> > There is clearly a problem, but it is not yet clearly a bug. If it is a
> > bug, we're interested in solving it as much as you.
> 
> I'd be happy to run further tests or provide more details, if they'll 
> help.  We now have a workaround which is allowing us to proceed with our 
> project, but I'd like to know if there's another way to do this.  While 
> I understand that large or complex databases require careful tuning, I 
> was surprised to see a six- or seven-fold increase in run times between 
> PostgreSQL 7.4 and 8.0.1 on the same hardware, on an operation which 
> seems fairly straightforward: populating an empty table.
> 
> One other thing which puzzled me: as a test, I tried modifying our 
> script to spit out raw SQL statements instead of connecting to the 
> database and performing the inserts itself.  Normally, our script 
> populates two tables in one pass, and then populates the third and 
> fourth tables in a second pass.  I massaged the SQL by hand to group the 
> inserts together by table, so that the first table would be entirely 
> populated, then the second, etc.  When I ran this SQL script by piping 
> it straight into psql, it finished in about four minutes.  This is 
> comparable to the time it takes to run my modified script which breaks 
> and re-establishes the connection to the database.

OK. Not-a-bug.

Your situation is covered in the manual