Re: [GENERAL] Temporary tables and disk activity

2004-12-13 Thread Tom Lane
Phil Endecott <[EMAIL PROTECTED]> writes:
> What would happen if I were to rollback at the end of the transaction, 
> rather than committing (having made no changes)?  Would that eliminate 
> some or all of the catalog writes?

It would avoid fsync'ing the changes at commit time, but not really
reduce the write volume per se.

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [GENERAL] Temporary tables and disk activity

2004-12-13 Thread Phil Endecott
Tom Lane wrote:
Phil Endecott <[EMAIL PROTECTED]> writes:
Does this make sense?  I imagine that the temporary table is being added 
to these tables and then removed again.

Yes, a temp table has the same catalog infrastructure as a regular
table, so creation and deletion of a temp table will cause some activity
in those catalogs.  I thought you were concerned about the data within
the temp table, though.
I'm concerned about unnecessary disk activity, whatever its cause. I 
guessed that it was the temp table contents.

I do have quite a large number of tables in the database; I have one 
schema per user and of the order of 20 tables per user and 200 users.  I 
can imagine that in a system with fewer tables this would be 
insignificant, yet in my case it seems to be writing of the order of a 
megabyte in each 5-second update.

That seems like a lot.  How often do you create/delete temp tables?
Only once or twice per 5-sec update period.  I agree that it sounds like 
a lot which makes me think this could all be a red herring;  I suspect 
that there is something else going on as well as this temp table stuff 
(possibly nothing to do with postgresql).  But FYI this is treefic.com, 
a family tree website.  Have a look at, for example, 
http://treefic.com/treefic/royal92?a=tree_page&root_id=10286&direction=up
The first step in building that diagram is to find the ancestors of the 
root individual.  I have a pl_pgsql function that itteratively finds all 
of the ancestors, progressively adding them to a temporary table.  So it 
will create, populate, read and then drop one table for each page that 
it generates.  This is reasonably fast; overall speed is not limited by 
postgres.

What would happen if I were to rollback at the end of the transaction, 
rather than committing (having made no changes)?  Would that eliminate 
some or all of the catalog writes?

Many thanks for helping me understand this.
Regards,
Phil.
---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [GENERAL] Temporary tables and disk activity

2004-12-13 Thread Tom Lane
Phil Endecott <[EMAIL PROTECTED]> writes:
> Does this make sense?  I imagine that the temporary table is being added 
> to these tables and then removed again.

Yes, a temp table has the same catalog infrastructure as a regular
table, so creation and deletion of a temp table will cause some activity
in those catalogs.  I thought you were concerned about the data within
the temp table, though.

> I do have quite a large number of tables in the database; I have one 
> schema per user and of the order of 20 tables per user and 200 users.  I 
> can imagine that in a system with fewer tables this would be 
> insignificant, yet in my case it seems to be writing of the order of a 
> megabyte in each 5-second update.

That seems like a lot.  How often do you create/delete temp tables?

> I should mention that I ANALYSE the temporary table after creating it 
> and before using it for anything;  I'm not sure if this does any good 
> but I put it in as it "couldn't do any harm".

This is a good idea (if you analyze after filling the table) ... but it
will cause catalog traffic too, because again the pg_statistic rows go
into the regular pg_statistic catalog.

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [GENERAL] Temporary tables and disk activity

2004-12-13 Thread Phil Endecott
Hi Tom,
I thought of a quicker way to investiage this than strace and did an ls 
-lt in the data directory and looked up the tables that seem to change 
on every transaction in pg_class.  They are the catalog tables:

# ls -lt /var/lib/postgres/data/base/17142/
total 530108
-rw---1 postgres postgres  6488064 Dec 13 18:44 1259
-rw---1 postgres postgres  3670016 Dec 13 18:44 1247
-rw---1 postgres postgres 38715392 Dec 13 18:44 1249
-rw---1 postgres postgres  3317760 Dec 13 18:44 16390
-rw---1 postgres postgres 13467648 Dec 13 18:44 16599
-rw---1 postgres postgres 16957440 Dec 13 18:44 16610
-rw---1 postgres postgres  4808704 Dec 13 18:44 16613
-rw---1 postgres postgres 17072128 Dec 13 18:44 16624
-rw---1 postgres postgres 14352384 Dec 13 18:44 16625
-rw---1 postgres postgres   483328 Dec 13 18:44 16630
-rw---1 postgres postgres  2228224 Dec 13 18:44 16652
-rw---1 postgres postgres  5742592 Dec 13 18:44 16653
-rw---1 postgres postgres 63578112 Dec 13 18:44 16609
-rw---1 postgres postgres 13787136 Dec 13 18:44 16614
-rw---1 postgres postgres   483328 Dec 13 18:44 16629
=> select pc.relfilenode, pc.relname, pn.nspname from pg_class pc join 
pg_namespace pn on (pc.relnamespace=pn.oid) where pc.relfilenode in 
('1259','1247','1249','16390','16599','16610','16613','16624','16625','16630','16652','16653','16609','16614','16629');

 relfilenode | relname |  nspname
-+-+
   16599 | pg_depend   | pg_catalog
   16390 | pg_index| pg_catalog
1259 | pg_class| pg_catalog
1249 | pg_attribute| pg_catalog
1247 | pg_type | pg_catalog
   16653 | pg_type_typname_nsp_index   | pg_catalog
   16652 | pg_type_oid_index   | pg_catalog
   16630 | pg_index_indexrelid_index   | pg_catalog
   16629 | pg_index_indrelid_index | pg_catalog
   16625 | pg_depend_reference_index   | pg_catalog
   16624 | pg_depend_depender_index| pg_catalog
   16614 | pg_class_relname_nsp_index  | pg_catalog
   16613 | pg_class_oid_index  | pg_catalog
   16610 | pg_attribute_relid_attnum_index | pg_catalog
   16609 | pg_attribute_relid_attnam_index | pg_catalog
(15 rows)
Does this make sense?  I imagine that the temporary table is being added 
to these tables and then removed again.

I do have quite a large number of tables in the database; I have one 
schema per user and of the order of 20 tables per user and 200 users.  I 
can imagine that in a system with fewer tables this would be 
insignificant, yet in my case it seems to be writing of the order of a 
megabyte in each 5-second update.

I should mention that I ANALYSE the temporary table after creating it 
and before using it for anything;  I'm not sure if this does any good 
but I put it in as it "couldn't do any harm".

Any thoughts?
Regards,
Phil.
Tom Lane wrote:
Phil Endecott <[EMAIL PROTECTED]> writes:
Tom Lane wrote:
In principle, therefore, the kernel could hold temp table data in its
own disk buffers and never write it out to disk until the file is
deleted.  In practice, of course, the kernel doesn't know the data is
transient and will probably push it out whenever it has nothing else to
do.

That makes sense.  I suspect that I am seeing writes every 5 seconds, 
which looks like bdflush / update.

But my connections normally only last for a second at most.  In this 
case, surely the table would normally have been deleted before the 
kernel decided to write anything.

That does seem a bit odd, then.  Can you strace a typical backend
session and see if it's doing anything to force a disk write?
(I'm too lazy to go check right now whether 7.4 handled temp tables
exactly the same as CVS tip does.  I think it's the same but I might
be wrong.)
regards, tom lane


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [GENERAL] Temporary tables and disk activity

2004-12-13 Thread Bruce Momjian
Tom Lane wrote:
> Bruce Momjian <[EMAIL PROTECTED]> writes:
> > Tom Lane wrote:
> >> One of the things on the TODO list is making the size of temp-table
> >> buffers user-configurable.  (Temp table buffers are per-backend, they
> >> are not part of the shared buffer arena.)  With a large temp-table arena
> >> we'd never need to write to the kernel in the first place.  Right now
> >> you could manually increase the #define that sets it, but it would not
> >> pay to make it very large because the management algorithms are very
> >> stupid (linear scans).  That has to be fixed first :-(
> 
> > I assume you mean your TODO list because the official one has no mention
> > of this.
> 
> Doesn't it?  We've surely discussed the problem enough times, eg
> http://archives.postgresql.org/pgsql-hackers/2002-08/msg00380.php
> http://archives.postgresql.org/pgsql-hackers/2002-09/msg01368.php
> or for that matter here's Vadim complaining about it seven years ago:
> http://archives.postgresql.org/pgsql-hackers/1997-12/msg00215.php

OK, added:

* Allow the size of the buffer cache used by temporary objects to be
  specified as a GUC variable

  Larger local buffer cache sizes requires more efficient handling of
  local cache lookups.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [GENERAL] Temporary tables and disk activity

2004-12-12 Thread Tom Lane
Bruce Momjian <[EMAIL PROTECTED]> writes:
> Tom Lane wrote:
>> One of the things on the TODO list is making the size of temp-table
>> buffers user-configurable.  (Temp table buffers are per-backend, they
>> are not part of the shared buffer arena.)  With a large temp-table arena
>> we'd never need to write to the kernel in the first place.  Right now
>> you could manually increase the #define that sets it, but it would not
>> pay to make it very large because the management algorithms are very
>> stupid (linear scans).  That has to be fixed first :-(

> I assume you mean your TODO list because the official one has no mention
> of this.

Doesn't it?  We've surely discussed the problem enough times, eg
http://archives.postgresql.org/pgsql-hackers/2002-08/msg00380.php
http://archives.postgresql.org/pgsql-hackers/2002-09/msg01368.php
or for that matter here's Vadim complaining about it seven years ago:
http://archives.postgresql.org/pgsql-hackers/1997-12/msg00215.php

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [GENERAL] Temporary tables and disk activity

2004-12-12 Thread Bruce Momjian
Tom Lane wrote:
> Martijn van Oosterhout <[EMAIL PROTECTED]> writes:
> > I don't think temporary tables have any special rules regarding disk
> > writes, so I'd expect them ot get written out like everything else.
> 
> They'll be written out from PG's internal buffers, but IIRC they will
> never be fsync'd, and they definitely aren't WAL-logged.  (These
> statements hold true in 8.0, but not sure how far back.)
> 
> In principle, therefore, the kernel could hold temp table data in its
> own disk buffers and never write it out to disk until the file is
> deleted.  In practice, of course, the kernel doesn't know the data is
> transient and will probably push it out whenever it has nothing else to
> do.
> 
> One of the things on the TODO list is making the size of temp-table
> buffers user-configurable.  (Temp table buffers are per-backend, they
> are not part of the shared buffer arena.)  With a large temp-table arena
> we'd never need to write to the kernel in the first place.  Right now
> you could manually increase the #define that sets it, but it would not
> pay to make it very large because the management algorithms are very
> stupid (linear scans).  That has to be fixed first :-(

I assume you mean your TODO list because the official one has no mention
of this.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [GENERAL] Temporary tables and disk activity

2004-12-12 Thread Tom Lane
Phil Endecott <[EMAIL PROTECTED]> writes:
> Tom Lane wrote:
>> In principle, therefore, the kernel could hold temp table data in its
>> own disk buffers and never write it out to disk until the file is
>> deleted.  In practice, of course, the kernel doesn't know the data is
>> transient and will probably push it out whenever it has nothing else to
>> do.

> That makes sense.  I suspect that I am seeing writes every 5 seconds, 
> which looks like bdflush / update.

> But my connections normally only last for a second at most.  In this 
> case, surely the table would normally have been deleted before the 
> kernel decided to write anything.

That does seem a bit odd, then.  Can you strace a typical backend
session and see if it's doing anything to force a disk write?

(I'm too lazy to go check right now whether 7.4 handled temp tables
exactly the same as CVS tip does.  I think it's the same but I might
be wrong.)

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [GENERAL] Temporary tables and disk activity

2004-12-12 Thread Phil Endecott
Tom Lane wrote:
They [temporary tables]
> will be written out from PG's internal buffers, but IIRC they will
never be fsync'd, and they definitely aren't WAL-logged.  (These
statements hold true in 8.0, but not sure how far back.)
In principle, therefore, the kernel could hold temp table data in its
own disk buffers and never write it out to disk until the file is
deleted.  In practice, of course, the kernel doesn't know the data is
transient and will probably push it out whenever it has nothing else to
do.
That makes sense.  I suspect that I am seeing writes every 5 seconds, 
which looks like bdflush / update.

But my connections normally only last for a second at most.  In this 
case, surely the table would normally have been deleted before the 
kernel decided to write anything.  This is with 7.4.2 on linux 2.4.26. 
Does anyone have any experience with this type of situation?  Is there 
any kernel-tweaking I can play with?

Regards,
Phil.
---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [GENERAL] Temporary tables and disk activity

2004-12-12 Thread Tom Lane
Martijn van Oosterhout <[EMAIL PROTECTED]> writes:
> I don't think temporary tables have any special rules regarding disk
> writes, so I'd expect them ot get written out like everything else.

They'll be written out from PG's internal buffers, but IIRC they will
never be fsync'd, and they definitely aren't WAL-logged.  (These
statements hold true in 8.0, but not sure how far back.)

In principle, therefore, the kernel could hold temp table data in its
own disk buffers and never write it out to disk until the file is
deleted.  In practice, of course, the kernel doesn't know the data is
transient and will probably push it out whenever it has nothing else to
do.

One of the things on the TODO list is making the size of temp-table
buffers user-configurable.  (Temp table buffers are per-backend, they
are not part of the shared buffer arena.)  With a large temp-table arena
we'd never need to write to the kernel in the first place.  Right now
you could manually increase the #define that sets it, but it would not
pay to make it very large because the management algorithms are very
stupid (linear scans).  That has to be fixed first :-(

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [GENERAL] Temporary tables and disk activity

2004-12-12 Thread Martijn van Oosterhout
I don't think temporary tables have any special rules regarding disk
writes, so I'd expect them ot get written out like everything else. The
database doesn't know you're going to delete them later.

Are the tables big?

On Thu, Dec 09, 2004 at 10:10:21PM +, Phil Endecott wrote:
> Dear All,
> 
> I sent a message last weekend asking about temporary tables being 
> written to disk but didn't get any replies.  I'm sure there is someone 
> out there who knows something about this - please help!  Here is the 
> question again:
> 
> 
> Looking at vmstat output on my database server I have been suprised to 
> see lots of disk writes going on while it is doing what should be 
> exclusively read-only transactions. I see almost no disk reads as the 
> database concerned is small enough to fit into the OS disk cache.
> 
> I suspect that it might be something to do with temporary tables. There 
> are a couple of places where I create temporary tables to "optimise" 
> queries by factoring out what would otherwise be duplicate work. The 
> amount of data being written is of the right order of magnitude for this 
> to be the cause. I fear that perhaps Postgresql is flushing these tables 
> to disk, even though they will be dropped before the end of the 
> transaction. Is this a possibility? What issues should I be aware of 
> with temporary tables? Are there any other common causes of lots of disk 
> writes within read-only transactions? Is there any debug output that I 
> can look at to track this down?
> 
> Thanks in advance for any help that you can offer.
> 
> Regards,
> 
> Phil Endecott.
> 
> 
> 
> 
> ---(end of broadcast)---
> TIP 5: Have you checked our extensive FAQ?
> 
>   http://www.postgresql.org/docs/faqs/FAQ.html

-- 
Martijn van Oosterhout   <[EMAIL PROTECTED]>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.


pgpn91bIG0wEe.pgp
Description: PGP signature


[GENERAL] Temporary tables and disk activity

2004-12-10 Thread Phil Endecott
Dear All,
I sent a message last weekend asking about temporary tables being 
written to disk but didn't get any replies.  I'm sure there is someone 
out there who knows something about this - please help!  Here is the 
question again:

Looking at vmstat output on my database server I have been suprised to 
see lots of disk writes going on while it is doing what should be 
exclusively read-only transactions. I see almost no disk reads as the 
database concerned is small enough to fit into the OS disk cache.

I suspect that it might be something to do with temporary tables. There 
are a couple of places where I create temporary tables to "optimise" 
queries by factoring out what would otherwise be duplicate work. The 
amount of data being written is of the right order of magnitude for this 
to be the cause. I fear that perhaps Postgresql is flushing these tables 
to disk, even though they will be dropped before the end of the 
transaction. Is this a possibility? What issues should I be aware of 
with temporary tables? Are there any other common causes of lots of disk 
writes within read-only transactions? Is there any debug output that I 
can look at to track this down?

Thanks in advance for any help that you can offer.
Regards,
Phil Endecott.

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html


[GENERAL] Temporary tables and disk activity

2004-12-04 Thread Phil Endecott
Dear Postgresql experts,
Looking at vmstat output on my database server I have been suprised to 
see lots of disk writes going on while it is doing what should be 
exclusively read-only transactions.  I see almost no disk reads as the 
database concerned is small enough to fit into the OS disk cache.

I suspect that it might be something to do with temporary tables.  There 
are a couple of places where I create temporary tables to "optimise" 
queries by factoring out what would otherwise be duplicate work.  The 
amount of data being written is of the right order of magnitude for this 
to be the cause.  I fear that perhaps Postgresql is flushing these 
tables to disk, even though they will be dropped before the end of the 
transaction.  Is this a possibility?  What issues should I be aware of 
with temporary tables?  Are there any other common causes of lots of 
disk writes within read-only transactions?  Is there any debug output 
that I can look at to track this down?

Thanks in advance for any help that you can offer.
Regards,
Phil Endecott.

---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org