Re: [PERFORM] Is DBLINK transactional

2010-03-13 Thread Craig Ringer

On 13/03/2010 5:54 AM, Jeff Davis wrote:

On Fri, 2010-03-12 at 12:07 -0500, Merlin Moncure wrote:

of course.  You can always explicitly open a transaction on the remote
side over dblink, do work, and commit it at the last possible moment.
Your transactions aren't perfectly synchronized...if you crash in the
precise moment between committing the remote and the local you can get
in trouble.  The chances of this are extremely remote though.


If you want a better guarantee than that, consider using 2PC.


Translation in case you don't know: 2PC = two phase commit.

Note that you have to monitor lost transactions that were prepared for 
commit then abandoned by the controlling app and periodically get rid of 
them or you'll start having issues.



The problem with things that are extremely remote possibilities are
that they tend to be less remote than we expect ;)


... and they know just when they can happen despite all the odds to 
maximise the pain and chaos caused.


--
Craig Ringer

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Deleting bytea, autovacuum, and 8.2/8.4 differences

2010-03-13 Thread fka...@googlemail.com
Hi all,

my posting on 2010-01-14 about the performance when writing
bytea to disk caused a longer discussion. While the fact
still holds that the overall postgresql write performance is
roughly 25% of the serial I/O disk performance this was
compensated for my special use case here by doing some other
non-postgresql related things in parallel.

Now I cannot optimize my processes any further, however, now
I am facing another quite unexpected performance issue:
Deleting rows from my simple table (with the bytea column)
having 16 MB data each, takes roughly as long as writing
them!

Little more detail:

* The table just has 5 unused int columns, a timestamp,
OIDs, and the bytea column, no indices; the bytea storage
type is 'extended', the 16 MB are compressed to approx. the
half.

* All the usual optimizations are done to reach better
write through (pg_xlog on another disk, much tweaks to the
server conf etc), however, this does not matter here, since
not the absolute performance is of interest here but the
fact that deleting roughly takes 100% of the writing time.

* I need to write 15 rows of 16 MB each to disk in a maximum
time of 15 s, which is performed here in roughly 10 seconds,
however, now I am facing the problem that keeping my
database tidy (deleting rows) takes another 5-15 s (10s on
average), so my process exceeds the maximum time of 15s for
about 5s.

* Right now I am deleting like this:

DELETE FROM table WHERE (CURRENT_TIMESTAMP -
my_timestamp_column)  interval '2 minutes';

while it is planned to have the interval set to 6 hours in
the final version (thus creating a FIFO buffer for the
latest 6 hours of inserted data; so the FIFO will keep
approx.  10.000 rows spanning 160-200 GB data).

* This deletion SQL command was simply repeatedly executed
by pgAdmin while my app kept adding the 16 MB rows.

* Autovacuum is on; I believe I need to keep it on,
otherwise I do not free the disk space, right? If I switch
it off, the deletion time reduces from the average 10s down
to 4s.

* I am using server + libpq version 8.2.4, currently on
WinXP. Will an upgrade to 8.4 help here?

Do you have any other ideas to help me out?
Oh, please...

Thank You
 Felix





-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Deleting bytea, autovacuum, and 8.2/8.4 differences

2010-03-13 Thread Dave Crooke
Hi there

I'm not an expert on PG's toast system, but a couple of thoughts inline
below.

Cheers
Dave

On Sat, Mar 13, 2010 at 3:17 PM, fka...@googlemail.com 
fka...@googlemail.com wrote:

 Hi all,

 my posting on 2010-01-14 about the performance when writing
 bytea to disk caused a longer discussion. While the fact
 still holds that the overall postgresql write performance is
 roughly 25% of the serial I/O disk performance this was
 compensated for my special use case here by doing some other
 non-postgresql related things in parallel.

 Now I cannot optimize my processes any further, however, now
 I am facing another quite unexpected performance issue:
 Deleting rows from my simple table (with the bytea column)
 having 16 MB data each, takes roughly as long as writing
 them!

 Little more detail:

 * The table just has 5 unused int columns, a timestamp,
 OIDs, and the bytea column, *no indices*; the bytea storage
 type is 'extended', the 16 MB are compressed to approx. the
 half.


Why no indices?



 * All the usual optimizations are done to reach better
 write through (pg_xlog on another disk, much tweaks to the
 server conf etc), however, this does not matter here, since
 not the absolute performance is of interest here but the
 fact that deleting roughly takes 100% of the writing time.

 * I need to write 15 rows of 16 MB each to disk in a maximum
 time of 15 s, which is performed here in roughly 10 seconds,
 however, now I am facing the problem that keeping my
 database tidy (deleting rows) takes another 5-15 s (10s on
 average), so my process exceeds the maximum time of 15s for
 about 5s.

 * Right now I am deleting like this:

 DELETE FROM table WHERE (CURRENT_TIMESTAMP -
 my_timestamp_column)  interval '2 minutes';


You *need* an index on my_timestamp_column



 while it is planned to have the interval set to 6 hours in
 the final version (thus creating a FIFO buffer for the
 latest 6 hours of inserted data; so the FIFO will keep
 approx.  10.000 rows spanning 160-200 GB data).


That's not the way to keep a 6 hour rolling buffer ... what you need to do
is run the delete frequently, with  *interval '6 hours'* in the SQL acting
as the cutoff.

If you really do want to drop the entire table contents before refilling it,
do a *DROP TABLE* and recreate it.


 * This deletion SQL command was simply repeatedly executed
 by pgAdmin while my app kept adding the 16 MB rows.


Are you sure you are timing the delete, and not pgAdmin re-populating some
kind of buffer?



 * Autovacuum is on; I believe I need to keep it on,
 otherwise I do not free the disk space, right? If I switch
 it off, the deletion time reduces from the average 10s down
 to 4s.


You may be running autovaccum too aggressively, it may be interfering with
I/O to the tables.

Postgres vacuuming does not free disk space (in the sense of returning it to
the OS), it removes old versions of rows that have been UPDATEd or DELETEd
and makes that space in the table file available for new writes.


 * I am using server + libpq version 8.2.4, currently on
 WinXP. Will an upgrade to 8.4 help here?


8.4 has a lot of performance improvements. It's definitely worth a shot. I'd
also consider switching to another OS where you can use a 64-bit version of
PG and a much bigger buffer cache.


 Do you have any other ideas to help me out?
 Oh, please...

 Thank You
  Felix





 --
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance