Re: [PERFORM] Deleting bytea, autovacuum, and 8.2/8.4 differences

2010-03-15 Thread VJK
A quick test:
-
1. create table x1(x int, y bytea);
2. Load some data  say with python:
cp /opt/java/src.zip ~/tmp/a.dat (19MB)
##
import psycopg2
conn = psycopg2.connect(dbname='test' user='*' password=''
host='127.0.0.1');
conn.cursor().execute(INSERT INTO x1 VALUES (1, %s),
(psycopg2.Binary(open(a.dat).read()),))
conn.commit()
##
3. create table x2(x int, y bytea);
4. Copy table x1 100 times to x2 (1.9GB) and monitor/measure IO:
insert into x2 select a x, y from generate_series(1,100) a, x1;

Results:
---
On Linux 2.6.32 with an ext3 file system on one 15K rpm disk, we saw with
SystemTap that the source 1.9GB  (19MB x 100) resulted in 5GB of actual disk
IO and took 61 seconds (52 CPU + 9 sleep/wait for IO)

Deletion (delete from x2) took 32 seconds with 12 seconds CPU and 20 sec
sleep + wait for IO. Actual disk IO was about 4GB.

Since Pg does not use the concept of rollback segments,  it is unclear why
deletion produces so much disk IO (4GB).

VJ



On Sat, Mar 13, 2010 at 5:17 PM, fka...@googlemail.com 
fka...@googlemail.com wrote:

 Hi all,

 my posting on 2010-01-14 about the performance when writing
 bytea to disk caused a longer discussion. While the fact
 still holds that the overall postgresql write performance is
 roughly 25% of the serial I/O disk performance this was
 compensated for my special use case here by doing some other
 non-postgresql related things in parallel.

 Now I cannot optimize my processes any further, however, now
 I am facing another quite unexpected performance issue:
 Deleting rows from my simple table (with the bytea column)
 having 16 MB data each, takes roughly as long as writing
 them!

 Little more detail:

 * The table just has 5 unused int columns, a timestamp,
 OIDs, and the bytea column, no indices; the bytea storage
 type is 'extended', the 16 MB are compressed to approx. the
 half.

 * All the usual optimizations are done to reach better
 write through (pg_xlog on another disk, much tweaks to the
 server conf etc), however, this does not matter here, since
 not the absolute performance is of interest here but the
 fact that deleting roughly takes 100% of the writing time.

 * I need to write 15 rows of 16 MB each to disk in a maximum
 time of 15 s, which is performed here in roughly 10 seconds,
 however, now I am facing the problem that keeping my
 database tidy (deleting rows) takes another 5-15 s (10s on
 average), so my process exceeds the maximum time of 15s for
 about 5s.

 * Right now I am deleting like this:

 DELETE FROM table WHERE (CURRENT_TIMESTAMP -
 my_timestamp_column)  interval '2 minutes';

 while it is planned to have the interval set to 6 hours in
 the final version (thus creating a FIFO buffer for the
 latest 6 hours of inserted data; so the FIFO will keep
 approx.  10.000 rows spanning 160-200 GB data).

 * This deletion SQL command was simply repeatedly executed
 by pgAdmin while my app kept adding the 16 MB rows.

 * Autovacuum is on; I believe I need to keep it on,
 otherwise I do not free the disk space, right? If I switch
 it off, the deletion time reduces from the average 10s down
 to 4s.

 * I am using server + libpq version 8.2.4, currently on
 WinXP. Will an upgrade to 8.4 help here?

 Do you have any other ideas to help me out?
 Oh, please...

 Thank You
  Felix





 --
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance



Re: [PERFORM] Deleting bytea, autovacuum, and 8.2/8.4 differences

2010-03-15 Thread Kevin Grittner
fka...@googlemail.com fka...@googlemail.com wrote:
 
 Simply because the test case had just  50 rows (deleting
 all rows older than 2 minues). Later on I would use indices.
 
Running a performance test with 50 rows without indexes and
extrapolating to a much larger data set with indexes won't tell you
much.  The plans chosen by the PostgreSQL optimizer will probably be
completely different, and the behavior of the caches (at all levels)
will be very different.
 
  while it is planned to have the interval set to 6 hours in
  the final version (thus creating a FIFO buffer for the
  latest 6 hours of inserted data; so the FIFO will keep
  approx.  10.000 rows spanning 160-200 GB data).
 
This might lend itself to partitioning.  Dropping a partition
containing data older than six hours would be very fast.  Without
knowing what kinds of queries you want to run on the data, it's hard
to predict the performance impact on your other operations, though.
 
  * Autovacuum is on; I believe I need to keep it on,
  otherwise I do not free the disk space, right? If I switch
  it off, the deletion time reduces from the average 10s down
  to 4s.
 
 
 You may be running autovaccum too aggressively, it may be
 interfering with I/O to the tables.
 
 Hm, so would should I change then? I wonder if it helps to
 run autovacuum less aggressive if there will not be a
 situation were the whole process is stopped for a while. But
 I'd like to understand what to change here.
 
I'd be very careful about this, I've seen performance problems more
often (and more dramatic) from not running it aggressively enough. 
Monitor performance and bloat closely when you adjust this, and make
sure the data and load are modeling what you expect in production,
or you'll tune for the wrong environment and likely make matters
worse for the environment that really matters.
 
-Kevin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Deleting bytea, autovacuum, and 8.2/8.4 differences

2010-03-15 Thread Kevin Grittner
VJK vjkm...@gmail.com wrote:
 
 the source 1.9GB  (19MB x 100) resulted in 5GB of actual disk IO
 
 Deletion (delete from x2) took 32 seconds with 12 seconds CPU and
 20 sec sleep + wait for IO. Actual disk IO was about 4GB.
 
 Since Pg does not use the concept of rollback segments,  it is
 unclear why deletion produces so much disk IO (4GB).
 
One delete would mark the xmax of the tuple, so that transactions
without that transaction ID in their visible set would ignore it. 
The next table scan would set hint bits, which would store
information within the tuple to indicate that the deleting
transaction successfully committed, then the vacuum would later wake
up and rewrite the page with the deleted tuples removed.
 
If you have enough battery backed cache space on a hardware RAID
controller card, and that cache is configured in write-back mode,
many of these writes might be combined -- the original delete, the
hint bit write, and the vacuum might all combine into one physical
write to disk.  What does your disk system look like, exactly?
 
-Kevin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Deleting bytea, autovacuum, and 8.2/8.4 differences

2010-03-15 Thread Greg Smith

VJK wrote:
Since Pg does not use the concept of rollback segments,  it is unclear 
why deletion produces so much disk IO (4GB).


With PostgreSQL's write-ahead log, MVCC and related commit log, and 
transactional DDL features, there's actually even more overhead that can 
be involved than a simple rollback segment design when you delete things:


http://www.postgresql.org/docs/current/static/wal.html
http://www.postgresql.org/docs/current/static/mvcc-intro.html
http://wiki.postgresql.org/wiki/Hint_Bits
http://wiki.postgresql.org/wiki/Transactional_DDL_in_PostgreSQL:_A_Competitive_Analysis

One fun thing to try here is to increase shared_buffers and 
checkpoint_segments, then see if the total number of writes go down.  
The defaults for both are really low, which makes buffer page writes 
that might otherwise get combined as local memory changes instead get 
pushed constantly to disk.


--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com   www.2ndQuadrant.us


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Deleting bytea, autovacuum, and 8.2/8.4 differences

2010-03-15 Thread Tom Lane
Greg Smith g...@2ndquadrant.com writes:
 VJK wrote:
 Since Pg does not use the concept of rollback segments,  it is unclear 
 why deletion produces so much disk IO (4GB).

 With PostgreSQL's write-ahead log, MVCC and related commit log, and 
 transactional DDL features, there's actually even more overhead that can 
 be involved than a simple rollback segment design when you delete things:

For an example like this one, you have to keep in mind that the
toast-table rows for the large bytea value have to be marked deleted,
too.  Also, since I/O happens in units of pages, the I/O volume to
delete a tuple is just as much as the I/O to create it.  (The WAL
entry for deletion might be smaller, but that's all.)  So it is entirely
unsurprising that DELETE FROM foo is about as expensive as filling the
table initially.

If deleting a whole table is significant for you performance-wise,
you might look into using TRUNCATE instead.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Deleting bytea, autovacuum, and 8.2/8.4 differences

2010-03-15 Thread Tom Lane
Matthew Wakeling matt...@flymine.org writes:
 On Mon, 15 Mar 2010, Tom Lane wrote:
 If deleting a whole table is significant for you performance-wise,
 you might look into using TRUNCATE instead.

 Might you still end up with a normal delete operation 
 on the TOAST table when performing a TRUNCATE on the owner table?

No, you get a TRUNCATE on its toast table too.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Deleting bytea, autovacuum, and 8.2/8.4 differences

2010-03-15 Thread VJK
Inline:

On Mon, Mar 15, 2010 at 10:42 AM, Greg Smith g...@2ndquadrant.com wrote:

 VJK wrote:

 Since Pg does not use the concept of rollback segments,  it is unclear why
 deletion produces so much disk IO (4GB).


 With PostgreSQL's write-ahead log, MVCC and related commit log, and
 transactional DDL features, there's actually even more overhead that can be
 involved than a simple rollback segment design when you delete things:


There does not appear to be much WAL activity.  Here's the insertion of 100
rows as seen by iotop:
   4.39 G  0.00 %  9.78 % postgres: writer process
   5.34 G  0.00 %  5.93 % postgres: postgr~0.5.93(1212) idle
  27.84 M  0.00 %  1.77 % postgres: wal writer process
 144.00 K  0.00 %  0.00 % postgres: stats collector process
   0.00 B  0.00 %  0.00 % postgres: autova~ launcher process
   0.00 B  0.00 %  0.00 % postgres: postgr~0.5.93(4632) idle



.. and the deletion:
288.18 M  0.00 % 37.80 % postgres: writer process
  3.41 G  0.00 % 19.76 % postgres: postgr~0.5.93(1212) DELETE
 27.27 M  0.00 %  3.18 % postgres: wal writer process
 72.00 K  0.00 %  0.03 % postgres: stats collector process
  0.00 B  0.00 %  0.00 % postgres: autova~ launcher process
  0.00 B  0.00 %  0.00 % postgres: postgr~0.5.93(4632) idle

So, the original 1.9 GB of useful data generate about 10GB of IO, 5 of which
end up being written to the disk  The deletion generates about 3.8 GB of IO
all of which results in disk IO.  WAL activity is about 27MB in both cases.



 http://www.postgresql.org/docs/current/static/wal.html
 http://www.postgresql.org/docs/current/static/mvcc-intro.html
 http://wiki.postgresql.org/wiki/Hint_Bits

 http://wiki.postgresql.org/wiki/Transactional_DDL_in_PostgreSQL:_A_Competitive_Analysis


I read all of the above, but it does not really clarify why deletion
generates so much IO.


 One fun thing to try here is to increase shared_buffers and
 checkpoint_segments, then see if the total number of writes go down.  The
 defaults for both are really low, which makes buffer page writes that might
 otherwise get combined as local memory changes instead get pushed constantly
 to disk.

 --
 Greg Smith  2ndQuadrant US  Baltimore, MD
 PostgreSQL Training, Services and Support
 g...@2ndquadrant.com   www.2ndQuadrant.us




Re: [PERFORM] Deleting bytea, autovacuum, and 8.2/8.4 differences

2010-03-15 Thread Matthew Wakeling

On Mon, 15 Mar 2010, Tom Lane wrote:

For an example like this one, you have to keep in mind that the
toast-table rows for the large bytea value have to be marked deleted,
too.  Also, since I/O happens in units of pages, the I/O volume to
delete a tuple is just as much as the I/O to create it.  (The WAL
entry for deletion might be smaller, but that's all.)  So it is entirely
unsurprising that DELETE FROM foo is about as expensive as filling the
table initially.

If deleting a whole table is significant for you performance-wise,
you might look into using TRUNCATE instead.


What are the implications of using TRUNCATE on a table that has TOASTed 
data? Is TOAST all stored in one single table, or is it split up by owner 
table/column name? Might you still end up with a normal delete operation 
on the TOAST table when performing a TRUNCATE on the owner table?


Matthew

--
sed -e '/^[when][coders]/!d;/^...[discover].$/d;/^..[real].[code]$/!d
' `locate dict/words`

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Deleting bytea, autovacuum, and 8.2/8.4 differences

2010-03-14 Thread fka...@googlemail.com
Hi Dave,

thank you for your answers! Here some comments:

Dave Crooke:

  * The table just has 5 unused int columns, a timestamp,
  OIDs, and the bytea column, *no indices*; the bytea storage
  type is 'extended', the 16 MB are compressed to approx. the
  half.
 
 
 Why no indices?

Simply because the test case had just  50 rows (deleting
all rows older than 2 minues). Later on I would use indices.


  while it is planned to have the interval set to 6 hours in
  the final version (thus creating a FIFO buffer for the
  latest 6 hours of inserted data; so the FIFO will keep
  approx.  10.000 rows spanning 160-200 GB data).
 
 
 That's not the way to keep a 6 hour rolling buffer ... what you need to do
 is run the delete frequently, with  *interval '6 hours'* in the SQL acting
 as the cutoff.

In fact the delete was run frequently to cut everything
older than 6 hours *immediately*.


 If you really do want to drop the entire table contents before refilling it,
 do a *DROP TABLE* and recreate it.

No, I do not want to drop the whole table.


  * This deletion SQL command was simply repeatedly executed
  by pgAdmin while my app kept adding the 16 MB rows.
 
 
 Are you sure you are timing the delete, and not pgAdmin re-populating some
 kind of buffer?

Quite sure, yes. Because I launched just the delete command
in pgAdmin while the rest was executed by my application
outside pgAdmin, of course.



  * Autovacuum is on; I believe I need to keep it on,
  otherwise I do not free the disk space, right? If I switch
  it off, the deletion time reduces from the average 10s down
  to 4s.
 
 
 You may be running autovaccum too aggressively, it may be interfering with
 I/O to the tables.

Hm, so would should I change then? I wonder if it helps to
run autovacuum less aggressive if there will not be a
situation were the whole process is stopped for a while. But
I'd like to understand what to change here.


 8.4 has a lot of performance improvements. It's definitely worth a shot. I'd
 also consider switching to another OS where you can use a 64-bit version of
 PG and a much bigger buffer cache.

O.k., I'll give it a try.


Thank You.
 Felix


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Deleting bytea, autovacuum, and 8.2/8.4 differences

2010-03-13 Thread fka...@googlemail.com
Hi all,

my posting on 2010-01-14 about the performance when writing
bytea to disk caused a longer discussion. While the fact
still holds that the overall postgresql write performance is
roughly 25% of the serial I/O disk performance this was
compensated for my special use case here by doing some other
non-postgresql related things in parallel.

Now I cannot optimize my processes any further, however, now
I am facing another quite unexpected performance issue:
Deleting rows from my simple table (with the bytea column)
having 16 MB data each, takes roughly as long as writing
them!

Little more detail:

* The table just has 5 unused int columns, a timestamp,
OIDs, and the bytea column, no indices; the bytea storage
type is 'extended', the 16 MB are compressed to approx. the
half.

* All the usual optimizations are done to reach better
write through (pg_xlog on another disk, much tweaks to the
server conf etc), however, this does not matter here, since
not the absolute performance is of interest here but the
fact that deleting roughly takes 100% of the writing time.

* I need to write 15 rows of 16 MB each to disk in a maximum
time of 15 s, which is performed here in roughly 10 seconds,
however, now I am facing the problem that keeping my
database tidy (deleting rows) takes another 5-15 s (10s on
average), so my process exceeds the maximum time of 15s for
about 5s.

* Right now I am deleting like this:

DELETE FROM table WHERE (CURRENT_TIMESTAMP -
my_timestamp_column)  interval '2 minutes';

while it is planned to have the interval set to 6 hours in
the final version (thus creating a FIFO buffer for the
latest 6 hours of inserted data; so the FIFO will keep
approx.  10.000 rows spanning 160-200 GB data).

* This deletion SQL command was simply repeatedly executed
by pgAdmin while my app kept adding the 16 MB rows.

* Autovacuum is on; I believe I need to keep it on,
otherwise I do not free the disk space, right? If I switch
it off, the deletion time reduces from the average 10s down
to 4s.

* I am using server + libpq version 8.2.4, currently on
WinXP. Will an upgrade to 8.4 help here?

Do you have any other ideas to help me out?
Oh, please...

Thank You
 Felix





-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Deleting bytea, autovacuum, and 8.2/8.4 differences

2010-03-13 Thread Dave Crooke
Hi there

I'm not an expert on PG's toast system, but a couple of thoughts inline
below.

Cheers
Dave

On Sat, Mar 13, 2010 at 3:17 PM, fka...@googlemail.com 
fka...@googlemail.com wrote:

 Hi all,

 my posting on 2010-01-14 about the performance when writing
 bytea to disk caused a longer discussion. While the fact
 still holds that the overall postgresql write performance is
 roughly 25% of the serial I/O disk performance this was
 compensated for my special use case here by doing some other
 non-postgresql related things in parallel.

 Now I cannot optimize my processes any further, however, now
 I am facing another quite unexpected performance issue:
 Deleting rows from my simple table (with the bytea column)
 having 16 MB data each, takes roughly as long as writing
 them!

 Little more detail:

 * The table just has 5 unused int columns, a timestamp,
 OIDs, and the bytea column, *no indices*; the bytea storage
 type is 'extended', the 16 MB are compressed to approx. the
 half.


Why no indices?



 * All the usual optimizations are done to reach better
 write through (pg_xlog on another disk, much tweaks to the
 server conf etc), however, this does not matter here, since
 not the absolute performance is of interest here but the
 fact that deleting roughly takes 100% of the writing time.

 * I need to write 15 rows of 16 MB each to disk in a maximum
 time of 15 s, which is performed here in roughly 10 seconds,
 however, now I am facing the problem that keeping my
 database tidy (deleting rows) takes another 5-15 s (10s on
 average), so my process exceeds the maximum time of 15s for
 about 5s.

 * Right now I am deleting like this:

 DELETE FROM table WHERE (CURRENT_TIMESTAMP -
 my_timestamp_column)  interval '2 minutes';


You *need* an index on my_timestamp_column



 while it is planned to have the interval set to 6 hours in
 the final version (thus creating a FIFO buffer for the
 latest 6 hours of inserted data; so the FIFO will keep
 approx.  10.000 rows spanning 160-200 GB data).


That's not the way to keep a 6 hour rolling buffer ... what you need to do
is run the delete frequently, with  *interval '6 hours'* in the SQL acting
as the cutoff.

If you really do want to drop the entire table contents before refilling it,
do a *DROP TABLE* and recreate it.


 * This deletion SQL command was simply repeatedly executed
 by pgAdmin while my app kept adding the 16 MB rows.


Are you sure you are timing the delete, and not pgAdmin re-populating some
kind of buffer?



 * Autovacuum is on; I believe I need to keep it on,
 otherwise I do not free the disk space, right? If I switch
 it off, the deletion time reduces from the average 10s down
 to 4s.


You may be running autovaccum too aggressively, it may be interfering with
I/O to the tables.

Postgres vacuuming does not free disk space (in the sense of returning it to
the OS), it removes old versions of rows that have been UPDATEd or DELETEd
and makes that space in the table file available for new writes.


 * I am using server + libpq version 8.2.4, currently on
 WinXP. Will an upgrade to 8.4 help here?


8.4 has a lot of performance improvements. It's definitely worth a shot. I'd
also consider switching to another OS where you can use a 64-bit version of
PG and a much bigger buffer cache.


 Do you have any other ideas to help me out?
 Oh, please...

 Thank You
  Felix





 --
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance