Re: [PERFORM] Benchmark shows very slow bulk delete

2010-01-28 Thread Ivan Voras
James Mansion wrote:
 Ivan Voras wrote:
 I wish that, when people got the idea to run a simplistic benchmark
 like this, they would at least have the common sense to put the
 database on a RAM drive to avoid problems with different cylinder
 speeds of rotational media and fragmentation from multiple runs.
 Huh?
 It's tough to benchmark anything involving rotational drives :)
 But - how the database organises its IO to maximise the available
 bandwidth, limit
 avaiodable seeks, and limit avoidable flushes is absolutely key to
 realistic performance,
 especially on modest everyday hardware. Not everyone has a usage that
 justifies
 'enterprise' kit - but plenty of people can benefit from something a
 step up from
 SQLite.
 
 If you just want to benchmark query processor efficiency then that's one
 scenario
 where taking physical IO out of the picture might be justified, but I
 don't see a good
 reason to suggest that it is 'common sense' to do so for all testing,
 and while the
 hardware involved is pretty low end, its still a valid data point.
 .

You are right, of course, for common benchmarking to see what
performance can be expected from some setup in some circumstances, but
not where the intention is to compare products.

You don't have to go the memory drive / SSD route - just make sure the
databases always use the same (small) area of the disk drive.



signature.asc
Description: OpenPGP digital signature


[PERFORM] Benchmark shows very slow bulk delete

2010-01-27 Thread Thom Brown
Had a quick look at a benchmark someone put together of MySQL vs PostgreSQL,
and while PostgreSQL is generally faster, I noticed the bulk delete was very
slow: http://www.randombugs.com/linux/mysql-postgresql-benchmarks.html

Is this normal?

Thom


Re: [PERFORM] Benchmark shows very slow bulk delete

2010-01-27 Thread Ivan Voras

On 01/27/10 14:28, Thom Brown wrote:

Had a quick look at a benchmark someone put together of MySQL vs
PostgreSQL, and while PostgreSQL is generally faster, I noticed the bulk
delete was very slow:
http://www.randombugs.com/linux/mysql-postgresql-benchmarks.html


I wish that, when people got the idea to run a simplistic benchmark like 
this, they would at least have the common sense to put the database on a 
RAM drive to avoid problems with different cylinder speeds of rotational 
media and fragmentation from multiple runs.


Here are some typical results from a desktop SATA drive:

ada0
512 # sectorsize
500107862016# mediasize in bytes (466G)
976773168   # mediasize in sectors
969021  # Cylinders according to firmware.
16  # Heads according to firmware.
63  # Sectors according to firmware.
6QG3Z026# Disk ident.

Seek times:
Full stroke:  250 iter in   5.676993 sec =   22.708 msec
Half stroke:  250 iter in   4.284583 sec =   17.138 msec
Quarter stroke:   500 iter in   6.805539 sec =   13.611 msec
Short forward:400 iter in   2.678447 sec =6.696 msec
Short backward:   400 iter in   2.318637 sec =5.797 msec
Seq outer:   2048 iter in   0.214292 sec =0.105 msec
Seq inner:   2048 iter in   0.203929 sec =0.100 msec
Transfer rates:
outside:   102400 kbytes in   1.229694 sec =83273 kbytes/sec
middle:102400 kbytes in   1.446570 sec =70788 kbytes/sec
inside:102400 kbytes in   2.446670 sec =41853 kbytes/sec

This doesn't explain the 4-orders-of-magnitude difference between MySQL 
and PostgreSQL in bulk_delete() (0.02 vs 577) but it does suggest that 
some other results where the performance is close, might be bogus.


It's tough to benchmark anything involving rotational drives :)


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Benchmark shows very slow bulk delete

2010-01-27 Thread Matthew Wakeling

On Wed, 27 Jan 2010, Thom Brown wrote:

Had a quick look at a benchmark someone put together of MySQL vs PostgreSQL,
and while PostgreSQL is generally faster, I noticed the bulk delete was very
slow: http://www.randombugs.com/linux/mysql-postgresql-benchmarks.html

Is this normal?


On the contrary, TRUNCATE TABLE is really rather fast.

Seriously, the Postgres developers, when designing the system, decided on 
a database layout that was optimised for the most common cases. Bulk 
deletion of data is not really that common an operation, unless you are 
deleting whole categories of data, where setting up partitioning and 
deleting whole partitions would be sensible.


Other complications are that the server has to maintain concurrent 
integrity - that is, another transaction must be able to see either none 
of the changes or all of them. As a consequence of this, Postgres needs to 
do a sequential scan through the table and mark the rows for deletion in 
the transaction, before flipping the transaction committed status and 
cleaning up afterwards.


I'd be interested in how mysql manages to delete a whole load of rows in 
0.02 seconds. How many rows is that?


(Reading in the comments, I saw this: The slow times for Postgresql Bulk 
Modify/Bulk Delete can be explained by foreign key references to the 
updates table. I'm not sure that fully explains it though, unless there 
are basically zero rows being deleted - it's hardly bulk then, is it?)


Matthew

--
People who love sausages, respect the law, and work with IT standards 
shouldn't watch any of them being made.  -- Peter Gutmann


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Benchmark shows very slow bulk delete

2010-01-27 Thread Kevin Grittner
Thom Brown thombr...@gmail.com wrote:
 
 Had a quick look at a benchmark someone put together of MySQL vs
 PostgreSQL, and while PostgreSQL is generally faster, I noticed
 the bulk delete was very slow:
 http://www.randombugs.com/linux/mysql-postgresql-benchmarks.html
 
 Is this normal?
 
It is if you don't have an index on the table which has a foreign
key defined which references the table in which you're doing
deletes.  The author of the benchmark apparently didn't realize that
MySQL automatically adds such an index to the dependent table, while
PostgreSQL leaves it to you to decide whether to add such an index. 
For insert-only tables, it isn't always worth the cost of
maintaining it.
 
Also, I see that the database was small enough to be fully cached,
yet the costs weren't adjusted to the recommended values for such an
environment, so PostgreSQL should *really* have beaten MySQL by more
than it did.
 
-Kevin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Benchmark shows very slow bulk delete

2010-01-27 Thread Nikolas Everett
On Wed, Jan 27, 2010 at 9:54 AM, Kevin Grittner kevin.gritt...@wicourts.gov
 wrote:

 It is if you don't have an index on the table which has a foreign
 key defined which references the table in which you're doing
 deletes.  The author of the benchmark apparently didn't realize that
 MySQL automatically adds such an index to the dependent table, while
 PostgreSQL leaves it to you to decide whether to add such an index.
 For insert-only tables, it isn't always worth the cost of
 maintaining it.


It really gets to me that I have to not use some foreign keys in MySQL
because I can't afford to maintain the index.  I have to write super fun
check constraints that look like

DELIMITER \\
CREATE TRIGGER Location_Pre_Delete BEFORE DELETE ON Locations FOR EACH ROW
BEGIN
  DECLARE _id INT;
  SELECT id INTO _id FROM BigHistoryTable WHERE locationId = OLD.id LIMIT 1;
  IF _id IS NOT NULL THEN
INSERT INTO BigHistoryTable
(column_that_does_not_exist_but_says_that_you_violated_my_hacked_foreign_key)
VALUES ('fail');
  END IF;
END\\

Sometimes I can't sleep at night for having written that code.


Re: [PERFORM] Benchmark shows very slow bulk delete

2010-01-27 Thread Andres Freund
On Wednesday 27 January 2010 15:49:06 Matthew Wakeling wrote:
 On Wed, 27 Jan 2010, Thom Brown wrote:
  Had a quick look at a benchmark someone put together of MySQL vs
  PostgreSQL, and while PostgreSQL is generally faster, I noticed the bulk
  delete was very slow:
  http://www.randombugs.com/linux/mysql-postgresql-benchmarks.html
  
  Is this normal?
 
 On the contrary, TRUNCATE TABLE is really rather fast.
 
 Seriously, the Postgres developers, when designing the system, decided on
 a database layout that was optimised for the most common cases. Bulk
 deletion of data is not really that common an operation, unless you are
 deleting whole categories of data, where setting up partitioning and
 deleting whole partitions would be sensible.
 
 Other complications are that the server has to maintain concurrent
 integrity - that is, another transaction must be able to see either none
 of the changes or all of them. As a consequence of this, Postgres needs to
 do a sequential scan through the table and mark the rows for deletion in
 the transaction, before flipping the transaction committed status and
 cleaning up afterwards.
 
 I'd be interested in how mysql manages to delete a whole load of rows in
 0.02 seconds. How many rows is that?
Afair mysql detects that case and converts it into some truncate equivalent.

Andres

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Benchmark shows very slow bulk delete

2010-01-27 Thread James Mansion

Ivan Voras wrote:
I wish that, when people got the idea to run a simplistic benchmark 
like this, they would at least have the common sense to put the 
database on a RAM drive to avoid problems with different cylinder 
speeds of rotational media and fragmentation from multiple runs.

Huh?

It's tough to benchmark anything involving rotational drives :)
But - how the database organises its IO to maximise the available 
bandwidth, limit
avaiodable seeks, and limit avoidable flushes is absolutely key to 
realistic performance,
especially on modest everyday hardware. Not everyone has a usage that 
justifies
'enterprise' kit - but plenty of people can benefit from something a 
step up from

SQLite.

If you just want to benchmark query processor efficiency then that's one 
scenario
where taking physical IO out of the picture might be justified, but I 
don't see a good
reason to suggest that it is 'common sense' to do so for all testing, 
and while the

hardware involved is pretty low end, its still a valid data point.
.



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Benchmark shows very slow bulk delete

2010-01-27 Thread Greg Smith

Kevin Grittner wrote:

It is if you don't have an index on the table which has a foreign
key defined which references the table in which you're doing
deletes.  The author of the benchmark apparently didn't realize that
MySQL automatically adds such an index to the dependent table, while
PostgreSQL leaves it to you to decide whether to add such an index. 
  


The author there didn't write the PostgreSQL schema; he's just using the 
osdb test kit:  http://osdb.sourceforge.net/


Given that both Peter and Neil Conway have thrown work their way, I know 
there's been some PG specific work done on that project by people who 
know what's going on, but I'm not sure if that included a performance 
check.  A quick glance at 
http://osdb.cvs.sourceforge.net/viewvc/osdb/osdb/src/callable-sql/postgres-ui/osdb-pg-ui.m4?revision=1.4view=markup 
finds this:


 222 createIndexForeign(char* tName, char* keyName, char* keyCol,
 223 char* fTable, char* fFields) {
 224 snprintf(cmd, CMDBUFLEN,
 225 alter table %s add constraint %s foreign key (%s) references %s 
(%s),
 226 tName, keyName, keyCol, fTable, fFields);


But I don't see any obvious spot where the matching index that should go 
along with that is created at.  The code is just convoluted enough (due 
to how they abstract away support for multiple databases) that I'm not 
sure yet--maybe they call their createIndexBtree function and fix this 
in a later step.  But the way the function is 
named--createIndexForeign--seems to suggest they believe that this 
operation will create the index, too, which as you point out is just not 
true.


--
Greg Smith2ndQuadrant   Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com  www.2ndQuadrant.com


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance