It looks like ZODB performance in your test has the same O(log n) performance as PostgreSQL checkpoints (the periodic drops in your graph). This should come as no surprise. B-Trees have a theoretical Search/Insert/Delete time complexity equal to the height of the tree, which is (up to) log(n).

So why is PosgreSQL so much faster? It's using a Write-Ahead-Log for inserts. Instead of inserting into the (B-Tree based) data files at every transaction commit it writes a record to the WAL. This does not require traversal of the B-Tree and has O(1) time complexity. The penalty for this is that read operations become more complex, they must look first in the WAL and overlay those results with the main index. The WAL is never allowed to get too large, or its in memory index would become too big.

If you are going to have this number of records -- in a single B-Tree -- then use a relational database. It's what they're optimised for.

Laurence

Roché Compaan wrote:
Well I finally realised that ZODB benchmarks are not going to fall from
the sky so compelled by a project that needs to scale to very large
numbers and a general desire to have real numbers I started to write
some benchmarks.

My first goal was to get a baseline and test performance for the most
basic operations like inserts and lookups. The first test tests BTree
performance (OOBTree to be specific) and insert instances of a persitent
class into a BTree. Each instance has a single attribute that is 1K in
size. The test tries out different commit intervals - the first
iteration commits every 10 inserts, the second iteration commits every
100 inserts and the last one commits every 1000 inserts. I don't have
results for the second and third iterations since the first iteration
takes a couple of hours to complete and I'm still waiting for the
results on the second and third iteration.

The results so far is worrying in that performance deteriorates
logarithmically. The test kicks of with a bang at close to 750 inserts
per second, but after 1 million objects the insert rate drops to 260
inserts per second and at 10 million objects the rate is not even 60
inserts per second. Why?

In an attempt to determine if this drop in performance is normal I
created a test with Postgres purely to observe transaction rate and not
to compare it with the ZODB. In Postgres the transaction rate hovers
around 2700 inserts throughout the test. There are periodic drops but I
guess these are times when Postgres flushes to disc. I was hoping to
have a consistent transaction rate in the ZODB too. See the attached
image for the comparison. I also attach csv files of the data collected
by both tests.

During the last Plone conference I started a project called zodbbench
available here:

https://svn.plone.org/svn/collective/collective.zodbbench

The tests are written as unit tests and are run with a testrunner
script. The project uses buildout to make it easy to get going.
Unfortunately installing it with buildout on some systems seems to lead
to weird import errors that I can't explain so I would appreciate it if
somebody with buildout fu can look at it.
What I would appreciate more though is an explanation of the drop in
performance or alternatively, why the test is insane ;-)



------------------------------------------------------------------------


------------------------------------------------------------------------

_______________________________________________
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

_______________________________________________
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Reply via email to