Re: TDB: release process

Andy Seaborne Thu, 12 Jan 2012 07:13:38 -0800

Simon,

Thank you for the figures.  Very useful.


Is this on Linux or Windows?

One of the design points of transactions is when to write-back thechanges to the main database. At the moment, write-back is done by anytransaction (read included :-) that finds the database quiescent whenthe transaction clears up. A commit in a writer writes to the log, notthe main DB. The journal is written back later. This will hit somereaders. This is also why your write - times are better.

The advantage of the writeback policy currently is that it ispredictable and easy to calculate when it will happen. What I hope todo when the system is proven reliable is to have a write-back thread,taking all database writes off the transaction end code path. SQLliteand similar so this - but you need to be very careful that write-backdoes occur and the log doesn't just continue to grow.

An obvious half-way design is to fork the write-back at the end of thetransaction that decides it can do the final changes but immediatelyreturn from the transaction.

Also, with the change in locking strategy, different ordering maybehappening. In testing, I did see, when the system was totally loaded,some read transactions seemingly scheduled aside when there were writers(Linux) even accounting for the time when changes were written to themain database. However, that was in a system that was tuned to bemax-ed out and transactions that did zero, or near zero work.

During my testing, which is for a different workload, query has seemedthe same as 0.8.10 in normal use. The workload is read-dominated.Writes are infrequent. We're continuing to test.

One of the drivers for the transaction was to reduce the stutter effectof global locking for a writer to run - using MRSW locking, the writermeans the readers are locked out, resulting in high latency for somereads. The write-back needs further tuning for that although it isalready better because (1) it waits until the DB is quiet, (2) itamalgamates writes and especially sync() calls. sync() is the expesivepoint.


Another driver for the transactions has been to make the data more durable.

        Andy

PS Early work-in-progress

https://svn.apache.org/repos/asf/incubator/jena/Experimental/JenaPerf/trunk/

A performance framework for running query (and later update) mixes andreporting. Unfinished.




On 10/01/12 23:24, Simon Helsen wrote:

Andy,

yes, I'll look into it as soon as I have cycles again. And no, I have
not yet tried with non-transactional API in 2.7.0. I actually want to do
that at some point to have a cleaner baseline.

In the mean time, here is a summary of the results I found:

1) when I run with 1 client, query and store execution is comparable to
each other. I have detailed numbers, but they help much
2) things become interesting when I start scaling up the number of
clients (one of the principal motivations to move to TDB Tx). The data
below is for the following scenario:

* 50 clients
* the operations of each client is a mixture of queries and write
operations, where I execute a write operation for every 7th query
* the queries are deterministically taken from a pool of about 35
queries with varying complexity. When run in 1 client, they take
anywhere from a few ms to almost 2 seconds for most intense query
* between each operation, I wait 2s
* there is plenty of memory/heap available. I use a 64 bit machine with
8Gb of memory where 4 is used for the java heap.

Note that in TDB we use an exclusive write lock for write operations and
shared read locks for read operations. In TDBTx, I just use transactions
(i.e. we don't lock ourselves):

A) Here are the numbers for TDB (0.8.7 etc):

- total write time = 1345594ms, so about 1346s

cnt | avg | max | min | dev | tot
======================================================================================================================

DESCRIBE (ms) 402 | 466 | 4,859 | 0 | 609 | 187,609
SELECT (ms) 4,618 | 4,809 | 93,453 | 0 | 9,621 | 22,211,907
----------------------------------------------------------------------------------------------------------------------

PARALLELISM 5,020 | 14 | 41 | 0 | 8 | 79,066

quite note about parallelism: this indicates effectively how much
parallel activity was going on. For instance, on average, there were 14
queries running at the same time, but maximum 41. The total indicates
how heavily query activity was running in parallel.

B) Here are the numbers of TDBTx:

- total write time = 166047ms, so about 166s

cnt | avg | max | min | dev | tot
==================================================================================================================

DESCRIBE (ms) 168 | 2,557 | 9,219 | 31 | 1,769 | 429,645
SELECT (ms) 1,853 | 38,866 | 392,282 | 0 | 74,008 | 72,020,224
-------------------------------------------------------------------------------------------------------------------

PARALLELISM 2,021 | 35 | 49 | 0 | 10 | 71,791

note that although the test suite are running in the same way, The long
query times in TDBTx caused several timeouts, which indicates the
substantially smaller amount of completed queries. Even so, the total
query time was still almost 4 times higher

So, it seems that in this multi-client scenario, TDBTx is way better in
avoiding lock contention around write operations, but, it is behaving
significantly weaker for queries. One thing that is interesting is TDBTx
has a higher number
of average parallel running queries and a higher max. So, perhaps this
is an important cause in the slowdown.

Hopefully these are useful. Does any of you have done any performance
measurements with transactional TDB?

Simon

From:   Andy Seaborne <[email protected]>
To:     [email protected]
Date:   01/10/2012 02:04 PM
Subject:        Re: TDB: release process

------------------------------------------------------------------------

On 10/01/12 13:45, Andy Seaborne wrote:
 > On 09/01/12 15:07, Simon Helsen wrote:
 >> Andy, others,
 >>
 >> I have been testing TxTDB on my end and functionally, things are looking
 >> good. I am not able to see any immediate problems anymore. Of course,
 >> there may still be more exotic things left, but those can probably
 >> managed
 >> in am minor release. However, now that it is getting good on the
 >> functional end, I am starting to check the non-functional
 >> characteristics,
 >> especially speed and scalability (in terms of multiple clients). For
this
 >> I use a test suite with about 35 different queries and I compare the
 >> performance against Jena 2.6.3/ARQ 2.8.5 and TDB 0.8.7 because that is
 >> the
 >> version we currently use in the release of our product.. I am comparing
 >> these numbers then with Jena/ARQ 2.7.0 and TDB 0.9.0 (20111229) and the
 >> transaction API. I realize this partially comparing apples to pears but
 >> from our perspective, we need to see how the bottomline changes in terms
 >> of query speed when we increase the number of concurrent clients.
 >>
 >> I have detailed numbers, but before I start sharing these, I want to
know
 >> if there is anything I could/should do to tune ARQ/TxTDB in terms of
 >> performance. For instance, I wonder if there are still a whole range of
 >> checks active which I can/should turn off now that we are functionally
 >> more sound. For completeness, I should add that we don't use any
 >> optimization (i.e. we run with none.opt )
 >>
 >> thanks
 >>
 >> Simon
 >
 > Simon,
 >
 > Figure would be good. If you use TDB without touching the transaction
 > system then it should be the same as before (with the obvious chances of
 > unintended changes). Have you run this way?
 >
 > Just creating a transaction, especially one that allows write is a cost
 > and if the granularity is small then it's going to make a big
 > difference. (This is one reason there isn't an "autocommit" mode - it
 > only seems to end in trouble one way or another). Read transactions are
 > cheaper but not free.
 >
 > In terms of tuning, TDB 0.9 needs more heap as the transaction
 > intermediate state is in-RAM , with no proper spill-to-disk yet.
 >
 > There shouldn't be the internal consistency checking enabled. Hmm -
 > better check yet again!
 >
 > Andy
 >

Simon,

Could you profile the tests and pass on the results? Any testing code
left should show as hotspots.

Andy

Re: TDB: release process

Reply via email to