Re: Different policy for concurrency access in TDB supporting a single writer and multiple readers

Andy Seaborne Fri, 01 Apr 2011 02:28:35 -0700


On 30/03/11 22:46, Stephen Allen wrote:

Andy,

As an aside, I recall you mentioning that you had a BDB version of
TDB, using that would seem to offer a fast, stable way of adding
transactions to your B-trees.  Out of curiosity, were there problems
with using BDB?


https://github.com/afs/TDB-BDB

No problems as such but it just isn't very fast (non-transactionally).There is no bulk loading advantage at all, and query performance wasslower but OK. That's before turning on transactions. As the datascaled, the difference between TDB native and TDB-BDB became morepronounced.


BDB-C and BDB-JE are about the same speed.

Given they were already slower, and for TxTDB, I want to retainreader-performance, that doesn't look like a good starting point.

It might be a good place for a version with different goals - lessemphasis on scale, more on high-frequency writer (and less reads), forexample a sensor data hub.

I don't know why they are slower but I speculate that the generalpurpose design of both BDBs (e.g. fully variable length key and value,node size, overhead in the tree blocks for all sorts of features notused) means it is optimized for something else. BDB is designed forhighed-write concurrency - RDF datastores are for publishing (readdominant). Sometimes these design objectives pull in different directions.

I used BDB to store the string table as well (lexical forms of nodes).It was better to use a native string file.


Maybe it's a case of not using them to their best advantage.

tdbloader1 simply does the loading work in an order that is better thanadding triples one at a time, inbexsing as you go. It loads the primaryindex, then builds the secondary indexes by copying from the primary.That applies to BDB but it didn't help.

tdbloader2 uses Unix sort(1) to prepare the index data by sorting intothe order for each index, then writes the B+Trees directly to disk (fromthe bottom up and very carefully).


        Andy

Re: Different policy for concurrency access in TDB supporting a single writer and multiple readers

Reply via email to