Re: Fuseki log question

Dick & Hannah Thu, 02 May 2013 23:57:05 -0700

Hi. 

We have written a three phase commit and journal extension to TDB which is 
under test. This allows for multiple TDB stores to be kept in sync. We also 
journal all read and write operations for stats and theoretically we could use 
the journal for recreating a store. We have a ghost dataset which writes 
through to the physical stores. As Andy mentioned there are various design 
issues. We use an internal graph to store 3PC data such as an atomic sequence 
number to check stores on open etc...


Regards

Dick Murray
Technology Specialist
UNIT4 Business Software
www.unit4.com
+44 1275 377359
+44 7884 111729

On 2 May 2013, at 23:42, Andy Seaborne <a...@apache.org> wrote:

> If I read the intent behind Bill's question as asking if master-slave 
> replication can be done using TDB transaction logs, then the answer is that 
> there is no option to store the logs for reply onto another system.
> 
> In fact, that's the wrong log because the two systems may be slightly 
> different even if the same data (e.g. bNodes).  The fact that disk bytes are 
> exactly the same for the same data is not guaranteed (it happens to be true, 
> sans bnodes, for identical hardware currently, but there is no guarantee of 
> this).
> 
> Secondly, transaction logs maybe collapse at any moment.  When the write-back 
> to the main database is properly sync'ed to persistent storage, the journal 
> is truncated to zero length.
> 
> 
> A Master-Slave replication scheme has a very big decision to make.  Does it 
> replicate changes before announcing end of request to the client or does it 
> commit changes locally and announce success before replication has occurred?
> 
> The first is safe but has bad latency - a commit now has to synchronously go 
> to another machine.  That machine may be busy and slow.
> 
> The second is unsafe - the client can be told that an update has happened but 
> if the master is lost before replication has successfully completed then the 
> update is lost.
> 
> c.f. MongoDB where even the recent changes are still not fully resilient to 
> node loss.
> 
> I have been looking at RDF-level logging of changes whereby a diff log is 
> generated.  It can be used for change propagation or reply against a backup 
> to bring it quickly up-to-date. Since Fuseki can produce live backups, the 
> combination is quite effective.  Only some proof-of-concept stuff at the 
> moment.
> 
> There are other things you can do about this such as request replication.  
> Ping me if you want more discussion - it's too late here now.
> 
>    Andy
> 
> From the vaults:
> http://www.hpl.hp.com/techreports/98/HPL-98-06.html
> 
> 
> On 02/05/13 21:17, Rob Vesse wrote:
>> Yes
>> 
>> If you use TDB in transactional mode (which I believe Fuseki will do by
>> default - Andy can probably confirm if this is accurate) then TDB uses
>> write ahead logging to provide ACID guarantees
> 
> Yes - Fuseki is in transaction mode for TDB.  There is no option to turn it 
> off.
> 
>> 
>> http://jena.apache.org/documentation/tdb/tdb_transactions.html
>> 
>> Rob
>> 
>> 
>> 
>> On 5/2/13 1:12 PM, "Bill Roberts" <b...@swirrl.com> wrote:
>> 
>>> Does Fuseki/TDB have the option of storing transaction logs (in the style
>>> of most SQL databases), to allow a history of updates to be replayed - eg
>>> to recover after a problem, or for synchronising databases?
>>> 
>>> Thanks
>>> 
>>> Bill
>>> 
>>> 
>>> 
>>> 
>>> 
>> 
>

Re: Fuseki log question

Reply via email to