Re: Re-use of datasets after committing transactions

Andy Seaborne Wed, 04 Apr 2012 07:49:04 -0700

On 03/04/12 17:25, Bernie Greenberg wrote:

While I still haven't got an answer from this list about whether it was
really true that one has to close (and not reuse) dataset objects after a
committed or aborted "write" transaction, I did get an answer from my code,
as it were, and a surprising one, at that.


I found that at app shutdown time, I had nothing (with respect to Jena) to
do, as I had already, in each thread which had created a dataset in
response to a request for some kind of service, closed that dataset. Since
the datasets were not drawn from an open "master" object representing the
open store, but from a static source, there is no "master" object to close
down.

If the threads all had their hands on persistent, open dataset objects,
which each (according to your documentation and my own experience) can only
be used in that one thread, I would have a difficult problem causing those
threads (which may be asleep in a web or other server) to wake up to close
the pointer (yes, there may be a "thread close time" hook or the like, but
as I have it, I don't need one).

This all seems consistent with what we have transacted here before and
consistent with my understanding of transaction semantics, and seems to
work; please let me know if you think I'm overlooking something.

Thanks
Bernie

When a (Java in-memory object for a) Dataset is used transactionally, itmust be used only transactionally. I think you are only usingtransactions so no issues around here. People using datasets "oldworld" non-transactionally get old-world semantics - they need to besync'ed.


There's no harm syncing a transactional dataset (it does not do anything).

With transactions, no clearup after .end() is needed. (and a writerdoing .commit()/.abort() don't require .end - it's better style toalways call .end() in a "finally{}2 though).

When .commit() happens, the journal is written (append only), with acommit record. The changes are written to the main dataset at sometimewhen it's quiet. It may be when the .commit() happens, it may not -does not matter, the bytes are on-disk and the change is permanent.

Any transaction starting after the .commit sees the changes, either fromthe real storage or the unflushed transaction state. The system handlesthat.

If the app exits before the journal is fully written to the main dataset(strictly - "is known to have been written back"), then on next startup,the journal is flushed and the changes have become permanent in the mainstorage.

If the system crashes during write-back, then the changes are still inthe journal - it just writes them again on next recovery. The key pointis that the journal contains the new state of the data (as blocks) andnot diffs. If it were diffs, then it would have to read the old stateto calculate the new state. By recording new state only, it can simplykeep trying to write until it succeeds regardless of power cycling andcrashes. The journal is a sequence of idempotent changes.

TDB uses write-ahead logging. There is nothing to do on abort exceptforget about it. There are no undo actions, no write-behind logging.


Update of the storage is:
  Write log to storage
  sync the storage
  Truncate log to zero.

It's the truncate that records the fact all transactions have beenflushed back to the real dataset.

Which means the app has no shutdown actions to do. Any runningtransactions implicitly abort.


        Andy

Re: Re-use of datasets after committing transactions

Reply via email to