This is very, very excellent and answers all the questions I asked, could
have asked, or should have asked. Jena is as truly ACID as orange juice.
All the points in your response should be made in some form, prominently,
in the documentation.  You figure out where.

Many thanks!
Bernie

On Wed, Apr 4, 2012 at 10:48 AM, Andy Seaborne <[email protected]> wrote:

> On 03/04/12 17:25, Bernie Greenberg wrote:
>
>> While I still haven't got an answer from this list about whether it was
>> really true that one has to close (and not reuse) dataset objects after a
>> committed or aborted "write" transaction, I did get an answer from my
>> code,
>> as it were, and a surprising one, at that.
>>
>> I found that at app shutdown time, I had nothing (with respect to Jena) to
>> do, as I had already, in each thread which had created a dataset in
>> response to a request for some kind of service, closed that dataset. Since
>> the datasets were not drawn from an open "master" object representing the
>> open store, but from a static source, there is no "master" object to close
>> down.
>>
>> If the threads all had their hands on persistent, open dataset objects,
>> which each (according to your documentation and my own experience) can
>> only
>> be used in that one thread, I would have a difficult problem causing those
>> threads (which may be asleep in a web or other server) to wake up to close
>> the pointer (yes, there may be a "thread close time" hook or the like, but
>> as I have it, I don't need one).
>>
>> This all seems consistent with what we have transacted here before and
>> consistent with my understanding of transaction semantics, and seems to
>> work; please let me know if you think I'm overlooking something.
>>
>> Thanks
>> Bernie
>>
>>
> When a (Java in-memory object for a) Dataset is used transactionally, it
> must be used only transactionally.  I think you are only using transactions
> so no issues around here.  People using datasets "old world"
> non-transactionally get old-world semantics - they need to be sync'ed.
>
> There's no harm syncing a transactional dataset (it does not do anything).
>
> With transactions, no clearup after .end() is needed. (and a writer doing
> .commit()/.abort() don't require .end - it's better style to always call
> .end() in a "finally{}2 though).
>
> When .commit() happens, the journal is written (append only), with a
> commit record.  The changes are written to the main dataset at sometime
> when it's quiet.  It may be when the .commit() happens, it may not - does
> not matter, the bytes are on-disk and the change is permanent.
>
> Any transaction starting after the .commit sees the changes, either from
> the real storage or the unflushed transaction state.  The system handles
> that.
>
> If the app exits before the journal is fully written to the main dataset
> (strictly - "is known to have been written back"), then on next startup,
> the journal is flushed and the changes have become permanent in the main
> storage.
>
> If the system crashes during write-back, then the changes are still in the
> journal - it just writes them again on next recovery.  The key point is
> that the journal contains the new state of the data (as blocks) and not
> diffs.  If it were diffs, then it would have to read the old state to
> calculate the new state.  By recording new state only, it can simply keep
> trying to write until it succeeds regardless of power cycling and crashes.
>  The journal is a sequence of idempotent changes.
>
> TDB uses write-ahead logging.  There is nothing to do on abort except
> forget about it.  There are no undo actions, no write-behind logging.
>
> Update of the storage is:
>  Write log to storage
>  sync the storage
>  Truncate log to zero.
>
> It's the truncate that records the fact all transactions have been flushed
> back to the real dataset.
>
> Which means the app has no shutdown actions to do.  Any running
> transactions implicitly abort.
>
>        Andy
>
>
>
>
>
>
>

Reply via email to