Simon Helsen created JENA-327:
---------------------------------

             Summary: TDB Tx transaction lock to permit backups
                 Key: JENA-327
                 URL: https://issues.apache.org/jira/browse/JENA-327
             Project: Apache Jena
          Issue Type: Improvement
          Components: TDB
    Affects Versions: TDB 0.9.4
            Reporter: Simon Helsen


With large repositories, it is important to be able to create backups once in a 
while. This is because recreating an rdf store with millions of triples can be 
forbiddingly expensive. Moreover, it should be possible to take those backups 
while still allowing read activity on the store as in many cases, a complete 
shutdown is usually not possible. Before the introduction of tx, it was 
relatively straightforward to provide the right locks on the client-side to 
safely suspend any disk activity for a period of time enough to make a backup 
of the index. 

However, since tx, things have become slightly more complicated because TDB Tx 
touches the disk at other times than when performing write/sync activities. 
Right now, because of some understanding of how TDB Tx is implemented, it is 
still possible for clients to avoid disk activities to implement a backup 
process, but this dependency on TDB Tx implementation details is not very good. 
Moreover, we anticipate that in the future, the merging process from the 
journal into the main index may become entirely asynchornous for performance 
reasons. The moment that happens, client have no control anymore as to when the 
disk is being touched.

For this reason, we are requesting the following feature: a "backup" lock (by 
lack of a better name). Its semantics is that when the lock is taken, TDB Tx 
guarantees that no disk activity takes place and if necessary pauses 
activities. In other words, no write transaction should be able to complete and 
read transactions will not attempt to merge the journal. The idea would be that 
regular read activities can still continue. The API could be as simple as 
something like this:

try {
dataset.begin(ReadWrite.BACKUP) ;

<do whatever is necessary to backup the index>

} finally {
dataset.end()
}

As for the implementation, we suspect you currently have locks in place which 
could be used to guarantee this behavior. E.g. could 
txn.getBaseDataset().getLock().enterCriticalSection(Lock.WRITE) be sufficient?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to