Alexander Dutton created JENA-629:
-------------------------------------

             Summary: Support on-line rebuilds of a TDB store
                 Key: JENA-629
                 URL: https://issues.apache.org/jira/browse/JENA-629
             Project: Apache Jena
          Issue Type: New Feature
          Components: TDB
            Reporter: Alexander Dutton


TDB should occasionally sync its data into a fresh store and then transparently 
swap over to the new store. This would mean that stores with a lot of churn 
don't grow to excessive sizes.

"Occasionally" could be determined by some (configurable?) heuristic, such as 
"every X triples removed", or when initiated by the user.

My understanding of how TDB works is probably rather sketchy, but I suspect 
it'd be possible to enumerate the triples in the store and pipe them to 
tdbloader for a new store. The process could release the read lock periodically 
so as not to block writes for long periods, but record what happens in those 
writes for replaying onto the new store at the end. Eventually (and soon, in 
non-pathological cases) the two stores would be almost identical and TDB could 
stop writes to the old store, finish replaying any queued writes to the new 
store, and make the switch.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to