On 09/10/16 02:10, Stian Soiland-Reyes (JIRA) wrote:
Stian Soiland-Reyes created COMMONSRDF-45:
---------------------------------------------
Summary: Support longer-running RDF4J transactions?
Key: COMMONSRDF-45
URL: https://issues.apache.org/jira/browse/COMMONSRDF-45
Project: Apache Commons RDF
Issue Type: Wish
Components: rdf4j
Affects Versions: 0.3.0
Reporter: Stian Soiland-Reyes
RDF4J operations like Graph.add() uses an internal RepositoryConnection that is
closed on every method call.
c.f. HTTP.
This could cause a performance hit.
This task is to investigate how big that hit is for different backends, and to
propose an alternative to support more longer-running transactions.
For instance, one alternative would be:
{code:
Dataset g = rdf4j.createDataset();
try (TransactionalDataset t = g.begin()) {
t.add(triple1);
t.add(triple2);
t.remove(triple3);
t.commit();
// or
// t.abort()
}
A Java8 approach:
http://jena.staging.apache.org/documentation/txn/txn.html
It works on anything that is "transactional" rather than tying to
dataset (or graph? actually enforcing the unit to be dataset makes sense).
where TransactionalDataset is subtype of Dataset with a shared connection. Here
modifications in t won't be visible in g before the commit - but I guess we
could expose some of the different transaction isolation levels from RDF4J.
If the goal of CommonsRDF is common function across many systems, then
adopting a simple model for transactions would be better than trying to
reconcile all possibilities, now and that may come along.
This could call abort() if an exception is thrown. Perhaps .commit() would be the default
if everything is OK and so not needed explicitly - but that could cause unnecessary
"empty commits" for read-only transactions (e.g. using .contains() or
.iterate())
Surely that's an implementation issue? - no change => very little work
on commit. Read-only transactions in the implementation have a
faster-path commit (clearup). In Jena (TIM and TDB), read transactions,
or unpromoted general transactions, are very low exit cost (often a
ThreadLocal unset). In lock-based systems (TIM and TDB are lock free
transactions) there might be a bit more in releasing all the latches.
Txn implicitly commits and copes with explicit abort.
Iteration is fun. It's a great way to leak state out of the transaction
context.
Andy
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)