[jira] [Commented] (JENA-624) Develop a new in-memory RDF Dataset implementation

Andy Seaborne (JIRA) Tue, 01 Dec 2015 06:15:34 -0800

    [ 
https://issues.apache.org/jira/browse/JENA-624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15033730#comment-15033730
 ]


Andy Seaborne commented on JENA-624:
------------------------------------

Re: lock exposure: 

This is more about style. 

I think it would be better to have a private lock for the object's transaction 
control, not a public one.

An application may be using the dataset lock itself for other purposes so (1) 
the policy of MR+SW may be unexpected and (2) a bug in the application may 
overcall {{leaveCriticalSection}} and this would not be caught but it would 
break the system.  This is analogous to {{synchronized}} methods exposing the 
objects mutex.

Just one of many:
http://stackoverflow.com/questions/416183/in-java-critical-sections-what-should-i-synchronize-on#416198

Digressing a bit:

While {{DatasetGraphInMemory}} is one-thread/one-transaction, other systems do 
allow (if you get very complicated) multiple threads per transaction prodiving 
within the transaction, MRSW applies.

This will get hard in Fuseki if/when multi-request transactions get done.  A 
better transaction coordinator will make that easier: e.g. [mantis 
TransactionCoordinator|https://github.com/afs/mantis/blob/master/dboe-transaction/src/main/java/org/seaborne/dboe/transaction/txn/TransactionCoordinator.java]
 but I'm not sure its all the way there yet.





> Develop a new in-memory RDF Dataset implementation
> --------------------------------------------------
>
>                 Key: JENA-624
>                 URL: https://issues.apache.org/jira/browse/JENA-624
>             Project: Apache Jena
>          Issue Type: Improvement
>            Reporter: Andy Seaborne
>            Assignee: A. Soroka
>              Labels: java, linked_data, rdf
>
> The current (Jan 2014) Jena in-memory dataset uses a general purpose 
> container that works for any storage technology for graphs together with 
> in-memory graphs.  
> This project would develop a new implementation design specifically for RDF 
> datasets (triples and quads) and efficient SPARQL execution, for example, 
> using multi-core parallel operations and/or multi-version concurrent 
> datastructures to maximise true parallel operation.
> This is a system project suitable for someone interested in datatbase 
> implementation, datastructure design and implementation, operating systems or 
> distributed systems.
> Note that TDB can operate in-memory using a simulated disk with 
> copy-in/copy-out semantics for disk-level operations.  It is for faithful 
> testing TDB infrastructure and is not designed performance, general in-memory 
> use or use at scale.  While lesson may be learnt from that system, TDB 
> in-memory is not the answer here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (JENA-624) Develop a new in-memory RDF Dataset implementation

Reply via email to