The jena-on-cassandra solution is quite simple. it is an implementation of the graph layer so it doesn't do the joins directly but lets the higher level do it. There are 4 copies of the data stored in different order gspo, spog and 2 others that escape my mind at the moment but start with "o" and "p".
The tables are "indexed" by their first segments. The system looks at the known values and finds the table with the best index to solve the query, it then performs the query and any filtering as necessary to return the results. Inserts are written into all the tables (as would be expected) Deletes are done on a separate thread (eventual consistency after all). It uses the standard model-on-graph to create a model. Much of the work was really to understand how Cassandra does its indexing and how do do deletions. As a final note, the Object field is stored in several formats (URI, numeric value [if appropriate], string value and perhaps one other, I forget just now). So when finding a value it uses the proper value index. All a bit tricky but it seems to work. I would be glad to spend some time with you going over the design and design decisions if you wish. Claude On Mon, Sep 4, 2017 at 12:10 PM, <[email protected]> wrote: > Little of both? :grin: > > Primarily I am interested because of a grant [1] in which the Smithsonian > Institution (where I work) is participating in a supporting role (partly > because I convinced us to). That work involves using Cassandra for > distributed storage, and it will also involve a distributed LDP > implementation (the Fedora API referred to in that grant description is > really just a packaging of Memento [2] with LDP [3]), hence my interest in > jena-on-cassandra. > > As I understand the join question, the usual move with Cassandra is to > denormalize and store the joined data together, but that's obviously > nontrivial in our situation, where we don't know the potential queries. > Have you looked at an indexing solution such as was used by CumulusRDF [4]? > > ajs6f > > [1] https://www.imls.gov/grants/awarded/lg-71-17-0159-17 > [2] http://www.mementoweb.org/guide/quick-intro/ > [3] https://www.w3.org/TR/ldp/ > [4] http://iswc2011.semanticweb.org/fileadmin/iswc/Papers/Worksh > ops/SSWS/Ladwig-et-all-SSWS2011.pdf > > Claude Warren wrote on 9/2/17 12:44 PM: > > are you looking to use jena-on-cassandra or do you have ideas? what leads >> you to ask about it? >> >> >> On Sat, Sep 2, 2017 at 1:21 PM, <[email protected]> wrote: >> >> Hey, Claude-- >>> >>> Just curious as to where https://github.com/Claudenw/jena-on-cassandra >>> has ended up. Is that still work-in-progress? >>> >>> -- >>> >>> ajs6f >>> >>> >> >> >> -- I like: Like Like - The likeliest place on the web <http://like-like.xenei.com> LinkedIn: http://www.linkedin.com/in/claudewarren
