Re: Jena over Cassandra?

Claude Warren Mon, 04 Sep 2017 10:12:03 -0700

actually, looking at the code it is a datsset graph that the cassandra code
is built on.


On Mon, Sep 4, 2017 at 5:17 PM, Claude Warren <cla...@xenei.com> wrote:

> The jena-on-cassandra solution is quite simple.  it is an implementation
> of the graph layer so it doesn't do the joins directly but lets the higher
> level do it.  There are 4 copies of the data stored in different order
> gspo, spog and 2 others that escape my mind at the moment but start with
> "o" and "p".
>
> The tables are "indexed" by their first segments.  The system looks at the
> known values and finds the table with the best index to solve the query, it
> then performs the query and any filtering as necessary to return the
> results.
>
> Inserts are written into all the tables (as would be expected)
>
> Deletes are done on a separate thread (eventual consistency after all).
>
> It uses the standard model-on-graph to create a model.
>
> Much of the work was really to understand how Cassandra does its indexing
> and how do do deletions.
>
> As a final note, the Object field is stored in several formats (URI,
> numeric value [if appropriate], string value and perhaps one other, I
> forget just now).  So when finding a value it uses the proper value index.
> All a bit tricky but it seems to work.
>
> I would be glad to spend some time with you going over the design and
> design decisions if you wish.
>
> Claude
>
> On Mon, Sep 4, 2017 at 12:10 PM, <aj...@apache.org> wrote:
>
>> Little of both? :grin:
>>
>> Primarily I am interested because of a grant [1] in which the Smithsonian
>> Institution (where I work) is participating in a supporting role (partly
>> because I convinced us to). That work involves using Cassandra for
>> distributed storage, and it will also involve a distributed LDP
>> implementation (the Fedora API referred to in that grant description is
>> really just a packaging of Memento [2] with LDP [3]), hence my interest in
>> jena-on-cassandra.
>>
>> As I understand the join question, the usual move with Cassandra is to
>> denormalize and store the joined data together, but that's obviously
>> nontrivial in our situation, where we don't know the potential queries.
>> Have you looked at an indexing solution such as was used by CumulusRDF [4]?
>>
>> ajs6f
>>
>> [1] https://www.imls.gov/grants/awarded/lg-71-17-0159-17
>> [2] http://www.mementoweb.org/guide/quick-intro/
>> [3] https://www.w3.org/TR/ldp/
>> [4] http://iswc2011.semanticweb.org/fileadmin/iswc/Papers/Worksh
>> ops/SSWS/Ladwig-et-all-SSWS2011.pdf
>>
>> Claude Warren wrote on 9/2/17 12:44 PM:
>>
>> are you looking to use jena-on-cassandra or do you have ideas?  what leads
>>> you to ask about it?
>>>
>>>
>>> On Sat, Sep 2, 2017 at 1:21 PM, <aj...@apache.org> wrote:
>>>
>>> Hey, Claude--
>>>>
>>>> Just curious as to where https://github.com/Claudenw/jena-on-cassandra
>>>> has ended up. Is that still work-in-progress?
>>>>
>>>> --
>>>>
>>>> ajs6f
>>>>
>>>>
>>>
>>>
>>>
>
>
> --
> I like: Like Like - The likeliest place on the web
> <http://like-like.xenei.com>
> LinkedIn: http://www.linkedin.com/in/claudewarren
>



-- 
I like: Like Like - The likeliest place on the web
<http://like-like.xenei.com>
LinkedIn: http://www.linkedin.com/in/claudewarren

Re: Jena over Cassandra?

Reply via email to