Hi

I think we should investigate if it would make sense to implement the
Clerezza APIs on top of the "Kiwi" Triple store. This would allow any
Clerezza based Application - including stanbol - to use this Triple
store implementation.

WDYT
Rupert

On Tue, Jul 26, 2011 at 5:54 PM, Sebastian Schaffert
<[email protected]> wrote:
> Dear Florent,
>
> Am 26.07.2011 um 16:46 schrieb florent andré:
>>
>>>
>>> The dependency to Hibernate is mostly for the triple store, not for CMS 
>>> capabilities. And this is something I don't see how to avoid in the near 
>>> future because we need to store additional information about triples for 
>>> reasoning and versioning.
>>>
>>> Versioning is also of triples, not of content. As such it is probably also 
>>> interesting to the Stanbol community.
>>
>> I'm interesting in a little explanation of the way you store version / 
>> history of triples.
>
> We use a purely relational approach actually:
> - a table "KIWINODE" stores RDF nodes (unified table for literals, blank 
> nodes and resources)
> - a table "TRIPLES" stores triples with id, subject, predicate, object, 
> context, marker for deleted, marker for inferred, timestamp, creator 
> (subject, predicate, object, context, creator are references to KIWINODE)
> - a table "VERSION" stores version ID, timestamp, creator
> - join tables "VERSION_ADDEDNODES", "VERSION_REMOVEDNODES", 
> "VERSION_ADDEDTRIPLES", "VERSION_REMOVEDTRIPLES" store references to added 
> and removed nodes and to added and removed triples; for deleted triples and 
> nodes, the boolean marker will be set to true, for added nodes it will be 
> false
>
> Versioning is thus a simple database operation. "Active" (undeleted) triples 
> can be easily filtered using the boolean marker. Undoing simply means 
> reversing the operations (add and remove) on triples and nodes.
>
>
>>
>> I begin to think about that (but just think for now :) ), and the possible 
>> help of big tables (e.g. hbase) for this...
>>
>> Hbase is a (kind of) 3 dimensional database :
>> - 1 is column
>> - 1 is row
>> - 1 is timestamp
>
I think there is currently a lot of work on how to handle Graph
Structures in this kind of data stores. I am definitely interested in
this topic but currently I do not have the time to investigate it in
more detail.

> I really don't see the point. A relational database is already n-dimensional 
> ;-)
>

As long as you can handle the amount of triples on a single machine it
is fore sure more efficient and easier to implement to handle it with
a relational database.
I think there is also a new TripleStore implementation around that
uses Solr/Lucene to store Triples. Someone has mentioned it in Paris,
but I have forgot the name of the project.

>
>>
>> So, for my 100 feet idea :
>> - each triple is a row
>> - ?s, ?p, ?o each a column (or a column family)
>>
>> And so, history of each triple is store on the 3rd dimension : timestamps.
>>
>> This can bring to a really clean and easy design... if not strong 
>> technical/integration restrictions comes...
>
> I am not really convinced, but maybe you can offer some more details and 
> convince me.;-) I am not familiar with these kinds of databases.
>
> My thought is that relational databases are really well suited for the task 
> because this is what they have been designed for (triples are really purely 
> relational data), with one (minor) exception: expensive join operations 
> happen frequently when querying RDF, and there is almost no chance to 
> materialize them in advance. This can be compensated a bit by proper indexing 
> and configuration of the database, however.
>

Yago2 uses a special n-triple model that includes subject, predicate,
object, temporal, spatial and full text. For spatial and full text
they use the according extensions of the relational databases. By that
they can creatly reduce the amount of joins for requests for event
like data.

Again this discussion is very related to the work of Fabian on the Factstore!

best
Rupert

-- 
| Rupert Westenthaler             [email protected]
| Bodenlehenstraße 11                             ++43-699-11108907
| A-5500 Bischofshofen

Reply via email to