I've embedded answers to your questions below.

Susie


Cutler, Roger (RogerCutler) wrote:

No problem.  Getting back to the main subject of the thread, I'm a
little curious whether you've got some Oracle perspective on this issue.
I understand that new Oracle databases are putting RDF into some sort of
triple-store, but I don't know much about the details.  Some questions
that occur to me, but maybe not exactly the right questions:

- Does the RDF just go in as-is or is it compressed in some way?  If
there is a size factor of something like 15 from the data itself, are
these RDF stores tending to be real bulky?
RDF data is compressed - repeated node and link values are stored only once, and when a value repeats in the data only a reference to the already stored value is stored. There is no factor in Oracle RDF that adds to the size of the data. RDF is stored in the Oracle Database in an object-relational implementation, allowing users to manipulate RDF triples as objects.

The RDF Data Model can take advantage of the scalability and performance features in the database, e.g. indexing, parallelization, memory management, Real Application Clusters (RAC), etc. It can also work with our image and text management capability, and the security features.

As some parsing is needed when the data is initially loaded, there might be slower performance on loading compared to some other systems. However, in return for that, we have fast query performance.

- Is there some sort of indexing and related join-like function?  If so,
what are the performance characteristics?
There are several indexes built on the internal storage structures. We do perform joins but these are highly optimized. Our performance figures show how our design has resulted in very good performance. We have also extended SQL to enable SPARQL-like query capabilities, so the user does not have to be aware that data is held in different tables internally.

As I said, I don't have any experience with the RDF stuff, but some
thoughts based on my experience with relational databases:

- Just because you've got your data in an Oracle (or any other) database
doesn't mean you are going to be able to get at it in a performant
manner.  The devil is in the details.

- Operations that initiate a full read of a Gigabyte database are
extremely painful.

- Big joins can also be extremely painful.  Would traversing a big bunch
of RDF look something like an incredibly complex hairball of complex
joins?  If so, is there a potential problem here?
Yes, certainly the devil is in the details. And big joins are indeed painful. However the user does not have to do these big joins, nor worry about the details. The RDF query function provided by Oracle gives the user a simple SQL interface to query the internal tables. The internal operations are highly optimized, and where necessary internal Oracle features have been enhanced. Some of these techniques are described in the VLDB paper by Chong et al at http://www.oracle.com/technology/tech/semantic_technologies/pdf/vldb_2005.pdf



Reply via email to