I'm going to give it a go sometime soon and report back on my non-scientific findings. Your point about the small number of columns is well made, but the research paper cited earlier also mentions this and reports that because of column store optimisations even when they vertically partitioned their data rather than using a property-table approach they still saw good improvement. However, again, I'm no column store expert so perhaps I'm missing some point here :-). Anyway, time to "suck it and see@, all in the name of progress of course.

On 03/09/11 16:29, David Jordan wrote:
I have not used a column-oriented database, but I am somewhat familiar with 
them. My understanding of them is that the storage is partitioned on a column 
basis, such that there is no physical clustering together of all the columns 
for a given row. An advantage of this would be in the case where you have 
tables with many columns, but the particular application only needs a small 
subset of columns.

With the SDB representation of triples (3 columns) and quads (4 columns), and 
access typically based on having a specific value for one or two of the 
columns, I am not so sure that a column-based approach would offer any 
advantage.

But again, I am no expert on these types of databases.

These discussions about alternative datastore representations RDF/OWL data are 
very useful, to gain better understanding of which data architectures yield the 
best implementation approach for high-performance.

p.s. I Monet provides support for JDBC, I would not think much effort is needed 
to support in with SDB.

-----Original Message-----
From: nat lu [mailto:[email protected]]
Sent: Saturday, September 03, 2011 6:59 AM
To: [email protected]
Subject: Re: Adding support for MonetDB

1) Has anyone tried using MonetDB just as a jdbc-SDB source ? I suppose the DDL 
jena uses to create the normalized schema may need adjusting to suit MonetDBs 
SQL flavour, but it should work, with some mileage, to try it out and do some 
gap analysis - right ?

2) It also seems reasonable that a SPARQL front end (top of its 3 layer
stack) could be created in MAL to augment the existing SQL and Xquery modules. 
I've also seen some talk of an RDF module in newsgroups that is under 
development/experimental at this stage.

I'm interested to see how an SDB column store will perform with my small 
dataset compared to and RDBMS backed SDB instance.


On 02/09/11 10:11, Paolo Castagna wrote:
Andy Seaborne wrote:
It seems to use a non-normalized table design (as did the CStore paper)
and rely on indexing.  It would be interesting to see how that compares
with a normalized design, which is what most RDF stores use.  When
normalized, the joins for patterns are on fixed size numbers, not string
comparisons, and don't use secondary indexes (and at least one uses
bitmap indexes).

SDB is built around a normalized design, and portable across various
different SQL DBs.  The main triple table (or quad table) is 3 columns
of 4 or 8 bytes (depending on hash vs index allocation of internal ids).
That means the main table is a thin, long table.

If you want a deep integration of SPARQL and MonetDB, the first thing to
consider, if pure speed is the objective, is what schema design is best
for MonetDB.

The reported tests are on a dataset of 50 million triples which is
                                                                        ^
                                                                       not

particularly big - it means RAM caching is going to play a significant
role nowadays (they only used a one-CPU, 4G machine).
      Andy


On 01/09/11 19:49, nat lu wrote:
Speed I think ! At least in some use cases. (But Im no MonetDB expert)
Heres an interesting article that tries a few of them out with MonetDB
and CStore.

http://oai.cwi.nl/oai/asset/13806/13806B.pdf


Whats missing I believe is a the SPARQL endpoint integration.

On 01/09/11 08:36, Andy Seaborne wrote:
Tobias,

To turn the question round - what unique features of MonetDB could be
exposed through SDB to yield a better RDf store?

The current design covers the current set of databases supported but
it's not fixed - maybe there is something especially useful in MonetDB
and maybe it needs an extension to the design. The design is based on
templating via all those classes (the instance for each database is a
small class). The SQL generator usually needs some per-DB work because
SQL databases aren't exactly very "standard".

Andy

On 25/08/11 21:28, Paolo Castagna wrote:
Hi Tobias,
first of all, welcome on jena-users mailing list.

Tobias Willig wrote:
Hi everyone,

I like to add support for MonetDB in SDB and I have two questions
concerning
this project:

1. How much effort it takes to add support for a new database type?
It requires some effort. You need to add new Java classes and the
necessary tests.

Have you checked out the SDB sources yet?

If not:

svn co
https://svn.apache.org/repos/asf/incubator/jena/Jena2/SDB/trunk/ SDB
cd SDB
mvn test

You can use Eclipse and search for "Derby" which is one of the DBMS
supported
by SDB. This way you'll find the list of Java classes in SDB to
support Derby.
Then, you can read and study those classes. While you do that, you'll
learn the
design of SDB and you will get an idea on what it is required to add
MonetDB.

2. Are there predefined extension points that allow adding a new
database
type easily?
Yes, there are. Look at the super classes and interfaces from the
list of
classes above (i.e. searching for "Derby").

There isn't an "how to add a new database to SDB" guide, however make
sure
you read the general SDB documentation (it does not hurt).

Also, if you want to contribute an "how to add a new database to SDB"
you
are more than welcome to do so.

If so could you give me the name of some classes and config files,
which
are important to accomplish that task?
You can start from:

GenerateSQLDerby -- extends -->   GenrateSQL
FormatterSimpleDerby -- extends -->   FormatterSimple
StoreSimpleDerby -- extends -->   StoreBase1
FmtLayout2HashDerby -- extends -->   FmtLayout2
StoreTriplesNodesHashDerby -- extends -->   StoreBaseHash
TupleLoaderHashDerby -- extends -->   TupleLoaderHashBase
FmtLayout2IndexDerby -- extends -->   FmtLayout2HashDerby
StoreTriplesNodesIndexDerby -- extends -->   StoreBaseIndex
TupleLoaderIndexDerby -- extends -->   TupleLoaderIndexBase

Also look at:

- JDBC.java
- DatabaseType.java
- StoreFactory.java
- SDB.java

And the existing tests.

May I ask you what motivates you in adding MonetDB?

I've never used it myself and, indeed, I use TDB instead of SDB.
Transactions are coming, hopefully in the 0.9.0 release of TDB.

Last but not least, it's not only about the code. You should be willing
to support the users of your code too. Once you add support for
MonetDB,
people will start using it and, as they use your code, they'll find
bugs
and they'll ask you for more features, eventually. You should be
willing
to put some effort in fixing the bugs, at least... and you can always
say
"no" politely to new features. Until, someone else, who really needs
the
new feature and he/she is willing to put some effort, will take over
and
push the software a step further.

Once you start, you might have more specific questions on SDB design.
You can post your questions here or on [email protected]
mailing list. The more you demonstrate you put some effort the more
likely you'll receive helpful answers back from the SDB developers.
If you don't put enough effort and expect others will do it for you,
I am afraid, you risk to have not much back.

To conclude, it you are motivated and you think you'll have fun doing
it (and you need it for your work): go ahead, it's not a terribly huge
task and it could be a nice contribution (in particular for all the
MonetDB users out there).

Paolo

Thanks in advance!

Best Regards
Tobias Willig




Reply via email to