HI,
Had a quick look - the patch seems to want to delete (in entirety) a
couple of existing SDB classes which I had made minimal modifications
to, so I've re-checked out SDB 1.3.4 and created a new patch which I can
upload later. So - the first attachment in Jira shouldnt be used.
Secondly, I haven't recreated all the unit test code that exists for
other RDBMS supported by SDB because, simply by the looks of it, there's
quite a bit of it, and I haven't had time so far.
I've also, as I said before, not used Monet with any data yet, just got
it to the point that SDBConfig doesn't fall over, and the Jira issue
raised is simply to record the desire to support it (and perhaps other
column stores if conceptually they prove to have merit when used as an
RDF repo).
So, personally, I'd prefer (I assume you-all as well) that nothing
happens too quickly with this, until I or someone else gets some time to
do some work and produce the unit tests.
On 11/10/11 08:07, Paolo Castagna wrote:
Hi,
first of all, thank you for your patch.
I had a quick look, but I did not try to apply it (yet).
May I ask how you created your patch?
We added a section on the Getting Involved page on the Jena website:
"Patches should be attached to issues in Jira (click on
More Actions> Attach Files). To create a patch you can simply
use the command:
svn diff> JENA-XYZ.patch
Please, inspect your patch and make sure it includes all (and only)
the relevant changes for a single issue. Don't forget tests! If you
want to test if a patch applies cleanly you can use:
patch -p0< JENA-XYZ.patch
If you use Eclipse: right click on the project name in Package Explorer,
select Team> Create Patch or Team> Apply Patch."
- http://jena.staging.apache.org/jena/getting_involved/#submit_your_patches
It really helps if a patch contains only the lines you added|removed
and it applies cleanly. It saves a lot of time and speed-up reviewing
it.
Your patch may be perfectly fine, but I wanted to take the opportunity
to send the message across.
I am really curious to run a few benchmarks when it's done to compare
MonetDB with a more traditional SQL system.
By the way, about benchmarks Andy is (secretely) working on this:
https://svn.apache.org/repos/asf/incubator/jena/Experimental/JenaPerf/trunk/
I have not time to try it yet, but it seems very interesting. :-)
Thank you again for the new interesting feature.
Paolo
nat lu wrote:
Added, with patch file, at
https://issues.apache.org/jira/browse/JENA-134
I have made no more progress on testing it out other than sdbconfig so
far, hope too soon.
On 09/09/11 16:43, Paolo Castagna wrote:
Hi,
why don't you open a new JIRA issue (as a New Feature) for this?
https://issues.apache.org/jira/browse/JENA
You can then attach a patch to it. This way others can look at what
you have done so far (and maybe help you out).
Thanks for your help,
Paolo
nat lu wrote:
I made a start, and tried to use one of the existing flavours, but ended
up creating one for MonetDB - combination of derby and DB2. It doesnt
like longs or unbounded varchars.
So, I got as far as getting SDBConfig to complete, but havent done an
sdbload yet
On 09/09/11 10:37, Andy Seaborne wrote:
On 04/09/11 13:03, nat lu wrote:
I'm going to give it a go sometime soon and report back on my
non-scientific findings. Your point about the small number of
columns is
well made, but the research paper cited earlier also mentions this and
reports that because of column store optimisations even when they
vertically partitioned their data rather than using a property-table
approach they still saw good improvement. However, again, I'm no
column
store expert so perhaps I'm missing some point here :-). Anyway,
time to
"suck it and see@, all in the name of progress of course.
On 03/09/11 16:29, David Jordan wrote:
I have not used a column-oriented database, but I am somewhat
familiar
with them. My understanding of them is that the storage is
partitioned
on a column basis, such that there is no physical clustering together
of all the columns for a given row. An advantage of this would be in
the case where you have tables with many columns, but the particular
application only needs a small subset of columns.
With the SDB representation of triples (3 columns) and quads (4
columns), and access typically based on having a specific value for
one or two of the columns, I am not so sure that a column-based
approach would offer any advantage.
But again, I am no expert on these types of databases.
These discussions about alternative datastore representations RDF/OWL
data are very useful, to gain better understanding of which data
architectures yield the best implementation approach for
high-performance.
p.s. I Monet provides support for JDBC, I would not think much effort
is needed to support in with SDB.
Shouldn't be too hard :-) SDB targets SQL-92 and there are a few
extension points to cope with the vagaries of different SQL engines.
It's one of the reasons there are ~10 small files to write, to capture
the uniqueness of each SQL syntax.
Andy