Re: Adding support for MonetDB

Paolo Castagna Tue, 11 Oct 2011 12:28:43 -0700

nat lu wrote:
> Thanks, I hope the community can help progress this more, and that this
> is just a very small start. I'm sure I'm not the only one interested in
> the possibilities of column stores as RDF repositories.


Some Italians used to say: "If you want something done, do it yourself".

That's true in the open source world: you are part of the community and
interested in MonetDB, so you are in a good position to get it done. ;-)
Others might help you, but don't rely on it.

If you want to help someone else help you, a good way is to list in a
comment on the JIRA issue what are the things left to do and, if you have
an idea on how to do those things, add a short paragraph describing how
it should be done.

> "I'll be back...." :-)

I'll wait.

Paolo

> 
> On 11/10/11 12:15, Paolo Castagna wrote:
>> Hi nat lu
>>
>> nat lu wrote:
>>> HI,
>>>
>>> Had a quick look - the patch seems to want to delete (in entirety) a
>>> couple of existing SDB classes which I had made minimal modifications
>>> to, so I've re-checked out SDB 1.3.4 and created a new patch which I can
>>> upload later. So - the first attachment in Jira shouldnt be used.
>> Great. Thanks for double checking and acting on this.
>> This will make life easier to whoever review your patch.
>>
>>> Secondly, I haven't recreated all the unit test code that exists for
>>> other RDBMS supported by SDB because, simply by the looks of it, there's
>>> quite a bit of it, and I haven't had time so far.
>> Ack.
>>
>> A patch without tests is better than no patch at all.
>>
>> However, tests are really important and should be considered part of the
>> process of submitting a patch.
>>
>> They certainly help us a lot and can make a patch going into trunk
>> faster.
>>
>>> I've also, as I said before, not used Monet with any data yet, just got
>>> it to the point that SDBConfig doesn't fall over, and the Jira issue
>>> raised is simply to record the desire to support it (and perhaps other
>>> column stores if conceptually they prove to have merit when used as an
>>> RDF repo).
>> Ack.
>>
>> Others wanting to have MonetDB will see your issue and might come and
>> help you with that.
>>
>> I personally don't use SDB much and I've never used MonetDB before.
>>
>>> So, personally, I'd prefer (I assume you-all as well) that nothing
>>> happens too quickly with this, until I or someone else gets some time to
>>> do some work and produce the unit tests.
>> Sure.
>>
>> We tend not to commit untested code. :-)
>>
>> Having low quality or unfinished code committed to trunk is a bad
>> practice
>> and we avoid that.
>>
>> Rules have exceptions though, for example, I recently committed an
>> RDF/JSON
>> writer which isn't complete/fully tested, however it's not going to
>> affect
>> users in any way (see JENA-135). I did that to make Rob's life easier
>> in case
>> he want to have an RDF/JSON writer as well as a parser (which he
>> contributed).
>>
>> Nat, don't take these last two my messages too much personally, I am just
>> taking the opportunities to send across a few IMHO important messages:
>>
>>   - we welcome patches and new ideas/features (we really do!)
>>   - proper patches, well tested and which apply cleanly make our life
>>     so much easier (so we appreciate any effort in this direction).
>>     It's very difficult to get it wrong with svn diff> 
>> JENA-XYZ.patch. :-)
>>
>> Thanks again and have fun with SDB + MonetDB!
>>
>> Paolo
>>
>>> On 11/10/11 08:07, Paolo Castagna wrote:
>>>> Hi,
>>>> first of all, thank you for your patch.
>>>> I had a quick look, but I did not try to apply it (yet).
>>>>
>>>> May I ask how you created your patch?
>>>>
>>>> We added a section on the Getting Involved page on the Jena website:
>>>>
>>>>    "Patches should be attached to issues in Jira (click on
>>>>     More Actions>   Attach Files). To create a patch you can simply
>>>>     use the command:
>>>>
>>>>       svn diff>   JENA-XYZ.patch
>>>>
>>>>     Please, inspect your patch and make sure it includes all (and only)
>>>>     the relevant changes for a single issue. Don't forget tests! If you
>>>>     want to test if a patch applies cleanly you can use:
>>>>
>>>>       patch -p0<   JENA-XYZ.patch
>>>>
>>>>     If you use Eclipse: right click on the project name in Package
>>>> Explorer,
>>>>     select Team>   Create Patch or Team>   Apply Patch."
>>>>
>>>> -
>>>> http://jena.staging.apache.org/jena/getting_involved/#submit_your_patches
>>>>
>>>>
>>>>
>>>> It really helps if a patch contains only the lines you added|removed
>>>> and it applies cleanly. It saves a lot of time and speed-up reviewing
>>>> it.
>>>>
>>>> Your patch may be perfectly fine, but I wanted to take the opportunity
>>>> to send the message across.
>>>>
>>>> I am really curious to run a few benchmarks when it's done to compare
>>>> MonetDB with a more traditional SQL system.
>>>>
>>>> By the way, about benchmarks Andy is (secretely) working on this:
>>>> https://svn.apache.org/repos/asf/incubator/jena/Experimental/JenaPerf/trunk/
>>>>
>>>>
>>>> I have not time to try it yet, but it seems very interesting. :-)
>>>>
>>>> Thank you again for the new interesting feature.
>>>>
>>>> Paolo
>>>>
>>>>
>>>> nat lu wrote:
>>>>> Added, with patch file, at
>>>>>
>>>>> https://issues.apache.org/jira/browse/JENA-134
>>>>>
>>>>>
>>>>> I have made no more progress on testing it out other than sdbconfig so
>>>>> far, hope too soon.
>>>>>
>>>>>
>>>>> On 09/09/11 16:43, Paolo Castagna wrote:
>>>>>> Hi,
>>>>>> why don't you open a new JIRA issue (as a New Feature) for this?
>>>>>> https://issues.apache.org/jira/browse/JENA
>>>>>>
>>>>>> You can then attach a patch to it. This way others can look at what
>>>>>> you have done so far (and maybe help you out).
>>>>>>
>>>>>> Thanks for your help,
>>>>>> Paolo
>>>>>>
>>>>>> nat lu wrote:
>>>>>>> I made a start, and tried to use one of the existing flavours, but
>>>>>>> ended
>>>>>>> up creating one for MonetDB - combination of derby and DB2. It
>>>>>>> doesnt
>>>>>>> like longs or unbounded varchars.
>>>>>>>
>>>>>>> So, I got as far as getting SDBConfig to complete, but havent
>>>>>>> done an
>>>>>>> sdbload yet
>>>>>>>
>>>>>>>
>>>>>>> On 09/09/11 10:37, Andy Seaborne wrote:
>>>>>>>> On 04/09/11 13:03, nat lu wrote:
>>>>>>>>> I'm going to give it a go sometime soon and report back on my
>>>>>>>>> non-scientific findings. Your point about the small number of
>>>>>>>>> columns is
>>>>>>>>> well made, but the research paper cited earlier also mentions
>>>>>>>>> this and
>>>>>>>>> reports that because of column store optimisations even when they
>>>>>>>>> vertically partitioned their data rather than using a
>>>>>>>>> property-table
>>>>>>>>> approach they still saw good improvement. However, again, I'm no
>>>>>>>>> column
>>>>>>>>> store expert so perhaps I'm missing some point here :-). Anyway,
>>>>>>>>> time to
>>>>>>>>> "suck it and see@, all in the name of progress of course.
>>>>>>>>>
>>>>>>>>> On 03/09/11 16:29, David Jordan wrote:
>>>>>>>>>> I have not used a column-oriented database, but I am somewhat
>>>>>>>>>> familiar
>>>>>>>>>> with them. My understanding of them is that the storage is
>>>>>>>>>> partitioned
>>>>>>>>>> on a column basis, such that there is no physical clustering
>>>>>>>>>> together
>>>>>>>>>> of all the columns for a given row. An advantage of this would
>>>>>>>>>> be in
>>>>>>>>>> the case where you have tables with many columns, but the
>>>>>>>>>> particular
>>>>>>>>>> application only needs a small subset of columns.
>>>>>>>>>>
>>>>>>>>>> With the SDB representation of triples (3 columns) and quads (4
>>>>>>>>>> columns), and access typically based on having a specific
>>>>>>>>>> value for
>>>>>>>>>> one or two of the columns, I am not so sure that a column-based
>>>>>>>>>> approach would offer any advantage.
>>>>>>>>>>
>>>>>>>>>> But again, I am no expert on these types of databases.
>>>>>>>>>>
>>>>>>>>>> These discussions about alternative datastore representations
>>>>>>>>>> RDF/OWL
>>>>>>>>>> data are very useful, to gain better understanding of which data
>>>>>>>>>> architectures yield the best implementation approach for
>>>>>>>>>> high-performance.
>>>>>>>>>>
>>>>>>>>>> p.s. I Monet provides support for JDBC, I would not think much
>>>>>>>>>> effort
>>>>>>>>>> is needed to support in with SDB.
>>>>>>>> Shouldn't be too hard :-)  SDB targets SQL-92 and there are a few
>>>>>>>> extension points to cope with the vagaries of different SQL
>>>>>>>> engines.
>>>>>>>> It's one of the reasons there are ~10 small files to write, to
>>>>>>>> capture
>>>>>>>> the uniqueness of each SQL syntax.
>>>>>>>>
>>>>>>>>        Andy
>

Re: Adding support for MonetDB

Reply via email to