nat lu wrote: > Thanks, I hope the community can help progress this more, and that this > is just a very small start. I'm sure I'm not the only one interested in > the possibilities of column stores as RDF repositories.
Some Italians used to say: "If you want something done, do it yourself". That's true in the open source world: you are part of the community and interested in MonetDB, so you are in a good position to get it done. ;-) Others might help you, but don't rely on it. If you want to help someone else help you, a good way is to list in a comment on the JIRA issue what are the things left to do and, if you have an idea on how to do those things, add a short paragraph describing how it should be done. > "I'll be back...." :-) I'll wait. Paolo > > On 11/10/11 12:15, Paolo Castagna wrote: >> Hi nat lu >> >> nat lu wrote: >>> HI, >>> >>> Had a quick look - the patch seems to want to delete (in entirety) a >>> couple of existing SDB classes which I had made minimal modifications >>> to, so I've re-checked out SDB 1.3.4 and created a new patch which I can >>> upload later. So - the first attachment in Jira shouldnt be used. >> Great. Thanks for double checking and acting on this. >> This will make life easier to whoever review your patch. >> >>> Secondly, I haven't recreated all the unit test code that exists for >>> other RDBMS supported by SDB because, simply by the looks of it, there's >>> quite a bit of it, and I haven't had time so far. >> Ack. >> >> A patch without tests is better than no patch at all. >> >> However, tests are really important and should be considered part of the >> process of submitting a patch. >> >> They certainly help us a lot and can make a patch going into trunk >> faster. >> >>> I've also, as I said before, not used Monet with any data yet, just got >>> it to the point that SDBConfig doesn't fall over, and the Jira issue >>> raised is simply to record the desire to support it (and perhaps other >>> column stores if conceptually they prove to have merit when used as an >>> RDF repo). >> Ack. >> >> Others wanting to have MonetDB will see your issue and might come and >> help you with that. >> >> I personally don't use SDB much and I've never used MonetDB before. >> >>> So, personally, I'd prefer (I assume you-all as well) that nothing >>> happens too quickly with this, until I or someone else gets some time to >>> do some work and produce the unit tests. >> Sure. >> >> We tend not to commit untested code. :-) >> >> Having low quality or unfinished code committed to trunk is a bad >> practice >> and we avoid that. >> >> Rules have exceptions though, for example, I recently committed an >> RDF/JSON >> writer which isn't complete/fully tested, however it's not going to >> affect >> users in any way (see JENA-135). I did that to make Rob's life easier >> in case >> he want to have an RDF/JSON writer as well as a parser (which he >> contributed). >> >> Nat, don't take these last two my messages too much personally, I am just >> taking the opportunities to send across a few IMHO important messages: >> >> - we welcome patches and new ideas/features (we really do!) >> - proper patches, well tested and which apply cleanly make our life >> so much easier (so we appreciate any effort in this direction). >> It's very difficult to get it wrong with svn diff> >> JENA-XYZ.patch. :-) >> >> Thanks again and have fun with SDB + MonetDB! >> >> Paolo >> >>> On 11/10/11 08:07, Paolo Castagna wrote: >>>> Hi, >>>> first of all, thank you for your patch. >>>> I had a quick look, but I did not try to apply it (yet). >>>> >>>> May I ask how you created your patch? >>>> >>>> We added a section on the Getting Involved page on the Jena website: >>>> >>>> "Patches should be attached to issues in Jira (click on >>>> More Actions> Attach Files). To create a patch you can simply >>>> use the command: >>>> >>>> svn diff> JENA-XYZ.patch >>>> >>>> Please, inspect your patch and make sure it includes all (and only) >>>> the relevant changes for a single issue. Don't forget tests! If you >>>> want to test if a patch applies cleanly you can use: >>>> >>>> patch -p0< JENA-XYZ.patch >>>> >>>> If you use Eclipse: right click on the project name in Package >>>> Explorer, >>>> select Team> Create Patch or Team> Apply Patch." >>>> >>>> - >>>> http://jena.staging.apache.org/jena/getting_involved/#submit_your_patches >>>> >>>> >>>> >>>> It really helps if a patch contains only the lines you added|removed >>>> and it applies cleanly. It saves a lot of time and speed-up reviewing >>>> it. >>>> >>>> Your patch may be perfectly fine, but I wanted to take the opportunity >>>> to send the message across. >>>> >>>> I am really curious to run a few benchmarks when it's done to compare >>>> MonetDB with a more traditional SQL system. >>>> >>>> By the way, about benchmarks Andy is (secretely) working on this: >>>> https://svn.apache.org/repos/asf/incubator/jena/Experimental/JenaPerf/trunk/ >>>> >>>> >>>> I have not time to try it yet, but it seems very interesting. :-) >>>> >>>> Thank you again for the new interesting feature. >>>> >>>> Paolo >>>> >>>> >>>> nat lu wrote: >>>>> Added, with patch file, at >>>>> >>>>> https://issues.apache.org/jira/browse/JENA-134 >>>>> >>>>> >>>>> I have made no more progress on testing it out other than sdbconfig so >>>>> far, hope too soon. >>>>> >>>>> >>>>> On 09/09/11 16:43, Paolo Castagna wrote: >>>>>> Hi, >>>>>> why don't you open a new JIRA issue (as a New Feature) for this? >>>>>> https://issues.apache.org/jira/browse/JENA >>>>>> >>>>>> You can then attach a patch to it. This way others can look at what >>>>>> you have done so far (and maybe help you out). >>>>>> >>>>>> Thanks for your help, >>>>>> Paolo >>>>>> >>>>>> nat lu wrote: >>>>>>> I made a start, and tried to use one of the existing flavours, but >>>>>>> ended >>>>>>> up creating one for MonetDB - combination of derby and DB2. It >>>>>>> doesnt >>>>>>> like longs or unbounded varchars. >>>>>>> >>>>>>> So, I got as far as getting SDBConfig to complete, but havent >>>>>>> done an >>>>>>> sdbload yet >>>>>>> >>>>>>> >>>>>>> On 09/09/11 10:37, Andy Seaborne wrote: >>>>>>>> On 04/09/11 13:03, nat lu wrote: >>>>>>>>> I'm going to give it a go sometime soon and report back on my >>>>>>>>> non-scientific findings. Your point about the small number of >>>>>>>>> columns is >>>>>>>>> well made, but the research paper cited earlier also mentions >>>>>>>>> this and >>>>>>>>> reports that because of column store optimisations even when they >>>>>>>>> vertically partitioned their data rather than using a >>>>>>>>> property-table >>>>>>>>> approach they still saw good improvement. However, again, I'm no >>>>>>>>> column >>>>>>>>> store expert so perhaps I'm missing some point here :-). Anyway, >>>>>>>>> time to >>>>>>>>> "suck it and see@, all in the name of progress of course. >>>>>>>>> >>>>>>>>> On 03/09/11 16:29, David Jordan wrote: >>>>>>>>>> I have not used a column-oriented database, but I am somewhat >>>>>>>>>> familiar >>>>>>>>>> with them. My understanding of them is that the storage is >>>>>>>>>> partitioned >>>>>>>>>> on a column basis, such that there is no physical clustering >>>>>>>>>> together >>>>>>>>>> of all the columns for a given row. An advantage of this would >>>>>>>>>> be in >>>>>>>>>> the case where you have tables with many columns, but the >>>>>>>>>> particular >>>>>>>>>> application only needs a small subset of columns. >>>>>>>>>> >>>>>>>>>> With the SDB representation of triples (3 columns) and quads (4 >>>>>>>>>> columns), and access typically based on having a specific >>>>>>>>>> value for >>>>>>>>>> one or two of the columns, I am not so sure that a column-based >>>>>>>>>> approach would offer any advantage. >>>>>>>>>> >>>>>>>>>> But again, I am no expert on these types of databases. >>>>>>>>>> >>>>>>>>>> These discussions about alternative datastore representations >>>>>>>>>> RDF/OWL >>>>>>>>>> data are very useful, to gain better understanding of which data >>>>>>>>>> architectures yield the best implementation approach for >>>>>>>>>> high-performance. >>>>>>>>>> >>>>>>>>>> p.s. I Monet provides support for JDBC, I would not think much >>>>>>>>>> effort >>>>>>>>>> is needed to support in with SDB. >>>>>>>> Shouldn't be too hard :-) SDB targets SQL-92 and there are a few >>>>>>>> extension points to cope with the vagaries of different SQL >>>>>>>> engines. >>>>>>>> It's one of the reasons there are ~10 small files to write, to >>>>>>>> capture >>>>>>>> the uniqueness of each SQL syntax. >>>>>>>> >>>>>>>> Andy >
