Hey All

Just wanted to follow up on my original email with a few more thoughts
from ApacheCon taking into account the rest of the conference

As already noted one theme that came up several times was that of reducing
duplication both within Jena and between Linked Data projects at Apache
where appropriate.  The modularisation already discussed separately is
likely a good step on the road towards that goal at least on our part.

Another topic that was raised several times was how the Linked Data
projects at Apache can raise awareness and the profile of linked data
within the ASF to help attract new contributors.  How we go about doing
that is very much an open question, organising a solid Linked Data track
for ApacheCon EU later in the year is likely a good opportunity for this.

Re: ALv2 compatible SQL parsers I talked to several people about this
about ApacheCon and the Phoenix folks told me that what they were doing
was simply using ANTLR to generate a parser off of a standard SQL grammar.
 I didn't get chance to look into this much further to see what other
Apache projects that support SQL parsing use.

One interesting collaboration opportunity I may have found comes from
talking with Reto Gmur of the Clerezza project, he noted that they
actually package some Jena libraries up into OSGi bundles for their uses
so it is probably worth looking at what exactly those folks do and
possibly adding pointers to our websites informing people who want to use
Jena in OSGi environments to go talk to the Clerezza folks.

Final thought is on the potential for a move to Git in the Jena project,
it seems like a lot more projects are moving to git throughout the ASF
(SVN April fools jokes notwithstanding) and I know we've talked about in
in the past.  As already suggested on the other thread about Jena 3 module
structure 
(http://mail-archives.apache.org/mod_mbox/jena-dev/201404.mbox/%3cCF6C7B24.
33961%[email protected]%3e) moving to Git is a gating factor for
being able to work on more substantial refactors like that.  We should
certainly discuss the possibility of this move again and look at trying to
do it at some point later this year.

Rob

On 08/04/2014 08:10, "Andy Seaborne" <[email protected]> wrote:

>On 08/04/14 14:25, Rob Vesse wrote:
>> Hey All
>>
>> Just thought I¹d share my thoughts on ApacheCon Day 1 having spent a
>>bunch
>> of time talking to various people about Apache.
>
>Rob - thanks for the notes.  Sounds like it is an exciting atmosphere.
>
>> A common theme that has
>> come up is that it would be beneficial to Jena and the wider ASF if we
>>tried
>> to pursue more cross pollination and collaboration with other projects.
>> Both in terms of promoting the Jena brand but also in potentially
>>gaining
>> new collaborators.
>>
>> In terms of specific collaboration opportunities I¹ve heard a few
>>different
>> ideas.  I spent a bunch of time talking with Lewis McGibbney who¹s
>>involved
>> in Any23 about how Jena might make it easier for projects to share
>>common
>> functionality like RDF parsers.  The current module structures are
>>something
>> of a barrier in this regard since we have multiple versions of some
>>readers
>> and they are quite closely coupled into some aspects of our APIs.
>>Improving
>> modularisation in the future (as I think we hope to do in Jena 3x
>> eventually) would make things like this easier for people.
>
>Agreed : we need something like
>
>IRI
>Non-RDF related common library (Atlas in ARQ currently)
>new core (graph API, datasetgraph)
>RIOT
>API
>SPARQL
>TDB (split into base, file, b+tree, main)
>...
>
>A "maybe" is a module that is just the interfaces for graph, dataset etc
>etc. and have modules build from that but it looks to me like the
>difference between new-core and interface+core+mem is quite small.
>Having the in-memory implmentations around is necessary for internal
>working.
>
>And? Java8 so we can sort out iterators.
>
>Let's more actively discuss this.
>
>> Another personal
>> bug bear of Lewis¹s was the lack of a Fuseki WAR (JENA-201) which I did
>>tell
>> him will be resolved soon with the new Fuseki 2x.
>
>and in the preview release:
>
>http://people.apache.org/~andy/fuseki2/
>
>> The other more speculative things we talked about concerned future
>> directions for Any23, Lewis wants to get to a point where the extracted
>> triples can be more flexibly output including things like writing
>>direct to
>> TDB so there is room for discussion on what the best way of doing that
>>is.
>> Also whether there is room for any integration with Gora which is a
>> framework for big data persistence so the collaboration there would be
>>to
>> look at adding persistence to triple stores like TDB.
>
>Gora/Any23 can output to StreamRDF and then it should plug in.
>
>Any23 is Sesame-based but the rewrite from Sesame-objects to
>Jena-objects is a single copy so not too bad.
>
>> In another discussion with a Phoenix (currently Incubating) project I
>> realised that these folks must have some sort of SQL parser that is ALv2
>> compatible.
>
>I think both Derby and Tajo have SQL java-based parsers as they execute
>SQL.
>
>>  I know on the list in the past we discussed merging the current
>> Jena JDBC modules which do SPARQL over JDBC with Claude¹s efforts on
>>GitHub
>> which map RDF to relational and SQL to SPARQL and the main barrier to
>>that
>> was a lack of ALv2 compatible SQL parsing library.  So it would be worth
>> talking to the Phoenix folks about what they use for SQL parsing and
>>seeing
>> whether we can then use that to bring Claude¹s JDBC approach into Jena
>>JDBC
>> and support both approaches side by side.
>>
>> The other interesting discussion I had was with some folks from the
>>Sqoop
>> project after seeing their talk on Sqoop 2 which is a ETL framework at
>> Apache.  Currently they predominantly just extract from relational
>>databases
>> and write to HBase, flat files on HDFS etc.  It seems like there is an
>> opportunity there to work together to add RDF support both on the input
>>and
>> output sides.
>
>Interesting idea!
>
>>
>> Rob
>>
>       Andy
>
>




Reply via email to