Re: Jena3 : core/arq split

Andy Seaborne Tue, 27 Jan 2015 05:15:02 -0800

On 26/01/15 22:57, Rob Vesse wrote:

Comments inline:


On 26/01/2015 14:12, "Stian Soiland-Reyes" <[email protected]> wrote:

If we move out jena-riot, what is the gain? It relies on jena-core, and
the
core kind of needs read/write for everyday use. Core is not abstract like
the Commons RDF API.


Well the real "core" is, the basic interfaces and classes I.e. Node,
Triple, Graph, DatasetGraph, Dataset are fairly self contained and
relatively abstract.  If we are talking about the Model, Resource,
Ontology API then those are a lot more complex

It's also perfectly possible to use these APIs without ever needing any IO
(though perhaps unusual).


Could we at least call it jena-io if it goes solo? I know it also does
streaming, but don't make it too hard to find ;-).

Just today there was an email on one of the LOD lists where someone bailed
out of Jena because it needed 4 jena-* JARs to do a remote SPARQL query.
("the whole Jena stack"). How people survive without dependency management
is beyond me, but not everyone is in Maven land :-).


If they think Jena is bad (23 distinct modules) clearly they haven't seen
the list of Sesame artifacts lately (78 distinct modules) ;)


We could produce an uber jar of iri/core/arq/tdb.

Of course we already have an uber jar + dependencies - it's calledFuseki! "java -cp fusekijar commandline" is so convenient working onremote servers.

Side note:  This sort of think makes me both laugh and cry.  Users want a
user friendly domain specific API but then balk as soon as they realise
that it means actually needing more than one library (because apparently
modularisation is bad practise in the minds of end users).  Like you say
if you are a serious developer how you get by without using any kind of
proper build/package management tool really blows my mind.


Or using classpath "lib/*".

        Andy


I can however see one compelling argument for putting RIOT as a new module
- if we are able to make both Core and ARQ work without it, and it also
can
reduce the list of external dependencies for users of those (e.g. avoid
jsonld-java, thrift, httpclient?)


Yes reducing unnecessary dependencies for those that don't need them is
always valuable

Rob

On 26 Jan 2015 19:28, "Rob Vesse" <[email protected]> wrote:

Andy

I would prefer proposal two, Jena 3 will be disruptive regardless (if
only
because of the time people spend updating import statements).  A few
other
more minor changes to import statements and POM definitions wouldn't be
too big of a deal IMHO

I would be strongly against leaving old package names with redirects
since
it only encourages people to not bother migrating code properly and just
to simply update the version in the POM and not be aware that there are
other changes that happened (e.g. RDF 1.1).  A one time disruptive
migration forward to Jena 3 that makes me actually have to consider the
impact of the migration on my existing code is strongly preferable to a
staggered migration

In that vein I would suggest that the IO components be moved into their
own package (jena-riot I assume?) at the same time, again the principle
is
to make people take a single larger disruptive migration rather than
requiring many smaller migrations.  If Core needs to have some way of
wiring in IO automatically then I suggest we do it via the Java 7+
ServiceLoader mechanism, I'm already using it a little in the Elephas IO
modules and it works pretty nice and I would be willing to help get this
set up for Jena 3 IO as necessary.

I suppose the IO wiring comes back to the question of whether
Model.read()
and Model.write() are still relevant or if we force everyone over to
using
RDFDataMgr (which would be my preference) since the IO module has to
rely
on Core anyway for the relevant data model APIs and having Core somehow
rely on IO is an ugly circular dependency (or gets us into the same
problems we have now).  Of course the alternative solution to that is to
have the Resource API also broken out into its own module so that Core
really is only the core low level data structures.

With regards to packaging if people are using higher level POM artifacts
like apache-jena-libs then the module changes should remain fairly
transparent to them.

Rob

On 24/01/2015 10:34, "Andy Seaborne" <[email protected]> wrote:

[[
oaj = org.apache.jena
chhj = com.hp.hpl.jena
]]

One major possible change target is the core/arq split.

Much of this comes down to where quads/datasets go in the package tree.
  They started as a SPARQL (1.0) feature but are now RDF 1.1 and parser
related.

The general idea is move dataset/quad support to core, move parsers to
core (separate into their own package later??) and have jena-arq be
SPARQL only.

The question is how much change to go through to achieve that

Possibility 1 : Less change

Move DatasetGraph* to oaj.dataset.*

API visible:

Migrate Dataset from chhj.query.Dataset to oaj.rdf.dataset (c.f.
oaj.rdf.model)

Move DatasetGraph and Quad to oaj.dataset (c.f. oaj.graph)

Try to leave indirection class in chhj.query.Dataset somehow.


Possibility 2 : More change, more disruption (but one time)

Pull oaj.rdf.model up to oaj.rdf and put Dataset there.  This is the
"RDF API".

Use oaj.graph for DatasetGraph and Quad.

Hmm - actually writing this down, I am tending towards possibility 2 if
that works as cleanly as it sounds.

       Andy

Re: Jena3 : core/arq split

Reply via email to