I personally have no preference as to how the code in Jena should be
structured, as long as I am able to use it :).
I have personal preference of doing it in a specific way because IMO, it is
modular which makes it much easier to maintain in the long run. But again
it may not be the quickest one.

I already have been given a deadline, by the company to have ES extension
implemented in the next 15 days :). What this means is that I will be
maintaining the ES code extension to Jena Text at-least locally for a
coming period of time. I would be more than happy to contribute to Jena
community whatever is required to have a proper ElasticSearch
Implementation in place, whether within jena-text module or as a separate
module. Till the time Lucene and Solr is not upgraded to the latest
version, I will have to maintain a separate module for jena-text-es.

Cheers!
Anuj Kumar


On Wed, Mar 1, 2017 at 3:36 PM, A. Soroka <aj...@virginia.edu> wrote:

> Osma--
>
> The short answer is that yes, given the right tools you _can_ have
> different versions of code accessible in different ways. The longer answer
> is that it's probably not a viable alternative for Jena for this problem,
> at least not without a lot of other change.
>
> You are right to point to the classloader mechanism as being at the heart
> of this question, but I must alter your remark just slightly. From "the
> Java classloader only sees a single, flat package/class namespace and a set
> of compiled classes" to "ANY GIVEN Java classloader only sees a single,
> flat package/class namespace and a set of compiled classes".
>
> This is the fact that OSGi uses to make it possible to maintain strict
> module boundaries (and even dynamic module relationships at run-time). Each
> OSGi bundle sees its own classloader, and the framework is responsible for
> connecting bundles up to ensure that every bundle has what it needs in the
> way of types to function, based on metadata that the bundles provide to the
> framework. It's an incredibly powerful system (I use it every day and enjoy
> it enormously) but it's also very "heavy" and requires a good deal of
> investment to use. In particular, it's probably too large to put _inside_
> Jena. (I frequently put Jena inside an OSGi instance, on the other hand.)
>
> Java 9 Jigsaw [1] offers some possibility for strong modularization of
> this kind, but it's really meant for the JDK itself, not application
> libraries. In theory, we could "roll our own" classloader management for
> this problem. That sounds like more than a bit of a rabbit hole to me.
> There might be another, more lightweight, toolkit out there to this
> purpose, but I'm not aware of any myself.
>
> Otherwise, yes, you get into shading and the like. We have to do that for
> Guava for now because of HADOOP-10101 (grumble grumble) but it's hardly a
> thing we want to do any more of than needed, I don't think.
>
> ---
> A. Soroka
> The University of Virginia Library
>
> [1] http://openjdk.java.net/projects/jigsaw/
>
> > On Mar 1, 2017, at 9:03 AM, Osma Suominen <osma.suomi...@helsinki.fi>
> wrote:
> >
> > Hi Anuj!
> >
> > Thanks for the clarification.
> >
> > However, I'm still not sure I understand the situation completely. I
> know Maven can perform a lot of tricks, but Maven modules are just
> convenient ways to structure a Java project. Maven cannot change the fact
> that at runtime, module divisions don't really matter (except that they
> usually correspond to package sub-namespaces) and the Java classloader only
> sees a single, flat package/class namespace and a set of compiled classes
> (usually within JARs) in the classpath that it needs to check to find the
> right classes, and if there are two versions of the same library (eg
> Lucene) with overlapping class names, that's going to cause trouble. The
> only way around that is to shade some of the libraries, i.e. rename them so
> that they end up in another, non-conflicting namespace. Apparently
> Elasticsearch also did some of that in the past [1] but nowadays tries to
> avoid it.
> >
> > Does your assumption 1 ("At a given point in time, only a single
> Indexing Technology is used") imply that in the assembler configuration,
> you cannot have ja:loadClass declarations for both Lucene and ES backends?
> Or how do you run something like Fuseki that contains (in a single big JAR)
> both the jena-text and jena-text-es modules with all their dependencies,
> one of which requires the Lucene 4.x classes and the other one the Lucene
> 6.4.1 classes? How do you ensure that only one of them is used at a time,
> and that the Java classloader, even though it has access to both versions
> of Lucene, only loads classes from the single, correct one and not the
> other? Or do you need to have separate "Fuseki-Lucene" and "Fuseki-ES"
> packages, so that you don't end up with two Lucene versions within the same
> Fuseki JAR?
> >
> > -Osma
> >
> > [1] https://www.elastic.co/blog/to-shade-or-not-to-shade
> >
> > 01.03.2017, 11:03, anuj kumar kirjoitti:
> >> Hi Osma,
> >>
> >> I understand what you are saying. There are ways to mitigate risks and
> >> balance the refactoring without affecting the existing modules. But I
> will
> >> not delve into those now. I am not an expert in Jena to convincingly say
> >> that it is possible, without any hiccups. But I can take a guess and say
> >> that it is indeed possible :)
> >>
> >> For the question: "is it even possible to mix modules that depend on
> >> different versions of the Lucene libraries within the same project?"
> >>
> >> I actually do not understand what you mean by mixing modules. I assume
> you
> >> mean having jena-text and jena-text-es as dependencies in a build
> without
> >> causing the build to conflict. If that is what you mean than the answer
> is
> >> yes it is possible and quite simple as well. Let me explain how it is
> >> possible. But before that some assumption which I want to call out
> >> explicitly.
> >>
> >> *Assumption:*
> >> 1. At a given point in time, only a single Indexing Technology is used
> for
> >> text based indexing and searching via Jean. What this means is that we
> will
> >> either use Lucene Implementation OR Solr Implementation OR ES
> >> Implementation at any given point in time.
> >> 2. Fuseki build does not depend on any Lucene 4.9.1 specific classes but
> >> only on jena-text classes, if at all.
> >>
> >> Based on these assumptions it is possible to create a build that
> contains
> >> jena-text based common classes + ES specific classes without any
> >> compatibility issues. And it is infact quite simple. I did it in the
> >> current jena-text-es module and ran the entire build which succeeded.
> >> The key is to include the latest Lucene dependencies at the very
> beginning
> >> in the pom and then include jena-text dependency. Maven will then
> >> automatically resolve the dependency issues by including the Lucene
> >> librarires that we included in our es specific pom. Have a look the pom
> of
> >> jena-text-es module here to see how it can be done :
> >> https://github.com/EaseTech/jena/blob/master/jena-text-es/pom.xml
> >>
> >>
> >> Thanks,
> >> Anuj Kumar
> >>
> >>
> >> On Wed, Mar 1, 2017 at 7:27 AM, Osma Suominen <
> osma.suomi...@helsinki.fi>
> >> wrote:
> >>
> >>> Hi Anuj,
> >>>
> >>> I understand your concerns. However, we also need to balance between
> the
> >>> needs of individual modules/features and the whole codebase. I'm
> willing to
> >>> put in the effort to keep the other modules up to date with newer
> Lucene
> >>> versions. Lucene upgrade requirements are well documented, the only
> hitches
> >>> seen in JENA-1250 were related to how jena-text (ab)used some Lucene
> >>> features that were dropped from newer versions.
> >>>
> >>> A perhaps stupid question to more experienced Java developers: is it
> even
> >>> possible to mix modules that depend on different versions of the Lucene
> >>> libraries within the same project? In my (quite limited) understanding
> of
> >>> Java projects and libraries, this requires special arrangements (e.g.
> >>> shading) as the Java package/class namespace is shared by all the code
> >>> running within the same JVM.
> >>>
> >>> So can you create, say, a Fuseki build that contains the current
> jena-text
> >>> module (depending on Lucene 4.x) and the new jena-text-es module
> (depending
> >>> on Lucene 6.4.1) without any compatibility issues?
> >>>
> >>> -Osma
> >>>
> >>>
> >>>
> >>>
> >>> 01.03.2017, 00:47, anuj kumar kirjoitti:
> >>>
> >>>> Hi,
> >>>>
> >>>> My 2 Cents :
> >>>>
> >>>> The reason I proposed to have separate modules for Lucene, Solr and
> ES is
> >>>> exactly for avoiding the "All or Nothing" approach we need to take if
> we
> >>>> club them all together. If they stay together and if in the near
> future I
> >>>> want to upgrade ES to another version, I also need to again upgrade
> Lucene
> >>>> and Solr and possibly another implementation that may have been added
> >>>> during the time. As we all know, this means weeks of work if not
> months to
> >>>> get the changes released. This will personally de-motivate me to do
> >>>> anything and I will probably start maintaining my version of
> Jena-Text as
> >>>> that would be much simpler to do than to upgrade and test and in the
> >>>> process own(read fix bugs) the upgrade for each and every technology.
> >>>>
> >>>> If they are developed as separate modules, they can evolve
> independently
> >>>> of
> >>>> each other and we can avoid situations where we cant upgrade to latest
> >>>> version of Lucene because we do not know what effect it will have on
> Solr
> >>>> Implementation.
> >>>>
> >>>> We can start with having a separate Module for Jena Text ES and see
> how
> >>>> things go. If they go well, we could extract out Solr and Lucene out
> of
> >>>> Jena Text.
> >>>>
> >>>> Again this is just a suggestion based on my limited industry
> experience.
> >>>>
> >>>> Thanks,
> >>>> Anuj Kumar
> >>>>
> >>>>
> >>>>
> >>>> On Tue, Feb 28, 2017 at 5:23 PM, Osma Suominen <
> osma.suomi...@helsinki.fi
> >>>>>
> >>>> wrote:
> >>>>
> >>>> 28.02.2017, 17:12, A. Soroka kirjoitti:
> >>>>>
> >>>>> https://lists.apache.org/thread.html/dce0d502b11891c28e57bbc
> >>>>>> bb0cdef27d8374d58d9634076b8ef4cd7@1431107516@%3Cdev.jena.apache.org
> %3E
> >>>>>> ? In other words, might it be better to factor out between -text and
> >>>>>> -spatial and _then_ try to upgrade the Lucene version?
> >>>>>>
> >>>>>>
> >>>>> I certainly wouldn't object to that, but somebody has to volunteer
> to do
> >>>>> the actual work!
> >>>>>
> >>>>> I don't use the Solr component now, but I could easily see so
> doing...
> >>>>>
> >>>>>> that's pretty vague, I know, and I'm not in a position to do any
> work to
> >>>>>> maintain it, so consider that just a very small and blurry data
> point.
> >>>>>> :)
> >>>>>>
> >>>>>>
> >>>>> Last time I tried it (it was a while ago) I couldn't figure out how
> to
> >>>>> get
> >>>>> it running... If you could just try that with some toy data, then
> your
> >>>>> data
> >>>>> point would be a lot less blurry :) I haven't used Solr for
> anything, so
> >>>>> I'm not very familiar with how to set it up, and the jena-text
> >>>>> instructions
> >>>>> are pretty vague unfortunately.
> >>>>>
> >>>>>
> >>>>> -Osma
> >>>>>
> >>>>>
> >>>>> --
> >>>>> Osma Suominen
> >>>>> D.Sc. (Tech), Information Systems Specialist
> >>>>> National Library of Finland
> >>>>> P.O. Box 26 (Kaikukatu 4)
> >>>>> 00014 HELSINGIN YLIOPISTO
> >>>>> Tel. +358 50 3199529
> >>>>> osma.suomi...@helsinki.fi
> >>>>> http://www.nationallibrary.fi
> >>>>>
> >>>>>
> >>>>
> >>>>
> >>>>
> >>>
> >>> --
> >>> Osma Suominen
> >>> D.Sc. (Tech), Information Systems Specialist
> >>> National Library of Finland
> >>> P.O. Box 26 (Kaikukatu 4)
> >>> 00014 HELSINGIN YLIOPISTO
> >>> Tel. +358 50 3199529
> >>> osma.suomi...@helsinki.fi
> >>> http://www.nationallibrary.fi
> >>>
> >>
> >>
> >>
> >
> >
> > --
> > Osma Suominen
> > D.Sc. (Tech), Information Systems Specialist
> > National Library of Finland
> > P.O. Box 26 (Kaikukatu 4)
> > 00014 HELSINGIN YLIOPISTO
> > Tel. +358 50 3199529
> > osma.suomi...@helsinki.fi
> > http://www.nationallibrary.fi
>
>


-- 
*Anuj Kumar*

Reply via email to