Osma--

The short answer is that yes, given the right tools you _can_ have different 
versions of code accessible in different ways. The longer answer is that it's 
probably not a viable alternative for Jena for this problem, at least not 
without a lot of other change.

You are right to point to the classloader mechanism as being at the heart of 
this question, but I must alter your remark just slightly. From "the Java 
classloader only sees a single, flat package/class namespace and a set of 
compiled classes" to "ANY GIVEN Java classloader only sees a single, flat 
package/class namespace and a set of compiled classes".

This is the fact that OSGi uses to make it possible to maintain strict module 
boundaries (and even dynamic module relationships at run-time). Each OSGi 
bundle sees its own classloader, and the framework is responsible for 
connecting bundles up to ensure that every bundle has what it needs in the way 
of types to function, based on metadata that the bundles provide to the 
framework. It's an incredibly powerful system (I use it every day and enjoy it 
enormously) but it's also very "heavy" and requires a good deal of investment 
to use. In particular, it's probably too large to put _inside_ Jena. (I 
frequently put Jena inside an OSGi instance, on the other hand.)

Java 9 Jigsaw [1] offers some possibility for strong modularization of this 
kind, but it's really meant for the JDK itself, not application libraries. In 
theory, we could "roll our own" classloader management for this problem. That 
sounds like more than a bit of a rabbit hole to me. There might be another, 
more lightweight, toolkit out there to this purpose, but I'm not aware of any 
myself. 

Otherwise, yes, you get into shading and the like. We have to do that for Guava 
for now because of HADOOP-10101 (grumble grumble) but it's hardly a thing we 
want to do any more of than needed, I don't think.

---
A. Soroka
The University of Virginia Library

[1] http://openjdk.java.net/projects/jigsaw/

> On Mar 1, 2017, at 9:03 AM, Osma Suominen <osma.suomi...@helsinki.fi> wrote:
> 
> Hi Anuj!
> 
> Thanks for the clarification.
> 
> However, I'm still not sure I understand the situation completely. I know 
> Maven can perform a lot of tricks, but Maven modules are just convenient ways 
> to structure a Java project. Maven cannot change the fact that at runtime, 
> module divisions don't really matter (except that they usually correspond to 
> package sub-namespaces) and the Java classloader only sees a single, flat 
> package/class namespace and a set of compiled classes (usually within JARs) 
> in the classpath that it needs to check to find the right classes, and if 
> there are two versions of the same library (eg Lucene) with overlapping class 
> names, that's going to cause trouble. The only way around that is to shade 
> some of the libraries, i.e. rename them so that they end up in another, 
> non-conflicting namespace. Apparently Elasticsearch also did some of that in 
> the past [1] but nowadays tries to avoid it.
> 
> Does your assumption 1 ("At a given point in time, only a single Indexing 
> Technology is used") imply that in the assembler configuration, you cannot 
> have ja:loadClass declarations for both Lucene and ES backends? Or how do you 
> run something like Fuseki that contains (in a single big JAR) both the 
> jena-text and jena-text-es modules with all their dependencies, one of which 
> requires the Lucene 4.x classes and the other one the Lucene 6.4.1 classes? 
> How do you ensure that only one of them is used at a time, and that the Java 
> classloader, even though it has access to both versions of Lucene, only loads 
> classes from the single, correct one and not the other? Or do you need to 
> have separate "Fuseki-Lucene" and "Fuseki-ES" packages, so that you don't end 
> up with two Lucene versions within the same Fuseki JAR?
> 
> -Osma
> 
> [1] https://www.elastic.co/blog/to-shade-or-not-to-shade
> 
> 01.03.2017, 11:03, anuj kumar kirjoitti:
>> Hi Osma,
>> 
>> I understand what you are saying. There are ways to mitigate risks and
>> balance the refactoring without affecting the existing modules. But I will
>> not delve into those now. I am not an expert in Jena to convincingly say
>> that it is possible, without any hiccups. But I can take a guess and say
>> that it is indeed possible :)
>> 
>> For the question: "is it even possible to mix modules that depend on
>> different versions of the Lucene libraries within the same project?"
>> 
>> I actually do not understand what you mean by mixing modules. I assume you
>> mean having jena-text and jena-text-es as dependencies in a build without
>> causing the build to conflict. If that is what you mean than the answer is
>> yes it is possible and quite simple as well. Let me explain how it is
>> possible. But before that some assumption which I want to call out
>> explicitly.
>> 
>> *Assumption:*
>> 1. At a given point in time, only a single Indexing Technology is used for
>> text based indexing and searching via Jean. What this means is that we will
>> either use Lucene Implementation OR Solr Implementation OR ES
>> Implementation at any given point in time.
>> 2. Fuseki build does not depend on any Lucene 4.9.1 specific classes but
>> only on jena-text classes, if at all.
>> 
>> Based on these assumptions it is possible to create a build that contains
>> jena-text based common classes + ES specific classes without any
>> compatibility issues. And it is infact quite simple. I did it in the
>> current jena-text-es module and ran the entire build which succeeded.
>> The key is to include the latest Lucene dependencies at the very beginning
>> in the pom and then include jena-text dependency. Maven will then
>> automatically resolve the dependency issues by including the Lucene
>> librarires that we included in our es specific pom. Have a look the pom of
>> jena-text-es module here to see how it can be done :
>> https://github.com/EaseTech/jena/blob/master/jena-text-es/pom.xml
>> 
>> 
>> Thanks,
>> Anuj Kumar
>> 
>> 
>> On Wed, Mar 1, 2017 at 7:27 AM, Osma Suominen <osma.suomi...@helsinki.fi>
>> wrote:
>> 
>>> Hi Anuj,
>>> 
>>> I understand your concerns. However, we also need to balance between the
>>> needs of individual modules/features and the whole codebase. I'm willing to
>>> put in the effort to keep the other modules up to date with newer Lucene
>>> versions. Lucene upgrade requirements are well documented, the only hitches
>>> seen in JENA-1250 were related to how jena-text (ab)used some Lucene
>>> features that were dropped from newer versions.
>>> 
>>> A perhaps stupid question to more experienced Java developers: is it even
>>> possible to mix modules that depend on different versions of the Lucene
>>> libraries within the same project? In my (quite limited) understanding of
>>> Java projects and libraries, this requires special arrangements (e.g.
>>> shading) as the Java package/class namespace is shared by all the code
>>> running within the same JVM.
>>> 
>>> So can you create, say, a Fuseki build that contains the current jena-text
>>> module (depending on Lucene 4.x) and the new jena-text-es module (depending
>>> on Lucene 6.4.1) without any compatibility issues?
>>> 
>>> -Osma
>>> 
>>> 
>>> 
>>> 
>>> 01.03.2017, 00:47, anuj kumar kirjoitti:
>>> 
>>>> Hi,
>>>> 
>>>> My 2 Cents :
>>>> 
>>>> The reason I proposed to have separate modules for Lucene, Solr and ES is
>>>> exactly for avoiding the "All or Nothing" approach we need to take if we
>>>> club them all together. If they stay together and if in the near future I
>>>> want to upgrade ES to another version, I also need to again upgrade Lucene
>>>> and Solr and possibly another implementation that may have been added
>>>> during the time. As we all know, this means weeks of work if not months to
>>>> get the changes released. This will personally de-motivate me to do
>>>> anything and I will probably start maintaining my version of Jena-Text as
>>>> that would be much simpler to do than to upgrade and test and in the
>>>> process own(read fix bugs) the upgrade for each and every technology.
>>>> 
>>>> If they are developed as separate modules, they can evolve independently
>>>> of
>>>> each other and we can avoid situations where we cant upgrade to latest
>>>> version of Lucene because we do not know what effect it will have on Solr
>>>> Implementation.
>>>> 
>>>> We can start with having a separate Module for Jena Text ES and see how
>>>> things go. If they go well, we could extract out Solr and Lucene out of
>>>> Jena Text.
>>>> 
>>>> Again this is just a suggestion based on my limited industry experience.
>>>> 
>>>> Thanks,
>>>> Anuj Kumar
>>>> 
>>>> 
>>>> 
>>>> On Tue, Feb 28, 2017 at 5:23 PM, Osma Suominen <osma.suomi...@helsinki.fi
>>>>> 
>>>> wrote:
>>>> 
>>>> 28.02.2017, 17:12, A. Soroka kirjoitti:
>>>>> 
>>>>> https://lists.apache.org/thread.html/dce0d502b11891c28e57bbc
>>>>>> bb0cdef27d8374d58d9634076b8ef4cd7@1431107516@%3Cdev.jena.apache.org%3E
>>>>>> ? In other words, might it be better to factor out between -text and
>>>>>> -spatial and _then_ try to upgrade the Lucene version?
>>>>>> 
>>>>>> 
>>>>> I certainly wouldn't object to that, but somebody has to volunteer to do
>>>>> the actual work!
>>>>> 
>>>>> I don't use the Solr component now, but I could easily see so doing...
>>>>> 
>>>>>> that's pretty vague, I know, and I'm not in a position to do any work to
>>>>>> maintain it, so consider that just a very small and blurry data point.
>>>>>> :)
>>>>>> 
>>>>>> 
>>>>> Last time I tried it (it was a while ago) I couldn't figure out how to
>>>>> get
>>>>> it running... If you could just try that with some toy data, then your
>>>>> data
>>>>> point would be a lot less blurry :) I haven't used Solr for anything, so
>>>>> I'm not very familiar with how to set it up, and the jena-text
>>>>> instructions
>>>>> are pretty vague unfortunately.
>>>>> 
>>>>> 
>>>>> -Osma
>>>>> 
>>>>> 
>>>>> --
>>>>> Osma Suominen
>>>>> D.Sc. (Tech), Information Systems Specialist
>>>>> National Library of Finland
>>>>> P.O. Box 26 (Kaikukatu 4)
>>>>> 00014 HELSINGIN YLIOPISTO
>>>>> Tel. +358 50 3199529
>>>>> osma.suomi...@helsinki.fi
>>>>> http://www.nationallibrary.fi
>>>>> 
>>>>> 
>>>> 
>>>> 
>>>> 
>>> 
>>> --
>>> Osma Suominen
>>> D.Sc. (Tech), Information Systems Specialist
>>> National Library of Finland
>>> P.O. Box 26 (Kaikukatu 4)
>>> 00014 HELSINGIN YLIOPISTO
>>> Tel. +358 50 3199529
>>> osma.suomi...@helsinki.fi
>>> http://www.nationallibrary.fi
>>> 
>> 
>> 
>> 
> 
> 
> -- 
> Osma Suominen
> D.Sc. (Tech), Information Systems Specialist
> National Library of Finland
> P.O. Box 26 (Kaikukatu 4)
> 00014 HELSINGIN YLIOPISTO
> Tel. +358 50 3199529
> osma.suomi...@helsinki.fi
> http://www.nationallibrary.fi

Reply via email to