Just to be clear, I'm not suggesting that you or the Solr project should use the module system. I agree that it is not for every project, however, more and more projects and libraries are moving slowly to using the module system or making their libraries compatible. I'm not suggesting you add module-info.java classes in each of your libraries. The proposed changes I've highlighted are the minimum changes required to allow me and anyone else that chooses to use the module system to include Lucene libraries without issues.
In regards to your observations, you are right, there are plenty of issues moving to use the module system. We made the choice of moving a large project to use the module system while switching to Java 11 last year. We viewed this as paying forward the technical debt we thought we would encounter sometime in the future. Overall, I'm pleased we made the jump. While we do need to have a more complex start script (--add-modules, --add-opens, --add-exports, etc), that has been a set and forget exercise in the shell script and joins the GC, memory config, JMX and other config that was there previously. I like that the module-info forces us to think about how each sub-library in our system interfaces with others. We have a way to go in closing off implementation details, but this can be tackled over time. Maybe surprisingly, this is the first library I've wanted to use that hasn't got a plan to at least be compatible (ie make minor changes to add an Automatic-module-name and clean up shared package names). Other libraries I use successfully with the module system include, google guice, apache commons, jackson, jersey, jetty, dropwizard metrics, etc. I've also been able to make limited use of Lucene for spatial indexing by importing only lucene-core and lucene-spatial only. However, if I want to use anything past that it requires importing lucene-analyzers-common and lucene-queryparser (which requires lucene-sandbox) and that is where I have package name conflicts. If the changes I proposed are still viewed as having too many downstream impacts, my fallback position will be to patch the jars. This involves using the gradle import system to get the jars from Maven, then using a patch script to manually unzip the jars, move the offending classes into other jars which share the same package name and rezip. So far, I've been required to patch two libraries (https://github.com/jnr/jffi/issues/73 and https://github.com/datastax/native-protocol/issues/31 ), of course, I'd rather avoid this where possible. Regards, David. On Fri, Apr 10, 2020 at 11:58 PM Gus Heck <[email protected]> wrote: > While the module system sounds nice in theory, my experience is > unfortunately that it is quite difficult to use for any application with > many existing dependencies, 90% or more of projects don't use it and > therefore wind up in the default module. I've wasted many hours trying to > get it to work in fat jar packaging situations (JesterJ uses my fork of > one-jar, uno-jar) and that is nearly impossible. In order to use it one has > to be willing to issue a very large number of --add-opens or --add-imports > directives on the command line, which totally obliterates the nice java > -jar myJar.jar command-line invocation, and creates a *requirement* for a > startup script which then also has to maintain that list of directives > (more cost than gain, but tractable for cases where there is a startup > script, intractable for fat jars that want to be executable). > > There are allegedly manifest attributes, and you can find reference to > add-exports attribute if you dig long enough to find the archives emails > that were part of the JCP process... or more recently here: > https://www.jrebel.com/blog/java-9-modules-cheat-sheet... but they are > not documented here: > https://docs.oracle.com/en/java/javase/11/docs/specs/jar/jar.html and > while add-exports seems to work in the manifest, add-opens did not (at > least when I tried to use it). I have yet to find either attribute in any > official documentation despite hours of looking. > > My general inclination has become that modules of negative value for any > project that uses jars that were not written by the project or a project > they control, or has zero external dependencies, which is sad because > that's what it was supposed to help fix. The primary reason for this is the > inability to see into the default module if you declare a module, and the > error mentioned at the top of this thread, which leaves one re-packaging > 3rd party libraries (more cost than gain, possible license implications). > > I've generally concluded it's best to avoid modules unless it's forced on > me for some reason. I've yet to feel a module helped me but many many times > I found that it got in the way. > > YMMV, > Gus > > On Fri, Apr 10, 2020 at 2:30 AM David Ryan <[email protected]> wrote: > >> >> Hi Uwe, >> >> As I mentioned in the original post, the main issue I've come across with >> being able to support Java 11 modules is that Lucene doesn't cleanly >> separate the use of package names across jar libraries which results in >> errors like: >> >> "The package org.apache.lucene.analysis.standard is accessible from more >> than one module: lucene.analyzers.common, lucene.core" >> >> I ran some analysis to see how extensive the problem is using the code >> below (pass in the directory of unzipped Lucene 8.5.0 as argument). >> Actually, the problem is really not that big and could be fixed. If you >> ignore duplicates in lucene-test-framework-8.5.0.jar, >> lucene-backward-codecs-8.5.0.jar and lucene-misc-8.5.0, you are left with >> the following issues: >> >> package org.apache.lucene.analysis.standard >> - lucene-core-8.5.0.jar >> - lucene-analyzers-common-8.5.0.jar >> >> Would it be viable to rename package org.apache.lucene.analysis.standard >> in lucene-analyzers-common to org.apache.lucene.analysis.classic or move >> the classes into the core library? >> >> package org.apache.lucene.search >> package org.apache.lucene.document >> - lucene-core-8.5.0.jar >> - lucene-sandbox-8.5.0.jar >> >> Would it be possible to move the search and document package in >> lucene-sandbox into a sandbox sub-package? e.g. >> org.apache.lucene.sandbox.search >> >> package org.apache.lucene.collation >> package org.apache.lucene.collation.tokenattributes >> - lucene-analyzers-common-8.5.0.jar >> - lucene-analyzers-icu-8.5.0.jar >> >> Would it be possible to move collation into a collation.icu sub-package? >> e.g. org.apache.lucene.collation.icu >> >> As I'm not deeply familiar with the Lucene code base, I'm don't know what >> the flow-on effects of these changes would be. However, I'd be happy to >> raise the JIRA issue(s) and prepare the changes/pull request. >> >> I believe with these changes many of the important Lucene libraries could >> be cleanly brought into an application using the module system. Another >> nice to have would be to add an Automatic-Module-Name entry to the >> MANIFEST.MF of each jar to stop automated module names. >> >> As you mentioned the fall back is to create a lucene-all-8.5.0.jar and >> combine all classes into a single large file. Given this is not currently >> published on Maven, if there's no interest in making the above changes, >> would it be possible to change the build system to publish a lucene-all >> package? >> >> Regards, >> David. >> >> -------- Full output of duplicated packages -------- >> >> package org.apache.lucene.analysis.standard >> - lucene-core-8.5.0.jar >> - lucene-analyzers-common-8.5.0.jar >> - lucene-test-framework-8.5.0.jar >> >> package org.apache.lucene.codecs.blockterms >> - lucene-codecs-8.5.0.jar >> - lucene-test-framework-8.5.0.jar >> >> package org.apache.lucene.codecs.lucene80 >> - lucene-core-8.5.0.jar >> - lucene-backward-codecs-8.5.0.jar >> >> package org.apache.lucene.search.similarities >> - lucene-core-8.5.0.jar >> - lucene-test-framework-8.5.0.jar >> >> package org.apache.lucene.codecs.lucene70 >> - lucene-core-8.5.0.jar >> - lucene-backward-codecs-8.5.0.jar >> >> package org.apache.lucene.store >> - lucene-misc-8.5.0.jar >> - lucene-core-8.5.0.jar >> - lucene-test-framework-8.5.0.jar >> >> package org.apache.lucene.codecs.bloom >> - lucene-codecs-8.5.0.jar >> - lucene-test-framework-8.5.0.jar >> >> package org.apache.lucene.codecs.lucene50 >> - lucene-core-8.5.0.jar >> - lucene-backward-codecs-8.5.0.jar >> >> package org.apache.lucene.search.spans >> - lucene-core-8.5.0.jar >> - lucene-test-framework-8.5.0.jar >> >> package org.apache.lucene.index >> - lucene-misc-8.5.0.jar >> - lucene-core-8.5.0.jar >> - lucene-test-framework-8.5.0.jar >> >> package org.apache.lucene.util.fst >> - lucene-misc-8.5.0.jar >> - lucene-core-8.5.0.jar >> - lucene-test-framework-8.5.0.jar >> >> package org.apache.lucene.analysis >> - lucene-core-8.5.0.jar >> - lucene-test-framework-8.5.0.jar >> >> package org.apache.lucene.collation >> - lucene-analyzers-common-8.5.0.jar >> - lucene-analyzers-icu-8.5.0.jar >> >> package org.apache.lucene.codecs.uniformsplit.sharedterms >> - lucene-codecs-8.5.0.jar >> - lucene-test-framework-8.5.0.jar >> >> package org.apache.lucene.geo >> - lucene-core-8.5.0.jar >> - lucene-test-framework-8.5.0.jar >> >> package org.apache.lucene.search >> - lucene-misc-8.5.0.jar >> - lucene-core-8.5.0.jar >> - lucene-test-framework-8.5.0.jar >> - lucene-sandbox-8.5.0.jar >> >> package org.apache.lucene.collation.tokenattributes >> - lucene-analyzers-common-8.5.0.jar >> - lucene-analyzers-icu-8.5.0.jar >> >> package org.apache.lucene.document >> - lucene-misc-8.5.0.jar >> - lucene-core-8.5.0.jar >> - lucene-sandbox-8.5.0.jar >> >> package org.apache.lucene.codecs.compressing >> - lucene-core-8.5.0.jar >> - lucene-test-framework-8.5.0.jar >> >> package org.apache.lucene.codecs.uniformsplit >> - lucene-codecs-8.5.0.jar >> - lucene-test-framework-8.5.0.jar >> >> package org.apache.lucene.util >> - lucene-misc-8.5.0.jar >> - lucene-core-8.5.0.jar >> - lucene-test-framework-8.5.0.jar >> >> package org.apache.lucene.codecs >> - lucene-core-8.5.0.jar >> - lucene-test-framework-8.5.0.jar >> - lucene-backward-codecs-8.5.0.jar >> >> package org.apache.lucene.util.automaton >> - lucene-core-8.5.0.jar >> - lucene-test-framework-8.5.0.jar >> >> >> ------- >> >> import java.io.File; >> import java.io.FileInputStream; >> import java.io.IOException; >> import java.util.ArrayList; >> import java.util.HashMap; >> import java.util.List; >> import java.util.Map; >> import java.util.Map.Entry; >> import java.util.zip.ZipEntry; >> import java.util.zip.ZipInputStream; >> >> public class ReadJars { >> >> public static void main(final String[] arguments) throws IOException { >> System.out.println("Find duplicate packages: <path to jar >> files>"); >> >> Map<String, List<String>> duplicates = new HashMap<>(); >> List<File> files = new ArrayList<File>(); >> >> // Find all files in sub-directory. >> addFiles(files, arguments[0]); >> >> // Find all package duplicates. >> findDuplicatePackages(duplicates, files); >> >> // Print package with duplicate packages >> for (Entry<String, List<String>> entry : duplicates.entrySet()) { >> System.out.println("\npackage " + entry.getKey()); >> for (String dup : entry.getValue()) { >> System.out.println(" - " + dup); >> } >> } >> >> } >> >> private static void findDuplicatePackages(Map<String, List<String>> >> duplicates, List<File> files) throws IOException { >> Map<String, String> packageToJarMap = new HashMap<>(); >> for (File f : files) { >> if (f.getAbsolutePath().endsWith(".jar")) { >> >> String file = >> f.getAbsolutePath().substring(f.getAbsolutePath().lastIndexOf('/') + 1); >> ZipInputStream zip = new ZipInputStream(new >> FileInputStream(f)); >> >> for (ZipEntry entry = zip.getNextEntry(); entry != null; >> entry = zip.getNextEntry()) { >> if (!entry.isDirectory() && >> entry.getName().endsWith(".class")) { >> String className = entry.getName(); >> String packageName = ""; >> if (className.lastIndexOf('/') >= 0) { >> packageName = className.substring(0, >> className.lastIndexOf('/')).replace('/', '.'); >> } >> className = className.replace('/', '.'); >> >> // Only interested in Lucene packages. >> if (className.startsWith("org.apache.lucene")) { >> >> String jar = packageToJarMap.get(packageName); >> if (jar != null && !jar.equals(file)) { >> List<String> duplicate = >> duplicates.get(packageName); >> if (duplicate == null) { >> duplicate = new ArrayList<>(); >> duplicate.add(jar); >> duplicate.add(file); >> duplicates.put(packageName, >> duplicate); >> } else { >> boolean found = false; >> for (String dup : duplicate) { >> if (file.equals(dup)) { >> found = true; >> } >> } >> if (!found) { >> duplicate.add(file); >> } >> } >> } else { >> packageToJarMap.put(packageName, file); >> } >> } >> } >> } >> zip.close(); >> } >> } >> } >> >> public static void addFiles(List<File> files, String directoryName) { >> File directory = new File(directoryName); >> for (File file : directory.listFiles()) { >> if (file.isFile()) { >> files.add(file); >> } else if (file.isDirectory()) { >> addFiles(files, file.getAbsolutePath()); >> } >> } >> } >> } >> >> -------- >> >> >> On Tue, Mar 24, 2020 at 9:26 PM Uwe Schindler <[email protected]> wrote: >> >>> Hi, >>> >>> >>> >>> this is a known problem since many year and there are no plans to change >>> this yet. The main reason for Java 11 support is not to introduce the >>> module system, but instead use the new features of Java 11 and to get rid >>> MR-JAR complications to make use of new intrinsics. >>> >>> >>> >>> Servers like Solr or Elasticsearch are shipped as an application, so the >>> module system does not bring any benefit. >>> >>> >>> >>> Th general recommendation is to combine all required Lucene libraries >>> into a separate JAR file during the maven / gradle build (e.g. using the >>> Maven Shade plugin). Keep in mind that Lucene is also not suitable for use >>> in other module systems like >>> >>> >>> >>> There is currently some preparatory things to move forward with modules, >>> so although you might be able to actually compile Lucene with module system >>> (by limiting to a subset of JAR files), it currently won’t work >>> cross-module due to the way how it handles ServiceLoader interfaces >>> (codecs, postings formats, analyzers, see >>> https://issues.apache.org/jira/browse/LUCENE-9281). The only way to >>> make it work at runtime is to put all of Lucene into one module. >>> >>> >>> >>> Uwe >>> >>> >>> >>> ----- >>> >>> Uwe Schindler >>> >>> Achterdiek 19, D-28357 Bremen >>> >>> https://www.thetaphi.de >>> >>> eMail: [email protected] >>> >>> >>> >>> *From:* David Ryan <[email protected]> >>> *Sent:* Tuesday, March 24, 2020 8:05 AM >>> *To:* [email protected] >>> *Subject:* Lucene 9.0 Java module system support >>> >>> >>> >>> >>> >>> Hi all, >>> >>> >>> >>> I've been investigating the use of Lucene as part of an application that >>> uses the Java Module System. Initially, I used Gradle to bring in >>> lucene-core, lucene-spatial and lucene-queries using version 8.5.0. This >>> works with the automated module naming. However, bringing in >>> lucene-queryparser (which depends on lucene-sandbox) or >>> lucene-analyzers-common causes errors such as: >>> >>> >>> >>> "The package org.apache.lucene.analysis.standard is accessible from more >>> than one module: lucene.analyzers.common, lucene.core" >>> >>> >>> >>> The Java module system does not handle different jar files using the >>> same package name which occurs throughout multiple maven artifacts. >>> >>> >>> >>> Looking at the LUCENE issues, I found the task of moving to a minimum >>> version of Java 11, however, this does not mention the ability to be >>> compatible with the module system. I also checked the git repository and >>> couldn't find the required changes to support the module system. >>> >>> >>> >>> https://issues.apache.org/jira/browse/LUCENE-8738 >>> >>> >>> >>> I looked through the dev list recent history but could not find anything >>> related. Are there any plans to support modules? Given, I saw there are a >>> number of other breaking changes happening with the move to Lucene 9.0, >>> would it be good to make those changes? >>> >>> >>> >>> Thanks, >>> >>> David. >>> >>> >>> >> > > -- > http://www.needhamsoftware.com (work) > http://www.the111shift.com (play) >
