Hi,
I am currently on Easter vacation, but just wanted to give some feedback: They wy you are doing it works at the moment, because you don’t have anything to do with service providers. You don’t use custom codecs (outside lucene-core.jar) and you just ignored the META-IF/services folder. Your example works, as Lucene finds everything that is needed in the core.jar, so the “legacy” service loading works (lucene-core.jar find its own services in lucene-core.jar). As soon as you would add lucene-backwards-codecs.jar, it would break unfixable, unless you copy the whole backwards codecs into core. This is completely unfixable in Lucene 8.x, as the new service-loader interface requires Java 9+. But you may not yet have noticed that I already started to do preparatory work and migrated in master (Lucene 9.0) the service provider to use the Java Runtime classes and we have thrown away our own service loader (which cannot cross module boundaries). The fix was merged not long ago: <https://issues.apache.org/jira/browse/LUCENE-9281> https://issues.apache.org/jira/browse/LUCENE-9281 - this cannot be backported to Java 8 / Lucene 8, as the reason for the home-made SPIClassIterator was exactly the missing features (and some classpath ordering issues in 3rd party JVMs like IBM J9). Those issues are fixed in Java 11 (also allowing to instantiate the classes on your own). When you are testing your “hack patch” with Lucene’s master branch you may succeed with also loading classes from the backwards codecs – no guarantees yet!) About StandardAnalyzer: Unfortunately I aggressively complained a while back when Mike McCandless wanted to move standard analyzer out of the analysis package into core (“for convenience”). This was a bad step, and IMHO we should revert that or completely rename the packages and everything. The problem here is: As the analysis services are only part of lucene-analyzers, we had to leave the factory classes there, but move the implementation classes in core. The package has to be the same. The only way around that is to move the analysis factory framework also to core (I would not be against that). This would include all factory base classes and the service loading stuff. Then we can move standard analyzer and some of the filters/tokenizers including their factories to core an that problem would be solved. My plan to move to modules is the following: (1) Do preparatory work: * Retire SPIClassIterator (DONE). * Add some preparatory issues to cleanup class hierarchy: Move Analysis SPI to core / remove StandardAnalyzer and related classes out of core back to anaysis * Fix Codec API to no rely on package-private classes, so we can have a completely public API with abstract classes for codecs, so stuff in backwards-codecs does not need to have access to package private stuff in core. * Cleanup sandbox to prefix all classes there with “sandbox” package and where needed remove package-private access. If it’s needed for internal access, WTF: Just move the stuff to core! We have a new version 9.0, so either retire/delete Sandbox stuff or make it part of Lucene core. (2) Wait for the Gradle Build to be finalized, because including Module stuff into the current Ant build won’t work * Ant build must be retired! (3) Make Lucene real modules * As we are on Java 11 already, add module-info.java everywhere * Fix gradle build to create and test modules (Latest Gradle needed) * Migrate all META-INF/services/* to module-info.java (before doing this, figure out of the META-INF files must stays for non-module usage, or if Java is clever enough to also look into module descriptor to find services). We may need all services at both locations (for module or classpath usage; we need a build helper to check that it’s in-line) I don’t want Lucene work with automodules in Java 11, it should be fully modularized. If you want to help and express interest: Please open an issue for the preparatory work listing the cases of same-package in different jar files. I would split this up as described above: Analysis issues with standardanalyzer, codecs pkg-private apis, sandbox. Uwe ----- Uwe Schindler Achterdiek 19, D-28357 Bremen https://www.thetaphi.de eMail: [email protected] From: David Ryan <[email protected]> Sent: Saturday, April 11, 2020 9:30 AM To: [email protected] Subject: Re: Lucene 9.0 Java module system support Thanks, I had probably missed some of what Uwe was saying in regards to the limitation of what would be possible even if the suggested changes were made. Given your points and the fact that it would take a while to have any of the changes filter into a release version, I decided to develop a patch-lucene.sh script to validate if the changes would allow me to access the required parts of Lucene. While it may not provide the full functionality provided by SPI including analysis chains and codecs, I worked out it will allow the use of the basics. The patch script below adds Automatic module names to the four Lucene libraries I needed (not strictly required, but I was interested to check if it would work). For now, any duplicate packages I identified have been moved into the core library (as identified in suggested changes). Given, I'm not using the SPI functionality, I simply deleted the standard analysers from the services list (required or the module system complains they are missing). Thankfully Gradle doesn't validate jars in the cache so the patch script only needs to be run once. Once the patch script was applied, I was able to use Ngram analysers and the query parser without issue. This will provide the essentials of what we needed for an embedded solution (location indexing, short text indexing with ngrams tokenizer and lower case filter, searching by location with nearest, bounding box and radius search, and partial text searches using the query parser). Of course, I've only scratched the surface of the APIs and right now it isn't clear if/when the SPI functionality might cause further issues or be required. So, I've validated that the suggested changes would allow basic functionality to work under the Java module system. If there's anything else I can do to help progress the suggested changes please let me know. Regards, David. --------- patch-lucene.sh -- works on osx ------ mkdir -p patch-lucene/core mkdir -p patch-lucene/analyzers mkdir -p patch-lucene/sandbox mkdir -p patch-lucene/queryparser cd patch-lucene # copy the jars from the gradle cache. cp ~/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-core/8.5.0/3f9ea85fff4fc3f7c83869dddb9b0ef7818c0cae/lucene-core-8.5.0.jar . cp ~/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-analyzers-common/8.5.0/7156f2e545fd6e7faaee4781d15eb60cf5f07646/lucene-analyzers-common-8.5.0.jar . cp ~/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-sandbox/8.5.0/2b275921f2fd92b15b4f1a2a565467c3fa221ef9/lucene-sandbox-8.5.0.jar . cp ~/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-queryparser/8.5.0/13c38f39b1a7d10c4749ba789fa95da5868d4885/lucene-queryparser-8.5.0.jar . # Add automatic module name to core. cd core jar -xf ../lucene-core-8.5.0.jar sed -i '' -e '/Created-By:/a\'$'\n''Automatic-Module-Name: org.apache.lucene.core'$'\n' META-INF/MANIFEST.MF # Add automiatc module name, move standard analysis classes into core. Remove any classes in standard from service lists. cd ../analyzers jar -xf ../lucene-analyzers-common-8.5.0.jar sed -i '' -e '/Created-By:/a\'$'\n''Automatic-Module-Name: org.apache.lucene.analyzers.common'$'\n' META-INF/MANIFEST.MF sed -i '' -e '/standard/d' META-INF/services/* mv org/apache/lucene/analysis/standard/* ../core/org/apache/lucene/analysis/standard rmdir org/apache/lucene/analysis/standard jar -cfm ../lucene-analyzers-common-8.5.0-fix.jar META-INF/MANIFEST.MF . # Add automatic module name, move search and document packages into core. move Java 9 specific search into core. cd ../sandbox jar -xf ../lucene-sandbox-8.5.0.jar sed -i '' -e '/Created-By:/a\'$'\n''Automatic-Module-Name: org.apache.lucene.sandbox'$'\n' META-INF/MANIFEST.MF mv org/apache/lucene/search/* ../core/org/apache/lucene/search rmdir org/apache/lucene/search mv org/apache/lucene/document/* ../core/org/apache/lucene/document rmdir org/apache/lucene/document mv META-INF/versions/9/org/apache/lucene/search/* ../core/META-INF/versions/9/org/apache/lucene/search rm -rf META-INF/versions jar -cfm ../lucene-sandbox-8.5.0-fix.jar META-INF/MANIFEST.MF . # Package up core with changes. cd ../core jar -cfm ../lucene-core-8.5.0-fix.jar META-INF/MANIFEST.MF . # Add automatic module name. cd ../queryparser jar -xf ../lucene-queryparser-8.5.0.jar sed -i '' -e '/Created-By:/a\'$'\n''Automatic-Module-Name: org.apache.lucene.queryparser'$'\n' META-INF/MANIFEST.MF jar -cfm ../lucene-queryparser-8.5.0-fix.jar META-INF/MANIFEST.MF . # Copy the fixed versions back into gradle cache. cd .. cp lucene-core-8.5.0-fix.jar ~/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-core/8.5.0/3f9ea85fff4fc3f7c83869dddb9b0ef7818c0cae/lucene-core-8.5.0.jar cp lucene-analyzers-common-8.5.0-fix.jar ~/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-analyzers-common/8.5.0/7156f2e545fd6e7faaee4781d15eb60cf5f07646/lucene-analyzers-common-8.5.0.jar cp lucene-sandbox-8.5.0-fix.jar ~/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-sandbox/8.5.0/2b275921f2fd92b15b4f1a2a565467c3fa221ef9/lucene-sandbox-8.5.0.jar cp lucene-queryparser-8.5.0-fix.jar ~/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-queryparser/8.5.0/13c38f39b1a7d10c4749ba789fa95da5868d4885/lucene-queryparser-8.5.0.jar ------------ On Sat, Apr 11, 2020 at 11:21 AM Chris Hostetter <[email protected] <mailto:[email protected]> > wrote: : If the changes I proposed are still viewed as having too many downstream : impacts, my fallback position will be to patch the jars. This involves : using the gradle import system to get the jars from Maven, then using a : patch script to manually unzip the jars, move the offending classes into : other jars which share the same package name and rezip. So far, I've been I'm no expert here, but i trust that Uwe is, and i feel like your followup questions/suggests have still avoided his primary point about *why* Lucene/Solr hasn't attempted jigsaw modulariation... : >>> There is currently some preparatory things to move forward with modules, : >>> so although you might be able to actually compile Lucene with module system : >>> (by limiting to a subset of JAR files), it currently won’t work : >>> cross-module due to the way how it handles ServiceLoader interfaces : >>> (codecs, postings formats, analyzers, see : >>> https://issues.apache.org/jira/browse/LUCENE-9281). The only way to : >>> make it work at runtime is to put all of Lucene into one module. ...so, IIUC, even if you patch the *current* jars, any Lucene code you use that depends on SPI (like analysis chains, codecs, etc...) isn't going to work unless follow Uwe's primary suggestion for folks who care about modules... : >>> Th general recommendation is to combine all required Lucene libraries : >>> into a separate JAR file during the maven / gradle build (e.g. using the : >>> Maven Shade plugin). Keep in mind that Lucene is also not suitable for use -Hoss http://www.lucidworks.com/ --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] <mailto:[email protected]> For additional commands, e-mail: [email protected] <mailto:[email protected]>
