Thanks Uwe. On Sat, Apr 11, 2020 at 2:36 PM Uwe Schindler <[email protected]> wrote: > > Hi, > > If you check my plan, that's my intended way to proceed. The preparatory > things like package fixes and api changes can be done before. Like the huge > commit regarding service loader that was done already. > > Uwe > > Am April 11, 2020 12:27:26 PM UTC schrieb Dawid Weiss <[email protected]>: >> >> Can we focus on moving out of ant world on master first? :) It's >> already painful to maintain the gradle build in parallel. >> >> D. >> >> On Sat, Apr 11, 2020 at 1:14 PM Uwe Schindler <[email protected]> wrote: >>> >>> >>> Hi, >>> >>> >>> >>> I am currently on Easter vacation, but just wanted to give some feedback: >>> >>> They wy you are doing it works at the moment, because you don’t have >>> anything to do with service providers. You don’t use custom codecs (outside >>> lucene-core.jar) and you just ignored the META-IF/services folder. Your >>> example works, as Lucene finds everything that is needed in the core.jar, >>> so the “legacy” service loading works (lucene-core.jar find its own >>> services in lucene-core.jar). As soon as you would add >>> lucene-backwards-codecs.jar, it would break unfixable, unless you copy the >>> whole backwards codecs into core. This is completely unfixable in Lucene >>> 8.x, as the new service-loader interface requires Java 9+. >>> >>> >>> >>> But you may not yet have noticed that I already started to do preparatory >>> work and migrated in master (Lucene 9.0) the service provider to use the >>> Java Runtime classes and we have thrown away our own service loader (which >>> cannot cross module boundaries). The fix was merged not long ago: >>> https://issues.apache.org/jira/browse/LUCENE-9281 - this cannot be >>> backported to Java 8 / Lucene 8, as the reason for the home-made >>> SPIClassIterator was exactly the missing features (and some classpath >>> ordering issues in 3rd party JVMs like IBM J9). Those issues are fixed in >>> Java 11 (also allowing to instantiate the classes on your own). When you >>> are testing your “hack patch” with Lucene’s master branch you may succeed >>> with also loading classes from the backwards codecs – no guarantees yet!) >>> >>> >>> >>> About StandardAnalyzer: Unfortunately I aggressively complained a while >>> back when Mike McCandless wanted to move standard analyzer out of the >>> analysis package into core (“for convenience”). This was a bad step, and >>> IMHO we should revert that or completely rename the packages and >>> everything. The problem here is: As the analysis services are only part of >>> lucene-analyzers, we had to leave the factory classes there, but move the >>> implementation classes in core. The package has to be the same. The only >>> way around that is to move the analysis factory framework also to core (I >>> would not be against that). This would include all factory base classes and >>> the service loading stuff. Then we can move standard analyzer and some of >>> the filters/tokenizers including their factories to core an that problem >>> would be solved. >>> >>> >>> >>> My plan to move to modules is the following: >>> >>> >>> >>> (1) Do preparatory work: >>> >>> Retire SPIClassIterator (DONE). >>> Add some preparatory issues to cleanup class hierarchy: Move Analysis SPI >>> to core / remove StandardAnalyzer and related classes out of core back to >>> anaysis >>> Fix Codec API to no rely on package-private classes, so we can have a >>> completely public API with abstract classes for codecs, so stuff in >>> backwards-codecs does not need to have access to package private stuff in >>> core. >>> Cleanup sandbox to prefix all classes there with “sandbox” package and >>> where needed remove package-private access. If it’s needed for internal >>> access, WTF: Just move the stuff to core! We have a new version 9.0, so >>> either retire/delete Sandbox stuff or make it part of Lucene core. >>> >>> (2) Wait for the Gradle Build to be finalized, because including Module >>> stuff into the current Ant build won’t work >>> >>> Ant build must be retired! >>> >>> (3) Make Lucene real modules >>> >>> As we are on Java 11 already, add module-info.java everywhere >>> Fix gradle build to create and test modules (Latest Gradle needed) >>> Migrate all META-INF/services/* to module-info.java (before doing this, >>> figure out of the META-INF files must stays for non-module usage, or if >>> Java is clever enough to also look into module descriptor to find >>> services). We may need all services at both locations (for module or >>> classpath usage; we need a build helper to check that it’s in-line) >>> >>> >>> >>> I don’t want Lucene work with automodules in Java 11, it should be fully >>> modularized. >>> >>> >>> >>> If you want to help and express interest: Please open an issue for the >>> preparatory work listing the cases of same-package in different jar files. >>> I would split this up as described above: Analysis issues with >>> standardanalyzer, codecs pkg-private apis, sandbox. >>> >>> >>> >>> Uwe >>> ________________________________ >>> Uwe Schindler >>> >>> Achterdiek 19, D-28357 Bremen >>> >>> https://www.thetaphi.de >>> >>> eMail: [email protected] >>> >>> >>> >>> From: David Ryan <[email protected]> >>> Sent: Saturday, April 11, 2020 9:30 AM >>> To: [email protected] >>> Subject: Re: Lucene 9.0 Java module system support >>> >>> >>> >>> >>> >>> Thanks, I had probably missed some of what Uwe was saying in regards to >>> the limitation of what would be possible even if the suggested changes were >>> made. Given your points and the fact that it would take a while to have any >>> of the changes filter into a release version, I decided to develop a >>> patch-lucene.sh script to validate if the changes would allow me to access >>> the required parts of Lucene. While it may not provide the full >>> functionality provided by SPI including analysis chains and codecs, I >>> worked out it will allow the use of the basics. >>> >>> >>> >>> The patch script below adds Automatic module names to the four Lucene >>> libraries I needed (not strictly required, but I was interested to check if >>> it would work). For now, any duplicate packages I identified have been >>> moved into the core library (as identified in suggested changes). Given, >>> I'm not using the SPI functionality, I simply deleted the standard >>> analysers from the services list (required or the module system complains >>> they are missing). Thankfully Gradle doesn't validate jars in the cache so >>> the patch script only needs to be run once. >>> >>> >>> >>> Once the patch script was applied, I was able to use Ngram analysers and >>> the query parser without issue. This will provide the essentials of what we >>> needed for an embedded solution (location indexing, short text indexing >>> with ngrams tokenizer and lower case filter, searching by location with >>> nearest, bounding box and radius search, and partial text searches using >>> the query parser). Of course, I've only scratched the surface of the APIs >>> and right now it isn't clear if/when the SPI functionality might cause >>> further issues or be required. >>> >>> >>> >>> So, I've validated that the suggested changes would allow basic >>> functionality to work under the Java module system. If there's anything >>> else I can do to help progress the suggested changes please let me know. >>> >>> >>> >>> Regards, >>> >>> David. >>> >>> >>> >>> >>> >>> --------- patch-lucene.sh -- works on osx ------ >>> >>> mkdir -p patch-lucene/core >>> mkdir -p patch-lucene/analyzers >>> mkdir -p patch-lucene/sandbox >>> mkdir -p patch-lucene/queryparser >>> cd patch-lucene >>> >>> # copy the jars from the gradle cache. >>> cp >>> ~/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-core/8.5.0/3f9ea85fff4fc3f7c83869dddb9b0ef7818c0cae/lucene-core-8.5.0.jar >>> . >>> cp >>> ~/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-analyzers-common/8.5.0/7156f2e545fd6e7faaee4781d15eb60cf5f07646/lucene-analyzers-common-8.5.0.jar >>> . >>> cp >>> ~/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-sandbox/8.5.0/2b275921f2fd92b15b4f1a2a565467c3fa221ef9/lucene-sandbox-8.5.0.jar >>> . >>> cp >>> ~/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-queryparser/8.5.0/13c38f39b1a7d10c4749ba789fa95da5868d4885/lucene-queryparser-8.5.0.jar >>> . >>> >>> # Add automatic module name to core. >>> cd core >>> jar -xf ../lucene-core-8.5.0.jar >>> sed -i '' -e '/Created-By:/a\'$'\n''Automatic-Module-Name: >>> org.apache.lucene.core'$'\n' META-INF/MANIFEST.MF >>> >>> # Add automiatc module name, move standard analysis classes into core. >>> Remove any classes in standard from service lists. >>> cd ../analyzers >>> jar -xf ../lucene-analyzers-common-8.5.0.jar >>> sed -i '' -e '/Created-By:/a\'$'\n''Automatic-Module-Name: >>> org.apache.lucene.analyzers.common'$'\n' META-INF/MANIFEST.MF >>> sed -i '' -e '/standard/d' META-INF/services/* >>> mv org/apache/lucene/analysis/standard/* >>> ../core/org/apache/lucene/analysis/standard >>> rmdir org/apache/lucene/analysis/standard >>> jar -cfm ../lucene-analyzers-common-8.5.0-fix.jar META-INF/MANIFEST.MF . >>> >>> # Add automatic module name, move search and document packages into core. >>> move Java 9 specific search into core. >>> cd ../sandbox >>> jar -xf ../lucene-sandbox-8.5.0.jar >>> sed -i '' -e '/Created-By:/a\'$'\n''Automatic-Module-Name: >>> org.apache.lucene.sandbox'$'\n' META-INF/MANIFEST.MF >>> mv org/apache/lucene/search/* ../core/org/apache/lucene/search >>> rmdir org/apache/lucene/search >>> mv org/apache/lucene/document/* ../core/org/apache/lucene/document >>> rmdir org/apache/lucene/document >>> mv META-INF/versions/9/org/apache/lucene/search/* >>> ../core/META-INF/versions/9/org/apache/lucene/search >>> rm -rf META-INF/versions >>> jar -cfm ../lucene-sandbox-8.5.0-fix.jar META-INF/MANIFEST.MF . >>> >>> # Package up core with changes. >>> cd ../core >>> jar -cfm ../lucene-core-8.5.0-fix.jar META-INF/MANIFEST.MF . >>> >>> # Add automatic module name. >>> cd ../queryparser >>> jar -xf ../lucene-queryparser-8.5.0.jar >>> sed -i '' -e '/Created-By:/a\'$'\n''Automatic-Module-Name: >>> org.apache.lucene.queryparser'$'\n' META-INF/MANIFEST.MF >>> jar -cfm ../lucene-queryparser-8.5.0-fix.jar META-INF/MANIFEST.MF . >>> >>> # Copy the fixed versions back into gradle cache. >>> cd .. >>> cp lucene-core-8.5.0-fix.jar >>> ~/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-core/8.5.0/3f9ea85fff4fc3f7c83869dddb9b0ef7818c0cae/lucene-core-8.5.0.jar >>> cp lucene-analyzers-common-8.5.0-fix.jar >>> ~/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-analyzers-common/8.5.0/7156f2e545fd6e7faaee4781d15eb60cf5f07646/lucene-analyzers-common-8.5.0.jar >>> cp lucene-sandbox-8.5.0-fix.jar >>> ~/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-sandbox/8.5.0/2b275921f2fd92b15b4f1a2a565467c3fa221ef9/lucene-sandbox-8.5.0.jar >>> cp lucene-queryparser-8.5.0-fix.jar >>> ~/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-queryparser/8.5.0/13c38f39b1a7d10c4749ba789fa95da5868d4885/lucene-queryparser-8.5.0.jar >>> ________________________________ >>> On Sat, Apr 11, 2020 at 11:21 AM Chris Hostetter >>> <[email protected]> wrote: >>> >>> >>> : If the changes I proposed are still viewed as having too many downstream >>> : impacts, my fallback position will be to patch the jars. This involves >>> : using the gradle import system to get the jars from Maven, then using a >>> : patch script to manually unzip the jars, move the offending classes into >>> : other jars which share the same package name and rezip. So far, I've been >>> >>> I'm no expert here, but i trust that Uwe is, and i feel like your followup >>> questions/suggests have still avoided his primary point about *why* >>> Lucene/Solr hasn't attempted jigsaw modulariation... >>> >>> : >>> There is currently some preparatory things to move forward with >>> modules, >>> : >>> so although you might be able to actually compile Lucene with module >>> system >>> : >>> (by limiting to a subset of JAR files), it currently won’t work >>> : >>> cross-module due to the way how it handles ServiceLoader interfaces >>> : >>> (codecs, postings formats, analyzers, see >>> : >>> https://issues.apache.org/jira/browse/LUCENE-9281). The only way to >>> : >>> make it work at runtime is to put all of Lucene into one module. >>> >>> ...so, IIUC, even if you patch the *current* jars, any Lucene code you use >>> that depends on SPI (like analysis chains, codecs, etc...) isn't going to >>> work unless follow Uwe's primary suggestion for folks who care about >>> modules... >>> >>> : >>> Th general recommendation is to combine all required Lucene libraries >>> : >>> into a separate JAR file during the maven / gradle build (e.g. using >>> the >>> : >>> Maven Shade plugin). Keep in mind that Lucene is also not suitable >>> for use >>> >>> >>> -Hoss >>> http://www.lucidworks.com/ >>> ________________________________ >>> To unsubscribe, e-mail: [email protected] >>> For additional commands, e-mail: [email protected] >> >> ________________________________ >> To unsubscribe, e-mail: [email protected] >> For additional commands, e-mail: [email protected] >> > > -- > Uwe Schindler > Achterdiek 19, 28357 Bremen > https://www.thetaphi.de
--------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
