Thanks Uwe.

On Sat, Apr 11, 2020 at 2:36 PM Uwe Schindler <[email protected]> wrote:
>
> Hi,
>
> If you check my plan, that's my intended way to proceed. The preparatory 
> things like package fixes and api changes can be done before. Like the huge 
> commit regarding service loader that was done already.
>
> Uwe
>
> Am April 11, 2020 12:27:26 PM UTC schrieb Dawid Weiss <[email protected]>:
>>
>> Can we focus on moving out of ant world on master first? :) It's
>> already painful to maintain the gradle build in parallel.
>>
>> D.
>>
>> On Sat, Apr 11, 2020 at 1:14 PM Uwe Schindler <[email protected]> wrote:
>>>
>>>
>>>  Hi,
>>>
>>>
>>>
>>>  I am currently on Easter vacation, but just wanted to give some feedback:
>>>
>>>  They wy you are doing it works at the moment, because you don’t have 
>>> anything to do with service providers. You don’t use custom codecs (outside 
>>> lucene-core.jar) and you just ignored the META-IF/services folder. Your 
>>> example works, as Lucene finds everything that is needed in the core.jar, 
>>> so the “legacy” service loading works (lucene-core.jar find its own 
>>> services in lucene-core.jar). As soon as you would add 
>>> lucene-backwards-codecs.jar, it would break unfixable, unless you copy the 
>>> whole backwards codecs into core. This is completely unfixable in Lucene 
>>> 8.x, as the new service-loader interface requires Java 9+.
>>>
>>>
>>>
>>>  But you may not yet have noticed that I already started to do preparatory 
>>> work and migrated in master (Lucene 9.0) the service provider to use the 
>>> Java Runtime classes and we have thrown away our own service loader (which 
>>> cannot cross module boundaries). The fix was merged not long ago: 
>>> https://issues.apache.org/jira/browse/LUCENE-9281 - this cannot be 
>>> backported to Java 8 / Lucene 8, as the reason for the home-made 
>>> SPIClassIterator was exactly the missing features (and some classpath 
>>> ordering issues in 3rd party JVMs like IBM J9). Those issues are fixed in 
>>> Java 11 (also allowing to instantiate the classes on your own). When you 
>>> are testing your “hack patch” with Lucene’s master branch you may succeed 
>>> with also loading classes from the backwards codecs – no guarantees yet!)
>>>
>>>
>>>
>>>  About StandardAnalyzer: Unfortunately I aggressively complained a while 
>>> back when Mike McCandless wanted to move standard analyzer out of the 
>>> analysis package into core (“for convenience”). This was a bad step, and 
>>> IMHO we should revert that or completely rename the packages and 
>>> everything. The problem here is: As the analysis services are only part of 
>>> lucene-analyzers, we had to leave the factory classes there, but move the 
>>> implementation classes in core. The package has to be the same. The only 
>>> way around that is to move the analysis factory framework also to core (I 
>>> would not be against that). This would include all factory base classes and 
>>> the service loading stuff. Then we can move standard analyzer and some of 
>>> the filters/tokenizers including their factories to core an that problem 
>>> would be solved.
>>>
>>>
>>>
>>>  My plan to move to modules is the following:
>>>
>>>
>>>
>>>  (1) Do preparatory work:
>>>
>>>  Retire SPIClassIterator (DONE).
>>>  Add some preparatory issues to cleanup class hierarchy: Move Analysis SPI 
>>> to core / remove StandardAnalyzer and related classes out of core back to 
>>> anaysis
>>>  Fix Codec API to no rely on package-private classes, so we can have a 
>>> completely public API with abstract classes for codecs, so stuff in 
>>> backwards-codecs does not need to have access to package private stuff in 
>>> core.
>>>  Cleanup sandbox to prefix all classes there with “sandbox” package and 
>>> where needed remove package-private access. If it’s needed for internal 
>>> access, WTF: Just move the stuff to core! We have a new version 9.0, so 
>>> either retire/delete Sandbox stuff or make it part of Lucene core.
>>>
>>>  (2) Wait for the Gradle Build to be finalized, because including Module 
>>> stuff into the current Ant build won’t work
>>>
>>>  Ant build must be retired!
>>>
>>>  (3) Make Lucene real modules
>>>
>>>  As we are on Java 11 already, add module-info.java everywhere
>>>  Fix gradle build to create and test modules (Latest Gradle needed)
>>>  Migrate all META-INF/services/* to module-info.java (before doing this, 
>>> figure out of the META-INF files must stays for non-module usage, or if 
>>> Java is clever enough to also look into module descriptor to find 
>>> services). We may need all services at both locations (for module or 
>>> classpath usage; we need a build helper to check that it’s in-line)
>>>
>>>
>>>
>>>  I don’t want Lucene work with automodules in Java 11, it should be fully 
>>> modularized.
>>>
>>>
>>>
>>>  If you want to help and express interest: Please open an issue for the 
>>> preparatory work listing the cases of same-package in different jar files. 
>>> I would split this up as described above: Analysis issues with 
>>> standardanalyzer, codecs pkg-private apis, sandbox.
>>>
>>>
>>>
>>>  Uwe
>>> ________________________________
>>>  Uwe Schindler
>>>
>>>  Achterdiek 19, D-28357 Bremen
>>>
>>>  https://www.thetaphi.de
>>>
>>>  eMail: [email protected]
>>>
>>>
>>>
>>>  From: David Ryan <[email protected]>
>>>  Sent: Saturday, April 11, 2020 9:30 AM
>>>  To: [email protected]
>>>  Subject: Re: Lucene 9.0 Java module system support
>>>
>>>
>>>
>>>
>>>
>>>  Thanks, I had probably missed some of what Uwe was saying in regards to 
>>> the limitation of what would be possible even if the suggested changes were 
>>> made. Given your points and the fact that it would take a while to have any 
>>> of the changes filter into a release version, I decided to develop a 
>>> patch-lucene.sh script to validate if the changes would allow me to access 
>>> the required parts of Lucene.  While it may not provide the full 
>>> functionality provided by SPI including analysis chains and codecs, I 
>>> worked out it will allow the use of the basics.
>>>
>>>
>>>
>>>  The patch script below adds Automatic module names to the four Lucene 
>>> libraries I needed (not strictly required, but I was interested to check if 
>>> it would work).  For now, any duplicate packages I identified have been 
>>> moved into the core library (as identified in suggested changes). Given, 
>>> I'm not using the SPI functionality, I simply deleted the standard 
>>> analysers from the services list (required or the module system complains 
>>> they are missing).  Thankfully Gradle doesn't validate jars in the cache so 
>>> the patch script only needs to be run once.
>>>
>>>
>>>
>>>  Once the patch script was applied, I was able to use Ngram analysers and 
>>> the query parser without issue. This will provide the essentials of what we 
>>> needed for an embedded solution (location indexing, short text indexing 
>>> with ngrams tokenizer and lower case filter, searching by location with 
>>> nearest, bounding box and radius search, and partial text searches using 
>>> the query parser). Of course, I've only scratched the surface of the APIs 
>>> and right now it isn't clear if/when the SPI functionality might cause 
>>> further issues or be required.
>>>
>>>
>>>
>>>  So, I've validated that the suggested changes would allow basic 
>>> functionality to work under the Java module system. If there's anything 
>>> else I can do to help progress the suggested changes please let me know.
>>>
>>>
>>>
>>>  Regards,
>>>
>>>  David.
>>>
>>>
>>>
>>>
>>>
>>>  --------- patch-lucene.sh -- works on osx ------
>>>
>>>  mkdir -p patch-lucene/core
>>>  mkdir -p patch-lucene/analyzers
>>>  mkdir -p patch-lucene/sandbox
>>>  mkdir -p patch-lucene/queryparser
>>>  cd patch-lucene
>>>
>>>  # copy the jars from the gradle cache.
>>>  cp 
>>> ~/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-core/8.5.0/3f9ea85fff4fc3f7c83869dddb9b0ef7818c0cae/lucene-core-8.5.0.jar
>>>  .
>>>  cp 
>>> ~/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-analyzers-common/8.5.0/7156f2e545fd6e7faaee4781d15eb60cf5f07646/lucene-analyzers-common-8.5.0.jar
>>>  .
>>>  cp 
>>> ~/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-sandbox/8.5.0/2b275921f2fd92b15b4f1a2a565467c3fa221ef9/lucene-sandbox-8.5.0.jar
>>>  .
>>>  cp 
>>> ~/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-queryparser/8.5.0/13c38f39b1a7d10c4749ba789fa95da5868d4885/lucene-queryparser-8.5.0.jar
>>>  .
>>>
>>>  # Add automatic module name to core.
>>>  cd core
>>>  jar -xf ../lucene-core-8.5.0.jar
>>>  sed -i '' -e '/Created-By:/a\'$'\n''Automatic-Module-Name: 
>>> org.apache.lucene.core'$'\n' META-INF/MANIFEST.MF
>>>
>>>  # Add automiatc module name, move standard analysis classes into core. 
>>> Remove any classes in standard from service lists.
>>>  cd ../analyzers
>>>  jar -xf ../lucene-analyzers-common-8.5.0.jar
>>>  sed -i '' -e '/Created-By:/a\'$'\n''Automatic-Module-Name: 
>>> org.apache.lucene.analyzers.common'$'\n' META-INF/MANIFEST.MF
>>>  sed -i '' -e '/standard/d' META-INF/services/*
>>>  mv org/apache/lucene/analysis/standard/* 
>>> ../core/org/apache/lucene/analysis/standard
>>>  rmdir org/apache/lucene/analysis/standard
>>>  jar -cfm ../lucene-analyzers-common-8.5.0-fix.jar META-INF/MANIFEST.MF .
>>>
>>>  # Add automatic module name, move search and document packages into core. 
>>> move Java 9 specific search into core.
>>>  cd ../sandbox
>>>  jar -xf ../lucene-sandbox-8.5.0.jar
>>>  sed -i '' -e '/Created-By:/a\'$'\n''Automatic-Module-Name: 
>>> org.apache.lucene.sandbox'$'\n' META-INF/MANIFEST.MF
>>>  mv org/apache/lucene/search/* ../core/org/apache/lucene/search
>>>  rmdir org/apache/lucene/search
>>>  mv org/apache/lucene/document/* ../core/org/apache/lucene/document
>>>  rmdir org/apache/lucene/document
>>>  mv META-INF/versions/9/org/apache/lucene/search/* 
>>> ../core/META-INF/versions/9/org/apache/lucene/search
>>>  rm -rf META-INF/versions
>>>  jar -cfm ../lucene-sandbox-8.5.0-fix.jar META-INF/MANIFEST.MF .
>>>
>>>  # Package up core with changes.
>>>  cd ../core
>>>  jar -cfm ../lucene-core-8.5.0-fix.jar META-INF/MANIFEST.MF .
>>>
>>>  # Add automatic module name.
>>>  cd ../queryparser
>>>  jar -xf ../lucene-queryparser-8.5.0.jar
>>>  sed -i '' -e '/Created-By:/a\'$'\n''Automatic-Module-Name: 
>>> org.apache.lucene.queryparser'$'\n' META-INF/MANIFEST.MF
>>>  jar -cfm ../lucene-queryparser-8.5.0-fix.jar META-INF/MANIFEST.MF .
>>>
>>>  # Copy the fixed versions back into gradle cache.
>>>  cd ..
>>>  cp lucene-core-8.5.0-fix.jar 
>>> ~/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-core/8.5.0/3f9ea85fff4fc3f7c83869dddb9b0ef7818c0cae/lucene-core-8.5.0.jar
>>>  cp lucene-analyzers-common-8.5.0-fix.jar 
>>> ~/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-analyzers-common/8.5.0/7156f2e545fd6e7faaee4781d15eb60cf5f07646/lucene-analyzers-common-8.5.0.jar
>>>  cp lucene-sandbox-8.5.0-fix.jar 
>>> ~/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-sandbox/8.5.0/2b275921f2fd92b15b4f1a2a565467c3fa221ef9/lucene-sandbox-8.5.0.jar
>>>  cp lucene-queryparser-8.5.0-fix.jar 
>>> ~/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-queryparser/8.5.0/13c38f39b1a7d10c4749ba789fa95da5868d4885/lucene-queryparser-8.5.0.jar
>>> ________________________________
>>>  On Sat, Apr 11, 2020 at 11:21 AM Chris Hostetter 
>>> <[email protected]> wrote:
>>>
>>>
>>>  : If the changes I proposed are still viewed as having too many downstream
>>>  : impacts, my fallback position will be to patch the jars. This involves
>>>  : using the gradle import system to get the jars from Maven, then using a
>>>  : patch script to manually unzip the jars, move the offending classes into
>>>  : other jars which share the same package name and rezip. So far, I've been
>>>
>>>  I'm no expert here, but i trust that Uwe is, and i feel like your followup
>>>  questions/suggests have still avoided his primary point about *why*
>>>  Lucene/Solr hasn't attempted jigsaw modulariation...
>>>
>>>  : >>> There is currently some preparatory things to move forward with 
>>> modules,
>>>  : >>> so although you might be able to actually compile Lucene with module 
>>> system
>>>  : >>> (by limiting to a subset of JAR files), it currently won’t work
>>>  : >>> cross-module due to the way how it handles ServiceLoader interfaces
>>>  : >>> (codecs, postings formats, analyzers, see
>>>  : >>> https://issues.apache.org/jira/browse/LUCENE-9281). The only way to
>>>  : >>> make it work at runtime is to put all of Lucene into one module.
>>>
>>>  ...so, IIUC, even if you patch the *current* jars, any Lucene code you use
>>>  that depends on SPI (like analysis chains, codecs, etc...) isn't going to
>>>  work unless follow Uwe's primary suggestion for folks who care about
>>>  modules...
>>>
>>>  : >>> Th general recommendation is to combine all required Lucene libraries
>>>  : >>> into a separate JAR file during the maven / gradle build (e.g. using 
>>> the
>>>  : >>> Maven Shade plugin). Keep in mind that Lucene is also not suitable 
>>> for use
>>>
>>>
>>>  -Hoss
>>>  http://www.lucidworks.com/
>>> ________________________________
>>>  To unsubscribe, e-mail: [email protected]
>>>  For additional commands, e-mail: [email protected]
>>
>> ________________________________
>> To unsubscribe, e-mail: [email protected]
>> For additional commands, e-mail: [email protected]
>>
>
> --
> Uwe Schindler
> Achterdiek 19, 28357 Bremen
> https://www.thetaphi.de

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to