Hi,
I left comments for your questions about development setup on LUCENE-9317.
Please see them.

Tomoko


2020年4月12日(日) 14:16 David Ryan <[email protected]>:

>
> Hi Uwe,
>
> Thanks for taking the time out of your Easter break to respond. I've
> created the following Jira tickets for preparatory tasks:
>
> LUCENE-9317 <https://issues.apache.org/jira/browse/LUCENE-9317> Resolve
> package name conflicts for StandardAnalyzer to allow Java module system
> support
> LUCENE-9318 <https://issues.apache.org/jira/browse/LUCENE-9318> Fix Codec
> API to not rely on package-private classes as part of changes to support
> java module system
> LUCENE-9319 <https://issues.apache.org/jira/browse/LUCENE-9319> Clean up
> Sandbox project by retiring/delete functionality or move it to Lucene core
>
> I've also created the following for adding module-info.java:
>
> LUCENE-9320 <https://issues.apache.org/jira/browse/LUCENE-9320> Enable
> support the Java module system by adding module-info to projects
>
> I've checked out the master branch and starting to look at setting up the
> development environment using Gradle to see if I can assist with a few
> items. I notice Gradle hasn't been set up for Eclipse yet (maybe something
> I can help with). My first roadblock to getting setup is understanding the
> cyclic dependencies between core, codecs and the test-framework. Given IRC
> is not really used from what I can see, and the slack channel is private to
> people with @apache.org emails. Is there any good place to get basic help
> on getting started with the codebase without spamming this email list?
>
> Thanks,
> David.
>
> On Sat, Apr 11, 2020 at 9:14 PM Uwe Schindler <[email protected]> wrote:
>
>> Hi,
>>
>>
>>
>> I am currently on Easter vacation, but just wanted to give some feedback:
>>
>> They wy you are doing it works at the moment, because you don’t have
>> anything to do with service providers. You don’t use custom codecs (outside
>> lucene-core.jar) and you just ignored the META-IF/services folder. Your
>> example works, as Lucene finds everything that is needed in the core.jar,
>> so the “legacy” service loading works (lucene-core.jar find its own
>> services in lucene-core.jar). As soon as you would add
>> lucene-backwards-codecs.jar, it would break unfixable, unless you copy the
>> whole backwards codecs into core. This is completely unfixable in Lucene
>> 8.x, as the new service-loader interface requires Java 9+.
>>
>>
>>
>> But you may not yet have noticed that I already started to do preparatory
>> work and migrated in master (Lucene 9.0) the service provider to use the
>> Java Runtime classes and we have thrown away our own service loader (which
>> cannot cross module boundaries). The fix was merged not long ago:
>> https://issues.apache.org/jira/browse/LUCENE-9281 - this cannot be
>> backported to Java 8 / Lucene 8, as the reason for the home-made
>> SPIClassIterator was exactly the missing features (and some classpath
>> ordering issues in 3rd party JVMs like IBM J9). Those issues are fixed
>> in Java 11 (also allowing to instantiate the classes on your own). When you
>> are testing your “hack patch” with Lucene’s master branch you may succeed
>> with also loading classes from the backwards codecs – no guarantees yet!)
>>
>>
>>
>> About StandardAnalyzer: Unfortunately I aggressively complained a while
>> back when Mike McCandless wanted to move standard analyzer out of the
>> analysis package into core (“for convenience”). This was a bad step, and
>> IMHO we should revert that or completely rename the packages and
>> everything. The problem here is: As the analysis services are only part of
>> lucene-analyzers, we had to leave the factory classes there, but move the
>> implementation classes in core. The package has to be the same. The only
>> way around that is to move the analysis factory framework also to core (I
>> would not be against that). This would include all factory base classes and
>> the service loading stuff. Then we can move standard analyzer and some of
>> the filters/tokenizers including their factories to core an that problem
>> would be solved.
>>
>>
>>
>> My plan to move to modules is the following:
>>
>>
>>
>> (1) Do preparatory work:
>>
>>    - Retire SPIClassIterator (DONE).
>>    - Add some preparatory issues to cleanup class hierarchy: Move
>>    Analysis SPI to core / remove StandardAnalyzer and related classes out of
>>    core back to anaysis
>>    - Fix Codec API to no rely on package-private classes, so we can have
>>    a completely public API with abstract classes for codecs, so stuff in
>>    backwards-codecs does not need to have access to package private stuff in
>>    core.
>>    - Cleanup sandbox to prefix all classes there with “sandbox” package
>>    and where needed remove package-private access. If it’s needed for 
>> internal
>>    access, WTF: Just move the stuff to core! We have a new version 9.0, so
>>    either retire/delete Sandbox stuff or make it part of Lucene core.
>>
>> (2) Wait for the Gradle Build to be finalized, because including Module
>> stuff into the current Ant build won’t work
>>
>>    - Ant build must be retired!
>>
>> (3) Make Lucene real modules
>>
>>    - As we are on Java 11 already, add module-info.java everywhere
>>    - Fix gradle build to create and test modules (Latest Gradle needed)
>>    - Migrate all META-INF/services/* to module-info.java (before doing
>>    this, figure out of the META-INF files must stays for non-module usage, or
>>    if Java is clever enough to also look into module descriptor to find
>>    services). We may need all services at both locations (for module or
>>    classpath usage; we need a build helper to check that it’s in-line)
>>
>>
>>
>> I don’t want Lucene work with automodules in Java 11, it should be fully
>> modularized.
>>
>>
>>
>> If you want to help and express interest: Please open an issue for the
>> preparatory work listing the cases of same-package in different jar files.
>> I would split this up as described above: Analysis issues with
>> standardanalyzer, codecs pkg-private apis, sandbox.
>>
>>
>>
>> Uwe
>>
>>
>>
>> -----
>>
>> Uwe Schindler
>>
>> Achterdiek 19, D-28357 Bremen
>>
>> https://www.thetaphi.de
>>
>> eMail: [email protected]
>>
>>
>>
>> *From:* David Ryan <[email protected]>
>> *Sent:* Saturday, April 11, 2020 9:30 AM
>> *To:* [email protected]
>> *Subject:* Re: Lucene 9.0 Java module system support
>>
>>
>>
>>
>>
>> Thanks, I had probably missed some of what Uwe was saying in regards to
>> the limitation of what would be possible even if the suggested changes were
>> made. Given your points and the fact that it would take a while to have any
>> of the changes filter into a release version, I decided to develop a
>> patch-lucene.sh script to validate if the changes would allow me to access
>> the required parts of Lucene.  While it may not provide the full
>> functionality provided by SPI including analysis chains and codecs, I
>> worked out it will allow the use of the basics.
>>
>>
>>
>> The patch script below adds Automatic module names to the four Lucene
>> libraries I needed (not strictly required, but I was interested to check if
>> it would work).  For now, any duplicate packages I identified have been
>> moved into the core library (as identified in suggested changes). Given,
>> I'm not using the SPI functionality, I simply deleted the standard
>> analysers from the services list (required or the module system complains
>> they are missing).  Thankfully Gradle doesn't validate jars in the cache so
>> the patch script only needs to be run once.
>>
>>
>>
>> Once the patch script was applied, I was able to use Ngram analysers and
>> the query parser without issue. This will provide the essentials of what we
>> needed for an embedded solution (location indexing, short text indexing
>> with ngrams tokenizer and lower case filter, searching by location with
>> nearest, bounding box and radius search, and partial text searches using
>> the query parser). Of course, I've only scratched the surface of the APIs
>> and right now it isn't clear if/when the SPI functionality might cause
>> further issues or be required.
>>
>>
>>
>> So, I've validated that the suggested changes would allow basic
>> functionality to work under the Java module system. If there's anything
>> else I can do to help progress the suggested changes please let me know.
>>
>>
>>
>> Regards,
>>
>> David.
>>
>>
>>
>>
>>
>> --------- patch-lucene.sh -- works on osx ------
>>
>> mkdir -p patch-lucene/core
>> mkdir -p patch-lucene/analyzers
>> mkdir -p patch-lucene/sandbox
>> mkdir -p patch-lucene/queryparser
>> cd patch-lucene
>>
>> # copy the jars from the gradle cache.
>> cp
>> ~/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-core/8.5.0/3f9ea85fff4fc3f7c83869dddb9b0ef7818c0cae/lucene-core-8.5.0.jar
>> .
>> cp
>> ~/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-analyzers-common/8.5.0/7156f2e545fd6e7faaee4781d15eb60cf5f07646/lucene-analyzers-common-8.5.0.jar
>> .
>> cp
>> ~/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-sandbox/8.5.0/2b275921f2fd92b15b4f1a2a565467c3fa221ef9/lucene-sandbox-8.5.0.jar
>> .
>> cp
>> ~/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-queryparser/8.5.0/13c38f39b1a7d10c4749ba789fa95da5868d4885/lucene-queryparser-8.5.0.jar
>> .
>>
>> # Add automatic module name to core.
>> cd core
>> jar -xf ../lucene-core-8.5.0.jar
>> sed -i '' -e '/Created-By:/a\'$'\n''Automatic-Module-Name:
>> org.apache.lucene.core'$'\n' META-INF/MANIFEST.MF
>>
>> # Add automiatc module name, move standard analysis classes into core.
>> Remove any classes in standard from service lists.
>> cd ../analyzers
>> jar -xf ../lucene-analyzers-common-8.5.0.jar
>> sed -i '' -e '/Created-By:/a\'$'\n''Automatic-Module-Name:
>> org.apache.lucene.analyzers.common'$'\n' META-INF/MANIFEST.MF
>> sed -i '' -e '/standard/d' META-INF/services/*
>> mv org/apache/lucene/analysis/standard/*
>> ../core/org/apache/lucene/analysis/standard
>> rmdir org/apache/lucene/analysis/standard
>> jar -cfm ../lucene-analyzers-common-8.5.0-fix.jar META-INF/MANIFEST.MF .
>>
>> # Add automatic module name, move search and document packages into core.
>> move Java 9 specific search into core.
>> cd ../sandbox
>> jar -xf ../lucene-sandbox-8.5.0.jar
>> sed -i '' -e '/Created-By:/a\'$'\n''Automatic-Module-Name:
>> org.apache.lucene.sandbox'$'\n' META-INF/MANIFEST.MF
>> mv org/apache/lucene/search/* ../core/org/apache/lucene/search
>> rmdir org/apache/lucene/search
>> mv org/apache/lucene/document/* ../core/org/apache/lucene/document
>> rmdir org/apache/lucene/document
>> mv META-INF/versions/9/org/apache/lucene/search/*
>> ../core/META-INF/versions/9/org/apache/lucene/search
>> rm -rf META-INF/versions
>> jar -cfm ../lucene-sandbox-8.5.0-fix.jar META-INF/MANIFEST.MF .
>>
>> # Package up core with changes.
>> cd ../core
>> jar -cfm ../lucene-core-8.5.0-fix.jar META-INF/MANIFEST.MF .
>>
>> # Add automatic module name.
>> cd ../queryparser
>> jar -xf ../lucene-queryparser-8.5.0.jar
>> sed -i '' -e '/Created-By:/a\'$'\n''Automatic-Module-Name:
>> org.apache.lucene.queryparser'$'\n' META-INF/MANIFEST.MF
>> jar -cfm ../lucene-queryparser-8.5.0-fix.jar META-INF/MANIFEST.MF .
>>
>> # Copy the fixed versions back into gradle cache.
>> cd ..
>> cp lucene-core-8.5.0-fix.jar
>> ~/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-core/8.5.0/3f9ea85fff4fc3f7c83869dddb9b0ef7818c0cae/lucene-core-8.5.0.jar
>> cp lucene-analyzers-common-8.5.0-fix.jar
>> ~/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-analyzers-common/8.5.0/7156f2e545fd6e7faaee4781d15eb60cf5f07646/lucene-analyzers-common-8.5.0.jar
>> cp lucene-sandbox-8.5.0-fix.jar
>> ~/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-sandbox/8.5.0/2b275921f2fd92b15b4f1a2a565467c3fa221ef9/lucene-sandbox-8.5.0.jar
>> cp lucene-queryparser-8.5.0-fix.jar
>> ~/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-queryparser/8.5.0/13c38f39b1a7d10c4749ba789fa95da5868d4885/lucene-queryparser-8.5.0.jar
>>
>> ------------
>>
>>
>>
>> On Sat, Apr 11, 2020 at 11:21 AM Chris Hostetter <
>> [email protected]> wrote:
>>
>>
>> : If the changes I proposed are still viewed as having too many downstream
>> : impacts, my fallback position will be to patch the jars. This involves
>> : using the gradle import system to get the jars from Maven, then using a
>> : patch script to manually unzip the jars, move the offending classes into
>> : other jars which share the same package name and rezip. So far, I've
>> been
>>
>> I'm no expert here, but i trust that Uwe is, and i feel like your
>> followup
>> questions/suggests have still avoided his primary point about *why*
>> Lucene/Solr hasn't attempted jigsaw modulariation...
>>
>> : >>> There is currently some preparatory things to move forward with
>> modules,
>> : >>> so although you might be able to actually compile Lucene with
>> module system
>> : >>> (by limiting to a subset of JAR files), it currently won’t work
>> : >>> cross-module due to the way how it handles ServiceLoader interfaces
>> : >>> (codecs, postings formats, analyzers, see
>> : >>> https://issues.apache.org/jira/browse/LUCENE-9281). The only way to
>> : >>> make it work at runtime is to put all of Lucene into one module.
>>
>> ...so, IIUC, even if you patch the *current* jars, any Lucene code you
>> use
>> that depends on SPI (like analysis chains, codecs, etc...) isn't going to
>> work unless follow Uwe's primary suggestion for folks who care about
>> modules...
>>
>> : >>> Th general recommendation is to combine all required Lucene
>> libraries
>> : >>> into a separate JAR file during the maven / gradle build (e.g.
>> using the
>> : >>> Maven Shade plugin). Keep in mind that Lucene is also not suitable
>> for use
>>
>>
>> -Hoss
>> http://www.lucidworks.com/
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [email protected]
>> For additional commands, e-mail: [email protected]
>>
>>

Reply via email to