contrebande-labs opened a new issue, #12307:
URL: https://github.com/apache/lucene/issues/12307
### Description
We use IntelliJ to build a fat jar to run on ARM64 cloud VMs with Java 20 on
Oracle Linux 8.5 to build Lucene BM25+HNSW indices.
Our first run gave us this error message that we fixed by manually
extracting `lucene-core-9.6.0.jar/META-INF/versions/20/` in the classpath:
```
java.lang.LinkageError: MemorySegmentIndexInputProvider is missing in Lucene
JAR file
at
org.apache.lucene.store.MMapDirectory.lookupProvider(MMapDirectory.java:437)
at
java.base/java.security.AccessController.doPrivileged(AccessController.java:319)
at
org.apache.lucene.store.MMapDirectory.doPrivileged(MMapDirectory.java:395)
at
org.apache.lucene.store.MMapDirectory.<clinit>(MMapDirectory.java:448)
at
com.example.vertx.oci.lucene.LuceneHNSWIndexVerticle.lambda$initIndex$12(LuceneHNSWIndexVerticle.java:325)
at io.vertx.lang.rx.DelegatingHandler.handle(DelegatingHandler.java:20)
at io.vertx.core.impl.ContextBase.lambda$null$0(ContextBase.java:137)
at io.vertx.core.impl.ContextInternal.dispatch(ContextInternal.java:264)
at
io.vertx.core.impl.ContextBase.lambda$executeBlocking$1(ContextBase.java:135)
at io.vertx.core.impl.TaskQueue.run(TaskQueue.java:76)
at
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
at
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
at
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.base/java.lang.Thread.run(Thread.java:1623)
Caused by: java.lang.ClassNotFoundException:
org.apache.lucene.store.MemorySegmentIndexInputProvider
at
java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:641)
at
java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:188)
at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:521)
at java.base/java.lang.Class.forName0(Native Method)
at java.base/java.lang.Class.forName(Class.java:496)
at java.base/java.lang.Class.forName(Class.java:475)
at
java.base/java.lang.invoke.MethodHandles$Lookup.findClass(MethodHandles.java:2785)
at
org.apache.lucene.store.MMapDirectory.lookupProvider(MMapDirectory.java:422)
... 13 more
```
But then, 237GB RAM, 35 million documents and 2hrs20mins later, we got the
following and couldn't find how to fix it since
`org.apache.lucene.codecs.lucene90.Lucene90PostingsFormat` is already in the
classpath:
```
Exception in thread "Lucene Merge Thread #0"
org.apache.lucene.index.MergePolicy$MergeException:
java.lang.IllegalArgumentException: An SPI class of type
org.apache.lucene.codecs.PostingsFormat with name 'Lucene90' does not exist.
You need to add the corresponding JAR file supporting this SPI to your
classpath. The current classpath supports the following names: [completion,
Completion84, Completion90]
at
org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:735)
at
org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:727)
Caused by: java.lang.IllegalArgumentException: An SPI class of type
org.apache.lucene.codecs.PostingsFormat with name 'Lucene90' does not exist.
You need to add the corresponding JAR file supporting this SPI to your
classpath. The current classpath supports the following names: [completion,
Completion84, Completion90]
at org.apache.lucene.util.NamedSPILoader.lookup(NamedSPILoader.java:113)
at
org.apache.lucene.codecs.PostingsFormat.forName(PostingsFormat.java:111)
at
org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsReader.<init>(PerFieldPostingsFormat.java:325)
at
org.apache.lucene.codecs.perfield.PerFieldPostingsFormat.fieldsProducer(PerFieldPostingsFormat.java:392)
at
org.apache.lucene.index.SegmentCoreReaders.<init>(SegmentCoreReaders.java:118)
at org.apache.lucene.index.SegmentReader.<init>(SegmentReader.java:92)
at
org.apache.lucene.index.ReadersAndUpdates.getReader(ReadersAndUpdates.java:180)
at
org.apache.lucene.index.ReadersAndUpdates.getReaderForMerge(ReadersAndUpdates.java:788)
at
org.apache.lucene.index.IndexWriter.lambda$mergeMiddle$21(IndexWriter.java:5079)
at
org.apache.lucene.index.MergePolicy$OneMerge.initMergeReaders(MergePolicy.java:444)
at
org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:5075)
at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:4680)
at
org.apache.lucene.index.IndexWriter$IndexWriterMergeSource.merge(IndexWriter.java:6432)
at
org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:639)
at
org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:700)
com.example.vertx.oci.lucene.exception.IndexDocumentException:
org.apache.lucene.store.AlreadyClosedException: this IndexWriter is closed
at
com.example.vertx.oci.lucene.LuceneHNSWIndexVerticle.lambda$indexDocument$10(LuceneHNSWIndexVerticle.java:277)
at io.vertx.lang.rx.DelegatingHandler.handle(DelegatingHandler.java:20)
at io.vertx.core.impl.ContextBase.lambda$null$0(ContextBase.java:137)
at io.vertx.core.impl.ContextInternal.dispatch(ContextInternal.java:264)
at
io.vertx.core.impl.ContextBase.lambda$executeBlocking$1(ContextBase.java:135)
at io.vertx.core.impl.TaskQueue.run(TaskQueue.java:76)
at
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
at
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
at
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.base/java.lang.Thread.run(Thread.java:1623)
Caused by: org.apache.lucene.store.AlreadyClosedException: this IndexWriter
is closed
at org.apache.lucene.index.IndexWriter.ensureOpen(IndexWriter.java:908)
at org.apache.lucene.index.IndexWriter.ensureOpen(IndexWriter.java:921)
at
org.apache.lucene.index.IndexWriter.updateDocuments(IndexWriter.java:1529)
at
org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1817)
at
org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1470)
at
com.example.vertx.oci.lucene.LuceneHNSWIndexVerticle.lambda$indexDocument$10(LuceneHNSWIndexVerticle.java:271)
... 9 more
Caused by: java.lang.IllegalArgumentException: An SPI class of type
org.apache.lucene.codecs.PostingsFormat with name 'Lucene90' does not exist.
You need to add the corresponding JAR file supporting this SPI to your
classpath. The current classpath supports the following names: [completion,
Completion84, Completion90]
at org.apache.lucene.util.NamedSPILoader.lookup(NamedSPILoader.java:113)
at
org.apache.lucene.codecs.PostingsFormat.forName(PostingsFormat.java:111)
at
org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsReader.<init>(PerFieldPostingsFormat.java:325)
at
org.apache.lucene.codecs.perfield.PerFieldPostingsFormat.fieldsProducer(PerFieldPostingsFormat.java:392)
at
org.apache.lucene.index.SegmentCoreReaders.<init>(SegmentCoreReaders.java:118)
at org.apache.lucene.index.SegmentReader.<init>(SegmentReader.java:92)
at
org.apache.lucene.index.ReadersAndUpdates.getReader(ReadersAndUpdates.java:180)
at
org.apache.lucene.index.ReadersAndUpdates.getReaderForMerge(ReadersAndUpdates.java:788)
at
org.apache.lucene.index.IndexWriter.lambda$mergeMiddle$21(IndexWriter.java:5079)
at
org.apache.lucene.index.MergePolicy$OneMerge.initMergeReaders(MergePolicy.java:444)
at
org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:5075)
at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:4680)
at
org.apache.lucene.index.IndexWriter$IndexWriterMergeSource.merge(IndexWriter.java:6432)
at
org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:639)
at
org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:700)
```
Anyone knows how to fix this, especially on the second without a solution?
We think they are related but can't say for sure.
Thanks.
### Version and environment details
_No response_
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]