Juan Ramos created GEODE-7703: --------------------------------- Summary: Lucene IndexWriter Creation Failure Key: GEODE-7703 URL: https://issues.apache.org/jira/browse/GEODE-7703 Project: Geode Issue Type: Bug Components: lucene Reporter: Juan Ramos
While computing the index repository, the initialization might fail if there are modifications happening to the {{fileAndChunk}} region while the {{IndexWriter}} is being initialized. The exception stack trace varies from run to run but it always involves a {{IOException}} with different causes while reading the index file, some examples are shown below: {noformat} Caused by: java.io.FileNotFoundException: segments_1 at org.apache.geode.cache.lucene.internal.filesystem.FileSystem.getFile(FileSystem.java:101) at org.apache.geode.cache.lucene.internal.directory.RegionDirectory.openInput(RegionDirectory.java:115) at org.apache.lucene.store.Directory.openChecksumInput(Directory.java:137) at org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:286) at org.apache.lucene.index.IndexFileDeleter.<init>(IndexFileDeleter.java:165) at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:974) at org.apache.geode.cache.lucene.internal.IndexRepositoryFactory.finishComputingRepository(IndexRepositoryFactory.java:130) at org.apache.geode.cache.lucene.internal.IndexRepositoryFactory.computeIndexRepository(IndexRepositoryFactory.java:67) at org.apache.geode.cache.lucene.internal.IndexRepositoryFactoryDistributedTest.lambda$testBecomePrimaryWhileIndexing$566b4a0f$5(IndexRepositoryFactoryDistributedTest.java:224) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.geode.test.dunit.internal.MethodInvoker.executeObject(MethodInvoker.java:123) at org.apache.geode.test.dunit.internal.RemoteDUnitVM.executeMethodOnObject(RemoteDUnitVM.java:78) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:357) at sun.rmi.transport.Transport$1.run(Transport.java:200) at sun.rmi.transport.Transport$1.run(Transport.java:197) at java.security.AccessController.doPrivileged(Native Method) at sun.rmi.transport.Transport.serviceCall(Transport.java:196) at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:573) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:834) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.lambda$run$0(TCPTransport.java:688) at java.security.AccessController.doPrivileged(Native Method) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:687) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) {noformat} {noformat} Caused by: java.io.EOFException: Read past end of file _3z.si at org.apache.geode.cache.lucene.internal.directory.FileIndexInput.readByte(FileIndexInput.java:103) at org.apache.lucene.store.BufferedChecksumIndexInput.readByte(BufferedChecksumIndexInput.java:41) at org.apache.lucene.store.DataInput.readInt(DataInput.java:101) at org.apache.lucene.codecs.CodecUtil.checkHeader(CodecUtil.java:194) at org.apache.lucene.codecs.CodecUtil.checkIndexHeader(CodecUtil.java:255) at org.apache.lucene.codecs.lucene62.Lucene62SegmentInfoFormat.read(Lucene62SegmentInfoFormat.java:93) at org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:357) at org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:288) ... 20 more Suppressed: org.apache.lucene.index.CorruptIndexException: checksum status indeterminate: unexpected exception (resource=BufferedChecksumIndexInput(_3z.si)) at org.apache.lucene.codecs.CodecUtil.checkFooter(CodecUtil.java:471) at org.apache.lucene.codecs.lucene62.Lucene62SegmentInfoFormat.read(Lucene62SegmentInfoFormat.java:252) ... 22 more Caused by: java.io.EOFException: Read past end of file _3z.si at org.apache.geode.cache.lucene.internal.directory.FileIndexInput.readBytes(FileIndexInput.java:124) at org.apache.lucene.store.BufferedChecksumIndexInput.readBytes(BufferedChecksumIndexInput.java:49) at org.apache.lucene.store.DataInput.readBytes(DataInput.java:87) at org.apache.lucene.store.DataInput.skipBytes(DataInput.java:350) at org.apache.lucene.codecs.CodecUtil.checkFooter(CodecUtil.java:458) ... 23 more {noformat} The issue itself is extremely hard to reproduce as the time window for the race to happen is rather small, the solution implies returning null from the {{IndexRepositoryFactory}} whenever the exception happens and let the caller retry (the internal logic for doing this is already in place). -- This message was sent by Atlassian Jira (v8.3.4#803005)