Hi Michael,
Thank you for your reply. Please find responses to your questions below.
Regards,
Antony
On Sat, 30 Apr 2022 at 18:59, Michael McCandless <[email protected]>
wrote:
> Hi Antony,
>
> Hmm it looks like the root cause is this:
>
> Caused by: java.nio.file.NoSuchFileException: D:\i\202204\_14gb.si
>
> Can you list all the files in the index directory at the time this
> exception happens, and reply here? We need to figure out whether the file
> is really missing or what.
>
Below the index directory file listing. Yes, file is missing (D:\i\202204\_
14gb.si)
>
> Do you run any virus scanner / disk file tree utilities / etc.? In the
> distant past sometimes such programs might cause strange transient errors
> if they open a file for read exclusively or so, on windows.
>
There is no virus scanner running.
>
> What is the actual drive you are storing the index on (D:)? Is it a local
> disk or remote SMBFS mount?
>
It's a local disk (D:).
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
>
> On Sat, Apr 30, 2022 at 8:39 AM Antony Joseph <
> [email protected]> wrote:
>
>> Thank you for your reply.
>>
>> *The full stack trace is included:*
>>
>> <super: <class 'JavaError'>, <JavaError object>>
>> Java stacktrace:
>> org.apache.lucene.index.CorruptIndexException: Unexpected file read error
>> while
>> reading index.
>>
>> (resource=BufferedChecksumIndexInput(MMapIndexInput(path="D:\i\202204\segments_10fj")))
>> at
>> org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:290)
>> at
>> org.apache.lucene.index.IndexFileDeleter.<init>(IndexFileDeleter.java:165)
>> at
>> org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:972)
>> Caused by: java.nio.file.NoSuchFileException: D:\i\202204\_14gb.si
>> at sun.nio.fs.WindowsException.translateToIOException(Unknown
>> Source)
>> at sun.nio.fs.WindowsException.rethrowAsIOException(Unknown
>> Source)
>> at sun.nio.fs.WindowsException.rethrowAsIOException(Unknown
>> Source)
>> at sun.nio.fs.WindowsFileSystemProvider.newFileChannel(Unknown
>> Source)
>> at java.nio.channels.FileChannel.open(Unknown Source)
>> at java.nio.channels.FileChannel.open(Unknown Source)
>> at
>> org.apache.lucene.store.MMapDirectory.openInput(MMapDirectory.java:238)
>> at
>> org.apache.lucene.store.Directory.openChecksumInput(Directory.java:137)
>> at
>>
>> org.apache.lucene.codecs.lucene62.Lucene62SegmentInfoFormat.read(Lucene62SegmentInfoFormat.java:89)
>> at
>> org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:357)
>> at
>> org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:288)
>> ... 2 more
>>
>> Traceback (most recent call last):
>> File "index.py", line 112, in start
>> writer = IndexWriter(index_directory, iconfig)
>> lucene.JavaError: <super: <class 'JavaError'>, <JavaError object>>
>> Java stacktrace:
>> org.apache.lucene.index.CorruptIndexException: Unexpected file read error
>> while
>> reading index.
>>
>> (resource=BufferedChecksumIndexInput(MMapIndexInput(path="D:\i\202204\segments_10fj")))
>> at
>> org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:290)
>> at
>> org.apache.lucene.index.IndexFileDeleter.<init>(IndexFileDeleter.java:165)
>> at
>> org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:972)
>> Caused by: java.nio.file.NoSuchFileException: D:\i\202204\_14gb.si
>> at sun.nio.fs.WindowsException.translateToIOException(Unknown
>> Source)
>> at sun.nio.fs.WindowsException.rethrowAsIOException(Unknown
>> Source)
>> at sun.nio.fs.WindowsException.rethrowAsIOException(Unknown
>> Source)
>> at sun.nio.fs.WindowsFileSystemProvider.newFileChannel(Unknown
>> Source)
>> at java.nio.channels.FileChannel.open(Unknown Source)
>> at java.nio.channels.FileChannel.open(Unknown Source)
>> at
>> org.apache.lucene.store.MMapDirectory.openInput(MMapDirectory.java:238)
>> at
>> org.apache.lucene.store.Directory.openChecksumInput(Directory.java:137)
>> at
>>
>> org.apache.lucene.codecs.lucene62.Lucene62SegmentInfoFormat.read(Lucene62SegmentInfoFormat.java:89)
>> at
>> org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:357)
>> at
>> org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:288)
>> ... 2 more
>>
>>
>> Regards,
>> Antony
>>
>> On Sat, 30 Apr 2022 at 10:59, Robert Muir <[email protected]> wrote:
>>
>> > The most helpful thing would be the full stacktrace of the exception.
>> > This exception should be chaining the original exception and call
>> > site, and maybe tell us more about this error you hit.
>> >
>> > To me, it looks like a windows-specific issue where the filesystem is
>> > returning an unexpected error. So it would be helpful to see exactly
>> > which one that is, and the full trace of where it comes from, to chase
>> > it further
>> >
>> > On Thu, Apr 28, 2022 at 12:10 PM Antony Joseph
>> > <[email protected]> wrote:
>> > >
>> > > Thank you for your reply.
>> > >
>> > > This isn't happening in a single environment. Our application is being
>> > used
>> > > by various clients and this has been reported by multiple users - all
>> of
>> > > whom were running the earlier pylucene (v4.10) - without issues.
>> > >
>> > > One thing to mention is that our earlier version used Python 2.7.15
>> (with
>> > > pylucene 4.10) and now we are using Python 3.8.10 with Pylucene 6.5.0
>> -
>> > the
>> > > indexing logic is the same...
>> > >
>> > > One other thing to note is that the issue described has (so far!) only
>> > > occurred on MS Windows - none of our Linux customers have complained
>> > about
>> > > this.
>> > >
>> > > Any ideas?
>> > >
>> > > Regards,
>> > > Antony
>> > >
>> > > On Thu, 28 Apr 2022 at 17:00, Adrien Grand <[email protected]> wrote:
>> > >
>> > > > Hi Anthony,
>> > > >
>> > > > This isn't something that you should try to fix programmatically,
>> > > > corruptions indicate that something is wrong with the environment,
>> > > > like a broken disk or corrupt RAM. I would suggest running a memtest
>> > > > to check your RAM and looking at system logs in case they have
>> > > > anything to tell about your disks.
>> > > >
>> > > > Can you also share the full stack trace of the exception?
>> > > >
>> > > > On Thu, Apr 28, 2022 at 10:26 AM Antony Joseph
>> > > > <[email protected]> wrote:
>> > > > >
>> > > > > Hello,
>> > > > >
>> > > > > We are facing a strange situation in our application as described
>> > below:
>> > > > >
>> > > > > *Using*:
>> > > > >
>> > > > > - Python 3.8.10
>> > > > > - Pylucene 6.5.0
>> > > > > - Java 8 (1.8.0_181)
>> > > > > - Runs on Linux and Windows (error seen on Windows)
>> > > > >
>> > > > > We suddenly get the following *error*:
>> > > > >
>> > > > > 2022-02-10 09:58:09.253215: ERROR : writer | Failed to get index
>> > > > > (D:\i\202202) writer, Exception:
>> > > > > org.apache.lucene.index.CorruptIndexException: Unexpected file
>> read
>> > error
>> > > > > while reading index.
>> > > > >
>> > > >
>> >
>> (resource=BufferedChecksumIndexInput(MMapIndexInput(path="D:\i\202202\segments_fo")))
>> > > > >
>> > > > >
>> > > > > After this, no further indexing happens - trying to open the index
>> > for
>> > > > > writing throws the above error - and the index writer does not
>> open.
>> > > > >
>> > > > > FYI, our code contains the following *settings*:
>> > > > >
>> > > > > index_path = "D:\i\202202"
>> > > > > index_directory = FSDirectory.open(Paths.get(index_path))
>> > > > > iconfig = IndexWriterConfig(wrapper_analyzer)
>> > > > > iconfig.setOpenMode(IndexWriterConfig.OpenMode.CREATE_OR_APPEND)
>> > > > > iconfig.setRAMBufferSizeMB(16.0)
>> > > > > writer = IndexWriter(index_directory, iconfig)
>> > > > >
>> > > > >
>> > > > > *Repairing*
>> > > > > We tried 'repairing' the index with the following command / tool:
>> > > > >
>> > > > > java -cp lucene-core-6.5.0.jar:lucene-backward-codecs-6.5.0.jar
>> > > > > org.apache.lucene.index.CheckIndex "D:\i\202202" -exorcise
>> > > > >
>> > > > > This however returns saying "No problems found with the index."
>> > > > >
>> > > > >
>> > > > > *Work around*
>> > > > > We have to manually delete the problematic segment file:
>> > > > > D:\i\202202\segments_fo
>> > > > > after which the application starts again... until the next
>> > corruption. We
>> > > > > can't spot a specific pattern.
>> > > > >
>> > > > >
>> > > > > *Two questions:*
>> > > > >
>> > > > > 1. Can we handle this situation programmatically, so that no
>> > manual
>> > > > > intervention is needed?
>> > > > > 2. Any reason why we are facing the corruption issue in the
>> first
>> > > > place?
>> > > > >
>> > > > >
>> > > > > Before this we were using Pylucene 4.10 and we didn't face this
>> > problem -
>> > > > > the application logic is the same.
>> > > > >
>> > > > > Also, while the application runs on both Linux and Windows, so
>> far we
>> > > > have
>> > > > > observed this situation only on various Windows platforms.
>> > > > >
>> > > > > Would really appreciate some assistance. Thanks in advance.
>> > > > >
>> > > > > Regards,
>> > > > > Antony
>> > > >
>> > > >
>> > > >
>> > > > --
>> > > > Adrien
>> > > >
>> > > >
>> ---------------------------------------------------------------------
>> > > > To unsubscribe, e-mail: [email protected]
>> > > > For additional commands, e-mail: [email protected]
>> > > >
>> > > >
>> >
>> > ---------------------------------------------------------------------
>> > To unsubscribe, e-mail: [email protected]
>> > For additional commands, e-mail: [email protected]
>> >
>> >
>>
>
Volume in drive D is APP Data
Volume Serial Number is 742A-8BB3
Directory of D:\i\202204
04/23/2022 03:39 PM <DIR> .
04/23/2022 03:39 PM <DIR> ..
04/23/2022 02:33 AM 2,153 segments_10fj
04/23/2022 02:34 AM 1,633 segments_10fk
04/01/2022 02:34 AM 0 write.lock
04/21/2022 06:47 PM 405 _10ya.cfe
04/21/2022 06:47 PM 404,514,656 _10ya.cfs
04/21/2022 06:47 PM 421 _10ya.si
04/22/2022 10:52 AM 405 _12ok.cfe
04/22/2022 10:52 AM 408,755,035 _12ok.cfs
04/22/2022 10:52 AM 421 _12ok.si
04/22/2022 01:10 PM 405 _1313.cfe
04/22/2022 01:10 PM 444,171,534 _1313.cfs
04/22/2022 01:10 PM 421 _1313.si
04/22/2022 04:17 PM 405 _13in.cfe
04/22/2022 04:17 PM 445,896,460 _13in.cfs
04/22/2022 04:17 PM 421 _13in.si
04/22/2022 07:18 PM 405 _13wu.cfe
04/22/2022 07:18 PM 449,649,236 _13wu.cfs
04/22/2022 07:18 PM 421 _13wu.si
04/23/2022 12:01 AM 405 _14br.cfe
04/23/2022 12:01 AM 413,394,102 _14br.cfs
04/23/2022 12:01 AM 421 _14br.si
04/23/2022 12:01 AM 151,733 _14bs.cfs
04/23/2022 12:02 AM 103,020 _14bt.cfs
04/23/2022 12:02 AM 87,781 _14bu.cfs
04/23/2022 12:03 AM 9,957 _14bv.cfs
04/23/2022 12:03 AM 62,878 _14bw.cfs
04/23/2022 12:04 AM 16,847 _14bx.cfs
04/23/2022 12:05 AM 10,764 _14by.cfs
04/23/2022 12:06 AM 27,356 _14bz.cfs
04/23/2022 12:07 AM 405 _14c0.cfe
04/23/2022 12:07 AM 1,895,475 _14c0.cfs
04/23/2022 12:07 AM 383 _14c0.si
04/23/2022 12:08 AM 40,965 _14c1.cfs
04/23/2022 12:09 AM 82,949 _14c2.cfs
04/23/2022 12:09 AM 42,307 _14c3.cfs
04/23/2022 12:10 AM 22,726 _14c4.cfs
04/23/2022 12:10 AM 41,847 _14c5.cfs
04/23/2022 12:11 AM 72,802 _14c6.cfs
04/23/2022 12:11 AM 58,799 _14c7.cfs
04/23/2022 12:12 AM 88,753 _14c8.cfs
04/23/2022 12:13 AM 7,563 _14c9.cfs
04/23/2022 12:15 AM 46,191 _14ce.cfs
04/23/2022 12:15 AM 81,462 _14cf.cfs
04/23/2022 12:16 AM 11,370 _14cg.cfs
04/23/2022 12:17 AM 38,332 _14ch.cfs
04/23/2022 12:17 AM 26,131 _14ci.cfs
04/23/2022 12:18 AM 46,270 _14cj.cfs
04/23/2022 12:19 AM 17,319 _14ck.cfs
04/23/2022 12:20 AM 15,684 _14cl.cfs
04/23/2022 12:21 AM 59,915 _14cm.cfs
04/23/2022 12:22 AM 7,005 _14cn.cfs
04/23/2022 12:23 AM 167,162 _14cp.cfs
04/23/2022 12:25 AM 170,016 _14cr.cfs
04/23/2022 12:27 AM 271,712 _14cz.cfs
04/23/2022 12:30 AM 107,240 _14d2.cfs
04/23/2022 12:36 AM 316,753 _14d9.cfs
04/23/2022 12:40 AM 405 _14dd.cfe
04/23/2022 12:40 AM 447,407 _14dd.cfs
04/23/2022 12:40 AM 383 _14dd.si
04/23/2022 12:44 AM 269,327 _14dj.cfs
04/23/2022 12:51 AM 175,784 _14dt.cfs
04/23/2022 12:55 AM 405 _14dz.cfe
04/23/2022 12:55 AM 415,372 _14dz.cfs
04/23/2022 12:55 AM 383 _14dz.si
04/23/2022 12:56 AM 273,924 _14e3.cfs
04/23/2022 01:01 AM 242,125 _14ed.cfs
04/23/2022 01:07 AM 405 _14en.cfe
04/23/2022 01:07 AM 449,616 _14en.cfs
04/23/2022 01:07 AM 421 _14en.si
04/23/2022 01:12 AM 405 _14ex.cfe
04/23/2022 01:12 AM 441,454 _14ex.cfs
04/23/2022 01:12 AM 421 _14ex.si
04/23/2022 01:13 AM 46,465 _14ey.cfs
04/23/2022 01:20 AM 405 _14f7.cfe
04/23/2022 01:20 AM 450,403 _14f7.cfs
04/23/2022 01:20 AM 421 _14f7.si
04/23/2022 01:28 AM 405 _14fh.cfe
04/23/2022 01:28 AM 467,352 _14fh.cfs
04/23/2022 01:28 AM 421 _14fh.si
04/23/2022 01:34 AM 297,338 _14fr.cfs
04/23/2022 01:42 AM 285,794 _14g1.cfs
04/23/2022 01:49 AM 380,954 _14gb.cfs
04/23/2022 01:57 AM 405 _14gl.cfe
04/23/2022 01:57 AM 470,249 _14gl.cfs
04/23/2022 01:57 AM 421 _14gl.si
04/23/2022 02:02 AM 405 _14gv.cfe
04/23/2022 02:02 AM 501,988 _14gv.cfs
04/23/2022 02:02 AM 421 _14gv.si
04/23/2022 02:06 AM 405 _14h5.cfe
04/23/2022 02:06 AM 472,582 _14h5.cfs
04/23/2022 02:06 AM 421 _14h5.si
04/23/2022 02:10 AM 405 _14hf.cfe
04/23/2022 02:10 AM 516,337 _14hf.cfs
04/23/2022 02:10 AM 421 _14hf.si
04/23/2022 02:11 AM 11,267 _14hg.cfs
04/23/2022 02:11 AM 55,151 _14hh.cfs
04/23/2022 02:12 AM 15,565 _14hi.cfs
04/23/2022 02:12 AM 88,140 _14hj.cfs
04/23/2022 02:13 AM 69,674 _14hk.cfs
04/23/2022 02:14 AM 23,462 _14hl.cfs
04/23/2022 02:15 AM 10,100 _14hm.cfs
04/23/2022 02:15 AM 45,510 _14hn.cfs
04/23/2022 02:17 AM 405 _14hp.cfe
04/23/2022 02:17 AM 547,179 _14hp.cfs
04/23/2022 02:17 AM 421 _14hp.si
04/23/2022 02:17 AM 8,662 _14hq.cfs
04/23/2022 02:25 AM 405 _14hz.cfe
04/23/2022 02:25 AM 403,710 _14hz.cfs
04/23/2022 02:25 AM 421 _14hz.si
04/23/2022 02:26 AM 84,866 _14i0.cfs
04/23/2022 02:27 AM 18,715 _14i1.cfs
04/23/2022 02:28 AM 9,221 _14i2.cfs
04/23/2022 02:30 AM 83,826 _14i3.cfs
04/23/2022 02:31 AM 30,755 _14i4.cfs
04/23/2022 02:31 AM 24,798 _14i5.cfs
04/23/2022 02:32 AM 27,181 _14i6.cfs
04/23/2022 02:33 AM 7,005 _14i7.cfs
04/23/2022 02:34 AM 405 _14i9.cfe
04/23/2022 02:34 AM 558,333 _14i9.cfs
04/23/2022 02:34 AM 421 _14i9.si
04/05/2022 11:29 AM 98 _664.dii
04/05/2022 11:29 AM 3,140,765 _664.dim
04/05/2022 11:28 AM 51,797,110 _664.fdt
04/05/2022 11:28 AM 21,038 _664.fdx
04/05/2022 11:29 AM 4,291 _664.fnm
04/05/2022 11:29 AM 1,515,011 _664.nvd
04/05/2022 11:29 AM 188 _664.nvm
04/05/2022 11:29 AM 583 _664.si
04/05/2022 11:29 AM 92,453,822 _664_Lucene50_0.doc
04/05/2022 11:29 AM 234,246,791 _664_Lucene50_0.pos
04/05/2022 11:29 AM 280,169,598 _664_Lucene50_0.tim
04/05/2022 11:29 AM 2,532,894 _664_Lucene50_0.tip
04/05/2022 11:29 AM 2,935,314 _664_Lucene54_0.dvd
04/05/2022 11:29 AM 363 _664_Lucene54_0.dvm
04/19/2022 06:37 PM 95 _yjj.dii
04/19/2022 06:37 PM 1,897,032 _yjj.dim
04/19/2022 06:36 PM 30,489,864 _yjj.fdt
04/19/2022 06:36 PM 12,808 _yjj.fdx
04/19/2022 06:37 PM 4,455 _yjj.fnm
04/19/2022 06:37 PM 913,151 _yjj.nvd
04/19/2022 06:37 PM 188 _yjj.nvm
04/19/2022 06:37 PM 583 _yjj.si
04/19/2022 06:37 PM 52,854,765 _yjj_Lucene50_0.doc
04/19/2022 06:37 PM 133,018,229 _yjj_Lucene50_0.pos
04/19/2022 06:37 PM 178,174,040 _yjj_Lucene50_0.tim
04/19/2022 06:37 PM 1,662,130 _yjj_Lucene50_0.tip
04/19/2022 06:37 PM 1,854,815 _yjj_Lucene54_0.dvd
04/19/2022 06:37 PM 363 _yjj_Lucene54_0.dvm
04/21/2022 10:58 AM 96 _zs9.dii
04/21/2022 10:58 AM 2,181,856 _zs9.dim
04/21/2022 10:58 AM 34,798,219 _zs9.fdt
04/21/2022 10:58 AM 14,990 _zs9.fdx
04/21/2022 10:58 AM 4,291 _zs9.fnm
04/21/2022 10:58 AM 1,060,415 _zs9.nvd
04/21/2022 10:58 AM 188 _zs9.nvm
04/21/2022 10:58 AM 583 _zs9.si
04/21/2022 10:58 AM 57,928,377 _zs9_Lucene50_0.doc
04/21/2022 10:58 AM 147,299,467 _zs9_Lucene50_0.pos
04/21/2022 10:58 AM 181,792,715 _zs9_Lucene50_0.tim
04/21/2022 10:58 AM 1,683,131 _zs9_Lucene50_0.tip
04/21/2022 10:58 AM 1,921,992 _zs9_Lucene54_0.dvd
04/21/2022 10:58 AM 363 _zs9_Lucene54_0.dvm
04/21/2022 11:48 AM 96 _ztz.dii
04/21/2022 11:48 AM 2,124,160 _ztz.dim
04/21/2022 11:48 AM 38,019,415 _ztz.fdt
04/21/2022 11:48 AM 14,928 _ztz.fdx
04/21/2022 11:48 AM 4,291 _ztz.fnm
04/21/2022 11:48 AM 1,031,495 _ztz.nvd
04/21/2022 11:48 AM 188 _ztz.nvm
04/21/2022 11:48 AM 583 _ztz.si
04/21/2022 11:48 AM 64,060,384 _ztz_Lucene50_0.doc
04/21/2022 11:48 AM 163,985,641 _ztz_Lucene50_0.pos
04/21/2022 11:48 AM 229,803,629 _ztz_Lucene50_0.tim
04/21/2022 11:48 AM 2,146,560 _ztz_Lucene50_0.tip
04/21/2022 11:48 AM 1,912,551 _ztz_Lucene54_0.dvd
04/21/2022 11:48 AM 363 _ztz_Lucene54_0.dvm
176 File(s) 4,580,827,241 bytes
2 Dir(s) 111,429,218,304 bytes free
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]