[
https://issues.apache.org/jira/browse/LUCENE-3422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13714053#comment-13714053
]
l0co commented on LUCENE-3422:
------------------------------
There's a very unusual thing that I wouldn't be probably able to debug. Maybe
someone who knows something more about lucene will give me a clue...
The behavior in simple steps is following (I'm repeating this to be precise for
this comment).
1. I have new empty index
2. I'm writing the document to the index, I have some files (_1.*)
3. I'm updating the same document to the index, I have some more files (_1.*,
_2.*)
4. After repating few times step 3 (eg. I have _1,_2,_3,_4,_5,_6 files), it
decides to do something on the commit and I have only eg. _7.* files in the
index after this.
A. First the question: is the "something" from the 4th step a merge operation?
Now - why I wouldn't be probably able to find easily the problem. I was working
almost whole day on my computer, after 8 hrs of work I started to get the
problem. There was only one thing required to do to trigger this situation.
1a. Step 1
2a. Step 2
3a. Step 3
4a. Step 3 again <- always error (index files _1.?,_2.? after two steps were
merged into _3.? in 3a and I always had the error in 4a).
The computer was of course heavily exhausted after these hours of working,
memory fragmented etc. a lot of compile/start/stop tomcat steps were perfomed.
In the meantime I needed few times to make a hard kill of tomcat process.
Then I had a break for few hours and came now in the evening and I'm testing
what's going on. The system is clean and freshly started. Now I can perform:
1b. Step 1
2b. Step 2
3b. Step 3
4b-10b. Step 3
~11b. Step 3 - after I have a lot of files (eg.
_1.?,_2.?,_3.?,_4.?,_5.?,_6.?,_7.?,_8.?) it only now decides to "merge"
everything into a single bunch of files
12b. Step 3 - no FNFE error (!)
And here the other questions:
B. What can be the difference before morning and evening execution? Why
previously it decided to run merge just on second update, and now it makes it
after 8-10 update?
C. Any clue why I don't have the FNFE now?
Plus what I checked:
1. There's no way of having the other tomcat working in the background during
the tests, because tomcat port would be blocked.
2. I double checked and I always have only a single instance of IndexWriter,
which is reused across all testing requests. Moreover it always uses the same
sequential task executor (all write operations are serializable and performed
in a separate thread one by one).
Sorry for long story, but this looks amazing.
> IndeIndexWriter.optimize() throws FileNotFoundException and IOException
> -----------------------------------------------------------------------
>
> Key: LUCENE-3422
> URL: https://issues.apache.org/jira/browse/LUCENE-3422
> Project: Lucene - Core
> Issue Type: Bug
> Reporter: Elizabeth Nisha
> Attachments: delete.log
>
>
> I am using lucene 3.0.2 search APIs for my application.
> Indexed data is about 350MB and time taken for indexing is 25 hrs. Search
> indexing and Optimization runs in two different threads. Optimization runs
> for every 1 hour and it doesn't run while indexing is going on and vice
> versa. When optimization is going on using IndexWriter.optimize(),
> FileNotFoundException and IOException are seen in my log and the index file
> is getting corrupted, log says
> 1. java.io.IOException: No sub-file with id _5r8.fdt found
> [The file name in this message changes over time (_5r8.fdt, _6fa.fdt,
> _6uh.fdt, ..., _emv.fdt) ]
> 2. java.io.FileNotFoundException:
> /local/groups/necim/index_5.3/index/_bdx.cfs (No such file or directory)
> 3. java.io.FileNotFoundException:
> /local/groups/necim/index_5.3/index/_hkq.cfs (No such file or directory)
> Stack trace: java.io.IOException: background merge hit exception:
> _hkp:c100->_hkp _hkq:c100->_hkp _hkr:c100->_hkr _hks:c100->_hkr _hxb:c5500
> _hx5:c1000 _hxc:c198
> 84 into _hxd [optimize] [mergeDocStores]
> at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2359)
> at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2298)
> at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2268)
> at com.telelogic.cs.search.SearchIndex.doOptimize(SearchIndex.java:130)
> at
> com.telelogic.cs.search.SearchIndexerThread$1.run(SearchIndexerThread.java:337)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> Caused by: java.io.FileNotFoundException:
> /local/groups/necim/index_5.3/index/_hkq.cfs (No such file or directory)
> at java.io.RandomAccessFile.open(Native Method)
> at java.io.RandomAccessFile.<init>(RandomAccessFile.java:212)
> at
> org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexInput$Descriptor.<init>(SimpleFSDirectory.java:76)
> at
> org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexInput.<init>(SimpleFSDirectory.java:97)
> at
> org.apache.lucene.store.NIOFSDirectory$NIOFSIndexInput.<init>(NIOFSDirectory.java:87)
> at
> org.apache.lucene.store.NIOFSDirectory.openInput(NIOFSDirectory.java:67)
> at
> org.apache.lucene.index.CompoundFileReader.<init>(CompoundFileReader.java:67)
> at
> org.apache.lucene.index.SegmentReader$CoreReaders.<init>(SegmentReader.java:114)
> at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:590)
> at
> org.apache.lucene.index.IndexWriter$ReaderPool.get(IndexWriter.java:616)
> at
> org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4309)
> at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3965)
> at
> org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:231)
> at
> org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:288)
>
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]