Hi Team,
I am a new user of Lucene 4.8.1. I encountered a Lucene indexing
performance issue which slow down my application greatly. I tried several
ways from google searchs but still couldn't resolve it. Any suggestions
from your experts might help me a lot.
One of my application uses the lucene index for fast data searching. When I
start my application, I will index all the necessary data from database
which will be 88 MB index data after indexing is done. In this case,
indexing only takes less than 4 minutes.
I have another shell script task running every night, which send a JMX call
to my application to re-indexing all the data. The re-indexing method will
clear my current indexing directory data, reading data from database and
recreating the index from the ground. Everything works fine at the
beginning, indexing only takes a little more than 3 mins. But after my
application running for a while(one day or two), the re-indexing speed
slows down greatly which now takes more than 22 mins.
Here is the procedure of my Lucene indexing and re-indexing:
1. If index data exists inside index directory, remove all the index
data.
2. Create IndexWriter with 200MB RAMBUFFERSIZE, (6.6) MaxMergesAndThreads
3. Process DB result set
- When I loop the result set, I reuse the same Document instance.
- At the end of each loop, I call indexWriter.addDocument(doc)
4. IndexWriter.commit()
5. IndexWriter.close();
I did a profiling when it was slow and found out that
indexWriter.addDocument method took most of the time. Then, i put some
logging code as below:
long start = System.currentTimeMillis();
indexWriter.addDocument(doc);
totalAddDocTime += (System.currentTimeMillis() - start);
After several tests, when the indexing is slow down, the total time took by
indexWriter.addDocument(doc) is about 20 mins.
During indexing, i also observed the cpu usage sometimes above 100.
6G memory assigned to my application. When indexing, other processing
modules are all suspended waiting for indexing finish and I don't see any
memory leak in my application.
Can you give me some suggestions about my issue?
Thank you,
Jason