These files are normal Lucene segment files (in compound file
format). What's odd is that Lucene is not merging them down to a
smaller set of segments.
Have you done any advanced things, like customize the deletion or
merge policy?
When you close you writer, are you using just close() or close(false)?
If you can set the InfoStream on the writer for one of your
incremental update sessions and post the results, maybe that will shed
some light one what's going on.
Are you sure there are no exceptions being logged somewhere? Lucene
runs merges with background threads (by default), and if those threads
hit unhandled exceptions it's possible they are logged somewhere you
wouldn't normally look?
Mike
John Byrne wrote:
MergeFactor and MergeDocs are left at default values. The indexing
is incremental, i.e. whenever someone adds or modifys a file to in
svn repository, the lucene index is updated, and the writer/reader/
searcher are refreshed (closed and opened again).,
According to the svn logs for the time the files were created, a few
hundred files were added that day.
Overall, the index would have started out with around 150,000 to
200,000 documents, with anything from 100 to 1000 being added per day.
I don't optimize the index at any point, but I've never seen it get
like this before.
Thanks,
John
Erick Erickson wrote:
What are your IndexWriter MergFactor and MergeDocs set to? Also, are
the dates on all these files indicative of all being create during
the same
indexing run?
Finally, how many documents are you indexing?
Best
Erick
On Tue, Feb 3, 2009 at 10:26 AM, John Byrne
<john.by...@propylon.com> wrote:
Hi,
I've got a weird problem with a lucene index, using 2.3.1. The index
contains 6660 files. I don't know how this happened.Maybe somone
can tell me
something about the files themselves? (examples below)
On one day, between 10 and 40 of these files were being created
every
minute. The index updates are triggered by updates to an SVN
repository, but
I can't find any corresponding activity in the SVN logs.
The lucene files all have names like this:
_1qsw.cfs
_1qsx.cfs
_1qsy.cfs
_1qsz.cfs
_1qt0.cfs
and are mostly < 5K in size.
My application uses just one instance each of
IndexReader/IndexWriter/IndexSearcher. From looking at
Can anyone shed any light on these files? I'm not too hopeful
about fixing
this index because we are getting "too many open files", even with
an
unlimited ulimit, but any info/suggestions would be great. Thanks.
-John
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org
------------------------------------------------------------------------
No virus found in this incoming message.
Checked by AVG - http://www.avg.com Version: 8.0.233 / Virus
Database: 270.10.17/1933 - Release Date: 2/3/2009 5:48 PM
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org