Re: Homogeneous vs Heterogeneous indexes (was: FileNotFoundException)

Dmitry Serebrennikov Tue, 30 Apr 2002 15:21:08 -0700

Just a couple of clarification points:
- the number of files that Lucene uses depends on the number of segments 
in the index and the number of *stored* fields
- if your fields are not stored but only indexed, they do not require 
separate files. Otherwise, an .fnn file is created for each field.
- if at least one document uses a given field name in an index, that 
index requires the .fnn file for that field
- index segments are created when documents are added to the index. For 
each 10 docs you get a new segment.
- optimizing the index removes all segments are replaces them with one 
new segment that contains all of the documents
- optimization is done periodically as more documents are added 
(controlled by IndexWriter.mergeFactor), but can be done manually 
whenever needed


With all this, I think Lucene does use too many files...
Some additional info: there is a field on IndexWriter called infoStream. 
If this is set to a PrintStream (such as System.out), various diagnostic 
messages about the merging process will be printed to that stream. You 
might find this helpful in tuning the merge parameters.

Hope this helps.
Good luck.

Dmitry.




--
To unsubscribe, e-mail:   <mailto:[EMAIL PROTECTED]>
For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>

Re: Homogeneous vs Heterogeneous indexes (was: FileNotFoundException)

Reply via email to