Lucene considers an index with a single .cfx and a single .cfs as optimized.

Also, note that how Lucene stores files in the index is an impl detail
-- it can change from release to release -- so relying on any of these
details is dangerous.

That said, with recent Lucene versions, if you really want to force
these two files to be consolidated, you can make a custom MergePolicy
that returns a merge from the findMergesForOptimize for this case (the
default MergePolicy returns null since it thinks this case is already
optimized).

Or... if you index with two separate IndexWriter sessions, and then
call optimize, that should also merge down to one file.

Mike

2010/5/5 张志田 <zhitian.zh...@dianping.com>:
> Uwe, thank you very much.
>
> What is the mechanizm lucene will merge these two kinds of files? Sometimes I 
> found there was only one .cfs file, but in another time there may be one cfs 
> and cfx. I understand the .cfx is used to store the term vectors etc, but why 
> does the index result not seem to be consistent?
>
> Thanks,
> Garry
> ----- Original Message -----
> From: "Uwe Goetzke" <uwe.goet...@healy-hudson.com>
> To: <java-user@lucene.apache.org>
> Sent: Wednesday, May 05, 2010 3:57 PM
> Subject: AW: How can I merge .cfx and .cfs into a single cfs file?
>
>
> Index all into a directory and determine the size of all files in it.
>
> From http://lucene.apache.org/java/3_0_1/fileformats.html
> Starting with Lucene 2.3, doc store files (stored field values and term 
> vectors) can be shared in a single set of files for more than one segment. 
> When compound file is enabled, these shared files will be added into a single 
> compound file (same format as above) but with the extension .cfx.
>
> In addition to
> Compound File  .cfs  An optional "virtual" file consisting of all the other 
> index files for systems that frequently run out of file handles.
>
> Uwe
>
>
> -----Ursprüngliche Nachricht-----
> Von: 张志田 [mailto:zhitian.zh...@dianping.com]
> Gesendet: Mittwoch, 5. Mai 2010 08:24
> An: java-user@lucene.apache.org
> Betreff: How can I merge .cfx and .cfs into a single cfs file?
>
> Hi all,
>
> I have an index task which will index thousands of records with lucene 3.0.1. 
> My confusion is lucene will always create a .cfx and a .cfs file in the file 
> system, sometimes more, while I thought it should create a single .cfs file 
> if I optimize the index data. Is it by design? If yes, is there any 
> way/configuration I can do to merge all of the index files into a singe one?
>
> By the way, I have a logic to validate the index data, if the size of .cfs 
> increases dramatically comparing to the file generated last time, there may 
> be something wrong, a warning message will be threw. This is the reason that 
> I want to generate a single .cfs file. Any other suggestion about the index 
> validation?
>
> Any body can give me a hand?
>
> Thanks in advance.
>
> Garry
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to