Well ... that heuristic is not quite general enough, because the completion of a merge would also decrease the # files and +1 the generation number (if a commit had occurred).

You could check for *.cfs and if there is only one, declare the index optimized? This still isn't always correct because if there is a _X_N.del file (pending deletes against the segment) then the index is not optimized.

But in general, Lucene's file format can change from release to release (it's not an API), so if something changes in the future you may have to revisit this heuristic.

Mike

Shalin Shekhar Mangar wrote:

Hi Michael,

Thanks for the response.

Looking at the general way the filenames are organized:

IndexCommit.getFileNames() without optimize (after IW.close())
[segments_4, _0.cfs, _1.cfs, _2.cfs]
IndexCommit.getFileNames() after optimize+close [segments_5, _4.cfs]

We can compare the latest commit point's files with the previous
commit point's files and if the number of .cfs files have decreased
(or equal) (with a +1 in generation number), can we reliably say if an
optimize has happened?

On Tue, Jul 1, 2008 at 5:44 PM, Michael McCandless
<[EMAIL PROTECTED]> wrote:

You're right IndexCommit doesn't know that it represents an optimized index.

Likewise, IndexCommit doesn't know other "semantic" things about the index, eg, you've just called expungeDeletes, or, you just finished adding batch X of documents to the index, etc.

Also, realize that with autoCommit=false (to be the only choice in 3.0), no commit will be done after an optimize. Ie you have to call commit() or close() explicitly to make it a commit.

I think the simplest general approach to "know" which commit points represent "interesting" times to the application would be to call IW.optimize() then IW.commit() (if you are using trunk) or just IW.close(), then look at the last IndexCommit passed to your deletion policy's onCommit() and record yourself that this commit was the result of an optimize.

Mike

Shalin Shekhar Mangar wrote:

Hi,

I'm implementing a custom IndexDeletionPolicy. An IndexCommit object
does not have any information whether it's index is optimized or not.
How can a IndexDeletionPolicy know which IndexCommit instances
corresponded to optimized indices?

--
Regards,
Shalin Shekhar Mangar.

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




--
Regards,
Shalin Shekhar Mangar.

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to