[ https://issues.apache.org/jira/browse/LUCENE-3454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13145024#comment-13145024 ]
Michael McCandless commented on LUCENE-3454: -------------------------------------------- How about the name "forceMerge(int)" instead? Fundamentally, this is a different operation from maybeMerge() because that method only does "natural" merges, ie ones that the MP has selected on its own. Whereas forceMerge means you are forcing the MP to do merging that it otherwise would not have naturally chosen to do. I don't like names like compact/defragment since they still imply this is a sort of necessary periodic maintenance that you are expected / need to call. The fact is, Lucene has made excellent progress on getting good performance on multi-segment indexes: Query rewriting (eg MTQ) and searching is per-segment. TieredMP now targets segments with deletions, and can merge out-of-order, etc. Reducing the index down to 1 segment is rarely justified given the cost (yes, there are times, like a fully static index, but this is rare). The goal here is to discourage "typical" users from calling optimize ("expert" users will of course find the method and use it, hopefully in the "right" cases). The API is badly trappy today; we've seen this over and over now (I just got a private email a few days ago... when I asked why they optimize after every "batch" they said "because it just seemed like the right thing to do"). We've all seen many users fall into this trap. We can try to debate why this is so... I don't think it's because they are "morons". I think there are many other explanations. EG, our own FAQs, javadocs, the Lucene in Action book, tutorials, etc., all frequently "suggested" optimize in the past. I think, also, users often don't realize Lucene has "segments" and that optimize means these segments are "fully rewritten" and that this then implies O(N^2) cost if you call after every doc/batch, etc. These things are obvious to Lucene developers, but not so to users. > rename optimize to a less cool-sounding name > -------------------------------------------- > > Key: LUCENE-3454 > URL: https://issues.apache.org/jira/browse/LUCENE-3454 > Project: Lucene - Java > Issue Type: Improvement > Affects Versions: 3.4, 4.0 > Reporter: Robert Muir > Assignee: Michael McCandless > Attachments: LUCENE-3454.patch > > > I think users see the name optimize and feel they must do this, because who > wants a suboptimal system? but this probably just results in wasted time and > resources. > maybe rename to collapseSegments or something? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org