[ 
https://issues.apache.org/jira/browse/LUCENE-2701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer reopened LUCENE-2701:
-------------------------------------


This change together with LUCENE-2773 caused a change of the IW#optimize() and 
friends semantics.
IW#optimize() says:
{code}
 /**
   * Requests an "optimize" operation on an index, priming the index
   * for the fastest available search. Traditionally this has meant
   * merging all segments into a single segment as is done in the
   * default merge policy, but individual merge policies may implement
   * optimize in different ways.
   *

{code}

Which is not entirely true anymore since default now uses 

{code}
  /** Default maximum segment size.  A segment of this size
   *  or larger will never be merged.  @see setMaxMergeMB */
  public static final double DEFAULT_MAX_MERGE_MB = 2048;
{code}

this is not what I would expect from optimize() even if it would be documented 
that way. A plain optimize call should by default result in a single segment 
IMO. Yet, we could make this set by a flag in LogMergePolicy maybe something 
like LogMergePolicy#useMasMergeSizeForOptimize = false; as a default?

> Factor maxMergeSize into findMergesForOptimize in LogMergePolicy
> ----------------------------------------------------------------
>
>                 Key: LUCENE-2701
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2701
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>             Fix For: 3.1, 4.0
>
>         Attachments: LUCENE-2701.patch, LUCENE-2701.patch, LUCENE-2701.patch
>
>
> LogMergePolicy allows you to specify a maxMergeSize in MB, which is taken 
> into consideration in regular merges, yet ignored by findMergesForOptimze. I 
> think it'd be good if we take that into consideration even when optimizing. 
> This will allow the caller to specify two constraints: maxNumSegments and 
> maxMergeMB. Obviously both may not be satisfied, and therefore we will 
> guarantee that if there is any segment above the threshold, the threshold 
> constraint takes precedence and therefore you may end up w/ <maxNumSegments 
> (if it's not 1) after optimize. Otherwise, maxNumSegments is taken into 
> consideration.
> As part of this change, I plan to change some methods to protected (from 
> private) and members as well. I realized that if one wishes to implement his 
> own LMP extension, he needs to either put it under o.a.l.index or copy some 
> code over to his impl.
> I'll attach a patch shortly.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to