[jira] [Commented] (LUCENE-8264) Allow an option to rewrite all segments

Simon Willnauer (JIRA) Tue, 24 Apr 2018 04:32:40 -0700

    [ 
https://issues.apache.org/jira/browse/LUCENE-8264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16449666#comment-16449666
 ]


Simon Willnauer commented on LUCENE-8264:
-----------------------------------------

to be absolutely honest I was surprised by this as well. I think the reasons 
behind this change make sense to me but the implications are big. I am not sure 
if the strictness here comes only from the broken TermVectors offsets or not 
but if so can we discuss relaxing this a bit. This change hit a couple of 
committers by surprise (including myself) and I wonder if we can take a step 
back and reiterate on this decision? While there are a bunch or other issues 
when you for instance go from 3.x to 7.x like your tokenization / analysis 
chain isn't supported anymore etc. there are valid usecases for ugrading your 
index via background merges rewriting the index format. The issues like 
unsupported analysis chains should be handled by highler level apps like solr 
or es. Like there are tons of people that use lucene as a retrieval engine 
doing very simple whitespace tokenization, a merge from 3.x to 7.x might be 
just fine? I think it would be good to have the conversation again even though 
the changes were communicated very openly. [~jpountz] [~thetaphi] [~rcmuir] 
[~mikemccand] [~dweiss] WDYT?

> Allow an option to rewrite all segments
> ---------------------------------------
>
>                 Key: LUCENE-8264
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8264
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Erick Erickson
>            Assignee: Erick Erickson
>            Priority: Major
>
> For the background, see SOLR-12259.
> There are several use-cases that would be much easier, especially during 
> upgrades, if we could specify that all segments get rewritten. 
> One example: Upgrading 5x->6x->7x. When segments are merged, they're 
> rewritten into the current format. However, there's no guarantee that a 
> particular segment _ever_ gets merged so the 6x-7x upgrade won't necessarily 
> be successful.
> How many merge policies support this is an open question. I propose to start 
> with TMP and raise other JIRAs as necessary for other merge policies.
> So far the usual response has been "re-index from scratch", but that's 
> increasingly difficult as systems get larger.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (LUCENE-8264) Allow an option to rewrite all segments

Reply via email to