[
https://issues.apache.org/jira/browse/LUCENE-4560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13499876#comment-13499876
]
Shai Erera commented on LUCENE-4560:
------------------------------------
Thinking about this some more, I really don't thing it's a 'gradual' thing that
you do to the index:
* Depending on the state of the index, this migration may not happen at all to
some segments, typically very large segments and are not picked for merge
anymore. So what will happen is that you'll have code in your app that will
never be invoked after some time ... not a good sign to me.
* I won't want to have code in my app that lives there forever. Rather, I'd
like to make a decision to remove field 'foo', run the process which removes it
once, and be done with it, moving the code to some "tools" area that is never
run again.
** With your approach, RemoveFieldReader will not go away, unless you can
guarantee it ran on all segments, which is like forcing forceMerge(1) to run
(note, it may not do what you want, per MP settings !), which is really like
addIndexes
** Worse, today it's RemoveFieldReader, and tomorrow it will turn into
RemoveFieldAndMigrateIndexOptionsReader, because as I wrote above, you cannot
stop running that code if you cannot ensure that all segments have been
migrated.
So I'm beginning to think that this process should not be an
incremental/gradual/online thing, but rather an addIndexes type of process,
that you run once, and know that you're done with it, until the next time where
you need to rewrite the index, w/o actually re-indexing the content.
BTW, did you take a look at LUCENE-2632? It is about adding a FilteringCodec
which filters the data that it writes/reads. Could it help you here? If so, I
think that it has better chances to get committed, than the approach in this
issue (Codecs are already an extension point...).
> Support Filtering Segments During Merge
> ---------------------------------------
>
> Key: LUCENE-4560
> URL: https://issues.apache.org/jira/browse/LUCENE-4560
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Tim Smith
> Attachments: LUCENE-4560.patch
>
>
> Spun off from LUCENE-4557
> It is desirable to be able to filter segments during merge.
> Most often, full reindex of content is not possible.
> Merging segments can sometimes have negative consequences when fields are
> have different options (most restrictive option is forced during merge)
> Being able to filter segments during merges will allow gradually migrating
> indexed data to new index settings, support pruning/enhancing existing data
> gradually
> Use Cases:
> * Migrate IndexOptions for fields (See LUCENE-4557)
> * Gradually Remove index fields no longer used
> * Migrate indexed sort fields to DocValues
> * Support converting data types for indexed data
> * and so on
> patch will be forthcoming
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]