[ https://issues.apache.org/jira/browse/LUCENE-4560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13499876#comment-13499876 ]
Shai Erera commented on LUCENE-4560: ------------------------------------ Thinking about this some more, I really don't thing it's a 'gradual' thing that you do to the index: * Depending on the state of the index, this migration may not happen at all to some segments, typically very large segments and are not picked for merge anymore. So what will happen is that you'll have code in your app that will never be invoked after some time ... not a good sign to me. * I won't want to have code in my app that lives there forever. Rather, I'd like to make a decision to remove field 'foo', run the process which removes it once, and be done with it, moving the code to some "tools" area that is never run again. ** With your approach, RemoveFieldReader will not go away, unless you can guarantee it ran on all segments, which is like forcing forceMerge(1) to run (note, it may not do what you want, per MP settings !), which is really like addIndexes ** Worse, today it's RemoveFieldReader, and tomorrow it will turn into RemoveFieldAndMigrateIndexOptionsReader, because as I wrote above, you cannot stop running that code if you cannot ensure that all segments have been migrated. So I'm beginning to think that this process should not be an incremental/gradual/online thing, but rather an addIndexes type of process, that you run once, and know that you're done with it, until the next time where you need to rewrite the index, w/o actually re-indexing the content. BTW, did you take a look at LUCENE-2632? It is about adding a FilteringCodec which filters the data that it writes/reads. Could it help you here? If so, I think that it has better chances to get committed, than the approach in this issue (Codecs are already an extension point...). > Support Filtering Segments During Merge > --------------------------------------- > > Key: LUCENE-4560 > URL: https://issues.apache.org/jira/browse/LUCENE-4560 > Project: Lucene - Core > Issue Type: Improvement > Reporter: Tim Smith > Attachments: LUCENE-4560.patch > > > Spun off from LUCENE-4557 > It is desirable to be able to filter segments during merge. > Most often, full reindex of content is not possible. > Merging segments can sometimes have negative consequences when fields are > have different options (most restrictive option is forced during merge) > Being able to filter segments during merges will allow gradually migrating > indexed data to new index settings, support pruning/enhancing existing data > gradually > Use Cases: > * Migrate IndexOptions for fields (See LUCENE-4557) > * Gradually Remove index fields no longer used > * Migrate indexed sort fields to DocValues > * Support converting data types for indexed data > * and so on > patch will be forthcoming -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org