[ 
https://issues.apache.org/jira/browse/LUCENE-4560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13499996#comment-13499996
 ] 

Tim Smith commented on LUCENE-4560:
-----------------------------------

My base requirement here is that this be an online process.
As such, the add indexes approach is really not useful as i see it, especially 
as it requires 2x disk space, as well as completely new index directories, it 
does not play well with upgrading a user's existing index.

what i see as needed is the ability to gradually migrate indexes such that any 
individual segment is itself consistent.
currently, merging of indexes can result in loss of indexed data or otherwise 
break consistency, as in LUCENE-4557

it is 100% ok if all segments have not been processed as i can identify each 
segment's settings at index open/search time, and optionally filter/search/read 
segments differently.

It is true that once you start using this SegmentMergeFilter, you pretty much 
have to keep using it forever.
I don't see this as an issue as when dealing with supporting old indexes, you 
constantly have to support migration of data that was indexed using old code. 
For instance, as time goes on, my "MergeSegmentFilter" will do more, supporting 
migrating more and more old index formats/config settings to the latest 
indexing format/settings.

At quick glance, FilteringCodec looks like it applies to writing new content, 
not reading existing indexes?
Doesn't seem quite like that would do the trick here. I would need some way to 
have the index writer wrap the codec for existing segments in order to inject 
my custom filtering that would apply during merging. That would be logically 
identical to the patch provided, however would potentially result in a much 
more complex patch.






                
> Support Filtering Segments During Merge
> ---------------------------------------
>
>                 Key: LUCENE-4560
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4560
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Tim Smith
>         Attachments: LUCENE-4560.patch
>
>
> Spun off from LUCENE-4557
> It is desirable to be able to filter segments during merge.
> Most often, full reindex of content is not possible.
> Merging segments can sometimes have negative consequences when fields are 
> have different options (most restrictive option is forced during merge)
> Being able to filter segments during merges will allow gradually migrating 
> indexed data to new index settings, support pruning/enhancing existing data 
> gradually
> Use Cases:
> * Migrate IndexOptions for fields (See LUCENE-4557)
> * Gradually Remove index fields no longer used
> * Migrate indexed sort fields to DocValues
> * Support converting data types for indexed data
> * and so on
> patch will be forthcoming

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to