cool indeed! Now I can easily create full blown index on master and search (or replicate) only a subset I need to search.
New use cases possible with this: - Today one has to blow-up term directory with XXXMio unique ids just to support deletions. Often a thing only needed during indexing. For search only slaves, it is often sufficient to have uid as a stored field (if at all), but term dictionary does not get bloated. - possibility to simply store original documents in one index (kind of key-value store) , but to search /distribute much smaller index. This enables many new scenarios where Lucene takes storage responsibility (Lucene overtakes Database role in many cases). On Tue, Feb 14, 2012 at 8:45 AM, Uwe Schindler (Commented) (JIRA) <j...@apache.org> wrote: > > [ > https://issues.apache.org/jira/browse/LUCENE-2632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13207566#comment-13207566 > ] > > Uwe Schindler commented on LUCENE-2632: > --------------------------------------- > > Hey cool, sounds like this unmaintainable ParallelReaders obsolete by doing > the splitting to several directories/parallel fields in the codec - so > merging automtically works correct with every MP? > >> FilteringCodec, TeeCodec, TeeDirectory >> -------------------------------------- >> >> Key: LUCENE-2632 >> URL: https://issues.apache.org/jira/browse/LUCENE-2632 >> Project: Lucene - Java >> Issue Type: New Feature >> Components: core/index >> Affects Versions: 4.0 >> Reporter: Andrzej Bialecki >> Attachments: LUCENE-2632.patch, LUCENE-2632.patch >> >> >> This issue adds two new Codec implementations: >> * TeeCodec: there have been attempts in the past to implement parallel >> writing to multiple indexes so that they are all synchronized. This was >> however complicated due to the complexity of IndexWriter/SegmentMerger >> logic. The solution presented here offers a similar functionality but >> working on a different level - as the name suggests, the TeeCodec duplicates >> index data into multiple output Directories. >> * TeeDirectory (used also in TeeCodec) is a simple abstraction to perform >> Directory operations on several directories in parallel (effectively >> mirroring their data). Optionally it's possible to specify a set of suffixes >> of files that should be mirrored so that non-matching files are skipped. >> * FilteringCodec is related in a remote way to the ideas of index pruning >> presented in LUCENE-1812 and the concept of tiered search. Since we can use >> TeeCodec to write to multiple output Directories in a synchronized way, we >> could also filter out or modify some of the data that is being written. The >> FilteringCodec provides this functionality, so that you can use like this: >> {code} >> IndexWriter --> TeeCodec >> | | >> | +--> StandardCodec --> Directory1 >> +--> FilteringCodec --> StandardCodec --> Directory2 >> {code} >> The end result of this chain is two indexes that are kept in sync - one is >> the full regular index, and the other one is a filtered index. > > -- > This message is automatically generated by JIRA. > If you think it was sent incorrectly, please contact your JIRA > administrators: > https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa > For more information on JIRA, see: http://www.atlassian.com/software/jira > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org