[ https://issues.apache.org/jira/browse/LUCENE-2632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13207686#comment-13207686 ]
Andrzej Bialecki commented on LUCENE-2632: ------------------------------------------- bq. the TODO/etc in term vectors makes me wish our codec consumer APIs for Fields/TermVectors were more consistent... Also, the handling of segments.gen and compound files that bypasses codec actually forced me to implement TeeDirectory. Re. synchronization - yes, many of these should be removed. I synced everything for now to narrow down the source of merge problems. TeeCodec.files() - well spotted, this should be fixed. > FilteringCodec, TeeCodec, TeeDirectory > -------------------------------------- > > Key: LUCENE-2632 > URL: https://issues.apache.org/jira/browse/LUCENE-2632 > Project: Lucene - Java > Issue Type: New Feature > Components: core/index > Affects Versions: 4.0 > Reporter: Andrzej Bialecki > Attachments: LUCENE-2632.patch, LUCENE-2632.patch > > > This issue adds two new Codec implementations: > * TeeCodec: there have been attempts in the past to implement parallel > writing to multiple indexes so that they are all synchronized. This was > however complicated due to the complexity of IndexWriter/SegmentMerger logic. > The solution presented here offers a similar functionality but working on a > different level - as the name suggests, the TeeCodec duplicates index data > into multiple output Directories. > * TeeDirectory (used also in TeeCodec) is a simple abstraction to perform > Directory operations on several directories in parallel (effectively > mirroring their data). Optionally it's possible to specify a set of suffixes > of files that should be mirrored so that non-matching files are skipped. > * FilteringCodec is related in a remote way to the ideas of index pruning > presented in LUCENE-1812 and the concept of tiered search. Since we can use > TeeCodec to write to multiple output Directories in a synchronized way, we > could also filter out or modify some of the data that is being written. The > FilteringCodec provides this functionality, so that you can use like this: > {code} > IndexWriter --> TeeCodec > | | > | +--> StandardCodec --> Directory1 > +--> FilteringCodec --> StandardCodec --> Directory2 > {code} > The end result of this chain is two indexes that are kept in sync - one is > the full regular index, and the other one is a filtered index. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org