[jira] [Commented] (LUCENE-2632) FilteringCodec, TeeCodec, TeeDirectory

Robert Muir (Commented) (JIRA) Tue, 14 Feb 2012 05:13:26 -0800

    [ 
https://issues.apache.org/jira/browse/LUCENE-2632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13207690#comment-13207690
 ]


Robert Muir commented on LUCENE-2632:
-------------------------------------

{quote}
Also, the handling of segments.gen and compound files that bypasses codec 
actually forced me to implement TeeDirectory.
{quote}

True, though I don't know of any simple solutions to either of these :)

for CFS, we made some tiny steps in LUCENE-3728, but the codec only has limited 
control here (e.g. it can store certain things
outside of CFS, this is how preflex codec reads separate norms). But it cannot 
yet customize the CFS filenames nor the actual
format/packing process.

                
> FilteringCodec, TeeCodec, TeeDirectory
> --------------------------------------
>
>                 Key: LUCENE-2632
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2632
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: core/index
>    Affects Versions: 4.0
>            Reporter: Andrzej Bialecki 
>         Attachments: LUCENE-2632.patch, LUCENE-2632.patch
>
>
> This issue adds two new Codec implementations:
> * TeeCodec: there have been attempts in the past to implement parallel 
> writing to multiple indexes so that they are all synchronized. This was 
> however complicated due to the complexity of IndexWriter/SegmentMerger logic. 
> The solution presented here offers a similar functionality but working on a 
> different level - as the name suggests, the TeeCodec duplicates index data 
> into multiple output Directories.
> * TeeDirectory (used also in TeeCodec) is a simple abstraction to perform 
> Directory operations on several directories in parallel (effectively 
> mirroring their data). Optionally it's possible to specify a set of suffixes 
> of files that should be mirrored so that non-matching files are skipped.
> * FilteringCodec is related in a remote way to the ideas of index pruning 
> presented in LUCENE-1812 and the concept of tiered search. Since we can use 
> TeeCodec to write to multiple output Directories in a synchronized way, we 
> could also filter out or modify some of the data that is being written. The 
> FilteringCodec provides this functionality, so that you can use like this:
> {code}
> IndexWriter --> TeeCodec
>                  |  |
>                  |  +--> StandardCodec --> Directory1
>                  +--> FilteringCodec --> StandardCodec --> Directory2
> {code}
> The end result of this chain is two indexes that are kept in sync - one is 
> the full regular index, and the other one is a filtered index.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (LUCENE-2632) FilteringCodec, TeeCodec, TeeDirectory

Reply via email to