[
https://issues.apache.org/jira/browse/LUCENE-2425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Karthick Sankarachary updated LUCENE-2425:
------------------------------------------
Attachment: (was: LUCENE-2425.patch)
> An Anti-Merging Multi-Directory Indexing Framework
> --------------------------------------------------
>
> Key: LUCENE-2425
> URL: https://issues.apache.org/jira/browse/LUCENE-2425
> Project: Lucene - Java
> Issue Type: New Feature
> Components: contrib/*, Index
> Affects Versions: 3.0.1
> Reporter: Karthick Sankarachary
>
> By design, a Lucene index tends to merge documents that span multiple
> segments into fewer segments, in order to optimize its directory structure,
> which in turn leads to better search performance. In particular, it relies on
> a merge policy to specify the set of merge operations that should be
> performed when the index is optimized.
> Often times, there's a need to do the exact opposite, which is to "split" the
> documents. This calls for a mechanism that facilitates sub-division of
> documents based on a certain (ideally, user-defined) algorithm. By way of
> example, one may wish to sub-divide (or partition) documents based on
> parameters such as time, space, real-timeliness, and so on. Herein, we
> describe an indexing framework that builds on the Lucene index writer and
> reader, to address use cases wherein documents need to diverge rather than
> converge.
> In brief, it associates zero or more sub-directories with the index's
> directory, which serve to complement it in some manner. The sub-directories
> (a.k.a. splits) are managed by a split policy, which is notified of all
> changes made to the index directory (a.k.a. super-directory), thus allowing
> it to modify its sub-directories as it sees fit. To make the index reader and
> writer "observable", we extend Lucene's reader and writer with the goal of
> providing hooks into every method that could potentially change the index.
> This allows for propagation of such changes to the split policy, which
> essentially acts as a listener on the index.
> We refer to each sub-directory (or split) and the super-directory as a
> sub-index of the containing index (a.k.a. the split index). Note that the
> sub-directory may not necessarily be co-located with the super-directory.
> Furthermore, the split policy in turn relies on one or more split rules to
> determine when to add or remove sub-directories. This allows for a clear
> separation of the event that triggers a split from the management of those
> splits.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]