And this is also an approach Yonik drafted here for user/tagging design: http://wiki.apache.org/solr/UserTagDesign

        Erik


On Dec 4, 2009, at 1:35 PM, Steven A Rowe wrote:

Hi Grant,

On 12/02/2009 at 2:30 PM, Grant Ingersoll wrote:
I've been noodling around with the idea with the notion of a
"layered" field where variants of a primary token are stored at
"sub positions" of the primary token (instead of in separate copy
fields)

The Indri search engine (now part of Lemur) uses a similar idea: fields are implemented as potentially overlapping extents over the (single) stream of document tokens. (Howard Turtle, who is now the CNLP director, and has been involved in Indri development, told me about this feature. He says it allows for natural representation of fields projected onto hierarchical data, e.g. XML.) I wasn't able to find much documentation about this online when I looked just now, but here's a high-level overview of the Indri "repository" (aka index) structure:

http://www.lemurproject.org/docs/index.php/Indri_Repository_Structure

(See the "Field Information Files" section near the bottom.)

Steve


Reply via email to