On Friday 01 July 2005 20:52, McCallie,David wrote: > > Couldn't you use SpanQuery for something like this? Put special > <start-of-sentence> and <end-of-sentence> tokens around each sentence, > and then search for the specific key words inside of the outer SPAN? Do > the same for paragraphs, sections, etc. > > I tried this once, and it seemed to work. I'm not sure of the > performance penalty of the SPAN overhead. >
It should work, as well as SpanNotQuery for excluding the sentence boundary (see my other post). Using a separate sentence field in which each token position is mapped to the same sentence number would be faster, but that would also require a special version of PhraseQuery to search at the same position. Paragraphs can be handled similarly. The disadvantage of adding a new field over the same data is that the term index is duplicated. This could be avoided by extending the index format with index levels: one for normal use, one for sentences, one for paragraphs, ... . Regards, Paul Elschot --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]