: If you can express each phrase as a SpanNearQuery, the occurrences : of the phrases can be easily obtained by iterating over the result of : getSpans() on SpanNearQuery. : It's not as efficient as a specialized PhraseQuery, though.
I think you are missunderstanding his goal. (Assuming *I* understand it) what he's talking baout, is the ability for his search GUI to display suggested phrase searches you may want to try which consist of the words you just typed in grouped into phrases. For example, if a user types in... Lucene Erik Otis ...he will do a Boolean OR search on those words, and not suggest any phrases, if the user types in... Lucene Erik Otis Hatcher ...he will again do a boolean search on the individual words, but he wants to be able to suggest the more restrictive search consisisting of a phrase he found with the words "Erik" and "Hatcher" "Erik Hatcher" Otis Lucene ...but he doesn't want to suggest searching on phrases like "Erik Otis" or "Otis Lucene" unless those phrases actually appear in some documents. Presumably, if multiple phrases in the source data can be found in the permutations of hte search words, the least common are the ones you'd want to sugggest -- which makes the problem a sort of SIP problem (ie: given an extremely limited set of words, find the Statistically imporbably phrases in the corpus made using only subsets of those words) -Hoss --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]