[ https://issues.apache.org/jira/browse/LUCENE-794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12570979#action_12570979 ]
Mark Harwood commented on LUCENE-794: ------------------------------------- >>This may be largely irrelevant, but Solr has a ConstantScorePrefixQuery which >>has similar issues No, very relevant. Only yesterday I had a user with exactly the same highlighting problem >>it seems we prob shouldn't even keep it as configurable. Just drop it then? My nightmare scenario is systems where people are using ConstantScoreRangeQuery in their queries to do both latitude and longitude ranges over large areas - that's a lot of terms. I'd at least want the option of NOT loading them all into RAM at once when highlighting. Maybe we could look at having different highlight "matchers". The existing approach of keeping a big bag of query terms becomes a "TermsMatcher" (simply looks up tokens in a HashSet of terms), You can imagine a new "PrefixMatcher" which would examine tokens using "startsWith" and a "RangeMatcher" examine tokens using just a start and end term. However, there's a danger we could end up re-implementing a lot of query logic so maybe the relevant queries/filters could implement a "Matcher" interface to enable the same logic that is used when scanning TermEnum at query time to be used by the Highlighter when looking at TokenStreams i,e. something like this: interface Matcher { boolean matches(String value) } Needs some more thought yet but it could be an approach. > Extend contrib Highlighter to properly support PhraseQuery, SpanQuery, > ConstantScoreRangeQuery > ----------------------------------------------------------------------------------------------- > > Key: LUCENE-794 > URL: https://issues.apache.org/jira/browse/LUCENE-794 > Project: Lucene - Java > Issue Type: Improvement > Components: Other > Reporter: Mark Miller > Priority: Minor > Attachments: SpanHighlighter-01-26-2008.patch, > SpanHighlighter-01-28-2008.patch, spanhighlighter.patch, > spanhighlighter10.patch, spanhighlighter11.patch, spanhighlighter12.patch, > spanhighlighter2.patch, spanhighlighter3.patch, spanhighlighter5.patch, > spanhighlighter6.patch, spanhighlighter7.patch, spanhighlighter8.patch, > spanhighlighter9.patch, spanhighlighter_24_January_2008.patch, > spanhighlighter_patch_4.zip > > > This patch adds a new Scorer class (SpanQueryScorer) to the Highlighter > package that scores just like QueryScorer, but scores a 0 for Terms that did > not cause the Query hit. This gives 'actual' hit highlighting for the range > of SpanQuerys, PhraseQuery, and ConstantScoreRangeQuery. New Query types are > easy to add. There is also a new Fragmenter that attempts to fragment without > breaking up Spans. > See http://issues.apache.org/jira/browse/LUCENE-403 for some background. > There is a dependency on MemoryIndex. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]