[jira] Commented: (LUCENE-1999) Match spotter for all query types

Mark Harwood (JIRA) Wed, 21 Oct 2009 03:58:27 -0700

    [ 
https://issues.apache.org/jira/browse/LUCENE-1999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12768173#action_12768173
 ]


Mark Harwood commented on LUCENE-1999:
--------------------------------------

bq. couldn't the wrapper make a separate data structure for tracking which 
clause matched 

I was trying to keep the processing cost super-low with no object allocations 
because this is in a very tight loop. We don't really want to be generating a 
lot of state/processing while we're still evaluating potentially millions of 
candidate matches.
That seems to be the challenge doing this instrumentation in-line with the 
query execution.

bq. Also: doesn't highlighter run, separately, on each doc? And so it's OK if 
the scores are affected?

The use case I'm tackling right now involves search forms with lots of optional 
fields (spatial, numeric, "choice" etc) and I only needed a yes/no match flag 
for each field. This approach should give me these answers back immediately 
without impacting query processing speeds significantly. 
However, I can see the value in core Lucene capturing a richer data structure 
than a simple boolean where you choose to do a seperate "highlight" pass on the 
top N documents. This would suggest that you might need 2 query expressions - 
one for execution and one for adding highlighter instrumentation. I suppose the 
client could add the instrumentation requests to the initial query which are 
passive during a Lucene "results-selection" mode and become active in 
"highlight mode".



> Match spotter for all query types
> ---------------------------------
>
>                 Key: LUCENE-1999
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1999
>             Project: Lucene - Java
>          Issue Type: New Feature
>    Affects Versions: 2.9
>            Reporter: Mark Harwood
>         Attachments: matchflagger.patch
>
>
> Related to LUCENE-1929 and the current inability to highlight 
> NumericRangeQuery, spatial, cached term filters and other exotica.
> This patch provides the ability to wrap *any* Query objects and record match 
> info as flags encoded in the overall document score.
> Using this approach it would be possible to understand (and therefore 
> highlight) which fields matched clauses in a query.
> The match encoding approach loses some precision in scores as noted here: 
> http://tinyurl.com/ykt8nx7
> Avoiding these precision issues would require a change to Lucene core to 
> record docId, score AND a matchFlag byte in ScoreDoc objects and collector 
> APIs.
> This may be something we should consider.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1999) Match spotter for all query types

Reply via email to