: At a high level, I'm trying to do some more intelligent searching using : an app that will send multiple queries to Solr. My current issue is : around multi-valued fields and determining which entry actually : generated the "hit" for a particular query.
strictly speaking, this isn't possible with normal queries: the underlying data structures do not maintain any history about why a doc matches when executing a Query. SpanQuery is a subclass of Query that can give you this information, so a custom Solr plugin that used SpanTermQueries and SpanNearQueries in place of TermQueries and PhraseQueries could generate this kind of informatio -- but it comes at a cost (SpanQueries are not as fast as their traditional counter parts). The best you can do is use things like score Explanations and hit hihlighting which mimic the logic used during a query to determine why a doc (already identified) matched. : Jane Smith, Bob Smith, Roger Smith, Jane Doe. If the user performs a : search for Bob Smith, this document is returned. What I want to know is : that this document was returned because of "Bob Smith", not because of : Jane or Roger. I've tried using the highlighting settings. They do : provide some help, as the Jane Doe entry doesn't come back highlighted, : but both Jane and Roger do. I've tried using hl.requireFieldMatch, but : that seems to pertain only to fields, not entries within a multi-valued : field. FWIW: if you are using q=Bob+Smith then "Jane Smith" and "Roger Smith" *are* contributing to the result. However, even if you are using a phrase search (q="Bob+Smith") i do seem to recall thatthe traditional highlighter highlights all of the terms in the fields, even if the whole phrase isn't there -- historicly that was considered a feature (for the purpose of snippet generation people frequently want to see that type of behavior) but i can understand why it would cause you problems in your current use case As mention on the wiki, there is a "hl.usePhraseHighlighter" you can use to trigger a newer "SpanScorer" based highlighter -- which takes advantage of hte previously mentioned SpanQuery logic to determine what to highlight (evne if the queries themselves weren't SpanQueries) ... this param gets it's name because when dealing with phrase queries, it only highlights them if the whole phrase is there. http://wiki.apache.org/solr/HighlightingParameters Compare the results of these two URLs when using the example configs/data... http://localhost:8983/solr/select/?hl.fragsize=0&hl.usePhraseHighlighter=false&df=features&q=%22Solr+Search%22&hl.snippets=1000&hl.requireFieldMatch=true&fl=features&hl=true&hl.fl=features http://localhost:8983/solr/select/?hl.fragsize=0&hl.usePhraseHighlighter=true&df=features&q=%22Solr+Search%22&hl.snippets=1000&hl.requireFieldMatch=true&fl=features&hl=true&hl.fl=features I think that may solve your particular problem. -Hoss