A brief follow-up: On 7/6/06, Chris Hostetter <[EMAIL PROTECTED]> wrote:
3) using the example app with this patch, i tried both of these...
...on my port, i get no highlighting for any doc, even though in both cases there are docs in the result that have the simple term "video" in the features field ... am i doing something wrong, or is this a bug?
Looking at this, it seems that implementation will only highlight the first segment of a multi-valued field. Highlighter behaves similarly
4) if you open a LUCENE-XXXX issue for the QueryTermExtractor issue with DisjunctionMaxQueries, and attach a patch with testcases, and no one objects, i'll commit it :) ... but in the mean time i'm curous what it is about QueryTermExtractor that doesn't work -- it looks like it's fallback behavior is to use "query.extractTerms" and that should do the same thing as your getTermsFromDisjunctionMaxQuery right?
I see the issue now. Since the first version of the patch and the current solr trunk, solr has been updated to use a newer Lucene (which contains the extractTerms fallback). The old codebase required the modification.
5) perhaps a better way to pick the "default" fields if highlightFields isn't specified would be using the return fields being used (the "fl" param) rather then the default search field in StandardRequestHandler and the "qf" fields in DisMaxHandler ?
Tempting, but I think that using qf/df provides slightly more natural default behaviour. The returned field list often has many non-analyzable or non-text fields which we don't necessarily want to summarize, while the query fields are guaranteed to at least potentially contain the highlighted term (as we are searching them). At least in my use cases, highlighting is a cue that a search has found a term, hence it only makes sense to use in searched fields. I'll take a look at supporting multi-valued fields. It's something that I would be keen to work on as a further refinement to the base patch instead of being part of this one, if possible. cheers, -Mike