Solr version: 6.5.x Why do we need to pass hl.fl and df to be same for correct highlighting?
Let us suppose I am highlighting on field: fieldA which has stemming filter on its analysis. Sample doc: {"id":"1", "fieldA":"Vacation"} If I then highlighting request: > "params":{ > "q":"Vacation", > "hl":"on", > "indent":"on", > "hl.fl":"fieldA", > "wt":"json"} Highlighting doesn't work as "Vacation" via _text_::text_general as "Vacation" remains "Vacation", while on the index it is stored as "vacat". I debugged through the code and HighlightComponent::169 highlightQuery = rb.getQparser().getHighlightQuery(); highlightQuery is passed which is analysed value of what's being passed, this case: _text_:Vacation. Fast-forwarding to WeightedSpanTermExtractor::extractWeightedTerms::366:: for (final Term queryTerm : nonWeightedTerms) { > if (fieldNameComparator(queryTerm.field())) { > WeightedSpanTerm weightedSpanTerm = new WeightedSpanTerm(boost, > queryTerm.text()); > terms.put(queryTerm.text(), weightedSpanTerm); > } > } extracted term is "Vacation". Jumping to core highlighting code: Highlighter::getBestTextFragements::213 TokenGroup tokenGroup=new TokenGroup(tokenStream); Each tokenStream, has analysed tokens: "vacat" which obviously doesn't match with extracted term. Why the df, qf, values concern with what we pass in "hl.fl"? Isn't the query which is to be highlighted be analysed by field passed in "hl.fl", but then multiple fields can be passed in "hl.fl". Just wondering how it is suppose to be done. Any explanation will be fine. Amrit Sarkar Search Engineer Lucidworks, Inc. 415-589-9269 www.lucidworks.com Twitter http://twitter.com/lucidworks LinkedIn: https://www.linkedin.com/in/sarkaramrit2