Unified highlighter

Julien Massiera Thu, 12 Jul 2018 01:49:37 -0700

Hi Solr community,

I would like some help with a strange behavior that I observe on theunified highlighter.


Here is the configuration of my highlighter :

<str name="hl">on</str>
<str name="hl.method">unified</str>
<str name="hl.defaultSummary">false</str>
<str name="hl.tag.pre">&lt;span class="em"&gt;</str>
<str name="hl.tag.post">&lt;/span&gt;</str>
<str name="hl.fl">content_fr content_en exactContent</str>
<str name="hl.requireFieldMatch">true</str>
<str name="hl.bs.type">CHARACTER</str>
<str name="hl.encoder">html</str>
<str name="hl.fragsize">200</str>
<str name="hl.maxAnalyzedChars">51200</str>


I indexed some html documents from the www.datafari.com website.

The problem is that on some documents (not all), there is not enough"context" wrapping the found search terms.

For example, by searching "France labs", here is the highlightingobtained for a certain document:

"content_en":["<span class=\"em\">France</span> <spanclass=\"em\">Labs</span>"]

Now, if I perform the same query but with the hl.bs.type set to SENTENCEinstead of CHARACTER, I obtain the following highlighting for the samedocument :

This is way better but I strongly prefer using the WORD or CHARACTERtypes because highlighting can be too big with the SENTENCE or LINEtypes, depending on the indexed documents.

I tried to change the hl.bs.type to WORD or either to increase thehl.fragsize up to 1000, but with any other hl.bs.type than SENTENCE orLINE, the highlighting is limited to the found words only, which is notenough for what I need.

Is there something I am missing with the configuration ? For infos, I amusing Solr 6.6.4.


Thanks for your help.

Julien

Unified highlighter

Reply via email to