Re: Exact and non exact highlighting

David Smiley Fri, 22 Jan 2021 12:16:39 -0800

I'm very familiar with using the Unifier Highligher on a project with this
requirement.  The main "trick" we used was using only one field but
analyzing both ways with a term differentiator (e.g. a leading symbol), and
then coupled with a custom query parser that knows a phrase query is to be
highlighted using the "exact" analysis as opposed to stemmed/approximate
analysis.  As one can imagine, there was a lot of custom code involved here
for many search requirements; this complexity wasn't just for the
highlighting matter.  Any way, using one stored field and multiple indexed
fields (ignoring their stored content if any) is a known feature request:
https://issues.apache.org/jira/browse/SOLR-1105  There's even a patch.  I
would love to help get this feature into Solr if you want to take-over
there!  The patch needs some work; I really disagree with touching the Solr
schema.  If you are up for it, comment on that issue to let the original
contributor know you want to help move this forward.  Maybe they do too.


~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Fri, Jan 22, 2021 at 12:46 PM df2832368_...@amberoad.de
df2832368_...@amberoad.de <j...@amberoad.de> wrote:

> Hello folks,
>
> I am currently working on an issue where we need to enable exact
> highlighting on a text field.
>
> Only problem is that it should also be possible to have also parts of the
> query which don't need to be exact.(e.g. "Hello World" Test, so "Hello
> World" needs to be an exact match, but tests would also match test.)
>
> We have a text field with our normal analyzer pipeline (stemming,...) and
> a copy field which has a decreased pipeline(lowercase filter).
>
> For searching this does its job fine and only returns the correct results
> by translating the query to its supposed fields(e.g. " data-rule="ARROWS"
> data-suggestions="[{"value":"→"},{"value":"⇾"},{"value":"≥"},{"value":"⇉"},{"value":"⇒"},{"value":"⇨"},{"value":"⇛"}]"
> data-type="grammar">-> text_exact:"Hello World" AND text:Test)
>
> Now the problem: The highlighting is now split into the two text fields
> (which makes sense). So we somehow want to combine those two highlights
> (they have the same stored text) to get appropriate "tags" and also scores.
>
> I haven't found a neat solution to this problem by now and would like to
> ask if someone has done something similar or has a clear idea on what to do.
>
> I have tried to tinker a bit around our custom extension of the unified
> highlighter and tried to somehow merge the passages returned by the
> highlighter. But this is quite tedious and error-prone. The next idea was
> to do a two-step process by first getting the positions of the exact match
> in the text_exact field and afterwards somehow filter only highlights that
> have these positions inside. (But I suppose this idea would still not solve
> the "tag"(<em>/</em>) problem .)
>
> I am glad for every help you could offer.
>
> Jan

Re: Exact and non exact highlighting

Reply via email to