Hi David

First of all I wanted to say I'm working off your book!!  Third edition,
and I think it's a bit out of date now. I was just going to try following
the section on the Postings highlighter, but I see that's been absorbed
into the Unified highlighter. I find your book easier to follow than the
official documentation though.

I am going to try to configure the unified highlighter, and I will add that
storeOffsetsWithPositions to the schema (which I saw in your book) and I
will try indexing again from scratch.  Was getting some funny things going
on where I thought I'd turned highlighting off and it was still giving me
highlights.

Actually just re-reading your email again, are you saying that you can't
configure highlighting in solrconfig.xml? That's where I always configure
original highlighting in my dismax search handler. Am I supposed to add
highlighting to each request?

Thanks
Shaun

On Mon, 11 Jan 2021 at 20:57, David Smiley <dsmi...@apache.org> wrote:

> Hello!
>
> I worked on the UnifiedHighlighter a lot and want to help you!
>
> On Mon, Jan 11, 2021 at 9:58 AM Shaun Campbell <campbell.sh...@gmail.com>
> wrote:
>
> > I've been using highlighting for a while, using the original highlighter,
> > and just come across a problem with fields that contain a large amount of
> > text, approx 250k characters. I only have about 2,000 records but each
> one
> > contains a journal publication to search through.
> >
> > What I noticed is that some records didn't return a highlight even though
> > they matched on the content. I noticed the hl.maxAnalyzedChars parameter
> > and increased that, but  it allowed some records to be highlighted, but
> not
> > all, and then it caused memory problems on the server.  Performance is
> also
> > very poor.
> >
>
> I've been thinking hl.maxAnalyzedChars should maybe default to no limit --
> it's a performance threshold but perhaps better to opt-in to such a limit
> then scratch your head for a long time wondering why a search result isn't
> showing highlights.
>
>
> > To try to fix this I've tried  to configure the unified highlighter in my
> > solrconfig.xml instead.   It seems to be working but again I'm missing
> some
> > highlighted records.
> >
>
> There is no configuration of that highlighter in solrconfig.xml; it's
> entirely parameter driven (runtime).
>
>
> > The other thing is I've tried to adjust my unified highlighting settings
> in
> > solrconfig.xml and they don't  seem to be having any effect even after
> > restarting Solr.  I was just wondering whether there is any highlighting
> > information stored at index time. It's taking over 4hours to index my
> > records so it's not easy to keep reindexing my content.
> >
> > Any ideas on how to handle highlighting of large content  would be
> > appreciated.
> >
> > Shaun
> >
>
> Please read the documentation here thoroughly:
>
> https://lucene.apache.org/solr/guide/8_6/highlighting.html#the-unified-highlighter
> (or earlier version as applicable)
> Since you have large bodies of text to highlight, you would strongly
> benefit from putting offsets into the search index (and re-index) --
> storeOffsetsWithPositions.  That's an option on the field/fieldType in your
> schema; it may not be obvious reading the docs.  You have to opt-in to
> that; Solr doesn't normally store any info in the index for highlighting.
>
> ~ David Smiley
> Apache Lucene/Solr Search Developer
> http://www.linkedin.com/in/davidwsmiley
>

Reply via email to