https://issues.apache.org/jira/browse/SOLR-10321 -- near the end my opinion is we should just omit the field if there is no highlight, which would address your need to do this work-around. Glob or no glob. PR welcome!
It's satisfying seeing that the Unified Highlighter is so much faster than the original. I aim to make UH the default in 9.0. SOLR-12901 <https://issues.apache.org/jira/browse/SOLR-12901> It's kinda depressing that the weightMatcher mode is slow when there are many fields because I was hoping this choice might eventually be permanent in order to obsolete lots of code in the highlighter. I can guess why it's slow -- and I filed an issue -- https://issues.apache.org/jira/browse/LUCENE-9712 -- a tough one! Don't expect anything from me there for the foreseeable future. It'd take either some ugly hack that has some limited qualifications, or a substantial rewrite of much of the UH. At least there's the classic non-weightMatcher mode, which works faithfully, albeit with some of its own gotchas around obscure/custom query compatibility. You said the original highlighter performs at ~1.5 seconds. For the UH, I suspect your offset source is postings from the index to get such fantastic numbers that you get with it; right? For curiosity's sake, can you please set hl.offsetSource=ANALYSIS and tell me what speed you get? Set hl.weightMatches=false as well. My hope is that it's still substantially better than the original highlighter. Just because hl.requireFieldMatch=false is the default, doesn't mean it's the _right_ choice for everyone's app :-). I tend to think Solr should flip this in 9.0 for both accuracy & performance sake. And unset hl.maxAnalyzedChars -- mostly an obsolete safety with the UH being so much faster. ~ David Smiley Apache Lucene/Solr Search Developer http://www.linkedin.com/in/davidwsmiley On Fri, Jan 29, 2021 at 2:46 AM Kerwin <kerwin...@gmail.com> wrote: > On another note, since response time is in question, I have been using a > customhighlighter to just override the method encodeSnippets() in the > UnifiedSolrHighlighter class since solr 6 since Solr sends back blank array > (ZERO_LEN_STR_ARRAY) in the response payload for fields that do not match. > Here is the code before: > if (snippet == null) { > //TODO reuse logic of DefaultSolrHighlighter.alternateField > summary.add(field, ZERO_LEN_STR_ARRAY); > } .... > > So I had removed this clause and made the following change: > > if (snippet != null) { > // we used a special snippet separator char and we can now split on > it. > summary.add(field, snippet.split(SNIPPET_SEPARATOR)); > } > > This has not changed in Solr 8 too, which for 76 fields gives a very large > payload. So I will keep this custom code for now. > > On Fri, Jan 29, 2021 at 12:28 PM Kerwin <kerwin...@gmail.com> wrote: > >> Hi David, >> >> Thanks so much for your reply. >> hl.weightMatches was indeed the culprit. After setting it to false, I am >> now getting the same sub-second response as Solr 6. I am using Solr 8.6.1 >> (<luceneMatchVersion>8.6.1</luceneMatchVersion>) >> >> Here are the tests I carried out: >> hl.requireFieldMatch=true&hl.weightMatches=true (2458 ms) >> hl.requireFieldMatch=false&hl.weightMatches=true (3964 ms) >> hl.requireFieldMatch=true&hl.weightMatches=false (158 ms) >> hl.requireFieldMatch=false&hl.weightMatches=false (169 ms) (CHOSEN since >> this is consistent with our earlier setting). >> >> Thanks again, I will inform our other teams as well doing the Solr >> upgrade to check the CHANGES.txt doc related to this. >> >