[ 
https://issues.apache.org/jira/browse/SOLR-15246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17301047#comment-17301047
 ] 

David Smiley commented on SOLR-15246:
-------------------------------------

Wow, that's some slow highlighting!

Highlighting all stored fields {{hl.fl=*}} is suspicious.  Are you sure you 
want to do that?  It's unclear what fields you are actually searching on; maybe 
you should just be highlighting those.  Consider setting 
{{hl.requireFieldMatch=true}}

It would be interesting to isolate the performance impact of the underlying 
BreakIterator implementation from the rest of the Highlighter's job by choosing 
the most trivial implementation.  If you set 
{{hl.bs.type=SEPARATOR&hl.bs.separator=.}} then I'd be interested to see how 
much of a difference you see.  It's not a realistic setting because I'm sure 
there are more periods than sentences, and I think the highlights won't show 
the final period either... but it's something to compare.

If you shard your data more, you can do more highlighting in parallel.

> A unified highlighting search under solr 8.8.0/8.8.1 can take over 20 mins to 
> run and eventually times out.
> -----------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-15246
>                 URL: https://issues.apache.org/jira/browse/SOLR-15246
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: highlighter
>    Affects Versions: 8.8, 8.8.1
>         Environment: I was running solr under windows
>            Reporter: Matthew Flowerday
>            Priority: Minor
>
> With solr 8.8.0 a new unified highlighting parameter &hl.fragAlignRatio was 
> implemented which if not set defaults to 0.5. This attempts to improve the 
> high lighting so that highlighted text does not appear right at the left. 
> This works well but if you have a search result with numerous occurrences of 
> the word in question within the record performance goes right down!
> 2021-02-27 06:45:03.151 INFO  (qtp762476028-20) [   x:uleaf] 
> o.a.s.c.S.Request [uleaf]  webapp=/solr path=/select 
> params=\{hl.snippets=2&q=test&hl=on&hl.maxAnalyzedChars=1000000&fl=id,description,specification,score&start=20&hl.fl=*&rows=10&_=1614405119134}
>  hits=57008 status=0 QTime=1414320
> 2021-02-27 06:45:03.245 INFO  (qtp762476028-20) [   x:uleaf] 
> o.a.s.s.HttpSolrCall Unable to write response, client closed connection or we 
> are shutting down => org.eclipse.jetty.io.EofException
>               at 
> org.eclipse.jetty.io.ChannelEndPoint.flush(ChannelEndPoint.java:279)
> org.eclipse.jetty.io.EofException: null
>               at 
> org.eclipse.jetty.io.ChannelEndPoint.flush(ChannelEndPoint.java:279) 
> ~[jetty-io-9.4.34.v20201102.jar:9.4.34.v20201102]
>               at 
> org.eclipse.jetty.io.WriteFlusher.flush(WriteFlusher.java:422) 
> ~[jetty-io-9.4.34.v20201102.jar:9.4.34.v20201102]
>               at 
> org.eclipse.jetty.io.WriteFlusher.completeWrite(WriteFlusher.java:378) 
> ~[jetty-io-9.4.34.v20201102.jar:9.4.34.v20201102]
>  
> when I set &hl.fragAlignRatio=0.25 results came back much quicker
> 2021-02-27 14:59:57.189 INFO  (qtp1291367132-24) [   x:holmes] 
> o.a.s.c.S.Request [holmes]  webapp=/solr path=/select 
> params=\{hl.weightMatches=false&hl=on&fl=id,description,specification,score&start=1&hl.fragAlignRatio=0.25&rows=100&hl.snippets=2&q=test&hl.maxAnalyzedChars=1000000&hl.fl=*&hl.method=unified&timeAllowed=90000&_=1614430061690}
>  hits=136939 status=0 QTime=87024
> And  &hl.fragAlignRatio=0.1
> 2021-02-27 15:18:45.542 INFO  (qtp1291367132-19) [   x:holmes] 
> o.a.s.c.S.Request [holmes]  webapp=/solr path=/select 
> params=\{hl.weightMatches=false&hl=on&fl=id,description,specification,score&start=1&hl.fragAlignRatio=0.1&rows=100&hl.snippets=2&q=test&hl.maxAnalyzedChars=1000000&hl.fl=*&hl.method=unified&timeAllowed=90000&_=1614430061690}
>  hits=136939 status=0 QTime=69033
> And &hl.fragAlignRatio=0.0
> 2021-02-27 15:20:38.194 INFO  (qtp1291367132-24) [   x:holmes] 
> o.a.s.c.S.Request [holmes]  webapp=/solr path=/select 
> params=\{hl.weightMatches=false&hl=on&fl=id,description,specification,score&start=1&hl.fragAlignRatio=0.0&rows=100&hl.snippets=2&q=test&hl.maxAnalyzedChars=1000000&hl.fl=*&hl.method=unified&timeAllowed=90000&_=1614430061690}
>  hits=136939 status=0 QTime=2841
> I left our setting at 0.0 – this presumably how it was in 7.7.1 (fully left 
> aligned).  I am not too sure as to how many time a word has to occur in a 
> record for performance to go right down – but if too many it can have a BIG 
> impact.
> It might be an idea to set the default value to be say 0.25 instead of 0.5 so 
> that people are not caught out.
> I also noticed that setting &timeAllowed=90000 did not break out of the query 
> until it finished. Perhaps because the query finished quickly and what took 
> the time was the highlighting. It might be an idea to get &timeAllowed to 
> also cover any highlighting so that the query does not run until the jetty 
> timeout is hit. The machine 100% one core for about 20 mins!.
> I raised this at the request of a member of the user forum.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to