[ 
https://issues.apache.org/jira/browse/SOLR-15586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17400988#comment-17400988
 ] 

Yael Mushinsky commented on SOLR-15586:
---------------------------------------

Tried to remove the code colored in green with small patch:

org.apache.solr.search.Grouping.CommandFunc:

@Override

    @SuppressWarnings(\{"unchecked"})

    protected void finish() throws IOException {

      if (secondPass != null)

{         result = secondPass.getTopGroups(0);

        {color:#00875a}*populateScoresIfNecessary();*  {color}    

}

 

I get the scores and everything seems working ... what is it's purpose?

 

> After upgrading to Solr 8.7 from 5.2.1 seems that asking for 'score' in 'fl' 
> causes the query to take almost twice (or so) the time (regression?)
> -------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-15586
>                 URL: https://issues.apache.org/jira/browse/SOLR-15586
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>    Affects Versions: 8.7
>            Reporter: Yael Mushinsky
>            Priority: Major
>
>  In performance tests conducted on the same system, comparing its performance 
> with Solr 5.2.1 verses 8.7, was found a degradation in the query execution 
> time.
> The system uses a custom query parser, which constructs quite a large query 
> object.
> When removing the ‘score’ parameter from the ‘fl’ list, the times were back 
> as in Solr 5.
> Trying to trouble shoot using jstack (small bash script that take a jstack of 
> the Solr thread every few milliseconds) found that the Solr 8 thread “spends” 
> time in the stack: (was appearing in several jstack snippets) 
> "qtp2038105753-19" #19 prio=5 os_prio=0 cpu=1837.18ms elapsed=10334.56s 
> tid=0x00007f7da0973000 nid=0x6ddd runnable  [0x00007f7d6d5f2000]
>    java.lang.Thread.State: RUNNABLE
>         at 
> sun.nio.ch.FileDispatcherImpl.pread0([java.base@13.0.2/Native|mailto:java.base@13.0.2/Native%3cmailto:java.base@13.0.2/Native]
>  Method)
>         at 
> sun.nio.ch.FileDispatcherImpl.pread([java.base@13.0.2/FileDispatcherImpl.java:54|mailto:java.base@13.0.2/FileDispatcherImpl.java:54%3cmailto:java.base@13.0.2/FileDispatcherImpl.java:54])
>         at 
> sun.nio.ch.IOUtil.readIntoNativeBuffer([java.base@13.0.2/IOUtil.java:274|mailto:java.base@13.0.2/IOUtil.java:274%3cmailto:java.base@13.0.2/IOUtil.java:274])
>         at 
> sun.nio.ch.IOUtil.read([java.base@13.0.2/IOUtil.java:245|mailto:java.base@13.0.2/IOUtil.java:245%3cmailto:java.base@13.0.2/IOUtil.java:245])
>         at 
> sun.nio.ch.FileChannelImpl.readInternal([java.base@13.0.2/FileChannelImpl.java:811|mailto:java.base@13.0.2/FileChannelImpl.java:811%3cmailto:java.base@13.0.2/FileChannelImpl.java:811])
>         at 
> sun.nio.ch.FileChannelImpl.read([java.base@13.0.2/FileChannelImpl.java:796|mailto:java.base@13.0.2/FileChannelImpl.java:796%3cmailto:java.base@13.0.2/FileChannelImpl.java:796])
>         at 
> org.apache.lucene.store.NIOFSDirectory$NIOFSIndexInput.readInternal(NIOFSDirectory.java:170)
>         at 
> org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:315)
>         at 
> org.apache.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java:254)
>         at 
> org.apache.lucene.util.fst.ReverseRandomAccessReader.readByte(ReverseRandomAccessReader.java:33)
>         at org.apache.lucene.util.fst.FST.readLabel(FST.java:596)
>         at org.apache.lucene.util.fst.FST.readArc(FST.java:1259)
>         at org.apache.lucene.util.fst.FST.readNextRealArc(FST.java:1247)
>         at 
> org.apache.lucene.util.fst.FST.readFirstRealTargetArc(FST.java:1091)
>         at org.apache.lucene.util.fst.FST.findTargetArc(FST.java:1396)
>         at 
> org.apache.lucene.codecs.blocktree.SegmentTermsEnum.seekExact(SegmentTermsEnum.java:485)
>         at 
> org.apache.lucene.index.FilterLeafReader$FilterTermsEnum.seekExact(FilterLeafReader.java:184)
>         at 
> org.apache.lucene.index.TermStates.loadTermsEnum(TermStates.java:124)
>         at org.apache.lucene.index.TermStates.build(TermStates.java:109)
>         at 
> org.apache.lucene.search.PhraseQuery$1.getStats(PhraseQuery.java:447)
>         at org.apache.lucene.search.PhraseWeight.<init>(PhraseWeight.java:38)
>         at org.apache.lucene.search.PhraseQuery$1.<init>(PhraseQuery.java:429)
>         at 
> org.apache.lucene.search.PhraseQuery.createWeight(PhraseQuery.java:429)
>         at 
> org.apache.lucene.search.BoostQuery.createWeight(BoostQuery.java:125)
>         at 
> org.apache.lucene.search.IndexSearcher.createWeight(IndexSearcher.java:726)
>         at 
> org.apache.lucene.search.BooleanWeight.<init>(BooleanWeight.java:63)
>         at 
> org.apache.lucene.search.BooleanQuery.createWeight(BooleanQuery.java:231)
>         at 
> org.apache.lucene.search.IndexSearcher.createWeight(IndexSearcher.java:726)
>         at 
> org.apache.lucene.search.TopFieldCollector.populateScores(TopFieldCollector.java:563)
>         at 
> org.apache.solr.search.Grouping$Command.populateScoresIfNecessary(Grouping.java:593)
>         at 
> org.apache.solr.search.Grouping$CommandFunc.finish(Grouping.java:1007)
>         at org.apache.solr.search.Grouping.execute(Grouping.java:408)
>         at 
> org.apache.solr.handler.component.QueryComponent.doProcessGroupedSearch(QueryComponent.java:1486)
>         at 
> org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:395)
>         at 
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:360)
>  
> This doesn’t appear in any jstack snippet taken of the Solr 5 thread, when 
> ‘score’ is also asked for.
> Trying to look a bit into the code, I see that in solr 8 the calculation of 
> score “if needed” is performed as a separate phase, after the document 
> collection flow (might be that its like collecting the documents again in a 
> way? I/O etc. ) :
>  
> org.apache.solr.search.Grouping.execute():
>    if (!collectors.isEmpty()) {
>       Collector secondPhaseCollectors = 
> MultiCollector.wrap(collectors.toArray(new Collector[collectors.size()]));
>       if (collectors.size() > 0) {
>         if (cachedCollector != null) {
>           if (cachedCollector.isCached()) {
>             cachedCollector.replay(secondPhaseCollectors);
>           } else {
>             signalCacheWarning = true;
>             log.warn(String.format(Locale.ROOT, "The grouping cache is 
> active, but not used because it exceeded the max cache limit of %d percent", 
> maxDocsPercentageToCache));
>             log.warn("Please increase cache size or disable group caching.");
>             searchWithTimeLimiter(luceneFilter, secondPhaseCollectors);
>           }
>         } else {
>           if (pf.postFilter != null) {
>             pf.postFilter.setLastDelegate(secondPhaseCollectors);
>             secondPhaseCollectors = pf.postFilter;
>           }
>           searchWithTimeLimiter(luceneFilter, secondPhaseCollectors); 
> //->seems to me this is the main document collection flow
>         }
>         if (secondPhaseCollectors instanceof DelegatingCollector) {
>           ((DelegatingCollector) secondPhaseCollectors).finish();
>         }
>       }
>     }
>  
>     for (@SuppressWarnings(\{"rawtypes"})Command cmd : commands) {
>       cmd.finish();
>     }
>  
> org.apache.solr.search.Grouping.CommandFunc:
> @Override
>     @SuppressWarnings(\{"unchecked"})
>     protected void finish() throws IOException {
>       if (secondPass != null) {
>         result = secondPass.getTopGroups(0);
>         populateScoresIfNecessary();
>       }
>  
>  
> org.apache.solr.search.Grouping.Command<T>:
> protected void populateScoresIfNecessary() throws IOException {
>       if (needScores) {
>         for (GroupDocs<?> groups : result.groups) {
>           TopFieldCollector.populateScores(groups.scoreDocs, searcher, query);
>         }
>       }
>     }
>  
>  
> I know that asking for the score has a cost of performance, but in Solr 5 it 
> didn’t seem to be meaningful and in Solr 8 it seems to be very expansive.
> (In means of code, it seems the code is Solr 5 is different, I didn’t get 
> into finding exactly how and where it calculates the score to add to the 
> document upon request, but in means of timing, it doesn’t seem to have such 
> an effect.)
> Tried asking about this over the users mailing list, but didn't get an answer 
> ...
> Can you please assist in understanding the change in performance?
> Thank you.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Reply via email to