Ivan Provalov created SOLR-9942:
-----------------------------------

             Summary: MoreLikeThis Performance Degraded With Filtered Query
                 Key: SOLR-9942
                 URL: https://issues.apache.org/jira/browse/SOLR-9942
             Project: Solr
          Issue Type: Bug
      Security Level: Public (Default Security Level. Issues are Public)
          Components: MoreLikeThis
    Affects Versions: 5.5.2
            Reporter: Ivan Provalov


Without any filters, the MLT is performing normal.  With any added filters, the 
performance degrades (2.5-3.0X in our case).  The issue goes away with 6.0 
upgrade.  The hot method is Lucene's DisiPriorityQueue downHeap(), which takes 
5X more calls in 5.5.2 compared to 6.0.  I am guessing that some of the Solr 
filters refactoring fixed it for 6.0 release.

As a work-around, for now I just refactored the custom MLT handler to convert 
the filters into boolean clauses, which takes care of the issue.   

Our configuration: 
1. mlt.maxqt=100
2. There is an additional filter passed as a parameter
3. <field name="some_mlt" type="text_en" indexed="true" stored="true" 
multiValued="true" omitNorms="false" termVectors="true"/>
4. text_en is a pretty standard text fieldType.

I have a code to populate a test dataset and run a query in order to reproduce 
this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to