Hi,

  I'm in the process of transitioning to SolrCloud from a conventional
Master-Slave model. I'm using Solr 4.4 and has set-up 2 shards with 1
replica each. I've 3 zookeeper ensemble. All the nodes are running on AWS
EC2 instances. Shards are on m1.xlarge and sharing a zookeeper instance
(mounted on a separate volume). 6 gb memory is allocated to each solr
instance.

I've around 10 million documents in index. With the previous standalone
model, the queries avg around 100 ms.  The SolrCloud query response have
been abysmal so far. The query response time is over 1000ms, reaching 2000ms
often. I expected some surge due to additional servers, network latency,
etc. but this difference is really baffling. The hardware is similar in both
cases, except for the fact that couple of SolrCloud node is sharing
zookeeper as well. m1x.large I/O is high, so shouldn't be a bottleneck as
well.

The other difference from old setup is that I'm using the new
CloudSolrServer class which is having the 3 zookeeper reference for load
balancing. But I don't think it has any major impact as the queries executed
from Solr admin query panel confirms the slowness.

Here are some of my configuration setup:

<autoCommit> 
        <maxTime>30000</maxTime> 
        <openSearcher>false</openSearcher> 
</autoCommit>

<autoSoftCommit> 
        <maxTime>1000</maxTime> 
</autoSoftCommit>


<maxBooleanClauses>1024</maxBooleanClauses>


<filterCache class="solr.FastLRUCache" size="16384" initialSize="4096"
autowarmCount="4096"/>

<queryResultCache class="solr.LRUCache" size="16384" initialSize="8192"
autowarmCount="4096"/>

<documentCache class="solr.LRUCache" size="32768" initialSize="16384"
autowarmCount="0"/>

<fieldValueCache class="solr.FastLRUCache" size="16384" autowarmCount="8192"
showItems="4096" />

<enableLazyFieldLoading>true</enableLazyFieldLoading>

<queryResultWindowSize>200</queryResultWindowSize>

<queryResultMaxDocsCached>400</queryResultMaxDocsCached>



<listener event="newSearcher" class="solr.QuerySenderListener">
        <arr name="queries">
                <lst><str name="q">line</str></lst>
                <lst><str name="q">xref</str></lst>
                <lst><str name="q">draw</str></lst>
        </arr>
        </listener>
                <listener event="firstSearcher" 
class="solr.QuerySenderListener">
                        <arr name="queries">
                                <lst><str name="q">line</str></lst>
                                <lst><str name="q">draw</str></lst>
                                <lst><str name="q">line</str><str 
name="fq">language:english</str></lst>
                                <lst><str name="q">line</str><str
name="fq">Source2:documentation</str></lst>
                                <lst><str name="q">line</str><str
name="fq">Source2:CloudHelp</str></lst>
                                <lst><str name="q">draw</str><str 
name="fq">language:english</str></lst>
                                <lst><str name="q">draw</str><str
name="fq">Source2:documentation</str></lst>
                                <lst><str name="q">draw</str><str
name="fq">Source2:CloudHelp</str></lst>
                        </arr>
</listener>

<maxWarmingSearchers>2</maxWarmingSearchers>


The custom request handler :

<requestHandler name="/adskcloudhelp" class="solr.SearchHandler">
                <lst name="defaults">
                        <str name="echoParams">explicit</str>
                        <float name="tie">0.01</float>
                        <str name="wt">velocity</str>
                        <str name="v.template">browse</str>
                        <str name="v.contentType">text/html;charset=UTF-8</str> 
  
                        <str name="v.layout">layout</str>
                        <str name="v.channel">cloudhelp</str>

                        <str name="defType">edismax</str>
                        <str name="q.alt">*:*</str>
                        <str name="rows">15</str>
                        <str
name="fl">id,url,Description,Source2,text,filetype,title,LastUpdateDate,PublishDate,ViewCount,TotalMessageCount,Solution,LastPostAuthor,Author,Duration,AuthorUrl,ThumbnailUrl,TopicId,score</str>
                        <str name="qf">text^1.5 title^2 IndexTerm^.9 
keywords^1.2
ADSKCommandSrch^2 ADSKContextId^1</str>
                        <str name="bq">Source2:CloudHelp^3 
Source2:youtube^0.85</str> 
                        <str 
name="bf">recip(ms(NOW,PublishDate),3.16e-11,1,1)^2.0</str> 
                        <str name="df">text</str>

                        
                        <str name="facet">on</str>
                        <str name="facet.mincount">1</str>
                        <str name="facet.limit">100</str>
                        <str name="facet.field">language</str>
                        <str name="facet.field">Source2</str>
                        <str name="facet.field">DocumentationBook</str>
                        <str name="facet.field">ADSKProductDisplay</str>
                        <str name="facet.field">audience</str>

                        
                        <str name="hl">true</str>
                        <str name="hl.fl">text title</str>
                        <str name="f.text.hl.fragsize">250</str>
                        <str name="f.text.hl.alternateField">ShortDesc</str>

                        
                        <str name="spellcheck">true</str>
                        <str name="spellcheck.dictionary">default</str>
                        <str name="spellcheck.collate">true</str>
                        <str name="spellcheck.onlyMorePopular">false</str>
                        <str name="spellcheck.extendedResults">false</str>
                        <str name="spellcheck.count">1</str>
                </lst>
                <arr name="last-components">
                        <str>spellcheck</str>
                </arr>
        </requestHandler>

One thing I've noticed is that the queryresultcache hit rate is really low,
not sure our queries are always that unique. I'm using edismax and there's a
<str name="bf">recip(ms(NOW,PublishDate),3.16e-11,1,1)^2.0</str> , can this
contribute ?

Sorry about the long post, but I'm struggling to nail down the issue here,
especially when queries are running fine in a master-slave environment with
similar hardware and network.

Any pointers will be highly appreciated.

Regards,
Shamik




--
View this message in context: 
http://lucene.472066.n3.nabble.com/SolrCloud-Performance-Issue-tp4095940.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to