Hi All, I have been using Solr for some time now but mostly in standalone mode. Now my current project is using Solr 6.5.1 hosted on hadoop. My solrconfig.xml has the following configuration. In the prod environment the performance on querying seems to really slow. Can anyone help me with few pointers on howimprove on the same.
<directoryFactory name="DirectoryFactory" class="solr.HdfsDirectoryFactory"> <str name="solr.hdfs.home">${solr.hdfs.home:}</str> <bool name="solr.hdfs.blockcache.enabled">${solr.hdfs.blockcache.enabled:true}</bool> <int name="solr.hdfs.blockcache.slab.count">${solr.hdfs.blockcache.slab.count:1}</int> <bool name="solr.hdfs.blockcache.direct.memory.allocation">${solr.hdfs.blockcache.direct.memory.allocation:false}</bool> <int name="solr.hdfs.blockcache.blocksperbank">${solr.hdfs.blockcache.blocksperbank:16384}</int> <bool name="solr.hdfs.blockcache.read.enabled">${solr.hdfs.blockcache.read.enabled:true}</bool> <bool name="solr.hdfs.blockcache.write.enabled">${solr.hdfs.blockcache.write.enabled:false}</bool> <bool name="solr.hdfs.nrtcachingdirectory.enable">${solr.hdfs.nrtcachingdirectory.enable:true}</bool> <int name="solr.hdfs.nrtcachingdirectory.maxmergesizemb">${solr.hdfs.nrtcachingdirectory.maxmergesizemb:16}</int> <int name="solr.hdfs.nrtcachingdirectory.maxcachedmb">${solr.hdfs.nrtcachingdirectory.maxcachedmb:192}</int> </directoryFactory> <lockType>hdfs</lockType> It has 6 collections of following size Collection 1 -->6.41 MB Collection 2 -->634.51 KB Collection 3 -->4.59 MB Collection 4 -->1,020.56 MB Collection 5 --> 607.26 MB Collection 6 -->102.4 kb Each Collection has 5 shards each. Allocated heap size for young generation is about 8 gb and old generation is about 24 gb. And gc analysis showed peak size utlisation is really low compared to these values. But querying to Collection 4 and collection 5 is giving really slow response even thoughwe are not using any complex queries.Output of debug quries run with debug=timing are given below for reference. Can anyone help suggest a way improve the performance. Response to query <response> <lst name="responseHeader"> <bool name="zkConnected">true</bool> <int name="status">0</int> <int name="QTime">3962</int> <lst name="params"> <str name="q"> ("hybrid electric powerplant" "hybrid electric powerplants" "Electric" "Electrical" "Electricity" "Engine" "fuel economy" "fuel efficiency" "Hybrid Electric Propulsion" "Power Systems" "Powerplant" "Propulsion" "hybrid" "hybrid electric" "electric powerplant") </str> <str name="defType">edismax</str> <str name="debug">true</str> <str name="indent">on</str> <arr name="qf"> <str>host</str> <str>title</str> <str>url</str> <str>customContent</str> <str>contentSpecificSearch</str> </arr> <arr name="fl"> <str>id</str> <str>contentTagsCount</str> </arr> <str name="start">0</str> <str name="bq.op">OR</str> <str name="q.op">OR</str> <str name="correlationID">3985d7e2-3e54-48d8-8336-229e85f5d9de</str> <str name="rows">600</str> <str name="bq"> ("hybrid electric powerplant"^100.0 "hybrid electric powerplants"^100.0 "Electric"^50.0 "Electrical"^50.0 "Electricity"^50.0 "Engine"^50.0 "fuel economy"^50.0 "fuel efficiency"^50.0 "Hybrid Electric Propulsion"^50.0 "Power Systems"^50.0 "Powerplant"^50.0 "Propulsion"^50.0 "hybrid"^15.0 "hybrid electric"^15.0 "electric powerplant"^15.0) </str> </lst> </lst> <result name="response" numFound="205458" start="0" maxScore="1836.806"> <lst name="timing"> <double name="time">15374.0</double> <lst name="prepare"> <double name="time">2.0</double> <lst name="query"> <double name="time">2.0</double> </lst> <lst name="facet"> <double name="time">0.0</double> </lst> <lst name="facet_module"> <double name="time">0.0</double> </lst> <lst name="mlt"> <double name="time">0.0</double> </lst> <lst name="highlight"> <double name="time">0.0</double> </lst> <lst name="stats"> <double name="time">0.0</double> </lst> <lst name="expand"> <double name="time">0.0</double> </lst> <lst name="terms"> <double name="time">0.0</double> </lst> <lst name="debug"> <double name="time">0.0</double> </lst> </lst> <lst name="process"> <double name="time">15363.0</double> <lst name="query"> <double name="time">1313.0</double> </lst> <lst name="facet"> <double name="time">0.0</double> </lst> <lst name="facet_module"> <double name="time">0.0</double> </lst> <lst name="mlt"> <double name="time">0.0</double> </lst> <lst name="highlight"> <double name="time">0.0</double> </lst> <lst name="stats"> <double name="time">0.0</double> </lst> <lst name="expand"> <double name="time">0.0</double> </lst> <lst name="terms"> <double name="time">0.0</double> </lst> <lst name="debug"> <double name="time">14048.0</double> </lst> </lst> </lst> Thanks, Arun -- Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html