On 6/25/2016 1:19 AM, Roshan Kamble wrote: > I am using Solr 6.0.0 in cloudMode (3 physical nodes + one zookeeper) > and have heavy insert/update/delete operations. I am using > CloudSolrClient and tried with all batch size from 100 to 1000. But it > has been observed that persist at Solr node is very slow. It takes > around 20 secords to store 50-100 records. Does anyone know how to > improve the speed for these operations?
Is that 20 seconds the *index* time or the *commit* time? If it's the commit time, then see the "slow commits" section of the link that I provided below. You can see how long the last commit took by looking at the statistics in the admin UI for the searcher object. If it's the index time, how much data is in those records? What does the analysis in your schema do to that data? If you have no idea which process is taking the time, then you should decouple indexing from committing, so you can time both separately. Very slow indexing usually has one or more of these causes: 1) The data is very large and is heavily analyzed. 2) It is only being sent to Solr by a single thread. 3) Your Solr machine does not have enough memory for effective operation. That last item is a somewhat complex topic. It is one of the things discussed here: https://wiki.apache.org/solr/SolrPerformanceProblems There could be other problems, but these are the most common. The solutions for these issues are, in the same order: 1a) Reduce the amount of data per record. 1b) change the schema so analysis is not as heavy. 1c) Handle rich document processing in your indexing program, not Solr. 2) Use multiple threads/processes in your indexing program. 3) Add memory to the server, and sometimes increase the max heap size. Thanks, Shawn