Re: SolrCloud persisting data is very slow

Shawn Heisey Sat, 25 Jun 2016 08:14:28 -0700

On 6/25/2016 1:19 AM, Roshan Kamble wrote:
> I am using Solr 6.0.0 in cloudMode (3 physical nodes + one zookeeper)
> and have heavy insert/update/delete operations. I am using
> CloudSolrClient and tried with all batch size from 100 to 1000. But it
> has been observed that persist at Solr node is very slow. It takes
> around 20 secords to store 50-100 records. Does anyone know how to
> improve the speed for these operations?


Is that 20 seconds the *index* time or the *commit* time?  If it's the
commit time, then see the "slow commits" section of the link that I
provided below.  You can see how long the last commit took by looking at
the statistics in the admin UI for the searcher object.

If it's the index time, how much data is in those records?  What does
the analysis in your schema do to that data?

If you have no idea which process is taking the time, then you should
decouple indexing from committing, so you can time both separately.

Very slow indexing usually has one or more of these causes:

1) The data is very large and is heavily analyzed.
2) It is only being sent to Solr by a single thread.
3) Your Solr machine does not have enough memory for effective operation.

That last item is a somewhat complex topic.  It is one of the things
discussed here:

https://wiki.apache.org/solr/SolrPerformanceProblems

There could be other problems, but these are the most common.  The
solutions for these issues are, in the same order:

1a) Reduce the amount of data per record.
1b) change the schema so analysis is not as heavy.
1c) Handle rich document processing in your indexing program, not Solr.
2) Use multiple threads/processes in your indexing program.
3) Add memory to the server, and sometimes increase the max heap size.

Thanks,
Shawn

Re: SolrCloud persisting data is very slow

Reply via email to