Doing a standard commit after every document is a Solr anti-pattern.

commitWithin is a “near-realtime” commit in recent versions of Solr and not a 
standard commit.

https://cwiki.apache.org/confluence/display/solr/Near+Real+Time+Searching

- Mark

http://about.me/markrmiller

On Feb 12, 2014, at 9:52 AM, Pisarev, Vitaliy <vitaliy.pisa...@hp.com> wrote:

> I am running a very simple performance experiment where I post 2000 documents 
> to my application. Who in turn persists them to a relational DB and sends 
> them to Solr for indexing (Synchronously, in the same request).
> I am testing 3 use cases:
> 
>  1.  No indexing at all - ~45 sec to post 2000 documents
>  2.  Indexing included - commit after each add. ~8 minutes (!) to post and 
> index 2000 documents
>  3.  Indexing included - commitWithin 1ms ~55 seconds (!) to post and index 
> 2000 documents
> The 3rd result does not make any sense, I would expect the behavior to be 
> similar to the one in point 2. At first I thought that the documents were not 
> really committed but I could actually see them being added by executing some 
> queries during the experiment (via the solr web UI).
> I am worried that I am missing something very big. The code I use for point 2:
> SolrInputDocument = // get doc
> SolrServer solrConnection = // get connection
> solrConnection.add(doc);
> solrConnection.commit();
> Whereas the code for point 3:
> SolrInputDocument = // get doc
> SolrServer solrConnection = // get connection
> solrConnection.add(doc, 1); // According to API documentation I understand 
> there is no need to explicitly call commit with this API
> Is it possible that committing after each add will degrade performance by a 
> factor of 40?
> 

Reply via email to