Doing a standard commit after every document is a Solr anti-pattern. commitWithin is a “near-realtime” commit in recent versions of Solr and not a standard commit.
https://cwiki.apache.org/confluence/display/solr/Near+Real+Time+Searching - Mark http://about.me/markrmiller On Feb 12, 2014, at 9:52 AM, Pisarev, Vitaliy <vitaliy.pisa...@hp.com> wrote: > I am running a very simple performance experiment where I post 2000 documents > to my application. Who in turn persists them to a relational DB and sends > them to Solr for indexing (Synchronously, in the same request). > I am testing 3 use cases: > > 1. No indexing at all - ~45 sec to post 2000 documents > 2. Indexing included - commit after each add. ~8 minutes (!) to post and > index 2000 documents > 3. Indexing included - commitWithin 1ms ~55 seconds (!) to post and index > 2000 documents > The 3rd result does not make any sense, I would expect the behavior to be > similar to the one in point 2. At first I thought that the documents were not > really committed but I could actually see them being added by executing some > queries during the experiment (via the solr web UI). > I am worried that I am missing something very big. The code I use for point 2: > SolrInputDocument = // get doc > SolrServer solrConnection = // get connection > solrConnection.add(doc); > solrConnection.commit(); > Whereas the code for point 3: > SolrInputDocument = // get doc > SolrServer solrConnection = // get connection > solrConnection.add(doc, 1); // According to API documentation I understand > there is no need to explicitly call commit with this API > Is it possible that committing after each add will degrade performance by a > factor of 40? >