On Tue, Jan 12, 2010 at 3:48 AM, Smith G <gudumba.sm...@gmail.com> wrote:
> Hello All,
>               I am trying to find a better approach ( perfomance wise
> ) to index documents. Document count is approximately a million+.
> First, I thought of writing multiple threads using
> CommonsHttpSolrServer to submit documents. But later I found out
> StreamingUpdateSolrServer, which says we can forget about batching.
>
> 1) We can pass thread-count parameter to StreamingUpdateSolrServer,
> does it exactly serve the same as writing multiple threads using
> CommonsHttpSolrServer ?.

Not quite - streaming update solr server batches documents on the fly.
 So if you have a server with N CPUs, you should only need N threads
to saturate it.  Using multiple threads with CommonsHttpSolrServer,
it's still one document per request (unless you do your own batching)
and there is still latency between request and response, meaning it
would take more threads to fill in that latency.

-Yonik
http://www.lucidimagination.com

Reply via email to