At 10,000 documents per post, I was actually finding that embedded Solr was 
providing a significant performance boost. It has been a while since I did any 
comparisons, but it was probably on the order of 40% or so.

----- Original Message ----
From: climbingrose <[EMAIL PROTECTED]>
To: solr-user@lucene.apache.org
Sent: Monday, August 27, 2007 12:21:56 AM
Subject: Re: Embedded about 50% faster for indexing

Haven't tried the embedded server but I think I have to agree with Mike.
We're currently sending 2000 job batches to SOLR server and the amount of
time required to transfer documents over http is insignificant compared with
the time required to index them. So I do think unless you are sending
document one by one, embedded SOLR shouldn't give you much more performance
boost.

On 8/25/07, Mike Klaas <[EMAIL PROTECTED]> wrote:
>
> On 24-Aug-07, at 2:29 PM, Wu, Daniel wrote:
>
> >> -----Original Message-----
> >> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf
> >> Of Yonik Seeley
> >> Sent: Friday, August 24, 2007 2:07 PM
> >> To: solr-user@lucene.apache.org
> >> Subject: Re: Embedded about 50% faster for indexing
> >>
> >> One thing I'd like to avoid is everyone trying to embed just
> >> for performance gains. If there is really that much
> >> difference, then we need a better way for people to get that
> >> without resorting to Java code.
> >>
> >> -Yonik
> >>
> >
> > Theoretically and practically, embedded solution will be faster than
> > going through http/xml.
>
> This is only true if the http interface adds significant overhead to
> the cost of indexing a document, and I don't see why this should be
> so, as indexing is relatively heavyweight.  setting up the connection
> could be expensive, but this can be greatly mitigated by sending more
> than one doc per http request, using persistent connections, and
> threading.
>
> -Mike
>



-- 
Regards,

Cuong Hoang




Reply via email to