The way I'd do it would be to buy more servers, set up Tomcat on each, and get SOLR replicating from your current machine to the others. Then, throw them all behind a load balancer, and there you go.

You could also post your updates to every machine. Then you don't need to worry about getting replication running.

+--------------------------------------------------------+
 | Matthew Runo
 | Zappos Development
 | [EMAIL PROTECTED]
 | 702-943-7833
+--------------------------------------------------------+


On Oct 9, 2007, at 7:12 AM, David Whalen wrote:

All:

How can I break up my install onto more than one box?  We've
hit a learning curve here and we don't understand how best to
proceed.  Right now we have everything crammed onto one box
because we don't know any better.

So, how would you build it if you could?  Here are the specs:

a) the index needs to hold at least 25 million articles
b) the index is constantly updated at a rate of 10,000 articles
per minute
c) we need to have faceted queries

Again, real-world experience is preferred here over book knowledge.
We've tried to read the docs and it's only made us more confused.

TIA

Dave W


-----Original Message-----
From: Yonik Seeley [mailto:[EMAIL PROTECTED]
Sent: Monday, October 08, 2007 3:42 PM
To: solr-user@lucene.apache.org
Subject: Re: Availability Issues

On 10/8/07, David Whalen <[EMAIL PROTECTED]> wrote:
Do you see any requests that took a really long time to finish?

The requests that take a long time to finish are just
simple queries.
And the same queries run at a later time come back much faster.

Our logs contain 99% inserts and 1% queries.  We are
constantly adding
documents to the index at a rate of 10,000 per minute, so the logs
show mostly that.

Oh, so you are using the same boxes for updating and querying?
When you insert, are you using multiple threads?  If so, how many?

What is the full URL of those slow query requests?
Do the slow requests start after a commit?

Start with the thread dump.
I bet it's multiple queries piling up around some synchronization
points in lucene (sometimes caused by multiple threads generating
the same big filter that isn't yet cached).

What would be my next steps after that?  I'm not sure I'd
understand
enough from the dump to make heads-or-tails of it.  Can I
share that
here?

Yes, post it here.  Most likely a majority of the threads
will be blocked somewhere deep in lucene code, and you will
probably need help from people here to figure it out.

-Yonik




Reply via email to