I am new to this too. But my plan is to use sth like this:

I will use and online and offline index. Offline index will be presented to search engine users and offline index will be updated continuously. Time to time offline index will be written over online index. (When update is considered to be stable enough.)

I will have 40-50 crawlers running continuously. They will get their jobs from a database and save their results to database again. And time to time i will update the index with the info available in database.

Although i coded most of this mechanism, i did not test it with 50+ crawlers. But i think, this is scalable enough for my future uses too. Since for my domain, i don't need to show daily/updated results, as far as i see, this mechanism will work for me.

I cant think how you have so much data updated so quickly. I mean which data is needed to be updated so frequently and needs 30+ servers to be handled. Let me know your thoughts..

Levent.

Biggy wrote:
Well try having say 30 servers try to write in the index at the same time and
10 others to read. You'll get enough locks to make a grown man cry. :)


Scott Sellman wrote:
Sorry if this seems naïve (I am new to Lucene), but why not keep one copy
of the Lucene index on a NAS and have it shared by all servers?
-----Original Message-----
From: Biggy [mailto:[EMAIL PROTECTED] Sent: Wednesday, December 27, 2006 7:57 AM
To: java-user@lucene.apache.org
Subject: Clustering Lucene with 40 Servers


I'm currently investigating the best ways of clustering Lucene.
I've heard of both Solr, Terracotta but do not know how well they scale.
Their examples talk of a 4 node cluster. This is way too small for my
needs.

I have 30x JVMs each handling 3 requests/sec and each having their own
Lucene index. The index changes are propagated to the cluster members
using
JGroups messages. This solution has more than reached its limit as JGroups
has become unstable and a source of many JVMs crashes. Based on current
traffic trends I anticipate needing to upgrade to 40x + JVMS very soon.

Can anybody suggest a way to effectivily cluster / replicate document
changes ?

P.S.: JMS is not a possible solution as this was the prior JGroups solution. We
had too many memory problems/queue full etc crashing the servers
thereafter.

--
View this message in context:
http://www.nabble.com/Clustering-Lucene-with-40-Servers-tf2886546.html#a8064135
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]





---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to