In my experience, searching a read only index mounted via NFS is fine. The NFS related issues are with locking.
I'd agree with Chris that you should try to avoid very big indexes. -- Ian. On 04/08/05, Chris Lu <[EMAIL PROTECTED]> wrote: > A big index is slow to merge, slow to search, and as you mentioned, > it's slow to sync. An 1G index took me several hours to merge on a P3 > 1GHz 512MRam. > Mounting Lucene on NFS is also a "No GO". > > I feel you choice 2 may be feasible, although complicated. That's your > job, right. :) > > BTW: I guess your app is a web hosted app for different users, is it? > > -- > Chris Lu > ------------ > Lucene Search RAD on Any Database > http://www.dbsight.net > > > On 8/4/05, Benjamin Reitzammer <[EMAIL PROTECTED]> wrote: > > Hi, > > we are in the process of planning a search feature of a product and we > > are having quite a hard time figuring out the "right" way to do it. > > > > The requirements for our app are the following: > > 1) Large number of indices (at _least_ 10000) > > 2) The amount of data involved per index is not very high, but because > > of the number of indices involved the data set will be something about > > 500 - 1000 GB > > 3) The searching capabilities must be fail safe, while it's acceptable > > if deletes/updates can take some time. > > 4) The majority of operations will be searching the indices. > > > > I've followed the mailing list intensively the last month and > > especially the "Best Practices for Distributing Lucene Indexing and > > Searching" > > (http://marc.theaimsgroup.com/?l=lucene-user&m=110971318020691&w=2) > > and "Real time indexing and distribution to lucene on separate boxes (long)" > > (http://marc.theaimsgroup.com/?l=lucene-user&m=107900097217474&w=2) > > threads provided some interesting insight. > > > > But still our requirements are a bit different. > > > > My thoughts how the above could be handled, so far are: > > > > 1) Have one *really big* "master" which handles all tasks related to > > index manipulation. Sync the indices according to Doug's tips > > http://marc.theaimsgroup.com/?l=lucene-user&m=110973989200204&w=2 out > > to a cluster of slaves that are responsible for searching. > > Problem: How to make sure that indices across slaves are in sync. > > Big Problem: Syncing of this large number of indices will cause a lot > > of traffic and cause already quite a load on the slaves (not to speak > > of the master) > > > > 1.1) Is it safe to _search_ (only) an index mounted via NFS? If yes, > > then the search boxes could mount the indices on the master box. But > > this solution would probably lead to some serious perfomance issues > > because of the needed disk I/O on the master. > > Though I'd love to be proven wrong on this one. > > > > 2) Split up index collection into smaller portions and distribute a > > certain number of indices (~ up to 1000 indices) into smaller > > autonomous clusters, that are completely responsible for their > > collection of indices. > > Problem: How do I keep index distribution dynamic so I don't have to > > hardcode where to look for a certain index (that's not a real lucene > > issue, but more one of distributed computing, but nevertheless I > > thought you guys might know a way to solve it). > > > > Any ideas on this? > > Has anyone ever worked with such a large number of Lucene indices (and > > the amount of data it involves)? > > > > I appreciate your help very much. > > > > Cheers > > > > Benjamin --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]