We're running Ferret and acts_as_ferret in our production environment. We
have multiple mongrels talking to a single index on a separate (virtual)
server over DRb. This is working ok for now, as our index updates are fairly
infrequent. I'm concerned with the lack of rendundancy/scalability in this
layout.

Our index won't get too big - maybe 100k indexed objects, each no more than
500 words - but it needs to be highly available, like the rest of our site
(www.caring.com, if you are interested).

One alternate approach I'm considering would be to do something like this:
- disable the after_save callbacks in acts_as_ferret in production mode, to
stop multiple mongrels writing to the index.
- move all index writes to a centralized batch process which interacts with
a 'master' index
- periodically clone out the master index to slave indexes located locally
to each user-facing rails index (not using DRb)

My last company used this approach for a lucene index, with lucene running
behind a custom search webapp not that different from SOLR, so the
user-facing webservers retrieved search results over http from the search
webapp. We had to write some fairly intricate scripting to stop and start
the search webapps whilst we copied out the master index to the slaves.


Does anyone have any experience with this kind of approach? Is there some
standard way to distribute and run multiple instances of an index?

Bonus question - how upset does a running mongrel get when the ferret index
it talks to is suddenly replaced by a new set of files?

Thanks for any insights on how best to solve this.

Thanks,
Patrick Wright


_______________________________________________
Ferret-talk mailing list
[email protected]
http://rubyforge.org/mailman/listinfo/ferret-talk

Reply via email to