We're running Ferret and acts_as_ferret in our production environment. We have multiple mongrels talking to a single index on a separate (virtual) server over DRb. This is working ok for now, as our index updates are fairly infrequent. I'm concerned with the lack of rendundancy/scalability in this layout.
Our index won't get too big - maybe 100k indexed objects, each no more than 500 words - but it needs to be highly available, like the rest of our site (www.caring.com, if you are interested). One alternate approach I'm considering would be to do something like this: - disable the after_save callbacks in acts_as_ferret in production mode, to stop multiple mongrels writing to the index. - move all index writes to a centralized batch process which interacts with a 'master' index - periodically clone out the master index to slave indexes located locally to each user-facing rails index (not using DRb) My last company used this approach for a lucene index, with lucene running behind a custom search webapp not that different from SOLR, so the user-facing webservers retrieved search results over http from the search webapp. We had to write some fairly intricate scripting to stop and start the search webapps whilst we copied out the master index to the slaves. Does anyone have any experience with this kind of approach? Is there some standard way to distribute and run multiple instances of an index? Bonus question - how upset does a running mongrel get when the ferret index it talks to is suddenly replaced by a new set of files? Thanks for any insights on how best to solve this. Thanks, Patrick Wright _______________________________________________ Ferret-talk mailing list [email protected] http://rubyforge.org/mailman/listinfo/ferret-talk

