Hi- I'm technical lead at Lingr (http://www.lingr.com), a chatroom-based social networking site. We've currently got several million user utterances stored in MySQL, and we're looking to build a local search functionality. I've played around with aaf and I really like it, but I have some questions.
1. Is anyone out there using aaf to index a corpus of this size? If so, how has your scaling experience been? 2. We would be running one central aaf server instance, talking to it over drb from our many application servers. We add tens of thousands of utterances per day- anyone out there indexing this many items on a daily basis over drb? If so, how has your experience been in terms of stability? 3. All of our utterance data is in UTF8, but we don't know what language a particular utterance is in. It's common to have both latin and non-latin text even in the same room. How can I index both types of strings effectively within the same model field index? 4. Any suggestions on how to build the initial index in an offline way? I suspect it will probably take many hours to build the initial index. 5. I suspect we will have to disable_ferret(:always) on our utterance model, then update the index manually on some periodic basis (cron job, backgroundrb worker, etc.). The reason for this is that we don't want to introduce any delay into the process of storing a new utterance, which occurs in realtime during a chat session. Anyone have experience doing this? Any advice is appreciated! Best Regards, Danny Burkes -- Posted via http://www.ruby-forum.com/. _______________________________________________ Ferret-talk mailing list [email protected] http://rubyforge.org/mailman/listinfo/ferret-talk

