Ryan King wrote: > Yes. I have server models with more the 4M rows, all indexed with AAF. > My experience has been that AAF is very stable. Most of my challenges > have been with ferret upgrades breaking index format. > > Yes. Rock solid. >
Great to know- thanks very much! >> 3. All of our utterance data is in UTF8, but we don't know what >> language a particular utterance is in. It's common to have both latin >> and non-latin text even in the same room. How can I index both types of >> strings effectively within the same model field index? > > Why not just use UTF8? > Sorry, I should have been more clear- what I was referring to was not storage, but rather tokenization. My understanding is that many people use a simple Regex-based one-token-per-character tokenizer for non-Latin languages, but, since our languages are mixed, I wasn't sure what type of approach to tokenization would be best. Clearly we can't use that one-token-per-character analyzer on latin text, right? > However, if your search system isn't online (ie, the feature isn't > enabled in the front end), why would you need anything special? The > AAF DRb server can server requests while you're running a rebuild (as > long as you don't use the current rebuild_index method). > Perhaps I'm remembering incorrectly, but my recollection was that, the first time I created a new record for a model that uses aaf, the whole instance blocked while aaf was creating the index. Did I remember that wrong? If that is the way that it works, then, clearly, I need to start the rebuild from outside of the application, before any users can create new model objects. Further, are you saying that model creations during the rebuild won't block (I guess they realize that a rebuild is already happening and just return immediately)? >> 5. I suspect we will have to disable_ferret(:always) on our utterance >> model, then update the index manually on some periodic basis (cron job, >> backgroundrb worker, etc.). The reason for this is that we don't want >> to introduce any delay into the process of storing a new utterance, >> which occurs in realtime during a chat session. Anyone have experience >> doing this? > > It's pretty fast. The only time you'd see a slowdown is when you > encounter a lock in the DRb server. > And what would cause that? Do normal model creates cause a lock? Thanks so much for your info. so far and for any further advice you can give me. Best Regards, Danny -- Posted via http://www.ruby-forum.com/. _______________________________________________ Ferret-talk mailing list [email protected] http://rubyforge.org/mailman/listinfo/ferret-talk

