On Mon, Apr 23, 2007 at 04:42:50PM +0200, Danny Burkes wrote: > > Some heuristics to get an idea about which language you're working on > > right now might be a good idea to select a proper analyzing algorithm. > > > > I think that's probably the only way to do this effectively, but how can > I specify a particular analyzer on a per-model-instance basis? I only > recall seeing that aaf allows analyzer specification on a per-model > basis. > > Or perhaps you were thinking of a single analyzer that does heuristics > itself and decides how to tokenized based on the input string?
Yeah, that's what I was thinking of. That uber-analyzer could determine the language/type of language used in a document and then delegate to a specialized analyzer. Same would have to be done for query analysis - here (because of small text size) it would be good if a hint could be supplied by the application (i.e. user profile, ui language used). [..] > > Unfortunately the DRb server doesn't realize this, yet. As Ryan wrote, I > > plan to rework the re-indexing stuff in the near future, most likely > > then there will be some kind of index rotation and a queue remembering > > model updates that occured while a rebuild is going on. > > > > So how would you suggest that ever get the index "caught up"? The first > rebuild_index will probably take many hours, and, while that's building, > thousands on new model instances will be created. Since we can't turn > on automatic indexing (at least until the index is up to date), how do > we get the index up to date? I'd just remember the time I started rebuild_index, and after it's finished, index all records that have been created afterwards, not doing a rebuild_index but reading/indexing them one by one. Or even better, let your background indexer (that later will handle the regular index updates) do this by marking all records older than the rebuild timestamp as 'already indexed' and then starting it to handle these not yet indexed records. regards, Jens -- Jens Krämer webit! Gesellschaft für neue Medien mbH Schnorrstraße 76 | 01069 Dresden Telefon +49 351 46766-0 | Telefax +49 351 46766-66 [EMAIL PROTECTED] | www.webit.de Amtsgericht Dresden | HRB 15422 GF Sven Haubold, Hagen Malessa _______________________________________________ Ferret-talk mailing list [email protected] http://rubyforge.org/mailman/listinfo/ferret-talk

