Hi! On Wed, Jan 17, 2007 at 01:05:14PM +0700, Stian Haklev wrote: [..] > but is there a way I could easily configure the > "tokenizing" behaviour (let me know if my terminology is wrong) to > split for example "applications/entries" into two words, searchable by > themselves?
your terminology is correct, the tokenizer is responsible of splitting document content into single terms. You can get an idea of how this works at http://ferret.davebalmain.com/api/classes/Ferret/Analysis.html If you want to use a custom tokenizer you'll have to write your own analyzer which then makes use of this tokenizer. Don't be afraid, this is really easy: def MyAnalyzer < Ferret::Analysis::Analyzer def token_stream(field, str) return StemFilter.new(LowerCaseFilter.new(StandardTokenizer.new(str))) end end (from http://ferret.davebalmain.com/api/classes/Ferret/Analysis/Analyzer.html) hope this gets you started. Cheers, Jens -- webit! Gesellschaft für neue Medien mbH www.webit.de Dipl.-Wirtschaftsingenieur Jens Krämer [EMAIL PROTECTED] Schnorrstraße 76 Tel +49 351 46766 0 D-01069 Dresden Fax +49 351 46766 66 _______________________________________________ Ferret-talk mailing list [email protected] http://rubyforge.org/mailman/listinfo/ferret-talk

