> ... I think the simplest way to make it work is to use a RegexpAnalyzer that > takes > every character for a token.
David's code uses StandardAnalyzer. It's implemented in C and is fast and advanced. I don't want to re-invent the wheel (e.g. www.example.com, emails, punctuation etc.). PerFieldAnalyzer is not a good solution for me too (I have mixed text). Persian is very similar to English, in punctuations (it has some extra marks), word foundation, and even stems. -- Posted via http://www.ruby-forum.com/. _______________________________________________ Ferret-talk mailing list [email protected] http://rubyforge.org/mailman/listinfo/ferret-talk

