> ... I think the simplest way to make it work is to use a RegexpAnalyzer that 
> takes
> every character for a token. 

David's code uses StandardAnalyzer. It's implemented in C and is fast 
and advanced. I don't want to re-invent the wheel (e.g. www.example.com, 
emails, punctuation etc.). PerFieldAnalyzer is not a good solution for 
me too (I have mixed text). Persian is very similar to English, in 
punctuations (it has some extra marks), word foundation, and even stems.


-- 
Posted via http://www.ruby-forum.com/.
_______________________________________________
Ferret-talk mailing list
[email protected]
http://rubyforge.org/mailman/listinfo/ferret-talk

Reply via email to