Re: [Ferret-talk] Ferret and non latin characters support

Reza Yeganeh Sun, 22 Apr 2007 16:46:47 -0700

> ... I think the simplest way to make it work is to use a RegexpAnalyzer that 
> takes
> every character for a token.


David's code uses StandardAnalyzer. It's implemented in C and is fast 
and advanced. I don't want to re-invent the wheel (e.g. www.example.com, 
emails, punctuation etc.). PerFieldAnalyzer is not a good solution for 
me too (I have mixed text). Persian is very similar to English, in 
punctuations (it has some extra marks), word foundation, and even stems.


-- 
Posted via http://www.ruby-forum.com/.
_______________________________________________
Ferret-talk mailing list
[email protected]
http://rubyforge.org/mailman/listinfo/ferret-talk

Re: [Ferret-talk] Ferret and non latin characters support

Reply via email to