> So to sum it up, it's not a matter of reinventing the wheel. It's a
> quick hack that will get you imprecise results sometimes, but will
> work with mixed text for sure, since your analyzer doesn't assume any
> "westernisms" to be there when tokenizing text.

I think we're missing the point here. The problem is that David's code 
uses StandardAnalyzer and it works for him, not for me and Phillip.
I have to write my own Analyzer, Stemfilter and StopFilter for Persian. 
If StandardAnalyzer (although partially for Persian) works, I won't have 
extra overhead of using RegExpAnalyzer for common tokenizing of Persian 
and Latin context.

-- 
Posted via http://www.ruby-forum.com/.
_______________________________________________
Ferret-talk mailing list
[email protected]
http://rubyforge.org/mailman/listinfo/ferret-talk

Reply via email to