On 2/26/07, neongrau __ <[EMAIL PROTECTED]> wrote:
> hi all!
>
> after hours of trying to find contents with german umlauts i stumbled
> upon a post where someone said ferret won't work with utf-8 on
> windows???
>
> is that really true?
>
> do i really have to iconv everything to iso-8859-15 before indexing and
> do the same with the query to get it working?
The StandardAnalyzer uses your current locale settings to determine
what a letter is when tokenizing your data. As far as I was able to
determine, Windows doesn't have support for UTF-8 locales in C and the
win32 libraries. (I'd love for someone to correct me on thise). What
you can do is write a custom analyzer and UTF-8 should be fine. There
has been plenty of discussion on creating your own analyzer in the
past:
http://www.ruby-forum.com/search?query=ferret+analyzer&submit=Search
You can also look in the unit tests.
--
Dave Balmain
http://www.davebalmain.com/
_______________________________________________
Ferret-talk mailing list
[email protected]
http://rubyforge.org/mailman/listinfo/ferret-talk