I was wondering about methods for analyzing various languages and that what I understand (please correct me if I wrong):

1. To analyze non English language I need to use specific analyzer.
Link to already available contributions in sandbox http://svn.apache.org/repos/asf/lucene/java/trunk/contrib/analyzers/

2. There are some universal systems in the world such as
   Ispell - http://ficus-www.cs.ucla.edu/geoff/ispell.html
   Snowball - http://snowball.tartarus.org/
   Hunspell - http://sourceforge.net/projects/hunspell
(This info I got from Postgresql documentation chapter about full-text-search which is part of core distribution from version 8.3 (beta 2 for that moment) ) Here is link too: http://www.postgresql.org/docs/8.3/static/textsearch-dictionaries.html

Each of such engines has various dictionaries for various languages.

3. Snowball is already supported by Lucene.
http://svn.apache.org/repos/asf/lucene/java/trunk/contrib/snowball

And here goes my question :) ...

Can I use Ispell dictionaries with Lucene? And if no, then why? Are there some juridical issues with it or just no implementation exists jet? If no implementation, then maybe there is some motivation not to support it (maybe it just not worth to do), or because of complexity
or no one ever tried jet?

The problem is that I need to support some languages not listed in snowball supported dictionaries list, but existing in Ispell, so
it is naturally to try to use Ispell in this situation.

P.S. Postgresql full text search has it, so Lucene probably need one too, I think.

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to