Hi Marcin,

I've tried to create a Portuguese dictionary using the Language Tool, but
without success. Can you please help me?
Do you use Lametyzator to read the FSA file generated by fsa_build?
(http://www.eti.pg.gda.pl/katedry/kiw/pracownicy/Jan.Daciuk/personal/fsa.html)

How can I create an input file for fsa_build? How it should be formated?

Thanks!
William

On 3/6/07, Marcin Miłkowski <[EMAIL PROTECTED]> wrote:

[EMAIL PROTECTED] napisał(a):

> You speak about "your solution".
> What is it? Is it a morphological analysis tool or a grammar checker?

Dictionary-based POS-tagger for LanguageTool, using finite-state
automata format for storing data (one of the most efficient dictionary
formats, in terms of speed and space). Most languages supported by LT
use such dictionaries now.

I use a combination of scripts to re-use 12dicts Word Lists and AGID
files to get part of speech information, and then they clean it, add
some entries I added manually, etc. The overall solution is quite hybrid
but quite fast and efficient. Bugs are there but that's life.

> Only for English or also for other languages? Where is it?

This is a part of LanguageTool (Java version). All sources are in the
CVS (look in resources/en). Two files should be downloaded separately
(infl.txt and part-of-speech.txt from 12dicts and AGID), but it should
be specified in the sources.

We could of course release it separately if anyone else needs a nicely
wrapped package instead of dirty CVS ;)

Best,
Marcin

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Reply via email to