Hi,
I've successfully extracted a Swedish word list from
apertium.sv-da.sv.dix  as follows:

lt-expand apertium-sv-da.sv.dix | cut -f1 -d':' >
apertium-sv-da.sv.dix.expanded

Going through the list I found lots of errors. I excluded words present
in the Aspell dictionary to get a shorter list of misspelled words. It
was quite long though, and worse: it contained mostly correctly spelled
words, unknown to Aspell. Hunspell (used by e.g. OpenOffice/Libre
Office) knows much more words. Anyone that happens to know how to
extract/get Hunspell word lists as text files? 

Looking at the misspelled list I realised that many of "the errors" are
variants added for analysis only (r="LR"). Is there an easy way to
expand only the variants that are used for generation? Such a procedure
would produce a much shorter and more correct list.

Anyhow, I continued by checking the list in Word-processing programs to
get the real errors and found quite a lot. Some of them have I already
corrected in the pair sv-da. What about the separate language
dictionary? Should I merge my corrections somehow? What's the
recommended procedure when improving/adding to an existing language
pair?

By the way: How do I use the separated language monodixies? Can they be
used for existing pairs or only when creating new pairs? What's the
recommendation for new pairs? The "Apertium New Language Pair HOWTO"
still supposes that the monodixies are made exclusively for the new
pair.

Yours,
Per Tunedal



------------------------------------------------------------------------------
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
_______________________________________________
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to