I set the case options like this: fsa.dict.speller.ignore-all-uppercase=false
I set this to false, because of correct words like LSD. For Dutch, it would be great as there is a max amount of uppercased: LSD is ignored, AMSTERDAM is not. fsa.dict.speller.ignore-camel-case=true These are all proper names, so can be ignored fsa.dict.speller.convert-case=false I will try to set this to true, but it is unclear in the text what a dictionary entry like LSD will do then. It appears to work as expected ;-) : whe words are uppercased in the dic, it will require that; if not, it will suggest the uppercased version. about the dash at the end of a word: Details tells me the dash is kept to the word in the tokenizer in Dutch <S> <S> Een[een/DTe,een/NM,een/NM1,een/NN1d,] afdelings[afdelings/null,] of[of/CJo,] ander[ander/AJn,] uitje[uitje/NN1r,ui/NN1r,] .[</S>,] In the English version (that I am testing with) <S> Een[Een/null,B-NP-singular] afdelings[afdelings/null,E-NP-singular] -[-/null,O] of[of/IN,B-PP] ander[Ander/NNP,B-NP-singular] uitje[uitje/null,E-NP-singular] .[./.,</S>,O] So the mistake is in the mix of languages. I think that makes it necessary to test in the Dutch, and touch the code for this. > W dniu 2014-09-03 20:06, R.J. Baars pisze: >> I replace the English dictionary with the newly generated Dutch one. >> >> Running the complete list of wrong and correct words through LT works. >> The >> output is less structured than I would like though. When there is no >> suggestion, the entire suggestion line is missing; also the word is not >> recognizable in the output, just underlined, which is more difficult to >> process. I will have to build a program around this to get the data I >> need >> to judge the suggestions. Taask for tomorrow. >> >> But it works, with the following conclusions: >> - there is still a lot of words that should have been accepted (missing >> compounding parts in Hunspell) > > Daniel is working on that for German. > >> - numbers as a whole (0123456) should be skipped, but ranking numbers >> like >> 100e and mp3, F16 should be checked. As far as I could see, there are no >> options for that. > > Interesting. This is probably a bug, as I don't expect numbers to be > checked by a spell checker. > >> - When a word is completely in upper-case (UPPERCASE) (which is not in >> the >> dictionary and set not to be accepted), the alternatives Uppercase and >> uppercase are not suggested. > > This is probably because your dictionary is case-sensitive. > >> >> These are no showstoppers, but a small step back from Hunspell. >> >> Maybe some of these are general things, useful to put on the todo-list. > > It seems to me that the number checking is a genuine bug. I never had > this "check words with numbers" option set, so this is why I didn't > encounter this. > > Regards, > Marcin > > ------------------------------------------------------------------------------ > Slashdot TV. > Video for Nerds. Stuff that matters. > http://tv.slashdot.org/ > _______________________________________________ > Languagetool-devel mailing list > Languagetool-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/languagetool-devel > ------------------------------------------------------------------------------ Slashdot TV. Video for Nerds. Stuff that matters. http://tv.slashdot.org/ _______________________________________________ Languagetool-devel mailing list Languagetool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/languagetool-devel