On 04/21/2013 03:11 AM, Jaume Ortolà i Font wrote:
Thanks, will try that. Another one: what's the recommended way to store knowledge about alternative spellings for the word, e.g. color vs colour? It looks like it would make sense to code this relation in the dictionary so that we don't have to introduce regex for alternative spelling and repeat it multiple times in the rules. But I looked at the English module and it looks like such relation is not present in the dictionary but instead hardcoded in the rules.2013/4/21 Andriy Rysin <[email protected] <mailto:[email protected]>>1) I would like to treat several apostrophes equally (apostrophes are part of the word in Ukrainian), e.g. in dictionary and rules I could use ' (0x27) but I would like to be able to parse text that has U+2019 (and potentially U+02BC) the same way, I guess I could do a simple replace in word tokenizer but I was wondering if there's a better way This is what is done in Catalan. So far I have found no problem. Jaume
Thanks Andriy
------------------------------------------------------------------------------ Precog is a next-generation analytics platform capable of advanced analytics on semi-structured data. The platform includes APIs for building apps and a phenomenal toolset for data science. Developers can use our toolset for easy data analysis & visualization. Get a free account! http://www2.precog.com/precogplatform/slashdotnewsletter
_______________________________________________ Languagetool-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/languagetool-devel
