Re: equivalent and optional characters in words

Jaume Ortolà i Font Sun, 21 Apr 2013 00:12:02 -0700

2013/4/21 Andriy Rysin <[email protected]>

> 1) I would like to treat several apostrophes equally (apostrophes are
> part of the word in Ukrainian), e.g. in dictionary and rules I could use
> ' (0x27) but I would like to be able to parse text that has U+2019 (and
> potentially U+02BC) the same way, I guess I could do a simple replace in
> word tokenizer but I was wondering if there's a better way
>


This is what is done in Catalan. So far  I have found no problem.

Jaume

------------------------------------------------------------------------------
Precog is a next-generation analytics platform capable of advanced
analytics on semi-structured data. The platform includes APIs for building
apps and a phenomenal toolset for data science. Developers can use
our toolset for easy data analysis & visualization. Get a free account!
http://www2.precog.com/precogplatform/slashdotnewsletter

_______________________________________________
Languagetool-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/languagetool-devel

Re: equivalent and optional characters in words

Reply via email to