On Fri, Jun 18, 2010 at 7:11 PM, Lance Norskog <goks...@gmail.com> wrote:
> Indeed. Also, it should be possible to output multiple synonyms based > on the mapping: word_with_umlaut should be become word_with_u and > word_with_ue as synonyms. (Ok, maybe this example is wrong, but it > illustrates the idea.) > > I don't think we should do this. how many tokens would üüüüüüüüüüüü make? (such malformed input exists in the wild, e.g. someone spills beer on their keyboard and they key gets sticky) -- Robert Muir rcm...@gmail.com