On Sat, Jun 8, 2013 at 4:02 PM, Stephan Stiller <stephan.stil...@gmail.com>wrote:
> > > http://www.unicode.org/**reports/tr38/<http://www.unicode.org/reports/tr38/>does > a good summary of the possibilities. >> > Which and where? > > > Section 3.7.1 Simplified and Traditional Chinese Variants talks about converting between Simplified and Traditional Chinese. > Trying to "fold" from one locale to another, which is what folding from >> traditional to simplified would be is not a good idea, best practice is not >> bear in mind the locale being used, and do information retrieval on a >> locale by locale basis. >> > What do you mean? > > Put simply: Either you don't let someone search a TW database with > simplified characters or you convert either the search terms or the > searched documents internally for the duration of your search – or some > combination of these options. It is not at all obvious to me what the > fastest way in a big data context is. There's gotta be research about this. > > Stephan > >