Thanks. Please tell me more about the tables/software that does the conversion. Really appreciate your help.
--- On Mon, 3/7/11, François Schiettecatte <fschietteca...@gmail.com> wrote: > From: François Schiettecatte <fschietteca...@gmail.com> > Subject: Re: How to handle searches across traditional and simplifies Chinese? > To: solr-user@lucene.apache.org > Date: Monday, March 7, 2011, 5:24 PM > I did a little research into this for > a client a while. The character mapping is not one to one > which complicates things (TC and SC have evolved > independently) and if you want to do a perfect job you will > need a dictionary. However there are tables out there (I can > dig one up for you) that allow conversion from one to the > other. So you would pick either TC or SC as your canonical > Chinese, and just convert all the documents and searches to > it. > > I will stress that this is very much a brute force > approach, the mapping is not perfect and the two character > sets have evolved (much like UK and US English, I was > brought up in the UK and live in the US). > > Hope this helps. > > Cheers > > François > > On Mar 7, 2011, at 5:02 PM, Andy wrote: > > > I have documents that contain both simplified and > traditional Chinese characters. Is there any way to search > across them? For example, if someone searches for 类 > (simplified Chinese), I'd like to be able to recognize that > the equivalent character is 類 in traditional Chinese and > search for 类 or 類 in the documents. > > > > Is that something that Solr, or any related software, > can do? Is there a standard approach in dealing with this > problem? > > > > Thanks. > > > > > > > >