> > 3. I have started to construct a variant equivalent
> > table for Chinese characters. But If I put that into
> > the above tonormalize there will be a very big table.
> > I have think of doing the mapping when the input code
> > is converted into unicode (instead of convert them to
> > different variant equivalent form, convert them to the
> > a chosen variant form. In that way, we need only to
> > modify the big5, gb, jis to unicode table. But I am
> > not very sure is this hack is good or bad.
>
>
> I think this table should be done not in big5 or gb
> form, but in unicode format. Like toupper/tolower.

So, I will contribute the table. There shall be a few level for the operator
to choose:

1. Simplified and Traditional variants, these are taken directly from the
unihan.txt
2. Variants identified by CCCII, which can be extracted from the unihan.txt
also
3. Meaning similar, which not identical variant form, but some very similar
in usage or by mistake which would be useful for search propose only. This
is done by manually lookup the dictionary and the character frequency table.
4. Numeric variants which maps all numeric characters to 1,2,3,...
5. Punctuation and full-sized alphabets.


> > 4. As mnogosearch is a open source project, I have a
> > little difficult to contribute the code directly : I
> > can not get the premission from my boss even I write
> > the code at my own time. So, Before sent you the
> > patch, I would like to hear from you.
>
> Can you hear me?    :-)
>
>
> By the way, just interesting...
>
> Why your boss doesn't allow to contribute into open
> source project?

I work in a gov agency, my boss have no ideas about code and software. They
fear of any different acts and responsabilities. Our contract prohabit us to
do any part-time works even not for money. It is really difficult to get
them to sign the paper required to let the code free.

Rgs,

Kent Sin


_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com

___________________________________________
If you want to unsubscribe send "unsubscribe general"
to [EMAIL PROTECTED]

Reply via email to