On Friday, July 12, 2002, at 02:06 , Connie Chan wrote:

[..]

p0: I will defer to your understanding of chinese
'encryption' - since I haven't been in that since
the wade-giles v. pin-yin debates....

but if it is a simple mapping as you suggest.....

> b855=cdf2
> a456=d5c9
> a454=c8fd
> a457=c9cf
> a455=cfc2
> ......

where everything on the left hand side is 'uniq'
the problem comes when you have say

        ff33 -> 1111
and
        ff33 -> 2222

but is context dependent on some ordering of characters
before or after it....

So as long as they are always a simple one to one
relationship you should be ok.

> LHS is the Big5 Char, and RHS is GB2312.
> So I tried to make a hash say
>
> $ch{'b855'} = 'cdf2' ; # something like that .
>
> Then I can operate it like this
> $checkThis = 'b855';
> $newChar =~ s/($checkThis)/$ch{$1}/eg; # dunno it works or not, not tested
> yet =)
>
> So... any warning here ? =)

here is where I would argue for

        $new_char = ( exists $big_5{$check_this} ) ? $big_5{$check_this} : 
'UNK';

since the possibility exists that there is a character
in Big5 that is not in GB2312....


ciao
drieux

---


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to