string manip on japansese characters?

John Mooney Mon, 15 Apr 2002 09:18:42 -0700

Hello,
I was wondering if someone could provide some advice on how to tweak a perl script to 
deal with double-byte (UTF_8 & S-JIS) characters from within perl. I've read a TPJ 
article by Jeff Friedl. I've also searched CPAN and found many different modules - 
IMAP*, UTF-*, and on and on and am a bit confused on the best way to proceed.


Up till now, our scripts have been largely doing DB queries and prints to load files. 
We've been able to get by setting a Locale variable in our unix environment - Perl did 
no manipulation on strings and our data was preserved on prints.

A new script will need to do transliteration ( s///; tr///; etc.) and string 
manipulation.  For example, removal of  \012 & \015 octals, as well as leading and 
trailing space removal, etc. 

In short, is there a simple approach anyone can recommend? Note our manipulations will 
be (at least for now) pretty confied to a specific set so if a hash-char mapping is 
the simplest/best approach we could get by with that for awhile. OTOH, if a simple and 
bug-free module exists, we'd probably prefer that approach as it sounds like a simpler 
mod for our existing domestic scripts.

Any advice appreciated.
Thanks
- John




--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

string manip on japansese characters?

Reply via email to