Rami Friedman <[EMAIL PROTECTED]> writes:
>I need to read files written in a variety of charsets (Big5, Arabic,
>Hebrew, etc) and write their contents to an oracle database.  This
>problem is easy to solve in Java where each feed gets converted to a
>ucs-2 string, but, if possible, I need to write the code in perl.  Can
>this be done?  

I am actively working on this at present.
The development track perl is very close to being able to do it.

>The third edition of Programming Perl says locales and
>unicode don't mix well yet.  

locales are unfortunately rather underspecified. 
Knowing the locale name does not tell you what the encoding is.
(If you know of a way please let me know!)

>I guess that means I cannot convert from
>Big5 to utf-8, for instance.  

Work is in process so you can say:

  open(my $fh,"<:encoding(big5)",$name)

then you can read Unicode characters out of the stream.
If you write them to a stream opened thus:

  open(my $oh,">:utf8",$outname);

Then you will have converted the file.
Alternatively there will be mechanisms to get utf8 encoded data 
for storing into a database.

>Is that correct?  Could I instead rely on
>the database driver to convert from the foreign charset to unicode?


>
>Thanks in advance to anyone who might be able to help.
-- 
Nick Ing-Simmons <[EMAIL PROTECTED]>
Via, but not speaking for: Texas Instruments Ltd.

Reply via email to