Re: UTF-EBCDIC to UTF-8

2000-07-28 Thread Doug Ewell

Jeu George <[EMAIL PROTECTED]> wrote:

> Is their any conversion routine that transforms UTF-EBCDIC
> characters to UTF-8 characters.

UTF-8 is defined in Chapter 3, page 47, definition D36 of The Unicode
Standard, version 3.0.  A table is given showing the conversion process.
If you don't have the book (I'm guessing you don't :-), then check out
the FAQ on the Unicode Web site at
<http://www.unicode.org/unicode/faq/encoding_allocation.html>
and look for the question "What is the definition of UTF-8?"

(What a relief it is finally to be able to point people to the Unicode
Web site for the definition of UTF-8!)

UTF-EBCDIC is defined in Unicode Technical Report #16, available at
<http://www.unicode.org/unicode/reports/tr16/>.

Both of these are well-defined, straightforward specs, and if you are
a programmer (especially in a language like C that allows easy bit
manipulation) you should not have any trouble writing the conversion
routines.  Normally you would decode UTF-EBCDIC to Unicode scalar
values and then encode those in UTF-8, but I suppose it would also be
possible to go directly from UTF-EBCDIC to UTF-8.  I can provide
programming hints if you like.

If you aren't a programmer and need to convert some existing data (where
did you find UTF-EBCDIC data, anyway?), I have written a pair of DOS
conversion utilities, "cp2uni" and "uni2cp" (and a wrapper, "cp2cp")
that will perform these conversions and many others.  If you think you
will need these, please contact me privately (off the list).

-Doug Ewell
 Fullerton, California



UTF-EBCDIC to UTF-8

2000-07-27 Thread Jeu George

Hello,
Is their any conversion routine that transforms UTF-EBCDIC characters
to UTF-8 characters.
Regards
Jeu