If, as Doug suggests, Vadim wants to do something like represent "'Hi' as 'U+0048 U+0069'", then the current version 1.5 of our UniEdit text editor for Windows has a handy "Copy as" feature which automates this conversion and a number of others.
It permits copying any amount of selected Unicode text from a UniEdit edit window to the Windows clipboard in a variety of special formats, in addition to the usual Unicode-text and local code page Windows clipboard formats, which result from the usual "Copy" feature. Here are the various formats: Unicode UTF-8 Encoding U+nnnn (Unicode Character Literals) &#number; (HTML Numeric Character References) \\unnnn (Java Unicode Escape Sequences) 0xnnnn, (C/C++ Hexadecimal Integer Constants, Comma-delimited) &Hnnnn (Visual Basic Hexadecimal Integer Constants) The resulting formatted strings can then be pasted directly into source code, a resource string file, documentation, etc. in another text editor or a non-Unicode-aware application (in the case of UTF-8 format). I'm not sure if we handle properly the formatting of surrogates or anything beyond the BMP.... UniEdit v.1.5 can be downloaded from here: http://research.humancomp.org/ftp/pub/download/unied32.exe (9825KB) General UniEdit information is available here: http://www.humancomp.org/uniintro.htm (although there isn't much detail there about the new features added in v.1.5). Best wishes, Rick Kunst _/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/ The Humanities Computing Laboratory A Nonprofit Education and Research Corporation 301 W. Main St., Suite 400-I Durham, NC 27701 USA Tel. (919) 667-9556, (919) 656-5915 Fax: (919) 667-9556 E-mail: [EMAIL PROTECTED] http://www.humancomp.org _/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/ > -----Original Message----- > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On > Behalf Of [EMAIL PROTECTED] > Sent: Thursday, October 25, 2001 1:41 AM > To: [EMAIL PROTECTED] > Cc: [EMAIL PROTECTED] > Subject: Re: converting Unicode text into Unicode codes > > > Nobody seems to have touched this one yet... > > On 2001-10-22 at 15:35, Vadim Khaskel <[EMAIL PROTECTED]> wrote: > > > I have question regarding tools available to convert Unicode > > text into Unicode codes. We work on enhancement of our current product > > and one of the new features is "Internationalization". Please let me > > know if you may heard of such a tool. > > As Addison Phillips says in his signature block, > "Internationalization is an > architecture. It is not a feature." > > You should clarify what you mean by "convert Unicode text into Unicode > codes." All computerized text, in Unicode or any other character set, is > represented as a sequence of codes. If the text is already > "Unicode text," > then by definition it is already encoded in "Unicode codes." > > If you have text in another encoding, such as Latin-1 or Windows > CP1252 or > EBCDIC or whatever, and wish to convert it to Unicode, there is a > handy tool > called "recode" available as free software on the Internet. > > If you already have Unicode text and wish to view the Unicode > scalar values > of the text (e.g. you want to display "Hi" as "U+0048 U+0069"), somebody > could probably whip up a quick Perl script to do this. > > But I think you need to explain more clearly what it is you have > and what you > want. > > -Doug Ewell > Fullerton, California