Re: UTF-8 file to ASCII file converter
On Fri, Apr 12, 2002 at 11:11:41AM -0400, [EMAIL PROTECTED] wrote: > On Fri, 12 Apr 2002, Vasilis Vasaitis wrote: > > > Just like in the case of the opposite conversion, this conversion can also > > be easily achieved with an one-liner. The following seems to be able to do > > the job: > > > > perl -ne 'for (unpack "U*", $_) { printf $_ > 255 ? "U+%04X" : "%c", $_ }' > > Unless you regard ISO-8859-1 as a synonym to US-ASCII, '255' has to > be '127' :-) Er, right. That's what I meant, actually, but I guess I wasn't thinking much at that moment :^). And since I only tested this with an iconv'ed ISO-8859-7 text to UTF-8, I didn't even notice... Cheers, -- Vasilis Vasaitis [EMAIL PROTECTED] "Don't do drugs. Santa Claus is watching." -- winamp.com -- Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/linux-utf8/
Re: UTF-8 file to ASCII file converter
> "Bruno" == Bruno Haible <[EMAIL PROTECTED]> writes: Bruno> And when you use this C/Java syntax, you get the converter for Bruno> free: it is contained it libiconv. Try "iconv -f UTF-8 -t JAVA". Cool. But when was that addded? iconv (GNU libc) 2.2.4 as included in SuSE 7.3's glibc-2.2.4-64.i386.rpm does not support it. RH7.2 also has 2.2.4, and also lacks JAVA (one never knows what patches vendors add...). I don't see any support for JAVA in cvs either, though I've only browsed the tree. -JimC -- Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/linux-utf8/
Re: UTF-8 file to ASCII file converter
On Fri, 12 Apr 2002, Vasilis Vasaitis wrote: > On Thu, Apr 11, 2002 at 12:04:18AM -0700, Pedro Ferreira wrote: > > Now I would like to do the oposite, convert an utf-8 > > file to an ascii file, each utf-8 character would be > > encoded back to U+. Many thanks in advance for any > Just like in the case of the opposite conversion, this conversion can also > be easily achieved with an one-liner. The following seems to be able to do > the job: > > perl -ne 'for (unpack "U*", $_) { printf $_ > 255 ? "U+%04X" : "%c", $_ }' Unless you regard ISO-8859-1 as a synonym to US-ASCII, '255' has to be '127' :-) -- Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/linux-utf8/
Re: UTF-8 file to ASCII file converter
On Fri, 12 Apr 2002, Bruno Haible wrote: > H. Peter Anvin writes: > > > You'd probably be better off using C-like escape codes \u and > > \U with \ escaped as \\. > > And when you use this C/Java syntax, you get the converter for free: > it is contained it libiconv. Try "iconv -f UTF-8 -t JAVA". That's nice to know. BTW, in case somebody wants to 'torture' her/his computer/processor for this simple task doable by a Perl one-liner or iconv, (s)he can run the following: native2ascii -encoding UTF-8 file.utf8 file.java native2ascii -reverse -encoding UTF-8 file.java file.utf8 native2ascii comes with JDK. Jungshik Shin -- Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/linux-utf8/
Re: UTF-8 file to ASCII file converter
On Thu, Apr 11, 2002 at 12:04:18AM -0700, Pedro Ferreira wrote: > I already have a perl script (thanks to Oyvind A. > Holm) that converts an ascii file with U+ unicode > codes to an utf-8 file. > Now I would like to do the oposite, convert an utf-8 > file to an ascii file, each utf-8 character would be > encoded back to U+. Many thanks in advance for any > help! Just like in the case of the opposite conversion, this conversion can also be easily achieved with an one-liner. The following seems to be able to do the job: perl -ne 'for (unpack "U*", $_) { printf $_ > 255 ? "U+%04X" : "%c", $_ }' -- Vasilis Vasaitis [EMAIL PROTECTED] "Don't do drugs. Santa Claus is watching." -- winamp.com -- Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/linux-utf8/
Re: UTF-8 file to ASCII file converter
H. Peter Anvin writes: > You'd probably be better off using C-like escape codes \u and > \U with \ escaped as \\. And when you use this C/Java syntax, you get the converter for free: it is contained it libiconv. Try "iconv -f UTF-8 -t JAVA". Bruno -- Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/linux-utf8/
Re: UTF-8 file to ASCII file converter
Followup to: <[EMAIL PROTECTED]> By author:Pedro Ferreira <[EMAIL PROTECTED]> In newsgroup: linux.utf8 > > I already have a perl script (thanks to Oyvind A. > Holm) that converts an ascii file with U+ unicode > codes to an utf-8 file. > Now I would like to do the oposite, convert an utf-8 > file to an ascii file, each utf-8 character would be > encoded back to U+. Many thanks in advance for any > help! > You'd probably be better off using C-like escape codes \u and \U with \ escaped as \\. -hpa -- <[EMAIL PROTECTED]> at work, <[EMAIL PROTECTED]> in private! "Unix gives you enough rope to shoot yourself in the foot." http://www.zytor.com/~hpa/puzzle.txt<[EMAIL PROTECTED]> -- Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/linux-utf8/
UTF-8 file to ASCII file converter
I already have a perl script (thanks to Oyvind A. Holm) that converts an ascii file with U+ unicode codes to an utf-8 file. Now I would like to do the oposite, convert an utf-8 file to an ascii file, each utf-8 character would be encoded back to U+. Many thanks in advance for any help! --- Pedro Ferreira <[EMAIL PROTECTED]> wrote: > > Works fine, thank you! > > > --- "Oyvind A. Holm" <[EMAIL PROTECTED]> wrote: > > On 2002-03-26 06:58-0800 Pedro Ferreira wrote: > > > > > Please, what is the best tool to convert an > ascii > > file > > > with unicode character codes like this: > > > U+3400 > > > U+3405 > > > to another UTF-8 file with the corresponding > > unicode > > > characters? > > > > This Perl script should do the job: > > > > == CUT HERE == > > > > #!/usr/bin/perl -w > > > > > __ > Do You Yahoo!? > Yahoo! Movies - coverage of the 74th Academy Awards® > http://movies.yahoo.com/ > -- > Linux-UTF8: i18n of Linux on all levels > Archive: http://mail.nl.linux.org/linux-utf8/ > = Pedro Ferreira Grenoble - France "Everything should be made as simple as possible, but not simpler." - Einstein __ Do You Yahoo!? Yahoo! Tax Center - online filing with TurboTax http://taxes.yahoo.com/ -- Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/linux-utf8/