Re: UTF-8 file to ASCII file converter

2002-04-12 Thread Vasilis Vasaitis

On Fri, Apr 12, 2002 at 11:11:41AM -0400, [EMAIL PROTECTED] wrote:
> On Fri, 12 Apr 2002, Vasilis Vasaitis wrote:
> 
> >   Just like in the case of the opposite conversion, this conversion can also
> > be easily achieved with an one-liner. The following seems to be able to do
> > the job:
> > 
> >   perl -ne 'for (unpack "U*", $_) { printf $_ > 255 ? "U+%04X" : "%c", $_ }'
> 
>  Unless you regard ISO-8859-1 as a synonym to US-ASCII, '255' has to
> be '127' :-) 

  Er, right. That's what I meant, actually, but I guess I wasn't thinking
much at that moment :^). And since I only tested this with an iconv'ed
ISO-8859-7 text to UTF-8, I didn't even notice...

Cheers,

-- 
Vasilis Vasaitis
[EMAIL PROTECTED]

"Don't do drugs. Santa Claus is watching."
-- winamp.com


--
Linux-UTF8:   i18n of Linux on all levels
Archive:  http://mail.nl.linux.org/linux-utf8/




Re: UTF-8 file to ASCII file converter

2002-04-12 Thread James H. Cloos Jr.

> "Bruno" == Bruno Haible <[EMAIL PROTECTED]> writes:

Bruno> And when you use this C/Java syntax, you get the converter for
Bruno> free: it is contained it libiconv. Try "iconv -f UTF-8 -t JAVA".

Cool.  But when was that addded?  iconv (GNU libc) 2.2.4 as included
in SuSE 7.3's glibc-2.2.4-64.i386.rpm does not support it.  RH7.2 also
has 2.2.4, and also lacks JAVA (one never knows what patches vendors
add...).  I don't see any support for JAVA in cvs either, though I've
only browsed the tree.

-JimC

--
Linux-UTF8:   i18n of Linux on all levels
Archive:  http://mail.nl.linux.org/linux-utf8/




Re: UTF-8 file to ASCII file converter

2002-04-12 Thread jshin

On Fri, 12 Apr 2002, Vasilis Vasaitis wrote:
> On Thu, Apr 11, 2002 at 12:04:18AM -0700, Pedro Ferreira wrote:
> > Now I would like to do the oposite, convert an utf-8
> > file to an ascii file, each utf-8 character would be
> > encoded back to U+. Many thanks in advance for any

>   Just like in the case of the opposite conversion, this conversion can also
> be easily achieved with an one-liner. The following seems to be able to do
> the job:
> 
>   perl -ne 'for (unpack "U*", $_) { printf $_ > 255 ? "U+%04X" : "%c", $_ }'

 Unless you regard ISO-8859-1 as a synonym to US-ASCII, '255' has to
be '127' :-) 

--
Linux-UTF8:   i18n of Linux on all levels
Archive:  http://mail.nl.linux.org/linux-utf8/




Re: UTF-8 file to ASCII file converter

2002-04-12 Thread jshin

On Fri, 12 Apr 2002, Bruno Haible wrote:

> H. Peter Anvin writes:
> 
> > You'd probably be better off using C-like escape codes \u and
> > \U with \ escaped as \\.
> 
> And when you use this C/Java syntax, you get the converter for free:
> it is contained it libiconv. Try "iconv -f UTF-8 -t JAVA".

  That's nice to know. BTW, in case somebody wants to 'torture'
her/his computer/processor for this simple task doable by a Perl one-liner
or iconv, (s)he can run the following:

   native2ascii -encoding UTF-8 file.utf8 file.java
   native2ascii -reverse -encoding UTF-8 file.java file.utf8

native2ascii comes with JDK. 

  Jungshik Shin

--
Linux-UTF8:   i18n of Linux on all levels
Archive:  http://mail.nl.linux.org/linux-utf8/




Re: UTF-8 file to ASCII file converter

2002-04-12 Thread Vasilis Vasaitis

On Thu, Apr 11, 2002 at 12:04:18AM -0700, Pedro Ferreira wrote:
> I already have a perl script (thanks to Oyvind A.
> Holm) that converts an ascii file with U+ unicode
> codes to an utf-8 file.
> Now I would like to do the oposite, convert an utf-8
> file to an ascii file, each utf-8 character would be
> encoded back to U+. Many thanks in advance for any
> help!

  Just like in the case of the opposite conversion, this conversion can also
be easily achieved with an one-liner. The following seems to be able to do
the job:

  perl -ne 'for (unpack "U*", $_) { printf $_ > 255 ? "U+%04X" : "%c", $_ }'

-- 
Vasilis Vasaitis
[EMAIL PROTECTED]

"Don't do drugs. Santa Claus is watching."
-- winamp.com


--
Linux-UTF8:   i18n of Linux on all levels
Archive:  http://mail.nl.linux.org/linux-utf8/




Re: UTF-8 file to ASCII file converter

2002-04-12 Thread Bruno Haible

H. Peter Anvin writes:

> You'd probably be better off using C-like escape codes \u and
> \U with \ escaped as \\.

And when you use this C/Java syntax, you get the converter for free:
it is contained it libiconv. Try "iconv -f UTF-8 -t JAVA".

Bruno
--
Linux-UTF8:   i18n of Linux on all levels
Archive:  http://mail.nl.linux.org/linux-utf8/




Re: UTF-8 file to ASCII file converter

2002-04-11 Thread H. Peter Anvin

Followup to:  <[EMAIL PROTECTED]>
By author:Pedro Ferreira <[EMAIL PROTECTED]>
In newsgroup: linux.utf8
>
> I already have a perl script (thanks to Oyvind A.
> Holm) that converts an ascii file with U+ unicode
> codes to an utf-8 file.
> Now I would like to do the oposite, convert an utf-8
> file to an ascii file, each utf-8 character would be
> encoded back to U+. Many thanks in advance for any
> help!
> 

You'd probably be better off using C-like escape codes \u and
\U with \ escaped as \\.

-hpa
-- 
<[EMAIL PROTECTED]> at work, <[EMAIL PROTECTED]> in private!
"Unix gives you enough rope to shoot yourself in the foot."
http://www.zytor.com/~hpa/puzzle.txt<[EMAIL PROTECTED]>
--
Linux-UTF8:   i18n of Linux on all levels
Archive:  http://mail.nl.linux.org/linux-utf8/




UTF-8 file to ASCII file converter

2002-04-10 Thread Pedro Ferreira

I already have a perl script (thanks to Oyvind A.
Holm) that converts an ascii file with U+ unicode
codes to an utf-8 file.
Now I would like to do the oposite, convert an utf-8
file to an ascii file, each utf-8 character would be
encoded back to U+. Many thanks in advance for any
help!

--- Pedro Ferreira <[EMAIL PROTECTED]> wrote:
> 
> Works fine, thank you!
> 
> 
> --- "Oyvind A. Holm" <[EMAIL PROTECTED]> wrote:
> > On 2002-03-26 06:58-0800 Pedro Ferreira wrote:
> > 
> > > Please, what is the best tool to convert an
> ascii
> > file
> > > with unicode character codes like this:
> > > U+3400
> > > U+3405
> > > to another UTF-8 file with the corresponding
> > unicode
> > > characters?
> > 
> > This Perl script should do the job:
> > 
> > == CUT HERE ==
> > 
> > #!/usr/bin/perl -w
> > 
> 
> 
> __
> Do You Yahoo!?
> Yahoo! Movies - coverage of the 74th Academy Awards®
> http://movies.yahoo.com/
> --
> Linux-UTF8:   i18n of Linux on all levels
> Archive:  http://mail.nl.linux.org/linux-utf8/
> 


=
Pedro Ferreira
Grenoble - France

"Everything should be made as simple as possible, but not simpler." - Einstein

__
Do You Yahoo!?
Yahoo! Tax Center - online filing with TurboTax
http://taxes.yahoo.com/
--
Linux-UTF8:   i18n of Linux on all levels
Archive:  http://mail.nl.linux.org/linux-utf8/