At 09:33 -0400 2001.07.23, Steve Torrence wrote:
>I would just like to know what is the most common way of handling
>this. It's hard to believe the Perl script authors working on
>scripts for languages that contain these extended characters are
>going through a lot of trouble putting the accented characters in
>their scripts. It seems that a tool like bbedit would be able to
>take a script that was written using the extended characters and
>convert the text to something compatible with perl.

I am not sure what you are asking.  Most people don't use non-ASCII
characters.  When they do, they usually use Latin-1 or Unicode.  If you use
MacRoman or Windows character sets instead, then you need to convert.

So what is it you want to handle?  If it is BBEdit or Terminal showing the
"right" characters, then you need to either do a Latin1<->MacRoman
character map, or you need to change the behavior of the programs to show
the characters in the character set you want (in BBEdit or Terminal, this
might be as simple as changing the font; I use ProFont, and it has a
companion ProFontIsoLatin1).

>I know that the few scripts I have found don't use either of the 2
>coding methods Will mentioned. They seem to substitute one extended
>character for another which Perl seems to convert to the correct
>character when sending the html to the browser.

Again, perl does not convert any characters.  It just sends a byte of data.
How that byte is rendered is not determined by perl, it is determined by
the rendering process (BBEdit, Terminal, Internet Explorer, etc.).

For example, if I send the byte 0xC4, Terminal (using MacRoman) will show
"Ÿ" (that italicized lower-case "f" used often for folders).  But Internet
Explorer (using the default of Latin-1) will show "Ä" (capital "A" with an
umlaut).  perl doesn't care.  It is just data to perl (unless you are in a
regex, or using isprint(), etc.).  perl does no conversion or translation.
It just sends a single byte that is rendered by the displaying process in
different ways.

So you need to know not how perl will print the data, but how a particular
process will render it.  If you are printing data in Latin-1, then you need
to make sure the rendering process displays it in Latin-1.  For most
applications (web, etc.), this is the default, and therefore not a problem.
You might want to therefore adjust the behavior of your apps to use Latin-1
instead of MacRoman.

-- 
Chris Nandor                      [EMAIL PROTECTED]    http://pudge.net/
Open Source Development Network    [EMAIL PROTECTED]     http://osdn.com/

Reply via email to