Re: Accented characters in scripts (and whereis iconv.h?)
At 09:33 -0400 2001.07.23, Steve Torrence wrote: I would just like to know what is the most common way of handling this. It's hard to believe the Perl script authors working on scripts for languages that contain these extended characters are going through a lot of trouble putting the accented characters in their scripts. It seems that a tool like bbedit would be able to take a script that was written using the extended characters and convert the text to something compatible with perl. I am not sure what you are asking. Most people don't use non-ASCII characters. When they do, they usually use Latin-1 or Unicode. If you use MacRoman or Windows character sets instead, then you need to convert. So what is it you want to handle? If it is BBEdit or Terminal showing the right characters, then you need to either do a Latin1-MacRoman character map, or you need to change the behavior of the programs to show the characters in the character set you want (in BBEdit or Terminal, this might be as simple as changing the font; I use ProFont, and it has a companion ProFontIsoLatin1). I know that the few scripts I have found don't use either of the 2 coding methods Will mentioned. They seem to substitute one extended character for another which Perl seems to convert to the correct character when sending the html to the browser. Again, perl does not convert any characters. It just sends a byte of data. How that byte is rendered is not determined by perl, it is determined by the rendering process (BBEdit, Terminal, Internet Explorer, etc.). For example, if I send the byte 0xC4, Terminal (using MacRoman) will show (that italicized lower-case f used often for folders). But Internet Explorer (using the default of Latin-1) will show Ä (capital A with an umlaut). perl doesn't care. It is just data to perl (unless you are in a regex, or using isprint(), etc.). perl does no conversion or translation. It just sends a single byte that is rendered by the displaying process in different ways. So you need to know not how perl will print the data, but how a particular process will render it. If you are printing data in Latin-1, then you need to make sure the rendering process displays it in Latin-1. For most applications (web, etc.), this is the default, and therefore not a problem. You might want to therefore adjust the behavior of your apps to use Latin-1 instead of MacRoman. -- Chris Nandor [EMAIL PROTECTED]http://pudge.net/ Open Source Development Network[EMAIL PROTECTED] http://osdn.com/
Re: Accented characters in scripts (and whereis iconv.h?)
On Monday, July 23, 2001 at 09:33, [EMAIL PROTECTED] (Steve Torrence) wrote: I would just like to know what is the most common way of handling this. It's hard to believe the Perl script authors working on scripts for languages that contain these extended characters are going through a lot of trouble putting the accented characters in their scripts. It seems that a tool like bbedit would be able to take a script that was written using the extended characters and convert the text to something compatible with perl. The way I see it, you have three options. I've listed them in what I would think of as increasing order to desirability. 1) Add a content-type meta tag like this meta http-equiv=content-type content=text/html; charset=x-mac-roman which will tell the browser to treat the text as MacRoman. This will cause the text to render properly in any browser which can handle the conversion correctly. 2) Save the files in ISO Latin-1 encoding and convert to/from MacRoamn as necessary when editing. You'll need something like the Midex plug-in for BBEdit to do the character set conversions. http://www.barebones.com/support/bbedit/bbedit-plugins.html 3) Convert the 8bit characters in your files to HTML entity codes. That is, é becomes eacute; etc. You can use the Translate command in BBEdit to do this. (Markup:Utilities:Translate) Actually, there is another option. You could write the script in MacRoman and then, before spitting out the HTML, convert the text to ISO Latin-1 on the fly. I guess I'd place this somewhere between 2 and 3 above. -- Christian Smith | [EMAIL PROTECTED] | http://web.barebones.com He who dies with the most friends... Is still dead!
Re: Accented characters in scripts (and whereis iconv.h?)
I'm new to Perl and am using BBEdit to edit scripts. I just can't figure out how to easily type in characters with accents. For example to get usuário I have to put in usu·rios for Você I have to type in VocÍ if it's just a question of entering the character, the ascii floater in bbedit will probably be easiest. window menu - ascii table, double click for the %mumble or apple-doubleclick for the character. but remember that the upper part of the mac ascii table isn't the same as on other platforms. have a look at perldoc perlport (or http://www.perldoc.com/perl5.6/pod/perlport.html ) for lots of stiff warnings about platform variation and some useful advice. bbedit also has a similar floater for html entities, which is probably all you need. even better, the translate command will automatically turn your accented characters into the entity equivalents, eg é - eacute; out of curiosity, i did some fiddling about with this: #!/usr/bin/perl print Content-type:text/plain\n\n; print ñö háblà espanõl\n; and found that the only place it works as expected is on the os x command line, either locally or on a remote server. through a browser it always comes up as '-s hýbl espanðl', whatever the native set is on the server or client. it seems that perl on os x uses the mac roman character set until run under cgi, at which point it prints 7bit ascii. You can confirm this by running: #!/usr/bin/perl print($_ = . chr($_) . \n) for (32..255); which produces different results in the terminal and the browser. if this proves to be a problem, and you must have raw accented text inside your script, you can either print through a translation table from mac ascii to the standard/other version, or (more portable) do it all in unicode. the recommended way to do either of these seems to be the Text::Iconv module, but i couldn't get it to make, as os x doesn't provide the iconv.h that it wraps around. does anyone know a fix? thanks will ps. i'd be grateful if people would correct any misconceptions here. -- pgpkey: http://www.spanner.org/keys/will.txt