One Small Doubt

The only area of doubt I have about this problem being caused by the base Perl and it 
configuration results from having the MIME::Lite and MIME::Base_64 modules available. 
Both of these I would expect to have access to the encode features but neither are 
used in this code module. They are used in other modules elsewhere on the CGI but no 
connection to the troublesome module.

The Pound Sterling

The pound is defiantly odd; from memory the PC originally allowed the � to replaced 
the # and you could have one or the other. Then codepage 850 changed things so you 
could have both and the pound moved to 0xA3 in the range beyond the ASCII defined 
characters. A somewhat checkered history.

Now in my Red Hat environment "LANG=en_GB.UTF-8" is set and I think this is causing 
Perl to render the � in a two byte format 0xC2A3 however in the source the one byte 
0xA3 is used and understood. So the input/source is not encoded but the output is 
encoded; I don't really understand, why?

Equally so far the � seems to be the only character effected in this way.

However, now I have the no encoding; pragma in force everything is rendered as one 
byte characters.

I love Perl but I am not sure that this part is very transparent. I would have 
expected the norm to follow the input/source and only do translation on instruction. 
Equally as the use byte; pragma is supposed to force characters to be rendered as 
"almost binary" I expected it to stop the two byte rendering.

I think this area of 5.8 whilst better than 5.6 may still need some clarification 
before the average user can understand it easily.

Frank


>>> John Delacour <[EMAIL PROTECTED]> 10/10/03 00:25:07 >>>
At 4:05 pm +0100 9/10/03, Frank Smith wrote:

>  I have now forced Perl to prodcue uncoded output by the use of:
>
>  no coding;
>
>  which has worked wonders.

no encoding, I presume you mean.  That makes no difference here.


On the other hand if I run this

use encoding "utf8", STDOUT => "MacRoman" ;
print "\x{2022}" ;

I get the one-byte Mac bullet instead of the 
three-byte utf8 character I would get with just

print "\x{2022}" ;

There seems to be something odd about the "�". 
Perl on my machine prints it in one byte whatever 
I do.  Maybe something to do with locale settings.

JD



***********************************************************************
This transmission contains information which may be confidential and  
which may also be privileged.  It is intended for the named addressee  
only.  Unless you are the named addressee, or authorised to receive it 
on behalf of the addressee you may not copy or use it, or disclose it 
to anyone else.  If you have received this transmission in error please 
contact the sender.  Thank you for your cooperation. 
***********************************************************************

For more information about AEA Technology please visit our website at 
http://www.aeat.co.uk

AEA Technology plc registered office 329 Harwell, Didcot, Oxfordshire OX11 0QJ.
Registered in England and Wales, number 3095862.

Reply via email to