Well, not exactly.  The extra \x00 is only present in characters whose value
<= 127, which actually does not include either A-grave; the lc one is \xe0,
the uc \xc0.  Otherwise, there is an increasingly elaborate encoding scheme,
which may occupy 2, 3 or 4 bytes.  I'll paraphrase part of it:-

A 2-byte encoding, e.g. \xcf83, is interpreted thus:-

Byte 0 (here \xcf) 1 1 0 a a a a a
Byte 1 (here \x83) 1 0 b b b b b b

The whole value is aaaaa bbbbbb, in this case 01111 000011, or \x03c3, in
other words greek lower case sigma.  The 3- and 4-byte encodings take this
sort of thing even further.

Rgds, GStC.

  

-----Original Message-----
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, March 02, 2005 11:59 PM
To: [EMAIL PROTECTED]; beginners@perl.org
Subject: RE: z/OS unicode problem.


Hi,

May be this is could be useful; Unicode characters have two bytes for each
character. Each character has "0x00" following to the each character. You
need to remove the "0x00" after each character and then do the Regular
Expression matching.

my $temp = chr(0x00);
$line =~ s/$temp//g; this remove the Unicode characters from the line.

Regds
Suresh

-----Original Message-----
From: Rajarshi Das [mailto:[EMAIL PROTECTED]

Sent: Thursday, March 03, 2005 5:33 AM
To: beginners@perl.org
Subject: z/OS unicode problem.

Hi,
I had a question regarding utf-ebcdic issues on z/OS. I tried this on a

perl-5.8.6.  If I use a unicode character within a character class and try

matching the same using a regular expression, I get a failure.

e.g. if I write this ;

use charnames:full;

$a = "\N{LATIN SMALL LETTER A WITH GRAVE}"; $b = "\N{LATIN CAPITAL LETTER A
WITH GRAVE}";

$a =~ m/[$b]/i;

This fails whereas,

$b =~ m/[$a]/i;
passes.

Does anyone have thoughts on why this might be happening ? Alternately,

could someone let know as to who could help ?

Thanks in advance,
Rajarshi.

_________________________________________________________________
Click, Upload, Print. http://www.kodakexpress.co.in?soe=4956 Deliver in

India.


--

To unsubscribe, e-mail: [EMAIL PROTECTED] For additional
commands, e-mail: [EMAIL PROTECTED] <http://learn.perl.org/>
<http://learn.perl.org/first-response>





Confidentiality Notice


The information contained in this electronic message and any attachments to
this message are intended for the exclusive use of the addressee(s) and may
contain confidential or privileged information. If you are not the intended
recipient, please notify the sender at Wipro or [EMAIL PROTECTED]
immediately and destroy all copies of this message and any attachments.

--
To unsubscribe, e-mail: [EMAIL PROTECTED] For additional
commands, e-mail: [EMAIL PROTECTED] <http://learn.perl.org/>
<http://learn.perl.org/first-response>


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>


Reply via email to