Hi Sadahiro

 On 9/11/05, SADAHIRO Tomoyuki <[EMAIL PROTECTED]> wrote:

> 
> On Wed, 31 Aug 2005 19:53:37 +0530, Sastry <[EMAIL PROTECTED]> wrote
> 
> > Hi Sadahiro
> > The patch has resolved four tests that were failing previously but one
> > more test is stilling failing(which was failing even before applying the
> > patch).
> > Here is the test case
> >
> > ($a = v300.196.172.302.197.172) =~ tr/\x{12c}-\x{130}/\xc0-\xc4/;
> > is($a, v192.196.172.194.197.172, 'UTF range');
> > # got 'DÐDEÐ'
> > # expected '{DÐBEÐ'
> > Can you suggest some pointers towards fixing this?
> > -Sastry
> 
> This "EBCDIC-specific" problem is based on how to treat with code values
> including Unicode (\x{12c}-\x{130} is surely Unicode) on EBCDIC platform.
> Native code values in EBCDIC (for example 'A' == 193) almost differs
> from the range of 0..255 in Unicode (for example 'A' == 65) which
> coincides with ASCII/Latin1.
> 
> Thus the middle part of a character range is gererally different
> between EBCDIC and Unicode.
> 
> For example consider a character range \xc0-\xc4. Since the mappings
> \xc0 to '{' (an open curly) and \xc4 to D in EBCDIC are definite,
> the range \xc0-\xc4 is equivalent to {-D on EBCDIC platform.
> 
> In EBCDIC {-D (\xc0-\xc4) can be expanded to \xc0\xc1\xc2\xc3\xc4,
> but in Unicode {-D cannot be expanded, as the Unicode scalar values
> of the endpoints are reverse ('{' = U+007B, D = U+0044).

  
 Actually the current perl implementation is confused:
> in the parse time (see toke.c#scan_const) perl treats the range
> in EBCDIC order and then does not catch as "Invalid range,"
> though in the compile time (see op.c#pmtrans) and the run time
> (see doop.c#do_trans_simple_utf8 and its friends) perl treats
> the range in Unicode order and then generates a strange result.
> > For this test since the min > max in scan_const, as per their Unicode 
> values, should we complain warning, in which case the test case is wrong in 
> EBCDIC platform! Am I correct?

  
 In my opinion it is necessary to determine how to expand character
> ranges with Unicode (whether the native EBCDIC order or Unicode order).
> I'm not sure using the native encoding (ASCII/Latin1/EBCDIC) everytime
> (that is same as "no Unicode within 0..255") makes people happy.

 >Do you think that perl-5.8.6 is not expanding the character ranges with 
Unicode? If so how is this test case working?
 ($a = "\x{12d}\x{12e}\x{12f}\x{130}") =~ tr/\x{12c}-\x{130}/Y/;
All the bytes are translated to Y
 regards
-Sastry

 
 Regards,
> SADAHIRO Tomoyuki
> 
> 
>

Reply via email to