The problem, as you correctely pointed out, is that we don't actually have 
separate data types for ASCII & Unicode strings - we only have Unicode strings.

Therefore when you ask for a Unicode string from an ASCII string we have to do 
some magic to figure out whether you're using your string as a byte array that 
contains the bytes for a Unicode string, or if you're using your string as a 
Unicode string and we need to return you back the original string.

I agree w/ you that the error message could be better - my guess would be that 
the "specified code page" here is ASCII but that really is just a guess.  And 
if I'm guessing I'm betting most other people won't know what's going on either 
:).



Do you want to help develop Dynamic languages on CLR? 
(http://members.microsoft.com/careers/search/details.aspx?JobID=6D4754DE-11F0-45DF-8B78-DC1B43134038)

-----Original Message-----
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of J. Merrill
Sent: Monday, May 01, 2006 9:15 AM
To: Discussion of IronPython
Subject: Re: [IronPython] unicode bug?

(This presumes that IronPython has separate string and unicode types, like 
CPython does.  If that's not the case, well, "never mind...")

Shouldn't it be the case that calling   typename(value)   does as little work 
as possible if the value is already of the specified type?  That is, it would 
be a shame if
    i = 5
    j = int(i)
did a lot of work to ensure that i is a valid int (within range of 32-bit 
integer etc).  So the sample code should not be testing to see if the 
already-unicode-string can be converted to a unicode string -- should it?  
(That doesn't mean that there's no problem that could be demonstrated by 
slightly different code, like   u = unicode (Sq2 + 'hello')   for example.)

Shouldn't the error message say "from code page XXX to unicode" rather than 
saying "from specified code page to unicode"?  How else to know (without a lot 
of investigation) what code page was "specified"?

At 11:34 AM 5/1/2006, Dino Viehland wrote (in part)
>Thanks for the bug report, I've got it filed in our bug database.
>
>I'm thinking we should be able to get to this one for beta 7 if it doesn't end 
>up being too complex (Unicode can always be trickier than you initially 
>expect).
>
>-----Original Message-----
>From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Cheemeng
>Sent: Sunday, April 30, 2006 2:59 AM
>To: [email protected]
>Subject: [IronPython] unicode bug?
>
>hi,
>
>Sq2 = u'\xb2'
>u = unicode(Sq2)
>print u is Sq2
>
>in CPython, the unicode function returns back the same str,
>in IP, an exception is thrown,
>UnicodeDecodeError: Unable to translate bytes [B2] at index 0 from
>specified code page to Unicode.
>
>regards,
>cheemeng


J. Merrill / Analytical Software Corp


_______________________________________________
users mailing list
[email protected]
http://lists.ironpython.com/listinfo.cgi/users-ironpython.com
_______________________________________________
users mailing list
[email protected]
http://lists.ironpython.com/listinfo.cgi/users-ironpython.com

Reply via email to