Re: [IronPython] Django, unicode, and #20366

Dino Viehland Thu, 11 Feb 2010 11:11:46 -0800

Oh, I should also add I think Michael's proposed fix makes this code work
correctly.


> -----Original Message-----
> From: [email protected] [mailto:users-
> [email protected]] On Behalf Of Dino Viehland
> Sent: Thursday, February 11, 2010 11:07 AM
> To: Discussion of IronPython; Michael Foord
> Subject: Re: [IronPython] Django, __unicode__, and #20366
> 
> Vernon wrote:
> > You need the 'byte' class for Python 3 anyway. Implement it now.
> 
> Done!  Assuming you mean bytes it’s in 2.6 already.  Now if everyone
> would upgrade their code to use b’’ :)
> 
> > A small sample...
> >
> > <code x.py>
> > import sys
> > u = u'1234\u00f6'
> > s = '1234'
> > x = str(s)
> > print type(x), repr(x)
> > x = unicode(s)
> > print type(x), repr(x)
> > try:
> >    x = unicode(u)
> >    print type(x), repr(x)
> > except:
> >    print 'Error=',sys.exc_info()[0]
> > try:
> >    x = str(u)
> >    print type(x), repr(x)
> > except:
> >    print 'Error=',sys.exc_info()[0]
> > </code>
> > --------------------
> >
> > The results...
> >
> > >c:\python26\python.exe x.py
> > <type 'str'> '1234'
> > <type 'unicode'> u'1234'
> > <type 'unicode'> u'1234\xf6'
> > Error= <type 'exceptions.UnicodeEncodeError'>
> >
> > >"c:\program files\Ironpython 2.6\ipy.exe" x.py
> > <type 'str'> '1234'
> > <type 'str'> '1234'
> > Error= <type 'exceptions.UnicodeDecodeError'>
> > Error= <type 'exceptions.UnicodeDecodeError'>
> >
> > >copy x.py x3.py
> > >2to3 -w x3.py
> > >c:\python31\python.exe x3.py
> > <class 'str'> '1234'
> > <class 'str'> '1234'
> > <class 'str'> '1234ö'
> > <class 'str'> '1234ö'
> > ------------------------------
> > One would think that IronPython should produce the same output as
> Python 3 -- since 'str' and 'unicode' are the same thing in both
> dialects. In particular, the exception when 'converting' unicode to >
> unicode is just plain wrong.
> 
> 
> I'm not going to argue the exception isn't wrong.  But saying
> IronPython should output the same thing as an entirely different script
> isn't right either.  After running 2to3 the script looks like this for
> me:
> 
> import sys
> u = '1234\u00f6'
> s = '1234'
> x = str(s)
> print(type(x), repr(x))
> x = str(s)
> print(type(x), repr(x))
> try:
>     x = str(u)
>     print(type(x), repr(x))
> except:
>     print('Error=',sys.exc_info()[0])
> try:
>     x = str(u)
>     print(type(x), repr(x))
> except:
>     print('Error=',sys.exc_info()[0])
> 
> You can argue whether or not 2to3 did the right thing here - it has
> completely dropped the distinction between str and unicode.  In reality
> if this was a script written for Python 2.5 and above your usage of str
> here is ambiguous.  If this script was written for 2.6 and above then
> it's clear you want strings and not bytes because you'd have used
> bytes/bytearray/b'' to indicate bytes.  The problem is there's still
> lots of code which runs on 2.5+ and won't be using bytes/bytearray/b''
> but really is dealing with bytes and not strings.
> 
> The fact is this is going to be broken unless we were to make str be a
> distinct type from Unicode - then there'd be no ambiguity and we
> wouldn't have to guess.  But that's a massive change which propagates
> through the entire IronPython code base and involves tons of breaking
> changes.  I've looked at doing this before and it's spreads everywhere
> and there's lots of new ugliness.  We could look at doing it again but
> it seems like making that massive change and then switching to 3k and
> changing it all back isn't very productive.
> 
> 
> 
> _______________________________________________
> Users mailing list
> [email protected]
> http://lists.ironpython.com/listinfo.cgi/users-ironpython.com
_______________________________________________
Users mailing list
[email protected]
http://lists.ironpython.com/listinfo.cgi/users-ironpython.com

Re: [IronPython] Django, __unicode__, and #20366

Reply via email to

Re: [IronPython] Django, unicode, and #20366