Oh, I should also add I think Michael's proposed fix makes this code work correctly.
> -----Original Message----- > From: users-boun...@lists.ironpython.com [mailto:users- > boun...@lists.ironpython.com] On Behalf Of Dino Viehland > Sent: Thursday, February 11, 2010 11:07 AM > To: Discussion of IronPython; Michael Foord > Subject: Re: [IronPython] Django, __unicode__, and #20366 > > Vernon wrote: > > You need the 'byte' class for Python 3 anyway. Implement it now. > > Done! Assuming you mean bytes it’s in 2.6 already. Now if everyone > would upgrade their code to use b’’ :) > > > A small sample... > > > > <code x.py> > > import sys > > u = u'1234\u00f6' > > s = '1234' > > x = str(s) > > print type(x), repr(x) > > x = unicode(s) > > print type(x), repr(x) > > try: > > x = unicode(u) > > print type(x), repr(x) > > except: > > print 'Error=',sys.exc_info()[0] > > try: > > x = str(u) > > print type(x), repr(x) > > except: > > print 'Error=',sys.exc_info()[0] > > </code> > > -------------------- > > > > The results... > > > > >c:\python26\python.exe x.py > > <type 'str'> '1234' > > <type 'unicode'> u'1234' > > <type 'unicode'> u'1234\xf6' > > Error= <type 'exceptions.UnicodeEncodeError'> > > > > >"c:\program files\Ironpython 2.6\ipy.exe" x.py > > <type 'str'> '1234' > > <type 'str'> '1234' > > Error= <type 'exceptions.UnicodeDecodeError'> > > Error= <type 'exceptions.UnicodeDecodeError'> > > > > >copy x.py x3.py > > >2to3 -w x3.py > > >c:\python31\python.exe x3.py > > <class 'str'> '1234' > > <class 'str'> '1234' > > <class 'str'> '1234ö' > > <class 'str'> '1234ö' > > ------------------------------ > > One would think that IronPython should produce the same output as > Python 3 -- since 'str' and 'unicode' are the same thing in both > dialects. In particular, the exception when 'converting' unicode to > > unicode is just plain wrong. > > > I'm not going to argue the exception isn't wrong. But saying > IronPython should output the same thing as an entirely different script > isn't right either. After running 2to3 the script looks like this for > me: > > import sys > u = '1234\u00f6' > s = '1234' > x = str(s) > print(type(x), repr(x)) > x = str(s) > print(type(x), repr(x)) > try: > x = str(u) > print(type(x), repr(x)) > except: > print('Error=',sys.exc_info()[0]) > try: > x = str(u) > print(type(x), repr(x)) > except: > print('Error=',sys.exc_info()[0]) > > You can argue whether or not 2to3 did the right thing here - it has > completely dropped the distinction between str and unicode. In reality > if this was a script written for Python 2.5 and above your usage of str > here is ambiguous. If this script was written for 2.6 and above then > it's clear you want strings and not bytes because you'd have used > bytes/bytearray/b'' to indicate bytes. The problem is there's still > lots of code which runs on 2.5+ and won't be using bytes/bytearray/b'' > but really is dealing with bytes and not strings. > > The fact is this is going to be broken unless we were to make str be a > distinct type from Unicode - then there'd be no ambiguity and we > wouldn't have to guess. But that's a massive change which propagates > through the entire IronPython code base and involves tons of breaking > changes. I've looked at doing this before and it's spreads everywhere > and there's lots of new ugliness. We could look at doing it again but > it seems like making that massive change and then switching to 3k and > changing it all back isn't very productive. > > > > _______________________________________________ > Users mailing list > Users@lists.ironpython.com > http://lists.ironpython.com/listinfo.cgi/users-ironpython.com _______________________________________________ Users mailing list Users@lists.ironpython.com http://lists.ironpython.com/listinfo.cgi/users-ironpython.com