Holger Joukl wrote: > [EMAIL PROTECTED] schrieb am 13.12.2006 > 11:02:30: > > > > > Holger Joukl wrote: > > > Hi there, > > > > > > I consider the behaviour of unicode() inconvenient wrt to conversion of > > > non-string > > > arguments. > > > While you can do: > > > > > > >>> unicode(17.3) > > > u'17.3' > > > > > > you cannot do: > > > > > > >>> unicode(17.3, 'ISO-8859-1', 'replace') > > > Traceback (most recent call last): > > > File "<stdin>", line 1, in ? > > > TypeError: coercing to Unicode: need string or buffer, float found > > > >>> > > > [...] > > > Any reason why unicode() with a non-string argument should not allow > the > > > encoding and errors arguments? > > > > There is reason: encoding is a property of bytes, it is not applicable > > to other objects. > > Ok, but I still don't see why these arguments shouldn't simply be silently > ignored > for non-string arguments.
That's rather bizzare and sloppy approach. Should unicode(17.3, 'just-having-fun', 'I-do-not-like-errors') unicode(17.3, 'sdlfkj', 'ewrlkj', 'eoirj', 'sdflkj') work? > > > Or some good solution to work around my problem? > > > > Do not put undecoded bytes in a mixed-type argument list. A rule of > > thumb working with unicode: decode as soon as possible, encode as late > > as possible. > > It's not always that easy when you deal with a tree data structure with the > tree elements containing different data types and your user may decide to > output > root.element.subelement.whateverData. > I have the problems in a logging mechanism, and it would vanish if > unicode(<non-string>, encoding, errors) would work and just ignore the > obsolete > arguments. I don't really see from your example what stops you from putting unicode instead of bytes into your tree, but I can believe some libraries can cause some extra work. That's the problem with libraries, not with builtin function unicode(). Would you be happy if floating point value 17.3 would be stored as 8 bytes in your tree? After all, that is how 17.3 is actually represented in computer memory. Same story with unicode, if some library gives you raw bytes *you* have to do extra work later. -- Leo -- http://mail.python.org/mailman/listinfo/python-list