----- Original Message ----- > From: Steven D'Aprano <st...@pearwood.info> > To: tutor@python.org > Cc: > Sent: Sunday, October 6, 2013 4:52 AM > Subject: Re: [Tutor] class decorator question > > On Sat, Oct 05, 2013 at 12:26:14PM -0700, Albert-Jan Roskam wrote: > >> >> On http://lucumr.pocoo.org/2013/5/21/porting-to-python-3-redux/ I > saw >> >> a very cool and useful example of a class decorator. It > (re)implements >> >> __str__ and __unicode__ in case Python 2 is used. For Python 3, > the >> >> decorator does nothing. I wanted to generalize this decorator so > the >> >> __str__ method under Python 2 encodes the string to an arbitrary >> >> encoding. This is what I've created: > http://pastebin.com/vghD1bVJ. >> >> >> >> It works, but the code is not very easy to understand, I am > affraid. >> > >> >It's easy to understand, it's just doing it the wrong way. It > creates >> >and subclass of your class, which it shouldn't do. >> >> Why not? Because it's an unusual coding pattern? Or is it ineffecient? > > It is both of those things. (Well, the inefficiency is minor.) My > main objection is that it is inelegant, like using a screwdriver as > a chisel instead of using a chisel -- even when it's "good > enough", > it's not something you want other people to see you doing if you > care about looking like a craftsman :-)
or use a shoe to hammer a nail in the wall... ;-) > Another issue is to do with naming. In your example, you decorate Test. > What that means in practice is that you create a new class, Klass(Test), > throw away Test, and bind Klass to the top-level name Test. So in effect > you're doing this: > > class Test # The undecorated version. > > class Klass(Test) # Subclass it inside the decorator. > > Test = Klass # throw away the original and re-use the variable name. > > But classes, like functions, have *two* names. They have the name they > are bound to, the variable name (*usually* one of these, but sometimes > zero or two or more). And they have their own internal name: > > Test.__name__ > => returns "Klass" > > > This will make debugging unneccesarily confusing. If you use your > decorator three times: > > @implements_to_string > class Spam > > @implements_to_string > class Eggs > > @implements_to_string > class Cheese > > > instances of all three of Spam, Eggs and Cheese will claim to be > instances of "Klass". That would indeed be *very* confusing. > Now there is a simple work-around for this: inside the decorator, call > > Klass.__name__ = cls.__name__ > > before returning. But that leads to another issue, where instances of > the parent, undecorated, class (if any!) and instances of the child, > decorated, class both claim to be from the same "Test" class. This is > more of theoretical concern, since you're unlikely to be instantiating > the undecorated parent class. > > >> I subclassed because I needed the encoding value in the decorator. >> But subclassing may indeed have been overkill. > > Yes :-) > > The encoding value isn't actually defined until long after the decorator > has finished doing its work, after the class is decorated, and an > instance is defined. So there is no encoding value used in the decorator > itself. The decorator can trivially refer to the encoding value, so long > as that doesn't actually get executed until after an instance is > created: > > def decorate(cls): > def spam(self): > print(self.encoding) > cls.spam = spam > return cls > > works fine without subclassing. waah, why didn't I think of this? I've been making this way more complicated than needed. self.__dict__["encoding"] = self.encoding (see also below) was another way I considered to pass the encoding value from the class to its decorator. I even considered making a class decorator with arguments. All unnecesary. > >> >Here's a better >> >approach: inject the appropriate methods into the class directly. > Here's >> >a version for Python 3: > [...] >> >This avoids overwriting __str__ if it is already defined, and likewise >> >for __bytes__. >> >> Doesn't a class always have __str__ implementation? > > No. Where is the __str__ implementation here? > > class X: > pass > > This class defines no methods at all. Its *superclass*, object in Python > 3, defines methods such as __str__. But you'll notice that I didn't call > > > hasattr(cls, '__str__') > > since that will return True, due to object having a __str__ method. I > called > > '__str__' in cls.__dict__ > > which only returns True if cls explicitly defines a __str__ method. aaaaaahh, yes, of course these are not the same so 'method_name' in cls.__dict__ tests whether method_name is *implemented* in that class. In many/most cases hasattr is all you need because you want to know whether method_name can be *called* in that class. >> Nice, thanks Steven. I made a couple of versions after reading your >> advise. The main change that I still had to somehow retrieve the >> encoding value from the class to be decorated (decoratee?). I simply >> stored it in __dict__. Here is the second version that I created: >> http://pastebin.com/te3Ap50C. I tested it in Python 2 and 3. > > Not sufficiently :-) Your test class has problems. See below. > > > >> The Test >> class contains __str__ and __unicode__ which are renamed and redefined >> by the decorator if Python 3 (or 4, or..) is used. >> >> >> General question: I am using pastebin now. Is that okay, given that >> this is not part of the "memory" of the Python Tutor archive? It > might >> be annoying if people search the archives and get 404s if they try to >> follow these links. Just in case I am also pasting the code below: > > In my opinion, no it's not okay, particularly if your code is short > enough to be posted here. > > Just because a pserson has access to this mailing list doesn't > necessarily mean they have access to pastebin. It might be blocked. The > site might be down. They might object to websites that require > Javascript (pastebin doesn't *require* it, but it's only a matter of > time...). Or they may simply be too busy/lazy to follow the link. It's also easy to do both. I always hope code in mails does not get mangled (even if it's plain text). The colour coding of pastebin and similar sites helps other readers understand code more easily. And I agree posting long code is a no-no. >> from __future__ import print_function >> import sys >> >> def decorate(cls): >> print("decorate called") >> if sys.version_info[0] > 2: >> cls.__dict__["__str__"].__name__ = '__bytes__' >> cls.__dict__["__unicode__"].__name__ = '__str__' >> cls.__bytes__ = cls.__dict__["__str__"] >> cls.__str__ = cls.__dict__["__unicode__"] >> return cls > > I thought your aim was to write something that was cross-version and > that added default __str__ and __unicode__ methods to the class if they > didn't already exist? [looks back at the original code...] Ah no, my > mistake, I misunderstood. > > The above requires the caller to write their classes using the Python 2 > style __str__ and __unicode__ methods. __unicode__ isn't even mandatory > in Python 2, but your decorate won't work without it! > > As given, your decorator: > - does nothing in Python 2, even if the caller didn't define __str__ > or __unicode__ methods;. I *know* that I defined three classes that each contain __str__ and __unicode__, so is it still a good idea to test for their existence? So a meta question: How generally applicable should code, in this case a decorator, be? Should one always strive for code that could readily be re-used in other places? It is cool (and efficient, and intellectually gratifying) if code can be re-used, but isn't a downside that the code is more sophisticated/longer than required for a given context? At what point does refined code turn into "bloated software"? http://c2.com/cgi/wiki?PrematureGeneralization . > - fails in Python 3 if the class doesn't define a __unicode__ method; > > - does the wrong thing in Python 3 if the class already has correctly > working __str__ and __bytes__ methods; > > - doesn't help you if you have a Python 3 style class and want to use > it in Python 2; Python 3 style class is a class that inherits from object, right (class Foo(object):...)? I indeed had not considered the possibility that the decorator might fail when used for old-style classes. > - doesn't work well if the decorated class inherits its __str__ and > __unicode__ methods from a parent class. > > > Admittedly, that last one is tricky, thanks to everything inheriting > from object. > > >> @decorate >> class Test(object): >> >> def __init__(self): >> self.__dict__["encoding"] = self.encoding > > Why are you doing that? What is the outcome you are hoping for, and why > do you think it is necessary? See also above. I should have deleted that. >> def __str__(self): >> return "str called".encode(self.encoding) >> >> def __unicode__(self): >> return "unicode called" > > These are wrong! Worse, you have multiple errors that cancel each > other out -- sometimes, two wrongs do make a right. aargh, of course. I should have done (me thinks): def __str__(self): return self.__unicode__().encode(self.encoding) def __unicode__(self): return u"unicode called" > In Python 2: calling encode on a byte-string is permitted, but is the > wrong thing to do. By accident, it (usually?) works, but you shouldn't > do it. So there's your first wrong. > > When converted to Python 3, the __str__ method becomes __bytes__, and is > supposed to return bytes. Now the "str called" literal is Unicode, and > > encode will work, returning bytes. But it only works because of the > first wrong -- if you re-write __str__ to use b"str called", or to > call > "str called".decode, your Python 3 __bytes__ method will fail. > > In Python 2, __unicode__ ought to return a unicode string, u"unicode > called". By accident, if you return a byte string, Python will decode it > using ASCII, and it seems to work. But it's still wrong, and it's > particularly likely to go wrong if the __unicode__ method does any, > well, Unicode stuff. > > When converted to __str__ by the decorator, the ex-__unicode__ method > will work, but only because you used a (Python2) byte-string literal > "..." inside it. If you wrote a u"Unicode string", it would > fail in > Python 3.1 or 3.2 (but work in 3.3 and better). > > >> @property >> def encoding(self): >> """In reality this method extracts the encoding from > a file""" >> return "utf-8" # rot13 no longer exists in Python3 > > Why would you do that? > > Why not just supply the encoding when you initialise the instance? Counter question: why would I ask the caller for information if that information can automatically be retrieved? > def __init__(self, encoding): > self.encoding = encoding > > >> if __name__ == "__main__": >> t = Test() >> if sys.version_info[0] == 2: >> print(unicode(t)) >> print(str(t)) > > This is insufficient testing. In Python 2, you need to test both > unicode(t) and str(t). In Python 3, you need to test both str(t) and > bytes(t). > > In may turn out that, by accident, all four tests work for the given > Test class. But that's not going to apply to everything. > > > > > -- > Steven > _______________________________________________ > Tutor maillist - Tutor@python.org > To unsubscribe or change subscription options: > https://mail.python.org/mailman/listinfo/tutor > _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor