On Thu, 11 Jul 2013 04:15:37 +0100, Joshua Landau wrote: > I have this innocent and simple code: > > from collections import deque > exhaust_iter = deque(maxlen=0).extend
At this point, exhaust_iter is another name for the bound instance method "extend" of one specific deque instance. Other implementations may do otherwise[1], but CPython optimizes built-in methods and functions. E.g. they have no __dict__ so you can't add attributes to them. When you look up exhaust_iter.__doc__, you are actually looking up (type(exhaust_iter)).__doc__, which is a descriptor: py> type(exhaust_iter).__doc__ <attribute '__doc__' of 'builtin_function_or_method' objects> py> type(type(exhaust_iter).__doc__) <class 'getset_descriptor'> Confused yet? Don't worry, you will be... So, calling exhaust_iter.__doc__: 1) looks up '__doc__' on the class "builtin_function_or_method", not the instance; 2) which looks up '__doc__' on the class __dict__: py> type(exhaust_iter).__dict__['__doc__'] <attribute '__doc__' of 'builtin_function_or_method' objects> 3) This is a descriptor with __get__ and __set__ methods. Because the actual method is written in C, you can't access it's internals except via the API: even the class __dict__ is not really a dict, it's a wrapper around a dict: py> type(type(exhaust_iter).__dict__) <class 'mappingproxy'> Anyway, we have a descriptor that returns the doc string: py> descriptor = type(exhaust_iter).__doc__ py> descriptor.__get__(exhaust_iter) 'Extend the right side of the deque with elements from the iterable' My guess is that it is fetching this from some private C member, which you can't get to from Python except via the descriptor. And you can't set it: py> descriptor.__set__(exhaust_iter, '') Traceback (most recent call last): File "<stdin>", line 1, in <module> AttributeError: attribute '__doc__' of 'builtin_function_or_method' objects is not writable which is probably because if you could write to it, it would change the docstring for *every* deque. And that would be bad. If this were a pure-Python method, you could probably bypass the descriptor, but it's a C-level built-in. I think you're out of luck. I think the right solution here is the trivial: def exhaust(it): """Doc string here.""" deque(maxlen=0).extend(it) which will be fast enough for all but the tightest inner loops. But if you really care about optimizing this: def factory(): eatit = deque(maxlen=0).extend def exhaust_iter(it): """Doc string goes here""" eatit(it) return exhaust_iter exhaust_it = factory() del factory which will be about as efficient as you can get while still having a custom docstring. But really, I'm having trouble understanding what sort of application would have "run an iterator to exhaustion without doing anything with the values" as the performance bottleneck :-) > exhaust_iter.__doc__ = "Exhaust an iterator efficiently [...]" > > Obviously it does not work. Even if it did work, it would not do what you hope. Because __doc__ is a dunder attribute (double leading and trailing underscores), help() currently looks it up on the class, not the instance: class Spam: "Spam spam spam" x = Spam() help(x) => displays "Spam spam spam" x.__doc__ = "Yummy spam" help(x) => still displays "Spam spam spam" > Is there a way to get it to work simply and > without creating a new scope (which would be a rather inefficient a way > to set documentation, and would hamper introspection)? > > How about dropping the "simply" requirement? I don't believe so. [1] IronPython and Jython both currently do the same thing as CPython, so even if this is not explicitly language-defined behaviour, it looks like it may be de facto standard behaviour. -- Steven -- http://mail.python.org/mailman/listinfo/python-list