[Python-Dev] PEP 0424: A method for exposing a length hint
Hi all, I've just submitted a PEP proposing making __length_hint__ a public API for users to define and other VMs to implement: PEP: 424 Title: A method for exposing a length hint Version: $Revision$ Last-Modified: $Date Author: Alex Gaynor Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 14-July-2012 Python-Version: 3.4 Abstract CPython currently defines an ``__length_hint__`` method on several types, such as various iterators. This method is then used by various other functions (such as ``map``) to presize lists based on the estimated returned by ``__length_hint__``. Types can then define ``__length_hint__`` which are not sized, and thus should not define ``__len__``, but can estimate or compute a size (such as many iterators). Proposal This PEP proposes formally documenting ``__length_hint__`` for other interpreter and non-standard library Python to implement. ``__length_hint__`` must return an integer, and is not required to be accurate. It may return a value that is either larger or smaller than the actual size of the container. It may raise a ``TypeError`` if a specific instance cannot have its length estimated. It may not return a negative value. Rationale = Being able to pre-allocate lists based on the expected size, as estimated by ``__length_hint__``, can be a significant optimization. CPython has been observed to run some code faster than PyPy, purely because of this optimization being present. Open questions == There are two open questions for this PEP: * Should ``list`` expose a kwarg in it's constructor for supplying a length hint. * Should a function be added either to ``builtins`` or some other module which calls ``__length_hint__``, like ``builtins.len`` calls ``__len__``. Copyright = This document has been placed into the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 Alex ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 0424: A method for exposing a length hint
2012/7/14 Alex Gaynor : > > Proposal > > > This PEP proposes formally documenting ``__length_hint__`` for other > interpreter and non-standard library Python to implement. > > ``__length_hint__`` must return an integer, and is not required to be > accurate. > It may return a value that is either larger or smaller than the actual size of > the container. It may raise a ``TypeError`` if a specific instance cannot have > its length estimated. It may not return a negative value. And what happens if you return a negative value? > > Rationale > = > > Being able to pre-allocate lists based on the expected size, as estimated by > ``__length_hint__``, can be a significant optimization. CPython has been > observed to run some code faster than PyPy, purely because of this > optimization > being present. > > Open questions > == > > There are two open questions for this PEP: > > * Should ``list`` expose a kwarg in it's constructor for supplying a length > hint. > * Should a function be added either to ``builtins`` or some other module which > calls ``__length_hint__``, like ``builtins.len`` calls ``__len__``. Let's try to keep this as limited as possible for a public API. -- Regards, Benjamin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 0424: A method for exposing a length hint
On Sat, Jul 14, 2012 at 4:18 PM, Benjamin Peterson wrote: > 2012/7/14 Alex Gaynor : > > > > Proposal > > > > > > This PEP proposes formally documenting ``__length_hint__`` for other > > interpreter and non-standard library Python to implement. > > > > ``__length_hint__`` must return an integer, and is not required to be > accurate. > > It may return a value that is either larger or smaller than the actual > size of > > the container. It may raise a ``TypeError`` if a specific instance > cannot have > > its length estimated. It may not return a negative value. > > And what happens if you return a negative value? > > ValueError, the same as with len. > > > > Rationale > > = > > > > Being able to pre-allocate lists based on the expected size, as > estimated by > > ``__length_hint__``, can be a significant optimization. CPython has been > > observed to run some code faster than PyPy, purely because of this > optimization > > being present. > > > > Open questions > > == > > > > There are two open questions for this PEP: > > > > * Should ``list`` expose a kwarg in it's constructor for supplying a > length > > hint. > > * Should a function be added either to ``builtins`` or some other module > which > > calls ``__length_hint__``, like ``builtins.len`` calls ``__len__``. > > Let's try to keep this as limited as possible for a public API. > > Sounds reasonable to me! Should we just go ahead and strip those out now? > > -- > Regards, > Benjamin > Alex -- "I disapprove of what you say, but I will defend to the death your right to say it." -- Evelyn Beatrice Hall (summarizing Voltaire) "The people's good is the highest law." -- Cicero ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 0424: A method for exposing a length hint
On Sat, Jul 14, 2012 at 4:21 PM, Alex Gaynor wrote: > > > On Sat, Jul 14, 2012 at 4:18 PM, Benjamin Peterson > wrote: >> >> 2012/7/14 Alex Gaynor : >> > >> > Proposal >> > >> > >> > This PEP proposes formally documenting ``__length_hint__`` for other >> > interpreter and non-standard library Python to implement. >> > >> > ``__length_hint__`` must return an integer, and is not required to be >> > accurate. >> > It may return a value that is either larger or smaller than the actual >> > size of >> > the container. It may raise a ``TypeError`` if a specific instance >> > cannot have >> > its length estimated. It may not return a negative value. >> >> And what happens if you return a negative value? >> > > ValueError, the same as with len. > >> >> > >> > Rationale >> > = >> > >> > Being able to pre-allocate lists based on the expected size, as >> > estimated by >> > ``__length_hint__``, can be a significant optimization. CPython has been >> > observed to run some code faster than PyPy, purely because of this >> > optimization >> > being present. >> > >> > Open questions >> > == >> > >> > There are two open questions for this PEP: >> > >> > * Should ``list`` expose a kwarg in it's constructor for supplying a >> > length >> > hint. >> > * Should a function be added either to ``builtins`` or some other module >> > which >> > calls ``__length_hint__``, like ``builtins.len`` calls ``__len__``. >> >> Let's try to keep this as limited as possible for a public API. >> > > Sounds reasonable to me! Should we just go ahead and strip those out now? I'm +1 on not having a public API for this. Ultimately the contract for a length hint will depend heavily upon what you need it for. Some applications would require a length hint to be an "at least" others an "at most" and others something else entirely. Given that the contract here appears to be >=0, I don't think the length hint is particularly useful to the public at large. > >> >> >> -- >> Regards, >> Benjamin > > > Alex > > -- > "I disapprove of what you say, but I will defend to the death your right to > say it." -- Evelyn Beatrice Hall (summarizing Voltaire) > "The people's good is the highest law." -- Cicero > > > ___ > Python-Dev mailing list > Python-Dev@python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/alexandre.zani%40gmail.com > ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 0424: A method for exposing a length hint
2012/7/14 Alex Gaynor : > > > On Sat, Jul 14, 2012 at 4:18 PM, Benjamin Peterson > wrote: >> >> 2012/7/14 Alex Gaynor : >> > >> > Proposal >> > >> > >> > This PEP proposes formally documenting ``__length_hint__`` for other >> > interpreter and non-standard library Python to implement. >> > >> > ``__length_hint__`` must return an integer, and is not required to be >> > accurate. >> > It may return a value that is either larger or smaller than the actual >> > size of >> > the container. It may raise a ``TypeError`` if a specific instance >> > cannot have >> > its length estimated. It may not return a negative value. >> >> And what happens if you return a negative value? >> > > ValueError, the same as with len. CPython will probably have to updated to not ignore it if you return "melons". > >> >> > >> > Rationale >> > = >> > >> > Being able to pre-allocate lists based on the expected size, as >> > estimated by >> > ``__length_hint__``, can be a significant optimization. CPython has been >> > observed to run some code faster than PyPy, purely because of this >> > optimization >> > being present. >> > >> > Open questions >> > == >> > >> > There are two open questions for this PEP: >> > >> > * Should ``list`` expose a kwarg in it's constructor for supplying a >> > length >> > hint. >> > * Should a function be added either to ``builtins`` or some other module >> > which >> > calls ``__length_hint__``, like ``builtins.len`` calls ``__len__``. >> >> Let's try to keep this as limited as possible for a public API. >> > > Sounds reasonable to me! Should we just go ahead and strip those out now? Certainly. -- Regards, Benjamin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 0424: A method for exposing a length hint
On 7/14/2012 6:11 PM, Alex Gaynor wrote: ... Various thoughts: "This method is then used by various other functions (such +as ``map``) to presize lists" -- map no longer produces lists. This only makes sense in 3.x if you mean that map can pass along the value of its inputs. "Types can then define ``__length_hint__`` which are not +sized, and thus should not define ``__len__``," is awkwardly phrased. I think you mean "Types that are not sized and should not define __len__ can then define __length_hint__. What do 'sized' and 'should' mean? Some iterators know exactly how many items they have yet to yield. The main implication of having a __len__ versus __length_hint__ methods seems to be it bool() value when empty. If lists were to get a new keyword arg, so should the other classes based on one internal array. I see this has been removed. Generator functions are the nicest way to define iterators in Python. Generator instances returned from generator functions cannot be given a length hint. They are not directly helped. However ... Not addressed in the PEP: do consumers of __length_hint look for it (and __length__ before or after calling iter(input), or both? If before, then the following should work. class gwlh: # generator with length hint def __init__(self, gen, len): self.gen = gen self.len = len def __iter__(self): return self.gen def __length_hint__(self): return len Do transformation iterators pass through hints from inputs? Does map(f, iterable) look for len or hint on iterable? Ditto for some itertools, like chain (add lengths). Any guidelines in the PEP -- Terry Jan Reedy ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 0424: A method for exposing a length hint
On Sun, Jul 15, 2012 at 9:18 AM, Benjamin Peterson wrote: >> Open questions >> == >> >> There are two open questions for this PEP: >> >> * Should ``list`` expose a kwarg in it's constructor for supplying a length >> hint. >> * Should a function be added either to ``builtins`` or some other module >> which >> calls ``__length_hint__``, like ``builtins.len`` calls ``__len__``. > > Let's try to keep this as limited as possible for a public API. Length hints are very useful for *any* container implementation, whether those containers are in the standard library or not. Just as we exposed operator.index when __index__ was added, we should expose an "operator.length_hint" function with the following semantics: def length_hint(obj): """Return an estimate of the number of items in obj. This is useful for presizing containers when building from an iterable. If the object supports len(), the result will be exact. Otherwise, it may over or underestimate by an arbitrary amount. The result will be an integer >= 0. """ try: return len(obj) except TypeError: try: get_hint = obj.__length_hint__ except AttributeError: return 0 hint = get_hint() if not isinstance(hint, int): raise TypeError("Length hint must be an integer, not %r" % type(hint)) if hint < 0: raise ValueError("Length hint (%r) must be >= 0" % hint) return hint There's no reason to make pure Python container implementations reimplement all that for themselves. Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 0424: A method for exposing a length hint
On Sat, Jul 14, 2012 at 10:16 PM, Nick Coghlan wrote: > On Sun, Jul 15, 2012 at 9:18 AM, Benjamin Peterson > wrote: > >> Open questions > >> == > >> > >> There are two open questions for this PEP: > >> > >> * Should ``list`` expose a kwarg in it's constructor for supplying a > length > >> hint. > >> * Should a function be added either to ``builtins`` or some other > module which > >> calls ``__length_hint__``, like ``builtins.len`` calls ``__len__``. > > > > Let's try to keep this as limited as possible for a public API. > > Length hints are very useful for *any* container implementation, > whether those containers are in the standard library or not. Just as > we exposed operator.index when __index__ was added, we should expose > an "operator.length_hint" function with the following semantics: > > def length_hint(obj): > """Return an estimate of the number of items in obj. This is > useful for presizing containers when building from an iterable. > > If the object supports len(), the result will be exact. > Otherwise, it may over or underestimate by an arbitrary amount. The > result will be an integer >= 0. > """ > try: > return len(obj) > except TypeError: > try: > get_hint = obj.__length_hint__ > except AttributeError: > return 0 > hint = get_hint() > if not isinstance(hint, int): > raise TypeError("Length hint must be an integer, not > %r" % type(hint)) > if hint < 0: > raise ValueError("Length hint (%r) must be >= 0" % hint) > return hint > > There's no reason to make pure Python container implementations > reimplement all that for themselves. > > Cheers, > Nick. > > -- > Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia > Sounds reasonable to me, the only issue with your psuedocode (err... I mean Python ;)), is that there's no way for the __lenght_hint__ to specify that that particular instance can't have a length hint computed. e.g. imagine some sort of lazy stream that cached itself, and only wanted to offer a length hint if it had already been evaluated. Without an exception to raise, it has to return whatever the magic value for length_hint is (in your impl it appears to be 0, the current _PyObject_LengthHint method in CPython has a required `default` parameter). The PEP proposes using TypeError for that. Anyways that code looks good, do you want to add it to the PEP? Alex -- "I disapprove of what you say, but I will defend to the death your right to say it." -- Evelyn Beatrice Hall (summarizing Voltaire) "The people's good is the highest law." -- Cicero ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com