[Python-Dev] PEP 0424: A method for exposing a length hint

2012-07-14 Thread Alex Gaynor
Hi all,

I've just submitted a PEP proposing making __length_hint__ a public API for 
users to define and other VMs to implement:

PEP: 424
Title: A method for exposing a length hint
Version: $Revision$
Last-Modified: $Date
Author: Alex Gaynor 
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 14-July-2012
Python-Version: 3.4

Abstract


CPython currently defines an ``__length_hint__`` method on several types, such
as various iterators. This method is then used by various other functions (such 
as
``map``) to presize lists based on the estimated returned by
``__length_hint__``. Types can then define ``__length_hint__`` which are not
sized, and thus should not define ``__len__``, but can estimate or compute a
size (such as many iterators).

Proposal


This PEP proposes formally documenting ``__length_hint__`` for other
interpreter and non-standard library Python to implement.

``__length_hint__`` must return an integer, and is not required to be accurate.
It may return a value that is either larger or smaller than the actual size of
the container. It may raise a ``TypeError`` if a specific instance cannot have
its length estimated. It may not return a negative value.

Rationale
=

Being able to pre-allocate lists based on the expected size, as estimated by 
``__length_hint__``, can be a significant optimization. CPython has been
observed to run some code faster than PyPy, purely because of this optimization
being present.

Open questions
==

There are two open questions for this PEP:

* Should ``list`` expose a kwarg in it's constructor for supplying a length
  hint.
* Should a function be added either to ``builtins`` or some other module which
  calls ``__length_hint__``, like ``builtins.len`` calls ``__len__``.

Copyright
=

This document has been placed into the public domain.

..
Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
coding: utf-8




Alex

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0424: A method for exposing a length hint

2012-07-14 Thread Benjamin Peterson
2012/7/14 Alex Gaynor :
>
> Proposal
> 
>
> This PEP proposes formally documenting ``__length_hint__`` for other
> interpreter and non-standard library Python to implement.
>
> ``__length_hint__`` must return an integer, and is not required to be 
> accurate.
> It may return a value that is either larger or smaller than the actual size of
> the container. It may raise a ``TypeError`` if a specific instance cannot have
> its length estimated. It may not return a negative value.

And what happens if you return a negative value?

>
> Rationale
> =
>
> Being able to pre-allocate lists based on the expected size, as estimated by
> ``__length_hint__``, can be a significant optimization. CPython has been
> observed to run some code faster than PyPy, purely because of this 
> optimization
> being present.
>
> Open questions
> ==
>
> There are two open questions for this PEP:
>
> * Should ``list`` expose a kwarg in it's constructor for supplying a length
>   hint.
> * Should a function be added either to ``builtins`` or some other module which
>   calls ``__length_hint__``, like ``builtins.len`` calls ``__len__``.

Let's try to keep this as limited as possible for a public API.


-- 
Regards,
Benjamin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0424: A method for exposing a length hint

2012-07-14 Thread Alex Gaynor
On Sat, Jul 14, 2012 at 4:18 PM, Benjamin Peterson wrote:

> 2012/7/14 Alex Gaynor :
> >
> > Proposal
> > 
> >
> > This PEP proposes formally documenting ``__length_hint__`` for other
> > interpreter and non-standard library Python to implement.
> >
> > ``__length_hint__`` must return an integer, and is not required to be
> accurate.
> > It may return a value that is either larger or smaller than the actual
> size of
> > the container. It may raise a ``TypeError`` if a specific instance
> cannot have
> > its length estimated. It may not return a negative value.
>
> And what happens if you return a negative value?
>
>
ValueError, the same as with len.


> >
> > Rationale
> > =
> >
> > Being able to pre-allocate lists based on the expected size, as
> estimated by
> > ``__length_hint__``, can be a significant optimization. CPython has been
> > observed to run some code faster than PyPy, purely because of this
> optimization
> > being present.
> >
> > Open questions
> > ==
> >
> > There are two open questions for this PEP:
> >
> > * Should ``list`` expose a kwarg in it's constructor for supplying a
> length
> >   hint.
> > * Should a function be added either to ``builtins`` or some other module
> which
> >   calls ``__length_hint__``, like ``builtins.len`` calls ``__len__``.
>
> Let's try to keep this as limited as possible for a public API.
>
>
Sounds reasonable to me!  Should we just go ahead and strip those out now?


>
> --
> Regards,
> Benjamin
>

Alex

-- 
"I disapprove of what you say, but I will defend to the death your right to
say it." -- Evelyn Beatrice Hall (summarizing Voltaire)
"The people's good is the highest law." -- Cicero
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0424: A method for exposing a length hint

2012-07-14 Thread Alexandre Zani
On Sat, Jul 14, 2012 at 4:21 PM, Alex Gaynor  wrote:
>
>
> On Sat, Jul 14, 2012 at 4:18 PM, Benjamin Peterson 
> wrote:
>>
>> 2012/7/14 Alex Gaynor :
>> >
>> > Proposal
>> > 
>> >
>> > This PEP proposes formally documenting ``__length_hint__`` for other
>> > interpreter and non-standard library Python to implement.
>> >
>> > ``__length_hint__`` must return an integer, and is not required to be
>> > accurate.
>> > It may return a value that is either larger or smaller than the actual
>> > size of
>> > the container. It may raise a ``TypeError`` if a specific instance
>> > cannot have
>> > its length estimated. It may not return a negative value.
>>
>> And what happens if you return a negative value?
>>
>
> ValueError, the same as with len.
>
>>
>> >
>> > Rationale
>> > =
>> >
>> > Being able to pre-allocate lists based on the expected size, as
>> > estimated by
>> > ``__length_hint__``, can be a significant optimization. CPython has been
>> > observed to run some code faster than PyPy, purely because of this
>> > optimization
>> > being present.
>> >
>> > Open questions
>> > ==
>> >
>> > There are two open questions for this PEP:
>> >
>> > * Should ``list`` expose a kwarg in it's constructor for supplying a
>> > length
>> >   hint.
>> > * Should a function be added either to ``builtins`` or some other module
>> > which
>> >   calls ``__length_hint__``, like ``builtins.len`` calls ``__len__``.
>>
>> Let's try to keep this as limited as possible for a public API.
>>
>
> Sounds reasonable to me!  Should we just go ahead and strip those out now?

I'm +1 on not having a public API for this. Ultimately the contract
for a length hint will depend heavily upon what you need it for. Some
applications would require a length hint to be an "at least" others an
"at most" and others something else entirely. Given that the contract
here appears to be >=0, I don't think the length hint is particularly
useful to the public at large.

>
>>
>>
>> --
>> Regards,
>> Benjamin
>
>
> Alex
>
> --
> "I disapprove of what you say, but I will defend to the death your right to
> say it." -- Evelyn Beatrice Hall (summarizing Voltaire)
> "The people's good is the highest law." -- Cicero
>
>
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> http://mail.python.org/mailman/options/python-dev/alexandre.zani%40gmail.com
>
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0424: A method for exposing a length hint

2012-07-14 Thread Benjamin Peterson
2012/7/14 Alex Gaynor :
>
>
> On Sat, Jul 14, 2012 at 4:18 PM, Benjamin Peterson 
> wrote:
>>
>> 2012/7/14 Alex Gaynor :
>> >
>> > Proposal
>> > 
>> >
>> > This PEP proposes formally documenting ``__length_hint__`` for other
>> > interpreter and non-standard library Python to implement.
>> >
>> > ``__length_hint__`` must return an integer, and is not required to be
>> > accurate.
>> > It may return a value that is either larger or smaller than the actual
>> > size of
>> > the container. It may raise a ``TypeError`` if a specific instance
>> > cannot have
>> > its length estimated. It may not return a negative value.
>>
>> And what happens if you return a negative value?
>>
>
> ValueError, the same as with len.

CPython will probably have to updated to not ignore it if you return "melons".

>
>>
>> >
>> > Rationale
>> > =
>> >
>> > Being able to pre-allocate lists based on the expected size, as
>> > estimated by
>> > ``__length_hint__``, can be a significant optimization. CPython has been
>> > observed to run some code faster than PyPy, purely because of this
>> > optimization
>> > being present.
>> >
>> > Open questions
>> > ==
>> >
>> > There are two open questions for this PEP:
>> >
>> > * Should ``list`` expose a kwarg in it's constructor for supplying a
>> > length
>> >   hint.
>> > * Should a function be added either to ``builtins`` or some other module
>> > which
>> >   calls ``__length_hint__``, like ``builtins.len`` calls ``__len__``.
>>
>> Let's try to keep this as limited as possible for a public API.
>>
>
> Sounds reasonable to me!  Should we just go ahead and strip those out now?

Certainly.


-- 
Regards,
Benjamin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0424: A method for exposing a length hint

2012-07-14 Thread Terry Reedy

On 7/14/2012 6:11 PM, Alex Gaynor wrote:
...

Various thoughts:

"This method is then used by various other functions (such +as ``map``) 
to presize lists"
 -- map no longer produces lists. This only makes sense in 3.x if you 
mean that map can pass along the value of its inputs.


"Types can then define ``__length_hint__`` which are not
+sized, and thus should not define ``__len__``,"
is awkwardly phrased. I think you mean
"Types that are not sized and should not define __len__ can then define 
__length_hint__.


What do 'sized' and 'should' mean? Some iterators know exactly how many 
items they have yet to yield. The main implication of having a __len__ 
versus __length_hint__ methods seems to be it bool() value when empty.


If lists were to get a new keyword arg, so should the other classes 
based on one internal array. I see this has been removed.


Generator functions are the nicest way to define iterators in Python. 
Generator instances returned from generator functions cannot be given a 
length hint. They are not directly helped. However ...


Not addressed in the PEP: do consumers of __length_hint look for it (and 
__length__ before or after calling iter(input), or both? If before, then 
the following should work.


class gwlh: # generator with length hint
def __init__(self, gen, len):
self.gen = gen
self.len = len
def __iter__(self):
return self.gen
def __length_hint__(self):
return len

Do transformation iterators pass through hints from inputs? Does map(f, 
iterable) look for len or hint on iterable? Ditto for some itertools, 
like chain (add lengths). Any guidelines in the PEP


--
Terry Jan Reedy



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0424: A method for exposing a length hint

2012-07-14 Thread Nick Coghlan
On Sun, Jul 15, 2012 at 9:18 AM, Benjamin Peterson  wrote:
>> Open questions
>> ==
>>
>> There are two open questions for this PEP:
>>
>> * Should ``list`` expose a kwarg in it's constructor for supplying a length
>>   hint.
>> * Should a function be added either to ``builtins`` or some other module 
>> which
>>   calls ``__length_hint__``, like ``builtins.len`` calls ``__len__``.
>
> Let's try to keep this as limited as possible for a public API.

Length hints are very useful for *any* container implementation,
whether those containers are in the standard library or not. Just as
we exposed operator.index when __index__ was added, we should expose
an "operator.length_hint" function with the following semantics:

def length_hint(obj):
"""Return an estimate of the number of items in obj. This is
useful for presizing containers when building from an iterable.

If the object supports len(), the result will be exact.
Otherwise, it may over or underestimate by an arbitrary amount. The
result will be an integer >= 0.
"""
try:
return len(obj)
except TypeError:
try:
get_hint = obj.__length_hint__
except AttributeError:
return 0
hint = get_hint()
if not isinstance(hint, int):
raise TypeError("Length hint must be an integer, not
%r" % type(hint))
if hint < 0:
raise ValueError("Length hint (%r) must be >= 0" % hint)
return hint

There's no reason to make pure Python container implementations
reimplement all that for themselves.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0424: A method for exposing a length hint

2012-07-14 Thread Alex Gaynor
On Sat, Jul 14, 2012 at 10:16 PM, Nick Coghlan  wrote:

> On Sun, Jul 15, 2012 at 9:18 AM, Benjamin Peterson 
> wrote:
> >> Open questions
> >> ==
> >>
> >> There are two open questions for this PEP:
> >>
> >> * Should ``list`` expose a kwarg in it's constructor for supplying a
> length
> >>   hint.
> >> * Should a function be added either to ``builtins`` or some other
> module which
> >>   calls ``__length_hint__``, like ``builtins.len`` calls ``__len__``.
> >
> > Let's try to keep this as limited as possible for a public API.
>
> Length hints are very useful for *any* container implementation,
> whether those containers are in the standard library or not. Just as
> we exposed operator.index when __index__ was added, we should expose
> an "operator.length_hint" function with the following semantics:
>
> def length_hint(obj):
> """Return an estimate of the number of items in obj. This is
> useful for presizing containers when building from an iterable.
>
> If the object supports len(), the result will be exact.
> Otherwise, it may over or underestimate by an arbitrary amount. The
> result will be an integer >= 0.
> """
> try:
> return len(obj)
> except TypeError:
> try:
> get_hint = obj.__length_hint__
> except AttributeError:
> return 0
> hint = get_hint()
> if not isinstance(hint, int):
> raise TypeError("Length hint must be an integer, not
> %r" % type(hint))
> if hint < 0:
> raise ValueError("Length hint (%r) must be >= 0" % hint)
> return hint
>
> There's no reason to make pure Python container implementations
> reimplement all that for themselves.
>
> Cheers,
> Nick.
>
> --
> Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
>

Sounds reasonable to me, the only issue with your psuedocode (err... I mean
Python ;)), is that there's no way for the __lenght_hint__ to specify that
that particular instance can't have a length hint computed.  e.g. imagine
some sort of lazy stream that cached itself, and only wanted to offer a
length hint if it had already been evaluated.  Without an exception to
raise, it has to return whatever the magic value for length_hint is (in
your impl it appears to be 0, the current _PyObject_LengthHint method in
CPython has a required `default` parameter).  The PEP proposes using
TypeError for that.

Anyways that code looks good, do you want to add it to the PEP?

Alex

-- 
"I disapprove of what you say, but I will defend to the death your right to
say it." -- Evelyn Beatrice Hall (summarizing Voltaire)
"The people's good is the highest law." -- Cicero
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com