On Thu, May 07, 2020 at 11:12:28PM +0900, Stephen J. Turnbull wrote:

[...]
> So of course zip can be said to be used like a class.  Its instances
> are constructed by calling it, the most common (only in practice?) 
> method call is .__iter__, it's invariably called implicitly in a for
> statement, and usually the instance is ephemeral (ie, discarded when
> the for statement is exited).  But it needn't be.  Like any iterator,
> if it's not exhausted in one iteration context, you can continue it
> later.  Or explicitly call next on it.

Agreed.

In CPython 3.x, zip is a class. In 2.x it was a function returning a 
list. I don't know why it's a class now -- possibly something to do with 
the C implementation? -- but that's not part of the public API, and I 
would not expect that to be a language guarantee. The API is only that 
it returns an iterator, it doesn't have to be any specific class.

If zip were implemented in pure Python, it would probably be a 
generator, something like this:

    def zip(a, b):
        while True:
            yield(next(a), next(b))

only better :-)


[...]
>  > So while yes, alternate constructors are a common pattern, I don't
>  > think they are a common pattern for classes like zip.
> 
> That's a matter of programming style, I think.  There's no real
> difference between
> 
>     zip(a, b, length='checksame')
> 
> and 
> 
>     zip.checksame(a, b)
> 
> They just initialize an internal attribute differently, which may as
> well be an Enum or even a few named class constants.

I agree that it's a matter of programming style, but I disagree with 
your reason. It isn't necessary for the methods to return instances of 
the same class, they could also return different classes.

An old example of this from Python 2 was int/long unification:

    py> type(int(1e1))
    <type 'int'>

    py> type(int(1e100))
    <type 'long'>

A current example of this is the open() built-in, which returns a 
different class depending on the arguments given. For example:

    py> open('/tmp/a', 'wb')
    <_io.BufferedWriter name='/tmp/a'>

    py> open('/tmp/a', 'w')
    <_io.TextIOWrapper name='/tmp/a' mode='w' encoding='UTF-8'>


So we might have:

    zip() --> return a zip instance
    zip.checksame() --> return a zip_strict instance

for example. The specific type of iterator is an implementation detail.


> I think (after the initial shock ;-) I like the latter *better*,
> because the semantics of checksame and longest are significantly (to
> me, anyway) different.  checksame is a constraint on correct behavior,
> and explicitly elicits an Exception on invalid input.  longest is a
> variant specification of behavior on a certain class of valid input.
> I'm happier with those being different *functions* rather than values
> of an argument.  YMMV, of course.

If we reach consensus that this functionality is worth having and worth 
being in the builtins, my preferences would go (best to worst):


(1) Namespaces are one honking great idea -- let's do more of those!
    zip is a namespace, let's use that fact:

    zip(*args)           # for backwards compatibility
    zip.strict(*args)

This gives us the best flexibility going into the future. We can add 
new versions of zip without overloading the builtins itself: there is 
only one top level name, and the docstring can point the reader at 
dir(zip) to see more. At some point, we might choose to move zip_longest 
into zip as well, leaving the itertools version for backwards 
compatibility.

Maybe some day Soni will even get his desired version of zip that 
exposes any partial results left over after an iterator is exhausted.

This idiom is particularly convenient since zip is a class and can 
easily be given additional methods, so they would show up in help(zip) 
without any extra work.


(2) Separate top-level functions:

    zip, zip_strict


(3) A mode parameter:

    zip(*args, mode='short')  # default
    zip(*args, mode='strict')


and then a long, long, long way down my list of preferred APIs:

(4) A bool flag:

    zip(*args, strict=False)  # default
    zip(*args, strict=True)

which is the least flexible, since it locks us in to only two such zip 
versions without going into API contortions.



-- 
Steven
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/XMTAHCVNRZXABZXL5VONRDBHRTTI32AP/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to