[issue40230] Itertools.product() Out of Memory Errors

Dennis Sweeney Wed, 08 Apr 2020 18:49:51 -0700


Dennis Sweeney <[email protected]> added the comment:


The trouble is that itertools.product accepts iterators, and there is no 
guaranteed way of "restarting" an arbitrary iterator in Python. Consider:

    >>> a = iter([1,2,3])
    >>> b = iter([4,5,6])
    >>> next(a)
    1
    >>> next(b)
    4
    >>> from itertools import product
    >>> list(product(a, b))
    [(2, 5), (2, 6), (3, 5), (3, 6)]

Since there's no way to get back to items you've already consumed, the current 
approach is to consume all of the iterators to begin with and store their items 
in arrays, then lazily produce tuples of the items at the right indices of 
those arrays.

Perhaps one could consume lazily from the iterators, say, only filling up the 
pools as they're needed and not storing the contents of the first iterator, but 
this would mean sometimes producing a product iterator that was doomed to cause 
a memory error eventually. If you really need this behavior you could do this 
in Python:

    def lazy_product(*iterables):
        if not iterables:
            yield ()
            return
        it0 = iterables[0]
        for x in it0:
            print(f"{x=}")
            for rest in lazy_product(*iterables[1:]):
                print(f"{rest=}")
                yield (x,) + rest

The above could surely be optimized as maybe you're suggesting, but this would 
be a backward-incompatible change for itertools.product.

----------
nosy: +Dennis Sweeney

_______________________________________
Python tracker <[email protected]>
<https://bugs.python.org/issue40230>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue40230] Itertools.product() Out of Memory Errors

Reply via email to