[issue10109] itertools.product with infinite iterator cause MemoryError.

2012-01-18 Thread Sumudu Fernando

Sumudu Fernando  added the comment:

>>> tuple(itertools.cycle(enumerate(it)) for it in itertools.count())
  ...
  TypeError: 'int' object is not iterable

That is not what happens in the function, though!  That would correspond to 
doing product(*itertools.count(2010)), but if you try that you won't even get 
past argument expansion (obviously).  Doing product(*xrange(10)) gives the 
error you're talking about, for example.

product(itertools.count(2010)) works perfectly well with the version I posted, 
though it is a bit silly to do it that way since it produces the same values as 
count itself (which is what "cartesian product" should do), while saving extra 
bookkeeping along the way.

Anyway, I'm pretty new to python and I don't think this is quite relevant 
enough to warrant opening a new ticket.  I'm happy to leave it here for the 
education of the next neophyte who stumbles across this idiosyncracy of 
itertools.product.

--

___
Python tracker 
<http://bugs.python.org/issue10109>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10109] itertools.product with infinite iterator cause MemoryError.

2012-01-18 Thread Sumudu Fernando

Sumudu Fernando  added the comment:

I don't agree with the response to this.

It is true that as implemented (at least in 2.7, I don't have 3.x handy to 
check) itertools.product requires finite iterables.  However this seems to be 
simply a consequence of the implementation and not part of the "spirit" of the 
function, which as falsetru pointed out is stated to be "equivalent to nested 
for-loops in a generator expression".

Indeed, implementing product in Python (in a recursive way) doesn't have this 
problem.

Perhaps a more convincing set of testcases to show why this could be considered 
a problem:

>>> import itertools
>>> itertools.product(xrange(100))

>>> itertools.product(xrange(100))

>>> itertools.product(xrange(10))
Traceback (most recent call last):
  File "", line 1, in 
MemoryError

Note that I'm not even using an infinite iterable, just a really big one.  The 
issue is that creating the iterator fails with a MemoryError, before I've even 
asked for any values.  Consider the following:

for (i, v) in enumerate(itertools.product(a, b, c)):
if i < 1000:
print v
else:
break

When a, b, and c are relatively small, finite iterables, this code works fine.  
However, if *any* of them are too large (or infinite), we see a MemoryError 
before the loop even starts, even though only 1000 elements are required.  I 
think it's conceivable that we might want something like "a = 
itertools.cycle(xrange(5))", and even that will break this loop.

That said, in all such cases I could think of, we can always either truncate 
big iterators before passing them to product, or use zip/comprehensions to add 
their values into the tuple (or some combination of those).  So maybe it isn't 
a huge deal.

I've attached my implementation of product which deals with infinite iterators 
by leveraging enumerate and itertools.cycle, and is pretty much a direct 
translation of the "odometer" idea.  This doesn't support the "repeat" 
parameter (but probably could using itertools.tee).  One thing that should be 
changed is itertools.cycle shouldn't be called / doesn't need to be called on 
infinite iterators, but I couldn't figure out how to do that.  Maybe there is 
some way to handle it in the C implementation?)

In summary: the attached implementation of product can accept any mix of 
infinite / finite iterators, returning a generator intended for partial 
consumption.  The existing itertools.product doesn't work in this case.

--
nosy: +Sumudu.Fernando
Added file: http://bugs.python.org/file24270/product.py

___
Python tracker 
<http://bugs.python.org/issue10109>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com