Luis Zarrabeitia wrote:
Hi there.

For most use cases I think about, the iterator protocol is more than enough. However, on a few cases, I've needed some ugly hacks.

Ex 1:

a = iter([1,2,3,4,5]) # assume you got the iterator from a function and
b = iter([1,2,3])     # these two are just examples.

then,

zip(a,b)

has a different side effect from

zip(b,a)

After the excecution, in the first case, iterator a contains just [5], on the second, it contains [4,5]. I think the second one is correct (the 5 was never used, after all). I tried to implement my 'own' zip, but there is no way to know the length of the iterator (obviously), and there is also no way to 'rewind' a value after calling 'next'.

Interesting observation. Iterators are intended for 'iterate through once and discard' usages. To zip a long sequence with several short sequences, either use itertools.chain(short sequences) or put the short sequences as the first zip arg.

Ex 2:

Will this iterator yield any value? Like with most iterables, a construct

if iterator:
   # do something

would be a very convenient thing to have, instead of wrapping a 'next' call on a try...except and consuming the first item.

To test without consuming, wrap the iterator in a trivial-to-write one_ahead or peek class such as has been posted before.

Ex 3:

if any(iterator):
# do something ... but the first true value was already consumed and # cannot be reused. "Any" cannot peek inside the iterator without # consuming the value.

If you are going to do something with the true value, use a for loop and break. If you just want to peek inside, use a sequence (list(iterator)).

Instead,

i1, i2 = tee(iterator)
if any(i1):
   # do something with i2

This effectively makes two partial lists and tosses one. That may or may not be a better idea.

Question/Proposal:

Has there been any PEP regarding the problem of 'peeking' inside an iterator?

Iterators are not sequences and, in general, cannot be made to act like them. The iterator protocol is a bare-minimum, least-common-denominator requirement for inter-operability. You can, of course, add methods to iterators that you write for the cases where one-ahead or random access *is* possible.

Knowing if the iteration will end or not, and/or accessing the next value, without consuming it? Is there any (simple, elegant) way around it?

That much is trivial. As suggested above, write a wrapper with the exact behavior you want. A sample (untested)

class one_ahead():
  "Self.peek is the next item or undefined"
  def __init__(self, iterator):
    try:
      self.peek = next(iterator)
      self._it = iterator
    except StopIteration:
      pass
  def __bool__(self):
    return hasattr(self, 'peek')
  def __next__(self): # 3.0, 2.6?
    try:
      next = self.peek
      try:
        self.peek = next(self._it)
      except StopIteration:
        del self.peek
      return next
    except AttrError:
      raise StopIteration

Terry Jan Reedy

--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to