Help me understand this iterator

2006-10-31 Thread LaundroMat
Hi,

I've found this script over at effbot
(http://effbot.org/librarybook/os-path.htm), and I can't get my head
around its inner workings. Here's the script:

import os

class DirectoryWalker:
# a forward iterator that traverses a directory tree

def __init__(self, directory):
self.stack = [directory]
self.files = []
self.index = 0

def __getitem__(self, index):
while 1:
try:
file = self.files[self.index]
self.index = self.index + 1
except IndexError:
# pop next directory from stack
self.directory = self.stack.pop()
self.files = os.listdir(self.directory)
self.index = 0
else:
# got a filename
fullname = os.path.join(self.directory, file)
if os.path.isdir(fullname) and not
os.path.islink(fullname):
self.stack.append(fullname)
return fullname

for file in DirectoryWalker(.):
print file

Now, if I look at this script step by step, I don't understand:
- what is being iterated over (what is being called by file in
DirectoryWalker()?);
- where it gets the index value from;
- where the while 1:-loop is quitted.

Thanks in advance,

Mathieu

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Help me understand this iterator

2006-10-31 Thread Fredrik Lundh
LaundroMat wrote:

 Now, if I look at this script step by step, I don't understand:
 - what is being iterated over (what is being called by file in
 DirectoryWalker()?);

as explained in the text above the script, this class emulates a 
sequence.  it does this by implementing the __getindex__ method:

 http://effbot.org/pyref/__getitem__

 - where it gets the index value from;

from the call to __getitem__ done by the for-in loop.

 - where the while 1:-loop is quitted.

the loop stops when the stack is empty, and pop raises an IndexError 
exception.

note that this is an old example; code written for newer versions of 
Python would probably use a recursing generator instead (see the source 
code for os.walk in the standard library for an example).

/F

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Help me understand this iterator

2006-10-31 Thread Peter Otten
LaundroMat wrote:

 Hi,
 
 I've found this script over at effbot
 (http://effbot.org/librarybook/os-path.htm), and I can't get my head
 around its inner workings. Here's the script:
 
 import os
 
 class DirectoryWalker:
 # a forward iterator that traverses a directory tree
 
 def __init__(self, directory):
 self.stack = [directory]
 self.files = []
 self.index = 0
 
 def __getitem__(self, index):
 while 1:
 try:
 file = self.files[self.index]
 self.index = self.index + 1
 except IndexError:
 # pop next directory from stack
 self.directory = self.stack.pop()
 self.files = os.listdir(self.directory)
 self.index = 0
 else:
 # got a filename
 fullname = os.path.join(self.directory, file)
 if os.path.isdir(fullname) and not
 os.path.islink(fullname):
 self.stack.append(fullname)
 return fullname
 
 for file in DirectoryWalker(.):
 print file
 
 Now, if I look at this script step by step, I don't understand:
 - what is being iterated over (what is being called by file in
 DirectoryWalker()?);
 - where it gets the index value from;
 - where the while 1:-loop is quitted.

With 

dw = DirectoryWalker(.)

the for loop is equivalent to

index = 0 # internal variable, not visible from Python
while True:
try: 
file = dw[index] # invokes dw.__getitem__(index)
except IndexError:
break
print file

This is an old way of iterating over a sequence which is only used when the
iterator-based approach

dwi = iter(dw) # invokes dw.__iter__()
while True:
try:
file = dwi.next()
except StopIteration:
break
print file

fails.

Peter
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Help me understand this iterator

2006-10-31 Thread Steven D'Aprano
On Tue, 31 Oct 2006 03:36:08 -0800, LaundroMat wrote:

 Hi,
 
 I've found this script over at effbot
 (http://effbot.org/librarybook/os-path.htm), and I can't get my head
 around its inner workings. 

[snip code]

 Now, if I look at this script step by step, I don't understand:
 - what is being iterated over (what is being called by file in
 DirectoryWalker()?);

What is being iterated over is the list of files in the current directory.
In Unix land (and probably DOS/Windows as well) the directory . means
this directory, right here.


 - where it gets the index value from;

When Python see's a line like for x in obj: it does some special
magic. First it looks to see if obj has a next method, that is, it
tries to call obj.next() repeatedly. That's not the case here --
DirectoryWalker is an old-style iterator, not one of the fancy new ones.

Instead, Python tries calling obj[index] starting at 0 and keeps going
until an IndexError exception is raised, then it halts the for loop.

So, think of it like this: pretend that Python expands the following code:

for x in obj:
block

into something like this:

index = 0
while True: # loop forever
try:
x = obj[index]
block # can use x in block
except IndexError:
# catch the exception and escape the while loop
break
index = index + 1
# and now we're done, continue the rest of the program

That's not exactly what Python does, of course, it is much more efficient,
but that's a good picture of what happens.


 - where the while 1:-loop is quitted.


The while 1 loop is escaped when the function hits the return statement.



-- 
Steven.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Help me understand this iterator

2006-10-31 Thread Peter Otten
LaundroMat wrote:

[me hitting send too soon]

 Now, if I look at this script step by step, I don't understand:

 - where the while 1:-loop is quitted.

 class DirectoryWalker:
 # a forward iterator that traverses a directory tree
 
 def __init__(self, directory):
 self.stack = [directory]
 self.files = []
 self.index = 0
 
 def __getitem__(self, index):
 while 1:
 try:
 file = self.files[self.index]
 self.index = self.index + 1
 except IndexError:
 # pop next directory from stack
 self.directory = self.stack.pop()

If self.stack is empty, pop() will raise an IndexError which terminates both
the 'while 1' loop in __getitem__() and the enclosing 'for file in ...'
loop

 self.files = os.listdir(self.directory)
 self.index = 0
 else:
 # got a filename
 fullname = os.path.join(self.directory, file)
 if os.path.isdir(fullname) and not
 os.path.islink(fullname):
 self.stack.append(fullname)
 return fullname

The return statement feeds the next file to the for loop.

Peter

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Help me understand this iterator

2006-10-31 Thread LaundroMat
Ack, I get it now. It's not the variable's name (index) that is
hard-coded, it's just that the for...in... loop sends an argument by
default. That's a lot more comforting.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Help me understand this iterator

2006-10-31 Thread LaundroMat
Thanks all, those were some great explanations. It seems I have still
still a long way for me to go before I grasp the intricacies of this
language.

That 'magic index' variable bugs me a little however. It gives me the
same feeling as when I see hard-coded variables. I suppose the
generator class has taken care of this with its next() method (although
- I should have a look - __next__() probable takes self and index as
its arguments). Although I'm very fond of the language (as a
non-formally trained hobbyist developer), that magic bit is a tad
disturbing.

Still, thanks for the quick and complete replies!

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Help me understand this iterator

2006-10-31 Thread Fredrik Lundh
LaundroMat wrote:

 That 'magic index' variable bugs me a little however. It gives me the
 same feeling as when I see hard-coded variables. 

what magic index?  the variable named index is an argument to the 
method it's used in.

/F

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Help me understand this iterator

2006-10-31 Thread LaundroMat
On Oct 31, 3:53 pm, Fredrik Lundh [EMAIL PROTECTED] wrote:
 LaundroMat wrote:
  That 'magic index' variable bugs me a little however. It gives me the
  same feeling as when I see hard-coded variables.what magic index?  the 
  variable named index is an argument to the
 method it's used in.

Yes, I reacted too quickly. Sorry.

-- 
http://mail.python.org/mailman/listinfo/python-list