Help me understand this iterator
Hi, I've found this script over at effbot (http://effbot.org/librarybook/os-path.htm), and I can't get my head around its inner workings. Here's the script: import os class DirectoryWalker: # a forward iterator that traverses a directory tree def __init__(self, directory): self.stack = [directory] self.files = [] self.index = 0 def __getitem__(self, index): while 1: try: file = self.files[self.index] self.index = self.index + 1 except IndexError: # pop next directory from stack self.directory = self.stack.pop() self.files = os.listdir(self.directory) self.index = 0 else: # got a filename fullname = os.path.join(self.directory, file) if os.path.isdir(fullname) and not os.path.islink(fullname): self.stack.append(fullname) return fullname for file in DirectoryWalker(.): print file Now, if I look at this script step by step, I don't understand: - what is being iterated over (what is being called by file in DirectoryWalker()?); - where it gets the index value from; - where the while 1:-loop is quitted. Thanks in advance, Mathieu -- http://mail.python.org/mailman/listinfo/python-list
Re: Help me understand this iterator
LaundroMat wrote: Now, if I look at this script step by step, I don't understand: - what is being iterated over (what is being called by file in DirectoryWalker()?); as explained in the text above the script, this class emulates a sequence. it does this by implementing the __getindex__ method: http://effbot.org/pyref/__getitem__ - where it gets the index value from; from the call to __getitem__ done by the for-in loop. - where the while 1:-loop is quitted. the loop stops when the stack is empty, and pop raises an IndexError exception. note that this is an old example; code written for newer versions of Python would probably use a recursing generator instead (see the source code for os.walk in the standard library for an example). /F -- http://mail.python.org/mailman/listinfo/python-list
Re: Help me understand this iterator
LaundroMat wrote: Hi, I've found this script over at effbot (http://effbot.org/librarybook/os-path.htm), and I can't get my head around its inner workings. Here's the script: import os class DirectoryWalker: # a forward iterator that traverses a directory tree def __init__(self, directory): self.stack = [directory] self.files = [] self.index = 0 def __getitem__(self, index): while 1: try: file = self.files[self.index] self.index = self.index + 1 except IndexError: # pop next directory from stack self.directory = self.stack.pop() self.files = os.listdir(self.directory) self.index = 0 else: # got a filename fullname = os.path.join(self.directory, file) if os.path.isdir(fullname) and not os.path.islink(fullname): self.stack.append(fullname) return fullname for file in DirectoryWalker(.): print file Now, if I look at this script step by step, I don't understand: - what is being iterated over (what is being called by file in DirectoryWalker()?); - where it gets the index value from; - where the while 1:-loop is quitted. With dw = DirectoryWalker(.) the for loop is equivalent to index = 0 # internal variable, not visible from Python while True: try: file = dw[index] # invokes dw.__getitem__(index) except IndexError: break print file This is an old way of iterating over a sequence which is only used when the iterator-based approach dwi = iter(dw) # invokes dw.__iter__() while True: try: file = dwi.next() except StopIteration: break print file fails. Peter -- http://mail.python.org/mailman/listinfo/python-list
Re: Help me understand this iterator
On Tue, 31 Oct 2006 03:36:08 -0800, LaundroMat wrote: Hi, I've found this script over at effbot (http://effbot.org/librarybook/os-path.htm), and I can't get my head around its inner workings. [snip code] Now, if I look at this script step by step, I don't understand: - what is being iterated over (what is being called by file in DirectoryWalker()?); What is being iterated over is the list of files in the current directory. In Unix land (and probably DOS/Windows as well) the directory . means this directory, right here. - where it gets the index value from; When Python see's a line like for x in obj: it does some special magic. First it looks to see if obj has a next method, that is, it tries to call obj.next() repeatedly. That's not the case here -- DirectoryWalker is an old-style iterator, not one of the fancy new ones. Instead, Python tries calling obj[index] starting at 0 and keeps going until an IndexError exception is raised, then it halts the for loop. So, think of it like this: pretend that Python expands the following code: for x in obj: block into something like this: index = 0 while True: # loop forever try: x = obj[index] block # can use x in block except IndexError: # catch the exception and escape the while loop break index = index + 1 # and now we're done, continue the rest of the program That's not exactly what Python does, of course, it is much more efficient, but that's a good picture of what happens. - where the while 1:-loop is quitted. The while 1 loop is escaped when the function hits the return statement. -- Steven. -- http://mail.python.org/mailman/listinfo/python-list
Re: Help me understand this iterator
LaundroMat wrote: [me hitting send too soon] Now, if I look at this script step by step, I don't understand: - where the while 1:-loop is quitted. class DirectoryWalker: # a forward iterator that traverses a directory tree def __init__(self, directory): self.stack = [directory] self.files = [] self.index = 0 def __getitem__(self, index): while 1: try: file = self.files[self.index] self.index = self.index + 1 except IndexError: # pop next directory from stack self.directory = self.stack.pop() If self.stack is empty, pop() will raise an IndexError which terminates both the 'while 1' loop in __getitem__() and the enclosing 'for file in ...' loop self.files = os.listdir(self.directory) self.index = 0 else: # got a filename fullname = os.path.join(self.directory, file) if os.path.isdir(fullname) and not os.path.islink(fullname): self.stack.append(fullname) return fullname The return statement feeds the next file to the for loop. Peter -- http://mail.python.org/mailman/listinfo/python-list
Re: Help me understand this iterator
Ack, I get it now. It's not the variable's name (index) that is hard-coded, it's just that the for...in... loop sends an argument by default. That's a lot more comforting. -- http://mail.python.org/mailman/listinfo/python-list
Re: Help me understand this iterator
Thanks all, those were some great explanations. It seems I have still still a long way for me to go before I grasp the intricacies of this language. That 'magic index' variable bugs me a little however. It gives me the same feeling as when I see hard-coded variables. I suppose the generator class has taken care of this with its next() method (although - I should have a look - __next__() probable takes self and index as its arguments). Although I'm very fond of the language (as a non-formally trained hobbyist developer), that magic bit is a tad disturbing. Still, thanks for the quick and complete replies! -- http://mail.python.org/mailman/listinfo/python-list
Re: Help me understand this iterator
LaundroMat wrote: That 'magic index' variable bugs me a little however. It gives me the same feeling as when I see hard-coded variables. what magic index? the variable named index is an argument to the method it's used in. /F -- http://mail.python.org/mailman/listinfo/python-list
Re: Help me understand this iterator
On Oct 31, 3:53 pm, Fredrik Lundh [EMAIL PROTECTED] wrote: LaundroMat wrote: That 'magic index' variable bugs me a little however. It gives me the same feeling as when I see hard-coded variables.what magic index? the variable named index is an argument to the method it's used in. Yes, I reacted too quickly. Sorry. -- http://mail.python.org/mailman/listinfo/python-list