Re: [Tutor] a quick Q: how to use for loop to read a series of files with .doc end

Dave Angel Fri, 07 Oct 2011 03:24:01 -0700

On 10/07/2011 04:08 AM, lina wrote:

<snip>
I thought it might be some loop reason made it double output the results, so
I made an adjustation in indent, now it showed:
$ python3 counter-vertically-v2.py
{'B': [0, 0, 0, 0, 0, 0], 'E': [1, 0, 1, 0, 1, 0]}
{'B': [0, 0, 0, 0, 0, 0], 'E': [1, 0, 1, 0, 1, 0]}
[1, 0, 1, 0, 1, 0]
Traceback (most recent call last):
   File "counter-vertically-v2.py", line 48, in<module>
     dofiles(".")
   File "counter-vertically-v2.py", line 13, in dofiles
     processfile(filename)
   File "counter-vertically-v2.py", line 31, in processfile
     for a,b in zip(results['E'],results['B']):
KeyError: 'E'


still two results, but the summary is correct, with a KeyError which I don't
know how to fix the key error here.

#!/bin/python3

import os.path


TOKENS="BE"
LINESTOSKIP=0
INFILEEXT=".xpm"
OUTFILEEXT=".txt"

def dofiles(topdirectory):
     for filename in os.listdir(topdirectory):
         processfile(filename)

def processfile(infilename):
     results={}
     base, ext =os.path.splitext(infilename)
     if ext == INFILEEXT:
         text = fetchonefiledata(infilename)
         numcolumns=len(text[0])
         for ch in TOKENS:
             results[ch] = [0]*numcolumns
         for line in text:
             line = line.strip()
         for col, ch in enumerate(line):
             if ch in TOKENS:
                 results[ch][col]+=1
     for k,v in results.items():
         print(results)

That'll print the whole map for each item in it. Since you apparentlyhave two items, "E" and "B", you get the whole thing printed out twice.


I have no idea what you really wanted to print, but it probably was k and v

     summary=[]
     for a,b in zip(results['E'],results['B']):
         summary.append(a+b)
     print(summary)
     writeonefiledata(base+OUTFILEEXT,summary)

def fetchonefiledata(inname):
     infile = open(inname)
     text = infile.readlines()
     return text[LINESTOSKIP:]

def writeonefiledata(outname,summary):
     outfile = open(outname,"w")
     for elem in summary:
         outfile.write(str(summary))


if __name__=="__main__":
     dofiles(".")

Thanks all for your time,

As for the reason you got the exception, it probably was because theNEXT file had no E's in it.

One of the reasons to break this stuff into separate functions is so youcan test them separately. You probably should be calling processfile()directly in your top-level code, till it all comes out correctly. Or atleast add a print of the filename it's working on.

Anyway, it's probably a mistake to ever reference "E" and "B"explicitly, but instead loop through the TOKENS. That way it'll stillwork when you add more or different tokens. Further, if it's consideredvalid for an input file not to have samples of all the tokens, then youhave to loop through the ones you actually have. That might meanlooping through the keys of results. Or, for the particular use case inthat line, there's undoubtedly a method of results that will give youall the values in a list. That list would make an even better argumentto zip(). Once again, I remind you of the dir() function, to seeavailable methods.


--

DaveA

_______________________________________________
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] a quick Q: how to use for loop to read a series of files with .doc end

Reply via email to