On 10/05/2011 08:46 AM, lina wrote:
On Wed, Oct 5, 2011 at 8:21 PM, Dave Angel<d...@davea.name>  wrote:


#these two are capitalized because they're intended to be constant
TOKENS = "BE"
LINESTOSKIP = 43
INFILEEXT = ".xpm"
OUTFILEEXT = ".txt"

def dofiles(topdirectory):
    for filename in os.listdr(topdirectory):

Here your typo is listdir not listdr,
        processfile(filename)
def processfile(infilename):
    base, ext =os.path.splitext(fileName)

Here I changed the fileName to infilename
    if ext == INFILEEXT:
        text = fetchonefiledata(infilename)
        numcolumns = len(text[0])
        results = {}
        for ch in TOKENS:

            results[ch] = [0] * numcolumns
        for line in text:
            line = line.strip()

            for col, ch in enumerate(line):
                if ch in tokens:

Here I changed the tokens to TOKENS
                    results[ch][col] += 1
        writeonefiledata(base+**OUTFILEEXT, results)


def fetchonefiledata(inname):
    infile = open(inname)
    text = infile.readlines()
    return text[LINESTOSKIP:]

def writeonefiledata(outname):
    outfile = open(outname, "w")
    ...process the results as appropriate...
    ....(since you didn't tell us how multiple tokens were to be
displayed)

if __name__ == "__main__":
    dofiles(".")     #or get the top directory from the sys.argv variable,
which is set from command line.


You dissect the former one you suggested before into 4 functions.

a little question, why choose .ext? why the splitext is also ext here?



  Try the following, perhaps in the interpreter:
mytuple = ("one thing", "Another thing")
base, extension = mytuple

Now look and see what base and extension have for values.

Previously we just needed the second element of the splitext return value.
  This time we'll need both, so might as well put them in variables that have
  useful names.
Yes, thanks for reminding, I understand now.



import os.path


TOKENS="E"
LINESTOSKIP=0
INFILEEXT=".xpm"
OUTFILEEXT=".txt"

def dofiles(topdirectory):
     for filename in os.listdir(topdirectory):
         processfile(filename)

def processfile(infilename):
     base, ext =os.path.splitext(infilename)
     if ext == INFILEEXT:
         text = fetchonefiledata(infilename)
         numcolumns=len(text[0])
         results={}
         for ch in TOKENS:

             results[ch] = [0]*numcolumns
         for line in text:
             line = line.strip()

             for col, ch in enumerate(line):
                 if ch in TOKENS:
                     results[ch][col]+=1
         writeonefiledata(base+**OUTFILEEXT,results)

def fetchonefiledata(inname):
     infile = open(inname)
     text = infile.readlines()
     return text[LINESTOSKIP:]

def writeonefiledata(outname,**results):
     outfile = open(outname,"w")
     for item in results:
         return outfile.write(item)


if __name__=="__main__":
     dofiles(".")

just the results is a bit unexpected.

  $ more try.txt
E

I might make a mistake in the writeonefiledata your left part.

  I'd be amazed if there weren't at least a couple of typos in my message.
  But this is where you sprinkle a couple of prints.  What did results look
like when you print it out?

Yes, you did keep some typos there.
The result is kind of weird? only E there.

I ask again. What did results look like when you print it out. I'm referring to the argument to writeonefiledata().
def writeonefiledata(outname,results):
put the lines here:
            print ("results is: ", results)
            print("repr is:", repr(results))

     outfile = open(outname,"w")
     for item in results:
         return outfile.write(item)

This final part I made some mistakes?

yes, you're iterating over the keys of a dictionary. Since it only has the key "E", that's what you get. Try printing dir(results) to see what methods might return something other than the key. Make the language work for you.
I hope you'll find that results is a dictionary, you might not want to just
write() its keys.  You probably want to write() its values instead, perhaps
with a heading showing what key you're printing.
Later I wish to get the value of B+E, the two tokens. so the final results
of each columns is enough. I will use this data to proceed further in
future.

the code to get multiple keys is already there. Only reason you're getting only E is that you only specified one token. Try changing it to

TOKENS = "EA"

  But it gives you a simple refactoring that splits the logic so each can be
visualized (and tested) independently.  i'd also split up processfile(),
once I realized how big it was.

There are many shortcuts that can be applied. Some of them probably use
language features you're not comfortable with, like perhaps generators.
  And
if  efficiency is important, there are optimizations to do, like using
islice directly on the infile object.  That one would eliminate having to
have the whole file stored in memory at one time.

Likewise there are further things that could be done to decouple the
functions even more.

But there's nothing in the above code which uses very advanced topics, so
you should be able to understand it and fix whatever typos I've
undoubtedly
got.

What are you using for debugging aids?  Besides this group, I mean.
  print
statements?  An IDE ?  which one?

  debugging aids?
I just run python3 script.py
it will pop up some hints,
in the middle, probably try print.

  Once the code is refactored into small enough independent functions, you
can do things like write multiple versions of a given function, for
debugging purposes.  For example, you could have another function called
  fetchonefiledata(), and have it return a list of strings.  For example, it
might be

def fetchonefiledata(dummy):
    buf = """EEDC
AAAC
F145
CCCA
"""
    return buf.split()

and then you wouldn't be dependent on an actual file being available.

Naturally, at that point, your top-level code would call processfiles()
instead of dofile().

And remember the repr() and type() functions when trying to see just what
type of thing something is.y

I have not figured it out how to use the repr() and type() yet.
So try them. repr() shows you a lot more information about an object than str() does, and the latter is what you're getting when you print something directly.

And type() shows you the type of something.

And dir() shows you the attributes of something. Usually what you're interested in is the list of methods. Anyway. once you find an interesting one you can do help() on it. For example, try help( {}.iteritems )

another question, you know in linux, when use TAB, can automatically input
something,
so in python3, are there some way they can intelligent give some hints or
fill the left.

Sure, that's the job of the IDE. If you just want auto-indentation, emacs can do that with the python macros. But if you want full method expansion and such, look into one of a dozen IDEs. I happen to use Komodo, but there are others free, and non-free. And there's Ipython. And I think there's something included in CPython, but I never looked into it.

--

DaveA

_______________________________________________
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Reply via email to