Re: [Tutor] Review and criticism of python project
Tiger12506 wrote: Ouch. Usually in OOP, one never puts any user interaction into a class. That seems a bit strongly put to me. Generally user interaction should be separated from functional classes but you might have a class to help with command line interaction (e.g. the cmd module in the std lib) and GUI frameworks are usually built with classes. Sorry... made that a little too general. I just have personal experiences with putting a whole bunch of raw_inputs in a class once and learned never to make that mistake again when I wanted to use the code for something else. ;-) uh, tf = line.split(',') if uh in self.uhdata: f=self.uhdata[uh] if tf not in f: f.append(tf) else: self.uhdata[uh]=[tf] This can read try: f = self.uhdata[uh] except KeyError: self.uhdata[uh] = [] finally: self.uhdata[uh].append(tf) These are not equivalent - the original code avoids duplicates in self.uhdata[uh]. Possibly self.uhdata[uh] could be a set instead of a list. Then this could be written very nicely using defaultdict: from collections import defaultdict ... self.uhdata = defaultdict(set) ... self.uhdata[uh].add(tf) for uh, Sections in self.uhdata.items(): Sections.sort() This will not work. In the documentation, it says that dictionary object return a *copy* of the (key,value) pairs. You are sorting those *copies*. No, the original code is fine. The docs say that dict.items() returns a copy of a's list of (key, value) pairs. This is a little misleading, perhaps. A dict does not actually contain a list of key, value pairs, it is implemented with a hash table. dict.items() returns a new list containing the key, value pairs. But the keys and values in the new list are references to the same keys and values in the dict. So mutating the values in the returned list, via sort(), will also mutate the values in the dict. Ah hah! Yes knew about hashes... It was the references that got me. I need to learn to use id() at the python prompt a little more often. ;-) missingpgcounts={} fmissingpgcounts=[] for x in self.uhdata: for f in self.uhdata[x]: if f not in fmissingpgcounts: fmissingpgcounts.append(f) fmissingpgcounts = [f for f in self.uhdata.itervalues() if f not in fmissingpgcounts] Ahhh... This looks like a set. fmissingpgcounts = set(self.uhdata.itervalues()) Again, this is not quite the same thing. The original code builds a set of the contents of the values. You are building a set from the values (which are already lists). You still need one loop: fmissingpgcounts = set() for v in self.uhdate.itervalues(): fmissingpgcounts.update(v) self.pgcounts = dict((x,0) for x in fmissingpgcounts) or, self.pgcounts = dict.fromkeys(fmissingpgcounts, 0) That's all I have energy for... Kent Hmm... Apparently I didn't have enough energy. I must remember to eat breakfast more often, it's affecting how I think... You're completely right of course, Kent! What would all of us do without you? ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Review and criticism of python project
There are no errors per se - the script is doing what it needs to I guess I just want to check it for compliance - for some reason I think itis a mess and should be much cleaner. I am only concerned with one particular area of the complete project - it is 229 lines in total - would this be too much to post? I do not have a website to post code to - just don't want to post too much for the group and annoy anyone. Thanks for your comments and let me know. GTXY20 On Jan 3, 2008 6:08 PM, wesley chun [EMAIL PROTECTED] wrote: Is there a forum or group where I can upload my python project for review? I am new at Python and at this point my program is doing what it needs to I just can't help but feeling I have some errors or improper coding going on inside. if it's not too huge, feel free to post it here for feedback! let us know a bit about what it does, your intentions and/or perhaps how you built it, and any specific areas of concern. also post any error msgs if applicable. cheers, -- wesley - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Core Python Programming, Prentice Hall, (c)2007,2001 http://corepython.com wesley.j.chun :: wescpy-at-gmail.com python training and technical consulting cyberweb.consulting : silicon valley, ca http://cyberwebconsulting.com ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Review and criticism of python project
GTXY20 wrote: There are no errors per se - the script is doing what it needs to I guess I just want to check it for compliance - for some reason I think itis a mess and should be much cleaner. I am only concerned with one particular area of the complete project - it is 229 lines in total - would this be too much to post? Did you get my invitation to post programs in that size range? Please just do it. The suspense is killing me! I do not have a website to post code to - just don't want to post too much for the group and annoy anyone. Dialogging about it is more painful than just reading the code. Thanks for your comments and let me know. ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Review and criticism of python project
Hi there. What this section area does is takes a data file that is comma separated and imports - there is a unique ID in the first field and a code in the second field that corresponds to a certain section of information. What I need from this is for the process to role up against unique ID all section holdings withot duplicates, report on section combinations, and overal section counts. In addtion I need the ability to assigna value for page count to these sections and have the ability to uploada translation file just in case a section is identiifed by multiple values that needs to be normalized to a single unique value. Sorry for the lengthly code response - all commenst are appreciated - as mentioned I am quite new with Python - it is doing what I need it to do but I think that it is a mess and needs to be cleaned up a little. Thanks for any comments. GTXY20 import sys import os class __analysis: def __init__(self): print '***Analysis Tool***' datafile=raw_input('data file name:') self.datafile=datafile self.parsefile() # script to import unitID section data and section page count reference and create a sorted dictionary # where in uhdata{} key=unitID and value=unitID section holdings # where in pgcnt{} key=Section and value=page count def parsefile(self): try: uhdatafile = open(self.datafile, 'r') records = uhdatafile.read() uhdatafile.close() lines = records.split() self.uhdata={} for line in lines: uh, tf = line.split(',') if uh in self.uhdata: f=self.uhdata[uh] if tf not in f: f.append(tf) else: self.uhdata[uh]=[tf] for uh, Sections in self.uhdata.items(): Sections.sort() except IOError: print 'file not found check file name' analysis() ftranslateok=raw_input('would you like to translate section codes? (y/n):') if ftranslateok == 'y': self.transFn() else: pass pgcountok=raw_input('would you like to assign section page counts? (y/n):') if pgcountok == 'y': self.setPageCounts() else: missingpgcounts={} fmissingpgcounts=[] for x in self.uhdata: for f in self.uhdata[x]: if f not in fmissingpgcounts: fmissingpgcounts.append(f) for x in fmissingpgcounts: missingpgcounts[x]=0 self.pgcounts = missingpgcounts fdistmodel=raw_input('would you like to define max section distribution cut off? (y/n):') if fdistmodel == 'y': self.fdistmax=raw_input('what is the max distributions before a full book?:') self.fdistmax=int(self.fdistmax) self.Sectiondistmax() else: self.fdistmax=10 self.Sectiondistmax() sys.exit(1) # function to determine number of uniqueID for each section def Sectionqty(self): Sectionqtyoutfile = open('Sectionqty.txt', 'w+') Sectionqtyoutfile.write ('Section\tQTY\n') from collections import defaultdict fcounts=defaultdict(int) flst=[] flst2=[] if self.fdistmax == 10: for v in self.uhdata.values(): for item in v: fcounts[item]+=1 for k,v in sorted(fcounts.items()): Section=k fqty=v Sectionqtyoutfile.write ('%s\t%s\n' % (Section, fqty)) else: for k,v in self.uhdata.items(): if len(v)=self.fdistmax: flst.append(self.uhdata[k]) for i in flst: for x in i: flst2.append(x) for Sections in flst2: fcounts[Sections]+=1 for k,v in sorted(fcounts.items()): Section= k fqty= v Sectionqtyoutfile.write ('%s\t%s\n' % (Section, fqty)) Sectionqtyoutfile.close() self.SectionCombqty() # function to determine number of uniqueID section combinations and associated section page counts def SectionCombqty(self): SectionCombqtyoutfile = open('SectionCombqty.txt', 'w+') SectionCombqtyoutfile.write('Combination Qty\tNumber of Sections\tCombination\tCombinationPageCount\tTotalPages\n') fullbook = 'Full Book' fgreater=[] fcheck=0 from collections import defaultdict fcomb=defaultdict(int) for uh in self.uhdata.keys(): fcomblst=self.uhdata[uh] fcomb[tuple(fcomblst)]+=1 if self.fdistmax == 10: for count, items in sorted( ((v,k) for k,v in fcomb.items ()),reverse=True): fpgcounts =
Re: [Tutor] Review and criticism of python project
GTXY20 [EMAIL PROTECTED] wrote Sorry for the lengthly code response - all commenst are appreciated - as mentioned I am quite new with Python - it is doing what I need it to do but I think that it is a mess and needs to be cleaned up a little. I've only skimmed it very quickly and a few things popped out at me. 1) Separate the UI from the logic. In particular dont put print statements or raw_input statements in init methods - you will never be able to use the class in a GUI context! Instead pass the values into the class when you instantiate it. 2) If you must use import inside methods put it at the top - thats where most readers will look for it. Its more conventional to put all imports in a file at the top of the file. 3) Double underscore before a name in Python usually means its a magic Python feature or a private attribute of a class. Using it for a class name is very unusual. A single underscore would indicate the class was private and not intended for general use and might be more appropriate.. 4) This section self.uhdata={} for line in lines: uh, tf = line.split(',') if uh in self.uhdata: f=self.uhdata[uh] if tf not in f: f.append(tf) else: self.uhdata[uh]=[tf] Looks like it should be cleaner. I don;t thionk you shouldneed all those if/in checks using a dictionary. I can't be sure without spending more time reading it than I've got right now but it looks wrong somehow. 5) What is the po8int of this: for k,v in sorted(fcounts.items()): Section=k fqty=v Sectionqtyoutfile.write ('%s\t%s\n' % (Section, fqty)) Why not just for k,v in sorted(fcounts.items()): Sectionqtyoutfile.write ('%s\t%s\n' % (k,v)) The extra variables dont add anything and if you really prefer the names then do this: for Section,fqty in sorted(fcounts.items()): Sectionqtyoutfile.write ('%s\t%s\n' % (Section,fqty)) Thanks for any comments. Thats as far as I got, HTH, -- Alan Gauld Author of the Learn to Program web site http://www.freenetpages.co.uk/hp/alan.gauld ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Review and criticism of python project
Cool! Code! (gullom voice - we must look at code, yesss) Hi there. What this section area does is takes a data file that is comma separated and imports - there is a unique ID in the first field and a code in the second field that corresponds to a certain section of information. What I need from this is for the process to role up against unique ID all section holdings withot duplicates, report on section combinations, and overal section counts. In addtion I need the ability to assigna value for page count to these sections and have the ability to uploada translation file just in case a section is identiifed by multiple values that needs to be normalized to a single unique value. Sorry for the lengthly code response - all commenst are appreciated - as mentioned I am quite new with Python - it is doing what I need it to do but I think that it is a mess and needs to be cleaned up a little. Thanks for any comments. GTXY20 import sys import os class __analysis: def __init__(self): print '***Analysis Tool***' datafile=raw_input('data file name:') self.datafile=datafile self.parsefile() Ouch. Usually in OOP, one never puts any user interaction into a class. Anytime you want to make an __analysis object, code execution is halted unto the user responds Yuck. If necessary, then make the datafile a parameter you pass in and do the raw_input outside of the class definition before you instantiate an object. As an aside... setting the filename to datafile and then datafile directly to self.datafile is a little wasteful when you can set it directly to self.datafile # script to import unitID section data and section page count reference and create a sorted dictionary # where in uhdata{} key=unitID and value=unitID section holdings # where in pgcnt{} key=Section and value=page count def parsefile(self): try: uhdatafile = open(self.datafile, 'r') records = uhdatafile.read() uhdatafile.close() lines = records.split() Ummm this will split on every space also. Using the name 'lines' as an indicator I think you were expecting this to actually split on newline characters, so instead you should use uhdatafile.readlines() self.uhdata={} for line in lines: Here you iterate over a created list. It is easier and more efficient to iterate over the file itself. for line in uhdatafile: do_stuff() uhdatafile.close() uh, tf = line.split(',') if uh in self.uhdata: f=self.uhdata[uh] if tf not in f: f.append(tf) else: self.uhdata[uh]=[tf] This can read try: f = self.uhdata[uh] except KeyError: self.uhdata[uh] = [] finally: self.uhdata[uh].append(tf) Also note that there is no error checking in the algorithm anywhere here. The file must be exactly formatted or it will create an error. for uh, Sections in self.uhdata.items(): Sections.sort() This will not work. In the documentation, it says that dictionary object return a *copy* of the (key,value) pairs. You are sorting those *copies*. Also sinces uh is not used here itervalues() is sufficient. for Sections in self.uhdata.itervalues(): Sections.sort() except IOError: print 'file not found check file name' analysis() ftranslateok=raw_input('would you like to translate section codes? (y/n):') if ftranslateok == 'y': self.transFn() else: pass This is unnecessary, code execution will continue whether you use else: pass or not. pgcountok=raw_input('would you like to assign section page counts? (y/n):') if pgcountok == 'y': self.setPageCounts() else: missingpgcounts={} fmissingpgcounts=[] for x in self.uhdata: for f in self.uhdata[x]: if f not in fmissingpgcounts: fmissingpgcounts.append(f) fmissingpgcounts = [f for f in self.uhdata.itervalues() if f not in fmissingpgcounts] Ahhh... This looks like a set. fmissingpgcounts = set(self.uhdata.itervalues()) for x in fmissingpgcounts: missingpgcounts[x]=0 self.pgcounts = missingpgcounts Oh my. So we have a dictionary, create a unique list of the values and then create another dictionary from that? The intermediate variable missingpgcounts is not necessary. self.pgcounts = dict((x,0) for x in fmissingpgcounts) fdistmodel=raw_input('would you like to define max section distribution cut off? (y/n):') if fdistmodel == 'y': self.fdistmax=raw_input('what is the max distributions before a full book?:') self.fdistmax=int(self.fdistmax) self.Sectiondistmax() else:
Re: [Tutor] Review and criticism of python project
Tiger12506 wrote: Ouch. Usually in OOP, one never puts any user interaction into a class. That seems a bit strongly put to me. Generally user interaction should be separated from functional classes but you might have a class to help with command line interaction (e.g. the cmd module in the std lib) and GUI frameworks are usually built with classes. uh, tf = line.split(',') if uh in self.uhdata: f=self.uhdata[uh] if tf not in f: f.append(tf) else: self.uhdata[uh]=[tf] This can read try: f = self.uhdata[uh] except KeyError: self.uhdata[uh] = [] finally: self.uhdata[uh].append(tf) These are not equivalent - the original code avoids duplicates in self.uhdata[uh]. Possibly self.uhdata[uh] could be a set instead of a list. Then this could be written very nicely using defaultdict: from collections import defaultdict ... self.uhdata = defaultdict(set) ... self.uhdata[uh].add(tf) for uh, Sections in self.uhdata.items(): Sections.sort() This will not work. In the documentation, it says that dictionary object return a *copy* of the (key,value) pairs. You are sorting those *copies*. No, the original code is fine. The docs say that dict.items() returns a copy of a's list of (key, value) pairs. This is a little misleading, perhaps. A dict does not actually contain a list of key, value pairs, it is implemented with a hash table. dict.items() returns a new list containing the key, value pairs. But the keys and values in the new list are references to the same keys and values in the dict. So mutating the values in the returned list, via sort(), will also mutate the values in the dict. missingpgcounts={} fmissingpgcounts=[] for x in self.uhdata: for f in self.uhdata[x]: if f not in fmissingpgcounts: fmissingpgcounts.append(f) fmissingpgcounts = [f for f in self.uhdata.itervalues() if f not in fmissingpgcounts] Ahhh... This looks like a set. fmissingpgcounts = set(self.uhdata.itervalues()) Again, this is not quite the same thing. The original code builds a set of the contents of the values. You are building a set from the values (which are already lists). You still need one loop: fmissingpgcounts = set() for v in self.uhdate.itervalues(): fmissingpgcounts.update(v) self.pgcounts = dict((x,0) for x in fmissingpgcounts) or, self.pgcounts = dict.fromkeys(fmissingpgcounts, 0) That's all I have energy for... Kent ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Review and criticism of python project
thanks for the feedback - i will go through your comments line by line adjust and test and will re-post when complete. GTXY20 On Jan 4, 2008 9:29 PM, Kent Johnson [EMAIL PROTECTED] wrote: Tiger12506 wrote: Ouch. Usually in OOP, one never puts any user interaction into a class. That seems a bit strongly put to me. Generally user interaction should be separated from functional classes but you might have a class to help with command line interaction (e.g. the cmd module in the std lib) and GUI frameworks are usually built with classes. uh, tf = line.split(',') if uh in self.uhdata: f=self.uhdata[uh] if tf not in f: f.append(tf) else: self.uhdata[uh]=[tf] This can read try: f = self.uhdata[uh] except KeyError: self.uhdata[uh] = [] finally: self.uhdata[uh].append(tf) These are not equivalent - the original code avoids duplicates in self.uhdata[uh]. Possibly self.uhdata[uh] could be a set instead of a list. Then this could be written very nicely using defaultdict: from collections import defaultdict ... self.uhdata = defaultdict(set) ... self.uhdata[uh].add(tf) for uh, Sections in self.uhdata.items(): Sections.sort() This will not work. In the documentation, it says that dictionary object return a *copy* of the (key,value) pairs. You are sorting those *copies*. No, the original code is fine. The docs say that dict.items() returns a copy of a's list of (key, value) pairs. This is a little misleading, perhaps. A dict does not actually contain a list of key, value pairs, it is implemented with a hash table. dict.items() returns a new list containing the key, value pairs. But the keys and values in the new list are references to the same keys and values in the dict. So mutating the values in the returned list, via sort(), will also mutate the values in the dict. missingpgcounts={} fmissingpgcounts=[] for x in self.uhdata: for f in self.uhdata[x]: if f not in fmissingpgcounts: fmissingpgcounts.append(f) fmissingpgcounts = [f for f in self.uhdata.itervalues() if f not in fmissingpgcounts] Ahhh... This looks like a set. fmissingpgcounts = set(self.uhdata.itervalues()) Again, this is not quite the same thing. The original code builds a set of the contents of the values. You are building a set from the values (which are already lists). You still need one loop: fmissingpgcounts = set() for v in self.uhdate.itervalues(): fmissingpgcounts.update(v) self.pgcounts = dict((x,0) for x in fmissingpgcounts) or, self.pgcounts = dict.fromkeys(fmissingpgcounts, 0) That's all I have energy for... Kent ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Review and criticism of python project
If you have a web page, you can upload the code to your web page, then post here with a link to the code and a request for reviews. That's one way to do it. -- b h a a l u u at g m a i l dot c o m On Jan 3, 2008 5:00 PM, GTXY20 [EMAIL PROTECTED] wrote: Hello all, Is there a forum or group where I can upload my python project for review? I am new at Python and at this point my program is doing what it needs to I just can't help but feeling I have some errors or improper coding going on inside. Any advice is very much appreciated. Thanks. GTXY20 ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Review and criticism of python project
GTXY20 wrote: Hello all, Is there a forum or group where I can upload my python project for review? Others on this list can guide you to web sites for uploading code. If the program is relatively small you can just email it to this list as body text. How small? For me I'd rather wade thru 200 or so lines in an email. I am new at Python and at this point my program is doing what it needs to I just can't help but feeling I have some errors Errors? Do you mean bugs that have not yet surfaced but might under some conditions? or improper coding Are you concerned about performance, readability maintainability, easier / better ways to do things, conformance to Python coding guidelines, etc? If so this is the right place to get suggestions. ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor