Re: [Tutor] Review and criticism of python project

2008-01-07 Thread Tiger12506
 Tiger12506 wrote:

 Ouch. Usually in OOP, one never puts any user interaction into a class.

 That seems a bit strongly put to me. Generally user interaction should be
 separated from functional classes but you might have a class to help with
 command line interaction (e.g. the cmd module in the std lib) and GUI
 frameworks are usually built with classes.

Sorry... made that a little too general. I just have personal experiences
with putting a whole bunch of raw_inputs in a class once and learned never
to make that mistake again when I wanted to use the code for something else.
;-)

uh, tf = line.split(',')
if uh in self.uhdata:
f=self.uhdata[uh]
if tf not in f:
f.append(tf)
else:
self.uhdata[uh]=[tf]

 This can read

 try:
   f = self.uhdata[uh]
 except KeyError:
   self.uhdata[uh] = []
 finally:
   self.uhdata[uh].append(tf)

 These are not equivalent - the original code avoids duplicates in
 self.uhdata[uh].

 Possibly self.uhdata[uh] could be a set instead of a list. Then this could
 be written very nicely using defaultdict:

 from collections import defaultdict
   ...
   self.uhdata = defaultdict(set)
   ...
   self.uhdata[uh].add(tf)

for uh, Sections in self.uhdata.items():
Sections.sort()

 This will not work. In the documentation, it says that dictionary object
 return a *copy* of the (key,value) pairs.
 You are sorting those *copies*.

 No, the original code is fine. The docs say that dict.items() returns a
 copy of a's list of (key, value) pairs. This is a little misleading,
 perhaps. A dict does not actually contain a list of key, value pairs, it
 is implemented with a hash table. dict.items() returns a new list
 containing the key, value pairs.

 But the keys and values in the new list are references to the same keys
 and values in the dict. So mutating the values in the returned list, via
 sort(), will also mutate the values in the dict.

Ah hah! Yes knew about hashes... It was the references that got me. I need
to learn to use id() at the python prompt a little more often. ;-)

missingpgcounts={}
fmissingpgcounts=[]
for x in self.uhdata:
for f in self.uhdata[x]:
if f not in fmissingpgcounts:
fmissingpgcounts.append(f)

 fmissingpgcounts = [f for f in self.uhdata.itervalues() if f not in
 fmissingpgcounts]
 Ahhh... This looks like a set.
 fmissingpgcounts = set(self.uhdata.itervalues())

 Again, this is not quite the same thing. The original code builds a set of
 the contents of the values. You are building a set from the values (which
 are already lists). You still need one loop:
 fmissingpgcounts = set()
 for v in self.uhdate.itervalues():
   fmissingpgcounts.update(v)

 self.pgcounts = dict((x,0) for x in fmissingpgcounts)

 or, self.pgcounts = dict.fromkeys(fmissingpgcounts, 0)

 That's all I have energy for...
 Kent


Hmm... Apparently I didn't have enough energy. I must remember to eat
breakfast more often, it's affecting how I think... You're completely right
of course, Kent! What would all of us do without you?

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Review and criticism of python project

2008-01-04 Thread GTXY20
There are no errors per se - the script is doing what it needs to I guess I
just want to check it for compliance - for some reason I think itis a mess
and should be much cleaner.

I am only concerned with one particular area of the complete project - it is
229 lines in total - would this be too much to post? I do not have a website
to post code to - just don't want to post too much for the group and annoy
anyone.

Thanks for your comments and let me know.

GTXY20

On Jan 3, 2008 6:08 PM, wesley chun [EMAIL PROTECTED] wrote:

  Is there a forum or group where I can upload my python project for
 review?
 
  I am new at Python and at this point my program is doing what it needs
 to I
  just can't help but feeling I have some errors or improper coding going
 on
  inside.


 if it's not too huge, feel free to post it here for feedback!  let us
 know a bit about what it does, your intentions and/or perhaps how you
 built it, and any specific areas of concern. also post any error msgs
 if applicable.

 cheers,
 -- wesley
 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
 Core Python Programming, Prentice Hall, (c)2007,2001
http://corepython.com

 wesley.j.chun :: wescpy-at-gmail.com
 python training and technical consulting
 cyberweb.consulting : silicon valley, ca
 http://cyberwebconsulting.com

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Review and criticism of python project

2008-01-04 Thread bob gailer
GTXY20 wrote:
  
 There are no errors per se - the script is doing what it needs to I 
 guess I just want to check it for compliance - for some reason I think 
 itis a mess and should be much cleaner.
  
 I am only concerned with one particular area of the complete project - 
 it is 229 lines in total - would this be too much to post?

Did you get my invitation to post programs in that size range? Please 
just do it. The suspense is killing me!
 I do not have a website to post code to - just don't want to post too 
 much for the group and annoy anyone.
Dialogging about it is more painful than just reading the code.
  
 Thanks for your comments and let me know.

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Review and criticism of python project

2008-01-04 Thread GTXY20
Hi there.

What this section area does is takes a data file that is comma separated and
imports - there is a unique ID in the first field and a code in the second
field that corresponds to a certain section of information. What I need from
this is for the process to role up against unique ID all section holdings
withot duplicates, report on section combinations, and overal section
counts. In addtion I need the ability to assigna value for page count to
these sections and have the ability to uploada translation file just in case
a section is identiifed by multiple values that needs to be normalized to a
single unique value.

Sorry for the lengthly code response - all commenst are appreciated - as
mentioned I am quite new with Python - it is doing what I need it to do but
I think that it is a mess and needs to be cleaned up a little.

Thanks for any comments.

GTXY20

import sys
import os
class __analysis:
def __init__(self):
print '***Analysis Tool***'
datafile=raw_input('data file name:')
self.datafile=datafile
self.parsefile()

# script to import unitID section data and section page count reference and
create a sorted dictionary
# where in uhdata{} key=unitID and value=unitID section holdings
# where in pgcnt{} key=Section and value=page count

def parsefile(self):
try:
uhdatafile = open(self.datafile, 'r')
records = uhdatafile.read()
uhdatafile.close()
lines = records.split()
self.uhdata={}
for line in lines:
uh, tf = line.split(',')
if uh in self.uhdata:
f=self.uhdata[uh]
if tf not in f:
f.append(tf)
else:
self.uhdata[uh]=[tf]

for uh, Sections in self.uhdata.items():
Sections.sort()
except IOError:
print 'file not found check file name'
analysis()

ftranslateok=raw_input('would you like to translate section codes?
(y/n):')
if ftranslateok == 'y':
self.transFn()
else:
pass
pgcountok=raw_input('would you like to assign section page counts?
(y/n):')
if pgcountok == 'y':
self.setPageCounts()
else:
missingpgcounts={}
fmissingpgcounts=[]
for x in self.uhdata:
for f in self.uhdata[x]:
if f not in fmissingpgcounts:
fmissingpgcounts.append(f)
for x in fmissingpgcounts:
missingpgcounts[x]=0
self.pgcounts = missingpgcounts
fdistmodel=raw_input('would you like to define max section
distribution cut off? (y/n):')
if fdistmodel == 'y':
self.fdistmax=raw_input('what is the max distributions before a
full book?:')
self.fdistmax=int(self.fdistmax)
self.Sectiondistmax()
else:
self.fdistmax=10
self.Sectiondistmax()
sys.exit(1)

# function to determine number of uniqueID for each section
def Sectionqty(self):
Sectionqtyoutfile = open('Sectionqty.txt', 'w+')
Sectionqtyoutfile.write ('Section\tQTY\n')
from collections import defaultdict
fcounts=defaultdict(int)
flst=[]
flst2=[]
if self.fdistmax == 10:
for v in self.uhdata.values():
for item in v:
fcounts[item]+=1

for k,v in sorted(fcounts.items()):
Section=k
fqty=v
Sectionqtyoutfile.write ('%s\t%s\n' % (Section, fqty))

else:
for k,v in self.uhdata.items():
if len(v)=self.fdistmax:
flst.append(self.uhdata[k])
for i in flst:
for x in i:
flst2.append(x)
for Sections in flst2:
fcounts[Sections]+=1
for k,v in sorted(fcounts.items()):
Section= k
fqty= v
Sectionqtyoutfile.write ('%s\t%s\n' % (Section, fqty))

Sectionqtyoutfile.close()
self.SectionCombqty()

# function to determine number of uniqueID section combinations and
associated section page counts
def SectionCombqty(self):
SectionCombqtyoutfile = open('SectionCombqty.txt', 'w+')
SectionCombqtyoutfile.write('Combination Qty\tNumber of
Sections\tCombination\tCombinationPageCount\tTotalPages\n')
fullbook = 'Full Book'
fgreater=[]
fcheck=0
from collections import defaultdict
fcomb=defaultdict(int)
for uh in self.uhdata.keys():
fcomblst=self.uhdata[uh]
fcomb[tuple(fcomblst)]+=1
if self.fdistmax == 10:
for count, items in sorted( ((v,k) for k,v in fcomb.items
()),reverse=True):
fpgcounts = 

Re: [Tutor] Review and criticism of python project

2008-01-04 Thread Alan Gauld

GTXY20 [EMAIL PROTECTED] wrote

 Sorry for the lengthly code response - all commenst are 
 appreciated - as
 mentioned I am quite new with Python - it is doing what I need it to 
 do but
 I think that it is a mess and needs to be cleaned up a little.

I've only skimmed it very quickly and a few things popped out at me.

1) Separate the UI from the logic. In particular dont put print 
statements
or raw_input statements in init methods - you will never be able to 
use
the class in a GUI context! Instead pass the values into the class
when you instantiate it.

2) If you must use import inside methods put it at the top - thats 
where
most readers will look for it. Its more conventional to put all 
imports in
a file at the top of the file.

3) Double underscore before a name in Python usually means
its a magic Python feature or a private attribute of a class. Using it 
for a
class name is very unusual. A single underscore would indicate the 
class
was private and not intended for general use and might be more
appropriate..

4) This section
self.uhdata={}
for line in lines:
uh, tf = line.split(',')
if uh in self.uhdata:
f=self.uhdata[uh]
if tf not in f:
f.append(tf)
else:
self.uhdata[uh]=[tf]

Looks like it should be cleaner. I don;t thionk you shouldneed all
those if/in checks using a dictionary. I can't be sure without 
spending
more time reading it than I've got right now but it looks wrong 
somehow.

5) What is the po8int of this:
for k,v in sorted(fcounts.items()):
Section=k
fqty=v
Sectionqtyoutfile.write ('%s\t%s\n' % (Section, 
 fqty))


Why not just
for k,v in sorted(fcounts.items()):
Sectionqtyoutfile.write ('%s\t%s\n' % (k,v))

The extra variables dont add anything and if you really prefer the
names then do this:
for Section,fqty in sorted(fcounts.items()):
Sectionqtyoutfile.write ('%s\t%s\n' % (Section,fqty))

 Thanks for any comments.

Thats as far as I got,

HTH,

-- 
Alan Gauld
Author of the Learn to Program web site
http://www.freenetpages.co.uk/hp/alan.gauld



___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Review and criticism of python project

2008-01-04 Thread Tiger12506
Cool! Code! (gullom voice - we must look at code, yesss)

 Hi there.

 What this section area does is takes a data file that is comma separated 
 and
 imports - there is a unique ID in the first field and a code in the second
 field that corresponds to a certain section of information. What I need 
 from
 this is for the process to role up against unique ID all section holdings
 withot duplicates, report on section combinations, and overal section
 counts. In addtion I need the ability to assigna value for page count to
 these sections and have the ability to uploada translation file just in 
 case
 a section is identiifed by multiple values that needs to be normalized to 
 a
 single unique value.

 Sorry for the lengthly code response - all commenst are appreciated - as
 mentioned I am quite new with Python - it is doing what I need it to do 
 but
 I think that it is a mess and needs to be cleaned up a little.

 Thanks for any comments.

 GTXY20

 import sys
 import os
 class __analysis:
def __init__(self):
print '***Analysis Tool***'
datafile=raw_input('data file name:')
self.datafile=datafile
self.parsefile()

Ouch. Usually in OOP, one never puts any user interaction into a class. 
Anytime you want to make an __analysis object, code execution is halted unto 
the user responds Yuck. If necessary, then make the datafile a parameter 
you pass in and do the raw_input outside of the class definition before you 
instantiate an object.
As an aside... setting the filename to datafile and then datafile directly 
to self.datafile is a little wasteful when you can set it directly to 
self.datafile

 # script to import unitID section data and section page count reference 
 and
 create a sorted dictionary
 # where in uhdata{} key=unitID and value=unitID section holdings
 # where in pgcnt{} key=Section and value=page count

def parsefile(self):
try:
uhdatafile = open(self.datafile, 'r')
records = uhdatafile.read()
uhdatafile.close()
lines = records.split()

Ummm this will split on every space also. Using the name 'lines' as an 
indicator I think you were expecting this to actually split on newline 
characters, so instead you should use uhdatafile.readlines()

self.uhdata={}
for line in lines:

Here you iterate over a created list. It is easier and more efficient to 
iterate over the file itself.
for line in uhdatafile:
   do_stuff()
uhdatafile.close()

uh, tf = line.split(',')
if uh in self.uhdata:
f=self.uhdata[uh]
if tf not in f:
f.append(tf)
else:
self.uhdata[uh]=[tf]

This can read

try:
  f = self.uhdata[uh]
except KeyError:
  self.uhdata[uh] = []
finally:
  self.uhdata[uh].append(tf)


Also note that there is no error checking in the algorithm anywhere here. 
The file must be exactly formatted or it will create an error.

for uh, Sections in self.uhdata.items():
Sections.sort()

This will not work. In the documentation, it says that dictionary object 
return a *copy* of the (key,value) pairs.
You are sorting those *copies*.  Also sinces uh is not used here 
itervalues() is sufficient.

for Sections in self.uhdata.itervalues():
  Sections.sort()


except IOError:
print 'file not found check file name'
analysis()

ftranslateok=raw_input('would you like to translate section codes?
 (y/n):')
if ftranslateok == 'y':
self.transFn()
else:
pass

This is unnecessary, code execution will continue whether you use else: pass 
or not.

pgcountok=raw_input('would you like to assign section page counts?
 (y/n):')
if pgcountok == 'y':
self.setPageCounts()
else:
missingpgcounts={}
fmissingpgcounts=[]
for x in self.uhdata:
for f in self.uhdata[x]:
if f not in fmissingpgcounts:
fmissingpgcounts.append(f)

fmissingpgcounts = [f for f in self.uhdata.itervalues() if f not in 
fmissingpgcounts]
Ahhh... This looks like a set.
fmissingpgcounts = set(self.uhdata.itervalues())

for x in fmissingpgcounts:
missingpgcounts[x]=0
self.pgcounts = missingpgcounts

Oh my. So we have a dictionary, create a unique list of the values and then 
create another dictionary from that?
The intermediate variable missingpgcounts is not necessary.

self.pgcounts = dict((x,0) for x in fmissingpgcounts)

fdistmodel=raw_input('would you like to define max section
 distribution cut off? (y/n):')
if fdistmodel == 'y':
self.fdistmax=raw_input('what is the max distributions before a
 full book?:')
self.fdistmax=int(self.fdistmax)
self.Sectiondistmax()
else:

Re: [Tutor] Review and criticism of python project

2008-01-04 Thread Kent Johnson
Tiger12506 wrote:

 Ouch. Usually in OOP, one never puts any user interaction into a class. 

That seems a bit strongly put to me. Generally user interaction should 
be separated from functional classes but you might have a class to help 
with command line interaction (e.g. the cmd module in the std lib) and 
GUI frameworks are usually built with classes.


uh, tf = line.split(',')
if uh in self.uhdata:
f=self.uhdata[uh]
if tf not in f:
f.append(tf)
else:
self.uhdata[uh]=[tf]
 
 This can read
 
 try:
   f = self.uhdata[uh]
 except KeyError:
   self.uhdata[uh] = []
 finally:
   self.uhdata[uh].append(tf)

These are not equivalent - the original code avoids duplicates in 
self.uhdata[uh].

Possibly self.uhdata[uh] could be a set instead of a list. Then this 
could be written very nicely using defaultdict:

from collections import defaultdict
   ...
   self.uhdata = defaultdict(set)
   ...
   self.uhdata[uh].add(tf)

for uh, Sections in self.uhdata.items():
Sections.sort()
 
 This will not work. In the documentation, it says that dictionary object 
 return a *copy* of the (key,value) pairs.
 You are sorting those *copies*.  

No, the original code is fine. The docs say that dict.items() returns a 
copy of a's list of (key, value) pairs. This is a little misleading, 
perhaps. A dict does not actually contain a list of key, value pairs, it 
is implemented with a hash table. dict.items() returns a new list 
containing the key, value pairs.

But the keys and values in the new list are references to the same keys 
and values in the dict. So mutating the values in the returned list, via 
sort(), will also mutate the values in the dict.


missingpgcounts={}
fmissingpgcounts=[]
for x in self.uhdata:
for f in self.uhdata[x]:
if f not in fmissingpgcounts:
fmissingpgcounts.append(f)
 
 fmissingpgcounts = [f for f in self.uhdata.itervalues() if f not in 
 fmissingpgcounts]
 Ahhh... This looks like a set.
 fmissingpgcounts = set(self.uhdata.itervalues())

Again, this is not quite the same thing. The original code builds a set 
of the contents of the values. You are building a set from the values 
(which are already lists). You still need one loop:
fmissingpgcounts = set()
for v in self.uhdate.itervalues():
   fmissingpgcounts.update(v)

 self.pgcounts = dict((x,0) for x in fmissingpgcounts)

or, self.pgcounts = dict.fromkeys(fmissingpgcounts, 0)

That's all I have energy for...
Kent

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Review and criticism of python project

2008-01-04 Thread GTXY20
thanks for the feedback - i will go through your comments line by line
adjust and test and will re-post when complete.

GTXY20

On Jan 4, 2008 9:29 PM, Kent Johnson [EMAIL PROTECTED] wrote:

 Tiger12506 wrote:

  Ouch. Usually in OOP, one never puts any user interaction into a class.

 That seems a bit strongly put to me. Generally user interaction should
 be separated from functional classes but you might have a class to help
 with command line interaction (e.g. the cmd module in the std lib) and
 GUI frameworks are usually built with classes.


 uh, tf = line.split(',')
 if uh in self.uhdata:
 f=self.uhdata[uh]
 if tf not in f:
 f.append(tf)
 else:
 self.uhdata[uh]=[tf]
 
  This can read
 
  try:
f = self.uhdata[uh]
  except KeyError:
self.uhdata[uh] = []
  finally:
self.uhdata[uh].append(tf)

 These are not equivalent - the original code avoids duplicates in
 self.uhdata[uh].

 Possibly self.uhdata[uh] could be a set instead of a list. Then this
 could be written very nicely using defaultdict:

 from collections import defaultdict
   ...
   self.uhdata = defaultdict(set)
   ...
   self.uhdata[uh].add(tf)

 for uh, Sections in self.uhdata.items():
 Sections.sort()
 
  This will not work. In the documentation, it says that dictionary object
  return a *copy* of the (key,value) pairs.
  You are sorting those *copies*.

 No, the original code is fine. The docs say that dict.items() returns a
 copy of a's list of (key, value) pairs. This is a little misleading,
 perhaps. A dict does not actually contain a list of key, value pairs, it
 is implemented with a hash table. dict.items() returns a new list
 containing the key, value pairs.

 But the keys and values in the new list are references to the same keys
 and values in the dict. So mutating the values in the returned list, via
 sort(), will also mutate the values in the dict.


 missingpgcounts={}
 fmissingpgcounts=[]
 for x in self.uhdata:
 for f in self.uhdata[x]:
 if f not in fmissingpgcounts:
 fmissingpgcounts.append(f)
 
  fmissingpgcounts = [f for f in self.uhdata.itervalues() if f not in
  fmissingpgcounts]
  Ahhh... This looks like a set.
  fmissingpgcounts = set(self.uhdata.itervalues())

 Again, this is not quite the same thing. The original code builds a set
 of the contents of the values. You are building a set from the values
 (which are already lists). You still need one loop:
 fmissingpgcounts = set()
 for v in self.uhdate.itervalues():
   fmissingpgcounts.update(v)

  self.pgcounts = dict((x,0) for x in fmissingpgcounts)

 or, self.pgcounts = dict.fromkeys(fmissingpgcounts, 0)

 That's all I have energy for...
 Kent


___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Review and criticism of python project

2008-01-03 Thread bhaaluu
If you have a web page, you can upload the code to your web page,
then post here with a link to the code and a request for reviews.
That's one way to do it.
-- 
b h a a l u u at g m a i l dot c o m

On Jan 3, 2008 5:00 PM, GTXY20 [EMAIL PROTECTED] wrote:

 Hello all,

 Is there a forum or group where I can upload my python project for review?

 I am new at Python and at this point my program is doing what it needs to I
 just can't help but feeling I have some errors or improper coding going on
 inside.

 Any advice is very much appreciated.

 Thanks.

 GTXY20

 ___
 Tutor maillist  -  Tutor@python.org
 http://mail.python.org/mailman/listinfo/tutor


___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Review and criticism of python project

2008-01-03 Thread bob gailer
GTXY20 wrote:

 Hello all,

 Is there a forum or group where I can upload my python project for review?
Others on this list can guide you to web sites for uploading code.

If the program is relatively small you can just email it to this list as 
body text. How small? For me I'd rather wade thru 200 or so lines in an 
email.

 I am new at Python and at this point my program is doing what it needs 
 to I just can't help but feeling I have some errors
Errors? Do you mean bugs that have not yet surfaced but might under some 
conditions?
 or improper coding
Are you concerned about performance, readability  maintainability, 
easier / better ways to do things, conformance to Python coding 
guidelines, etc?

If so this is the right place to get suggestions.

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor