Re: [Tutor] Multi-Dimensional Dictionary that contains a 12 element list.
On Sat, 2005-12-31 at 09:33 -0500, Kent Johnson wrote: > Could be >self.results[key] = [0*24] [0]*24 Excellent points and advice, just noting a typo. -- Lloyd Kvam Venix Corp ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Multi-Dimensional Dictionary that contains a 12 element list.
Paul Kraus wrote: > So now for all my reports all I have to write are little 5 or 6 line scripts > that take a text file split the fields and format them before basing them off > into my custom object. Very slick and this is my first python project. Its > cluttered and messy but for about 1 hours worth of work on a brand new > language I am impressed with the usability of this language. I'm impressed with how fast you have learned it! > > Current Script - Critique away! :) A number of comments below... > =-=-=-=-=--=-= > #!/usr/bin/python > import string > import re > > class Tbred_24Months: > def __init__(self,currentyear,currentmonth): ### Takes Ending Year and > Ending Month Inits List > guide = [] > self.results = {} > self.guide = {} > self.end_month = currentmonth > self.end_year = currentyear You never use these two attributes ^^ > self.start_month = currentmonth > self.start_year = currentyear start_month and start_year can be local variables, you don't use them outside this method. > for count in range(24): > guide.append((self.start_month,self.start_year)) > self.start_month -= 1 > if self.start_month < 1: > self.start_month = 12 > self.start_year -= 1 > guide.reverse() > count = 0 > for key in guide: > self.guide[key[1],key[0]]=count > count += 1 You can unpack key in the for statement, giving names to the elements: for start_month, start_year in guide: self.guide[start_year, start_month] = count You can generate count automatically with enumerate(): for count, (start_month, start_year) in enumerate(guide): self.guide[start_year, start_month] = count If you built guide with tuples in the order you actually use them (why not?) you could build self.guide as easily as self.guide = dict( (key, count) for count, key in enumerate(guide) ) For that matter, why not just build self.guide instead of guide, in the first loop? for index in range(23, -1, -1): self.guide[start_month,start_year] = index start_month -= 1 if start_month < 1: start_month = 12 start_year -= 1 I found the use of both guide and self.guide confusing but it seems you don't really need guide at all. > self.sortedkeys = self.guide.keys() > self.sortedkeys.sort() You don't use sortedkeys > > def insert(self,key,year,month,number): > if self.guide.has_key((year,month)): > if self.results.has_key(key): > seq = self.guide[(year,month)] > self.results[key][seq] += number > else: > self.results[key] = [] > for x in range(24):self.results[key].append(0) Could be self.results[key] = [0*24] Is there a bug here? You don't add in number when the key is not there. I like to use dict.setdefault() in cases like this: if self.guide.has_key((year,month)): seq = self.guide[(year,month)] self.results.setdefault(key, [0*24])[seq] += number though if performance is a critical issue your way might be better - it doesn't have to build the default list every time. (Don't try to build the default list just once, you will end up with every entry aliased to the same list!) (I hesitate to include this note at all - you will only notice a difference in performance if you are doing this operation many many times!) > > def splitfields(record): > fields = [] > datestring='' > ### Regular Expr. > re_negnum = re.compile('(\d?)\.(\d+)(-)') > re_date = re.compile('(\d\d)/(\d\d)/(\d\d)') > for element in record.split('|'): > element=element.strip() # remove leading/trailing whitespace > ### Move Neg Sign from right to left of number > negnum_match = re_negnum.search( element ) > if negnum_match: > if negnum_match.group(1):element = "%s%d.%02d" > %(negnum_match.group(3),int(negnum_match.group(1)),int(negnum_match.group(2))) > else:element = "%s0.%02d" > %(negnum_match.group(3),int(negnum_match.group(2))) This is a lot of work just to move a - sign. How about if element.endswith('-'): element = '-' + element[:-1] > ### Format Date > date_match = re_date.search(element) > if date_match: > (month,day,year) = > (date_match.group(1),date_match.group(2),date_match.group(3)) > ### Convert 2 year date to 4 year > if int(year) > 80:year = "19%02d" %int(year) > else:year = "20%02d" %int(year) > element = (year,month,day) You go to a lot of trouble to turn your year into a string, then you convert it back to an int later. Why not just keep it as an int? Do you know that Python data structures (lists, dicts, tuples) are heterogeneous? You can have a list that
Re: [Tutor] Multi-Dimensional Dictionary that contains a 12 element list.
[snip] Paul Kraus wrote: > Now I have to find a way to take the output at the end and pipe it > out to an external Perl program that creates an excel spreadsheet ( > no real clean easy way to do this in python but hey each tool has its > usefullness). I wish I could hide this in the object though so that I > could call a "dump" method that would then create the spreadsheet. I > will have to play with this later. [snip] Hi Paul, For the Excel portion of things, you may want to take a look at pyExcelerator. http://sourceforge.net/projects/pyexcelerator I had never tried it (until I read your post), so I though I'd give it a try. There are others on the list who have used it (John Fouhy & Bob Gailer, I believe, maybe others??). It seems very good a writing Excel files. It does not require COM or anything such as that, but you will need Python 2.4. It is supposed to run on both Windows and *nix (I've only tested it on Windows). I've no idea what your output data looks like, but here's a small example which uses the fields (from your previous code) and writes them to an Excel file. from pyExcelerator.Workbook import * wb = Workbook() ws0 = wb.add_sheet('Vendor_Sales') fields = \ 'vendor,otype,oreturn,discountable,discperc,amount,date'.split(',') row = 0 col = -1 for field in fields: col += 1 ws0.write(row, col, field) wb.save('Vendor_Sales.xls') It's a small download (approx. 260 KB) and easy to install (python setup.py install). It seems a little 'light' in the documentation area, but there are various examples in the zip file. HTH, Bill ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Multi-Dimensional Dictionary that contains a 12 element list.
> That is the approach Paul took originally (see the other fork of this > thread). He is accumulating a sparse 3d matrix where the keys are year, > field6 and month. (He hasn't said what field6 represents.) The problem > is that he wants to print out counts corresponding to all the existing > year and field6 values and every possible month value. To do this I > think a two-level data structure is appropriate, such as the dict[ > (year, field6) ][month] approach you outlined. Field6 is just an arbitrary field that represents some generic key. So for clarity lets say in this instance it represents a customer code. so dict(2005,12130)[0..11] would hold sales by month for customer number 12130 in 2005. This problem has evolved since it started and I have created a class that lets me build rolling 24 month data structures.I need to create a bunch of reports that will be run monthly that will show a rolling 24 month total for different things. Sales by customer, sales by vendor, purchases by vendor. So by making a class that on construction takes the current year and month it will build the structure I need. I then have a method that lets me fill the monthly buckets. All i do is pass it the arbitrary key (customercode) year, month, and amount and it will increment that bucket. So now for all my reports all I have to write are little 5 or 6 line scripts that take a text file split the fields and format them before basing them off into my custom object. Very slick and this is my first python project. Its cluttered and messy but for about 1 hours worth of work on a brand new language I am impressed with the usability of this language. Now I have to find a way to take the output at the end and pipe it out to an external Perl program that creates an excel spreadsheet ( no real clean easy way to do this in python but hey each tool has its usefullness). I wish I could hide this in the object though so that I could call a "dump" method that would then create the spreadsheet. I will have to play with this later. Current Script - Critique away! :) =-=-=-=-=--=-= #!/usr/bin/python import string import re class Tbred_24Months: def __init__(self,currentyear,currentmonth): ### Takes Ending Year and Ending Month Inits List guide = [] self.results = {} self.guide = {} self.end_month = currentmonth self.end_year = currentyear self.start_month = currentmonth self.start_year = currentyear for count in range(24): guide.append((self.start_month,self.start_year)) self.start_month -= 1 if self.start_month < 1: self.start_month = 12 self.start_year -= 1 guide.reverse() count = 0 for key in guide: self.guide[key[1],key[0]]=count count += 1 self.sortedkeys = self.guide.keys() self.sortedkeys.sort() def insert(self,key,year,month,number): if self.guide.has_key((year,month)): if self.results.has_key(key): seq = self.guide[(year,month)] self.results[key][seq] += number else: self.results[key] = [] for x in range(24):self.results[key].append(0) def splitfields(record): fields = [] datestring='' ### Regular Expr. re_negnum = re.compile('(\d?)\.(\d+)(-)') re_date = re.compile('(\d\d)/(\d\d)/(\d\d)') for element in record.split('|'): element=element.strip() # remove leading/trailing whitespace ### Move Neg Sign from right to left of number negnum_match = re_negnum.search( element ) if negnum_match: if negnum_match.group(1):element = "%s%d.%02d" %(negnum_match.group(3),int(negnum_match.group(1)),int(negnum_match.group(2))) else:element = "%s0.%02d" %(negnum_match.group(3),int(negnum_match.group(2))) ### Format Date date_match = re_date.search(element) if date_match: (month,day,year) = (date_match.group(1),date_match.group(2),date_match.group(3)) ### Convert 2 year date to 4 year if int(year) > 80:year = "19%02d" %int(year) else:year = "20%02d" %int(year) element = (year,month,day) if element == '.000': element = 0.00 fields.append( element ) return fields ### Build Vendor Sales sales = Tbred_24Months(2005,11) vendorsales = open('vendorsales.txt','r') for line in vendorsales: fields = splitfields( line ) if len(fields) == 7: (vendor,otype,oreturn,discountable,discperc,amount,date) = fields amount = float(amount);discperc = float(discperc) #if discperc and discountable == 'Y': amount = amount - ( amount * (discperc/100) ) if otype == 'C' or oreturn == 'Y':amount = amount * -1 sales.insert(vendor,int(date[0]),int(date[1]),amount) result = '' for key in sales.results: sum =
Re: [Tutor] Multi-Dimensional Dictionary that contains a 12 element list.
Danny Yoo wrote: > This being said, what data are you modeling? It almost sounds like you're > implementing some kind of 3d matrix, even a sparse one. More information > about the data you're modelling might lead to a different representation > > For example, would using a dictionary whose keys are three-tuples be > approrpriate? That is the approach Paul took originally (see the other fork of this thread). He is accumulating a sparse 3d matrix where the keys are year, field6 and month. (He hasn't said what field6 represents.) The problem is that he wants to print out counts corresponding to all the existing year and field6 values and every possible month value. To do this I think a two-level data structure is appropriate, such as the dict[ (year, field6) ][month] approach you outlined. Kent ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Multi-Dimensional Dictionary that contains a 12 element list.
> ok so assuming I had a dictionary with 1key that contained a list like > so... dictionary[key1][0] > > How would I increment it or assign it if it didn't exist. I assumed like > this. dict[key1][0] = dictionary.get(key1[0],0) + X Hi Paul, Given a dictionary d and some arbitrary key k, let's assume two possibilities: 1. k is a key in d. If so, no problem, and we'll assume that d[k] is associated to some twelve-element list. 2. k isn't yet a key in d. This is the situation we want to work on. We can go the uncomplicated route and just say something like this: if k not in d: d[k] = [0] * 12 in which case we change Possibility 2 to Possibility 1: the end result is that we make sure that d[k] points to a twelve-element list of zeros. Once we guarantee this, we can go on our merry way. For example: if k not in d: d[k] = [0] * 12 d[k][0] += 1 There is a one-liner way of doing this: we can use the setdefault() method of dictionaries. d.setdefault(k, [0] * 12)[0] += 1 This is a bit dense. There's no shame in not using setdefault() if you don't want to. *grin* I actually prefer the longer approach unless I'm really in a hurry. This being said, what data are you modeling? It almost sounds like you're implementing some kind of 3d matrix, even a sparse one. More information about the data you're modelling might lead to a different representation For example, would using a dictionary whose keys are three-tuples be approrpriate? ## def increment(d, x, y, z): """Given a dictionary whose keys are 3-tuples, increments at the position (x, y, z).""" d[(x, y, z)] = d.get((x, y, z), 0) + 1 ## Here's this code in action: ### >>> map = {} >>> increment(map, 3, 1, 4) >>> map {(3, 1, 4): 1} >>> increment(map, 2, 7, 1) >>> map {(2, 7, 1): 1, (3, 1, 4): 1} >>> increment(map, 2, 7, 1) >>> map {(2, 7, 1): 2, (3, 1, 4): 1} ### Best of wishes to you! ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Multi-Dimensional Dictionary that contains a 12 element list.
On Wednesday 28 December 2005 11:30 am, Kent Johnson wrote: > Python lists don't create new elements on assignment (I think Perl lists > do this?) so for example > dictionary[(key1, key2)][10] = X ok so assuming I had a dictionary with 1key that contained a list like so... dictionary[key1][0] How would I increment it or assign it if it didn't exist. I assumed like this. dict[key1][0] = dictionary.get(key1[0],0) + X -- Paul Kraus =-=-=-=-=-=-=-=-=-=-= PEL Supply Company Network Administrator 216.267.5775 Voice 216.267.6176 Fax www.pelsupply.com =-=-=-=-=-=-=-=-=-=-= ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Multi-Dimensional Dictionary that contains a 12 element list.
Paul Kraus wrote: > Here is the code that I used. Its functional and it works but there has got > to > be some better ways to do a lot of this. Transversing the data structure > still seems like I have to be doing it the hard way. > > The input data file has fixed width fields that are delimited by pipe. > So depending on the justification for each field it will either have leading > or ending whitespace. > # > import re > import string > results = {} > def format_date(datestring): > (period,day,year) = map(int,datestring.split('/') ) > period += 2 > if period == 13: period = 1; year += 1 > if period == 14: period = 2; year += 1 if period > 12: period -= 12; year += 1 > if year > 80: > year = '19%02d' % year > else: > year = '20%02d' % year > return (year,period) > > def format_qty(qty,credit,oreturn): > qty = float(qty) > if credit == 'C' or oreturn == 'Y': > return qty * -1 > else: > return qty > > textfile = open('orders.txt','r') > for line in textfile: > fields = map( string.strip, line.split( '|' ) ) > fields[4] = format_qty(fields[ 4 ],fields[ 1 ], fields[ 2 ] ) qty = format_qty(fields[ 4 ],fields[ 1 ], fields[ 2 ] ) would be clearer in subsequent code. > (year, period) = format_date( fields[7] ) > for count in range(12): > if count == period: > if results.get( ( year, fields[6], count), 0): > results[ year,fields[6], count] += fields[4] > else: > results[ year,fields[6],count] = fields[4] The loop on count is not doing anything, you can use period directly. And the test on results.get() is not needed, it is safe to always add: key = (year, fields[6], period) results[key] = results.get(key, 0) + qty > > sortedkeys = results.keys() > sortedkeys.sort() > > for keys in sortedkeys: > res_string = keys[0]+'|'+keys[1] > for count in range(12): > if results.get((keys[0],keys[1],count),0): > res_string += '|'+str(results[keys[0],keys[1],count]) > else: > res_string += '|0' > print res_string This will give you duplicate outputs if you ever have more than one period for a given year and field[6] (whatever that is...). OTOH if you just show the existing keys you will not have entries for the 0 keys. So maybe you should go back to your original idea of using a 12-element list for the counts. Anyway in the above code the test on results.get() is not needed since you just use the default value in the else: res_string += str(results.get((keys[0],keys[1],count),0)) > > ___ > Tutor maillist - Tutor@python.org > http://mail.python.org/mailman/listinfo/tutor > > ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Multi-Dimensional Dictionary that contains a 12 element list.
Paul Kraus wrote: > I am trying to build a data structure that would be a dictionary of a > dictionary of a list. > > In Perl I would build the structure like so $dictionary{key1}{key2}[0] = X > I would iterate like so ... > foreach my $key1 ( sort keys %dictionary ) { > foreach my $key2 ( sort keys %{$dictionary{$key1}} ) { > foreach my $element ( @{$dictionary{$key1}{$key2} } ) { > print "$key1 => $key2 => $element\n"; > } > } > } > > Sorry for the Perl reference but its the language I am coming from. I use > data > structures like this all the time. I don't always iterate them like this but > If i can learn to build these and move through them in python then a good > portion of the Perl apps I am using can be ported. > > Playing around I have come up with this but have no clue how to iterate over > it or if its the best way. It seems "clunky" but it is most likely my lack of > understanding. > > dictionary[(key1,key2)]=[ a,b,c,d,e,f,g ... ] > > This i think gives me a dictionary with two keys ( not sure how to doing > anything usefull with it though) and a list. This gives you a dict whose keys are the tuple (key1, key2). Since tuples sort in lexicographical order you could print this out sorted by key1, then key2 with for (key1, key2), value in sorted(dictionary.iteritems()): for element in value: print key1, '=>', key2, '=>', element (Wow, Python code that is shorter than the equivalent Perl? There must be some mistake! ;) > > Now I am not sure how I can assign one element at a time to the list. Assuming the list already has an element 0, use dictionary[(key1, key2)][0] = X Python lists don't create new elements on assignment (I think Perl lists do this?) so for example dictionary[(key1, key2)][10] = X will fail if the list doesn't already have 11 elements or more. You can use list.append() or pre-populate the list with default values depending on your application. Kent ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Multi-Dimensional Dictionary that contains a 12 element list.
On Wednesday 28 December 2005 10:18 am, Paul Kraus wrote: > I am trying to build a data structure that would be a dictionary of a > dictionary of a list. > > In Perl I would build the structure like so $dictionary{key1}{key2}[0] = X > I would iterate like so ... > foreach my $key1 ( sort keys %dictionary ) { > foreach my $key2 ( sort keys %{$dictionary{$key1}} ) { > foreach my $element ( @{$dictionary{$key1}{$key2} } ) { > print "$key1 => $key2 => $element\n"; > } > } > } > Here is the code that I used. Its functional and it works but there has got to be some better ways to do a lot of this. Transversing the data structure still seems like I have to be doing it the hard way. The input data file has fixed width fields that are delimited by pipe. So depending on the justification for each field it will either have leading or ending whitespace. TIA, Paul #!/usr/bin/python # ## Paul D. Kraus - 2005-12-27 ## parse.py - Parse Text File ## Pipe deliminted '|' # ## Fields: CustCode[0] ## : OrdrType[1] ## : OrdrReturn [2] ## : State [3] ## : QtyShipped [4] ## : VendCode[5] ## : InvoiceDate [7] # import re import string results = {} def format_date(datestring): (period,day,year) = map(int,datestring.split('/') ) period += 2 if period == 13: period = 1; year += 1 if period == 14: period = 2; year += 1 if year > 80: year = '19%02d' % year else: year = '20%02d' % year return (year,period) def format_qty(qty,credit,oreturn): qty = float(qty) if credit == 'C' or oreturn == 'Y': return qty * -1 else: return qty textfile = open('orders.txt','r') for line in textfile: fields = map( string.strip, line.split( '|' ) ) fields[4] = format_qty(fields[ 4 ],fields[ 1 ], fields[ 2 ] ) (year, period) = format_date( fields[7] ) for count in range(12): if count == period: if results.get( ( year, fields[6], count), 0): results[ year,fields[6], count] += fields[4] else: results[ year,fields[6],count] = fields[4] sortedkeys = results.keys() sortedkeys.sort() for keys in sortedkeys: res_string = keys[0]+'|'+keys[1] for count in range(12): if results.get((keys[0],keys[1],count),0): res_string += '|'+str(results[keys[0],keys[1],count]) else: res_string += '|0' print res_string ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
[Tutor] Multi-Dimensional Dictionary that contains a 12 element list.
I am trying to build a data structure that would be a dictionary of a dictionary of a list. In Perl I would build the structure like so $dictionary{key1}{key2}[0] = X I would iterate like so ... foreach my $key1 ( sort keys %dictionary ) { foreach my $key2 ( sort keys %{$dictionary{$key1}} ) { foreach my $element ( @{$dictionary{$key1}{$key2} } ) { print "$key1 => $key2 => $element\n"; } } } Sorry for the Perl reference but its the language I am coming from. I use data structures like this all the time. I don't always iterate them like this but If i can learn to build these and move through them in python then a good portion of the Perl apps I am using can be ported. Playing around I have come up with this but have no clue how to iterate over it or if its the best way. It seems "clunky" but it is most likely my lack of understanding. dictionary[(key1,key2)]=[ a,b,c,d,e,f,g ... ] This i think gives me a dictionary with two keys ( not sure how to doing anything usefull with it though) and a list. Now I am not sure how I can assign one element at a time to the list. here is the pseudo code. read text file. split line from text file into list of fields. One of the fields contains the date. Split the date into two fields Year and Month/Period. Build data structure that is a dictionary based on year, based on period, based on item code then store/increment the units sold based on period. dictionary[(year,period)] = [ jan, feb, mar, apr, may, jun, july, aug, sep, oct, nov ,dec] I would prefer to have the months just be an array index 0 through 11 and when it reads the file it increments the number contained there. TIA, -- Paul Kraus =-=-=-=-=-=-=-=-=-=-= PEL Supply Company Network Administrator 216.267.5775 Voice 216.267.6176 Fax www.pelsupply.com =-=-=-=-=-=-=-=-=-=-= ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor