Re: [Tutor] Help with strings and lists.
> Date: Fri, 14 Jul 2006 16:43:32 +1200 > From: "John Fouhy" <[EMAIL PROTECTED]> > > Let me attempt to be the first to say: > > String substitutions!!! > > The docs are here: http://docs.python.org/lib/typesseq-strings.html > Thanks John, that looks way better. I'll also check out the page. Alan. ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Help with strings and lists
> Date: Fri, 14 Jul 2006 12:12:42 +0200 > From: J?nos Juh?sz <[EMAIL PROTECTED]> > Subject: [Tutor] Help with strings and lists. > To: <[EMAIL PROTECTED]>, tutor@python.org > Message-ID: > <[EMAIL PROTECTED]> > Content-Type: text/plain; charset="iso-8859-2" > > Dear Alan, > > Probably you will be interested about list comprehension and zip(), as it > can simplify all the similar tasks. > Hi Janos, What a sexy piece of code! Oh man, I have a lot to learn. The code didn't justify as required, so I combined your code with something from Kent. The code is now: def format (value, width): if value.isdigit(): return value.rjust(width) + ' ' else: return value.ljust(width) s = ('Monday 7373 3663657 2272 547757699 reached 100%', 'Tuesday 7726347 552 766463 2253 under-achieved 0%', 'Wednesday 9899898 8488947 6472 77449 reached 100%', 'Thursday 636648 553 22344 5699 under-achieved 0%', 'Friday 997 3647757 78736632 357599 over-achieved 200%') table = [line.split() for line in s] transposed = zip(*(table)) maxWidths = [max([len(str(cell)) for cell in column]) for column in transposed] print '\n'.join([' '.join([ format(field,w) for (field, w) in zip(line, maxWidths)]) for line in table]) I noticed that if one line has less items than the rest (i.e. one less column), that column is ignored for all lines. I'll work around that somehow. Thanks, I will stare in wonder at that code for years to come. ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Help with strings and lists.
Alan Collins wrote: > Hi, > > I do a far bit of data manipulation and decided to try one of my > favourite utilities in Python. I'd really appreciate some optimization > of the script. I'm sure that I've missed many tricks in even this short > script. > > Let's say you have a file with this data: > > Monday 7373 3663657 2272 547757699 reached 100% > Tuesday 7726347 552 766463 2253 under-achieved 0% > Wednesday 9899898 8488947 6472 77449 reached 100% > Thursday 636648 553 22344 5699 under-achieved 0% > Friday 997 3647757 78736632 357599 over-achieved 200% > > You now want columns 1, 5, and 7 printed and aligned (much like a > spreadsheet). For example: > > Monday547757699 100% > Wednesday 77449 100% > ... > > This script does the job, but I reckon there are better ways. In the > interests of brevity, I have dropped the command-line argument handling > and hard-coded the columns for the test and I hard-coded the input file > name. > You might like to see how it is done in this recipe: http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/267662 > --- > """ > PrintColumns > > Print specified columns, alignment based on data type. > > The script works by parsing the input file twice. The first pass gets > the maximum length of > all values on the columns. This value is used to pad the column on the > second pass. > > """ > import sys > > columns = [0] # hard-code the columns to be printed. > colwidth = [0] # list into which the maximum field lenths will > be stored. > > """ > This part is clunky. Can't think of another way to do it without making > the script > somewhat longer and slower. What it does is that if the user specifies > column 0, all > columns will be printed. This bit builds up the list of columns, from 1 > to 100. > """ > > if columns[0] == 0: > columns = [1] > while len(columns) < 100: > columns.append(len(columns)+1) > columns = range(1, 100) > """ > First pass. Read all lines and determine the maximum width of each > selected column. > """ > infile = file("mylist", "r") > indata = infile.readlines() > for myline in indata: > mycolumns = myline.split() > colindex = 0 > for column in columns: > if column <= len(mycolumns): > if len(colwidth)-1 < colindex: > colwidth.append(len(mycolumns[column-1])) > else: > if colwidth[colindex] < len(mycolumns[column-1]): > colwidth[colindex] = len(mycolumns[column-1]) > colindex += 1 > infile.close() > > """ > Second pass. Read all lines and print the selected columns. Text values > are left > justified, while numeric values are right justified. > """ > infile = file("mylist", "r") > indata = infile.readlines() > No need to read the file again, you still have indata. > for myline in indata: > mycolumns = myline.split() > colindex = 0 > for column in columns: > if column <= len(mycolumns): > if mycolumns[column-1].isdigit(): > x = mycolumns[column-1].rjust(colwidth[colindex]) + ' ' > else: > x = mycolumns[column-1].ljust(colwidth[colindex]+1) > print x, > colindex += 1 > print "" > infile.close() > Hmm...you really should make columns be the correct length. If you use a list comp to make colwidth then you can just make columns the same length as colwidth. Then if you make a helper function for the formatting def format(value, width): if value.isdigit(): return value.rjust(width) + ' ' else: return value.ljust(width) Now the formatting becomes values = [ format(column[i], colwidth[i] for i in columns ] which you print with print ''.join(values) Kent > --- > > Any help greatly appreciated. > Regards, > Alan. > ___ > Tutor maillist - Tutor@python.org > http://mail.python.org/mailman/listinfo/tutor > > > ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
[Tutor] Help with strings and lists.
Dear Alan, Probably you will be interested about list comprehension and zip(), as it can simplify all the similar tasks. >>> s = ('Monday 7373 3663657 2272 547757699 reached 100%','Tuesday 7726347 552 766463 2253 under-achieved 0%','Wednesday 9899898 8488947 6472 77449 reached 100%','Thursday 636648 553 22344 5699 under-achieved 0%','Friday 997 3647757 78736632 357599 over-achieved 200%') >>> # I made a table from your list >>> table = [line.split() for line in s] >>> # it is possible to transpose the table >>> transposed = zip(*(table)) >>> transposed [('Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday'), ('7373', '7726347', '9899898', '636648', '997'), ('3663657', '552', '8488947', '553', '3647757'), ('2272', '766463', '6472', '22344', '78736632'), ('547757699', '2253', '77449', '5699', '357599'), ('reached', 'under-achieved', 'reached', 'under-achieved', 'over-achieved'), ('100%', '0%', '100%', '0%', '200%')] >>> # calc the max(len(str(cell))) for each row, >>> # that means columns in the original table >>> maxWidths = [max([len(str(cell)) for cell in column]) for column in transposed] >>> maxWidths [9, 7, 7, 8, 9, 14, 4] >>> # format it >>> [ str.ljust(str(field),w) for (field, w) in zip(table[0], maxWidths)] ['Monday ', '7373 ', '3663657', '2272 ', '547757699', 'reached ', '100%'] >>> # join it to give a line >>> '|'.join([ str.ljust(str(field),w) for (field, w) in zip(table[0], maxWidths)]) 'Monday |7373 |3663657|2272 |547757699|reached |100%' >>> # it can be made for all of the lines >>> '\n'.join(['|'.join([ str.ljust(str(field),w) for (field, w) in zip(line, maxWidths)]) for line in table]) 'Monday |7373 |3663657|2272 |547757699|reached |100%\nTuesday |7726347|552 |766463 |2253 |under-achieved|0% \nWednesday|9899898|8488947|6472 |77449 |reached |100%\nThursday |636648 |553 |22344 |5699 |under-achieved|0% \nFriday |997 |3647757|78736632|357599 |over-achieved |200%' >>> # and can be printed in this form >>> print '\n'.join(['|'.join([ str.ljust(str(field),w) for (field, w) in zip(line, maxWidths)]) for line in table]) Monday |7373 |3663657|2272 |547757699|reached |100% Tuesday |7726347|552 |766463 |2253 |under-achieved|0% Wednesday|9899898|8488947|6472 |77449 |reached |100% Thursday |636648 |553 |22344 |5699 |under-achieved|0% Friday |997 |3647757|78736632|357599 |over-achieved |200% >>> I know it is a different way of thinking, but it works well with python. It is the functional way instead of your procedural one. > Hi, > I do a far bit of data manipulation and decided to try one of my > favourite utilities in Python. I'd really appreciate some optimization > of the script. I'm sure that I've missed many tricks in even this short > script. > Let's say you have a file with this data: > Monday 7373 3663657 2272 547757699 reached 100% > Tuesday 7726347 552 766463 2253 under-achieved 0% > Wednesday 9899898 8488947 6472 77449 reached 100% > Thursday 636648 553 22344 5699 under-achieved 0% > Friday 997 3647757 78736632 357599 over-achieved 200% > You now want columns 1, 5, and 7 printed and aligned (much like a > spreadsheet). For example: > Monday 547757699 100% > Wednesday 77449 100% > ... > This script does the job, but I reckon there are better ways. In the > interests of brevity, I have dropped the command-line argument handling > and hard-coded the columns for the test and I hard-coded the input file > name. > Any help greatly appreciated. > Regards, > Alan. Best Regards, János Juhász___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Help with strings and lists.
On 14/07/06, Alan Collins <[EMAIL PROTECTED]> wrote: > You now want columns 1, 5, and 7 printed and aligned (much like a > spreadsheet). For example: > > Monday547757699 100% > Wednesday 77449 100% Let me attempt to be the first to say: String substitutions!!! The docs are here: http://docs.python.org/lib/typesseq-strings.html At their simplest, you can do things like: s = '%s || %s || %s' % (columns[0], columns[4], columns[6]) This will print: Monday || 547757699 || 100% Next, you can specify padding as well: s = '%12s || %12s || %5s' % (columns[0], columns[4], columns[6]) and the first string will be padded with spaces to 12 characters, the second to 12, and the last to 5. You can change which side it pads on by specifying negative field widths --- eg, %-12s instead of %12s. (one will produce " Monday", the other "Monday ". I forget which.) Next, you can tell python to read the field width from a variable: s = '%*s || %*s || %*s' % (12, columns[0], 12, columns[4], 5, columns[6]) (ie: width first) So all you need to do is find the maximum field width (have a look at list comprehensions and the max() function), then use string formatting operators to lay everything out :-) -- John. ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor
[Tutor] Help with strings and lists.
Hi, I do a far bit of data manipulation and decided to try one of my favourite utilities in Python. I'd really appreciate some optimization of the script. I'm sure that I've missed many tricks in even this short script. Let's say you have a file with this data: Monday 7373 3663657 2272 547757699 reached 100% Tuesday 7726347 552 766463 2253 under-achieved 0% Wednesday 9899898 8488947 6472 77449 reached 100% Thursday 636648 553 22344 5699 under-achieved 0% Friday 997 3647757 78736632 357599 over-achieved 200% You now want columns 1, 5, and 7 printed and aligned (much like a spreadsheet). For example: Monday547757699 100% Wednesday 77449 100% ... This script does the job, but I reckon there are better ways. In the interests of brevity, I have dropped the command-line argument handling and hard-coded the columns for the test and I hard-coded the input file name. --- """ PrintColumns Print specified columns, alignment based on data type. The script works by parsing the input file twice. The first pass gets the maximum length of all values on the columns. This value is used to pad the column on the second pass. """ import sys columns = [0] # hard-code the columns to be printed. colwidth = [0] # list into which the maximum field lenths will be stored. """ This part is clunky. Can't think of another way to do it without making the script somewhat longer and slower. What it does is that if the user specifies column 0, all columns will be printed. This bit builds up the list of columns, from 1 to 100. """ if columns[0] == 0: columns = [1] while len(columns) < 100: columns.append(len(columns)+1) """ First pass. Read all lines and determine the maximum width of each selected column. """ infile = file("mylist", "r") indata = infile.readlines() for myline in indata: mycolumns = myline.split() colindex = 0 for column in columns: if column <= len(mycolumns): if len(colwidth)-1 < colindex: colwidth.append(len(mycolumns[column-1])) else: if colwidth[colindex] < len(mycolumns[column-1]): colwidth[colindex] = len(mycolumns[column-1]) colindex += 1 infile.close() """ Second pass. Read all lines and print the selected columns. Text values are left justified, while numeric values are right justified. """ infile = file("mylist", "r") indata = infile.readlines() for myline in indata: mycolumns = myline.split() colindex = 0 for column in columns: if column <= len(mycolumns): if mycolumns[column-1].isdigit(): x = mycolumns[column-1].rjust(colwidth[colindex]) + ' ' else: x = mycolumns[column-1].ljust(colwidth[colindex]+1) print x, colindex += 1 print "" infile.close() --- Any help greatly appreciated. Regards, Alan. ___ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor