Lipska the Kat wrote:

> Greetings Pythoners
> 
> A short while back I posted a message that described a task I had set
> myself. I wanted to implement the following bash shell script in Python
> 
> Here's the script
> 
> sort -nr $1 | head -${2:-10}
> 
> this script takes a filename and an optional number of lines to display
> and sorts the lines in numerical order, printing them to standard out.
> if no optional number of lines are input the script prints 10 lines
> 
> Here's the file.
> 
> 50    Parrots
> 12    Storage Jars
> 6     Lemon Currys
> 2     Pythons
> 14    Spam Fritters
> 23    Flying Circuses
> 1     Meaning Of Life
> 123   Holy Grails
> 76    Secret Policemans Balls
> 8     Something Completely Differents
> 12    Lives of Brian
> 49    Spatulas
> 
> 
> ... and here's my very first attempt at a Python program
> I'd be interested to know what you think, you can't hurt my feelings
> just be brutal (but fair). There is very little error checking as you
> can see and I'm sure you can crash the program easily.
> 'Better' implementations most welcome

> #! /usr/bin/env python3.2
> 
> import fileinput
> from sys import argv
> from operator import itemgetter
> 
> l=[]
> t = tuple
> filename=argv[1]
> lineCount=10
> 
> with fileinput.input(files=(filename)) as f:

Note that (filename) is not a tuple, just a string surrounded by superfluous 
parens. 

>>> filename = "foo.bar"
>>> (filename)
'foo.bar'
>>> (filename,)
('foo.bar',)
>>> filename,
('foo.bar',)

You are lucky that FileInput() tests if its files argument is just a single 
string.

>         for line in f:
>                 t=(line.split('\t'))
>                 t[0]=int(t[0])
>                 l.append(t)
>         l=sorted(l, key=itemgetter(0))
> 
>         try:    
>                 inCount = int(argv[2])
>                 lineCount = inCount
>         except IndexError:
>                 #just catch the error and continue              
>                 None
> 
>         for c in range(lineCount):
>                 t=l[c]
>                 print(t[0], t[1], sep='\t', end='')
> 

I prefer a more structured approach even for such a tiny program:

- process all commandline args
- read data
- sort
- clip extra lines
- write data

I'd break it into these functions:

def get_commmandline_args():
    """Recommended library: argparse.
       Its FileType can deal with stdin/stdout.
    """
def get_quantity(line):
    return int(line.split("\t", 1)[0])

def sorted_by_quantity(lines):
    """Leaves the lines intact, so you don't 
       have to reassemble them later on."""
    return sorted(lines, key=get_quantity)

def head(lines, count):
    """Have a look at itertools.islice() for a more
       general approach"""
    return lines[:count]

if __name__ == "__main__":
    # protecting the script body allows you to import
    # the script as a library into other programs
    # and reuse its functions and classes.
    # Also: play nice with pydoc. Try
    # $ python -m pydoc -w ./yourscript.py

    args = get_commandline_args()
    with args.infile as f:
        lines = sorted_by_quantity(f)
    with args.outfile as f:
        f.writelines(head(lines, args.line_count))

Note that if you want to handle large files gracefully you need to recombine 
sorted_by_quantity() and head() (have a look at heapq.nsmallest() which was 
already mentioned in the other thread).

-- 
http://mail.python.org/mailman/listinfo/python-list

Reply via email to