Rob Andrews wrote: > I'm trying to think of the best way to go about this one, as the files > I have to sort are *big*. > > They're ASCII files with each row consisting of a series of > fixed-length fields, each of which has a corresponding format file. > (To be specific, these files are FirstLogic compatible.) > > I'm looking to sort files such that I can produce the 50,000 records > with the highest "score" in a certain field. > > A grossly over-simplified example is: > > "JohnDoe 3.14123 Anywhere St." > "MarySmith11.03One Jackson Pl. " > > ------------------------------------------------------------ >>>> for x in people: # substituting 'people' for a file of records > print x[9:14] > > 3.14 > 11.03 > ------------------------------------------------------------ > > With this in mind, I'm trying to sort the file by the value of the > number in the field represented by x[9:14] in the example here.
If the files fit in memory you can define a function that returns the key value and use it for the sort. If lines is a list of strings in the above format, def myKey(line): return float(line[9:14]) lines.sort(key=myKey) Or you can use John's suggestion of splitting the lines but that may not be needed in this case. Kent _______________________________________________ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor