On May 4, 2006, at 12:12 AM, [EMAIL PROTECTED] wrote: > hi > I have a file with columns delimited by '~' like this: > > 1SOME STRING ~ABC~12311232432D~20060401~00000000 > 2SOME STRING ~DEF~13534534543C~20060401~00000000 > 3SOME STRING ~ACD~14353453554G~20060401~00000000 > > ..... > > What is the pythonic way to sort this type of structured text file? > Say i want to sort by 2nd column , ie ABC, ACD,DEF ? so that it becomes > > 1SOME STRING ~ABC~12311232432D~20060401~00000000 > 3SOME STRING ~ACD~14353453554G~20060401~00000000 > 2SOME STRING ~DEF~13534534543C~20060401~00000000 > ? > I know for a start, that i have to split on '~', then append all the > second columns into a list, then sort the list using sort(), but i am > stuck with how to get the rest of the corresponding columns after the > sort.... > > thanks... >
A couple ways. Assume that you have the lines in a list called 'lines', as follows: lines = [ "1SOME STRING ~ABC~12311232432D~20060401~00000000", "3SOME STRING ~ACD~14353453554G~20060401~00000000", "2SOME STRING ~DEF~13534534543C~20060401~00000000"] The more traditional way would be to define your own comparison function: def my_cmp(x,y): return cmp( x.split("~")[1], y.split("~")[1]) lines.sort(cmp=my_cmp) The newer, faster way, would be to define your own key function: def my_key(x): return x.split("~")[1] lines.sort(key=my_key) The key function is faster because you only have to do the split("~")[1] once for each line, whereas it will be done many times for each line if you use a comparison function. Jay P. -- http://mail.python.org/mailman/listinfo/python-list