dear tutors: I have two files. I want to take coordiates of an row in fileA and find if they are in the range of coordinates in fileB. If they are, I want to be able to map else, pass. thanks kumar
file a: name loc x y a 4 40811596 40811620 b 4 40811619 40811643 c 4 40811649 40811673 d 4 40811734 40811758 e 4 40811797 40811821 f 4 40811817 40811841 g 4 40811895 40811919 h 4 40811938 40811962 file b: zx zy z1 4 + 40810323 40812000 z2 4 + 40810323 40812000 z3 4 + 40810323 40812000 z4 4 + 40810323 40812000 z5 4 + 40810323 40812000 z6 4 + 40810323 40812000 z7 4 + 40810323 40812000 z8 4 + 40810323 40812000 I want to take coordiates x and y from each row in file a, and check if they are in range of zx and zy. If they are in range then I want to be able to write both matched rows in a tab delim single row. my code: f1 = open('fileA','r') f2 = open('fileB','r') da = f1.read().split('\n') dat = da[:-1] ba = f2.read().split('\n') bat = ba[:-1] for m in dat: col = m.split('\t') for j in bat: cols = j.split('\t') if col[1] == cols[1]: xc = int(cols[2]) yc = int(cols[3]) if int(col[2]) in xrange(xc,yc): if int(col[3]) in xrange(xc,yc): print m+'\t'+j output: a 4 40811596 40811620 z1 4 + 40810323 40812000 This code is too slow. Could you experts help me speed the script a lot faster. In each file I have over 50K rows and the script runs very slow. Please help. thanks Kumar _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor