[Tutor] (no subject)

kumar s Thu, 07 Jan 2010 14:12:02 -0800

dear tutors:
I have two files. I want to take coordiates of an row in fileA and find if they 
are in the range of coordinates in fileB. If they are, I want to be able to map 
else, pass. 
thanks
kumar


file a:
name     loc          x       y
a       4       40811596        40811620
b       4       40811619        40811643
c       4       40811649        40811673
d       4       40811734        40811758
e       4       40811797        40811821
f       4       40811817        40811841
g       4       40811895        40811919
h       4       40811938        40811962



file b:

                              zx       zy
z1      4       +       40810323        40812000
z2      4       +       40810323        40812000
z3      4       +       40810323        40812000
z4      4       +       40810323        40812000
z5      4       +       40810323        40812000
z6      4       +       40810323        40812000
z7      4       +       40810323        40812000
z8      4       +       40810323        40812000




I want to take coordiates x and y from each row in file a, and check if they 
are in range of zx and zy. If they are in range then I want to be able to write 
both matched rows in a tab delim single row. 


my code:

f1 = open('fileA','r')
f2 = open('fileB','r')
da = f1.read().split('\n')
dat = da[:-1]
ba = f2.read().split('\n')
bat = ba[:-1]


for m in dat:
        col = m.split('\t')
        for j in bat:
                cols = j.split('\t')
                if col[1] == cols[1]:
                        xc = int(cols[2])
                        yc = int(cols[3])
                        if int(col[2]) in xrange(xc,yc):
                                if int(col[3]) in xrange(xc,yc):
                                        print m+'\t'+j

output:
a       4       40811596        40811620    z1 4 +  40810323     40812000



This code is too slow. Could you experts help me speed the script a lot faster. 
In each file I have over 50K rows and the script runs very slow. 

Please help. 

thanks
Kumar


      

_______________________________________________
Tutor maillist  -  [email protected]
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

[Tutor] (no subject)

Reply via email to