On 4/5/2010 1:15 AM, TGW wrote:
Sorry - my mistake - try:

infile = open("filex")
match_zips = open("zippys")
result = [line for line in infile if line in match_zips]
print result
When I apply the readlines to the original file, It is taking a lot longer to process and the outfile still remains blank. Any suggestions?

OK - you handled the problem regarding reading to end-of-file. Yes it takes a lot longer, because now you are actually iterating through match_zips for each line.

How large are these files? Consider creating a set from match_zips. As lists get longer, set membership test become faster than list membership test.

If the outfile is empty that means that line[149:154] is never in match_zips.

I suggest you take a look at match_zips. You will find a list of strings of length 6, which cannot match line[149:154], a string of length 5.


#!/usr/bin/env python
# Find records that match zipcodes in zips.txt

import os
import sys

def main():
    infile = open("/Users/tgw/NM_2010/NM_APR.txt", "r")
    outfile = open("zip_match_apr_2010.txt", "w")
    zips = open("zips.txt", "r")
    match_zips = zips.readlines()
    lines = [ line for line in infile if line[149:154] in match_zips ]

    outfile.write(''.join(lines))
#    print line[149:154]
    print lines
    infile.close()
    outfile.close()
main()





--
Bob Gailer
919-636-4239
Chapel Hill NC

_______________________________________________
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Reply via email to