["sf" <[EMAIL PROTECTED]>] >> I have files A, and B each containing say 100,000 lines (each >> line=one string without any space) >> >> I want to do >> >> " A - (A intersection B) " >> >> Essentially, want to do efficient grep, i..e from A remove those >> lines which are also present in file B.
[Fredrik Lundh] > that's an unusual definition of "grep", but the following seems to > do what you want: > > afile = "a.txt" > bfile = "b.txt" > > bdict = dict.fromkeys(open(bfile).readlines()) > > for line in open(afile): > if line not in bdict: > print line, > > </F> Note that an open file is an iterable object, yielding the lines in the file. The "for" loop exploited that above, but fromkeys() can also exploit it. That is, bdict = dict.fromkeys(open(bfile)) is good enough (there's no need for the .readlines()). -- http://mail.python.org/mailman/listinfo/python-list