vox wrote:
Hi,
I'm contsructing a simple compare-script and thought I would use set
([]) to generate the difference output. But I'm obviosly doing
something wrong.

file1 contains 410 rows.
file2 contains 386 rows.
I want to know what rows are in file1 but not in file2.

This is my script:
s1 = set(open("file1"))
s2 = set(open("file2"))
s3 = set([])
s1temp = set([])
s2temp = set([])

s1temp = set(i.strip() for i in s1)
s2temp = set(i.strip() for i in s2)
s3 = s1temp-s2temp

print len(s3)

Output is 119. AFAIK 410-386=24. What am I doing wrong here?

Assuming that every line in s2 is in s1. If there are lines in s2 that are not in s1, then the number of lines in s1 not in s2 will be larger than 24. s1 - s2 subtracts the intersection of s1 and s2.

--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to