I have a million-line text file with 100 characters per line, and simply need to determine how many of the lines are distinct.
On my PC, this little program just goes to never-never land: def number_distinct(fn): f = file(fn) x = f.readline().strip() L = [] while x<>'': if x not in L: L = L + [x] x = f.readline().strip() return len(L) Would anyone care to point out improvements? Is there a better algorithm for doing this? -- http://mail.python.org/mailman/listinfo/python-list