On Mar 19, 11:55 am, Shane Geiger <[EMAIL PROTECTED]> wrote: > In the unix world, 'fc' would be like diff. > > """ > Python example of checksumming files with the MD5 module. > > In Python 2.5, the hashlib module would be preferable/more elegant. > """ > > import md5 > > import string, os > r = lambda f: open(f, "r").read() > def readfile(f,strip=False): return (strip and stripper(r(f))) or r(f) > def writefile(f, data, perms=750): open(f, "w").write(data) and > os.chmod(f, perms) > > def get_md5(fname): > hash = md5.new() > contents = readfile(fname) > hash.update(contents) > value = hash.digest() > return (fname, hash.hexdigest()) > > import glob > > for f in glob.glob('*'): > print get_md5(f) > > > A crude way to check if two files are the same on Windows is to look > > at the output of the "fc" function of cmd.exe, for example > > > def files_same(f1,f2): > > cmnd = "fc " + f1 + " " + f2 > > return ("no differences" in popen(cmnd).read()) > > > This is needlessly slow, because one can stop comparing two files > > after the first difference is detected. How should one check that > > files are the same in Python? The files are plain text. > > -- > Shane Geiger > IT Director > National Council on Economic Education > [EMAIL PROTECTED] | 402-438-8958 | http://www.ncee.net > > Leading the Campaign for Economic and Financial Literacy > > sgeiger.vcf > 1KDownload
You can also use Python's file "read" method to read a block of each file in a loop in binary mode. Something like: file1 = open(path1, 'rb') file2 = open(path2, 'rb') bytes1 = file1.read(blocksize) bytes2 = file2.read(blocksize) And then just compare bytes to see if there is a difference. If so, break out of the loop. I saw this concept in the book: Python Programming, 3rd Ed. by Lutz. Have fun! Mike -- http://mail.python.org/mailman/listinfo/python-list