On Thu, Jul 19, 2012 at 5:32 PM, Jordan <wolfrage8...@gmail.com> wrote: > > I am not sure how to answer that question because all files are binary, > but the files that I will parse have an encoding that allows them to be > read in a non-binary output. But my program will not use the in a > non-binary way, that is why I plan to open them with the 'b' mode to > open them as binary with no encoding assumed by python. I just not have > tested this new technique that you gave me on a binary file yet as I was > still implementing it for strings.
Reading from a file in binary mode returns a bytes object in Python 3. Since iterating over bytes returns ints, you can cycle the key over the plain text using zip and compute the XOR without having to convert the entire message into a single big number in memory. Here's my example from before, adapted for files: >>> from itertools import cycle >>> key = b'1234' >>> kit = cycle(key) >>> with open('temp.txt', 'rb') as f, open('cipher.txt', 'wb') as fo: ... fit = iter(lambda: f.read(512), b'') ... for text in fit: ... fo.write(bytes(x^y for x,y in zip(text, kit))) Since the input file could be arbitrarily large and lack newlines, I'm using "iter" to create a special iterator that reads 512-byte chunks. The iterator stops when "read" returns an empty bytes object (i.e. b''). You could use a while loop instead. I assume here that the key is possibly shorter than the message (e.g. encrypting 1 megabyte of text with a 128 byte key). If you're making a one-time pad I think the key is the same length as the message. In that case you wouldn't have to worry about cycling it. Anyway, I'm not particularly interested in cryptography. I'm just trying to help with the operations. _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor