It seems unlikely that there is a bug in Python's base64 encode/decode functions, but it is possible.
For me, a long string of the same length you mention survives an encode/decode
pair (a round-trip):
Python 2.3.3 (#1, May 7 2004, 10:31:40)
[GCC 3.3.3 20040412 (Red Hat Linux 3.3.3-7)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> b = "x" * 1728512
>>> b.encode("base64").decode("base64") == b
True
>>> import base64
>>> b == base64.decodestring(base64.encodestring(b))
True
I also ran a program which tests this round-trip property for random data
of all lengths from 0 to 4095, and it ran until I stopped it, rather than
printing an AssertionError:
import os, base64
f = open("/dev/urandom")
while 1:
for l in range(4096):
b = f.read(l)
assert b == base64.decodestring(base64.encodestring(b))
os.write(2, ".")
print "!"
Of course, Python itself has a test-suite for base64 which tests strings of
several lengths against the values they should give after encoding or decoding.
$ python /usr/lib/python2.3/test/test_base64.py
test_decodestring (__main__.Base64TestCase) ... ok
test_encodestring (__main__.Base64TestCase) ... ok
----------------------------------------------------------------------
Ran 2 tests in 0.001s
OK
If your 1728512-character string doesn't pass the simple round-trip test, then
you've uncovered a latent Python bug that my tests didn't demonstrate. If it
does, then the problem lies somewhere else in your code, possibly in the part
that transmits the encoded message.
Jeff
pgpI4jGdumlqj.pgp
Description: PGP signature
-- http://mail.python.org/mailman/listinfo/python-list
