MRAB wrote:
webcomm wrote:
On Jan 8, 8:39 pm, "James Mills" <prolo...@shortcircuit.net.au> wrote:
Send us a sample of this file in question...

Here's a sample with some dummy data from the web service:
http://webcomm.webfactional.com/htdocs/data.zip

That's the zip created in this line of my code...
f = open('data.zip', 'wb')

If I open the file it contains as unicode in my text editor (EditPlus)
on Windows XP, there is ostensibly nothing wrong with it.  It looks
like valid XML.  But if I return it to my browser with python+django,
there are bad characters every other character

If I unzip it like this...
popen("unzip data.zip")
...then the bad characters are 'FFFD' characters as described and
pictured here...
http://groups.google.com/group/comp.lang.python/browse_thread/thread/...

If I unzip it like this...
getzip('data.zip', ignoreable=30000)
...using Scott's function at...
http://groups.google.com/group/comp.lang.python/msg/c2008e48368c6543
...then the bad characters are \x00 characters.

I can unzip it in Windows XP. The file within it (called "data") is XML encoded as UTF-16LE (2 bytes per character, low byte first), but without the initial byte order mark. Python's zipfile module says "BadZipfile: File is not a zip file".

If I strip off all but the last 4 zero-bytes then the zipfile module can open it:

decoded = base64.b64decode(datum)
five_zeros = chr(0) * 5
while decoded.endswith(five_zeros):
    decoded = decoded[ : -1]
f = open('data.zip', 'wb')
f.write(decoded)
f.close()
x = zipfile.ZipFile('data.zip', 'r')

--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to