On Fri, Jan 9, 2009 at 3:08 PM, Chris Mellon <arka...@gmail.com> wrote: > On Fri, Jan 9, 2009 at 2:32 PM, webcomm <rya...@gmail.com> wrote: >> On Jan 9, 3:15 pm, Steve Holden <st...@holdenweb.com> wrote: >>> webcomm wrote: >>> > Hi, >>> > In python, is there a distinction between unzipping bytes and >>> > unzipping a binary file to which those bytes have been written? >>> >>> > The following code is, I think, an example of writing bytes to a file >>> > and then unzipping... >>> >>> > decoded = base64.b64decode(datum) >>> > #datum is a base64 encoded string of data downloaded from a web >>> > service >>> > f = open('data.zip', 'wb') >>> > f.write(decoded) >>> > f.close() >>> > x = zipfile.ZipFile('data.zip', 'r') >>> >>> > After looking at the preceding code, the provider of the web service >>> > gave me this advice... >>> > "Instead of trying to create a file, take the unzipped bytes and get a >>> > Unicode string of text from it." >>> >>> Not terribly useful advice, but one presumes he she or it was trying to >>> be helpful. >>> >>> > If so, I'm not sure how to do what he's suggesting, or if it's really >>> > different from what I've done. >>> >>> Well, what you have done appears pretty wrong to me, but let's take a >>> look. What's datum? You appear to be treating it as base64-encoded data; >>> is that correct? Have you examined it? >> >> It's data that has been compressed then base64 encoded by the web >> service. I'm supposed to download it, then decode, then unzip. They >> provide a C# example of how to do this on page 13 of >> http://forums.regonline.com/forums/docs/RegOnlineWebServices.pdf >> >> If you have a minute, see also this thread... >> http://groups.google.com/group/comp.lang.python/browse_thread/thread/d72d883409764559/5b9eceeee3e77dd4?hl=en&lnk=gst&q=webcomm#5b9eceeee3e77dd4 >> > > When they say "zip", they're talking about a zlib compressed stream of > bytes, not a zip archive. > > You want to base64 decode the data, then zlib decompress it, then > finally interpret it as (I think) UTF-16, as that's what Windows > usually means when it says "Unicode". > > decoded = base64.b64decode(datum) > decompressed = zlib.decompress(decoded) > result = decompressed.decode('utf-16') >
And of course as *soon* as I write that, I read the appendix on the documentation in full and turn out to be wrong. Ignore me *sigh*. It would really help if you could post a sample file somewhere. -- http://mail.python.org/mailman/listinfo/python-list