En Mon, 04 Aug 2008 20:43:45 -0300, Steven D'Aprano <[EMAIL PROTECTED]> escribi�:

I'm using urllib.urlretrieve() to download HTML pages, and I've hit a
snag with URLs containing ampersands:

http://www.example.com/parrot.php?x=1&y=2

Somewhere in the process, urls like the above are escaped to:

http://www.example.com/parrot.php?x=1&amp;y=2

which naturally fails to exist.

I could just do a string replace, but is there a "right" way to escape
and unescape URLs? I've looked through the standard lib, but I can't find
anything helpful.

This works fine for me:

py> import urllib
py> fn = urllib.urlretrieve("http://c7.amazingcounters.com/counter.php?i=1516903
&c=4551022")[0]
py> open(fn,"rb").read()
'\x89PNG\r\n\x1a\n\x00\x00\x00\rIHDR\x00\x00...

So it's not urlretrieve escaping the url, but something else in your code...

--
Gabriel Genellina

--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to