On 08/05/2014 04:46, Steven D'Aprano wrote:
On Wed, 07 May 2014 11:42:24 +0100, Robin Becker wrote:

I have an outstanding request for ReportLab to allow images to be opened
using the data: scheme. That used to be supported in python 2.7 using
urllib, but in python 3.3 urllib2 --> urllib and at least the default
urlopener doesn't support data:


It looks like you intended to show an example, but left it out.

Is there a way to use the residual legacy of the old urllib code that's
now in urllib.URLopener to open unusual schemes? I know it can be used
directly eg

urllib.request.URLopener().open('data:.........')

but that seems to leave the splitting & testing logic up to me when it
logically belongs in some central place ie urllib.request.urlopen.

You may need to explain in a little more detail. When you say "splitting
and testing", what are you splitting and testing? It may also help if you
show some Python 2.7 code that works, and what happens in 3.3.


OK not sure about 3.4, but in 3.3 the urllib module cannot open a request like 
this

C:\code-trunk\hg-repos\reportlab\tests>\python33\python.exe
Python 3.3.3 (v3.3.3:c3896275c0f6, Nov 18 2013, 21:18:40) [MSC v.1600 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import urllib.request
>>> urllib.request.urlopen('data:image/gif;base64,R0lGODdhAQABAIAAAP///////ywAAAAAAQABAAACAkQBADs=').read()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\python33\lib\urllib\request.py", line 156, in urlopen
    return opener.open(url, data, timeout)
  File "C:\python33\lib\urllib\request.py", line 469, in open
    response = self._open(req, data)
  File "C:\python33\lib\urllib\request.py", line 492, in _open
    'unknown_open', req)
  File "C:\python33\lib\urllib\request.py", line 447, in _call_chain
    result = func(*args)
  File "C:\python33\lib\urllib\request.py", line 1310, in unknown_open
    raise URLError('unknown url type: %s' % type)
urllib.error.URLError: <urlopen error unknown url type: data>
>>>

in python27 one can do

C:\tmp>python
Python 2.7.6 (default, Nov 10 2013, 19:24:18) [MSC v.1500 32 bit (Intel)] on 
win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import urllib
>>> data=urllib.urlopen('data:image/gif;base64,R0lGODdhAQABAIAAAP///////ywAAAAAAQABAAACAkQBADs=').read()
>>> len(data)
35
>>>

and as indicated  by Ian Kelly in 3.4
C:\tmp>\python34\python.exe
Python 3.4.0 (v3.4.0:04f714765c13, Mar 16 2014, 19:24:06) [MSC v.1600 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import urllib.request
>>> data=urllib.request.urlopen('data:image/gif;base64,R0lGODdhAQABAIAAAP///////ywAAAAAAQABAAACAkQBADs=').read()
>>> len(data)
35



in 3.3 we have the old code URLopener class. However, when I use that I see this

C:\code-trunk\hg-repos\reportlab\tests>\python33\python.exe
Python 3.3.3 (v3.3.3:c3896275c0f6, Nov 18 2013, 21:18:40) [MSC v.1600 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> from urllib.request import URLopener
>>> data = URLopener().open('data:image/gif;base64,R0lGODdhAQABAIAAAP///////ywAAAAAAQABAAACAkQBADs=').read()
>>> len(data)
115
>>> data
'Date: Thu, 08 May 2014 10:21:45 GMT\nContent-type: image/gif\nContent-Length: 35\n\nGIF87a\x01\x00\x01\x00\x80\x00\x00├
┐├┐├┐├┐├┐├┐,\x00\x00\x00\x00\x01\x00\x01\x00\x00\x02\x02D\x01\x00;'
>>>

so I seem to be getting the real data and some headers now. I think this is different from what is expected, but that code is labelled as old/deprecated and possibly going away.

Since urllib doesn't always work as expected in 3.3 I've had to write a small stub for the special data: case. Doing all the splitting off of the headers seems harder than just doing the special case.

However, there are a lot of these 'schemes' so should I be doing this sort of thing? Apparently it's taken 4 versions of python to get urllib in 3.4 to do this so it's not clear to me whether all schemes are supposed to hang off urllib.request.urlopen or if instead of special casing the 3.3 data: I should have special cased a handler for it and injected that into my opener (or possibly the default opener). Doing the handler means I do have to handle the headers stuff whereas my stub is just returning the data bits.
--
Robin Becker

--
https://mail.python.org/mailman/listinfo/python-list

Reply via email to