New submission from brent s. <brent.sa...@gmail.com>: Currently, a parsed urlparse() object looks (roughly) like this:
urlparse('http://example.com/foo;key1=value1?key2=value2#key3=value3#key4=value4') returns: ParseResult(scheme='http', netloc='example.com', path='/foo', params='key1=value1', query='key2=value2', fragment='key3=value3#key4=value4') However, I recommend a couple things: 0.) that ParseResult objects support dict emulation. e.g. one can run: dict(parseresult_obj) and get (using the example string above (corrected classification for RFC2986 compliance and common usage): {'fragment': [('key4', 'value4')], 'netloc': 'foo.tld', 'params': [('key2', 'value2')], 'path': '/foo', 'query': [('key3', 'value3')], 'scheme': 'http'} Obviously, fragment, params, and query could instead be serialized into a nested dict. I'm not sure which is more preferred in the pythonic sense. 1.) Better RFC3986 compliance. Per RFC3986 ยง 3 (https://tools.ietf.org/html/rfc3986#section-3), the URL can be further split into separate components. For instance, while considered deprecated, should "userinfo" (e.g. "http://user:password@...") be parsed? At the very least, the port should be parsed out to a separate component from the netloc (or userinfo parsed out separate from netloc) - this will assist in parsing host:port combinations in netlocs that contain both userinfo and a specified port (and allow the port to be given as an int type, thus more easily used in e.g. the socket lib). 2.) If a component is not present, I suggest it be a None object instead of an empty string. e.g.: urlparse('http://example.com/foo') Would return: ParseResult(scheme='http', netloc='example.com', path='/foo', params=None, query=None, fragment=None) instead of ParseResult(scheme='http', netloc='example.com', path='/foo', params='', query='', fragment='') ---------- components: Library (Lib) messages: 316454 nosy: bsaner priority: normal severity: normal status: open title: Improvement suggestions for urllib.parse.urlparser type: enhancement _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue33480> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com