"metaperl" <[EMAIL PROTECTED]> writes: > The urlparse with Python 2.4.3 includes the user and pass in the site > aspect of its parse: > > >>> scheme, site, path, parms, query, fid = > >>> urlparse.urlparse("http://bill:[EMAIL > >>> PROTECTED]/lib/module-urlparse.html") > > >>> site > 'bill:[EMAIL PROTECTED]' > > > I personally would prefer that it be broken down a bit further. What > are existing opinions on this?
Module urlparse should be deprecated in Python 2.6, to be replaced with a new module (or modules) that implements the relevant parts of RFC 3986 and 3987 (read the python-dev archives for discussion and several people's first cuts at implementation). Splitting "userinfo" (the bit before the '@' in user:[EMAIL PROTECTED]) should be a separate function. Mostly because RFC 3986 talks a lot about 5-tuples into which ANY URL can be split, and that splitting process doesn't involve splitting out userinfo. So it makes sense to have one function do the splitting into RFC 3986 5-tuples, and another split out the userinfo. Also, though, the userinfo syntax is deprecated, because people use it for semantic spoofing attacks: people don't understand (or don't notice) that http://microsoft.com&rhubarb=custard&[EMAIL PROTECTED]/more/stuff.htm is not a microsoft.com URL. Note that userinfo has always been illegal in HTTP URLs, and is no longer supported by newer browsers. So relegating it to a separate function is a good thing, IMO. John -- http://mail.python.org/mailman/listinfo/python-list