Feature Requests item #1643370, was opened at 2007-01-24 10:23
Message generated for change (Tracker Item Submitted) made by Item Submitter
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=355470&aid=1643370&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Python Library
Group: Python 2.6
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: anatoly techtonik (techtonik)
Assigned to: Nobody/Anonymous (nobody)
Summary: recursive urlparse

Initial Comment:
urlparse module is incomplete. there is no convenient high-level function to 
parse url down into atomic chunks, urldecode query and bring it to array (or 
dictionary for that case), so that you can modify that dictionary and 
reassemble it into query again using nothing more than simple array 
manipulations.

This kind of function is universal and flexible in the same way that low-level 
API, but in comparison it allows to considerably speed up development process 
if the speech is about urls

I propose urlparseex(urlstring) function that will dissect the URL into 
dictionary of appropriate dictionaries or strings and decode all % entities

scheme  0       string
netloc  1       dictionary
        username 1.1 string or whatever
        password 1.2 string or whatever
        server  1.3     hostname string
        port    1.4     port integer
path    2       string
params  3       ordered dictionary of path components for the sake of 
reassembling them later (sorry, I have little pythons in my head to replace 
"ordered dictionary" with something more appropriate) where respective path 
part entry is also dictionary of parameters
query   4       dictionary
fragment        5       string


there must be also counterpart urlunparseex(dictionary) to reassemble url and 
reencode entities


Reasons behind the decision:
- 90% of time you need to decode % entities - this must be made by default 
(whoever need to let them encoded are in minor and may use other functions)
- atomic recursion format is needed to be able to easily change any url 
component and reassemble it back
- get simple swiss-army knife for high-level (read - logical) url operations in 
one module

http://docs.python.org/lib/module-urlparse.html

There is also this proposal below. It is a little bit different, but shows that 
after four years url  handling problems are still actual. 

http://sourceforge.net/tracker/index.php?func=detail&aid=600362&group_id=5470&atid=355470


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=355470&aid=1643370&group_id=5470
_______________________________________________
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to