Re: [Python-Dev] [RFC] urlparse - parse query facility
* Fred L. Drake, Jr. [EMAIL PROTECTED] [2007-06-16 01:06:59]: * Coding question: Without retyping the bunch of code again in the BaseResult, would is the possible to call parse_qs/parse_qsl function on self.query and provide the result? Basically, what would be a good of doing it. That's what I was thinking. Just add something like this to BaseResult (untested): def parsedQuery(self, keep_blank_values=False, strict_parsing=False): return parse_qs( self.query, keep_blank_values=keep_blank_values, strict_parsing=strict_parsing) def parsedQueryList(self, keep_blank_values=False, strict_parsing=False): return parse_qsl( self.query, keep_blank_values=keep_blank_values, strict_parsing=strict_parsing) Thanks Fred. That really helped. :-) I have updated the urlparse.py module, cgi.py and also included in the tests in the test_urlparse.py to test this new functionality. test run passed for all the valid queries, except for these: #(=, {}), #(==, {}), #(=;=, {}), The testcases are basically from test_cgi.py module and there is comment on validity of these 3 tests for query values. Pending stuff is updating the documentation. I maintained all the files temporarily at: http://cvs.sarovar.org/cgi-bin/cvsweb.cgi/python/?cvsroot=uthcode I had requested a commit access to Summer of Code branch in my previous mail, but I guess it not been noticed yet. I shall update the files later or send in as patches for application. Whether there's a real win with this is unclear. I generally prefer having an object that represents the URL and lets me get what I want from it, rather than having to pass the bits around to separate parsing functions. The I agree. This is really convenient when one comes to know about it. Thanks, Senthil -- O.R.Senthil Kumaran http://uthcode.sarovar.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [RFC] urlparse - parse query facility
* Fred L. Drake, Jr. [EMAIL PROTECTED] [2007-06-13 22:42:21]: I see no reason to incorporate the URL splitting into the function; the existing function signatures for cgi.parse_qs and cgi.parse_qsl are sufficient. Thanks for the comments, Fred. I understand, that having the signatures of parse_qs and parse_qsl are sufficient in the urlparse module and invoking the same from cgi module will be correct. The urlparse will cotain parse_qs and parse_qsl takes the query string (not url) and with optional arguments keep_blank_values and strict_parsing (same as cgi). http://deadbeefbabe.org/paste/5154 It may be convenient to add methods to the urlparse.BaseResult class providing access to the parsed version of the query on the instance. This is where, I spent a little bit time and I am unable to comeout conclusively as how it can be done. Someone in the list, please help me. * parse_qs or parse_qsl will be invoked on the query component separately by the user. * If parsed query needs to be available at the instance as a convenience function, then we will have to assume the keep_blank_values and strict_parsing values. * Coding question: Without retyping the bunch of code again in the BaseResult, would is the possible to call parse_qs/parse_qsl function on self.query and provide the result? Basically, what would be a good of doing it. Thanks, Senthil -- O.R.Senthil Kumaran http://uthcode.sarovar.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [RFC] urlparse - parse query facility
On Saturday 16 June 2007, O.R.Senthil Kumaran wrote: The urlparse will cotain parse_qs and parse_qsl takes the query string (not url) and with optional arguments keep_blank_values and strict_parsing (same as cgi). http://deadbeefbabe.org/paste/5154 Looks good. It may be convenient to add methods to the urlparse.BaseResult class providing access to the parsed version of the query on the instance. ... * parse_qs or parse_qsl will be invoked on the query component separately by the user. Yes; this doesn't change, really. Methods would still need to be invoked separately, but the query string doesn't need to be passed in; it's part of the data object. * If parsed query needs to be available at the instance as a convenience function, then we will have to assume the keep_blank_values and strict_parsing values. If it were a property, yes, but I think a method on the result object makes more sense because we don't want to assume values for these arguments. * Coding question: Without retyping the bunch of code again in the BaseResult, would is the possible to call parse_qs/parse_qsl function on self.query and provide the result? Basically, what would be a good of doing it. That's what I was thinking. Just add something like this to BaseResult (untested): def parsedQuery(self, keep_blank_values=False, strict_parsing=False): return parse_qs( self.query, keep_blank_values=keep_blank_values, strict_parsing=strict_parsing) def parsedQueryList(self, keep_blank_values=False, strict_parsing=False): return parse_qsl( self.query, keep_blank_values=keep_blank_values, strict_parsing=strict_parsing) Whether there's a real win with this is unclear. I generally prefer having an object that represents the URL and lets me get what I want from it, rather than having to pass the bits around to separate parsing functions. The result objects were added in 2.5, though, and I've no real idea how widely they've been adopted. -Fred -- Fred L. Drake, Jr. fdrake at acm.org Chaos is the score upon which reality is written. --Henry Miller ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] [RFC] urlparse - parse query facility
a) import cgi and call cgi module's query_ps. [circular imports] or b) Implement a stand alone query parsing facility in urlparse *AS IN* cgi module. Assuming (b), please remove the (code for the) parsing from the cgi module, and just import it back from urlparse (or urllib). Since cgi already imports urllib (which imports urlparse), this isn't adding any dependencies -- but it keeps the code in a single location. -jJ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [RFC] urlparse - parse query facility
* Jim Jewett [EMAIL PROTECTED] [2007-06-13 19:27:24]: a) import cgi and call cgi module's query_ps. [circular imports] or b) Implement a stand alone query parsing facility in urlparse *AS IN* cgi module. Assuming (b), please remove the (code for the) parsing from the cgi module, and just import it back from urlparse (or urllib). Since cgi already imports urllib (which imports urlparse), this isn't adding any dependencies -- but it keeps the code in a single location. Sure, thats a good idea as I see it. It wont break anything as well. Thanks, -- O.R.Senthil Kumaran http://uthcode.sarovar.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [RFC] urlparse - parse query facility
On Tuesday 12 June 2007, Senthil Kumaran wrote: This mail is a request for comments on changes to urlparse module. We understand that urlparse returns the 'complete query' value as the query component and does not provide the facilities to separate the query components. User will have to use the cgi module (cgi.parse_qs) to get the query parsed. I agree with the comments Jim provided. Below method implements the urlparse_qs(url, keep_blank_values,strict_parsing) that will help in parsing the query component of the url. It behaves same as the cgi.parse_qs. Except that it takes a URL, not only a query string. def urlparse_qs(url, keep_blank_values=0, strict_parsing=0): ... scheme, netloc, url, params, querystring, fragment = urlparse(url) I see no reason to incorporate the URL splitting into the function; the existing function signatures for cgi.parse_qs and cgi.parse_qsl are sufficient. It may be convenient to add methods to the urlparse.BaseResult class providing access to the parsed version of the query on the instance. -Fred -- Fred L. Drake, Jr. fdrake at acm.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] [RFC] urlparse - parse query facility
Hi all, This mail is a request for comments on changes to urlparse module. We understand that urlparse returns the 'complete query' value as the query component and does not provide the facilities to separate the query components. User will have to use the cgi module (cgi.parse_qs) to get the query parsed. There has been a discussion in the past, on having a method of parse query string available from urlparse module itself. [1] To implement the query parse feature in urlparse module, we can: a) import cgi and call cgi module's query_ps. This approach will have problems as it i) imports cgi for urlparse module. ii) cgi module in turn imports urllib and urlparse. b) Implement a stand alone query parsing facility in urlparse *AS IN* cgi module. Below method implements the urlparse_qs(url, keep_blank_values,strict_parsing) that will help in parsing the query component of the url. It behaves same as the cgi.parse_qs. Please let me know your comments on the below code. -- def unquote(s): unquote('abc%20def') - 'abc def'. res = s.split('%') for i in xrange(1, len(res)): item = res[i] try: res[i] = _hextochr[item[:2]] + item[2:] except KeyError: res[i] = '%' + item except UnicodeDecodeError: res[i] = unichr(int(item[:2], 16)) + item[2:] return .join(res) def urlparse_qs(url, keep_blank_values=0, strict_parsing=0): Parse a URL query string and return the components as a dictionary. Based on the cgi.parse_qs method.This is a utility function provided with urlparse so that users need not use cgi module for parsing the url query string. Arguments: url: URL with query string to be parsed keep_blank_values: flag indicating whether blank values in URL encoded queries should be treated as blank strings. A true value indicates that blanks should be retained as blank strings. The default false value indicates that blank values are to be ignored and treated as if they were not included. strict_parsing: flag indicating what to do with parsing errors. If false (the default), errors are silently ignored. If true, errors raise a ValueError exception. scheme, netloc, url, params, querystring, fragment = urlparse(url) pairs = [s2 for s1 in querystring.split('') for s2 in s1.split(';')] query = [] for name_value in pairs: if not name_value and not strict_parsing: continue nv = name_value.split('=', 1) if len(nv) != 2: if strict_parsing: raise ValueError, bad query field: %r % (name_value,) # Handle case of a control-name with no equal sign if keep_blank_values: nv.append('') else: continue if len(nv[1]) or keep_blank_values: name = unquote(nv[0].replace('+', ' ')) value = unquote(nv[1].replace('+', ' ')) query.append((name, value)) dict = {} for name, value in query: if name in dict: dict[name].append(value) else: dict[name] = [value] return dict -- Testing: $ python Python 2.6a0 (trunk, Jun 10 2007, 12:04:03) [GCC 3.4.2 20041017 (Red Hat 3.4.2-6.fc3)] on linux2 Type help, copyright, credits or license for more information. import urlparse dir(urlparse) ['BaseResult', 'MAX_CACHE_SIZE', 'ParseResult', 'SplitResult', '__all__', '__builtins__', '__doc__', '__file__', '__name__', '_parse_cache', '_splitnetloc', '_splitparams', 'clear_cache', 'non_hierarchical', 'scheme_chars', 'test', 'test_input', 'unquote', 'urldefrag', 'urljoin', 'urlparse', 'urlparse_qs', 'urlsplit', 'urlunparse', 'urlunsplit', 'uses_fragment', 'uses_netloc', 'uses_params', 'uses_query', 'uses_relative'] URL = 'http://www.google.com/search?hl=enlr=ie=UTF-8oe=utf-8q=south+africa+travel+cape+town' print urlparse.urlparse_qs(URL) {'q': ['south africa travel cape town'], 'oe': ['utf-8'], 'ie': ['UTF-8'], 'hl': ['en']} print urlparse.urlparse_qs(URL,keep_blank_values=1) {'q': ['south africa travel cape town'], 'ie': ['UTF-8'], 'oe': ['utf-8'], 'lr': [''], 'hl': ['en']} Thanks, Senthil [1] http://mail.python.org/pipermail/tutor/2002-August/016823.html -- O.R.Senthil Kumaran http://phoe6.livejournal.com ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com