Re: [jira] Commented: (MODPYTHON-93) Improve util.FieldStorage efficiency
Gregory (Grisha) Trubetskoy wrote: Having looked at the FieldStorage code, I'm guessing the idea was that you parse fields as they come in and append them to a list. This preserves the original order of fields, in case it is needed. I'm not sure that maintaining a dictionary alongside the list is the right thing to do. It might be, but there are some difficult questions to answer -e.g. how costly is a sequential search, and is the code complexity (and fieldstorage code is no picnic to read as it is) worth the speedup? Also while it would speed up retrieval, it will slow down the write operation - when a field is added to fieldstorage you now need to append it to the list, AND check whether it exists in the dictionary, then add it there as well. How often do developers access form fields via __getitem__? I noticed the publisher does not use it - it iterates the list, so nothing would be gained there. We do it a lot but we copy it into a different dictionary first to get exactly the setup we want. But dictionary-style access is a very obvious, pythonic way to do it. I have a simple 70-line ordereddict implementation which is derived from dict and remembers the keys in the order that they were assigned when iterating through the list, this may be a way to go for this. It just uses a list of keys internally to remember the order, and otherwise is a dictionary... Also, something else to consider - is there a simple programatic solution that could be documented, e.g. something like my_fs = util.FieldStorage(req) dict_fs = {} dict_fs.update(my_fs) [have no idea whether this will work :-)] It may work but still has the potential performance problem since it loops through the keys and then does a getitem on each key which loops through them again. Not likely to cause problems for a small number of arguments but not ideal :-) and voila - you've got a dictionary based fieldstorage? Anyway, just a few cents from me. Grisha
Re: [SPAM] [mod_python] [SPAM] ANNOUNCE: Mod_python 3.2.5 Beta
I don't know how to do that, and it doesn't bother me that much :-) Grisha On Mon, 28 Nov 2005, David Fraser wrote: Gregory (Grisha) Trubetskoy wrote: The Apache Software Foundation and The Apache HTTP Server Project are pleased to announce the 3.2.5 Beta release mod_python. Can we make sure the final release doesn't come out as SPAM on the announce list? :-) David
[jira] Updated: (MODPYTHON-94) Calling APR optional functions provided by mod_ssl
[ http://issues.apache.org/jira/browse/MODPYTHON-94?page=all ] Deron Meranda updated MODPYTHON-94: --- Attachment: modpython4.tex.patch This is a documentation patch which goes with the previously attached code patch. Made against 3.2.5b. Calling APR optional functions provided by mod_ssl -- Key: MODPYTHON-94 URL: http://issues.apache.org/jira/browse/MODPYTHON-94 Project: mod_python Type: New Feature Components: core Versions: 3.2 Environment: Apache 2 Reporter: Deron Meranda Attachments: modpython4.tex.patch, requestobject.c.patch mod_python is not able to invoke APR Optional Functions. There are some cases however where this could be of great benifit. For example, consider writing an authentication or authorization handler which needs to determine SSL properties (even if to just answer the simple question: is the connection SSL encrypted). The normal way of looking in the subprocess_env for SSL_* variables does not work in those early handler phases because those variables are not set until the fixup phase. The mod_ssl module though does provide both a ssl_is_https() and ssl_var_lookup() optional functions which can be used in earlier phases. For example look at how mod_rewrite calls those; using the APR_DECLARE_OPTIONAL_FN and APR_RETRIEVE_OPTIONAL_FN macros. I can see how it might be very hard to support optional functions in general because of the C type linkage issue, but perhaps a select few could be coded directly into mod_python. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: [mod_python] ANNOUNCE: Mod_python 3.2.5 Beta
Grisha, Speaking of 3.2.5 beta, how long do we wait before it becomes final? Jim
Re: [jira] Commented: (MODPYTHON-93) Improve util.FieldStorage efficiency
Gregory (Grisha) Trubetskoy wrote: Having looked at the FieldStorage code, I'm guessing the idea was that you parse fields as they come in and append them to a list. This preserves the original order of fields, in case it is needed. I assumed that as well, but I'm not sure getting the fields in a particular order is the most common use case - not for me anyway. Plus, I'm not suggesting getting rid of access to FieldStorage.list. For the dict-like such as fs.keys(), I don't think people should expect keys in any particular order. If they really want that feature then they can subclass write their own ordered dict class. I'm not sure that maintaining a dictionary alongside the list is the right thing to do. It might be, but there are some difficult questions to answer -e.g. how costly is a sequential search, and is the code complexity (and fieldstorage code is no picnic to read as it is) worth the speedup? The current code is a litte hairy - it helps to drink a cup of strong coffee before reading it. We can always hide the complexity in a separate add_field method. As to the performance tradeoffs, I guess we benchmark and see? I love doing benchmarks. The 2 things that bring joy to my heart are benchmarks and unit tests. And good documentation. The *3* things that bring joy to my heart are... well you get the idea. Also while it would speed up retrieval, it will slow down the write operation - when a field is added to fieldstorage you now need to append it to the list, AND check whether it exists in the dictionary, then add it there as well. How often do developers access form fields via __getitem__? I noticed the publisher does not use it - it iterates the list, so nothing would be gained there. For myself, I use it (almost?) exclusively as a dict. As for the use in publisher, maybe that implementation needs to be examined as well. ;) Also, something else to consider - is there a simple programatic solution that could be documented, e.g. something like my_fs = util.FieldStorage(req) dict_fs = {} dict_fs.update(my_fs) [have no idea whether this will work :-)] Nope. If you have multiple fields with the same name you'll lose all but the last field. (eg. The checkbox example example on the mod_python list that got me started on this in the first place). and voila - you've got a dictionary based fieldstorage? Except that FieldStorage is already supposed to be dict-like so why would I want to duplicate the effort in my code? For example 7 out of 10 the fs methods are there to support dict-like behaviour and the other 3 are initialization helpers which will never be called by user code anyway. To me, that makes it a dictionary. I'm not talking about adding new functionality to FieldStorage, just examining the current implementation wrt to performance. Anyway, just a few cents from me. I don't want you to think I'm hung up on this issue. It just seems to me that the goal of mod_python moving forward should be stability, speed, efficiency while keeping feature creep to a minumum. I think it's worthwhile to examine existing code as we go along to see if we are meeting these goals. We still need to have *some* code to chew on every now and then after all. :) Jim
Re: [jira] Commented: (MODPYTHON-93) Improve util.FieldStorage efficiency
If you provide say FieldStorage.make_dict that returns a dictionary, then I don't see why the order of the keys is important when the original list is still available. Nick Nicolas Lehuen wrote: Hi, Speaking of ordered dictionary : http://www.voidspace.org.uk/python/weblog/arch_d7_2005_11_19.shtml#e140 Why is the ordering so important ? I do understand we need to support multiple values per field name, but I don't see why ordering is needed. Regards, Nicolas 2005/11/28, David Fraser [EMAIL PROTECTED]: Gregory (Grisha) Trubetskoy wrote: Having looked at the FieldStorage code, I'm guessing the idea was that you parse fields as they come in and append them to a list. This preserves the original order of fields, in case it is needed. I'm not sure that maintaining a dictionary alongside the list is the right thing to do. It might be, but there are some difficult questions to answer -e.g. how costly is a sequential search, and is the code complexity (and fieldstorage code is no picnic to read as it is) worth the speedup? Also while it would speed up retrieval, it will slow down the write operation - when a field is added to fieldstorage you now need to append it to the list, AND check whether it exists in the dictionary, then add it there as well. How often do developers access form fields via __getitem__? I noticed the publisher does not use it - it iterates the list, so nothing would be gained there. We do it a lot but we copy it into a different dictionary first to get exactly the setup we want. But dictionary-style access is a very obvious, pythonic way to do it. I have a simple 70-line ordereddict implementation which is derived from dict and remembers the keys in the order that they were assigned when iterating through the list, this may be a way to go for this. It just uses a list of keys internally to remember the order, and otherwise is a dictionary... Also, something else to consider - is there a simple programatic solution that could be documented, e.g. something like my_fs = util.FieldStorage(req) dict_fs = {} dict_fs.update(my_fs) [have no idea whether this will work :-)] It may work but still has the potential performance problem since it loops through the keys and then does a getitem on each key which loops through them again. Not likely to cause problems for a small number of arguments but not ideal :-) and voila - you've got a dictionary based fieldstorage? Anyway, just a few cents from me. Grisha
Re: [mod_python] ANNOUNCE: Mod_python 3.2.5 Beta
Gregory (Grisha) Trubetskoy wrote: A couple of weeks perhaps? I don't think the final can happen before Apachecon without feeling rushed anyway, so we could target second half of December? Sounds good. But are you not at least a little bit tempted to have it have it ready to go so that you can make the official announcement at ApacheCon? Jim On Mon, 28 Nov 2005, Jim Gallacher wrote: Grisha, Speaking of 3.2.5 beta, how long do we wait before it becomes final? Jim
Re: [jira] Commented: (MODPYTHON-93) Improve util.FieldStorage efficiency
Gregory (Grisha) Trubetskoy wrote: On Mon, 28 Nov 2005, Nicolas Lehuen wrote: Why is the ordering so important ? I do understand we need to support multiple values per field name, but I don't see why ordering is needed. I think that it may be dictated by some RFC (the stdlib does it this way too), I'm not sure, but it's a good question though, it'd be great to have it researched and answered so that we don't have to go over this point again. Grisha Ordering is not defined according to my interpretation. But at the same time we shouldn't mess with the ordering. Gotta love those RFCs. :) http://www.ietf.org/rfc/rfc2388.txt?number=2388 Returning Values from Forms: multipart/form-data 5.5 Ordered fields and duplicated field names The relationship of the ordering of fields within a form and the ordering of returned values within multipart/form-data is not defined by this specification, nor is the handling of the case where a form has multiple fields with the same name. While HTML-based forms may send back results in the order received, and intermediaries should not reorder the results, there are some systems which might not define a natural order for form fields. Jim
[jira] Commented: (MODPYTHON-94) Calling APR optional functions provided by mod_ssl
[ http://issues.apache.org/jira/browse/MODPYTHON-94?page=comments#action_12358754 ] David Fraser commented on MODPYTHON-94: --- I wonder whether it would be possible to use a module like ctypes to connect to Apache functions. That way the linking issue is not a problem... Calling APR optional functions provided by mod_ssl -- Key: MODPYTHON-94 URL: http://issues.apache.org/jira/browse/MODPYTHON-94 Project: mod_python Type: New Feature Components: core Versions: 3.2 Environment: Apache 2 Reporter: Deron Meranda Attachments: modpython4.tex.patch, requestobject.c.patch mod_python is not able to invoke APR Optional Functions. There are some cases however where this could be of great benifit. For example, consider writing an authentication or authorization handler which needs to determine SSL properties (even if to just answer the simple question: is the connection SSL encrypted). The normal way of looking in the subprocess_env for SSL_* variables does not work in those early handler phases because those variables are not set until the fixup phase. The mod_ssl module though does provide both a ssl_is_https() and ssl_var_lookup() optional functions which can be used in earlier phases. For example look at how mod_rewrite calls those; using the APR_DECLARE_OPTIONAL_FN and APR_RETRIEVE_OPTIONAL_FN macros. I can see how it might be very hard to support optional functions in general because of the C type linkage issue, but perhaps a select few could be coded directly into mod_python. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: [jira] Commented: (MODPYTHON-93) Improve util.FieldStorage efficiency
2005/11/29, Nicolas Lehuen [EMAIL PROTECTED]: 2005/11/29, Mike Looijmans [EMAIL PROTECTED]: Nicolas Lehuen wrote: Why is the ordering so important ? I do understand we need to support multiple values per field name, but I don't see why ordering is needed. Because there are applications out there that will break if you change it. Besides that, think about a form with a few text entry boxes (all the same name, e.g. spouse). It would be very confusing for a user of that page to see the text re-ordered every time he clicks one of the buttons on the page. (I'm perfectly aware of at least 4 alternatives, but that is not my point here). From the page code I've written and seen so far, the order of differently named fields is not important. I haven't seen a case where a form expecting a=1b=2 would fail if you pass it b=2a=1. But I have seen cases where a=1x=2x=3 is not the same as a=1x=3x=2. The simple dictionary implementation as proposed would not break that code. -- Mike Looijmans Philips Natlab / Topic Automation Hi Mike, As Jim pointed out, even if using a simple dict structure would not enable us to preserve the *key* ordering, it would still allow us to preserve the *value* ordering for fields with the same key, since they would be added to lists in the same order the browser would send them. So my guess is that preserving key order is not required, but preserving value order for a given key is. In that case a simple dict with list values is sufficient, easy to implement and efficient. Your examples are easily handled with this solution. I think we should check how this problem has been solved in other programming environments. I'll check how this was done in the Java servlet API. Well, the Java Servlets 2.4 API doesn't say anything about field order (see in javax.servlet.ServletRequest.getParameterName() or page 35 in servlet-2_4-fr-spec.pdf). It turns out this problem was raised in the antique JServ project : http://archive.apache.org/gnats/5211 One proposal was to use an OrderedHashtable. As you can see, the reply was quite definite : No. There is nothing in the spec that says that these parameters should be in any sort of order. CGI scripts that expect them to be in order are coded incorrectly IMHO. Regards, Nicolas
Re: [jira] Commented: (MODPYTHON-93) Improve util.FieldStorage efficiency
Nicolas Lehuen wrote: One proposal was to use an OrderedHashtable. As you can see, the reply was quite definite : No. There is nothing in the spec that says that these parameters should be in any sort of order. CGI scripts that expect them to be in order are coded incorrectly IMHO. Standing on a soap box preaching like that may work in the Java world, but we Python programmers have a more realistic view of the world. If we don't want to be crushed by a stampede of angry web developers, I guess we should, at least, provide the means to: - Iterate through the arguments as listed. - Get multi-fields into an array in the order as listed. - Provide fast, dictionary-like access. Hey! this is Python. We can do all that, without losing anything. Let them poor Java folks preach about hell and doom, while we just get on with our lives and build something that meets everybody's expectations. How about we make the first call to get or __getitem__ create the dictionary? We could put code in __getattr__ to create it when it's referenced. Patch is on its way... -- Mike Looijmans Philips Natlab / Topic Automation