[jira] Closed: (MODPYTHON-97) mod_python.publisher iterables and content_type broken

Graham Dumpleton (JIRA) Sat, 04 Mar 2006 21:57:01 -0800

     [ http://issues.apache.org/jira/browse/MODPYTHON-97?page=all ]
     
Graham Dumpleton closed MODPYTHON-97:
-------------------------------------



> mod_python.publisher iterables and content_type broken
> ------------------------------------------------------
>
>          Key: MODPYTHON-97
>          URL: http://issues.apache.org/jira/browse/MODPYTHON-97
>      Project: mod_python
>         Type: Bug
>   Components: publisher
>     Versions: 3.2.7
>     Reporter: Graham Dumpleton
>     Assignee: Nicolas Lehuen
>      Fix For: 3.2.7

>
> In 3.2, mod_python.publisher was modified so that if it encountered an 
> interable it would recursively iterate over the items and publish each with 
> the result being concatenated.
> FWIW, I personally didn't like this as I saw it potentially changing the 
> behaviour of existing code, although perhaps in contrived cases or for test 
> code only. I saw that this sort of behaviour should have been managed by the 
> user by explicit use of a wrapper class instead, rather than it being magic. 
> End of ramble. :-)
> Regardless of my concerns, the behaviour that was added is broken. 
> Specifically, mod_python.publisher is setting the content type based on the 
> content of the first item returned from the iterable. For example, consider 
> the following:
> index = [
>   '<html><body><p>',
>   1000 * "X",
>   '</p></body></html>',
> ]
> When published, this is resulting in the content type being set to 
> 'text/plain' and not 'text/html'. In part this is perhaps caused by the fact 
> that the content type check is now performed by looking for a trailing 
> '</html>' in the content whereas previously it would look for a leading 
> '<html>'. This was changed because all the HTML prologue that can appear 
> before '<html>' would often throw out this check with the HTML not being 
> automatically being detected. Thus at the time it was thought that looking 
> for the trailing '</html>' would be more reliable. It ain't going to help to 
> go back to using a leading '<html>' check though as the first item may only 
> contain the prologue and not '<html>'.
> These checks are only going to work for iterables if the results of 
> publishing of each item were added to the end of a list of strings, rather 
> than being written back immediately using req.write(). Once all that has been 
> returned by the iterable is obtained, this can all be joined back together 
> and then the HTML check done.
> Joining all the separate items returned from the iterable back together 
> defeats the purpose of what this feature was about in the first place and may 
> result in huge in memory objects needing to be created to hold the combined 
> result just so the HTML check can be done.
> The only way to avoid the problem is for the content type to be set 
> explicitly by the user before the iterable is processed. This is a bit tricky 
> as it is mod_python.publisher which is automagically doing this. The best you 
> can do is something like:
> class SetContentType:
>   def __init__(self,content_type):
>     self.__content_type = content_type
>   def __call__(self,req):
>     req.content_type = self.__content_type
>     return ""
> index = [
>   SetContentType('text/html'),
>   '<html><body><p>',
>   1000 * "X",
>   '</p></body></html>',
> ]
> Once you start doing this, the user may as well have provided their own 
> published function in the first place that set the content type and manually 
> iterated over items and wrote them to req.write(). This could also be managed 
> by a user specified wrapper class which is how I saw this as preferably being 
> done in the first place. Ie.,
> class PublishIterable:
>   def __init__(self,value,content_type):
>     self.__value = value
>     self.__content_type = content_type
>   def __call__(self,req):
>     req.content_type = self.__content_type
>     for item in self.__value:
>       req.write(item)
> _values = [
>   '<html><body><p>',
>   1000 * "X",
>   '</p></body></html>',
> ]
> index = PublishIterable(_values,'text/html')
> Personally I believe this automagic publishing of iterables should be removed 
> from mod_python.publisher. You might still provide a special function/wrapper 
> that works like PublisherIterable but handles recursive structures and 
> callable objects in the process, but I feel it should be up to the user to 
> make a conscious effort to use it and mod_python.publisher shouldn't assume 
> that it should process any iterable in this way automatically.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

[jira] Closed: (MODPYTHON-97) mod_python.publisher iterables and content_type broken

Reply via email to