Hi there,
I've spent the last few days trying to get an input filter to work in
mod_python, in the context of requests that are being reverse proxied to
an app-server by mod_proxy. I've also tried mod_rewrite.
The input filter works fine if mod_proxy or mod_rewrite are *not* in
play and things are just posted into the void. When posting to an
appserver, the symptoms for mod_proxy and mod_rewrite appear to be very
similar.
My apologies in advance for the information dump... I hope someone can
at least give me any clue as to where this problem may lay or what
further actions to pursue. My 'fix' to mod_python might also be worth
consideration, though I have no idea why it works.
I ran into a host of problems that I think have something to do with the
underlying apache, but may also have something to do with mod_python itself.
Symptoms on Apache 2.0.54, using mod_python 3.1.3
-------------------------------------------------
* input filters appear to be unreliable in the face of changing data.
(writing out something else than goes in) Apache hangs or the filter
gets called infinite times in a single request (subrequests appear to be
not in play though, at least I can't detect them using req.main).
* even when data is not changed, filtering sometimes hangs.
* the infinite calling thing can be worked around by disabling the
filter after one is done.
* I can suppress the infinite call behavior using .disable(), but then
the system hangs instead.
Apache 2.0.54, latest svn mod_python
------------------------------------
* same story as before.
Latest apache 2.0.x svn as well as svn mod_python
-------------------------------------------------
* Infinite calls do occur, but..
* Hangs still occur, but..
* Hangs and infinite calls disappear and everything works as expected
(except for message in error log) if an exception is raised inside the
filter code!
Tracking this down to lib/python/mod_python/apache.py, in
FilterDispatch() there's the following section:
if object:
# call the object
if config.has_key("PythonEnablePdb"):
pdb.runcall(object, filter)
else:
object(filter)
# always flush the filter. without a FLUSH or EOS bucket,
# the content is never written to the network.
# XXX an alternative is to tell the user to flush() always
filter.flush()
The hang/calling behavior seems to be triggered when filter.flush() is
called. If instead I put in a line:
return OK
before 'filter.flush()' is ever reached (as is the case when the
exception is raised), everything appears to work. Unfortunately the same
trick doesn't work on Apache 2.0.54... (even if I use the disable() trick).
Does this mean that filter.flush() is buggy when mod_proxy or
mod_rewrite are in effect? I don't know, but I thought I'd report it
here. Perhaps it's also a problem to do with input filters in
particular? The comment talks about needing to flush to make sure things
are sent to the network, but that comment makes more sense for output
filters than input filters (even though mod_proxy in turn sends stuff to
the network again).
Apache change hunt
------------------
Trying to figure out what in Apache itself might've changed, I found
that in the bleeding-edge Apache 2.0 trunk there is a patch that seems
to have to do something with this. More on this apache patch is here:
http://svn.apache.org/viewcvs.cgi/httpd/httpd/branches/2.0.x/CHANGES?rev=233302&view=markup
in particular:
*) proxy HTTP: Rework the handling of request bodies to handle
chunked input and input filters which modify content length, and
avoid spooling arbitrary-sized request bodies in memory.
PR 15859. [Jeff Trawick]
Though it also seems to work when mod_rewrite is used instead of
mod_proxy, so perhaps this fix isn't it and it's something deeper inside
Apache that changed...
...
Feel free to ask me more questions; I can do more testing if you like. I
can also post the test code if people are interested. Of course ideally
I'd make all of this work on a released version of Apache with a
released version of mod_python, but I'll take any hint I can get. :)
Regards,
Martijn