At 10:18 PM 4/8/2010 +0200, Manlio Perillo wrote:
Suppose I have an HTML template file, and I want to use a sub request.

...
${subrequest('/header/'}
...

The problem with this code is that, since Mako will buffer all generated
content, the result response body will contain incorrect data.

It will first contain the response body generated by the sub request,
then the content generated from the Mako template (XXX I have not
checked this, but I think it is how it works).

Okay, I'm confused even more now. It seems to me like what you've just described is something that's fundamentally broken, even if you're not using WSGI at all.


So, when executing a sub request, it is necessary to flush (that is,
send to Nginx, in my case) the content generated from the template
before the sub request is done.

This seems to only makes sense if you're saying that the subrequest *has to* send its output directly to the client, rather than to the parent request. If the subrequest sends its output to the parent request (as a sane implementation would), then there is no problem. Likewise, if the subrequest is sent to a buffer that's then inserted into the parent invocation.

Anything else seems utterly insane to me, unless you're basically taking a bunch of legacy CGI code using 'print' statements and hacking it into something else. (Which is still insane, just differently. ;-) )


Ah, you are right sorry.
But this is not required for the Mako example (I was focusing on that
example).

As far as I can tell, that example is horribly wrong.  ;-)


But when using the greenlet middleware, and when using the function for
flushing Mako buffer, some data will be yielded *before* the application
returns and status and headers are passed to Nginx.

And that's probably because sharing a single output channel between the parent and child requests is a bad idea. ;-)

(Specifically, it's an increase in "temporal coupling", I believe. I know it's some kind of coupling between functions that's considered bad, I just don't remember if that's the correct name for it.)


> This is also a good time for people to learn that generators are usually
> a *very bad* way to write WSGI apps

It's the only way to be able to suspend execution, when the WSGI
implementation is embedded in an async web server not written in Python.

It's true that dropping start_response() means you can't yield empty strings prior to determining your headers, yes.


> - yielding is for server push or
> sending blocks of large files, not tiny strings.

Again, consider the use of sub requests.
yielding a "not large" block is the only choice you have.

No, it isn't. You can buffer your output and yield empty strings until you're ready to flush.



Unless, of course, you implement sub request support in pure Python (or
using SSI - Server Side Include).

I don't see why it has to be "pure", actually. It just that the subrequest needs to send data to the invoker rather than sending it straight to the client.

That's the bit that's crazy in your example -- it's not a scenario that WSGI 2 should support, and I'd consider the fact that WSGI 1 lets you do it to be a bug, not a feature. ;-)

That being said, I can see that removing start_response() closes a loophole that allows async apps to *potentially* exist under WSGI 1 (as long as you were able to tolerate the resulting crappy API).

However, to fix that crappy API requires greenlets or threads, at which point you might as well just use WSGI 2. In the Nginx case, you can either do WSGI 1 in C and then use an adapter to provide WSGI 2, or you can expose your C API to Python and write a small greenlets-using Python wrapper to support suspending. It would look something like:

    def gateway(request_info, app):
        # set up environ
        run(greenlet(lambda: Finished(app(environ))))

    def run(child):
        while not child.dead:
             data = child.switch()
             if isinstance(data, Finished):
                  send_status(data.status)
                  send_headers(data.headers)
                  send_response(data.response)
             else:
                 perform_appropriate_action_on(data)
                 if data.suspend:
                     # arrange for run(child) to be re-called later, then...
                     return

Suspension now works by switching back to the parent greenlet with command objects (like Finished()) to tell the run() loop what to do. The run() loop is not stateful, so when the task is unsuspended, you simply call run(child) again.

A similar structure would exist for send_response() - i.e., it's a loop over the response, can break out of the loop if it needs to suspend, and arranges for itself to be re-called at the appropriate time.

Voila - you now have asynchronous WSGI 2 support.

Now, whether you actually *want* to do that is a separate question, but as (I hope) you can see, you definitely *can* do it, and without needing any greenlet-using code to be in C. From C, you just call back into one of the Python top-level loops (run() and send_response()), which then does the appropriate task switching.


Another use case is when you have a very large page, and you want to
return some data as soon as possible to avoid the user to abort request
if it takes some time.

That's the server push case -- but of course that's not a problem even in WSGI 2, since the "response" can still be a generator.


Also, note that with Nginx (as with Apache, if I'm not wrong), even if
application yields small strings, the server can still do some buffering
in order to increase performance.

In which case, it's in violation of the WSGI spec. The spec requires eparately-yielded strings to be flushed to OS-level buffering.


What do you mean by absence of generator support?
WSGI 2 applications can still return a generator.

Yes - but they can't *be* a generator - previously they could, due to the separate start_response callable.


_______________________________________________
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Reply via email to