Re: [Web-SIG] [extension] x-wsgiorg.flush

2007-10-08 Thread Graham Dumpleton
On 09/10/2007, Phillip J. Eby <[EMAIL PROTECTED]> wrote:
> At 08:23 AM 10/9/2007 +1000, Graham Dumpleton wrote:
> >On 09/10/2007, Phillip J. Eby <[EMAIL PROTECTED]> wrote:
> > > At 06:25 PM 10/8/2007 +0200, Manlio Perillo wrote:
> > > >Phillip J. Eby ha scritto:
> > > > > [...]
> > > > >
> > > > > I don't think there's any point to having a WSGI extension for If-*
> > > > > header support.
> > > >
> > > >I have just found that the WSGI spec says:
> > > >"""...it should be clear that a server may handle cache validation via
> > > >the If-None-Match and If-Modified-Since request headers and the
> > > >Last-Modified and ETag response headers."""
> > > >
> > > >
> > > >So a WSGI implementation is *allowed* to perform cache validation, but
> > > >it is not clear *how* this should be done.
> > > >
> > > >As an example, without the need of an extension, the start_response
> > > >callable may check if Last-Modified or ETag is in the headers.
> > > >In this case, it may perform a cache validation, and if the client
> > > >representation is fresh, it may omit to send the body.
> > > >
> > > >However there are two problems here:
> > > >1) It is not clear if WSGI explicitly allows an implementation to skip
> > > > the iteration over the app_iter object, for optimization purpose
> > > >2) For a WSGI implementation embedded in an existing webserver, the
> > > > most convenient method to perform cache validation is to let the
> > > > server do it; however this requires to send the headers as soon as
> > > > start_response is called, and this is not allowed.
> > >
> > > The only time that the headers can be changed is if there is an error
> > > during the generation of the body content.  So, the fact that
> > > send_headers() is called with a matching ETag or Last-Modified, is
> > > sufficient to determine that the request may be handled using a cache.
> > >
> > > You are correct that the PEP does not explicitly allow the iteration
> > > to be skipped.  My thought is that it should indeed allow it, as long
> > > as the close() method (if any) is still called, and so long as the
> > > request method was a GET.
> >
> >Why only a GET?
> >
> >Just showing my ignorance here and would like it explained. :-)
>
> Since GET is supposed to be side effect-free, skipping the
> calculation of the response body (by not iterating over it) is less
> likely to cause a problem than with another request method.  I guess
> HEAD would be safe, too.

Except that with the way that people use query strings to a GET
instead of a POST with form data in the body, that GET can technically
also have a content body, and how people in general abuse the method
type, that probably often isn't the case. This is why I was querying
the distinction, as not sure one can really say it is different to
other methods unless HTTP specifications do indicate as much at least
in relation to caching. Caching is an area I have never really looked,
so I don't really know what the specifications say so this could all
be irrelevant. :-)

Graham

> If we were just now defining WSGI 1.0, I would let it be any method
> and explicitly document that servers doing cache validation or
> processing a HEAD method can skip iteration of the body, so that you
> can get better performance.
>
> However, if we put this language into WSGI 1.0, I'm wary of breaking
> stuff that exists in the field; indeed it might be better just to say
> that it's up to the user to add middleware to do this, rather than
> trying to get a common standard for how servers should handle it.
>
>
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] [extension] x-wsgiorg.flush

2007-10-08 Thread Phillip J. Eby
At 08:23 AM 10/9/2007 +1000, Graham Dumpleton wrote:
>On 09/10/2007, Phillip J. Eby <[EMAIL PROTECTED]> wrote:
> > At 06:25 PM 10/8/2007 +0200, Manlio Perillo wrote:
> > >Phillip J. Eby ha scritto:
> > > > [...]
> > > >
> > > > I don't think there's any point to having a WSGI extension for If-*
> > > > header support.
> > >
> > >I have just found that the WSGI spec says:
> > >"""...it should be clear that a server may handle cache validation via
> > >the If-None-Match and If-Modified-Since request headers and the
> > >Last-Modified and ETag response headers."""
> > >
> > >
> > >So a WSGI implementation is *allowed* to perform cache validation, but
> > >it is not clear *how* this should be done.
> > >
> > >As an example, without the need of an extension, the start_response
> > >callable may check if Last-Modified or ETag is in the headers.
> > >In this case, it may perform a cache validation, and if the client
> > >representation is fresh, it may omit to send the body.
> > >
> > >However there are two problems here:
> > >1) It is not clear if WSGI explicitly allows an implementation to skip
> > > the iteration over the app_iter object, for optimization purpose
> > >2) For a WSGI implementation embedded in an existing webserver, the
> > > most convenient method to perform cache validation is to let the
> > > server do it; however this requires to send the headers as soon as
> > > start_response is called, and this is not allowed.
> >
> > The only time that the headers can be changed is if there is an error
> > during the generation of the body content.  So, the fact that
> > send_headers() is called with a matching ETag or Last-Modified, is
> > sufficient to determine that the request may be handled using a cache.
> >
> > You are correct that the PEP does not explicitly allow the iteration
> > to be skipped.  My thought is that it should indeed allow it, as long
> > as the close() method (if any) is still called, and so long as the
> > request method was a GET.
>
>Why only a GET?
>
>Just showing my ignorance here and would like it explained. :-)

Since GET is supposed to be side effect-free, skipping the 
calculation of the response body (by not iterating over it) is less 
likely to cause a problem than with another request method.  I guess 
HEAD would be safe, too.

If we were just now defining WSGI 1.0, I would let it be any method 
and explicitly document that servers doing cache validation or 
processing a HEAD method can skip iteration of the body, so that you 
can get better performance.

However, if we put this language into WSGI 1.0, I'm wary of breaking 
stuff that exists in the field; indeed it might be better just to say 
that it's up to the user to add middleware to do this, rather than 
trying to get a common standard for how servers should handle it.

___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] [extension] x-wsgiorg.flush

2007-10-08 Thread Graham Dumpleton
On 09/10/2007, Phillip J. Eby <[EMAIL PROTECTED]> wrote:
> At 06:25 PM 10/8/2007 +0200, Manlio Perillo wrote:
> >Phillip J. Eby ha scritto:
> > > [...]
> > >
> > > I don't think there's any point to having a WSGI extension for If-*
> > > header support.
> >
> >I have just found that the WSGI spec says:
> >"""...it should be clear that a server may handle cache validation via
> >the If-None-Match and If-Modified-Since request headers and the
> >Last-Modified and ETag response headers."""
> >
> >
> >So a WSGI implementation is *allowed* to perform cache validation, but
> >it is not clear *how* this should be done.
> >
> >As an example, without the need of an extension, the start_response
> >callable may check if Last-Modified or ETag is in the headers.
> >In this case, it may perform a cache validation, and if the client
> >representation is fresh, it may omit to send the body.
> >
> >However there are two problems here:
> >1) It is not clear if WSGI explicitly allows an implementation to skip
> > the iteration over the app_iter object, for optimization purpose
> >2) For a WSGI implementation embedded in an existing webserver, the
> > most convenient method to perform cache validation is to let the
> > server do it; however this requires to send the headers as soon as
> > start_response is called, and this is not allowed.
>
> The only time that the headers can be changed is if there is an error
> during the generation of the body content.  So, the fact that
> send_headers() is called with a matching ETag or Last-Modified, is
> sufficient to determine that the request may be handled using a cache.
>
> You are correct that the PEP does not explicitly allow the iteration
> to be skipped.  My thought is that it should indeed allow it, as long
> as the close() method (if any) is still called, and so long as the
> request method was a GET.

Why only a GET?

Just showing my ignorance here and would like it explained. :-)

Graham

> With that clarification added to the existing spec, I think it should
> be possible to implement cache validation in a server.
>
> Hopefully, if anybody knows of a reason why this clarification should
> *not* be added to the spec, they will speak up now.  :)
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] [extension] x-wsgiorg.flush

2007-10-08 Thread Phillip J. Eby
At 06:25 PM 10/8/2007 +0200, Manlio Perillo wrote:
>Phillip J. Eby ha scritto:
> > [...]
> >
> > I don't think there's any point to having a WSGI extension for If-*
> > header support.
>
>I have just found that the WSGI spec says:
>"""...it should be clear that a server may handle cache validation via
>the If-None-Match and If-Modified-Since request headers and the
>Last-Modified and ETag response headers."""
>
>
>So a WSGI implementation is *allowed* to perform cache validation, but
>it is not clear *how* this should be done.
>
>As an example, without the need of an extension, the start_response
>callable may check if Last-Modified or ETag is in the headers.
>In this case, it may perform a cache validation, and if the client
>representation is fresh, it may omit to send the body.
>
>However there are two problems here:
>1) It is not clear if WSGI explicitly allows an implementation to skip
> the iteration over the app_iter object, for optimization purpose
>2) For a WSGI implementation embedded in an existing webserver, the
> most convenient method to perform cache validation is to let the
> server do it; however this requires to send the headers as soon as
> start_response is called, and this is not allowed.

The only time that the headers can be changed is if there is an error 
during the generation of the body content.  So, the fact that 
send_headers() is called with a matching ETag or Last-Modified, is 
sufficient to determine that the request may be handled using a cache.

You are correct that the PEP does not explicitly allow the iteration 
to be skipped.  My thought is that it should indeed allow it, as long 
as the close() method (if any) is still called, and so long as the 
request method was a GET.

With that clarification added to the existing spec, I think it should 
be possible to implement cache validation in a server.

Hopefully, if anybody knows of a reason why this clarification should 
*not* be added to the spec, they will speak up now.  :)

___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] [extension] x-wsgiorg.flush

2007-10-08 Thread Manlio Perillo
Thomas Broyer ha scritto:
> 2007/10/8, Manlio Perillo:
>> However there are two problems here:
>> 1) It is not clear if WSGI explicitly allows an implementation to skip
>>the iteration over the app_iter object, for optimization purpose
>> 2) For a WSGI implementation embedded in an existing webserver, the
>>most convenient method to perform cache validation is to let the
>>server do it; however this requires to send the headers as soon as
>>start_response is called, and this is not allowed.
> 
> Oops, sorry, hadn't correctly understood what you were saying. Of
> course you're right here.
> 

A precisation: this is only an optimization.

Nginx will always do the cache validation (if the appropriate header 
filter is enabled) and will discard the body if the cliend has a fresh copy.

The same applies to If-Range, but in this case it is not possible to 
optimize the WSGI application execution.



Regards  Manlio Perillo
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] [extension] x-wsgiorg.flush

2007-10-08 Thread Thomas Broyer
2007/10/8, Manlio Perillo:
> However there are two problems here:
> 1) It is not clear if WSGI explicitly allows an implementation to skip
>the iteration over the app_iter object, for optimization purpose
> 2) For a WSGI implementation embedded in an existing webserver, the
>most convenient method to perform cache validation is to let the
>server do it; however this requires to send the headers as soon as
>start_response is called, and this is not allowed.

Oops, sorry, hadn't correctly understood what you were saying. Of
course you're right here.

-- 
Thomas Broyer
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] [extension] x-wsgiorg.flush

2007-10-08 Thread Thomas Broyer
2007/10/8, Manlio Perillo:
> Phillip J. Eby ha scritto:
> > [...]
> >
> > I don't think there's any point to having a WSGI extension for If-*
> > header support.
>
> I have just found that the WSGI spec says:
> """...it should be clear that a server may handle cache validation via
> the If-None-Match and If-Modified-Since request headers and the
> Last-Modified and ETag response headers."""
>
>
> So a WSGI implementation is *allowed* to perform cache validation, but
> it is not clear *how* this should be done.
>
> As an example, without the need of an extension, the start_response
> callable may check if Last-Modified or ETag is in the headers.
> In this case, it may perform a cache validation, and if the client
> representation is fresh, it may omit to send the body.
>
> However there are two problems here:
> 1) It is not clear if WSGI explicitly allows an implementation to skip
>the iteration over the app_iter object, for optimization purpose
> 2) For a WSGI implementation embedded in an existing webserver, the
>most convenient method to perform cache validation is to let the
>server do it; however this requires to send the headers as soon as
>start_response is called, and this is not allowed.

How about (not tested, and simplified to require the app to return an
iterable, and without support for If-Range):

def has_precondition(environ):
 return "HTTP_IF_MATCH" in environ or
"HTTP_IF_NONE_MATCH" in environ or
"HTTP_IF_MODIFIED_SINCE" in environ or
"HTTP_IF_UNMODIFIED_SINCE" in environ

def matches_preconditions(environ, headers):
# TODO

def notmodifed_middleware(application):
def middleware(environ, start_response):
notmodified = [False]
def sr(status, headers, exc_info=None):
if status[0] == "2" and matches_preconditions(environ, headers):
start_response("304 Not Modified", headers, exc_info)
notmodified[0] = True
return lambda s: raise NotSupportedError("The write
callback is deprecated")
else:
   notmodified[0] = False
   return start_response(status, headers, exc_info)
app_iter = application(environ,
environ["wsgi.method"] == "GET" and
has_preconditions(environ) and sr or start_response)
if notmodified[0]:
return ("", )
else:
return app_iter
return middleware


We're still waiting for the app to complete (and return its app_iter)
before sending anything to the client but this doesn't prevent us from
checking preconditions and in this case replace the status with a 304
Not Modified and an empty body (ignoring the app_iter all together;
but maybe we should iterate it to allow the wrapped application to
*really* complete its execution)

-- 
Thomas Broyer
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] [extension] x-wsgiorg.flush

2007-10-08 Thread Manlio Perillo
Phillip J. Eby ha scritto:
> [...]
> 
> I don't think there's any point to having a WSGI extension for If-* 
> header support.  

I have just found that the WSGI spec says:
"""...it should be clear that a server may handle cache validation via 
the If-None-Match and If-Modified-Since request headers and the 
Last-Modified and ETag response headers."""


So a WSGI implementation is *allowed* to perform cache validation, but 
it is not clear *how* this should be done.

As an example, without the need of an extension, the start_response 
callable may check if Last-Modified or ETag is in the headers.
In this case, it may perform a cache validation, and if the client 
representation is fresh, it may omit to send the body.

However there are two problems here:
1) It is not clear if WSGI explicitly allows an implementation to skip
the iteration over the app_iter object, for optimization purpose
2) For a WSGI implementation embedded in an existing webserver, the
most convenient method to perform cache validation is to let the
server do it; however this requires to send the headers as soon as
start_response is called, and this is not allowed.



Regards  Manlio Perillo
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] WSGI 2.0

2007-10-08 Thread Manlio Perillo
Graham Dumpleton ha scritto:
> On 08/10/2007, Manlio Perillo <[EMAIL PROTECTED]> wrote:
>> Phillip J. Eby ha scritto:
>>> At 01:02 PM 10/8/2007 +0200, Manlio Perillo wrote:
 Supporting "legacy" and "huge" WSGI applications is not really a
 priority for me.
>>> Then you should really make it clear to your users that your Nginx
>>> module does not support WSGI.  The entire point of WSGI is to allow
>>> "legacy" (i.e. already-written applications) to be portable across
>>> servers.  Something that doesn't run existing WSGI apps is clearly not
>>> WSGI.
>>>
>> [Here I respond to the latest post of Graham, too.]
>>
>> Right, but actually nginx mod_wsgi *can* execute every WSGI application
>> in a *conforming* way (I'm completing full support for WSGI 2.0, and
>> after this I will implement WSGI 1.0).
>>
>> Of course some classes of WSGI applications runs *better* if they don't
>> block the nginx process loop too much, so that nginx can serve multiple
>> requests at the same time.
>>
>> It is simply a matter of optimized execution.
> 
> Do note that there only exists WSGI 1.0. There is no such thing as
> WSGI 2.0 as yet and you shouldn't really assume that the list of
> proposed ideas for discussion will actually end up producing anything
> that looks like what is described. All you can really do at present is
> implement WSGI 1.0, anything else is not WSGI and certainly not WSGI
> 2.0.
> 

Right, and in the nginx mod_wsgi README I explicitly write that the 
current version is implementing the WSGI *draft*.

The reason I'm implementing the WSGI 2.0 draft is that it allows a more 
simple code flow.

> Graham
> 



Regards  Manlio Perillo
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] WSGI 2.0

2007-10-08 Thread Graham Dumpleton
On 08/10/2007, Manlio Perillo <[EMAIL PROTECTED]> wrote:
> Phillip J. Eby ha scritto:
> > At 01:02 PM 10/8/2007 +0200, Manlio Perillo wrote:
> >> Supporting "legacy" and "huge" WSGI applications is not really a
> >> priority for me.
> >
> > Then you should really make it clear to your users that your Nginx
> > module does not support WSGI.  The entire point of WSGI is to allow
> > "legacy" (i.e. already-written applications) to be portable across
> > servers.  Something that doesn't run existing WSGI apps is clearly not
> > WSGI.
> >
>
> [Here I respond to the latest post of Graham, too.]
>
> Right, but actually nginx mod_wsgi *can* execute every WSGI application
> in a *conforming* way (I'm completing full support for WSGI 2.0, and
> after this I will implement WSGI 1.0).
>
> Of course some classes of WSGI applications runs *better* if they don't
> block the nginx process loop too much, so that nginx can serve multiple
> requests at the same time.
>
> It is simply a matter of optimized execution.

Do note that there only exists WSGI 1.0. There is no such thing as
WSGI 2.0 as yet and you shouldn't really assume that the list of
proposed ideas for discussion will actually end up producing anything
that looks like what is described. All you can really do at present is
implement WSGI 1.0, anything else is not WSGI and certainly not WSGI
2.0.

Graham
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] WSGI 2.0

2007-10-08 Thread Manlio Perillo
Phillip J. Eby ha scritto:
> At 01:02 PM 10/8/2007 +0200, Manlio Perillo wrote:
>> Supporting "legacy" and "huge" WSGI applications is not really a
>> priority for me.
> 
> Then you should really make it clear to your users that your Nginx 
> module does not support WSGI.  The entire point of WSGI is to allow 
> "legacy" (i.e. already-written applications) to be portable across 
> servers.  Something that doesn't run existing WSGI apps is clearly not 
> WSGI.
> 

[Here I respond to the latest post of Graham, too.]

Right, but actually nginx mod_wsgi *can* execute every WSGI application 
in a *conforming* way (I'm completing full support for WSGI 2.0, and 
after this I will implement WSGI 1.0).

Of course some classes of WSGI applications runs *better* if they don't 
block the nginx process loop too much, so that nginx can serve multiple 
requests at the same time.

It is simply a matter of optimized execution.


Regards  Manlio Perillo
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] WSGI 2.0

2007-10-08 Thread Phillip J. Eby
At 01:02 PM 10/8/2007 +0200, Manlio Perillo wrote:
>Supporting "legacy" and "huge" WSGI applications is not really a
>priority for me.

Then you should really make it clear to your users that your Nginx 
module does not support WSGI.  The entire point of WSGI is to allow 
"legacy" (i.e. already-written applications) to be portable across 
servers.  Something that doesn't run existing WSGI apps is clearly not WSGI.

___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] WSGI 2.0

2007-10-08 Thread Manlio Perillo
Ian Bicking ha scritto:
> Manlio Perillo wrote:
>> Phillip J. Eby ha scritto:
>>> At 11:04 AM 10/6/2007 +0200, Manlio Perillo wrote:
 As an example, the WSGI write callable cannot be implemented in a
 conforming way in Nginx.
>>> ...unless you use a separate thread or process.
>>>
>>
>> I'm insisting about asynchronous support in WSGI because
>> Nginx *does not supports threads*.
>>
>> It has some thread support but it is *broken*.
>> Even if in future the problems are solved, the threading model of 
>> Nginx *will break* many existing WSGI applications, since the WSGI 
>> iteration can be resumed in different threads.
> 
> Just so you are aware -- almost all current WSGI applications block, and 
> can't be run in asynchronous environments.  

Not every WSGI application "blocks" the request processing for a 
"sensible" amount of time.

A streaming application, as an example, can "block" without problems, 
since nginx mod_wsgi will pause the execution as soon as the application 
output cannot be written at once to the client.

Moreover, as I have already written, using the wsgi.pause_output, it 
should possible to write a WSGI "component" that run the entire WSGI 
application in a separate thread (but, in this case, it MUST buffer the 
entire output, since nginx is not thread safe).

Nginx can also use several worker processes, so it can still (somehow) 
serve "blocking" applications.

> So if you are writing WSGI 
> support that doesn't support applications that block, well, it won't 
> really be able to do much with current WSGI code.
> 

Supporting "legacy" and "huge" WSGI applications is not really a 
priority for me.

I want some support for adding extensions that can be used by other WSGI 
implementations that want to support asynchronous applications in 
asynchronous server.

I can add "proprietary" extensions, but Python is already full of not 
interoperable web solutions.



P.S.
Since, as I can see, many people on this mailing list are not interested 
in asynchronous support for WSGI, we can stop this thread (and further 
discussions) here.





Regards  Manlio Perillo
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com