from:"Graham Dumpleton"

Re: [Web-SIG] Collating follow-up on the future of WSGI

2016-01-20 Thread Graham Dumpleton

I am still confused as to why you keep talking as if you seemingly are trying 
to force extensions to the existing WSGI specification into the core WSGI 
specification when an alternative has already been cited.

Extensions and the process for describing them and getting them accepted, plus 
the appropriate WSGI environ prefix, as has been mentioned before, is what is 
covered by:

http://wsgi.readthedocs.org/en/latest/specifications.html 
<http://wsgi.readthedocs.org/en/latest/specifications.html>

Being an extension means it is entirely optional for a WSGI server to try and 
implement it, thus allowing WSGI servers/adapters that cannot implement 
something to skip them. They stand as separate documents and would never become 
part of the core WGSI PEP.

Is there an issue with doing extensions per the process, and in the WSGI 
environ namespace, that was outlined in that URL? You seem to be suggesting a 
completely new way of handling extensions and ignoring what was laid down 
before.

So no one is saying you can’t have extensions, and that separate process gives 
you all the scope you need to do it.

In drafting your specification just fit it reference to what is described in 
that URL, using ‘x-wsgiorg.’ prefix keys.

Graham

> On 21 Jan 2016, at 4:13 PM, Benoit Chesneau  wrote:
> 
> I am not speaking about websockets.  You could use it for SSE, or some apps 
> could use the Upgrade header to upgrade from http to their own protocol 
> etc... The only discussion i saw about websockets are about the addition of 
> an async api or an external api. I am not describing that. I am speaking 
> about providing a low level abstraction like wsgi.input but adding to it the 
> support of output. (I was referring to wsgi.multithread...). This low level 
> interface would allow anyone to provide its own implementation(server) or 
> usage (application) still acting as a *gateway* .
> 
> Also who are "we"? I am starting to think the discussion is already done and 
> only obscure details like the content_length or headers encoding should be 
> discussed. The RAW_SOCKET have been added on demand of the gunicorn users. 
> Such thing also exist in things like cherrypy if I remember. A lot of code 
> around have been created over it. So before deciding it's unworkable or 
> whatever I strongly invite you to consider it as an addition to the environ. 
> And since some servers need to pass the data differently I then suggest a 
> Resource object on which you can read and write and eventually poll. This is 
> not a websocket but more a proxy ressource to the client connexion. I will 
> come back asap with a small spec.
> 
> I also propose a second addition to the protocol that formalize the addition 
> of extensions to the protocol by the servers if they want to. Having for 
> example something like "`wsg.extensions` . Such addition would help anyone to 
> experiment changes over the wsgi before making such changes in the 
> specification by itself possibly.
> 
> I think we have a good opportunity to extend the WSGI specification to allow 
> the users to take over the new challenges on the web without forcing them to 
> use a concurrency mode or skip completely the WSGI spec. The interest I see 
> in WSGI is its simplicity and low level interface allowing users to build 
> whatever they want over it. The different workers and their support of 
> different concurrency models and framework  in gunicorn let me think it is 
> possible. Are the participants of this thread ready to discuss it? 
> 
> - benoît
> 
> On Wed, 20 Jan 2016 at 23:37, Graham Dumpleton  <mailto:graham.dumple...@gmail.com>> wrote:
>> On 21 Jan 2016, at 9:27 AM, Benoit Chesneau > <mailto:bchesn...@gmail.com>> wrote:
>> 
>> again. any server can do such implementation if we create a new Resource 
>> abstraction. This abstraction would expose a common api to read and write. 
>> The implementation would be specific to the server.
> 
> If you mean not exposing the raw socket and having a separate high level API 
> for implementing something like WebSocket this was already talked about. The 
> suggestion was that it should not be a part of WSGI. Develop that API 
> independently with no link to WSGI. The idea of upgrading from WSGI to a 
> different API isn’t practical for various WSGI servers as it isn’t possible 
> to unwind the state of the connection path created to get to point of 
> handling the WSGI application. The better scenario is that the switch to an 
> alternate WebSocket API is handled completely within the web server however 
> it needs to handle it, when it needs to handle it, and not be reliant on 
> going into a WSGI application which then says, oh, I actually need that to be 
> WebSocket.
> 
>> Now like w

Re: [Web-SIG] Collating follow-up on the future of WSGI

2016-01-20 Thread Graham Dumpleton


> On 21 Jan 2016, at 9:27 AM, Benoit Chesneau  wrote:
> 
> again. any server can do such implementation if we create a new Resource 
> abstraction. This abstraction would expose a common api to read and write. 
> The implementation would be specific to the server.

If you mean not exposing the raw socket and having a separate high level API 
for implementing something like WebSocket this was already talked about. The 
suggestion was that it should not be a part of WSGI. Develop that API 
independently with no link to WSGI. The idea of upgrading from WSGI to a 
different API isn’t practical for various WSGI servers as it isn’t possible to 
unwind the state of the connection path created to get to point of handling the 
WSGI application. The better scenario is that the switch to an alternate 
WebSocket API is handled completely within the web server however it needs to 
handle it, when it needs to handle it, and not be reliant on going into a WSGI 
application which then says, oh, I actually need that to be WebSocket.

> Now like we have wsgi.thread I would instead suggest to add a system of 
> capability or extension  like in smtp, imap, ... so the servers that 
> implement a specific extension can legally published it. Would it work for 
> you?

Since there is nothing in WSGI environ called wsgi.thread now I have no idea 
what you are really suggesting here.

Graham

> benoit
> 
> On Wed, 20 Jan 2016 at 21:28, Graham Dumpleton  <mailto:graham.dumple...@gmail.com>> wrote:
> 
>> On 21 Jan 2016, at 2:48 AM, Benoit Chesneau > <mailto:bchesn...@gmail.com>> wrote:
>> 
>> 
>> 
>> On Wed, Jan 20, 2016 at 1:57 AM Robert Collins > <mailto:robe...@robertcollins.net>> wrote:
>> On 20 January 2016 at 12:04, Benoit Chesneau > <mailto:bchesn...@gmail.com>> wrote:
>> 
>> >
>> > not at all. But I made the assumption that the wsgi server maintained a
>> > thread directly or not where the python application is running .
>> >
>> > In any case there is some sort of wrapping done in the same thread/process
>> > where the python application is running. And then nothing stop to give the
>> > socket away to the application and tell to the server to stop to 
>> > communicate
>> > with it.
>> 
>> What socket?
>> 
>> Data could be being passed by shm, for instance.
>> 
>> -Rob
>> 
>> 
>> While shared memory would be quite a bad idea, then why not. I still don't 
>> see why having a way to upgrade the connection can't be done.
>> 
>> Call it I/O resource or Socket, the issue is the same. At the end nothing 
>> stop the server to pass the control to the app. If we forget the socket 
>> (which is btw the simplest design) then the server could stop to control the 
>> I/O resource when the application ask it to do it. At some point either a 
>> garbage collection or a basic resource return/claim flow could be used to 
>> definitely free the resource.
>> 
>> The thing behind that is that it would allow the WSGI spec to only focus on 
>> providing a strict gateway workflow without forcing the application to adopt 
>> a concurrency model aync or not.
> 
> No one has said you cannot do it. because though it is only able to be 
> implemented in a subset of WSGI servers/adapters, then it doesn’t seem 
> appropriate that it be a part of the core WSGI specification.
> 
> This is the role of a WSGI extension as found at:
> 
> http://wsgi.readthedocs.org/en/latest/specifications.html 
> <http://wsgi.readthedocs.org/en/latest/specifications.html>
> 
> So go talk to the authors of uWSGI, and the other couple of packages 
> available for trying to plug these into some of the pure Python based WSGI 
> servers and come to an agreement between yourselves as to a standard way of 
> doing it and the extension specification can be added to the wsgi.org 
> <http://wsgi.org/> site.
> 
> Graham
> 

___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
https://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] Collating follow-up on the future of WSGI

2016-01-20 Thread Graham Dumpleton


> On 21 Jan 2016, at 2:48 AM, Benoit Chesneau  wrote:
> 
> 
> 
> On Wed, Jan 20, 2016 at 1:57 AM Robert Collins  > wrote:
> On 20 January 2016 at 12:04, Benoit Chesneau  > wrote:
> 
> >
> > not at all. But I made the assumption that the wsgi server maintained a
> > thread directly or not where the python application is running .
> >
> > In any case there is some sort of wrapping done in the same thread/process
> > where the python application is running. And then nothing stop to give the
> > socket away to the application and tell to the server to stop to communicate
> > with it.
> 
> What socket?
> 
> Data could be being passed by shm, for instance.
> 
> -Rob
> 
> 
> While shared memory would be quite a bad idea, then why not. I still don't 
> see why having a way to upgrade the connection can't be done.
> 
> Call it I/O resource or Socket, the issue is the same. At the end nothing 
> stop the server to pass the control to the app. If we forget the socket 
> (which is btw the simplest design) then the server could stop to control the 
> I/O resource when the application ask it to do it. At some point either a 
> garbage collection or a basic resource return/claim flow could be used to 
> definitely free the resource.
> 
> The thing behind that is that it would allow the WSGI spec to only focus on 
> providing a strict gateway workflow without forcing the application to adopt 
> a concurrency model aync or not.

No one has said you cannot do it. because though it is only able to be 
implemented in a subset of WSGI servers/adapters, then it doesn’t seem 
appropriate that it be a part of the core WSGI specification.

This is the role of a WSGI extension as found at:

http://wsgi.readthedocs.org/en/latest/specifications.html 


So go talk to the authors of uWSGI, and the other couple of packages available 
for trying to plug these into some of the pure Python based WSGI servers and 
come to an agreement between yourselves as to a standard way of doing it and 
the extension specification can be added to the wsgi.org  
site.

Graham

___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
https://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] Collating follow-up on the future of WSGI

2016-01-20 Thread Graham Dumpleton

> On 20 Jan 2016, at 11:24 PM, André Malo  wrote:
> 
> * Graham Dumpleton wrote:
> 
>>> On 20 Jan 2016, at 10:25 PM, André Malo  wrote:
> 
>>> Regarding chunked requests - in my own WSGI implementation I went the
>>> most pragmatic way and simply provided a CONTENT_LENGTH of -1 for unknown
>>> request sizes (it maps very well to file.read(size)). Something like this
>>> would be my suggestion for a future WSGI spec.
>> 
>> I am assuming here you mean that -1 means return whatever you have
>> available, or block until you have something.
>> 
>> Problem with that is that some implementations will use -1 as a default
>> value to mean no argument supplied and so read all input.
> 
> That was actually the idea. It has the same semantics (as in 
> file.read(int(environ['CONTENT_LENGTH'])). Since -1 is not covered by RFC 
> 3875, it should not break much as well (*cough*).
> 
>> 
>> So that could well conflict with some implementations.
>> 
>> Also, if it is going to block, how is it really different to reading with a
>> block size.
> 
> It's not. It's a signal, that the gateway has no idea about the size of the 
> request body and you (as the application) should not make any assumptions. 
> You wouldn't read(-1) a file of unknown size either.

Okay. Didn’t read properly everything you said. I thought you were trying to 
give read() with -1 a special meaning only. Not that you are also suggesting 
CONTENT_LENGTH in WSGI environ would be -1.

I can’t remember the details but I recollect when this was discussed once 
before that setting CONTENT_LENGTH to -1 was determined as not being a good 
idea. Would need to go back through the archives to find the reasons brought up.

Anyway, it becomes unnecessary if you simply change things such that you should 
read, in chunks as necessary, until get back an empty byte string. You don’t 
need a special CONTENT_LENGTH value to indicate unknown length at that point.

If you want to deal with partial reads of unknown length, still better to have 
ASYNC as then you can avoiding blocking and be doing something else at same 
time.

Graham
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
https://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] Collating follow-up on the future of WSGI

2016-01-20 Thread Graham Dumpleton

> On 20 Jan 2016, at 10:25 PM, André Malo  wrote:
> 
> * Cory Benfield wrote:
> 
>>> On 20 Jan 2016, at 06:04, Graham Dumpleton 
>>> wrote:
>>> 
>>> For response content, if a WSGI application currently doesn’t set a
>>> Content-Length on response, A HTTP/1.1 web server is at liberty to chunk
>>> the response.
>>> 
>>> So I am not sure what is missing.
>> 
>> My specific concern is the distinction between “at liberty to” and
>> “required to”. Certain behaviours that make sense with chunked transfer
>> encoding do not make sense without it: for example, streaming API endpoints
>> that return events as they arrive.

Bidirectional HTTP is effectively a no go.

CGI, SCGI and FASTCGI implementations, mod_wsgi daemon mode and many HTTP 
proxies will often not actually start reading a response until they have 
managed to send the full content of the request. There are also various issues 
around buffering, especially with intermediaries.

So if your expectation is that that you can send a bit of a request, have 
client wait for a response for that bit, then send more request, wait for more 
response for that part of the request and so on, it isn’t going to work for 
most implementations.

This problem/issue and the lack of support for both way streaming has been the 
subjects of some RFCs. I can’t remember if this is the exact one I remember 
seeing before, but does appear relevant:

   https://www.ietf.org/rfc/rfc6202.txt <https://www.ietf.org/rfc/rfc6202.txt>

Anyway, the end result as I saw it was no one could be bothered supporting 
proper bidirectional HTTP as getting proxies/caches changed was going to be too 
hard.

The solution was to give up on it for HTTP/1.X and instead do it in HTTP/2 as 
upgrading a HTTP connection to something else generally had the effect of 
bypassing any intermediaries behaviour which would cause issues as then it 
would switch to streaming properly in both directions. So I think it is a lost 
cause to try and do it in HTTP/1.X and WSGI.

>> Sending this kind of response with a
>> HTTP/1.0-style content-length absent response (framed by connection
>> termination) is utterly confusing, especially as some APIs consider the
>> chunk framing to be semantic.
> 
> Those APIs are just broken then. The HTTP RFCs state very clearly [1], that 
> any hop may modify the transfer encoding. In other words: the transfer 
> encoding is transparent to the representation layer.

Yep. If framing was required you could never rely on the HTTP chunking. Framing 
had to be done in the actual data.

>> This can and does bite people, because while all major production WSGI
>> servers use chunked transfer encoding in this situation, not all WSGI
>> implementations do: in fact, wsgiref does not. This means that if an
>> application has a production design requirement to use chunked transfer
>> encoding in its responses it cannot rely on the server actually providing
>> it.
>> 
>> I see two solutions to this problem: we could mandate that HTTP/1.1
>> responses that have no content length must be chunked, rather than falling
>> back to HTTP/1.0 style connection-termination-framed responses, or we could
>> have servers stuff something in the environ dictionary that can be checked
>> by applications. Or, I suppose, we can conclude that this problem is not
>> large enough, and that it’s “caveat developer”.
> 
> WSGI is a gateway working with the representation layer. I think, it should 
> not concern itself with underlying transport issues that much.
> 
> Regarding chunked requests - in my own WSGI implementation I went the most 
> pragmatic way and simply provided a CONTENT_LENGTH of -1 for unknown request 
> sizes (it maps very well to file.read(size)). Something like this would be my 
> suggestion for a future WSGI spec.

I am assuming here you mean that -1 means return whatever you have available, 
or block until you have something.

Problem with that is that some implementations will use -1 as a default value 
to mean no argument supplied and so read all input.

So that could well conflict with some implementations.

Also, if it is going to block, how is it really different to reading with a 
block size. The whole think with chunking as noted above is that intermediates 
can change the chunking and so using framing of your own in the data where you 
know the size of each message at application layer is only way to reliability 
do it. So can’t see any benefit of -1 meaning give me whatever you have.

In general this is where you would be better to have a proper ASYNC API.

> Cheers,
> nd
> 
> [1] https://tools.ietf.org/html/rfc7230#section-3.3.1 
> <https://tools.ietf.org/html/rfc7230#section-3.3.1>
> -- 
> If G

Re: [Web-SIG] Collating follow-up on the future of WSGI

2016-01-19 Thread Graham Dumpleton

> On 20 Jan 2016, at 8:56 AM, Robert Collins  wrote:
> 
>> REQUEST_URI environ variable
>> 
>> 
>> Multiple contributors expressed an interest in bringing this environment 
>> variable into WSGI directly, making it a required part of the environ 
>> dictionary. An alternative name for this was RAW_URI.
> 
> If its reasonable available to e.g. Apache modules, I could see doing
> this. That said, why have two? Why not require that URI be the 'RAW'
> URI? I don't see the benefit in having two separate variables.

The history on this one was that Apache and anything that copied what Apache 
did always provided this as REQUEST_URI. It has some de-facto standing 
therefore as meant it was also always present in many CGI, SCGI, FASTCGI 
environments as a result.

When Gunicorn decided to add the equivalent, they chose not to use what Apache 
has always used and chose a different name.

It isn’t an issue therefore of allowing both, it makes more sense only to note 
use of REQUEST_URI as it has longer standing. If adopted, Gunicorn would need 
to use REQUEST_URI, but only Gunicorn would have to continue to use RAW_URI to 
support people who wrote WSGI applications which were only looking for what 
Gunicorn used and didn’t know there was another convention for it.

As to the comment:

Why not require that URI be the ‘RAW’ URI?

am not sure what you ‘UR’ you are talking about, if you are talking about some 
other existing variables in the WSGI specification.

Graham___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
https://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] Collating follow-up on the future of WSGI

2016-01-19 Thread Graham Dumpleton

> On 20 Jan 2016, at 10:57 AM, Robert Collins  wrote:
> 
> On 20 January 2016 at 12:04, Benoit Chesneau  wrote:
> 
>> 
>> not at all. But I made the assumption that the wsgi server maintained a
>> thread directly or not where the python application is running .
>> 
>> In any case there is some sort of wrapping done in the same thread/process
>> where the python application is running. And then nothing stop to give the
>> socket away to the application and tell to the server to stop to communicate
>> with it.
> 
> What socket?
> 
> Data could be being passed by shm, for instance.

Exactly. You aren’t guaranteed that from the HTTP client to the WSGI server 
consists only of use of socket connections with HTTP running over it.

Intermediary hops to could use non socket communication mechanisms, or instead 
of using HTTP on a proxy connection, use an alternate protocol such as CGI, 
SCGI, FASTCGI or some internal WSGI server specific protocol.

There are enough of these already that a requirement in the WSGI specification 
to provide a socket which had the original unmodified HTTP request coming over 
it is not possible. It has to be optional and being optional it doesn’t really 
even need to be in the WSGI specification PEP but can be a separate WSGI 
extension specification.

Graham
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
https://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] Collating follow-up on the future of WSGI

2016-01-19 Thread Graham Dumpleton

> On 20 Jan 2016, at 8:56 AM, Robert Collins  wrote:
> 
>> 
>> Chunked Transfer Encoding
>> ~
>> 
>> It would be nice to formalise chunked transfer encoding in WSGI. Currently 
>> there is no way to signal to applications that chunked transfer encoding is 
>> in use by the client, or for the application to request it from the server.
>> 
>> This seemed to be a low priority work item, but if we can make this 
>> enhancement easily then it's worth considering.
> 
> I'm very much against this. I think its an abstraction violation. It
> makes as much sense as exposing the guts of HTTP/2 framing to an
> application. A way of doing Trailers would make sense.

I would agree in as much that what is stated here is confusing.

There are two concerns. Chunked request content, and use of chunking on 
response content.

For chunked request content it can’t currently be done by PEP . This is 
what the separate issue about reading more than CONTENT_LENGTH is about. For 
chunked request content, a WSGI application would never see raw chunked stream 
as it would be de-chunked by the web server.

For response content, if a WSGI application currently doesn’t set a 
Content-Length on response, A HTTP/1.1 web server is at liberty to chunk the 
response.

So I am not sure what is missing.

BTW, this reminds me of another area of the WSGI PEP which is broken which I 
have talked about before in:

   http://blog.dscpl.com.au/2009/10/wsgi-issues-with-http-head-requests.html 

There might be another post about it.

Basically the PEP is wrong is saying a WSGI server is allowed to construct a 
Content-Length header where it identifies the response as a list containing one 
string. Doing so can break the supposed equivalence in headers between GET and 
HEAD.

So another issue that would be cleaned up in any 1.1 final version of the 
specification.

Graham___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
https://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] Collating follow-up on the future of WSGI

2016-01-19 Thread Graham Dumpleton


> On 20 Jan 2016, at 8:29 AM, Benoit Chesneau  wrote:
> 
> 
> 
> On Tue, Jan 19, 2016 at 10:49 PM Graham Dumpleton  <mailto:graham.dumple...@gmail.com>> wrote:
> 
>> On 20 Jan 2016, at 7:43 AM, Benoit Chesneau > <mailto:bchesn...@gmail.com>> wrote:
>> 
>> I will make a more complete answer soon. But about:
>> 
>> 
>> 
>> Socket Escape Hatch
>> ~~~
>> 
>> Aside from Benoit, server operators were unanimously dismissive of the idea 
>> of a socket 'escape hatch'. In general it seems like servers would not be 
>> capable of achieving this. I think, therefore, this idea is unworkable.
>> 
>> 
>> Well it does work. This is how websockets works in gunicorn.  Escape is not 
>> the right term anyway. Think it as a socket upgrade. And then you would 
>> wonder why it would be unworkable. After all this is how SSL sockets works, 
>> so is the protocol negotiation in http2 ...
>> 
>> There is nothing magic there until you try to over engineer the stuff. 
>> Upgrading a sockets means that you tell to the server to forget it. This is 
>> how most concurrent servers work today.
> 
> The problem was that it would only work in a WSGI server where the original 
> request was accepted on a socket in the same process as the WSGI application 
> is running. It cannot work where where the WSGI application is behind a 
> bridging protocol. 
> 
> So it can’t work for CGI, SCGI, FASTCGI, mod_wsgi daemon mode and possibly 
> other implementations.
> 
> uh? But we don't care about bridging protocols. In WSGI (the gateway), the 
> server accept the socket and anyway pass it to the application via actually a 
> wrapper. Then expect a response from the application.
> 
> Upgrading a socket would simply mean  that the server will forget it (and 
> then consider its job done) once it got an appropriate response from the 
> application. How this is unworkable?

Bridging protocols such as FASTCGI do not provide an ability to upgrade the 
connection end to end.

That is, yes you could pass the raw socket to the WSGI application when behind 
FASTCGI, but you are passing it a socket from same process where data being 
received (and expected to be sent), is using FASTGCI message frames. It is not 
a raw HTTP socket connection.

There is no way to send a message back to the front end side of the bridged 
connection where the raw HTTP socket is, to tell the client side of the FASTCGI 
implementation to stop treating it as a FASTCGI connection to backend process 
and then suddenly start acting as a raw socket pass through.

Graham

___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
https://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] Collating follow-up on the future of WSGI

2016-01-19 Thread Graham Dumpleton

> On 20 Jan 2016, at 7:43 AM, Benoit Chesneau  wrote:
> 
> I will make a more complete answer soon. But about:
> 
> 
> 
> Socket Escape Hatch
> ~~~
> 
> Aside from Benoit, server operators were unanimously dismissive of the idea 
> of a socket 'escape hatch'. In general it seems like servers would not be 
> capable of achieving this. I think, therefore, this idea is unworkable.
> 
> 
> Well it does work. This is how websockets works in gunicorn.  Escape is not 
> the right term anyway. Think it as a socket upgrade. And then you would 
> wonder why it would be unworkable. After all this is how SSL sockets works, 
> so is the protocol negotiation in http2 ...
> 
> There is nothing magic there until you try to over engineer the stuff. 
> Upgrading a sockets means that you tell to the server to forget it. This is 
> how most concurrent servers work today.

The problem was that it would only work in a WSGI server where the original 
request was accepted on a socket in the same process as the WSGI application is 
running. It cannot work where where the WSGI application is behind a bridging 
protocol.

So it can’t work for CGI, SCGI, FASTCGI, mod_wsgi daemon mode and possibly 
other implementations.

So the ‘unworkable’ is coming from that you couldn’t universally implement it 
across all current WSGI implementations. For that reason, having it as part of 
core WSGI is debatable as it would have to be marked as optional. At that point 
better as a separate WSGI extension outside of the WSGI PEP if you did at least 
want to standardise such an approach across those WSGI servers that may be able 
to support it.

Graham

___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
https://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] Collating follow-up on the future of WSGI

2016-01-19 Thread Graham Dumpleton

> On 20 Jan 2016, at 2:55 AM, Cory Benfield  wrote:
> 
> Content Lengths
> ~~~
> 
> We should clarify in the new specification that an application that reads 
> beyond the logical length of the request as given by CONTENT_LENGTH will have 
> its reads return immediately with the empty string. Servers are required to 
> police that logic. This is codifying existing practice, and would also make 
> CONTENT_LENGTH purely advisory.

Clarification on this point as maybe I didn’t explain well enough what I was 
suggesting.

It isn’t that want an empty string to be returned immediately if an attempt is 
made to read more than CONTENT_LENGTH, but that it is permissible that a WSGI 
server CAN actually return more than what CONTENT_LENGTH states.

This would occur for example with chunked request content where CONTENT_LENGTH 
would actually be 0 (not present). Or, with compressed request content which is 
decompressed by the underlying web server. Thus the actual amount of data 
available to read would be greater in length than the original non zero 
CONTENT_LENGTH.

A WSGI application wishing to support these situations would, instead of 
reading up to CONTENT_LENGTH, would read in data until it is returned an empty 
string, indicating end of input.

The complication comes in that PEP 333 simply said you can’t read more than 
CONTENT_LENGTH. It didn’t really provide a guarantee that if you did you got an 
empty string back.

In PEP , a guarantee was added that when you had read all input you would 
get an empty string. There was no version change for WSGI in PEP , ie., 
wsgi.version, so you don’t really have a proper way of knowing for sure that a 
WSGI server will work that way so WSGI applications still tend to be written to 
only read up to CONTENT_LENGTH.

An updated WSGI 1.1, so wsgi.version would be updated, would provide a means of 
being able to know if empty string is guaranteed, but also the additional new 
change that you can read past CONTENT_LENGTH and still get data, with input 
eventually terminated by empty string.

I have at least one blog post about this some where so I will try and find it. 
Travelling this morning though so no more time to try and find it right now.

Graham___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
https://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] WSGI 2.0 Round 2: requirements and call for interest

2016-01-06 Thread Graham Dumpleton

> On 6 Jan 2016, at 10:19 PM, Cory Benfield  wrote:
> 
> 
>> On 6 Jan 2016, at 09:48, Graham Dumpleton  wrote:
>> 
>> If this does solve the push issue, what is there in HTTP/2 then that one 
>> couldn’t do via the existing WSGI interface?
> 
> Well, plenty, but none that we’d *want* to expose via WSGI with the possible 
> exception of long-running bi-directional communications channels like 
> Websockets, which you’ve already expressed a desire to expose in a different 
> API. =)

Can you elaborate more on the ‘plenty’ part.

This was the issue in the past. People appeared to want access to everything. 
Thus why the belief you may as well allow them a separate API purpose built for 
it. Maybe people are becoming more realistic in expectations now as to what 
they really need for a typical web application as HTTP/2 is better understood.

> Pushing via Link headers is not ideal because it delays the push until after 
> the headers are ready to go, and there’s a tricky ordering concern here (RFC 
> 7540 points out that any PUSH_PROMISE frames should be sent before the 
> response headers and body are sent, which means that we temporarily block the 
> response that is ready to go from WSGI. This is a minor concern, but worth 
> noting.)
> 
> However, I’m happy to say that Pushing via Link headers is the way to go if 
> we’d rather not specify a WSGI-specific API for it.

It appears that Link could at least be a fallback.

The idea of a separate callable to push resources in WSGI environ could still 
be dealt with as an extension specification and so not need changes to the WSGI 
specification itself. Worst case is all that callable does is add Link headers 
to the response. This would likely have to be the case in Apache with mod_h2 as 
wouldn’t expect handlers to have direct access to an API to do it any other 
way, plus in mod_wsgi daemon mode is in the wrong process to access any API.

With the possibility that Link header would be a mechanism for use by such a 
callable, then any calling arguments for the callable should perhaps be 
modelled on what is possible via the Link header. I could be talking nonsense 
on that point as I have no idea how server push works in HTTP/2 protocol.

Graham
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
https://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] WSGI 2.0 Round 2: requirements and call for interest

2016-01-06 Thread Graham Dumpleton

> On 5 Jan 2016, at 10:31 PM, Graham Dumpleton  
> wrote:
> 
>>> For example, mod_wsgi already supports HTTP/2 by virtue of the fact that 
>>> the mod_h2 module in Apache exists. The existing internal APIs of Apache 
>>> and how mod_wsgi uses those means that HTTP/2 bridges into the WSGI world 
>>> with no code changes to mod_wsgi.
>> 
>> Agreed. If all we want is to keep the request/response cycle intact, then 
>> WSGI supports H2 already. One possibility that has already been suggested 
>> here would be to define a HTTP/2 extension to WSGI, advertised in the 
>> environ dict, that allows the application to signal pushes to the server. 
>> This would be a fairly simple extension to write and implement.
> 
> Sorry to be cynical. Many people have said that changes in the past related 
> to WSGI will 'be a fairly simple extension to write and implement’. Dig 
> deeper and it never turns out to be the case. :-)
> 
> Such an extension presumes you actually have a tightly integrated HTTP/2 
> server which itself which can maintain a map of resources to push when 
> getting certain requests and also maintain what may have already been sent 
> for a session. Even getting to that point is going to be non trivial, even if 
> an extension may be simple for somehow notifying what the additional 
> resources to supply should be.
> 
> Right now I would say that with mod_h2 in Apache in would be plain impossible 
> as it doesn’t I believe even support the idea of pushing resources at this 
> point. Even then it would most likely be a huge undertaking to get it to work 
> for mod_wsgi daemon mode as the web application runs in a separate process to 
> where HTTP/2 is handled.
> 
> If you believe though it is as simple as an extra item in the environ 
> dictionary, then it can be handled as a separate extension specification per 
> the URL above.

A side discussion on Twitter has noted that this exists:

https://w3c.github.io/preload/ <https://w3c.github.io/preload/>

This is already implemented by mod_h2, nghttp2 and H20 at least.

This is not my area so I don’t know for sure whether this fits the bill as to 
what is being described as ‘pushing’.

If as I understand it, this allows a WSGI application to return a Link header 
and mod_h2 in Apache then uses HTTP/2 push to deliver up those resources 
straight away.

If I am misunderstanding this, please let me know.

The only problem I do see with this right now if it does what is required, is 
that in mod_wsgi daemon mode, except for select headers such as Set-Cookie and 
WWW-Authenticate, response headers of the same name will be joined together. 
What I don’t know is if mod_h2/nghttp2 will handle where the value of a Link 
header is joined. It is probably going to be safer if I modify mod_wsgi and add 
Link to the white list of headers which aren’t joined together.

If this does solve the push issue, what is there in HTTP/2 then that one 
couldn’t do via the existing WSGI interface?

Graham

___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
https://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] WSGI 2.0 Round 2: requirements and call for interest

2016-01-06 Thread Graham Dumpleton

> On 6 Jan 2016, at 12:13 AM, Benoit Chesneau  wrote:
> 
> So for me what should be WSGI 2? WSGI 2 should add against WSGI 1 the 
> following:
> 
> - tell to the application it is actually an HTTP2 request (maybe populating a 
> wsgi.http2 true env)

In CGI implementations you would for HTTP/1.1 already get:

SERVER_PROTOCOL: 'HTTP/1.1’

Under HTTP/2 when I tested some time back, I recollect it came through as one 
would assume would be expected:

SERVER_PROTOCOL: ‘HTTP/2’

Is there any reason that this existing CGI variable wouldn’t be sufficient for 
this purpose?

Graham___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
https://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] WSGI 2.0 Round 2: requirements and call for interest

2016-01-05 Thread Graham Dumpleton


> On 6 Jan 2016, at 9:27 AM, chris.d...@gmail.com wrote:
> 
> On Wed, 6 Jan 2016, Graham Dumpleton wrote:
> 
>> 
>>> On 6 Jan 2016, at 12:09 AM, chris.d...@gmail.com wrote:
>>> 
>>> As someone who writes their WSGI applications as functions that take
>>> `start_response` and `environ` and doesn't bother with much
>>> framework the things I would like to see in a minor revision to WSGI
>>> are:
>>> 
>>> * A consistent way to access the raw un-decoded request URI. This is
>>> so I can reconstruct a realistic `PATH_INFO` that has not been
>>> subjected to destructive handling by the server (e.g. apache
>>> messing with `%2F`) before continuing on to a route dispatcher.
>> 
>> This is already available in some servers by way of the REQUEST_URI value.
> 
> Yes, and in others (as mentioned by Benoit) as RAW_URI. One
> ("consistent") way would be better.
> 
> [Lots of good information about the challenges associated with using
> that information to do anything useful, deleted.]
> 
> What I've done in one app is this:
> https://github.com/tiddlyweb/tiddlyweb/blob/cc6b67d2855ea4d8d908f1a3e58db0dce7e8d138/tiddlyweb/web/serve.py#L119
> 
> Despite the fact that that is not strictly correct, it does mostly work
> for the situation described in the comment and the context of that
> app. One of the things I want from a light rev of WSGI is not to have
> to jump through those hoops.
> 
> It may be that's not feasible but I reckon we're at the wishing
> stage of the discussion.

Yeah, that code would have problems.

One other thing just remembered is that technically it is allowed that the path 
part of the request line can actually be a URI.

GET http://hostname/a/b/c HTTP/1.0

This would yield:

REQUEST_URI: 'http://hostname/a/b/c' <http://hostname/a/b/c'>
SCRIPT_NAME: ‘'
PATH_INFO: '/a/b/c’

Obviously I didn’t even mention the % encoding issues as part of SCRIPT_NAME 
part as you are obviously aware of those being an issue in PATH_INFO at least.

Lots of fun.

Graham___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
https://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] WSGI 2.0 Round 2: requirements and call for interest

2016-01-05 Thread Graham Dumpleton


> On 6 Jan 2016, at 9:19 AM, Graham Dumpleton  
> wrote:
> 
>> On 6 Jan 2016, at 12:09 AM, chris.d...@gmail.com 
>> <mailto:chris.d...@gmail.com> wrote:
>> 
>> As someone who writes their WSGI applications as functions that take
>> `start_response` and `environ` and doesn't bother with much
>> framework the things I would like to see in a minor revision to WSGI
>> are:
>> 
>> * A consistent way to access the raw un-decoded request URI. This is
>>  so I can reconstruct a realistic `PATH_INFO` that has not been
>>  subjected to destructive handling by the server (e.g. apache
>>  messing with `%2F`) before continuing on to a route dispatcher.
> 
> This is already available in some servers by way of the REQUEST_URI value.
> 
> This is the original first line of any HTTP request and can be split apart to 
> get the path.

Whoops. My foggy memory. REQUEST_URI is only raw path part, not the whole 
request line with method, protocol and path.

Graham___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
https://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] WSGI 2.0 Round 2: requirements and call for interest

2016-01-05 Thread Graham Dumpleton

> On 6 Jan 2016, at 12:09 AM, chris.d...@gmail.com wrote:
> 
> As someone who writes their WSGI applications as functions that take
> `start_response` and `environ` and doesn't bother with much
> framework the things I would like to see in a minor revision to WSGI
> are:
> 
> * A consistent way to access the raw un-decoded request URI. This is
>  so I can reconstruct a realistic `PATH_INFO` that has not been
>  subjected to destructive handling by the server (e.g. apache
>  messing with `%2F`) before continuing on to a route dispatcher.

This is already available in some servers by way of the REQUEST_URI value.

This is the original first line of any HTTP request and can be split apart to 
get the path.

The problem is that you cannot easily use it unless you want to replicate 
normalisations that the underlying server may do.

The key problem is working out where SCRIPT_NAME ends and PATH_INFO starts with 
the original path given in REQUEST_URI.

Sure if you only deal with a web application mounted at the root of the host it 
is easier because SCRIPT_NAME would be empty, but when mounted at a sub URL it 
gets trickier.

This is because a web server will eliminate things like repeating slashes in 
the part of the path that may match the mount point (sub url) for the web 
application. The sub url here could be dictated by what is defined in a 
configuration file, or could instead be due to matching against a file system 
path.

Further, the web server will eliminate attempts at relative directory traversal 
using ‘..’ and ‘.’.

So an original path may be something like:

/a/b//c/../d/.//e/../f/g/h

If the mount point was ‘/a/b/d’, then that is what gets passed through 
SCRIPT_NAME.

Now if you instead go to the raw path you would need to replicate all the 
normalisations. Only then could you maybe based on length of SCRIPT_NAME, 
number of component parts, or actual components in the path, try and calculate 
where SCRIPT_NAME ended and PATH_INFO started in the raw path.

This will still all fail if a web server does internal rewrites though, as the 
final SCRIPT_NAME may not even match the raw path, although at that point URL 
reconstruction can be a problem as well if what the application is given by way 
of the rewrite isn’t a public path.

I have only looked at SCRIPT_NAME. Even in PATH_INFO servers will apply same 
sort of normalisations.

So even this isn’t so simple to do properly if you want to go back and do it 
yourself using the raw path.

I have never seen anyone trying to extract repeating slashes intact out of a 
raw path even attempt to do it properly. They tend to assume that the raw path 
is pure and doesn’t have stuff in it which needs to be normalised and that 
rewrites aren’t occurring. As a result they assume that they can just strip 
number of characters off raw path based on length of SCRIPT_NAME passed 
through. This will be fragile though if the raw path isn’t pure.

Graham___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
https://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] WSGI 2.0 Round 2: requirements and call for interest

2016-01-05 Thread Graham Dumpleton

> On 5 Jan 2016, at 10:26 PM, Cory Benfield  wrote:
> 
> Forwarding this message from the django-developers list.
> 
> Hi Cory,
> 
> I’m not subscribed to web-sig but I read the discussion there. Feel free to 
> forward my answer to the group if you think it’s useful.
> 
> I have roughly the same convictions as Graham Dumpleton. If you want to 
> support HTTP/2 and WebSockets, don’t start with design decisions anchored in 
> CGI. Figure out what a simple and flexible API for these new protocols would 
> be, specify it, implement it, and make sure it degrades gracefully to HTTP/1. 
> You may be able to channel most of the communication through a single 
> generator, but it’s unclear to me that this will be the most convenient 
> design.
> 
> If you want to improve WSGI, here’s a list of mistakes or shortcomings in PEP 
>  that you can take a stab at. There’s a general theme: for a 
> specification that looks at the future, I believe that making modern 
> PaaS-based deployments secure by default matters more than not implementing 
> anything beyond what’s available in legacy CGI-based deployments.
> 
> 1. WSGI is prone to header injection vulnerabilities issues by design due to 
> the conversion of HTTP headers to CGI-style environment variables: if the 
> server doesn’t specifically prevent it, X-Foo and X_Foo both become 
> HTTP_X_Foo. I don’t believe it’s a good choice to destructively encode 
> headers, expect applications to undo the damage somehow, and introduce 
> security vulnerabilities in the process. If mimicking CGI is still considered 
> a must-have — 1% of current Python web programmers may have heard about it, 
> most of them from PEP  — then that burden should be pushed onto the 
> server, not the application.

FWIW, Apache 2.4 will discard headers which would use underscore, as well as 
many other characters. Basically it probably only accepts alphanumeric and ‘-‘ 
in original name.

In mod_wsgi, it does the same thing, even for Apache 2.2 where it wasn’t done.

So with mod_wsgi at least you are safe. Or at least if not still using some 
ancient mod_wsgi version. (Death to LTS Linux versions and out of date 
packages) :-)

The nginx server if used as a front end and where it is populating CGI like 
variables for passing to a builtin module such as uWSGI will also I believe 
discard headers which don’t match that requirement as well.

I can’t remember if gunicorn was updated to do something similar, or whether 
when uWSGI isn’t used behind nginx via its uwsgi protocol, but instead listens 
publicly via HTTP whether it does it either. 

> 2. More generally, I fail to see how mixing HTTP headers, server-related 
> inputs, and environment variables in a dict adds values. It prevents 
> iterating on each collection separately. It only makes sense if not offering 
> more features than CGI is a design goal; in that case, this discussion 
> doesn’t serve a purpose anyway. It would be nicer and possibly more secure if 
> the application received separately:
> 
> a. Configuration information, which servers could read from environment 
> variables by default for backwards compatibility, but could also get through 
> more secure channels and restrict to what the application needs in order to 
> better isolate it from the entire OS.

I have always had a bit of a beef with the way that the use of environment 
variables for configuration was promoted by the 12 factor manifesto. It grew 
out of how a specific hosting service did things and ignored that various web 
servers used configuration files instead or did things in other ways. Of course 
the hosting service made it difficult to impossible to use some of those 
traditional web servers, so they were safe in their narrow view of things.

Anyway, if environment variables were used where appropriate and with an 
intermediate mapping layer within Python web applications that would have been 
fine. The problem was that you started to see direct lookup of environment 
variables deep in code bases. So people wedded themselves to use of environment 
variables.

The more sensible thing to do would have been to use an intermediate Python 
module/package providing an abstraction layer for getting configuration. Code 
would then use that. The configuration layer could then look up environment 
variables or use other means to get configuration, such as from more 
traditional configuration files, or pulling it done from configuration servers.

As far as I know there is no good Python package out there which serves as such 
a intermediary configuration system which could be plugged into any application 
and which doesn’t carry a huge amount of baggage. Would love to hear about one 
if it exists.

> b. Server APIs mandated by the spec, per request.
> c. HTTP headers, per request.
> 
> 3. Stop pretending that HTTP is a u

Re: [Web-SIG] WSGI 2.0 Round 2: requirements and call for interest

2016-01-05 Thread Graham Dumpleton


> On 5 Jan 2016, at 10:57 PM, Graham Dumpleton  
> wrote:
> 
> 
>> On 5 Jan 2016, at 10:26 PM, Cory Benfield > <mailto:c...@lukasa.co.uk>> wrote:
>> 
>> Forwarding this message from the django-developers list.
>> 
>> Hi Cory,
>> 
>> I’m not subscribed to web-sig but I read the discussion there. Feel free to 
>> forward my answer to the group if you think it’s useful.
>> 
>> I have roughly the same convictions as Graham Dumpleton. If you want to 
>> support HTTP/2 and WebSockets, don’t start with design decisions anchored in 
>> CGI. Figure out what a simple and flexible API for these new protocols would 
>> be, specify it, implement it, and make sure it degrades gracefully to 
>> HTTP/1. You may be able to channel most of the communication through a 
>> single generator, but it’s unclear to me that this will be the most 
>> convenient design.
>> 
>> If you want to improve WSGI, here’s a list of mistakes or shortcomings in 
>> PEP  that you can take a stab at. There’s a general theme: for a 
>> specification that looks at the future, I believe that making modern 
>> PaaS-based deployments secure by default matters more than not implementing 
>> anything beyond what’s available in legacy CGI-based deployments.
>> 
>> 1. WSGI is prone to header injection vulnerabilities issues by design due to 
>> the conversion of HTTP headers to CGI-style environment variables: if the 
>> server doesn’t specifically prevent it, X-Foo and X_Foo both become 
>> HTTP_X_Foo. I don’t believe it’s a good choice to destructively encode 
>> headers, expect applications to undo the damage somehow, and introduce 
>> security vulnerabilities in the process. If mimicking CGI is still 
>> considered a must-have — 1% of current Python web programmers may have heard 
>> about it, most of them from PEP  — then that burden should be pushed 
>> onto the server, not the application.
> 
> FWIW, Apache 2.4 will discard headers which would use underscore, as well as 
> many other characters. Basically it probably only accepts alphanumeric and 
> ‘-‘ in original name.
> 
> In mod_wsgi, it does the same thing, even for Apache 2.2 where it wasn’t done.
> 
> So with mod_wsgi at least you are safe. Or at least if not still using some 
> ancient mod_wsgi version. (Death to LTS Linux versions and out of date 
> packages) :-)
> 
> The nginx server if used as a front end and where it is populating CGI like 
> variables for passing to a builtin module such as uWSGI will also I believe 
> discard headers which don’t match that requirement as well.
> 
> I can’t remember if gunicorn was updated to do something similar, or whether 
> when uWSGI isn’t used behind nginx via its uwsgi protocol, but instead 
> listens publicly via HTTP whether it does it either. 


I should clarify a point here. Apache 2.4 will discard the headers at the point 
of converting them to a CGI like environment when a handler asks for a CGI like 
set of variables. Raw headers will always be passed through as they were.

Graham___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
https://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] WSGI 2.0 Round 2: requirements and call for interest

2016-01-05 Thread Graham Dumpleton


> On 5 Jan 2016, at 8:40 PM, Cory Benfield  wrote:
> 
> 
>> On 5 Jan 2016, at 00:12, Graham Dumpleton > <mailto:graham.dumple...@gmail.com>> wrote:
>> 
>> 
>>> On 4 Jan 2016, at 11:27 PM, Cory Benfield >> <mailto:c...@lukasa.co.uk>> wrote:
>>> 
>>> All,
>>> 
>>> **TL;DR: What do you believe WSGI 2.0 should and should not do? Should we 
>>> do it at all?**
>>> 
>>> It’s a new year, and that means it’s time for another attempt to get WSGI 
>>> 2.0 off the ground. Many of you may remember that we attempted to do this 
>>> last year with Rob Collins leading the charge, but unfortunately personal 
>>> commitments made it impossible for Rob to keep pushing that attempt forward.
>> 
>> Although you call this round 2, it isn’t really. Robert’s effort was not the 
>> first time someone has pushed a WSGI 2.0 variant. So this is more like being 
>> about round 5 or 6.
>> 
>> In part because of those repeated attempts by people to propose something 
>> and label it as WSGI 2.0, I am very cool on reusing the WSGI 2.0 moniker. 
>> You will find little or no mention of ‘WSGI 2.0’ as a label in:
>> 
>> https://github.com/python-web-sig/wsgi-ng 
>> <https://github.com/python-web-sig/wsgi-ng>
>> 
>> That is probably somewhat due to my grumbling about the use of ‘WSGI 2.0’ 
>> back then.
>> 
>> Time has moved on and so the bad feelings and memories associated with the 
>> ‘WSGI 2.0’ label due to early failed efforts have faded, but I would still 
>> suggest avoiding the label ‘WSGI 2.0’ if at all possible.
> 
> Thanks for that feedback. Consider WSGI 2.0 a catch-all name for the purposes 
> of this specific discussion (the “what do we want WSGI to be going forward” 
> one). As you’ve suggested here, it’s entirely possible that the result of 
> this discussion will be several PEPs/APIs, or none at all, and it’s entirely 
> possible that none of them would be called WSGI 2.0.
> 
>> My general feeling is that if any proposed changes to the existing WSGI (PEP 
>> ) specification cannot be technically implemented on all existing WSGI 
>> server/adapter implementations that any new specification should not still 
>> be called WSGI.
>> 
>> In other words, even if many of these implementations may not be used much 
>> any more, it must be able to work, without needing to mark things as 
>> optional, on CGI, FASTCGI, SCGI, mod_wsgi, gunicorn, uWSGI, Waitress, etc 
>> etc.
>> 
>> This is purely to avoid the confusion whereby implementations cannot or 
>> choose not to implement any new specification. The last thing any WSGI 
>> server author wants is having to deal with a constant stream of questions 
>> and bug reports about not supporting an updated specification where 
>> technically it was never going to be possible. We have some obligation not 
>> to inflict this on what are, in nearly all cases, volunteers in the Open 
>> Source world who work on these things in their spare time and who are not 
>> doing it as part of their paid employment.
> 
> Can I clarify this requirement a bit? Are you wanting to say that any future 
> version of WSGI must be entirely compatible with PEP : that is, may not 
> introduce optional features or change existing behaviour, only clarify? 
> Please don’t mistake this for me challenging the idea: I’m wanting to get a 
> good understanding of what you’re suggesting with this, not agreeing or 
> disagreeing at this stage.

I am saying that any update to the WSGI specification should still be able to 
be implemented using any of the existing technologies that can already 
implement WSGI.

I would see it as just causing problems to bring out an updated WSGI 
specification which couldn’t be implemented on top of CGI, FASTCGI, SCGI or 
even mod_wsgi.

Further, it does really still need to be compatible with the existing 
specifications/applications. Changes I am talking about are clarifications or 
suggesting better ways of doing stuff like wsgi.file_wrapper to avoid known 
problems or to eliminate the use of assumptions about how something works.

If a framework or application is made dependent on some new aspect of the WSGI 
specification which has no fallback because the specification was changed to 
not really be compatible with prior versions in some way then it is me as the 
author of a WSGI server who would have to endure the constant questions of why 
that framework or application doesn’t now work on mod_wsgi if the changes 
couldn’t be supported.

People will not care what version of WSGI the framework or application adhered 
to. Their attitude will be that it supports WSGI

Re: [Web-SIG] WSGI 2.0 Round 2: requirements and call for interest

2016-01-04 Thread Graham Dumpleton

> On 4 Jan 2016, at 11:27 PM, Cory Benfield  wrote:
> 
> All,
> 
> **TL;DR: What do you believe WSGI 2.0 should and should not do? Should we do 
> it at all?**
> 
> It’s a new year, and that means it’s time for another attempt to get WSGI 2.0 
> off the ground. Many of you may remember that we attempted to do this last 
> year with Rob Collins leading the charge, but unfortunately personal 
> commitments made it impossible for Rob to keep pushing that attempt forward.

Although you call this round 2, it isn’t really. Robert’s effort was not the 
first time someone has pushed a WSGI 2.0 variant. So this is more like being 
about round 5 or 6.

In part because of those repeated attempts by people to propose something and 
label it as WSGI 2.0, I am very cool on reusing the WSGI 2.0 moniker. You will 
find little or no mention of ‘WSGI 2.0’ as a label in:

https://github.com/python-web-sig/wsgi-ng 

That is probably somewhat due to my grumbling about the use of ‘WSGI 2.0’ back 
then.

Time has moved on and so the bad feelings and memories associated with the 
‘WSGI 2.0’ label due to early failed efforts have faded, but I would still 
suggest avoiding the label ‘WSGI 2.0’ if at all possible.

My general feeling is that if any proposed changes to the existing WSGI (PEP 
) specification cannot be technically implemented on all existing WSGI 
server/adapter implementations that any new specification should not still be 
called WSGI.

In other words, even if many of these implementations may not be used much any 
more, it must be able to work, without needing to mark things as optional, on 
CGI, FASTCGI, SCGI, mod_wsgi, gunicorn, uWSGI, Waitress, etc etc.

This is purely to avoid the confusion whereby implementations cannot or choose 
not to implement any new specification. The last thing any WSGI server author 
wants is having to deal with a constant stream of questions and bug reports 
about not supporting an updated specification where technically it was never 
going to be possible. We have some obligation not to inflict this on what are, 
in nearly all cases, volunteers in the Open Source world who work on these 
things in their spare time and who are not doing it as part of their paid 
employment.

> Since then, the need for a revision of WSGI has become even more apparent. 
> Casual discussion on the web has indicated that application developers are 
> uncomfortable with the limitations of WSGI. These limitations are providing 
> an incentive for both application developers and server developers to take an 
> end-run around WSGI in an attempt to get a framework that is more suitable 
> for the modern web. A great example of the result of WSGI’s deficiencies is 
> Andrew Godwin’s channels work[0] for Django, which represents a paradigm 
> shift in application development that takes it far away from what WSGI is 
> today.
> 
> For this reason, I think we need to try again to get WSGI 2.0 off the ground. 
> But I don’t believe we can do this without getting broad consensus from the 
> developer community that a revision to WSGI is needed, and without 
> understanding what developers need from a new revision of WSGI. This should 
> take into account the prior discussions we’d had on this thread: however, I’m 
> also going to actively solicit feedback from some of the more notable WSGI 
> implementers, to ensure that whatever comes out of this SIG is something that 
> they would actually use.
> 
> This WG already had a list of requirements, which are as follows:
> 
> - Support servers speaking HTTP/1.x, HTTP/2 and Websockets (potentially all 
> on a single port).

Any support for implementing WebSockets should though be seen as a separate 
requirement to implementing HTTP/2.

A specific WSGI server implementation may be able to support HTTP/2, but not 
support WebSockets, or it could support WebSockets via HTTP/1.x already. In 
fact basic request/response functionality of HTTP/2 maps into the existing WSGI 
API specification and doesn’t really require any changes be made to the WSGI 
specification.

For example, mod_wsgi already supports HTTP/2 by virtue of the fact that the 
mod_h2 module in Apache exists. The existing internal APIs of Apache and how 
mod_wsgi uses those means that HTTP/2 bridges into the WSGI world with no code 
changes to mod_wsgi.

To support WebSockets is a much bigger problem and is not achievable with CGI, 
FASTCGI, SCGI.

It may be able to be supported within the Apache/mod_wsgi implementation, but 
the major re-architecting required in the mod_wsgi code, and the fact that it 
couldn’t be done by simply exposing a socket, but by requiring a new high level 
abstract API be developed which doesn’t expose the actual socket object, means 
you are really talking about a whole new API.

To me the WebSocket requirement and the need for a completely new API rules out 
ever doing this as part of an updated WSGI specification. It should really be

Re: [Web-SIG] REMOTE_ADDR and proxys

2014-10-13 Thread Graham Dumpleton

On 13/10/2014, at 11:26 PM, Benoit Chesneau  wrote:

> 
> 
> On Sun, Oct 12, 2014 at 11:38 PM, Robert Collins  
> wrote:
> On 30 September 2014 11:47, Alan Kennedy  wrote:
> 
> > [Robert]
> >> So it sounds like it should be the responsibility of a middleware to
> >> renormalize the environment?
> >
> > In order for that to be the case, you have strictly define what
> > "normalization" means.
> 
> For a given deployment its well defined. I agree that in general its not.
> 
> > I believe that it is not possible to fully specify "normalization", and that
> > any attempt to do so is futile.
> >
> > If you want to attempt it for the specific scenarios that your particular
> > application has to deal with, then by all means code your version of
> > "normalization" into your application. Or write some middleware to do it.
> >
> > But trying to make "normalization" a part of a WSGI-style specification is
> > impossible.
> 
> I don't recall proposing that it should be in a WSGI-style spec.
> 
> -Rob
> 
> --
> Robert Collins 
> Distinguished Technologist
> HP Converged Cloud
> ___
> Web-SIG mailing list
> Web-SIG@python.org
> Web SIG: http://www.python.org/sigs/web-sig
> Unsubscribe: 
> https://mail.python.org/mailman/options/web-sig/bchesneau%40gmail.com
> 
> 
> All this issue looks like the problem raised (and not yet solved) recently in 
> Gunicorn when the REMOTE_ADDR has been handled more strictly and we removed 
> all the X-Forward-* headers handling:
> 
> https://github.com/benoitc/gunicorn/issues/797
> 
> There is another case to take in consideration, when your server is answering 
> on unix sockets, so you don't have any TCP address to present. For now we 
> answer with an empty field. 
> 
> Also some application frameworks recently removed the middleware handling 
> X-Forward-* headers. I wonder why.
> 
> 
> There is an RFC for forward headers: http://tools.ietf.org/html/rfc7239 . For 
> me instead of trying to change the strict behaviour of REMOTE_ADDR I wonder 
> if we shouldn't rather add a new field to the environ. Thoughts?

My prior thinking on this was that REMOTE_ADDR should be left alone.

If front end proxies support RFC-7239 and pass them through you are all good.

If you are in a situation where a front end proxy doesn't support RFC-7239 but 
uses the prior convention of X-Forwarded-* headers, then one could take the 
older headers and construct the new RFC-7239 headers and drop the old 
X-Fowarded headers.

In other words, converge on the new convention set by RFC-7239 by translating 
the old way of doing things to the new. This way a WSGI application can be 
coded up just to check for the new header and not have to deal with both.

The actual translation from old headers to new could be done by a WSGI 
middleware or an optionally enabled WSGI server feature. Either way it doesn't 
need to be part of the WSGI specification.

As noted by others, the issue though is how much you trust the information 
passed in by the headers and does it capture entirely the existence of multiple 
hops.

In the case of REMOTE_ADDR it is added by the web server based on actual socket 
information and so there is no way a client can supersede it.

The X-Fowarded-* and Forwarded headers have the problem that a client can set 
them itself.

In having multiple ways now of denoting it, which takes precedence and do you 
trust. If your proxies use X-Forwarded-* but a HTTP client sets Forwarded, what 
do you do.

Ultimately, whether you use a WSGI middleware or a WSGI server which provides a 
built function for the typical case (optionally enabled), it has to be 
configurable to the point of an administrator being able to say what are the 
trusted headers. You may also want to be able to say what the IPs of proxies 
are that you want to trust if practical. This must be something an 
administrator can do and not be be dependent on developers embedding it within 
an application, which is why a builtin mechanism with a WSGI server may be 
preferred.

Anyway, this way a system administrator can say whether it is expected that a 
proxy only sets X-Forwarded-* and not Forwarded or vice versa and who to trust. 
You likely can't just have a default strategy if you want to be safe.

Another issue to consider is header spoofing, which not all WSGI servers 
protect against at the moment.

The spoofing problem is because of the CGI rule around how header names are 
converted. That is:

   Meta-variables with names beginning with "HTTP_" contain values read
   from the client request header fields, if the protocol used is HTTP.
   The HTTP header field name is converted to upper case, has all
   occurrences of "-" replaced with "_" and has "HTTP_" prepended to
   give the meta-variable name.  The header data can be presented as
   sent by the client, or can be rewritten in ways which do not change
   its semantics.  If multiple header fields with the same field-name
   are received then

Re: [Web-SIG] Draft 2: WSGI Response Upgrade Bridging

2014-10-10 Thread Graham Dumpleton

On 07/10/2014, at 7:15 AM, PJ Eby  wrote:

> As before, you can find a "living" HTML version of the draft in progress at:
> 
>   https://gist.github.com/pjeby/62e3892cd75257518eb0
> 
> (In addition to nice formatting, it also has a clickable table of contents.)
> 
> After the next round of feedback, I plan to convert this to reST and
> get a PEP number assigned -- assuming nobody comes up with a killer
> problem that sends me back to the drawing board, of course.  ;-)

For those who were not aware, I personally haven't commented as yet on this 
discussion because I have been on a holiday for the last few weeks and I wasn't 
going to allow a discussion about this to ruin my holiday. 

I haven't caught up yet on all the discussion, but it is sad to say that it has 
headed down a direction exactly as a I warned Robert Collins in private 
discussions would likely happen, with certain people trying to rush things to 
push their own specific idea for how things should be done, with the risk that 
that will dominate the agenda and so push Robert out of the way as far as 
trying to coordinate this as a community effort where anyone could feel 
confident about providing input with the result then also being a community 
effort.

So PJE, please step back and do not go rushing out to create a PEP. That is the 
worst thing you could do at this point and will only serve to deter people from 
the community contributing and so stifle proper discussion about this whole 
topic. You have no more experience or mandate to be specifying a standard for 
this than anyone else. By creating a PEP though that gets perceived by many as 
meaning the discussion is over. This is exactly what you did for PEP  and 
which caused previous discussion about improving WSGI to get shutdown. The 
result was that the only thing that really got addressed in PEP  was Python 
3 compatibility and a lot of the other bits of the WSGI specification which are 
poorly defined, contradictory or restrictive and which cause WSGI server and 
application developers pain never got addressed. If that prior discussion 
hadn't been shutdown in that way, we could have been using a better defined and 
improved WSGI years ago already.

Robert has stuck his neck out to try and bring various parties together to work 
on this where anyone who has an opinion or idea can raise them so we as a 
community can all together come up with something which is workable for both 
server implementers and web application developers.

Robert even setup a github repo specifically as a place to bring together all 
those ideas and described how people can add stuff there. For whatever personal 
reason you have decided to ignore that repo Robert set up and decided to go 
alone. If you have an issue with the way the repo was structured which didn't 
make it easy for you to contribute your work into it, then work with Robert to 
address that. Right now, that you have created your own separate space for 
writing up a specification which you are now trying to rush into a PEP comes 
across as you not really wanting to co-ordinate with Robert on this as a 
community effort with it instead appearing that you think you know better than 
anyone else and nothing anyone else says will be of value. In the face of that, 
it is hardly surprising that no one has really responded to what you have 
proposed.

So slow down. This is not a race to see who can be the first to come out with a 
PEP and so dominate the discussion, it is meant to be a community effort.

Robert. What I would suggest you do is reboot this whole effort.

Go back and perhaps look at how the github repo you setup is structured and 
make it more obvious how anyone can add their work into it in separate areas of 
it as need be and not just as issues, if that isn't already clear enough. 
Document exactly what you want people to do as far as adding anything there. 
Find people who will work with you on making all this clearer and defining any 
process.

The next step is to make a more definite statement about the timeline for this 
whole discussion.

Specifically, give notice of a formal request for comment period and publicise 
it through any Python blogs of the PSF that might be able to be used, as well 
as through the different Python web communities. Also get prominent individuals 
in the Python WSGI and web community to also publicise the comment period.

Set a specific date for the end of that comment period. There should be no rush 
on this and people should be given adequate time to respond. Most interested 
parties would only do this in their spare time and employers aren't going to 
allow them to waste their work time on it. So make the comment period something 
like 2 months from the date of announcing it.

What can people comment on?

They may want to comment on the process itself of how we get to the various 
specifications that may come out of this.

They may want to comment on what should even be addressed in any

Re: [Web-SIG] WSGI for HTTP/2.0 ?

2014-09-19 Thread Graham Dumpleton

On 20/09/2014, at 3:49 PM, Roberto De Ioris  wrote:

> I can help a bit (i am the uWSGI lead developer and a nginx and Cherokee
> contributor, and i have already implemented a spdy3 server last year)
> 
> I honestly think that WSGI by itself needs a complete rewrite/rethink to
> be adapted to modern (ok someone could say 'fashioned') patterns (that are
> somewhat more 'urgent' than HTTP/2), but i agree that starting thinking
> about HTTP/2 could be a good thing.

I agree.

Overhauling WSGI has more relevance because an underlying web server updating 
itself to support HTTP 2.0 will in the main have little relevance at the 
application layer as the web server is more than likely to have an adapter 
layer which makes things look the same to existing modules/protocol adapters.

In other words, Apache adding support for HTTP 2.0 isn't going to result in 
some sort of wholesale change of the Apache module interface, it would stay the 
same say whether HTTP 2.0 is used, especially just as an alternate way of doing 
the same thing as HTTP 1.1. In that respect, since no HTTP 2.0 specific 
functionality is going to be made visible through exist interfaces, then Apache 
modules or adapters for FASTCGI/SCGI etc or even mod_wsgi are simply not going 
to change.

So, overhaul WSGI as the primary aim, but within that factor in things to allow 
for HTTP 2.0 functionality.

The problem with trying to overhaul WSGI is that if it is done in an open forum 
like the Web-SIG it will die of a thousand cuts, as past efforts to update it 
in even minor ways have suffered.

The only way that WSGI itself will ever see an overhaul is through the strong 
willed determination of a few people off list, out of sight, to allow it it to 
be fully fleshed out, with input coming from direct consultation with and 
review by other related parties who have a vested interested or significant 
experience in the area.

I may be up for such an off list effort, but be warned I may want to run 
roughshod over it and exert quite a lot of influence over the process and 
outcome. :-)

Graham
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
https://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] [python-tulip] Re: [Python-Dev] wsgi validator with asynchronous handlers/servers

2013-04-26 Thread Graham Dumpleton

I described a different way of doing WSGI which would better cope with post 
response hooks at the Python Web Summit at PyCon in 2012. It made use of the 
context manager abstraction so it wouldn't screw with the returned iterable.

http://www.slideshare.net/GrahamDumpleton/pycon-us-2012-state-of-wsgi-2-14808297

Graham

On 27/04/2013, at 2:36 PM, est  wrote:

> Hi,
> 
> Newbie opinion here.
> 
> Since we are talking about Tulip and PEP 3156, I think it's high time we 
> address some of the design flaws in WSGI 1.0
> 
> One major problem with WSGI is that it can not handle true post-response 
> hooks.
> 
> The closest hack I found is this:
> https://modwsgi.readthedocs.org/en/latest/developer-guides/registering-cleanup-code.html
> 
> 
> As discussed by Graham Dumpleton here
> https://groups.google.com/group/modwsgi/msg/d699a09b3b11b313
> 
> Although the response was returned to the client, It will still hold the http 
> connection open until __callback finishes.
> 
> While it's pretty common design pattern for a post-response hook in modern 
> Web world. I can think a few usage:
> 
>  - User uploads file, return HTML says Upload OK, then Web worker continue to 
> transfer file to Amazon S3, which is slow and takes some time.
>  - After a series of user interaction on a web page, using the existing db 
> connection to write OLAP logs of later analysis.
>  - notify the http request to another ZMQ/XMPP connection
> 
> Currently, Celery is extremely popular (at least in Django or other non-async 
> web frameworks). But IMHO it's too heavy weight and copying python data & 
> objects from a cluster of Web workers to another cluster of task queue 
> workers is not worth it.
> 
> Another problem is the good old CGI environ design. I can't help to ask? Why?
> 
> Every HTTP header is transfered via envion, and capitalized with a HTTP_ 
> prefix e.g. HTTP_HOST. There's some serious information loss here.
> 
> 1. Actual header string case 
> 2. header order
> 
> Since WSGI is higher level framework, I think it's time for us to deliver the 
> original header status in a SortedDict.
> 
> Again, as a newbie advice, we should take this chance of integrating PEP 3156 
> with a deadly simple WSGI 3.0 design:
> 
> def application(request):
> ip = request.remote_ip
> length = request.headers["Content-Length"]
> request.write("done.")
> request.close()
> db.log(length) # some post-response actions.
> 
> 
> 
> On Mon, Mar 25, 2013 at 9:08 AM, Guido van Rossum  wrote:
> Hi Luca,
> 
> Unfortunately I haven't thought yet about the interactions between WSGI and 
> Tulip or PEP 3156. While I am pretty familiar with WSGI, I have never used 
> its async features, so I can't be much of a help. My best guess is that we 
> won't make any changes to WSGI to support PEP 3156 in Python 3.4, but that 
> once that is out, some folks will come up with an improved design for WSGI 
> that supports interoperability with standard async event loops. OTOH, maybe 
> you can read up on the PEP and check out the Tulip implementation 
> (http://code.google.com/p/tulip/) and maybe you can come up with a suitable 
> design for integrating PEP 3156 into WSGI? Though it may have to be named 
> WSGI 2.0 to emphasize that it is backwards incompatible.
> 
> --Guido
> 
> 
> 
> On Sun, Mar 24, 2013 at 2:18 PM, Luca Sbardella  
> wrote:
> Hello,
> 
> first time here, I'm Luca and I write lots of python of the asynchronous 
> variety.
> This question is about wsgi and the way pulsar 
> http://quantmind.github.com/pulsar/ handles asynchronous wsgi responses.
> 
> Yesterday I sent a message to the python-dev mailing list regarding 
> wsgiref.validator, this is the original message
> 
> I have an asynchronous wsgi application handler which yields empty bytes 
> before it is ready to yield the response body and, importantly, to call 
> start_response.
> 
> Something like this:
> 
> def wsgi_handler(environ, start_response):
> body = generate_body(environ)
> body = maybe_async(body)
> while is_async(body):
> yield b''
> start_response(...)
> ...
> 
> I started using wsgiref.validator recently, nice little gem in the standard 
> lib, and I discovered that the above handler does not validate! Disaster.
> Reading pep 
>  
> "the application must invoke the start_response() callable before the 
> iterable yields its first body bytestring, so that the server can send the 
> headers before any body content. However, this invocation may be performed by 
> the iterable's first iterat

Re: [Web-SIG] [modwsgi] Hop-by-hop headers

2012-08-09 Thread Graham Dumpleton

Probably better off asking on the Python WEB-SIG. I have cc'd this there.

http://www.python.org/community/sigs/current/web-sig/

Someone has probably felt that wsgiref implementation should somehow
be checking for things which aren't notionally allowed but which go
beyond just API usage checks. Checking for hop by hop headers should
possibly have been the job of the wsgiref.validator and not the server
in wsgiref.

I know of no other server which will outright error when a hop by hop
header is returned by an application, and as you note, there are
sometimes where it is useful to pass back Connection to ensure that
the web server/client drops the current connection and doesn't try and
maintain a keep alive connection.

Graham

On 10 August 2012 02:43, Ron Garret  wrote:
> I'm not sure this is the right place to ask this question because it's not 
> really about modwsgi but this is the best place I know to find expertise 
> about WSGI in general.
>
> Yesterday I fired up some old code using the wsgiref server and got the 
> following error:
>
> "Hop-by-hop headers not allowed"
>
> This turned out to be caused by my code including a "Connection: close" 
> header in order to work around an old Safari bug.  Trick is, the last time I 
> ran this code under wsgiref it worked, and it hasn't changed.  And when I run 
> it under modwsgi it works.
>
> So my question is: does anyone here know why wsgiref doesn't allow connection 
> headers?  And did this change recently?  Looking at the wsgiref code it seems 
> to reject Connection headers at least as far back as Python 2.6.  My code is 
> old, but it's not that old (less than two years).  I'm pretty sure it has run 
> successfully under Python2.6 if not 2.7.  It contains WITH statements, so the 
> last time I ran it could not have been under anything earlier than 2.6.
>
> I'm baffled.  Can anyone here shed any light on this?
>
> Thanks,
> rg
>
> --
> You received this message because you are subscribed to the Google Groups 
> "modwsgi" group.
> To post to this group, send email to modw...@googlegroups.com.
> To unsubscribe from this group, send email to 
> modwsgi+unsubscr...@googlegroups.com.
> For more options, visit this group at 
> http://groups.google.com/group/modwsgi?hl=en.
>
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] question about connection pool, task queue in WSGI

2012-07-13 Thread Graham Dumpleton

> On 13 July 2012 07:18, est  wrote:
>> Thanks for the answer. That's very helpful info.
>>
>>>  Only by changing the Django code base from memory. Better off asking
>> on the Django users list.
>>
>> Is my idea was good or bad? (make wsgi handle connection pools, instead of
>> wsgi apps)
>>
>> I read Tarek Ziadé last month's experiement of re-use tcp port by specify
>> socket FDs. It's awesome idea and code btw. I have couple of questions about
>> it:
>>
>> 1. In theory, I presume it's also possible with db connections? (After wsgi
>> hosting worker ended, handle the db connection FD to the next wsgi worker)

Unlikely. HTTP connections are stateless, open database connections
are high unlikely to be stateless with the client likely caching
certain session information.

>> 2. Is the socket FD the same mechanism like nginx? If you upgrade nginx
>> binary, restart nginx, the existing http connection won't break.

I would be very surprised if you could upgrade nginx, perform a
restart and preserve the HTTP listener socket. If you are talking
about some other socket I don't know what you are talking about.

As you can with Apache, you can likely enact a configuration file
change and perform a restart or trigger rereading of the configuration
and it would maintain the HTTP listener socket across the
configuration restart, but an upgrade implies changing the binary and
I know no way that you could easily persist a HTTP listener socket
across to an invocation of a new web server instance using a new
executable. In Apache you certainly cannot do it, and unless nginx has
some magic where the existing nginx execs the new nginx version and
somehow communicates through open socket connections to the new
process, I very much doubt it would as it would be rather messy to do
so.

>> 3. Is my following understanding of wsgi model right?
>>
>> A wsgi worker process runs the wsgi app (like django), multiple requests are
>> handled by the same process, the django views process these requests and
>> returns responses within the same process (possible in fork or threaded way,
>> or even both?). After a defined number of requests the wsgi worker
>> terminates and spawns the next wsgi worker process.

Different WSGI severs would behave differently, especially around
process control, but your model of understand is close enough.

>> Before hacking into a task queue based on pure wsgi code, I want to make
>> sure my view of wsgi is correct. :)

Would still suggest you just use an existing solution.

Graham

>> Please advise! Thanks in advance!
>>
>>
>> On Fri, Jul 13, 2012 at 11:31 AM, Graham Dumpleton
>>  wrote:
>>>
>>> On 12 July 2012 19:50, est  wrote:
>>> > Hi list,
>>> >
>>> > I am running a site with django + uwsgi, I have few questions about how
>>> > WSGI
>>> > works.
>>> >
>>> > 1. Is db connection open/close handled by Django? If it's open/closed
>>> > per
>>> > request,
>>>
>>> Yes it is.
>>>
>>> > can we make a connection pool in wsgi level, then multiple django
>>> > views can share it?
>>>
>>> Only by changing the Django code base from memory. Better off asking
>>> on the Django users list.
>>>
>>> > 2. As a general design consideration, can we execute some task *after*
>>> > the
>>> > response has returned to client? I have some heavy data processing need
>>> > to
>>> > be done after return HttpResponse() in django, the standard way to do
>>> > this
>>> > seems like Celery or other task queue with a broker. It's just too
>>> > heavyweight. Is it possible to do some simple background task in WSGI
>>> > directly?
>>>
>>> Read:
>>>
>>> http://code.google.com/p/modwsgi/wiki/RegisteringCleanupCode
>>>
>>> In doing this though, it ties up the request thread and so it would
>>> not be able to handle other requests until your task has finished.
>>>
>>> Creating background threads at the end of a request is not a good idea
>>> unless you do it using a pooling mechanism such that you limit the
>>> number of worker threads for your tasks. Because the process can crash
>>> or be shutdown, you loose the job as only in memory and thus not
>>> persistent.
>>>
>>> Better to use Celery, or if you think that is too heavy weight, have a
>>> look at Redis Queue (RQ) instead.
>>>
>>> Graham
>>
>>
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] question about connection pool, task queue in WSGI

2012-07-13 Thread Graham Dumpleton

Please keep replies in the mailing list.

Graham

On 13 July 2012 07:18, est  wrote:
> Thanks for the answer. That's very helpful info.
>
>>  Only by changing the Django code base from memory. Better off asking
> on the Django users list.
>
> Is my idea was good or bad? (make wsgi handle connection pools, instead of
> wsgi apps)
>
> I read Tarek Ziadé last month's experiement of re-use tcp port by specify
> socket FDs. It's awesome idea and code btw. I have couple of questions about
> it:
>
> 1. In theory, I presume it's also possible with db connections? (After wsgi
> hosting worker ended, handle the db connection FD to the next wsgi worker)
>
> 2. Is the socket FD the same mechanism like nginx? If you upgrade nginx
> binary, restart nginx, the existing http connection won't break.
>
> 3. Is my following understanding of wsgi model right?
>
> A wsgi worker process runs the wsgi app (like django), multiple requests are
> handled by the same process, the django views process these requests and
> returns responses within the same process (possible in fork or threaded way,
> or even both?). After a defined number of requests the wsgi worker
> terminates and spawns the next wsgi worker process.
>
> Before hacking into a task queue based on pure wsgi code, I want to make
> sure my view of wsgi is correct. :)
>
> Please advise! Thanks in advance!
>
>
> On Fri, Jul 13, 2012 at 11:31 AM, Graham Dumpleton
>  wrote:
>>
>> On 12 July 2012 19:50, est  wrote:
>> > Hi list,
>> >
>> > I am running a site with django + uwsgi, I have few questions about how
>> > WSGI
>> > works.
>> >
>> > 1. Is db connection open/close handled by Django? If it's open/closed
>> > per
>> > request,
>>
>> Yes it is.
>>
>> > can we make a connection pool in wsgi level, then multiple django
>> > views can share it?
>>
>> Only by changing the Django code base from memory. Better off asking
>> on the Django users list.
>>
>> > 2. As a general design consideration, can we execute some task *after*
>> > the
>> > response has returned to client? I have some heavy data processing need
>> > to
>> > be done after return HttpResponse() in django, the standard way to do
>> > this
>> > seems like Celery or other task queue with a broker. It's just too
>> > heavyweight. Is it possible to do some simple background task in WSGI
>> > directly?
>>
>> Read:
>>
>> http://code.google.com/p/modwsgi/wiki/RegisteringCleanupCode
>>
>> In doing this though, it ties up the request thread and so it would
>> not be able to handle other requests until your task has finished.
>>
>> Creating background threads at the end of a request is not a good idea
>> unless you do it using a pooling mechanism such that you limit the
>> number of worker threads for your tasks. Because the process can crash
>> or be shutdown, you loose the job as only in memory and thus not
>> persistent.
>>
>> Better to use Celery, or if you think that is too heavy weight, have a
>> look at Redis Queue (RQ) instead.
>>
>> Graham
>
>
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] question about connection pool, task queue in WSGI

2012-07-12 Thread Graham Dumpleton

On 12 July 2012 19:50, est  wrote:
> Hi list,
>
> I am running a site with django + uwsgi, I have few questions about how WSGI
> works.
>
> 1. Is db connection open/close handled by Django? If it's open/closed per
> request,

Yes it is.

> can we make a connection pool in wsgi level, then multiple django
> views can share it?

Only by changing the Django code base from memory. Better off asking
on the Django users list.

> 2. As a general design consideration, can we execute some task *after* the
> response has returned to client? I have some heavy data processing need to
> be done after return HttpResponse() in django, the standard way to do this
> seems like Celery or other task queue with a broker. It's just too
> heavyweight. Is it possible to do some simple background task in WSGI
> directly?

Read:

http://code.google.com/p/modwsgi/wiki/RegisteringCleanupCode

In doing this though, it ties up the request thread and so it would
not be able to handle other requests until your task has finished.

Creating background threads at the end of a request is not a good idea
unless you do it using a pooling mechanism such that you limit the
number of worker threads for your tasks. Because the process can crash
or be shutdown, you loose the job as only in memory and thus not
persistent.

Better to use Celery, or if you think that is too heavy weight, have a
look at Redis Queue (RQ) instead.

Graham
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] Large, fixed latency on every wsgiref response

2012-06-07 Thread Graham Dumpleton

If on Windows then try using 127.0.0.1 instead of localhost.

There are known issues with Windows whereby localhost actually
resolves to an IPV6 address of ::1 rather than IPV4 address of
127.0.0.1. For reasons I can't remember, this causes an initial delay
in connections.

Graham

On 8 June 2012 11:28, Matt Chaput  wrote:
>> Are you using an IP address or DNS name?
>>
>> http://appletoolbox.com/2010/09/fix-safari-slowness-stalled-page-loads-by-disabling-dns-prefetching/
>> http://support.apple.com/kb/TS2296
>
> "localhost", and this is on Windows, not Mac OS X. Also, as mentioned, this 
> problem shows up to different degrees in Safari, Chrome, and Firefox. At 
> least for me.
>
> At first glance the CherryPy server seems better and Waitress seems not to 
> have the problem, but I haven't devoted too much time to testing them yet.
>
> Matt
>
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] Large, fixed latency on every wsgiref response

2012-06-07 Thread Graham Dumpleton

Are you using an IP address or DNS name?

http://appletoolbox.com/2010/09/fix-safari-slowness-stalled-page-loads-by-disabling-dns-prefetching/
http://support.apple.com/kb/TS2296

Graham

On 8 June 2012 07:09, Matt Chaput  wrote:
> I'm using Paste script to configure a wsgiref server on Windows. And I'm
> seeing some weird stuff.
>
> On Safari, every request gets almost exactly 1 second of latency tacked on
> (the amount listed in the network diagnostics pane varies per request:
> 1.03s, 1.09s, 1.08s, 1.12s...). Every request. Even when the actual response
> takes practically no time (e.g. a 304), the connection latency is huge.
>
> On Chrome, the latency is smaller (around 300ms) and not on every request.
> Hovering over a request with the latency in Chrome's network pane shows the
> following information:
>
>  DNS Lookup: 1ms
>  Connecting: 302ms
>  Sending   : 0
>  Waiting   : 15ms
>  Receiving : 27ms
>
> Firefox also shows a large (1s) "connecting" time for some requests and no
> delay on other requests in the Firebug net pane.
>
> The only reason page load is barely tolerable is because at least with
> threading some of the delays are in parallel, but it's still slow.
>
> I have no idea what's going on here. Any ideas?
>
> Thanks,
>
> Matt
> ___
> Web-SIG mailing list
> Web-SIG@python.org
> Web SIG: http://www.python.org/sigs/web-sig
> Unsubscribe:
> http://mail.python.org/mailman/options/web-sig/graham.dumpleton%40gmail.com
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] Fwd: Can writing to stderr be evil for web apps?

2012-05-19 Thread Graham Dumpleton

On 19 May 2012 22:36, anatoly techtonik  wrote:
> Hi,
>
> Martin expressed concerns that using logging module with stderr output
> can break web applications, such as PyPI. I couldn't find any proof or
> denials for this fact, and it became a showstopper for me for further
> contributions to PyPI, because clearly I can't write good code without
> the sense of confidence.
>
> Here is the commit in question:
> https://bitbucket.org/techtonik/pypi-techtonik/changeset/5396f8c60d49
>
> I was redirected here from python-dev. So can anybody tell where are
> those stdout/stderr fears coming from and how to dispell them?
> (include in WSGI notes)

Part of this is likely going to be due to my deliberate action in
early versions of mod_wsgi to prohibit reading from stdin and writing
to stdout. This was specifically done to highlight that portable WSGI
application shouldn't be working with stdin/stdout because in doing so
they would break under a CGI/WSGI bridge written to conform to the
example in the WSGI PEP.

People don't like to change though and so a lot of people would say
that mod_wsgi was broken and/or that one couldn't use stdout under
Apache even though there was a configuration option there to remove
the artificial limitation. Over time I saw this grow in to a bit of a
myth that one couldn't even use stderr and started to see stupid
things like people remapping both stdout and stderr to their own files
or even /dev/null.

Although this artificial limitation was removed some time ago with it
being optionally enabled if you want to test WSGI application
portability. Certain LTS Linux distros still ship ancient mod_wsgi
versions where the limitation is the default.

So I tried to enforce something to make people do the right thing, but
people prefer to write crap code I guess.

An old blog post where I bemoaned all this can be found at:

http://blog.dscpl.com.au/2009/04/wsgi-and-printing-to-standard-output.html

If the CGI/WSGI bridge example in the PEP had simple saved away
original stdin/stdout and replaced them with an empty StringIO and
stderr, this possibly would never have become the issue it was. My own
CGI/WSGI bridge does exactly that now so as to allow people writing
crap code to still run it on CGI/WSGI if they really need to:

https://github.com/GrahamDumpleton/cgi2wsgi

Graham
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] Move www.wsgi.org to Read The Docs

2012-04-12 Thread Graham Dumpleton

Christian

The wsgi.org domain has reverted back to point to 'DZUG-Sprints' site
rather than to www.wsgi.org or wsgi.readthedocs.org.

Can you see what it going on.

Thanks.

Graham



On 20 September 2011 07:42, Graham Dumpleton  wrote:
> Christian. The DNS entry is actually wrong. Got this from Eric:
>
>  GrahamDumpleton: wanted to let you know they changed the DNS for wsgi.org,
>  but they pointed it at wsgi.readthedocs.org, so I made this project
> in its place
>  so the CNAME would resolve: http://readthedocs.org/projects/wsgi
>
>  that isn't yours: http://readthedocs.org/projects/wsgiorg/
>
>  I can either renmae the slug on yours, or you can get them to change the DNS
>
> Don't change the DNS though as I reckon it may be better that we claim:
>
>  wsgi.readthedocs.org
>
> Will stop someone else claiming generic WSGI for some project.
>
> Eric, yes, please change the slug.
>
> Thanks.
>
> Graham
>
> On 20 September 2011 06:42, Graham Dumpleton  
> wrote:
>> Thanks.
>>
>> One thing we should do now is create a page with instructions on how
>> you can contribute changes back via github project.
>>
>> Graham
>>
>> On 19 September 2011 23:30, Christian Theune  wrote:
>>> Hi,
>>>
>>> On 09/19/2011 11:33 AM, Christian Theune wrote:
>>>>
>>>> OK, I updated our database. The nameservers should start propagating
>>>> this in an hour or so.
>>>
>>> After some messing around with CNAMES and such I added a redirect from
>>> wsgi.org -> www.wsgi.org and a CNAME of www.wsgi.org to readthedocs.
>>>
>>> I also added a placeholder page while the DNS updates are in progress
>>> including a link to the direct readthedocs.org address.
>>>
>>> Hope this helps,
>>> Christian
>>>
>>>
>>> --
>>> Christian Theune · c...@gocept.com
>>> gocept gmbh & co. kg · forsterstraße 29 · 06112 halle (saale) · germany
>>> http://gocept.com · tel +49 345 1229889 0 · fax +49 345 1229889 1
>>> Zope and Plone consulting, development, hosting, operations
>>>
>>
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] A more useful command-line wsgiref.simple_server?

2012-04-01 Thread Graham Dumpleton

On 2 April 2012 15:08, Graham Dumpleton  wrote:
> On 2 April 2012 14:54, Sasha Hart  wrote:
>> I would personally not +x a module file just to serve an app with wsgiref
>> from the hashbang line; it's clever but I can't come up with any real
>> benefit. A case where I'm serving with wsgiref already has to be pretty
>> trivial and I'm not going to couple to it *from inside the module itself*
>> when it is so darned easy to just run the module from several nice python
>> test servers (also portable and I can use autoreload, etc.) But if this is
>> desired by many others, I'd agree it's a good factor to consider.
>
> When using CGI or FASTCGI, with a hosting system where an executable
> script needs to be supplied, it is beneficial to be able to say
> something like:
>
>  #!/usr/bin/env python -m cgi2wsgi
>
>  #!/usr/bin/env python -m fcgi2wsgi
>
> where the rest of the script is the just the WSGI application.
>
> I have implemented this for CGI as an example at:
>
>  https://github.com/GrahamDumpleton/cgi2wsgi
>
> I have done it for FASTCGI using flup as well before but that isn't
> available anywhere.

I should probably add though that this is not the best way it could be
done for FASTCGI. For FASTCGI you are better off making use of FASTCGI
implementations wrapper mechanism as intermediary with it handling the
loading. This is the approach that PHP under FASTCGI uses and why it
is so easy for users, namely because system admins set it up with
wrapper support. You don't see such niceties for Python where a system
admin sets up that a .wsgi script file would be understood to be a
Python WSGI application with no extra magic needing to be added to it
by the user, even though not that difficult in principle. Thus why
users need to resort to #! line and low level FASTCGI script in the
first place.

Graham
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] A more useful command-line wsgiref.simple_server?

2012-04-01 Thread Graham Dumpleton

On 2 April 2012 14:54, Sasha Hart  wrote:
> I would personally not +x a module file just to serve an app with wsgiref
> from the hashbang line; it's clever but I can't come up with any real
> benefit. A case where I'm serving with wsgiref already has to be pretty
> trivial and I'm not going to couple to it *from inside the module itself*
> when it is so darned easy to just run the module from several nice python
> test servers (also portable and I can use autoreload, etc.) But if this is
> desired by many others, I'd agree it's a good factor to consider.

When using CGI or FASTCGI, with a hosting system where an executable
script needs to be supplied, it is beneficial to be able to say
something like:

  #!/usr/bin/env python -m cgi2wsgi

  #!/usr/bin/env python -m fcgi2wsgi

where the rest of the script is the just the WSGI application.

I have implemented this for CGI as an example at:

  https://github.com/GrahamDumpleton/cgi2wsgi

I have done it for FASTCGI using flup as well before but that isn't
available anywhere.

Graham
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] A more useful command-line wsgiref.simple_server?

2012-03-30 Thread Graham Dumpleton

On 31 March 2012 14:36, PJ Eby  wrote:
> On Fri, Mar 30, 2012 at 5:12 PM, Graham Dumpleton
>  wrote:
>>
>> Now when doing mod_wsgi, a similar method of loading each file
>>
>> separately with a __name__ based on file system path was used to
>> ensure each was distinct when same file name used in different
>> directories.
>
>
> Why give them a __name__ at all?  Aren't they scripts, rather than modules?
>  ISTM that not having a __name__ would also let things like pickles fail
> faster.  That is, code that expected a module rather than a script would
> break right away.

Because not having a __name__ attribute at all would make:

  if __name__ == '__main__':
 ...

fail straight away and people quite often had that in scripts so they
could run it directly as well with a pure WSGI server.

>> FWIW, in the past when pushing the idea of a WSGI script file being
>> the lowest common denominator, part of the reason I found I couldn't
>> get it accepted is that some people simply didn't understand how in
>> Python to load an arbitrary file by path name and construct a module
>> for it in memory, with magic __name__. They seemed to think that the
>> only way to import a code file was for it to have a .py extension and
>> for the directory to be in sys.path. So, due to ignorance of the
>> solution as to how to do it meant I got a push back from some people.
>
> Who were you trying to get acceptance from?  Web-SIG or Python-Dev?
>  Framework devs or end-users?  Is  there a PEP?

I brought it up on the WEB-SIG. It may have been bad timing amongst
all the other discussions that went around in circles at the time on
the WEB-SIG. Also mentioned it in passing to some WSGI server
developers and other people when discussing web stuff at meet ups or
otherwise.

Graham
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] A more useful command-line wsgiref.simple_server?

2012-03-30 Thread Graham Dumpleton

On 31 March 2012 06:58, Geoffrey Spear  wrote:
> On Fri, Mar 30, 2012 at 3:20 PM, Masklinn  wrote:
>> 2. You seem to have asserted from the start that the default should be
>>   mounting modules, but I have seen no evidence or argument in favor of
>>   that so far.
>>
>>   Defaulting to scripts not only works with both local modules and
>>   arbitrary files and follow cpython's (and most tools's) own behavior,
>>   but would also allows using -mwsgiref.simple_server as a shebang
>>   line. I find this to have quite a lot of value.
>
> I may be dense, but is there actually a use case for using a WSGI
> application from a script? Presumably a script that defines a WSGI
> application would also run it.

Some history for you.

Seeing the file containing a WSGI application entry point as a file
rather than a module derives from how Apache works.

Take for example CGI under Apache, one can say for a directory context:

  AddHandler cgi-script .py

What this means is that any files with a .py extension are executed as
a CGI script. Thus, would have to be an executable file and have an
appropriate #! line which can resolve the Python interpreter to use.
In this case the extension used is actually irrelevant.

When mod_python came along it allowed one instead to say:

  AddHandler python-script .py

The way mod_python then originally worked was that when it resolved a
URL to a directory containing the target .py file, it would add that
directory to sys.path, import the module based on the basename of the
target file. It would then execute the entry point callable within the
loaded module.

No #! line was needed, nor did file need to be executable. The first
wasn't needed because mod_python dictated what Python version was
used.

The problem with what mod_python did was the AddHandler can span
multiple directories. As target files in each directory were accessed,
each directory would get added into sys.path to be able to import it.

Because these are normal file system directories and treated as
separate module directories and not part of an overall package
structure, there was nothing to stop you having the same name file in
each directory. It was common for example to have:

  DirectoryIndex index.py

This means that if the directory itself was the target, it would use
the index.py in the directory as means of generating the directory
index.

If more than one directory was added to sys.path containing an
index.py file, you can only have one loaded as a module, not both.

Thus you ended up with an in memory instance of 'index' module being
used rather than the second one encountered, or depending on sys.path
ordering, you could import the 'index' module from the wrong
directory. Basically, things were a bit unpredictable if you ever used
the same file name more than once.

There was various other things that could go wrong as well.

In latter version of mod_python the whole module importing system was
rewritten to avoid adding directories into sys.path. Instead a custom
module importer was used with special lookup rules to find modules in
directories itself.

Further, when modules were loaded, the __name__ of the module was not
just the basename of the file, but a magic string taking into account
the full file path name. By doing this, even though index.py may occur
in separate places, they would be distinct modules in memory.

The complexity of still allowing relative module imports from the same
directory to simulate things as if directory was in sys.path was
frightening though. Add to that that mod_python had a reloading
mechanism which could look not just at the immediate file, but all sub
modules imported from the directories managed by the mod_python custom
module importer and also trigger a reload when one of the used modules
was changed and not just the top level one.

Now when doing mod_wsgi, a similar method of loading each file
separately with a __name__ based on file system path was used to
ensure each was distinct when same file name used in different
directories.

What mod_wsgi didn't do though was replicate the custom module
importer that mod_python had as that really was a nightmare.

This mean that relative module imports from same directory would not
work. If someone really wanted that, they would need to add the
directory to sys.path themselves.

Once they did that though, because the target file as loaded by
mod_wsgi had a __name__ which didn't match the basename for file, then
if someone tried to import that module file back into something else,
you would end up with two copies in memory. The first being the magic
one mod_wsgi loaded as file and the other loaded as module.

To make it more obvious that they were treated a bit differently, and
to avoid people making this mistake, it was promoted to use a .wsgi
extension for the WSGI script file rather than .py. That way people
would not go inadvertently importing it a second time.

Further, because of the way that the .wsgi script file was loade

Re: [Web-SIG] A more useful command-line wsgiref.simple_server?

2012-03-29 Thread Graham Dumpleton

On 29 March 2012 21:02, Masklinn  wrote:
> Moving here as suggested by Terry Reedy as this list may be more
> interested than -ideas (note: some feedback already used to revise
> the original proposal, and a very basic patch — with no tests — is
> provided for the current CPython default branch)
>
> Currently, calling wsgiref.simple_server simply mounts the (bundled)
> demo app.
>
> I think that's a bit of a lost opportunity: the community seems to have
> mostly standardized on a wsgi script providing an application callable
> in its global namespace (though details may differ, mod_wsgi does not
> care for the script's name and mandates an `application` callable while

Apache/mod_wsgi only defaults to 'application', it is configurable.

As for the rest of the proposal, I tried to push the same idea a few
years back with intent that all WSGI servers would provide a similar
mechanism to that you describe as a lowest common denominator, but I
didn't get anywhere at the time.

Although people now perhaps appreciate more that a single approach
would be better, it has ballooned now into a much bigger goal with the
discussions on a common deployment mechanism. For example, a WARP file
(Python WAR file equivalent) being the latest idea. This was touched
on again at Web Summit at PyCon this year.

Graham

> e.g. gunicorn wants a Python module and the callable name must be
> configured), and it would be nice if simple_server could take such a
> script and mount the application provided:
>
> * This would allow testing that the script has no error without having
>  to go through mounting it in e.g. mod_wsgi
> * It would make trivial/test applications (e.g. dynamic responders to
>  local JS) simpler to bootstrap as there would be no need for the
>  half-dozen lines of wsgiref.simple_server bootstrapping and "hard"
>  dependency on wsgiref,
>
>   import wsgiref.simple_server
>
>   def application(environ, start_response):
>       'code'
>
>   if __name__ == '__main__':
>       httpd = make_server('', 8000, application)
>       httpd.serve_forever()
>
>  could become:
>
>   def application(environ, start_response):
>       'code'
>
> Since wsgiref already supports `python -mwsgiref.simple_server`, the
> changes would be pretty simple:
>
> * an optional positional argument of the form `script[:app]`, the script
>  is exec'd, the application (called "application" by default) is
>  extracted and then mounted in simple_server. If no script is specified,
>  just mount `demo_app` as before
> * Add -H/--host -p/--port options to, respectively, the hostname and the
>  port to bind the server to.
> * The current -msimple_server uses `handle_request` and only replies once,
>  to increase the usability of the CLI tool use `serve_forever` *when and
>  only when the mounted application is not demo_app*. It also avoids
>  opening a hardcoded example URL on launch.
>
> This way the current sanity test/"PHPInfo" demo app works as it did before,
> but it becomes possible to very easily serve a WSGI script with almost no
> overhead in the script itself.
>
> Attachment: patch performing the above-specified alterations, using
> argparse for arguments parsing and generation of help.
>
>
> ___
> Web-SIG mailing list
> Web-SIG@python.org
> Web SIG: http://www.python.org/sigs/web-sig
> Unsubscribe: 
> http://mail.python.org/mailman/options/web-sig/graham.dumpleton%40gmail.com
>
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] A 'shutdown' function in WSGI

2012-02-21 Thread Graham Dumpleton

If you want to be able to control a thread like that from an atexit
callback, you need to create the thread as daemonised. Ie.
setDaemon(True) call on thread.

By default a thread will actually inherit the daemon flag from the
parent. For a command line Python where thread created from main
thread it will not be daemonised and thus why the thread will be
waited upon on shutdown prior to atexit being called.

If you ran the same code in mod_wsgi, my memory is that the thread
will actually inherit as being daemonised because request handler in
mod_wsgi, from which import is trigger, are notionally daemonised.

Thus the code should work in mod_wsgi. Even so, to be portable, if
wanting to manipulate thread from atexit, make it daemonised.

Example of background threads in mod_wsgi at:

http://code.google.com/p/modwsgi/wiki/ReloadingSourceCode#Monitoring_For_Code_Changes

shows use of setDaemon().

Graham

On 22 February 2012 00:46, Tarek Ziadé  wrote:
>
>
> On Tue, Feb 21, 2012 at 1:43 PM, Antoine Pitrou  wrote:
>>
>> Tarek Ziadé  writes:
>> >
>> >
>> > On Tue, Feb 21, 2012 at 10:24 AM, Graham Dumpleton
>>  wrote:
>> > ...
>> > > But I don't think you can guarantee that everything is still up in
>> > > memory by
>> > > the time atexit gets called,
>> > > so you can't really call cleanup code there.
>> > The only thing which is done prior to atexit callbacks being called is
>> > waiting on threads which weren't marked as daemonised.
>> >
>> >
>> > which can lead to completely lock the shutdown if a lib or the program
>> > has a
>> > thread with a loop that waits for a condition.which it is not the case
>> > with
>> > signals, since you get a chance to properly stop everything beforehand.
>>
>> That's a buggy lib or program. This has nothing to do with WSGI really.
>
>
> No, that has to do with : please let me clean my program before you try to
> kill it because I can't use signals :)
>
>
>>
>> The
>> snippet Graham showed is run at any interpreter shutdown, even when you
>> simply
>> run "python" in your shell.
>
>
> here's a very simple demo: http://tarek.pastebin.mozilla.org/1489505
>
> Run it with plain python, and try to ctrl-C it. You won't reach atexit and
> will get locked.
>
> (here: python 2.7 / mac os)
>
> If you use signals instead of atexit, you'll have it working.
>
> And this pattern (a thread in the background) is pretty common -- unless I
> am missing something here
>
>
> Cheers
> Tarek
>
>>
>>
>> Regards
>>
>> Antoine.
>>
>>
>> ___
>> Web-SIG mailing list
>> Web-SIG@python.org
>> Web SIG: http://www.python.org/sigs/web-sig
>> Unsubscribe:
>> http://mail.python.org/mailman/options/web-sig/ziade.tarek%40gmail.com
>
>
>
>
> --
> Tarek Ziadé | http://ziade.org
>
> ___
> Web-SIG mailing list
> Web-SIG@python.org
> Web SIG: http://www.python.org/sigs/web-sig
> Unsubscribe:
> http://mail.python.org/mailman/options/web-sig/graham.dumpleton%40gmail.com
>
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] A 'shutdown' function in WSGI

2012-02-21 Thread Graham Dumpleton

On 21 February 2012 21:07, Simon Sapin  wrote:
> Le 21/02/2012 10:31, Graham Dumpleton a écrit :
>
>> You do realise you are just reinventing context managers?
>>
>> With this 'application' do requests.
>
> Indeed. I didn’t want to go too far from the initial "shutdown function"
> proposal, but actual context managers would be better.

FWIW, I have been playing with context managers in other ways to solve
per request resource cleanups issues as well. I will cover some of
what I have been doing with that in my State of WSGI 2 talk at PyCon
web summit.

I sort of wish this whole discussion could perhaps wait until the web
summit where after my talk I can perhaps discuss with interested
parties all the stuff I have been playing with around improving WSGI
rather than taking shots at little bits now when there is a lot more
to consider than just this.

Graham
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] A 'shutdown' function in WSGI

2012-02-21 Thread Graham Dumpleton

On 21 February 2012 21:41, Tarek Ziadé  wrote:
>
>
> On Tue, Feb 21, 2012 at 10:24 AM, Graham Dumpleton
>  wrote:
>>
>> ...
>>
>> > But I don't think you can guarantee that everything is still up in
>> > memory by
>> > the time atexit gets called,
>> > so you can't really call cleanup code there.
>>
>> The only thing which is done prior to atexit callbacks being called is
>> waiting on threads which weren't marked as daemonised.
>
> which can lead to completely lock the shutdown if a lib or the program has a
> thread with a loop that waits for a condition.

In mod_wsgi at least there are fail safes such that background C
threads will force kill the process if such a lockup occurs on
shutdown.

> which it is not the case with signals, since you get a chance to properly
> stop everything beforehand.

Yes and no. For a signal handler to even be able to be triggered,
there must be Python code executing in the main thread that originally
created the main interpreter.

In an embedded system such as mod_wsgi, the main thread is never used
to handle requests and actually runs in C code blocked waiting for an
internal notification that process is being shutdown.

>> what do you mean by bypassing its destruction ?
>
>> Non catchable signal from within process or from a distinct monitoring
>> process.
>> One of this things I pointed out is being missed.
>> That is, a WSGI adapter may be running on top of another layer of
>> abstraction, such as FASTCGI for example, where the lower layer isn't
>> going to have any callback mechanism of its own to even notify the
>> WSGI layer to trigger registered cleanup callbacks.
>> This is why the only mechanism one can universally rely on is the
>> Python interpreters own atexit mechanism.
>
> I see.. but what I don't understand is the following: when the whole stack
> is shut down, the python process is being killed by *someone*.
>
> And that someone, as far as I understand, is also able to send requests to
> the WSGI application.
>
> So what makes it impossible to send a shutdown signal prior to killing the
> process ?

Is not impossible and in mod_wsgi at least a signal is used to
initiate shutdown, this coming either from itself in some cases, or
from Apache parent process in others. The signal handler then uses a
socketpair pipe to wake up the blocked main thread to begin shutdown
steps. Either way it is handled at C code level because can't rely on
Python level signal handlers to actually run.

To further complicate things, in a process with multiple sub
interpreters where would the Python signal handler even run. There is
no main thread running waiting to exit. It also can't just cause the
main Python interpreter to be exited. Simply exiting a main thread
even if it did exist wouldn't allow you to cleanup sub interpreters.

In short, embedded systems are going to be quite different to what you
are used to with pure WSGI servers. It is because it is doing all this
to ensure that reliable shutdown can occur that mod_wsgi ignores all
Python signal handler registrations by default.

That all said, technically mod_wsgi could on reception of its signal,
and if there was a registry of known WSGI applications, it could tell
them the process is being shutdown. Presence of sub interpreters makes
that a lot of fun, but would be doable.

Right now without such a registry of applications with enter/exit
methods as being discussed in this thread, the only way in mod_wsgi is
to rely on atexit.

Graham
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] A 'shutdown' function in WSGI

2012-02-21 Thread Graham Dumpleton

On 21 February 2012 20:26, Simon Sapin  wrote:
> Le 21/02/2012 09:23, Tarek Ziadé a écrit :
>
>>    Instead of having to provide two or three objects separately to a
>>    server, how about making the callbacks attributes of the application
>>    callable?
>>
>>
>> can you show us an example ?
>
>
> Proposal:
>
> Function-based:
>
>    def startup():
>        return open_resource(something)
>
>    def shutdown(resource):
>        resource.close()
>
>    def application(environ, start_response):
>        # ...
>        return response_body
>
>    application.startup = startup
>    application.shutdown = shutdown
>
> Class-based:
>
>    class App(object):
>        def startup(self):
>            return open_resource(something)
>
>        def shutdown(self, resource):
>            resource.close()
>
>        def __call__(self, environ, start_response):
>            # ...
>            return response_body
>
>    application = App()
>
> The return value of startup() can be any python object and is opaque to the
> server. It is passed as-is to shutdown()
>
> startup() could take more parameters. Maybe the application (though can we
> already have it as self for class-based or in a closure for function-based)

You do realise you are just reinventing context managers?

With this 'application' do requests.

But then it was sort of suggested that was a bit too radical idea when
I have mentioned viewing it that way before. :-(

Graham
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] A 'shutdown' function in WSGI

2012-02-21 Thread Graham Dumpleton

On 21 February 2012 18:53, Tarek Ziadé  wrote:
>
>
> On Tue, Feb 21, 2012 at 2:39 AM, Graham Dumpleton
>  wrote:
>>
>> ...
>>
>> Overall the best chance of being able to do anything is relying on atexit.
>>
>> You are though at the mercy of the WSGI hosting mechanism shutting
>> down the process and so the interpreter, in an orderly manner such
>> that atexit callbacks get called.
>>
>> In Apache/mod_wsgi you get this guarantee, even in sub interpreters
>> where atexit callbacks wouldn't normally be called when they are
>> destroyed.
>>
>> For uWSGI, atexit callbacks will not be called at the moment, by
>> Robert is making changes to it so you get a guarantee there as well.
>> It is possible he is only doing this though for case where main
>> interpreter is being used, as doing it for sub interpreters is a bit
>> fiddly.
>>
>
> But I don't think you can guarantee that everything is still up in memory by
> the time atexit gets called,
> so you can't really call cleanup code there.

The only thing which is done prior to atexit callbacks being called is
waiting on threads which weren't marked as daemonised.

void
Py_Finalize(void)
{
PyInterpreterState *interp;
PyThreadState *tstate;

if (!initialized)
return;

wait_for_thread_shutdown();

/* The interpreter is still entirely intact at this point, and the
 * exit funcs may be relying on that.  In particular, if some thread
 * or exit func is still waiting to do an import, the import machinery
 * expects Py_IsInitialized() to return true.  So don't say the
 * interpreter is uninitialized until after the exit funcs have run.
 * Note that Threading.py uses an exit func to do a join on all the
 * threads created thru it, so this also protects pending imports in
 * the threads created via Threading.
 */
call_sys_exitfunc();

...

>> Any pure Python WSGI servers shouldn't have issues so long as they
>> aren't force exiting the whole process and bypassing normal
>> interpreter destruction.
>
>
> what do you mean by bypassing its destruction ?

Non catchable signal from within process or from a distinct monitoring process.

One of this things I pointed out is being missed.

That is, a WSGI adapter may be running on top of another layer of
abstraction, such as FASTCGI for example, where the lower layer isn't
going to have any callback mechanism of its own to even notify the
WSGI layer to trigger registered cleanup callbacks.

This is why the only mechanism one can universally rely on is the
Python interpreters own atexit mechanism.

Graham
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] A 'shutdown' function in WSGI

2012-02-20 Thread Graham Dumpleton

On 21 February 2012 12:03, Simon Sapin  wrote:
> Le 21/02/2012 01:18, Chris McDonough a écrit :
>
>> On Mon, 2012-02-20 at 17:39 -0500, PJ Eby wrote:
>>>
>>> >  The standard way to do this would be to define an "optional server
>>> >  extension" API supplied in the environ; for example, a
>>> >  'x-wsgiorg.register_shutdown' function.
>>
>> Unlikely, AFACIT, as shutdown may happen when no request is active.
>> Even if this somehow happened to not be the case, asking the application
>> to put it in the environ is not useful, as the environ can't really be
>> relied on to retain values "up" the call stack.
>
>
> Hi,
>
> I like environ['x-wsgiorg.register_shutdown']. It would work without changes
> to WSGI itself.
>
> I think that the idea is not to put your shutdown function in the
> environment and hope it stays there "up" the stack, but to register it by
> calling register_shutdown:
>
> @environ.get('x-wsgiorg.register_shutdown', lambda f: f)
> def do_cleanup():
>    pass
>
> Also, a shutdown function would be used to clean up something that was set
> up in a request. So if the server shuts down without having ever served a
> request, there probably is nothing to clean up.

Using environ is not going to work it is supplied on a per request basis.

You would typically want an application scope cleanup handler to only
be registered once.

In this scheme you are relying on it being registered from within a
request scope.

To ensure that it is only registered once, the caller would need to
use a flag protected by a thread mutex to know whether should call a
second time, which is cumbersome.

If you don't do that you could end up registering a separate callback
for every single request that occurs and memory usage alone would blow
out just from recording them all.

Alternatively, you would have to require the underlying WSGI
server/adapter to weed out duplicates, but even if you do that, you
still waste the time of the per request scope registering it all the
time.

Even if you have a registration mechanism, especially with a WSGI
adapter riding on top of something else, how is the WSGI adapter going
to get notified to call them.

All you have therefore done is shift the problem of how it is
triggered somewhere else.

Overall the best chance of being able to do anything is relying on atexit.

You are though at the mercy of the WSGI hosting mechanism shutting
down the process and so the interpreter, in an orderly manner such
that atexit callbacks get called.

In Apache/mod_wsgi you get this guarantee, even in sub interpreters
where atexit callbacks wouldn't normally be called when they are
destroyed.

For uWSGI, atexit callbacks will not be called at the moment, by
Robert is making changes to it so you get a guarantee there as well.
It is possible he is only doing this though for case where main
interpreter is being used, as doing it for sub interpreters is a bit
fiddly.

Any pure Python WSGI servers shouldn't have issues so long as they
aren't force exiting the whole process and bypassing normal
interpreter destruction.

Graham
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] Move www.wsgi.org to Read The Docs

2011-09-19 Thread Graham Dumpleton

Christian. The DNS entry is actually wrong. Got this from Eric:

  GrahamDumpleton: wanted to let you know they changed the DNS for wsgi.org,
  but they pointed it at wsgi.readthedocs.org, so I made this project
in its place
  so the CNAME would resolve: http://readthedocs.org/projects/wsgi

  that isn't yours: http://readthedocs.org/projects/wsgiorg/

  I can either renmae the slug on yours, or you can get them to change the DNS

Don't change the DNS though as I reckon it may be better that we claim:

  wsgi.readthedocs.org

Will stop someone else claiming generic WSGI for some project.

Eric, yes, please change the slug.

Thanks.

Graham

On 20 September 2011 06:42, Graham Dumpleton  wrote:
> Thanks.
>
> One thing we should do now is create a page with instructions on how
> you can contribute changes back via github project.
>
> Graham
>
> On 19 September 2011 23:30, Christian Theune  wrote:
>> Hi,
>>
>> On 09/19/2011 11:33 AM, Christian Theune wrote:
>>>
>>> OK, I updated our database. The nameservers should start propagating
>>> this in an hour or so.
>>
>> After some messing around with CNAMES and such I added a redirect from
>> wsgi.org -> www.wsgi.org and a CNAME of www.wsgi.org to readthedocs.
>>
>> I also added a placeholder page while the DNS updates are in progress
>> including a link to the direct readthedocs.org address.
>>
>> Hope this helps,
>> Christian
>>
>>
>> --
>> Christian Theune · c...@gocept.com
>> gocept gmbh & co. kg · forsterstraße 29 · 06112 halle (saale) · germany
>> http://gocept.com · tel +49 345 1229889 0 · fax +49 345 1229889 1
>> Zope and Plone consulting, development, hosting, operations
>>
>
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] Move www.wsgi.org to Read The Docs

2011-09-19 Thread Graham Dumpleton

Thanks.

One thing we should do now is create a page with instructions on how
you can contribute changes back via github project.

Graham

On 19 September 2011 23:30, Christian Theune  wrote:
> Hi,
>
> On 09/19/2011 11:33 AM, Christian Theune wrote:
>>
>> OK, I updated our database. The nameservers should start propagating
>> this in an hour or so.
>
> After some messing around with CNAMES and such I added a redirect from
> wsgi.org -> www.wsgi.org and a CNAME of www.wsgi.org to readthedocs.
>
> I also added a placeholder page while the DNS updates are in progress
> including a link to the direct readthedocs.org address.
>
> Hope this helps,
> Christian
>
>
> --
> Christian Theune · c...@gocept.com
> gocept gmbh & co. kg · forsterstraße 29 · 06112 halle (saale) · germany
> http://gocept.com · tel +49 345 1229889 0 · fax +49 345 1229889 1
> Zope and Plone consulting, development, hosting, operations
>
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] Move www.wsgi.org to Read The Docs

2011-09-16 Thread Graham Dumpleton

On 17 September 2011 03:05, Masklinn  wrote:
> On 2011-09-16, at 12:20 , Graham Dumpleton wrote:
>> On 16 September 2011 19:46, Masklinn  wrote:
>>> On 2011-09-10, at 22:18 , Graham Dumpleton wrote:
>>>> We haven't actually done a push up to ReadTheDocs yet.
>>>>
>>>> Sorry, just been too busy with work and trip to US. I expect things to
>>>> calm down once get home this week.
>>> Apart from your chronic lack of time, is there anything left blocking
>>> the push on RTD I can help with?
>>
>> http://wsgiorg.rtfd.org/
>>
> Cool. So Christian just needs to update the DNS records now?

Don't see any reason why not. Is looking clean and slick now.

I have added you as collaborator to wsgiorg project out of guthub. I
presume that means you can push changes back in somehow without
waiting on me now. Never had to do that. If anyone else wants to step
up to be maintainer on site can add them as well.

I'll also try and setup the hooks so updates automatically get applied
to ReadTheDocs.

Graham
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] Move www.wsgi.org to Read The Docs

2011-09-16 Thread Graham Dumpleton

On 16 September 2011 19:46, Masklinn  wrote:
> On 2011-09-10, at 22:18 , Graham Dumpleton wrote:
>> We haven't actually done a push up to ReadTheDocs yet.
>>
>> Sorry, just been too busy with work and trip to US. I expect things to
>> calm down once get home this week.
> Apart from your chronic lack of time, is there anything left blocking
> the push on RTD I can help with?

http://wsgiorg.rtfd.org/

Someone needs to work on the template to replace generic Sphinx type
header/footers with more appropriate branding. Having it say version
X.Y of documentation, trying to attribute copyright to specific person
etc don't make sense.

Graham
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] Move www.wsgi.org to Read The Docs

2011-09-10 Thread Graham Dumpleton

Just change:

:Title: Waiting for File Descriptor Events
:Author: Christopher Stawarz 
:Discussions-To: Python Web-SIG 
:Status: Proposed
:Created: 11-May-2008

to:

Waiting for File Descriptor Events


:Author: Christopher Stawarz 
:Discussions-To: Python Web-SIG 
:Status: Proposed
:Created: 11-May-2008

and it should be fine,

The table of information is only special if at the very start of the
document. So, move title out of the table and put it first as normal
title with underlining of some description so know it is title.

Graham



On 10 September 2011 06:36, Masklinn  wrote:
> On 2011-09-10, at 11:45 , Stephan Diehl wrote:
>>
>> How far are we in getting things ready at the ReadTheDocs end? I'd say, the 
>> earlier we can switch the DNS entry, the better.
> Everything was ported (as of August 28 anyway)[0], except for the 
> specifications: from what I can tell, Sphinx does not support PEP-RST[1] (it 
> does not understand the PEP header directives, so they are not displayed in 
> the output), I did not get any answer when I asked about it on the pocoo IRC 
> channel and google searches have failed to yield any information.
>
> Way forward I'd see would be adding a target to Sphinx's makefile to use 
> Docutils directly to compile the PEPs, and linking to them as if they were 
> static HTML documents, but I do not know if RTD supports that. You'd have to 
> ask someone more knowledgeable (Graham has already ported the mod_wsgi docs 
> so he might know). From RTD's documentation, it seems accounts can be 
> whitelisted for code execution[2], Graham probably wouldn't have any issue 
> getting flagged for a pair of bilding commands (and barring that, the 
> compiled PEPs could always be committed to the repo).
>
> In the long run, having a pep sphinx extension might be nice.
>
> [0] https://github.com/GrahamDumpleton/wsgiorg
> [1] https://github.com/GrahamDumpleton/wsgiorg/issues/11
> [2] http://read-the-docs.readthedocs.org/en/latest/faq.html
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] Move www.wsgi.org to Read The Docs

2011-09-10 Thread Graham Dumpleton

We haven't actually done a push up to ReadTheDocs yet.

Sorry, just been too busy with work and trip to US. I expect things to
calm down once get home this week.

FWIW, have no issues with adding other people to have direct commit
rights to wsgiorg project so I am not a bottleneck.

Graham

On 10 September 2011 12:49, Stephan Diehl  wrote:
> Am 10.09.2011 21:29, schrieb Masklinn:
> [...]
>>>
>>> We can't just point wsgi.org to ReadTheDocs.org, right?
>>
>> That's exactly it actually:
>> http://read-the-docs.readthedocs.org/en/latest/alternate_domains.html
>>
>
> Ahh, excelent. Didn't see that.
> I guess we should change the CNAME as soon as possible.
>
> Cheers, Stephan
>
>
>
>
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] Move www.wsgi.org to Read The Docs

2011-09-10 Thread Graham Dumpleton

I'll see if I can find someone at the DjangoCon sprints who might be
able to give suggestions of what to do. The sprints are at the offices
of Eric from ReadTheDocs, so one would think I might be able to get an
answer.

On 10 September 2011 06:36, Masklinn  wrote:
> On 2011-09-10, at 11:45 , Stephan Diehl wrote:
>>
>> How far are we in getting things ready at the ReadTheDocs end? I'd say, the 
>> earlier we can switch the DNS entry, the better.
> Everything was ported (as of August 28 anyway)[0], except for the 
> specifications: from what I can tell, Sphinx does not support PEP-RST[1] (it 
> does not understand the PEP header directives, so they are not displayed in 
> the output), I did not get any answer when I asked about it on the pocoo IRC 
> channel and google searches have failed to yield any information.
>
> Way forward I'd see would be adding a target to Sphinx's makefile to use 
> Docutils directly to compile the PEPs, and linking to them as if they were 
> static HTML documents, but I do not know if RTD supports that. You'd have to 
> ask someone more knowledgeable (Graham has already ported the mod_wsgi docs 
> so he might know). From RTD's documentation, it seems accounts can be 
> whitelisted for code execution[2], Graham probably wouldn't have any issue 
> getting flagged for a pair of bilding commands (and barring that, the 
> compiled PEPs could always be committed to the repo).
>
> In the long run, having a pep sphinx extension might be nice.
>
> [0] https://github.com/GrahamDumpleton/wsgiorg
> [1] https://github.com/GrahamDumpleton/wsgiorg/issues/11
> [2] http://read-the-docs.readthedocs.org/en/latest/faq.html
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] Move www.wsgi.org to Read The Docs

2011-08-28 Thread Graham Dumpleton

Who then is the specific person who could switch the DNS for
www.wsgi.org to direct towards ReadTheDocs when we are ready,
Christian? This presumes of course people are happy about this being
done.

I have been a bit busy myself, but masklinn has been doing a great job
at moving the pages into Sphinx format.

We are hitting up to a dozen spam messages a day on the existing wiki
at the moment. :-(

Graham

On 19 August 2011 17:51, Stephan Diehl  wrote:
> Sorry about the confusion.
>
> wsgi.org was owned by me for the last couple of years. Recently, we've
> moved ownership over to DZUG e.V. (German Zope User Group) which is the
> only legal offical python related entity in Germany at the moment.
> Obviously, I should have announced that here on this list, but didn't.
> The hosting was moved to GoCept which is a company owned by Christian
> Theune, who's also very active in the Python community.
>
> I guess that moving the content over to github is an excelent idea.
>
> Cheers, Stephan
> ___
> Web-SIG mailing list
> Web-SIG@python.org
> Web SIG: http://www.python.org/sigs/web-sig
> Unsubscribe: 
> http://mail.python.org/mailman/options/web-sig/graham.dumpleton%40gmail.com
>
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] Move www.wsgi.org to Read The Docs.

2011-08-18 Thread Graham Dumpleton

Thanks to additional help of masklinn and SvenBerkvensMatthijsse, the
spam has all been cleaned up now.

The number of pages isn't that great so manual conversion is probably
practical and no need to look at actual tools to convert to ReST.

Since I reckon this just has to be done, not waiting to see who owns
up to controlling the site and have created github repository at:

  https://github.com/GrahamDumpleton/wsgiorg

I'll start converting over some pages when get a chance over the
weekend when at PyCon AU.

If there is anyone who is a Sphinx/Read The Docs expert and wants to
set up some better styling for the site then go for it.

We can run with this for a bit and see how it goes. If can't find
anyone who controls www.wsgi.org domain or they don't want to change
how it is managed, then I guess the repo I have created will just get
thrown away, but at least worth a try. The site really needs some
love.

Graham

On 19 August 2011 07:23, Masklinn  wrote:
> On 2011-08-18, at 23:14 , Graham Dumpleton wrote:
>> Who owns and manages www.wsgi.org wiki?
>>
>> The amount of spam the wiki gets now is becoming rediculous.
>>
>> If we care about the wiki, it is time to take the content in it and
>> dump it in github as a project which can then be loaded up to Read The
>> Docs, with www.wsgi.org directing to that.
> Would require converting from moinmoin to rst would it not?
>
>> In the mean time, can anyone else help clean up the spam. I am usually
>> the only one who does it, but this time there is too much and becomes
>> a waste of my time. I only have so many phone meetings where I can
>> secretly be cleaning up the spam at the same time. So, many hands make
>> light work. :-)
> While not involved in Web-SIG or any of that, I'd be glad to help on
> cleaning up the spam on wsgi.org.
>
>
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

[Web-SIG] Move www.wsgi.org to Read The Docs.

2011-08-18 Thread Graham Dumpleton

Who owns and manages www.wsgi.org wiki?

The amount of spam the wiki gets now is becoming rediculous.

If we care about the wiki, it is time to take the content in it and
dump it in github as a project which can then be loaded up to Read The
Docs, with www.wsgi.org directing to that.

In the mean time, can anyone else help clean up the spam. I am usually
the only one who does it, but this time there is too much and becomes
a waste of my time. I only have so many phone meetings where I can
secretly be cleaning up the spam at the same time. So, many hands make
light work. :-)

Overall I reckon moving to github and Read The Docs may also encourage
greater participation as far as putting some useful content in it.
Personally I find wikis a pain for that sort of content and so can't
be bothered to work on the actual content. If it was on guthub and
Read The Docs I am more likely myself to help build out the content
with actual decent useful content, moving some of the stuff I have
blogged about or put elsewhere there instead.

Graham
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] A Python Web Application Package and Format

2011-04-14 Thread Graham Dumpleton

On 14 April 2011 18:22, Alice Bevan–McGregor  wrote:
> Howdy!
>
> I suspect you're thinking a little too low-level.

Exactly, I am trying to walk before running. Things always fall down
here because people try and take too large a leap rather than an
incremental approach, solving one small problem at a time.

Thus please don't think that because I am replying to your message
that I am specifically commenting about your plans. See this as a side
comment and don't try and evaluate it only in the context of your
ideas.

> On 2011-04-14 00:53:09 -0700, Graham Dumpleton said:
>
>> On 14 April 2011 16:57, Alice Bevan–McGregor  wrote:
>>>>
>>>> 3. Define how to get the WSGI app.  This is WSGI specific, but (1) is
>>>> *not* WSGI specific (it's only Python specific, and would apply well to
>>>> other platforms)
>>>
>>> I could imagine there would be multiple "application types":
>>>
>>> :: WSGI application.  Define a package dot-notation entry point to a WSGI
>>> application factory.
>>
>> Why can't it be a path to a WSGI script file?
>
> No reason it couldn't be.
>
> app.type = wsgi
> app.target = /myapp.wsgi:application
>
> (Paths relative to the folder the application is installed into, and dots
> after a slash are filename parts, not module separators.)
>
> But then, how do you configure it?  Using a factory (which is passed the
> from-appserver configuration) makes a lot of sense.
>
>> This actually works more universally as it works for servers which map
>> URLs to file based
>> resources as well.
>
> First, .wsgi files (after a few quick Google searches) are only used by
> mod_wsgi.  I wouldn't call that "universal", unless you can point out the
> other major web servers that support that format.

The WGSI module for nginx used them, as does uWSGI and either one of
Phusion Passenger or new Mongrel WSGI support rely on a script file.

You also have CGI, FASTCGI, SCGI and AJP also using script files.

Don't get hung up on the extension of .wsgi, it is the concept of a
script file which is stored in the file system in an arbitrary
location to which a URL maps.

> You'll have to describe the "map URLs to file based resources" issue, since
> every web server I've ever encountered (Apache, Nginx, Lighttpd, etc.) works
> that way.

Which supports what I am saying, but you for some reason decided to
focus on '.wsgi' as an extension which wasn't the point.

> Only if someone is willing to get really hokey with the system
> described thus far would any application-scope web servers be running.

Forget for a moment trying to tie this to your larger designs and see
it as more of a basic underlying concept. Ie., the baby step before
you try and run.

>> Also allows alternate extensions than .py and also allows basename of file
>> name to be arbitrarily named, both of which help with those same servers
>> which map URLs to file base resources.
>
> Again, you'll have to elaborate or at least point to some existing
> documentation on this.
>
> I've never encountered a problem with that, nor do any of my scripts end in
> .py.

Lack of an extension is fine if you have configured Apache with a
dedicated cgi-bin or fastcgi-bin directory where an extension is
irrelevant because you have:

  SetHandler cgi-script

But many Apache server configurations use:

  AddHandler cgi-script .py

Ie., handler dispatch is based off extension, the .py extension quite
often being associated with CGI script execution.

You often see:

  AddHandler fcgid-script .fcgid

Which says certain resource is to be started up as FASTCGI process.

For both these it expects those scripts to be self contained programs
which fire up the mechanics of interfacing with CGI or FASTCGI
protocols.

This means that you usually have to stick that boilerplate at the end
of the script.

This is where though FASTCGI deployment usually sucks bad. This is
because it is put on the user to get the boilerplate and remainder of
WSGI script perfect from the outset. If you don't, because FASTCGI
technically doesn't allow for stdout/stderr at point of startup, if
there is an error on import it is lost and user has no idea. So many
times you see people winging about setting up stuff on the likes of
DreamHost because of FASTCGI being a pain like this.

In the PHP world they don't have to deal with this boilerplate
nonsense. Instead there is a PHP launcher script associated with
FASTCGI module. So you have:

  AddHandler fcgid-script .php

but also a mapping in FASTCGI module configuration that says rather
than execute .php script if runs the launcher script instead. That way
the launcher script can get everything setup pro

Re: [Web-SIG] A Python Web Application Package and Format

2011-04-14 Thread Graham Dumpleton

On 14 April 2011 16:57, Alice Bevan–McGregor  wrote:
>> 3. Define how to get the WSGI app.  This is WSGI specific, but (1) is
>> *not* WSGI specific (it's only Python specific, and would apply well to
>> other platforms)
>
> I could imagine there would be multiple "application types":
>
> :: WSGI application.  Define a package dot-notation entry point to a WSGI
> application factory.

Why can't it be a path to a WSGI script file. This actually works more
universally as it works for servers which map URLs to file based
resources as well. Also allows alternate extensions than .py and also
allows basename of file name to be arbitrarily named, both of which
help with those same servers which map URLs to file base resources. It
also allows same name WSGI script file to exist in multiple locations
managed by same server without having to create an overarching package
structure with __init__.py files everywhere.

For WSGI servers which currently require a dotted path, eg gunicorn:

  gunicorn myapp

Then it changes to also allow:

  gunicorn --script myapp.wsgi

The server just has to construct a new Python module with a __name__
which relates to the absolute file system path and exec code within
that context to create the module itself. Nothing too difficult.

Because the WSGI script file is identified by explicit filesystem
path, you don't have to worry about what current working directory is
or otherwise set sys.path to allow it to be imported initially. The
WSGI script file then can itself even be responsible for further setup
of sys.path as appropriate and so be more self contained and not
dependent on an external launch system.

I have also always seen it as a PITA that for various of the WSGI
servers you always had to do:

  python myapp.py

and in the end of myapp.py add bolier plate like:

  from wsgiref.simple_server import make_server

  httpd = make_server('', 8000, application)
  print "Serving on port 8000..."
  httpd.serve_forever()

Use a different server which required such boilerplate and you had to change it.

Even where WSGI servers allowed you to specific a Python module as
command line argument, options all differed and you also needed to
know where WSGI server was installed to run it.

Using a WSGI script file as the lowest common denominator, it would
also be nice to be able to do something like:

  python -m gunicorn.server myapp.wsgi
  python -m wsgiref.server myapp.wsgi

Ie., use the '-m' option for Python command line to have the installed
module act as the processor for the WSGI script file, thereby avoiding
the need to modify the script. This lowest common denominator option
could handle a few common options which all servers would need to
accept such as listener host, port and perhaps even concepts of
processes/threads.

If you really wanted to tie the script to a particular method, but
still make it easy to use something else instead, then do it with a #!
line.

  #!/usr/bin/env python -m gunicorn -- --host localhost --port 8000

with the rest of the file being the normal WSGI script file contents,
without any special __main__ section as that is handled by the #!
line.

FWIW, I did bring this up a couple of years back, but then there was
little interest back then in trying to standardise deployment setup so
there was some measure of commonality between WSGI servers.

Graham
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] Declaring PEP 3333 accepted (was: PEP 444 != WSGI 2.0)

2011-01-12 Thread Graham Dumpleton

On 13 January 2011 12:02, P.J. Eby  wrote:
> At 02:52 PM 1/12/2011 -0800, Guido van Rossum wrote:
>>
>> On Wed, Jan 12, 2011 at 2:34 PM, Alice BevanMcGregor
>>  wrote:
>> > On 2011-01-10 13:12:57 -0800, Guido van Rossum said:
>> >>
>> >> Ok, now that we've had a week of back and forth about this, let me
>> >> repeat
>> >> my "threat". Unless more concerns are brought up in the next 24 hours,
>> >> can
>> >> PEP  be accepted? It seems a lot of people are waiting for a
>> >> decision
>> >> that enables implementers to go ahead and claim PEP 333[3]
>> >> compatibility.
>> >> PEP 444 can take longer.
>> >
>> > With the lack of responses, can I assume this has been or will be
>> > shortly
>> > marked as "accepted"?
>>
>> Yep. Phillip, can you do the honors?
>
> Apparently not -- I went to check it in and found Raymond had already marked
> it "Final".  ;-)
>
> (I'm not clear on whether there's a difference between "Final" and
> "Accepted" heredifference, but I assume that if we find some sort of actual
> error we can still fix it.)

You can partly blame me for that. They were talking about WSGI and
Python 3 on #python-dev and I mentioned that Guido had just blessed it
and made mistake of mentioning the word 'final' in the sentence not
knowing anything about what next status would be. Anyway, Raymond
decided to take it on themselves to update even though I said to leave
it to you. There response since is 'IIRC, informational peps go
straight to final upon acceptance.'. So, they seem to think that the
final status is correct.

Graham
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] [PEP 444] Future- and Generator-Based Async Idea

2011-01-08 Thread Graham Dumpleton

On 9 January 2011 12:16, Alice Bevan–McGregor  wrote:
> On 2011-01-08 09:00:18 -0800, P.J. Eby said:
>
>> (The next interesting challenge would be to integrate this withGraham's
>> proposal for adding cleanup handlers...)
>
> class MyApplication(object):
>   def __init__(self):
>       pass # process startup code
>
>   def __call__(self, environ):
>       yield None # must be a generator
>       pass # request code
>
>   def __enter__(self):
>       pass # request startup code
>
>   def __exit(exc_type, exc_val, exc_tb):
>       pass # request shutdown code -- regardless of exceptions
>
> We could mandate context managers!  :D  (Which means you can still wrap a
> simple function in @contextmanager.)

Context managers don't solve the problem I am trying to address. The
'with' statement doesn't apply context managers to WSGI application
objects in way that is desirable and use of a decorator to achieve the
same means having to replace close() which is what am trying to avoid
because of extra complexity that causes for WSGI middleware just to
make sure wsgi.file_wrapper works. We want a world where it should
never be necessary for WSGI middleware, or proxy decorators, to have
to fudge up a generator and override the close() chain to add
cleanups.

Graham

>        - Alice.
>
>
> ___
> Web-SIG mailing list
> Web-SIG@python.org
> Web SIG: http://www.python.org/sigs/web-sig
> Unsubscribe:
> http://mail.python.org/mailman/options/web-sig/graham.dumpleton%40gmail.com
>
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] Declaring PEP 3333 accepted (was: PEP 444 != WSGI 2.0)

2011-01-07 Thread Graham Dumpleton

On 8 January 2011 02:55, P.J. Eby  wrote:
> At 05:27 PM 1/7/2011 +1100, Graham Dumpleton wrote:
>>
>> Another thing though. For output changed to sys.stdout.buffer. For
>> input should we be using sys.stdin.buffer as well if want bytes?
>
> %&$*()&%!!!  Sorry, still getting used to this whole Python 3 thing.
>  (Honestly, I don't even use Python 2.6 for anything real yet.)
>
>
>> Good thing I tried running this. Did we all assume that someone else
>> was actually running it to check it? :-)
>
> Well, I only recently started changing the examples to actual Python 3, vs
> being the old Python 2 examples.  Though, I'm not sure anybody ever ran the
> Python 2 ones.  ;-)

Latest CGI/WSGI bridge example extract from PEP  seems to work
okay for my simple test.

So, if no more technical problems (vs cosmetic) that anyone else sees,
that is probably it and and we can toss this baby out the door.

Graham
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] Python 3 / PEP 3333 (was: PEP 444 / WSGI 2 Async)

2011-01-06 Thread Graham Dumpleton

On 7 January 2011 18:23, Alice Bevan–McGregor  wrote:
> On 2011-01-06 21:35:24 -0800, Jacob Kaplan-Moss said:
> Other than mod_wsgi, are there any PEP -compliant (or near-compliant)
> components in the wild?  Enough to bring a framework to life in Python 3?
>  What I see is the chicken-and-egg problem endemic with Python 3: developers
> wait on upstream to port before they do, and upstream developers are either
> waiting themselves or don't see the demand to port.

There is also uWSGI and CherryPy WSGI server. I recollect that Benoit
may have started looking it over for gunicorn.

Graham
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] Declaring PEP 3333 accepted (was: PEP 444 != WSGI 2.0)

2011-01-06 Thread Graham Dumpleton

On 7 January 2011 17:19, Alice Bevan–McGregor  wrote:
>> -                    raise exc_info[0], exc_info[1], exc_info[2]
>> +                    raise
>> exc_info[0](exc_info[1]).with_traceback(exc_info[2])
>
> The exception raising syntax has changed; you can not re-raise an exception
> using tuple notation any more.  The new syntax is far clearer, but I'm
> unsure of back-compatibility or even if it is possible to emulate it
> completely as a polygot (2.x and 3.x w/ same code).

PJE already said that intent is that PEP will only have Python 3
compatible code in it. Not attempting to have examples that work for
both Python 2 and Python 3.

That sounds to me then that we should be using what 2to3 changed it to. Ie.,

  if headers_sent:
# Re-raise original exception if headers sent
raise exc_info[0](exc_info[1]).with_traceback(exc_info[2])

Graham


Graham
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] Declaring PEP 3333 accepted (was: PEP 444 != WSGI 2.0)

2011-01-06 Thread Graham Dumpleton

On 7 January 2011 17:22, Graham Dumpleton  wrote:
> On 7 January 2011 17:13, Graham Dumpleton  wrote:
>> The version at:
>>
>> http://svn.python.org/projects/peps/trunk/pep-.txt
>>
>> still shows:
>>
>>        elif not headers_sent:
>>             # Before the first output, send the stored headers
>>             status, response_headers = headers_sent[:] = headers_set
>>             sys.stdout.write('Status: %s\r\n' % status)
>>             for header in response_headers:
>>                 sys.stdout.write('%s: %s\r\n' % header)
>>             sys.stdout.write('\r\n')
>>
>> so not using buffer there and also not converting strings written for
>> headers to bytes.
>
> So:
>
>        elif not headers_sent:
>             # Before the first output, send the stored headers
>             status, response_headers = headers_sent[:] = headers_set
>             sys.stdout.buffer.write(wsgi_header('Status: %s\r\n' % status))
>             for header in response_headers:
>                 sys.stdout.buffer.write(wsgi_header('%s: %s\r\n' % header))
>             sys.stdout.buffer.write(wsgi_header('\r\n'))
>
> where define up start of file:
>
> def wsgi_header(u):
>    return u.encode('iso-8859-1')
>
> I am still seeing some issue with CRLF but is in my body and with
> conversion of some StringIO in my test.

Solved my CRLF issue. Was caused by what 2to3 did to my code.

Another thing though. For output changed to sys.stdout.buffer. For
input should we be using sys.stdin.buffer as well if want bytes?

Good thing I tried running this. Did we all assume that someone else
was actually running it to check it? :-)

Graham

>> Graham
>>
>> On 7 January 2011 17:00, Graham Dumpleton  wrote:
>>> Stupid question first. When running 2to3 on the example CGI code, why
>>> would it throw back the following. Is this indicative of anything else
>>> that needs to be changed to satisfy some Python 3 thing. The list()
>>> bit seems redundant, but I don't know what the other stuff is about.
>>>
>>> --- xx.py (original)
>>> +++ xx.py (refactored)
>>> @@ -9,7 +9,7 @@
>>>     return u.encode(enc, esc).decode('iso-8859-1')
>>>
>>>  def run_with_cgi(application):
>>> -    environ = {k: wsgi_string(v) for k,v in os.environ.items()}
>>> +    environ = {k: wsgi_string(v) for k,v in list(os.environ.items())}
>>>     environ['wsgi.input']        = sys.stdin
>>>     environ['wsgi.errors']       = sys.stderr
>>>     environ['wsgi.version']      = (1, 0)
>>> @@ -45,7 +45,7 @@
>>>             try:
>>>                 if headers_sent:
>>>                     # Re-raise original exception if headers sent
>>> -                    raise exc_info[0], exc_info[1], exc_info[2]
>>> +                    raise 
>>> exc_info[0](exc_info[1]).with_traceback(exc_info[2])
>>>             finally:
>>>                 exc_info = None     # avoid dangling circular ref
>>>         elif headers_set:
>>>
>>>
>>>
>>>
>>> On 7 January 2011 16:58, Guido van Rossum  wrote:
>>>> On Thu, Jan 6, 2011 at 9:47 PM, James Y Knight  wrote:
>>>>> On Jan 6, 2011, at 11:30 PM, P.J. Eby wrote:
>>>>>> At 09:51 AM 1/7/2011 +1100, Graham Dumpleton wrote:
>>>>>>> Is that the last thing or do I need to go spend some time and write my
>>>>>>> own CGI/WSGI bridge for Python 3 based on my own Python 2 one I have
>>>>>>> lying around and just do some final validation checks with a parallel
>>>>>>> implementation as a sanity check to make sure we got everything? This
>>>>>>> might be a good idea anyway.
>>>>>>
>>>>>> It would.  In the meantime, though, I've checked in the two-line change 
>>>>>> to add .buffer in.  ;-)
>>>>>
>>>>> So does that mean PEP  can be accepted now?
>>>>
>>>> TBH I've totally lost track. Hopefully PJE and Graham can tell you...
>>>>
>>>> --
>>>> --Guido van Rossum (python.org/~guido)
>>>>
>>>
>>
>
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] Declaring PEP 3333 accepted (was: PEP 444 != WSGI 2.0)

2011-01-06 Thread Graham Dumpleton

On 7 January 2011 17:13, Graham Dumpleton  wrote:
> The version at:
>
> http://svn.python.org/projects/peps/trunk/pep-.txt
>
> still shows:
>
>        elif not headers_sent:
>             # Before the first output, send the stored headers
>             status, response_headers = headers_sent[:] = headers_set
>             sys.stdout.write('Status: %s\r\n' % status)
>             for header in response_headers:
>                 sys.stdout.write('%s: %s\r\n' % header)
>             sys.stdout.write('\r\n')
>
> so not using buffer there and also not converting strings written for
> headers to bytes.

So:

elif not headers_sent:
 # Before the first output, send the stored headers
 status, response_headers = headers_sent[:] = headers_set
 sys.stdout.buffer.write(wsgi_header('Status: %s\r\n' % status))
 for header in response_headers:
 sys.stdout.buffer.write(wsgi_header('%s: %s\r\n' % header))
 sys.stdout.buffer.write(wsgi_header('\r\n'))

where define up start of file:

def wsgi_header(u):
return u.encode('iso-8859-1')

I am still seeing some issue with CRLF but is in my body and with
conversion of some StringIO in my test.

> Graham
>
> On 7 January 2011 17:00, Graham Dumpleton  wrote:
>> Stupid question first. When running 2to3 on the example CGI code, why
>> would it throw back the following. Is this indicative of anything else
>> that needs to be changed to satisfy some Python 3 thing. The list()
>> bit seems redundant, but I don't know what the other stuff is about.
>>
>> --- xx.py (original)
>> +++ xx.py (refactored)
>> @@ -9,7 +9,7 @@
>>     return u.encode(enc, esc).decode('iso-8859-1')
>>
>>  def run_with_cgi(application):
>> -    environ = {k: wsgi_string(v) for k,v in os.environ.items()}
>> +    environ = {k: wsgi_string(v) for k,v in list(os.environ.items())}
>>     environ['wsgi.input']        = sys.stdin
>>     environ['wsgi.errors']       = sys.stderr
>>     environ['wsgi.version']      = (1, 0)
>> @@ -45,7 +45,7 @@
>>             try:
>>                 if headers_sent:
>>                     # Re-raise original exception if headers sent
>> -                    raise exc_info[0], exc_info[1], exc_info[2]
>> +                    raise 
>> exc_info[0](exc_info[1]).with_traceback(exc_info[2])
>>             finally:
>>                 exc_info = None     # avoid dangling circular ref
>>         elif headers_set:
>>
>>
>>
>>
>> On 7 January 2011 16:58, Guido van Rossum  wrote:
>>> On Thu, Jan 6, 2011 at 9:47 PM, James Y Knight  wrote:
>>>> On Jan 6, 2011, at 11:30 PM, P.J. Eby wrote:
>>>>> At 09:51 AM 1/7/2011 +1100, Graham Dumpleton wrote:
>>>>>> Is that the last thing or do I need to go spend some time and write my
>>>>>> own CGI/WSGI bridge for Python 3 based on my own Python 2 one I have
>>>>>> lying around and just do some final validation checks with a parallel
>>>>>> implementation as a sanity check to make sure we got everything? This
>>>>>> might be a good idea anyway.
>>>>>
>>>>> It would.  In the meantime, though, I've checked in the two-line change 
>>>>> to add .buffer in.  ;-)
>>>>
>>>> So does that mean PEP  can be accepted now?
>>>
>>> TBH I've totally lost track. Hopefully PJE and Graham can tell you...
>>>
>>> --
>>> --Guido van Rossum (python.org/~guido)
>>>
>>
>
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] Declaring PEP 3333 accepted (was: PEP 444 != WSGI 2.0)

2011-01-06 Thread Graham Dumpleton

The version at:

http://svn.python.org/projects/peps/trunk/pep-.txt

still shows:

elif not headers_sent:
 # Before the first output, send the stored headers
 status, response_headers = headers_sent[:] = headers_set
 sys.stdout.write('Status: %s\r\n' % status)
 for header in response_headers:
 sys.stdout.write('%s: %s\r\n' % header)
 sys.stdout.write('\r\n')

so not using buffer there and also not converting strings written for
headers to bytes.

Graham

On 7 January 2011 17:00, Graham Dumpleton  wrote:
> Stupid question first. When running 2to3 on the example CGI code, why
> would it throw back the following. Is this indicative of anything else
> that needs to be changed to satisfy some Python 3 thing. The list()
> bit seems redundant, but I don't know what the other stuff is about.
>
> --- xx.py (original)
> +++ xx.py (refactored)
> @@ -9,7 +9,7 @@
>     return u.encode(enc, esc).decode('iso-8859-1')
>
>  def run_with_cgi(application):
> -    environ = {k: wsgi_string(v) for k,v in os.environ.items()}
> +    environ = {k: wsgi_string(v) for k,v in list(os.environ.items())}
>     environ['wsgi.input']        = sys.stdin
>     environ['wsgi.errors']       = sys.stderr
>     environ['wsgi.version']      = (1, 0)
> @@ -45,7 +45,7 @@
>             try:
>                 if headers_sent:
>                     # Re-raise original exception if headers sent
> -                    raise exc_info[0], exc_info[1], exc_info[2]
> +                    raise 
> exc_info[0](exc_info[1]).with_traceback(exc_info[2])
>             finally:
>                 exc_info = None     # avoid dangling circular ref
>         elif headers_set:
>
>
>
>
> On 7 January 2011 16:58, Guido van Rossum  wrote:
>> On Thu, Jan 6, 2011 at 9:47 PM, James Y Knight  wrote:
>>> On Jan 6, 2011, at 11:30 PM, P.J. Eby wrote:
>>>> At 09:51 AM 1/7/2011 +1100, Graham Dumpleton wrote:
>>>>> Is that the last thing or do I need to go spend some time and write my
>>>>> own CGI/WSGI bridge for Python 3 based on my own Python 2 one I have
>>>>> lying around and just do some final validation checks with a parallel
>>>>> implementation as a sanity check to make sure we got everything? This
>>>>> might be a good idea anyway.
>>>>
>>>> It would.  In the meantime, though, I've checked in the two-line change to 
>>>> add .buffer in.  ;-)
>>>
>>> So does that mean PEP  can be accepted now?
>>
>> TBH I've totally lost track. Hopefully PJE and Graham can tell you...
>>
>> --
>> --Guido van Rossum (python.org/~guido)
>>
>
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] Declaring PEP 3333 accepted (was: PEP 444 != WSGI 2.0)

2011-01-06 Thread Graham Dumpleton

Stupid question first. When running 2to3 on the example CGI code, why
would it throw back the following. Is this indicative of anything else
that needs to be changed to satisfy some Python 3 thing. The list()
bit seems redundant, but I don't know what the other stuff is about.

--- xx.py (original)
+++ xx.py (refactored)
@@ -9,7 +9,7 @@
 return u.encode(enc, esc).decode('iso-8859-1')

 def run_with_cgi(application):
-environ = {k: wsgi_string(v) for k,v in os.environ.items()}
+environ = {k: wsgi_string(v) for k,v in list(os.environ.items())}
 environ['wsgi.input']= sys.stdin
 environ['wsgi.errors']   = sys.stderr
 environ['wsgi.version']  = (1, 0)
@@ -45,7 +45,7 @@
 try:
 if headers_sent:
 # Re-raise original exception if headers sent
-raise exc_info[0], exc_info[1], exc_info[2]
+raise exc_info[0](exc_info[1]).with_traceback(exc_info[2])
 finally:
 exc_info = None # avoid dangling circular ref
 elif headers_set:




On 7 January 2011 16:58, Guido van Rossum  wrote:
> On Thu, Jan 6, 2011 at 9:47 PM, James Y Knight  wrote:
>> On Jan 6, 2011, at 11:30 PM, P.J. Eby wrote:
>>> At 09:51 AM 1/7/2011 +1100, Graham Dumpleton wrote:
>>>> Is that the last thing or do I need to go spend some time and write my
>>>> own CGI/WSGI bridge for Python 3 based on my own Python 2 one I have
>>>> lying around and just do some final validation checks with a parallel
>>>> implementation as a sanity check to make sure we got everything? This
>>>> might be a good idea anyway.
>>>
>>> It would.  In the meantime, though, I've checked in the two-line change to 
>>> add .buffer in.  ;-)
>>
>> So does that mean PEP  can be accepted now?
>
> TBH I've totally lost track. Hopefully PJE and Graham can tell you...
>
> --
> --Guido van Rossum (python.org/~guido)
>
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] PEP 444 Goals

2011-01-06 Thread Graham Dumpleton

2011/1/7 James Y Knight :
>
> On Jan 6, 2011, at 7:46 PM, Alex Grönholm wrote:
>
> The WebDAV spec, on the other hand, says
> (http://www.webdav.org/specs/rfc2518.html#STATUS_102):
>
> The 102 (Processing) status code is an interim response used to inform the
> client that the server has accepted the complete request, but has not yet
> completed it. This status code SHOULD only be sent when the server has a
> reasonable expectation that the request will take significant time to
> complete. As guidance, if a method is taking longer than 20 seconds (a
> reasonable, but arbitrary value) to process the serverSHOULD return a 102
> (Processing) response. The server MUST send a final response after the
> request has been completed.
>
> Again, I don't care how this is done as long as it's possible.
>
> This pretty much has to be generated by the server implementation. One thing
> that could be done in WSGI is a callback function inserted into the environ
> to suggest to the server that it generate a certain 1xx response. That is,
> something like:
>   if 'wsgi.intermediate_response' in environ:
>     environ['wsgi.intermediate_response'](102, {'Random-Header':
> 'Whatever'})
> If a server implements this, it should probably ignore any requests from the
> app to send a 100 or 101 response. The server should be free to ignore the
> request, or not implement it. Given that the only actual use case (WebDAV)
> is rather rare and marks it as a SHOULD, I don't see any real practical
> issues with it being optional.
> The other thing that could be done is simply have a server-side
> configuration to allow sending 102 after *any* request takes > 20 seconds to
> process. That wouldn't require any changes to WSGI.
> I'd note that HTTP/1.1 clients are *required* to be able to handle any
> number of 1xx responses followed by a final response, so it's supposed to be
> perfectly safe for a server to always send a 102 as a response to any
> request, no matter what the app is, or what client user-agent is (so long as
> it advertised HTTP/1.1), or even whether the resource has anything to do
> with WebDAV. Of course, I'm willing to bet that's patently false back here
> in the Real World -- no doubt plenty of "HTTP/1.1" clients incorrectly barf
> on 1xx responses.

FWIW, Apache provides ap_send_interim_response() to allow interim status.

This is used by mod_proxy, but no where else in Apache core code. So,
you would be fine if proxying to a pure Python HTTP/WSGI server which
could generate interim responses, but would be out of luck with
FASTCGI, SCGI, AJP, CGI and any modules which do custom proxying using
own protocol such as uWSGI or mod_wsgi daemon mode.

In all the latter, the wire protocols for proxy connection would
themselves need to be modified as well as module implementation, which
isn't going to happen for any of those which are generic protocols.

Graham
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] PEP 444 Goals

2011-01-06 Thread Graham Dumpleton

2011/1/7 Alex Grönholm :
> 07.01.2011 04:09, Graham Dumpleton kirjoitti:
>>
>> 2011/1/7 Graham Dumpleton:
>>>
>>> 2011/1/7 Alex Grönholm:
>>>>
>>>> 07.01.2011 01:14, Graham Dumpleton kirjoitti:
>>>>
>>>> One other comment about HTTP/1.1 features.
>>>>
>>>> You will always be battling to have some HTTP/1.1 features work in a
>>>> controllable way. This is because WSGI gateways/adapters aren't often
>>>> directly interfacing with the raw HTTP layer, but with FASTCGI, SCGI,
>>>> AJP, CGI etc. In this sort of situation you are at the mercy of what
>>>> the modules implementing those protocols do, or even are hamstrung by
>>>> how those protocols work.
>>>>
>>>> The classic example is 100-continue processing. This simply cannot
>>>> work end to end across FASTCGI, SCGI, AJP, CGI and other WSGI hosting
>>>> mechanisms where proxying is performed as the protocol being used
>>>> doesn't implement a notion of end to end signalling in respect of
>>>> 100-continue.
>>>>
>>>> I think we need some concrete examples to figure out what is and isn't
>>>> possible with WSGI 1.0.1.
>>>> My motivation for participating in this discussion can be summed up in
>>>> that
>>>> I want the following two applications to work properly:
>>>>
>>>> - PlasmaDS (Flex Messaging implementation)
>>>> - WebDAV
>>>>
>>>> The PlasmaDS project is the planned Python counterpart to Adobe's
>>>> BlazeDS.
>>>> Interoperability with the existing implementation requires that both the
>>>> request and response use chunked transfer encoding, to achieve
>>>> bidirectional
>>>> streaming. I don't really care how this happens, I just want to make
>>>> sure
>>>> that there is nothing preventing it.
>>>
>>> That can only be done by changing the rules around wsgi.input is used.
>>> I'll try and find a reference to where I have posted information about
>>> this before, otherwise I'll write something up again about it.
>>
>> BTW, even if WSGI specification were changed to allow handling of
>> chunked requests, it would not work for FASTCGI, SCGI, AJP, CGI or
>> mod_wsgi daemon mode. Also not likely to work on uWSGI either.
>>
>> This is because all of these work on the expectation that the complete
>> request body can be written across to the separate application process
>> before actually reading the response from the application.
>>
>> In other words, both way streaming is not possible.
>>
>> The only solution which would allow this with Apache is mod_wsgi
>> embedded mode, which in mod_wsgi 3.X already has an optional feature
>> which can be enabled so as to allow you to step out of current bounds
>> of the WSGI specification and use wsgi.input as I will explain, to do
>> this both way streaming.
>>
>> Pure Python HTTP/WSGI servers which are a front facing server could
>> also be modified to handle this is WSGI specification were changed,
>> but whether those same will work if put behind a web proxy will depend
>> on how the front end web proxy works.
>
> Then I suppose this needs to be standardized in PEP 444, wouldn't you agree?

Huh! Not sure you understand what I am saying. Even if you changed the
WSGI specification to allow for it, the bulk of implementations
wouldn't be able to support it. The WSGI specification has no
influence over distinct protocols such as FASTCGI, SCGI, AJP or CGI or
proxy implementations and so cant be used to force them to be changed.

So, as much as I would like to see WSGI specification changed to allow
it, others may not on the basis that there is no point if few
implementations could support it.

Graham

>> Graham
>>
>>>> The WebDAV spec, on the other hand, says
>>>> (http://www.webdav.org/specs/rfc2518.html#STATUS_102):
>>>>
>>>> The 102 (Processing) status code is an interim response used to inform
>>>> the
>>>> client that the server has accepted the complete request, but has not
>>>> yet
>>>> completed it. This status code SHOULD only be sent when the server has a
>>>> reasonable expectation that the request will take significant time to
>>>> complete. As guidance, if a method is taking longer than 20 seconds (a
>>>> reasonable, but arbitrary value) to process the server SHOULD return a
>>>> 102
>>>&

Re: [Web-SIG] PEP 444 Goals

2011-01-06 Thread Graham Dumpleton

2011/1/7 Graham Dumpleton :
> 2011/1/7 Alex Grönholm :
>> 07.01.2011 01:14, Graham Dumpleton kirjoitti:
>>
>> One other comment about HTTP/1.1 features.
>>
>> You will always be battling to have some HTTP/1.1 features work in a
>> controllable way. This is because WSGI gateways/adapters aren't often
>> directly interfacing with the raw HTTP layer, but with FASTCGI, SCGI,
>> AJP, CGI etc. In this sort of situation you are at the mercy of what
>> the modules implementing those protocols do, or even are hamstrung by
>> how those protocols work.
>>
>> The classic example is 100-continue processing. This simply cannot
>> work end to end across FASTCGI, SCGI, AJP, CGI and other WSGI hosting
>> mechanisms where proxying is performed as the protocol being used
>> doesn't implement a notion of end to end signalling in respect of
>> 100-continue.
>>
>> I think we need some concrete examples to figure out what is and isn't
>> possible with WSGI 1.0.1.
>> My motivation for participating in this discussion can be summed up in that
>> I want the following two applications to work properly:
>>
>> - PlasmaDS (Flex Messaging implementation)
>> - WebDAV
>>
>> The PlasmaDS project is the planned Python counterpart to Adobe's BlazeDS.
>> Interoperability with the existing implementation requires that both the
>> request and response use chunked transfer encoding, to achieve bidirectional
>> streaming. I don't really care how this happens, I just want to make sure
>> that there is nothing preventing it.
>
> That can only be done by changing the rules around wsgi.input is used.
> I'll try and find a reference to where I have posted information about
> this before, otherwise I'll write something up again about it.

BTW, even if WSGI specification were changed to allow handling of
chunked requests, it would not work for FASTCGI, SCGI, AJP, CGI or
mod_wsgi daemon mode. Also not likely to work on uWSGI either.

This is because all of these work on the expectation that the complete
request body can be written across to the separate application process
before actually reading the response from the application.

In other words, both way streaming is not possible.

The only solution which would allow this with Apache is mod_wsgi
embedded mode, which in mod_wsgi 3.X already has an optional feature
which can be enabled so as to allow you to step out of current bounds
of the WSGI specification and use wsgi.input as I will explain, to do
this both way streaming.

Pure Python HTTP/WSGI servers which are a front facing server could
also be modified to handle this is WSGI specification were changed,
but whether those same will work if put behind a web proxy will depend
on how the front end web proxy works.

Graham

>> The WebDAV spec, on the other hand, says
>> (http://www.webdav.org/specs/rfc2518.html#STATUS_102):
>>
>> The 102 (Processing) status code is an interim response used to inform the
>> client that the server has accepted the complete request, but has not yet
>> completed it. This status code SHOULD only be sent when the server has a
>> reasonable expectation that the request will take significant time to
>> complete. As guidance, if a method is taking longer than 20 seconds (a
>> reasonable, but arbitrary value) to process the server SHOULD return a 102
>> (Processing) response. The server MUST send a final response after the
>> request has been completed.
>
> That I don't offhand see a way of being able to do as protocols like
> SCGI and CGI definitely don't allow interim status. I am suspecting
> that FASTCGI and AJP don't allow it either.
>
> I'll have to even do some digging as to how you would even handle that
> in Apache with a normal Apache handler.
>
> Graham
>
>> Again, I don't care how this is done as long as it's possible.
>>
>> The current WSGI specification acknowledges that by saying:
>>
>> """
>> Servers and gateways that implement HTTP 1.1 must provide transparent
>> support for HTTP 1.1's "expect/continue" mechanism. This may be done
>> in any of several ways:
>>
>> * Respond to requests containing an Expect: 100-continue request with
>> an immediate "100 Continue" response, and proceed normally.
>> * Proceed with the request normally, but provide the application with
>> a wsgi.input stream that will send the "100 Continue" response if/when
>> the application first attempts to read from the input stream. The read
>> request must then remain blocked until the client responds.
>> * Wait until the client decides that th

Re: [Web-SIG] PEP 444 Goals

2011-01-06 Thread Graham Dumpleton

2011/1/7 Alex Grönholm :
> 07.01.2011 01:14, Graham Dumpleton kirjoitti:
>
> One other comment about HTTP/1.1 features.
>
> You will always be battling to have some HTTP/1.1 features work in a
> controllable way. This is because WSGI gateways/adapters aren't often
> directly interfacing with the raw HTTP layer, but with FASTCGI, SCGI,
> AJP, CGI etc. In this sort of situation you are at the mercy of what
> the modules implementing those protocols do, or even are hamstrung by
> how those protocols work.
>
> The classic example is 100-continue processing. This simply cannot
> work end to end across FASTCGI, SCGI, AJP, CGI and other WSGI hosting
> mechanisms where proxying is performed as the protocol being used
> doesn't implement a notion of end to end signalling in respect of
> 100-continue.
>
> I think we need some concrete examples to figure out what is and isn't
> possible with WSGI 1.0.1.
> My motivation for participating in this discussion can be summed up in that
> I want the following two applications to work properly:
>
> - PlasmaDS (Flex Messaging implementation)
> - WebDAV
>
> The PlasmaDS project is the planned Python counterpart to Adobe's BlazeDS.
> Interoperability with the existing implementation requires that both the
> request and response use chunked transfer encoding, to achieve bidirectional
> streaming. I don't really care how this happens, I just want to make sure
> that there is nothing preventing it.

That can only be done by changing the rules around wsgi.input is used.
I'll try and find a reference to where I have posted information about
this before, otherwise I'll write something up again about it.

> The WebDAV spec, on the other hand, says
> (http://www.webdav.org/specs/rfc2518.html#STATUS_102):
>
> The 102 (Processing) status code is an interim response used to inform the
> client that the server has accepted the complete request, but has not yet
> completed it. This status code SHOULD only be sent when the server has a
> reasonable expectation that the request will take significant time to
> complete. As guidance, if a method is taking longer than 20 seconds (a
> reasonable, but arbitrary value) to process the server SHOULD return a 102
> (Processing) response. The server MUST send a final response after the
> request has been completed.

That I don't offhand see a way of being able to do as protocols like
SCGI and CGI definitely don't allow interim status. I am suspecting
that FASTCGI and AJP don't allow it either.

I'll have to even do some digging as to how you would even handle that
in Apache with a normal Apache handler.

Graham

> Again, I don't care how this is done as long as it's possible.
>
> The current WSGI specification acknowledges that by saying:
>
> """
> Servers and gateways that implement HTTP 1.1 must provide transparent
> support for HTTP 1.1's "expect/continue" mechanism. This may be done
> in any of several ways:
>
> * Respond to requests containing an Expect: 100-continue request with
> an immediate "100 Continue" response, and proceed normally.
> * Proceed with the request normally, but provide the application with
> a wsgi.input stream that will send the "100 Continue" response if/when
> the application first attempts to read from the input stream. The read
> request must then remain blocked until the client responds.
> * Wait until the client decides that the server does not support
> expect/continue, and sends the request body on its own. (This is
> suboptimal, and is not recommended.)
> """
>
> If you are going to try and push for full visibility of HTTP/1.1 and
> an ability to control it at the application level then you will fail
> with 100-continue to start with.
>
> So, although option 2 above would be the most ideal and is giving the
> application control, specifically the ability to send an error
> response based on request headers alone, and with reading the response
> and triggering the 100-continue, it isn't practical to require it, as
> the majority of hosting mechanisms for WSGI wouldn't even be able to
> implement it that way.
>
> The same goes for any other feature, there is no point mandating a
> feature that can only be realistically implementing on a minority of
> implementations. This would be even worse where dependence on such a
> feature would mean that the WSGI application would no longer be
> portable to another WSGI server and destroys the notion that WSGI
> provides a portable interface.
>
> This isn't just restricted to HTTP/1.1 features either, but also
> applies to raw SCRIPT_NAME and PATH_INFO as well. Only WSGI servers
> that are di

Re: [Web-SIG] PEP 444 Goals

2011-01-06 Thread Graham Dumpleton

One other comment about HTTP/1.1 features.

You will always be battling to have some HTTP/1.1 features work in a
controllable way. This is because WSGI gateways/adapters aren't often
directly interfacing with the raw HTTP layer, but with FASTCGI, SCGI,
AJP, CGI etc. In this sort of situation you are at the mercy of what
the modules implementing those protocols do, or even are hamstrung by
how those protocols work.

The classic example is 100-continue processing. This simply cannot
work end to end across FASTCGI, SCGI, AJP, CGI and other WSGI hosting
mechanisms where proxying is performed as the protocol being used
doesn't implement a notion of end to end signalling in respect of
100-continue.

The current WSGI specification acknowledges that by saying:

"""
Servers and gateways that implement HTTP 1.1 must provide transparent
support for HTTP 1.1's "expect/continue" mechanism. This may be done
in any of several ways:

* Respond to requests containing an Expect: 100-continue request with
an immediate "100 Continue" response, and proceed normally.
* Proceed with the request normally, but provide the application with
a wsgi.input stream that will send the "100 Continue" response if/when
the application first attempts to read from the input stream. The read
request must then remain blocked until the client responds.
* Wait until the client decides that the server does not support
expect/continue, and sends the request body on its own. (This is
suboptimal, and is not recommended.)
"""

If you are going to try and push for full visibility of HTTP/1.1 and
an ability to control it at the application level then you will fail
with 100-continue to start with.

So, although option 2 above would be the most ideal and is giving the
application control, specifically the ability to send an error
response based on request headers alone, and with reading the response
and triggering the 100-continue, it isn't practical to require it, as
the majority of hosting mechanisms for WSGI wouldn't even be able to
implement it that way.

The same goes for any other feature, there is no point mandating a
feature that can only be realistically implementing on a minority of
implementations. This would be even worse where dependence on such a
feature would mean that the WSGI application would no longer be
portable to another WSGI server and destroys the notion that WSGI
provides a portable interface.

This isn't just restricted to HTTP/1.1 features either, but also
applies to raw SCRIPT_NAME and PATH_INFO as well. Only WSGI servers
that are directly hooked into the URL parsing of the base HTTP server
can provide that information, which basically means that only pure
Python HTTP/WSGI servers are likely able to provide it without
guessing, and in that case such servers usually are always used where
WSGI application mounted at root anyway.

Graham

On 7 January 2011 09:29, Graham Dumpleton  wrote:
> On 7 January 2011 08:56, Alice Bevan–McGregor  wrote:
>> On 2011-01-06 13:06:36 -0800, James Y Knight said:
>>
>>> On Jan 6, 2011, at 3:52 PM, Alice Bevan–McGregor wrote:
>>>>
>>>> :: Making optional (and thus rarely-implemented) features non-optional.
>>>> E.g. server support for HTTP/1.1 with clarifications for interfacing
>>>> applications to 1.1 servers.  Thus pipelining, chunked encoding, et. al. as
>>>> per the HTTP 1.1 RFC.
>>>
>>> Requirements on the HTTP compliance of the server don't really have any
>>> place in the WSGI spec. You should be able to be WSGI compliant even if you
>>> don't use the HTTP transport at all (e.g. maybe you just send around
>>> requests via SCGI).
>>> The original spec got this right: chunking etc are something which is not
>>> relevant to the wsgi application code -- it is up to the server to implement
>>> the HTTP transport according to the HTTP spec, if it's purporting to be an
>>> HTTP server.
>>
>> Chunking is actually quite relevant to the specification, as WSGI and PEP
>> 444 / WSGI 2 (damn, that's getting tedious to keep dual-typing ;) allow for
>> chunked bodies regardless of higher-level support for chunking.  The body
>> iterator.  Previously you /had/ to define a length, with chunked encoding at
>> the server level, you don't.
>>
>> I agree, however, that not all gateways will be able to implement the
>> relevant HTTP/1.1 features.  FastCGI does, SCGI after a quick Google search,
>> seems to support it as well. I should re-word it as:
>>
>> "For those servers capable of HTTP/1.1 features the implementation of such
>> features is required."
>
> I would question whether FASTCGI, SCGI or AJP support the concept of
> chunking of

Re: [Web-SIG] Declaring PEP 3333 accepted (was: PEP 444 != WSGI 2.0)

2011-01-06 Thread Graham Dumpleton

Can we not let the PEP 444 discussion side track getting PEP 
sorted out? This is exactly what has happened numerous times before
when we have been trying to sort out core issues of WSGI on Python 3.
And people wander why I get grumpy now every time this happens. :-(

So, where are we at? It seems the only real issue that needs to be
resolved is the correctness or otherwise of the CGI/WSGI bridge
example. Everything else could be left as is, even if dubious.

For the CGI/WSGI bridge I take it the issue is that for bytes need to
use the internal buffer directly. I don't see a need to flush the top
level stdout because nothing should be writing to stdout besides the
WSGI bridge and so it just needs to ensure it is consistent.

Is that the last thing or do I need to go spend some time and write my
own CGI/WSGI bridge for Python 3 based on my own Python 2 one I have
lying around and just do some final validation checks with a parallel
implementation as a sanity check to make sure we got everything? This
might be a good idea anyway.

Graham

On 5 January 2011 05:33, Guido van Rossum  wrote:
> On Tue, Jan 4, 2011 at 7:48 AM, P.J. Eby  wrote:
>> At 06:30 PM 1/3/2011 -0800, Guido van Rossum wrote:
>>>
>>> Would
>>>
>>>  sys.stdout.buffer.write(b'abc')
>>>
>>> do?
>>>
>>> (If you mix this with writing strings to sys.stdout directly, you may
>>> have to call sys.stdout.flush() first.)
>>
>> The current code is:
>>
>>            sys.stdout.write(data)  # TODO: this needs to be binary on Py3
>>            sys.stdout.flush()
>>
>> Should I be using sys.stdout.buffer for both, or just the write?
>
> For both.
>
> But the flush() I was referring to is actually *before* either of
> these, suggesting
>
> sys.stdout.flush()
> sys.stdout.buffer.write(data)
> sys.stdout.buffer.flush()
>
> However the first flush() is only necessary is there's a possibility
> that previously something had been written to sys.stdout (not to
> sys.stdout.buffer).
>
>> For the CGI example in the PEP, I don't want to bother trying to make it
>> fully production-usable; that's what we have wsgiref in the stdlib for.  So
>> I won't worry about mixing strings and regular output in the example, even
>> if perhaps wsgiref should add the StringIO's proposed by Graham.
>
> Not sure what you mean. Surely copying and pasting the examples should
> work? What are the details you'd like to leave out?
>
> --
> --Guido van Rossum (python.org/~guido)
> ___
> Web-SIG mailing list
> Web-SIG@python.org
> Web SIG: http://www.python.org/sigs/web-sig
> Unsubscribe: 
> http://mail.python.org/mailman/options/web-sig/graham.dumpleton%40gmail.com
>
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] PEP 444 Goals

2011-01-06 Thread Graham Dumpleton

On 7 January 2011 08:56, Alice Bevan–McGregor  wrote:
> On 2011-01-06 13:06:36 -0800, James Y Knight said:
>
>> On Jan 6, 2011, at 3:52 PM, Alice Bevan–McGregor wrote:
>>>
>>> :: Making optional (and thus rarely-implemented) features non-optional.
>>> E.g. server support for HTTP/1.1 with clarifications for interfacing
>>> applications to 1.1 servers.  Thus pipelining, chunked encoding, et. al. as
>>> per the HTTP 1.1 RFC.
>>
>> Requirements on the HTTP compliance of the server don't really have any
>> place in the WSGI spec. You should be able to be WSGI compliant even if you
>> don't use the HTTP transport at all (e.g. maybe you just send around
>> requests via SCGI).
>> The original spec got this right: chunking etc are something which is not
>> relevant to the wsgi application code -- it is up to the server to implement
>> the HTTP transport according to the HTTP spec, if it's purporting to be an
>> HTTP server.
>
> Chunking is actually quite relevant to the specification, as WSGI and PEP
> 444 / WSGI 2 (damn, that's getting tedious to keep dual-typing ;) allow for
> chunked bodies regardless of higher-level support for chunking.  The body
> iterator.  Previously you /had/ to define a length, with chunked encoding at
> the server level, you don't.
>
> I agree, however, that not all gateways will be able to implement the
> relevant HTTP/1.1 features.  FastCGI does, SCGI after a quick Google search,
> seems to support it as well. I should re-word it as:
>
> "For those servers capable of HTTP/1.1 features the implementation of such
> features is required."

I would question whether FASTCGI, SCGI or AJP support the concept of
chunking of responses to the extent that the application can prepare
the final content including chunks as required by the HTTP
specification. Further, in Apache at least, the output from a web
application served via those protocols is still pushed through the
Apache output filter chain so as to allow the filters to modify the
response, eg., apply compression using mod_deflate. As a consequence,
the standard HTTP 'CHUNK' output filter is still a part of the output
filter stack. This means that were a web application to try and do
chunking itself, then Apache would rechunk such that the original
chunking became part of the content, rather than the transfer
encoding.

So, in order to be able to achieve what I think you want, with a web
application being able to do chunking itself, you would need to modify
the implementations of mod_fcgid, mod_fastcgi, mod_scgi, mod_ajp and
also like mod_cgi and mod_cgid of Apache.

The only WSGI implementation I know of for Apache where you might even
be able to do what you want is uWSGI. This is because I believe from
memory it uses a mode in Apache by default called assbackwords. What
this allows is for the output from the web application to bypass the
Apache output filter stack and directly control the raw HTTP output.
This gives uWSGI a little bit less overhead in Apache, but at the loss
of the ability to actually use Apache output filters and for Apache to
fix up response headers in any way. There is a flag in uWSGI which can
optionally be set to make it use the more traditional mode and not use
assbackwords.

Thus, I believe you would be fighting against server implementations
such as Apache and likely also nginx, Cherokee, lighttpd etc, to allow
chunking to be supported at the level of the web application.

About all you can do is ensure that the WSGI specification doesn't
include anything in it which would prevent a web application
harnessing indirectly such a feature as chunking where the web server
supports it.

As it is, it isn't chunked responses which is even the problem,
because if a underlying web server supports chunking for responses,
all you need to do is not set the content length.

The problem area with chunking is the request content as the way that
the WSGI specification is written prevents being able to have chunked
request content. I have described the issue previously and made
suggestions about alternate way that wsgi.input could be used.

Graham

> +1
>
>        - Alice.
>
>
> ___
> Web-SIG mailing list
> Web-SIG@python.org
> Web SIG: http://www.python.org/sigs/web-sig
> Unsubscribe:
> http://mail.python.org/mailman/options/web-sig/graham.dumpleton%40gmail.com
>
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] CGI in PEP 444

2011-01-04 Thread Graham Dumpleton

On 5 January 2011 07:04, James Y Knight  wrote:
> Back to the subject of this thread: A simple CGI server is useful because 
> it's simple enough that you can include it in the spec, to demonstrate how to 
> handle various bits of WSGI. And anyone writing a webserver understands CGI, 
> and can understand that. A complete HTTP implementation would not be simple 
> enough to write into the spec.

+1

And this is the crux of the issue. It doesn't matter whether people
use CGI or not, CGI provides a good basis for showing the mechanics of
how a WSGI server/adapter should process stuff. If not that, what are
you going to do, try and use pseudo code, include a much larger socket
based web server solution?

Graham
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] Declaring PEP 3333 accepted (was: PEP 444 != WSGI 2.0)

2011-01-04 Thread Graham Dumpleton

Add another point. FWIW, these are coming up because of questions
being asked on python-dev IRC channel about PEP .

The issue as it came down to was that the PEP may not be clear enough
in explaining that where str() is unicode and as such something like
PATH_INFO, although unicode, is actually bytes decoded as ISO-8859-1,
needed to be re encoded/decoded to get it back to Unicode in the
charset required before use.

They were thinking that because it was unicode already they could use
it as is and not need to do anything. Ie., didn't realise that need to
do:

  path_info = environ.get('PATH_INFO', '')
  path_info = path_info.encode('ISO-8859-1').decode('UTF-8')

for example to get it interpreted as UTF-8 first. They were simply
looking at concatenating new URL bits to the ISO-8859-1 variant from
other unicode strings that weren't bytes represented as ISO-8859-1.

In Python 2.X it was obvious that since it wasn't unicode that you had
to decode it, but confusion may arise for Python 3.X if this
requirement is not explicitly spelled out with a code example like
above.

We all may see it as obvious and yes perhaps it could be covered in
separate articles or commentaries be people, but given this person was
new to it, maybe it is deserving of more explanation in the PEP itself
if they were confused.

It could also be that the PEP covers it adequately already. I am too
tired to read through it again right now.

Graham

On 4 January 2011 20:53, Graham Dumpleton  wrote:
> BTW, to what extent are the examples in the PEP meant to be able to
> work on both Python 2.X and Python 3.X as is.
>
> Does it need to be clarified where examples will only work on Python
> 3.X, in particular the CGI gateway.
>
> Graham
>
> On 4 January 2011 16:49, Graham Dumpleton  wrote:
>> On 4 January 2011 16:39, Guido van Rossum  wrote:
>>> On Mon, Jan 3, 2011 at 7:39 PM, Graham Dumpleton
>>>  wrote:
>>>> I note one issue which I have expressed concern over previously. In
>>>> section 'Handling the Content-Length Header; it says:
>>>>
>>>> """
>>>> Under some circumstances, however, the server or gateway may be able
>>>> to either generate a Content-Length header, or at least avoid the need
>>>> to close the client connection. If the application does not call the
>>>> write() callable, and returns an iterable whose len() is 1, then the
>>>> server can automatically determine Content-Length by taking the length
>>>> of the first bytestring yielded by the iterable.
>>>> """
>>>
>>> That is copied exactly from PEP 333, i.e. WSGI 1.0. I didn't mean to
>>> solicit objections to parts of PEP  that are the same as PEP 333;
>>> PEP  is intended only to specify how WSGI 1.0 compliance is
>>> supposed to work in Python 3. Some clarifications to the original WSGI
>>> 1.0 wordings were actually added to PEP 333 around the same time that
>>> PEP  was spun off; AFAIK the changes to PEP 333 were
>>> noncontroversial and merely clarifications of how WSGI already works.
>>> I don't think you can change the above bit of specification (no matter
>>> how bad it is) and still call the resulting spec WSGI 1.0(.x) -- we
>>> don't want to rule out WSGI 1.0 compliance of apps or frameworks that
>>> would be considered compliant under the original 1.0 spec.
>>
>> I don't believe this really causes a compliance issue as it is a
>> requirement on the WSGI server, not the WSGI application and doesn't
>> cause any existing WSGI applications to break.
>>
>> It also says 'can' and not 'must' so technically WSGI servers are not
>> currently obligated to do it as I read it and certainly mod_wsgi
>> doesn't do it any more because it was causing problems for people.
>>
>> But then, since it does say 'can' and not 'must' any WSGI server
>> implementers who know better can just ignore it anyway if it if left
>> in, and it can be dealt with in any new major revision.
>>
>> Graham
>>
>
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] Declaring PEP 3333 accepted (was: PEP 444 != WSGI 2.0)

2011-01-04 Thread Graham Dumpleton

BTW, to what extent are the examples in the PEP meant to be able to
work on both Python 2.X and Python 3.X as is.

Does it need to be clarified where examples will only work on Python
3.X, in particular the CGI gateway.

Graham

On 4 January 2011 16:49, Graham Dumpleton  wrote:
> On 4 January 2011 16:39, Guido van Rossum  wrote:
>> On Mon, Jan 3, 2011 at 7:39 PM, Graham Dumpleton
>>  wrote:
>>> I note one issue which I have expressed concern over previously. In
>>> section 'Handling the Content-Length Header; it says:
>>>
>>> """
>>> Under some circumstances, however, the server or gateway may be able
>>> to either generate a Content-Length header, or at least avoid the need
>>> to close the client connection. If the application does not call the
>>> write() callable, and returns an iterable whose len() is 1, then the
>>> server can automatically determine Content-Length by taking the length
>>> of the first bytestring yielded by the iterable.
>>> """
>>
>> That is copied exactly from PEP 333, i.e. WSGI 1.0. I didn't mean to
>> solicit objections to parts of PEP  that are the same as PEP 333;
>> PEP  is intended only to specify how WSGI 1.0 compliance is
>> supposed to work in Python 3. Some clarifications to the original WSGI
>> 1.0 wordings were actually added to PEP 333 around the same time that
>> PEP  was spun off; AFAIK the changes to PEP 333 were
>> noncontroversial and merely clarifications of how WSGI already works.
>> I don't think you can change the above bit of specification (no matter
>> how bad it is) and still call the resulting spec WSGI 1.0(.x) -- we
>> don't want to rule out WSGI 1.0 compliance of apps or frameworks that
>> would be considered compliant under the original 1.0 spec.
>
> I don't believe this really causes a compliance issue as it is a
> requirement on the WSGI server, not the WSGI application and doesn't
> cause any existing WSGI applications to break.
>
> It also says 'can' and not 'must' so technically WSGI servers are not
> currently obligated to do it as I read it and certainly mod_wsgi
> doesn't do it any more because it was causing problems for people.
>
> But then, since it does say 'can' and not 'must' any WSGI server
> implementers who know better can just ignore it anyway if it if left
> in, and it can be dealt with in any new major revision.
>
> Graham
>
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] Declaring PEP 3333 accepted (was: PEP 444 != WSGI 2.0)

2011-01-03 Thread Graham Dumpleton

On 4 January 2011 16:39, Guido van Rossum  wrote:
> On Mon, Jan 3, 2011 at 7:39 PM, Graham Dumpleton
>  wrote:
>> I note one issue which I have expressed concern over previously. In
>> section 'Handling the Content-Length Header; it says:
>>
>> """
>> Under some circumstances, however, the server or gateway may be able
>> to either generate a Content-Length header, or at least avoid the need
>> to close the client connection. If the application does not call the
>> write() callable, and returns an iterable whose len() is 1, then the
>> server can automatically determine Content-Length by taking the length
>> of the first bytestring yielded by the iterable.
>> """
>
> That is copied exactly from PEP 333, i.e. WSGI 1.0. I didn't mean to
> solicit objections to parts of PEP  that are the same as PEP 333;
> PEP  is intended only to specify how WSGI 1.0 compliance is
> supposed to work in Python 3. Some clarifications to the original WSGI
> 1.0 wordings were actually added to PEP 333 around the same time that
> PEP  was spun off; AFAIK the changes to PEP 333 were
> noncontroversial and merely clarifications of how WSGI already works.
> I don't think you can change the above bit of specification (no matter
> how bad it is) and still call the resulting spec WSGI 1.0(.x) -- we
> don't want to rule out WSGI 1.0 compliance of apps or frameworks that
> would be considered compliant under the original 1.0 spec.

I don't believe this really causes a compliance issue as it is a
requirement on the WSGI server, not the WSGI application and doesn't
cause any existing WSGI applications to break.

It also says 'can' and not 'must' so technically WSGI servers are not
currently obligated to do it as I read it and certainly mod_wsgi
doesn't do it any more because it was causing problems for people.

But then, since it does say 'can' and not 'must' any WSGI server
implementers who know better can just ignore it anyway if it if left
in, and it can be dealt with in any new major revision.

Graham
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] Declaring PEP 3333 accepted (was: PEP 444 != WSGI 2.0)

2011-01-03 Thread Graham Dumpleton

On 4 January 2011 15:43, James Y Knight  wrote:
>
> On Jan 3, 2011, at 10:39 PM, Graham Dumpleton wrote:
>
>> As documented in:
>>
>>  http://blog.dscpl.com.au/2009/10/wsgi-issues-with-http-head-requests.html
>>
>> the automatic addition of a Content-Length response header where
>> len(iterable) is 1, can cause wrong output for cases where WSGI
>> application believes that it can itself decide not to return any
>> actual content for a HEAD response, ignoring the fact that there could
>> be output filters which rely on headers or content being exactly the
>> same as for GET.
>>
>> Do we therefore still want to promote the idea that the optimisation
>> is a good idea or even allowed?
>
> I think it would be nice if it was allowed -- it makes simple apps easier. 
> Just because some WSGI applications may be broken w.r.t. HEAD, that doesn't 
> make this optimization undesirable.
>
> However, the current description does leave things a bit ambiguous. Why, for 
> example, does it suggest only adding Content-Length if the length of the 
> iterable is 1? Surely "if type(iterable) in (list, tuple)", the server could 
> also set the Content-Length header to "sum(len(s) for s in iterable)". Is 
> that forbidden, or just not explicitly spelled out as allowed?

It just doesn't mention it. From memory it also doesn't mention what
to do about case when len(iterable) is 0 either, which presumably if
such an optimisation was allowed could allow you to set Content-Length
to 0.

> If your app *wants* to special-case HEAD handling so as to avoid generating 
> the body when it doesn't need to, how can it do that correctly/reliably? If 
> you normally return content with hard-to-determine length, and you want the 
> HEAD processing to thus also omit Content-Length (and not, say, have the 
> server decide it should return Content-Length: 0), what do you have to return 
> to ensure this happens?

Which is further support for the WSGI server not making decisions
itself to set Content-Length. But then, if an application isn't going
to generate Content-Length for HEAD, then it can't by rights do it for
GET either for same request else the response headers are different.

Graham
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] Declaring PEP 3333 accepted (was: PEP 444 != WSGI 2.0)

2011-01-03 Thread Graham Dumpleton

On 4 January 2011 11:43, Guido van Rossum  wrote:
> On Mon, Jan 3, 2011 at 3:13 PM, Jacob Kaplan-Moss  wrote:
>> On Sun, Jan 2, 2011 at 9:21 AM, Guido van Rossum  wrote:
>>> Although [PEP ] is still marked as draft, I personally think of it
>>> as accepted; [...]
>>
>> What does it take to get PEP  formally marked as accepted? Is
>> there anything I can do to push that process forward?
>>
>> The lack of a WSGI answer on Py3 is the main thing that's keeping me,
>> personally, from feeling excited about the platform. Once that's done
>> I can feel comfortable coding to it -- and browbeating those who don't
>> support it.
>>
>> I understand that PEP 444/Web3/WSGI 2/whatever might be a better
>> answer, but it's clearly got some way to go. In the meantime, what's
>> next to get PEP  officially endorsed and accepted?
>
> I haven't heard anyone speak up against it, ever, since it was
> submitted. If no-one speaks up in the next 24 hours consider it
> accepted (and after that delay, anyone with SVN privileges can mark it
> thus).

I note one issue which I have expressed concern over previously. In
section 'Handling the Content-Length Header; it says:

"""
Under some circumstances, however, the server or gateway may be able
to either generate a Content-Length header, or at least avoid the need
to close the client connection. If the application does not call the
write() callable, and returns an iterable whose len() is 1, then the
server can automatically determine Content-Length by taking the length
of the first bytestring yielded by the iterable.
"""

As documented in:

  http://blog.dscpl.com.au/2009/10/wsgi-issues-with-http-head-requests.html

the automatic addition of a Content-Length response header where
len(iterable) is 1, can cause wrong output for cases where WSGI
application believes that it can itself decide not to return any
actual content for a HEAD response, ignoring the fact that there could
be output filters which rely on headers or content being exactly the
same as for GET.

Do we therefore still want to promote the idea that the optimisation
is a good idea or even allowed?

Next thing is the reference to Jython in section 'Supporting Older
(<2.2) Versions of Python' which are quite out of date with respect to
version of Python it supports. Should that be updated? Should that
whole section be removed now?

Finally, I'd still like to see the CGI gateway example be updated to
properly protect stdin/stdout so that people using print() without
redirecting it to stderr don't stuff themselves up. For example:

# Keep a reference to the original stdin. We then replace
# stdin with an empty stream. This is to protect against
# code from accessing sys.stdin directly and consuming the
# request content.

stdin = sys.stdin

sys.stdin = cStringIO.StringIO('')

# Keep a reference to the original stdout. We then replace
# stdout with stderr. This is to protect against code that
# wants to use 'print' to output debugging. If stdout wasn't
# protected, then anything output using 'print' would end up
# being sent as part of the response itself and interfere
# with the operation of the CGI protocol.

stdout = sys.stdout

sys.stdout = sys.stderr

The adapter would then use stdin/stdout local variables and not
sys.stdin/sys.stdout.

The CGI adapter in wsgiref should also really be updated to do
something similar. This would solve the portability problem where code
is written for non CGI hosting environment and they leave 'print'
statements in which breaks on CGI. Even if people who have copied the
CGI gateway from PEP previously don't update their implementations,
have at least set what is best practice for the future.

BTW, I am not across how stdin/stdout works on Windows, but is there
an issue with that CGI example working on Windows because of text
stream vs byte stream issues and CRLF translation?

Graham
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] PEP 444 != WSGI 2.0

2011-01-01 Thread Graham Dumpleton

On 2 January 2011 16:28, Guido van Rossum  wrote:
> On Sat, Jan 1, 2011 at 5:02 PM, Graham Dumpleton
>  wrote:
>> Can we please clear up a matter.
>>
>> GothAlice (don't know off hand there real name), keeps going around
>> and claiming:
>>
>> """
>> After some discussion on the Web-SIG mailing list, PEP 444 is now
>> "officially" WSGI 2, and PEP  is WSGI 1.1
>> """
> [...]
>
> From past posts here, that's Alice Bevan–McGregor
> , added to the thread.
>
> On Sat, Jan 1, 2011 at 8:34 PM, Ian Bicking  wrote:
>> Until the PEP is approved, it's just a suggestion.  So for it to "really" be
>> WSGI 2 it will have to go through at least some approval process; which is
>> kind of ad hoc, but not so ad hoc as just to implicitly happen.  For WSGI 2
>> to happen, someone has to write something up and propose it.  Alice has
>> agreed to do that, working from PEP 444 which several other people have
>> participated in.  Calling it "WSGI 2" instead of "Web 3" was brought up on
>> this list, and the general consensus seemed to be that it made sense -- some
>> people felt a little funny about it, but ultimately it seemed to be
>> something everyone was okay with (with some people like myself feeling
>> strongly it should be "WSGI 2").
>>
>> I'm not sure why you are so stressed out about this?  If you think it's
>> really an issue, perhaps 2 could be replaced with "2alpha" until such time
>> as it is approved?
>
> I'm guessing that Graham is concerned that Alice's assertion implies
> that the PEP is approved.

I certainly don't take it as being approved because I know that PEP
444 is quite incomplete at this time. It is the perceptions others get
when they are being told that PEP 444 is WSGI 2.0 that I am worried
about.

> IOW that the *future* WSGI 2.0 is equal to
> the *current* PEP 444. While we don't know for sure, this is likely
> wrong (at least in some details).
>
> OTOH I agree with Ian that it seems correct to say that PEP 444 (which
> is still under development) is striving to arrive at a consensus for
> what will be named WSGI 2.0. This is not unusual in the world of
> standards -- future standards usually are given names and document IDs
> long before there is agreement on the contents of the standard. In
> this sense Alice's use of "officially" is not incorrect, although out
> of context it could be misunderstood to imply PEP approval. I would
> recommend adding caution about this whenever the equivalency between
> PEP 444 and WSGI 2.0 is mentioned -- perhaps it is enough to state
> that "PEP 444 is the draft for WSGI 2.0".

I'd suggest that that is even too strong. You are giving the
impression that no one else can separately come up with another
proposal, especially one that is significantly different.

> Often people or companies draw premature conclusions from draft
> standards and prepare implementations that comply with the draft
> standard in the hope that the draft won't change before it is set in
> stone, and sometimes such implementations are incorrectly billed as
> compliant with the standard (rather than with a specific version of
> the draft). I don't know if that is what Alice is doing -- an equally
> likely theory is that she's just excited.
>
> I haven't followed the development of PEP  much, so I can't
> comment on how much agreement there is on the draft; Ian's use of
> "alpha" suggests that there's some way to go still.
>
> One clearly factual error in Alice's (quoted) post: PEP  is WSGI
> 1.0.1, not WSGI 1.1. AFAIK there's no such thing as WSGI 1.1 now.
>
> Alice, since you have in the past posted here suggesting you are
> interested in carrying PEP 444 / WSGI 2.0 forward, please acknowledge
> that you understand the concerns raised in this thread.
>
> Graham, I suggest that you don't worry about this issue, and instead
> focus on helping the draft turn into a standard by providing feedback
> on PEP 444.

That may be too late. Alice and I haven't exactly hit it off well.
Using the #webcore IRC channel as the forum to work on it as Alice
wants is also totally inadequate. IRC is just not a suitable forum for
discussing this. It is just not possible to fit into a few lines what
takes pages to explain to a level that people can understand. As much
as this mailing list has caused seemingly never ending discussions in
the past that go no where, it is still the appropriate forum for any
discussion.

> Unless there's part of the story you're not telling here.

It should b

Re: [Web-SIG] PEP 444 != WSGI 2.0

2011-01-01 Thread Graham Dumpleton

On 2 January 2011 15:34, Ian Bicking  wrote:
> Until the PEP is approved, it's just a suggestion.  So for it to "really" be
> WSGI 2 it will have to go through at least some approval process; which is
> kind of ad hoc, but not so ad hoc as just to implicitly happen.  For WSGI 2
> to happen, someone has to write something up and propose it.  Alice has
> agreed to do that, working from PEP 444 which several other people have
> participated in.  Calling it "WSGI 2" instead of "Web 3" was brought up on
> this list, and the general consensus seemed to be that it made sense

Only about 2 or 3 people commented directly on it from what I
recollect. I would hardly say that is consensus. For myself it came at
a really bad time, coming off the back of a one month trip and
finishing up in a job after 8 1/2 years. If I had known that this
would lead to Alice going around and making comments like 'official'
on what is far from being  I would have objected quite strongly at the
time rather than just treating it like yet another one of the
distractions that has kept coming up over the years when dealing with
the WSGI on Python 3 issue.

> -- some
> people felt a little funny about it, but ultimately it seemed to be
> something everyone was okay with (with some people like myself feeling
> strongly it should be "WSGI 2").
>
> I'm not sure why you are so stressed out about this?

You say I am stressed and Alice in private email likes to think I am
bitter. What I am is passionate. The whole WSGI stuff over the last
few years has been handled so badly by the Python web community it is
embarrassing. We can continue this shambles or try and bring some
sanity with this process. As is, what Alice is now working on will
effectively be the 3rd or 4th variation of what WSGI 2.0 should be
depending on how you count it. First off there was PJEs basic
suggestion of dropping start_response and leaving everything else.
Others then pitched in with there wish list for that. Armin worked on
a proposal for a while and then there was PEP 444 which has mutated
again with what Alice is doing. Some of these have been seen by
outsiders as being what WSGI 2.0 will be and there have at times been
hosting mechanisms or frameworks claiming to support what these
proposals described and calling themselves WSGI 2.0. Luckily those
third party packages claiming to support some form of WSGI 2.0 have
never taken off, but either way the risk of confusion is still there.

> If you think it's
> really an issue, perhaps 2 could be replaced with "2alpha" until such time
> as it is approved?

If it is going to be done under the guise of a continually changing
PEP 444, then refer to it as PEP 444, including the environ tag
prefixes being pep444.

By allowing it to claim in some way that it is going to be WSGI 2.0
you are effectively fixing yourself down a course that that can be the
only successor for WSGI 1.0. So, if someone comes up with a much
better solution, they will be forced to call it something completely
different because you are hardly going to be able to go back and say,
sorry, we made a mistake and all you people who genuinely thought you
were coding to what was going to be a WSGI 2.0 are screwed.

Graham

> On Sat, Jan 1, 2011 at 8:02 PM, Graham Dumpleton
>  wrote:
>>
>> Can we please clear up a matter.
>>
>> GothAlice (don't know off hand there real name), keeps going around
>> and claiming:
>>
>> """
>> After some discussion on the Web-SIG mailing list, PEP 444 is now
>> "officially" WSGI 2, and PEP  is WSGI 1.1
>> """
>>
>> In this instance on web.py forum on Google Groups.
>>
>> I have pointed out a couple of times to them that there is no way that
>> PEP 444 has been blessed as being the official WSGI 2.0 but they are
>> not listening and are still repeating this claim. They can't also get
>> right that PEP  clearly says it is still WSGI 1.0 and not WSGI
>> 1.1.
>>
>> If the people here who's opinion matters are quite happy for GothAlice
>> to hijack the WSGI 2.0 moniker for PEP 444 I will shut up. But if that
>> happens, I will voice my objections by simply not having anything to
>> do with WSGI 2.0 any more.
>>
>> Graham
>> ___
>> Web-SIG mailing list
>> Web-SIG@python.org
>> Web SIG: http://www.python.org/sigs/web-sig
>> Unsubscribe:
>> http://mail.python.org/mailman/options/web-sig/ianb%40colorstudy.com
>
>
>
> --
> Ian Bicking  |  http://blog.ianbicking.org
>
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] PEP 444 != WSGI 2.0

2011-01-01 Thread Graham Dumpleton

On 2 January 2011 12:09, Jonas Galvez  wrote:
> Graham Dumpleton wrote:
>> If the people here who's opinion matters are quite happy for GothAlice
>> to hijack the WSGI 2.0 moniker for PEP 444 I will shut up. But if that
>> happens, I will voice my objections by simply not having anything to
>> do with WSGI 2.0 any more.
>
> Hi Graham, I'm interested in learning what is your motivation for
> objecting that PEP 444 be WSGI 2.0. I'm assuming you must have already
> voiced your criticism on a technical level somewhere by this point.
> Can you point me to it?

Because for all we know right now what will be WSGI 2.0 may look a lot
different to what PEP 444 is now. Already they have taken the original
PEP 444 that was put out by Chris, and which had never actually been
updated based on feedback on the Python WEB-SIG list to address
perceived shortcomings, and started injecting his own ideas on top of
it without any real consultation with those on the WEB-SIG who have
had a lot of experience with all this stuff.

Thus what he is working on is a very fluid specification that keeps
changing. Ie., it is thus a work in progress, yet the way they talk
about it is if it already is the official WSGI 2.0 specification when
it is still no more than a bunch of ideas of what could be done. I am
thus manly objecting at this point on a matter of process and how they
are portraying what PEP 444 is. The PEP 444 by rights should have been
completely withdrawn and marked as rejected. If they want to carry on
and take PEP 444 and turn it into something else, then give it some
other working name, but where it still isn't labelled as WSGI 2.0.
When they have fleshed it out sufficiently and it has passed review on
the Python WEB-SIG that is fine and then gets put up as a PEP with
some blessing, only then should it notionally be anointed as WSGI 2.0
if the community wants that. Don't do this and all you do is cause
ongoing confusion in the community as to what WSGI 2.0 is given that
the definition of what it may be keeps changing.

I also have a various technical issues with the original WSGI
specification and they aren't being addressed in PEP 444 from what I
have seen so far, as well as having issues with new things in PEP 444.
I have blogged and posted on the WEB-SIG list about a number of them
and am now starting to get back into documenting what some of those
other issues are. Overall though, I believe a big step needs to be
taken back and fresh look at this stuff needs to be made. It needs to
be cast into the greater context of how we deploy this stuff as well,
otherwise deployment is going to continue to be a PITA with all the
systems using different ways when there could be a better level of
compatibility across them all to make deployment easier.

Graham
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

[Web-SIG] PEP 444 != WSGI 2.0

2011-01-01 Thread Graham Dumpleton

Can we please clear up a matter.

GothAlice (don't know off hand there real name), keeps going around
and claiming:

"""
After some discussion on the Web-SIG mailing list, PEP 444 is now
"officially" WSGI 2, and PEP  is WSGI 1.1
"""

In this instance on web.py forum on Google Groups.

I have pointed out a couple of times to them that there is no way that
PEP 444 has been blessed as being the official WSGI 2.0 but they are
not listening and are still repeating this claim. They can't also get
right that PEP  clearly says it is still WSGI 1.0 and not WSGI
1.1.

If the people here who's opinion matters are quite happy for GothAlice
to hijack the WSGI 2.0 moniker for PEP 444 I will shut up. But if that
happens, I will voice my objections by simply not having anything to
do with WSGI 2.0 any more.

Graham
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] Is PEP 3333 the final solution for WSGI on Python 3?

2010-10-21 Thread Graham Dumpleton

On 22 October 2010 11:16, P.J. Eby  wrote:
> At 10:35 AM 10/22/2010 +1100, Graham Dumpleton wrote:
>>
>> Any one care to comment on my blog post?
>>
>>
>>
>> http://blog.dscpl.com.au/2010/10/is-pep--final-solution-for-wsgi-on.html
>>
>> As far as web framework developers commenting, Armin at:
>>
>>
>>
>> http://www.reddit.com/r/Python/comments/du7bf/is_pep__the_final_solution_for_wsgi_on_python/
>>
>> has said:
>>
>>  """Hopefully not. WSGI could do better and there is a proposal for
>> that (444)."""
>>
>> So, looks he is very cool on the idea.
>>
>> No other developers of actual web frameworks has commented at all on
>> PEP  from what I can see.
>>
>> Graham
>
> Just out of curiosity, Graham, isn't PEP  basically only a slight
> modification to what you yourself proposed and implemented in mod_wsgi for
> Python 3?

Correct, it is a bit more strict and changes wsgi.version back to 1.0.
So, except for wsgi.version, Apache/mod_wsgi already technically
conforms to it. I will be stepping a bit outside of the specification
and having non CGI variables in environment from Apache configuration
be treated as UTF-8 +surrogateescape however, with means to override
the encoding. This though is going to be necessary because of
capabilities of Apache and not an issue with WSGI specification.

> My guess is that there's been no comment because there's really not much to
> say about it.  The most controversial thing about it was Python-Dev's
> objection to modifying PEP 333 in place -- and that's the *only* reason why
> it's a new PEP at all.

I am not sure that it is that simple and that it is a done deal. Some
people have been quite passionate about having bytes used in more
places and the near silence from those people has me concerned that
all is going to happen is that they will ignore PEP  and another
discussion will just erupt again later. It will be quite disappointing
if Armin especially takes that stance and will not support PEP  in
Werkzeug, skip it and seek out an alternative.

As I say in my blog post, ultimately it may not matter as PEP 
continues the WSGI name and so if they don't like it then what they
come up with will have to be under a different name else it will be
too confusing. At this point I can't see a future version based on PEP
444 being called WSGI 2.0, it is just going to be better to be clearly
distinct in name after all that has gone on before.

As such, Apache/mod_wsgi can be made to align with PEP  for Python
3 and if anything else comes along later under a different name, then
Apache/mod_wsgi will simply not implement it since it is specific to
WSGI. If people want to somehow use Apache/mod_wsgi for it, they will
have to use an adapter, if possible. Otherwise people will have to
rely on other hosting mechanisms, be those other existing ones where
author is prepared to support any new interface and not so fussy about
naming inconsistencies or anything new that may come along.

Thus I am just after some acknowledgement from involved parties that
it is accepted that PEP  is the way forward for WSGI for Python 3
and that it isn't just going to be ignored. Without that, as far as I
am concerned we are still in limbo with only a little bit more mandate
than before because of you having created the update as new PEP ,
yet with no one pledging to accept it and use it going forward for
major web frameworks.

FWIW, what is the normal process for getting a PEP accepted as being
the way of doing things so people all agree? Note that I haven't been
reading python-dev list, so maybe there was some sort of agreement
already expressed there which means PEP  already has some sort of
mandate and people just have to accept it, don't know.

Graham
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

[Web-SIG] Is PEP 3333 the final solution for WSGI on Python 3?

2010-10-21 Thread Graham Dumpleton

Any one care to comment on my blog post?

  http://blog.dscpl.com.au/2010/10/is-pep--final-solution-for-wsgi-on.html

As far as web framework developers commenting, Armin at:

  
http://www.reddit.com/r/Python/comments/du7bf/is_pep__the_final_solution_for_wsgi_on_python/

has said:

  """Hopefully not. WSGI could do better and there is a proposal for
that (444)."""

So, looks he is very cool on the idea.

No other developers of actual web frameworks has commented at all on
PEP  from what I can see.

Graham
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] WSGI for Python 3

2010-08-29 Thread Graham Dumpleton

On 30 August 2010 13:07, P.J. Eby  wrote:
> At 11:16 AM 8/30/2010 +1000, Graham Dumpleton wrote:
>>
>> Although I almost begged that if we are going to discuss bytes,
>> compared to text/unicode, that agreement at least first be made about
>> the definition of the bytes leaning option, that request has pretty
>> well fallen on death ears.
>
> Did you not see my reply?  I (thought I) answered your question, and I
> actually also suggested that a variation of your unicode proposal might
> work, too.  See:
>
> http://mail.python.org/pipermail/web-sig/2010-August/004545.html

I was purely asking about bytes, what that means to people who want to
push that, and set aside the unicode one for the moment.

There have been others as well in the past who have pushed bytes, but
they haven't said anything about what it means and I really wanted
more input given that in the past the discussions had over the unicode
leaning proposals between us core people have been in part derailed by
these people who sit mostly on the sidelines and start shouting 'I
want bytes instead'. So, I want to give those critics their chance to
confirm what they mean by bytes, else we will keep having them pop up
time and time again when we are trying to discuss other stuff. So it
is the lack of response beyond the usual suspects that am grumpy
about.

Even in what you mention about bytes you are a bit fuzzy. Having value
of wsgi.url_scheme be bytes is reasonable and have no issue with that
given that other URL components will be bytes as well, but when you
yourself mention keys, you are a bit unsure because of the 'b' plague.
So, still no clarity on that point and if people are going to keep
raising bytes, would like that better definition of what they are
talking about.

The only other person who has said anything about bytes is Armin but
all that he really said was 'all bytes only'. This isn't much clearer
than when people have in the past said 'bytes everywhere', but in some
cases didn't actually mean keys. This is why I asked that people cut
and paste the definition I gave and change it to exactly what they
meant, so not having to second guess. FWIW, from separate discussion
understand Armin does mean bytes for keys.

So, was really after that clarity so we can say without confusion that
our starting point from now is that have two overall proposals and
that they be A and B as defined, with possibly even a C and D if need
be, not even using the labels bytes and unicode. We can then discuss
each in isolation as to whether as defined they would work or not.
>From that one or more might die, or might mutate further and actually
become closer to the other option but where all are still valid
options. Either way, people up till now have it stuck in their heads
now this bytes vs unicode divide when strictly speaking it isn't
necessarily pure bytes vs pure unicode, but merely a number of
different proposals with certain bits in one case using unicode
instead of bytes.

Given that we have dedicated most time to the unicode leaning
solution, would like to go and look properly at the bytes leaning
solutions now. That way we have the definitions and also have done the
analysis and when people come along later and say 'bytes everywhere',
we have something proper to refer back to about it.

Anyway, rather than keep arguing the point and move forward, let us
perhaps start now with the following definitions and new names to
identify them. We can even go a bit stupid and give each its own code
name so they are in part more memorable. Any next option based on your
suggestions about changing the WHEAT option can be called MAIZE. And
if you thinking I am going stark raving mad and should be put in a
white jacket and locked up, you could well be right. I am not a happy
camper right now, but that is because of many things besides this WSGI
stuff. :-)

 And yes I know about the page that has been just recently put up at:

  http://www.wsgi.org/wsgi/Python_3

>From memory when I first read it I wasn't sure if that it was
completely accurate, but at least it doesn't now mention mod_python
instead of mod_wsgi which was mighty confusing. We can perhaps merge
the following into that page, ie., expand the table, and talk more
about the abstract definitions rather than linking it to specific
implementations at this point. We can perhaps then start capturing the
pros and cons against each option in the page rather than loosing them
in the email chain.

OPTION : BARLEY

1. The application is passed an instance of a Python dictionary
containing what is referred to as the WSGI environment. All keys in
this dictionary are byte strings.

2. For the WSGI variable 'wsgi.url_scheme' contained in the WSGI
environment, the value of the variable should be a byte string.

3. For the CGI variables contained in the WSGI environment, th

Re: [Web-SIG] WSGI for Python 3

2010-08-29 Thread Graham Dumpleton

On 30 August 2010 11:02, Ian Bicking  wrote:
> Ugh... why are we back at bytes again?

Because no official decision, by way of a vote or even consensus, has
ever been made, the bytes option never goes away.

The problem with bytes, before one even tries to compare it to
text/unicode option, is that there is no clear description of what is
meant by the bytes option. For all I can see, there are potentially
multiple interpretations of what is meant by bytes.

Although I almost begged that if we are going to discuss bytes,
compared to text/unicode, that agreement at least first be made about
the definition of the bytes leaning option, that request has pretty
well fallen on death ears. Thus the discussion yet again is going the
direction of just dithering with a lot of navel gazing and not much
else.

As I brought up almost two years ago, if we are going to make any
progress on this, we are probably going to have a core group of people
nominated who can officially make the decision of what is done based
on a proper vote. This will be the only way there is going to be any
sort of acceptance of a decision. This idea that we can reach a
consensus just isn't working.

Graham

> I don't know of any concrete
> problems with using Latin1 (basically how mod_wsgi works).  It would be nice
> to try out some tricky cases -- cookie parsing, HTTP proxies,
> output-modifying middleware, a few other cases.  But I don't see a reason to
> expect they won't work.  It also doesn't feel particularly *wrong*.  The
> parsed portions of the request and response are mostly ASCII anyway, and the
> exceptions generally require wonky code anyway so a little transcoding isn't
> so bad.
>
> --
> Ian Bicking  |  http://blog.ianbicking.org
>
> ___
> Web-SIG mailing list
> Web-SIG@python.org
> Web SIG: http://www.python.org/sigs/web-sig
> Unsubscribe:
> http://mail.python.org/mailman/options/web-sig/graham.dumpleton%40gmail.com
>
>
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] WSGI for Python 3

2010-08-26 Thread Graham Dumpleton

On 27 August 2010 13:45, P.J. Eby  wrote:
> At 01:37 AM 8/27/2010 +0200, Armin Ronacher wrote:
>>
>> Hi,
>>
>> Is there a status update on that now I missed?  Did something decide on
>> bytes for the environment values or are we still unsure about that?
>
> To the extent we're "unsure", I think the holdup is simply that nobody has
> tried doing an all-bytes WSGI implementation -- unless of course you count
> all our Python 2.x experience as experience with an all-bytes
> implementation.  ;-)
>
> (Of course, that experience won't help us with Python 3 stdlib issues.)
>
>
>> At that point I don't care at all about what is decided on as long as
>> something is decided.  Can someone please stand up and just do that? :)
>
> Essentially the problem right now is that unless such a choice is made,
> there's little hope of getting the stdlib issues to be resolved, because we
> can't exactly file bug reports against the stdlib if we don't know what we
> want it to do.  ;-)
>
> My personal inclination is to define WSGI 2 as a bytes-oriented protocol,
> and then encourage people to port to WSGI 2 before moving to Python 3.

Since the major stumbling block, irrespective of other changes, to any
sort of agreement is still bytes vs unicode, and where we have a
reasonable clear definition of what unicode suggestion is, can we
please as a first step get a definition of what bytes actually implies
so everyone knows what we are talking about. I specifically ask this,
as it isn't clear because people don't explain in detail what they
mean when they are saying 'bytes'.

Going back to my definition #2 in my blog post from a year ago, I had:

1. The application is passed an instance of a Python dictionary
containing what is referred to as the WSGI environment. All keys in
this dictionary are native strings. For CGI variables, all names are
going to be ISO-8859-1 and so where native strings are unicode
strings, that encoding is used for the names of CGI variables

2. For the WSGI variable 'wsgi.url_scheme' contained in the WSGI
environment, the value of the variable should be a native string.

3. For the CGI variables contained in the WSGI environment, the values
of the variables are byte strings.

4. The WSGI input stream 'wsgi.input' contained in the WSGI
environment and from which request content is read, should yield byte
strings.

5. The status line specified by the WSGI application must be a byte string.

6. The list of response headers specified by the WSGI application must
contain tuples consisting of two values, where each value is a byte
string.

7. The iterable returned by the application and from which response
content is derived, must yield byte strings.

The points of disagreement I have seen about this is are as follows.

For (1), the keys should also be bytes, including names of 'wsgi.' special keys.

For (2), the value of 'wsgi.url_scheme' should be bytes.

So, do you really want bytes absolutely everywhere, or are keys still
going to be unicode taken as ISO-8859-1.

Note that we are not agreeing to the final solution here, just what
bytes means in contrast to the unicode option, so we know that we are
comparing only two options and not many options because people have
different interpretations of what bytes means.

As contrast, what we generally mean by the unicode option is
definition #3 from my blog post. That being:

1. The application is passed an instance of a Python dictionary
containing what is referred to as the WSGI environment. All keys in
this dictionary are native strings. For CGI variables, all names are
going to be ISO-8859-1 and so where native strings are unicode
strings, that encoding is used for the names of CGI variables

2. For the WSGI variable 'wsgi.url_scheme' contained in the WSGI
environment, the value of the variable should be a native string.

3. For the CGI variables contained in the WSGI environment, the values
of the variables are native strings. Where native strings are unicode
strings, ISO-8859-1 encoding would be used such that the original
character data is preserved and as necessary the unicode string can be
converted back to bytes and thence decoded to unicode again using a
different encoding.

4. The WSGI input stream 'wsgi.input' contained in the WSGI
environment and from which request content is read, should yield byte
strings.

5. The status line specified by the WSGI application should be a byte
string. Where native strings are unicode strings, the native string
type can also be returned in which case it would be encoded as
ISO-8859-1.

6. The list of response headers specified by the WSGI application
should contain tuples consisting of two values, where each value is a
byte string. Where native strings are unicode strings, the native
string type can also be returned in which case it would be encoded as
ISO-8859-1.

7. The iterable returned by the application and from which response
content is derived, should yield byte strings. Where native strings
are unicode strings, t

Re: [Web-SIG] WSGI for Python 3

2010-07-20 Thread Graham Dumpleton

On Tuesday, July 20, 2010, Etienne Robillard  wrote:
>
>
>
>
>
>
> Sorry to disagree. I dont think I've misunderstood any comments in this
> thread.
> At least some (encoding) issues seems from happening in Windows.

Can you please then point out which specific issue you are taking about?

The only Windows reference in this discussion that I recollect is my
own reference to it as part of an extended example about the fact that
the server is what ultimately dictates how any characters, including %
encodings, in the SCRIPT_NAME are. This is because the server derives
that part of the URL and not the WSGI application. That underlying
issue isn't Windows specific however.

Graham

> The
> point I
> attempted to made was that WSGI 2 could fix the chicken and egg
> problem. Python 3
> is not a solution but part of the problem, that is why a script could
> be written to
> port WSGI 1 apps to WSGI 2, assuming such a spec exists and stipulates
> how to parse
> http headers in Python 3...
>
> Regards,
>
> Etienne
>
> Graham Dumpleton wrote:
>
>   On Tuesday, July 20, 2010, Etienne Robillard 
>   wrote:
>
>
>
>
>
>
>
>
>
>
>
> AFAICT, the main difference is that under a
> bytes-only regime, the changes should be more consistent/mechanical, i.e.,
> able to be performed by relatively superficial code inspection.
>
>
>
> The problem in all these discussions is that practically no one has
> been prepared to actually sit down and attempt to migrate any
> significant code over to any of these proposals and Python 3.0.
>
> The only notable attempt is the work Robert Brewer did with CherryPy.
> Ultimately though I don't think the CherryPy case tells us much as it
> simple translates the interface in to an internal way of doing things.
> The true litmus test will be the conversion of any framework which
> keeps the WSGI interface exposed, with it being used as a means of
> composing together components to make a stack.
>
> Until someone has done that we have absolutely no evidence one way or
> the other as to what proposal is easier or even viable given potential
> short comings, or otherwise, in the Python language and standard
> libraries.
>
> It is a chicken and egg problem though in that I would say practically
> everyone doesn't want to do anything until the WSGI specification has
> been updated as they don't want to waste their time. You cant though
> update the specification without truly knowing whether a particular
> approach will work and to do that you have no choice but to actually
> try it.
>
>
> Hi Graham et al,
>
> One could maybe write a migration app for porting
> WSGI 1 apps to WSGI 2, in the same way 2to3.py was written.
>
> That's how at least I hoped to migrate notmm to Python 3. A switch
> could be used
> also to enable/disable bytes or text-mode only for HTTP headers
> parsing...
>
> Is there no such tools yet ready to slowly start moving ahead with
> WSGI 2 ? I recognize it's a chicken and egg problem but I don't think
> its necessary for framework authors to migrate to Python 3 in an
> attempt to solve mistery encoding
> errors affecting Windows platforms...
>
>
>
> The issues are not Windows specific. You are misunderstanding past
> comments if you believe that.
>
> The purpose to actually trying it is to work out how viable bytes
> everywhere and/or users dealing with % encoding is. If dealing with
> bytes everywhere proves to be easy then great, going that way may be
> best idea. If it is a PITA as some have said dealing with bytes is in
> Python 3.0 then we will know rather than it being speculation at this
> point.
>
> Graham
>
>
>
> A  easy-to-follow roadmap to WSGI
> 2  and writing
> related development tools should be a more effective way to port
> frameworks (to WSGI 2) and stick with Python 2 if they want so! ;-)
>
> my 2 cents,
>
> E
> --
> Etienne Robillard
> Green Tea Hackers Club
>
> E-mail: e...@gthcfoundation.org
> Work phone: 1 (514) 962-7703
> Website:https://gthc.org/
>
> During times of universal deceit, telling the truth becomes a revolutionary 
> act. -- George Orwell
>
>
>
>
>
>
>   ___
> Web-SIG mailing list
> Web-SIG@python.org
> Web SIG: http://www.python.org/sigs/web-sig
> Unsubscribe: http://mail.python.org/mailman/options/web-sig/erob%40gthc.org
>
>
>
>
> --
> Etienne Robillard
> Green Tea Hackers Club
>
> E-mail: e...@gthcfoundation.org
> Work phone: 1 (514) 962-7703
> Website:https://gthc.org/
>
> During times of universal deceit, telling the truth becomes a revolutionary 
> act. -- George Orwell
>
>
>
>
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] WSGI for Python 3

2010-07-20 Thread Graham Dumpleton

On Tuesday, July 20, 2010, Etienne Robillard  wrote:
>
>
>
>
>
>
>
>
>
>
> AFAICT, the main difference is that under a
> bytes-only regime, the changes should be more consistent/mechanical, i.e.,
> able to be performed by relatively superficial code inspection.
>
>
>
> The problem in all these discussions is that practically no one has
> been prepared to actually sit down and attempt to migrate any
> significant code over to any of these proposals and Python 3.0.
>
> The only notable attempt is the work Robert Brewer did with CherryPy.
> Ultimately though I don't think the CherryPy case tells us much as it
> simple translates the interface in to an internal way of doing things.
> The true litmus test will be the conversion of any framework which
> keeps the WSGI interface exposed, with it being used as a means of
> composing together components to make a stack.
>
> Until someone has done that we have absolutely no evidence one way or
> the other as to what proposal is easier or even viable given potential
> short comings, or otherwise, in the Python language and standard
> libraries.
>
> It is a chicken and egg problem though in that I would say practically
> everyone doesn't want to do anything until the WSGI specification has
> been updated as they don't want to waste their time. You cant though
> update the specification without truly knowing whether a particular
> approach will work and to do that you have no choice but to actually
> try it.
>
>
> Hi Graham et al,
>
> One could maybe write a migration app for porting
> WSGI 1 apps to WSGI 2, in the same way 2to3.py was written.
>
> That's how at least I hoped to migrate notmm to Python 3. A switch
> could be used
> also to enable/disable bytes or text-mode only for HTTP headers
> parsing...
>
> Is there no such tools yet ready to slowly start moving ahead with
> WSGI 2 ? I recognize it's a chicken and egg problem but I don't think
> its necessary for framework authors to migrate to Python 3 in an
> attempt to solve mistery encoding
> errors affecting Windows platforms...

The issues are not Windows specific. You are misunderstanding past
comments if you believe that.

The purpose to actually trying it is to work out how viable bytes
everywhere and/or users dealing with % encoding is. If dealing with
bytes everywhere proves to be easy then great, going that way may be
best idea. If it is a PITA as some have said dealing with bytes is in
Python 3.0 then we will know rather than it being speculation at this
point.

Graham

> A  easy-to-follow roadmap to WSGI
> 2  and writing
> related development tools should be a more effective way to port
> frameworks (to WSGI 2) and stick with Python 2 if they want so! ;-)
>
> my 2 cents,
>
> E
> --
> Etienne Robillard
> Green Tea Hackers Club
>
> E-mail: e...@gthcfoundation.org
> Work phone: 1 (514) 962-7703
> Website:https://gthc.org/
>
> During times of universal deceit, telling the truth becomes a revolutionary 
> act. -- George Orwell
>
>
>
>
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] WSGI for Python 3

2010-07-19 Thread Graham Dumpleton

On 19 July 2010 03:19, P.J. Eby  wrote:
> At 01:01 PM 7/18/2010 +1000, Graham Dumpleton wrote:
>>
>> This is on the basis that if people are going to have to rewrite their
>> code
>> a fair bit to handle bytes everywhere,
>
> What you mean by "rewrite their code a fair bit", and who is it that you
> think will have to do this?
> Or, more precisely, how is that any different from the text or
> text-and-bytes proposals?

My comments are based on the mood I have got from listening to
discussions here on this list and discussions in other forums and irc
channels. To me there appears to be a tendency towards people thinking
that having bytes everywhere will be harder to deal with than the text
proposal.

> AFAICT, the main difference is that under a
> bytes-only regime, the changes should be more consistent/mechanical, i.e.,
> able to be performed by relatively superficial code inspection.

The problem in all these discussions is that practically no one has
been prepared to actually sit down and attempt to migrate any
significant code over to any of these proposals and Python 3.0.

The only notable attempt is the work Robert Brewer did with CherryPy.
Ultimately though I don't think the CherryPy case tells us much as it
simple translates the interface in to an internal way of doing things.
The true litmus test will be the conversion of any framework which
keeps the WSGI interface exposed, with it being used as a means of
composing together components to make a stack.

Until someone has done that we have absolutely no evidence one way or
the other as to what proposal is easier or even viable given potential
short comings, or otherwise, in the Python language and standard
libraries.

It is a chicken and egg problem though in that I would say practically
everyone doesn't want to do anything until the WSGI specification has
been updated as they don't want to waste their time. You cant though
update the specification without truly knowing whether a particular
approach will work and to do that you have no choice but to actually
try it.

And before you argue that the hosting mechanisms haven't been there to
do that I will point out that mod_wsgi specifically implemented a way
of being able to selectively say whether bytes or text was passed
through. That code for bytes support sat there for six months or more
and there was zero interest expressed to me by anyone in using it as a
basis of some actual attempts at migrating existing code as a test. In
the end it got thrown out due to that lack of interest and due to it
holding up a new release of mod_wsgi.

Distinct from mod_wsgi, it also wouldn't be that hard for interested
people to modify wsgiref to implement the different proposals. I
stress again that no one seems prepared to do that and again even if
it was done, who is then going to try and use it.

Thus we all just sit here on the fence waiting for others to do
something, pushing our particular ideas and occasionally flip flopping
between those ideas as well.

Finally and for the record, I will not be modifying mod_wsgi to change
it in anyway now until I see a separate proof of concept WSGI server
and a decent sized framework ported to it. So yes I am going to sit on
the fence as well, but that is because I have been burned in the past
in putting in effort on this only see it go now where. I am not going
to waste my time again like that.

Graham
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] WSGI for Python 3

2010-07-19 Thread Graham Dumpleton

Go back through my blog and read some of the posts there so you have
some of the history. Recent discussions build on some of the stuff
there and I don't think anyone has the time to keep explaining all
this to every new person who comes along.

Graham

On Monday, July 19, 2010, Aaron Watters  wrote:
> I'm still in denial regarding Python 3 generally speaking,
> but it looks like something important is going on here.  Could
> someone summarize the main points (intelligible to a Python 2
> troglodyte)?
>
> thanks in advance,  -- Aaron Watters
>
> ===
> % man less
> less is more.
>
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] WSGI for Python 3

2010-07-17 Thread Graham Dumpleton

On 17 July 2010 22:30,   wrote:
> On Fri, 16 Jul 2010, P.J. Eby wrote:
>
>> At 02:28 PM 7/16/2010 -0500, Ian Bicking wrote:
>> There should be one, and preferably *only* one, obvious way to do it.
>>
>> And given that HTTP is inherently a bunch of bytes, bytes is the one
>> obvious way.
>
> I think this makes sense. The thing which is assembling the WSGI
> environment should do bytes and things further down the stack can
> deal with it as they like. This aligns well with how I like to think
> about such stuff: bytes on the outside, unicode on the inside.
>
> Given that app and frameworks developers can throw whatever keys
> they like back into the environment, they can cope as they like.[1]
>
> What would be horrible is if there need to be multiple coping
> strategies. Better to be able to say, "Oh it doesn't work? Try this
> way to cope: remember it is bytes."
>
> However, unless I'm misreading the thread, the bytes issue isn't
> really the bone of contention.

Actually it still is. There are still two competing camps. Some want
text, some want bytes. The whole discussion started purely around
basis of progressing the text based proposal. As usual, those wanting
bytes step up and we get two interwoven discussions which if you don't
know the history can be hard to follow.

My personal opinion is that if you are going to go bytes everywhere,
then you may as well throw out the complete WSGI specification as it
stands now and fix all the other problems with the specification. This
is on the basis that if people are going to have to rewrite their code
a fair bit to handle bytes everywhere, you may as well structurally
change the WSGI interface API as well to address other problems.

Anyway, it seems to be moot at this point as some believe that bytes
everywhere with Python language as it stands, plus state of stdlib
would make use of bytes everywhere rather unmanageable, which is where
ebytes comes in. Thus bytes everywhere doesn't sound like a short term
solution and requires changes in Python itself to make it viable.

Graham

> People seem okay with bytes as long
> as specifc points of pain are addressed, such as:
>
> * What's my PATH_INFO and SCRIPT_NAME?
> * This server, which hosts, but is not, the WSGI environment builder
>  doesn't play well with this model.
> * Some others I can't remember now.
>
> It seems then that perhaps a way forward is to say: Okay, it's gonna
> be bytes. Now, given that, how do we deal with these other issues,
> which perhaps can be recast and encapsulated to be considered
> orthogonal to the bytes/not-bytes debate.
>
> Because we _know_ that any choice is going to come with costs, but
> as things have dragged on, the lack of choice thus far is starting
> to have as much of a cost as the costs that are wanting to be
> resolved.
>
> [1] I not expecting or hoping for  porting/migrating to Python 3 to
> be simple/automatic/easy, but perhaps I'm cruel.
> --
> Chris Dent                      http://burningchrome.com/~cdent/
>                              [...]
> ___
> Web-SIG mailing list
> Web-SIG@python.org
> Web SIG: http://www.python.org/sigs/web-sig
> Unsubscribe:
> http://mail.python.org/mailman/options/web-sig/graham.dumpleton%40gmail.com
>
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] decoding environ

2010-07-17 Thread Graham Dumpleton

On Saturday, July 17, 2010, Antoine Pitrou  wrote:
> Ian Bicking  writes:
>>
>> So... there's been some discussion of WSGI on Python 3 lately.
>> I'm not feeling as pessimistic as some people, I feel like we were close
>> but just didn't *quite* get there.
>> Here's my thoughts:
>> * Everyone agrees keys in the environ should be native strings
>
> I don't know how that related to WSGI but it should be noted that Python 3.2
> comes with two synchronized views of the environment: os.environ (str -> str
> mapping) and os.environb (bytes -> bytes mapping).
>
> See http://docs.python.org/dev/library/os.html#os.environ
>
> Also, the way os.environ is decoded from bytes values involves the
> "surrogateescape" error handler, which ensures that non-decodeable bytes get
> their own unicode escape sequences, and can get re-encoded losslessly:
>
> http://www.python.org/dev/peps/pep-0383/

Only relevant to the extent it is needed for implementing a CGI/WSGI
bridge. Not relevant to WSGI itself.

Graham
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] WSGI for Python 3

2010-07-17 Thread Graham Dumpleton

On Saturday, July 17, 2010, Ian Bicking  wrote:
> On Sat, Jul 17, 2010 at 12:38 AM, Graham Dumpleton 
>  wrote:
>
>
> On Friday, July 16, 2010, And Clover  wrote:
>> On 07/14/2010 06:43 AM, Ian Bicking wrote:
>>
>>
>> There's only a couple tricky keys: SCRIPT_NAME, PATH_INFO,
>> and HTTP_COOKIE.
>>
>>
>> (And of those, PATH_INFO is the only one that really matters, in that no-one 
>> really uses non-ASCII script filenames,
>
> FWIW, I had to go to a lot of trouble to allow non ASCII in final
> SCRIPT_NAME in mod_wsgi. Specifically using AddHandler directive in
> Apache means a file system path can make up part of SCRIPT_NAME. I had
> someone who was specifically using Russian in a WSGI script file name
> and because with AddHandler that becomes part of SCRIPT_NAME you had
> to cater for it. Anyway this was more of a Windows issue in having to
> use special file system functions to deal with fact that on Windows
> filesystem paths aren't UTF-8 but something else.
>
> What this does highlight though is that although one can talk about
> passing raw script name through to application, that isn't necessarily
> right as it isn't the application that dictates what encoding may be
> used but the web server which is performing the mapping of that part
> of the original URL path to a potential filesystem resource, or
> alternatively where file based configuration for mount point, the
> encoding of the web sever configuration file.
>
> This is an Apache-specific issue.  It definitely doesn't apply to 
> paste.httpserver, I doubt CherryPy or wsgiref.  I don't really know how Nginx 
> or other servers work.

The only reason it doesn't apply to paste.httpserver is because it
doesn't have a URL mapping system of it's own. That is, you host a
single WSGI application at the root of the server. Any server which
allows hosting of a WSGI application at a sub URL will have such
issues. Specifically, the details of the sub URL are worked out by the
server, be it by mapping a URL to the file system or through matching
to a configuration parameter in a server configuration file.


Graham
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] WSGI for Python 3

2010-07-16 Thread Graham Dumpleton

On Saturday, July 17, 2010, Graham Dumpleton  wrote:
> On Saturday, July 17, 2010, Ian Bicking  wrote:
>> On Fri, Jul 16, 2010 at 1:40 PM, P.J. Eby  wrote:
>>
>>
>> At 11:07 AM 7/16/2010 -0500, Ian Bicking wrote:
>>
>> And this doesn't help with Python 3: either we have byte values of 
>> SCRIPT_NAME and PATH_INFO in Python 3, or we have text values.Â  I think 
>> bytes will be more awkward to port to than text, and inconsistent with other 
>> WSGI values.
>>
>>
>> OTOH, it has the tremendous advantage of pushing the encoding question onto 
>> the app (or framework) developer...  who's really the only one who can make 
>> the right decision for their particular application.  And personally, I'd 
>> rather have clear boundaries between text and bytes, such that porting (even 
>> if tedious or awkward) is *consistent*, and clear as to when you're 
>> finished, not, "oh, did I check to make sure I converted SCRIPT_NAME and 
>> PATH_INFO...  not just in my app code, but in all the library code I call 
>> *from* my app?"
>>
>> IOW, the bytes/string discussion on Python-dev has kind of led me to realize 
>> that we might just as well make the *entire* stack bytes (incoming and 
>> outgoing headers *and* streams), and rewrite that bit in PEP 333 about using 
>> str on "Python 3000" to say we go with bytes on Python 3+ for everything 
>> that's a str in today's WSGI.
>>
>> This was my first intuition too, until I started thinking in more detail 
>> about the particular values involved.  Some obviously are textish, like 
>> environ['SERVER_NAME'].  Not a very useful value, but definitely text.
>>
>> Basically all the internal strings are textish, so we're left with:
>>
>> wsgi.url_scheme
>> SCRIPT_NAME/PATH_INFO
>> QUERY_STRING
>> HTTP_*, CONTENT_TYPE, CONTENT_LENGTH (headers)
>> response status
>> response headers (name and value)
>>
>> And there's a few things like REMOTE_USER that are kind of in the middle.  
>> Everyone is in agreement that bodies should be bytes.
>>
>> One initial problem is that the Python 3 stdlib handles bytes poorly, so for 
>> instance there's no good way to reconstruct the URL using the stdlib.  That 
>> explains certain tensions, but I think we should ignore that, and in fact 
>> that's what Python-Dev seemed to say pretty clearly.
>>
>> Now, the other keys:
>>
>> wsgi.url_scheme: clearly ASCII
>>
>> SCRIPT_NAME/PATH_INFO: often UTF-8, could be no encoding, could be some old 
>> legacy encoding.
>> raw request path: should be ASCII (non-ASCII should be URL-encoded).  URL 
>> encoding happens at the byte layer, so a server could reasonably URL encode 
>> any non-ASCII characters without imposing any  encoding.
>>
>> QUERY_STRING: should be ASCII, same as raw request path
>>
>> headers: Most are ASCII.  Latin1 is a reasonable fallback and suggested by 
>> the specification.  The spec also implies you have use the RFC2047 inline 
>> encoding (like ?iso-8859-1?q?some=20text?=), but nothing supports this and 
>> supporting it would probably be a bad idea for security reasons.  The 
>> Atompub spec (reasonably modern) specifically says Title headers should be 
>> encoded with RFC2047 (if they are not ISO-8859-1): 
>> http://tools.ietf.org/html/draft-ietf-atompub-protocol-08#page-17 -- 
>> decoding this kind of encoding at the application layer seems reasonable to 
>> me.
>>
>> cookie header: this specific header can easily have multiple encodings, as 
>> the browser encodes data then treats it as opaque bytes, so a cookie can be 
>> set via UTF-8 one place, Latin1 another, and those coexist in one header.  
>> That is, there is no real encoding and this should be treated as bytes.  
>> (Latin1 is an approximation of bytes... a spotty way to treat bytes, but 
>> entirely workable.)
>>
>> response status: I believe the spec says this must be Latin1/ISO-8859-1.  In 
>> practice it is almost always ASCII, and since it is not user-visible it's 
>> not something that really needs localization.
>>
>> response headers: the spec implies Latin1, in practice the Set-Cookie header 
>> is bytes (since interoperation with wonky legacy systems is not uncommon).  
>> I'm not sure of any other exceptions?
>>
>>
>> So... to me it seems pretty reasonable for HTTP specifically that text can 
>> work.  And if feels weird that, say, environ['SERVER_NAME'] be text and 
>> environ['HTTP_HO

Re: [Web-SIG] WSGI for Python 3

2010-07-16 Thread Graham Dumpleton

On Saturday, July 17, 2010, Ian Bicking  wrote:
> On Fri, Jul 16, 2010 at 1:40 PM, P.J. Eby  wrote:
>
>
> At 11:07 AM 7/16/2010 -0500, Ian Bicking wrote:
>
> And this doesn't help with Python 3: either we have byte values of 
> SCRIPT_NAME and PATH_INFO in Python 3, or we have text values.Â  I think 
> bytes will be more awkward to port to than text, and inconsistent with other 
> WSGI values.
>
>
> OTOH, it has the tremendous advantage of pushing the encoding question onto 
> the app (or framework) developer...  who's really the only one who can make 
> the right decision for their particular application.  And personally, I'd 
> rather have clear boundaries between text and bytes, such that porting (even 
> if tedious or awkward) is *consistent*, and clear as to when you're finished, 
> not, "oh, did I check to make sure I converted SCRIPT_NAME and PATH_INFO...  
> not just in my app code, but in all the library code I call *from* my app?"
>
> IOW, the bytes/string discussion on Python-dev has kind of led me to realize 
> that we might just as well make the *entire* stack bytes (incoming and 
> outgoing headers *and* streams), and rewrite that bit in PEP 333 about using 
> str on "Python 3000" to say we go with bytes on Python 3+ for everything 
> that's a str in today's WSGI.
>
> This was my first intuition too, until I started thinking in more detail 
> about the particular values involved.  Some obviously are textish, like 
> environ['SERVER_NAME'].  Not a very useful value, but definitely text.
>
> Basically all the internal strings are textish, so we're left with:
>
> wsgi.url_scheme
> SCRIPT_NAME/PATH_INFO
> QUERY_STRING
> HTTP_*, CONTENT_TYPE, CONTENT_LENGTH (headers)
> response status
> response headers (name and value)
>
> And there's a few things like REMOTE_USER that are kind of in the middle.  
> Everyone is in agreement that bodies should be bytes.
>
> One initial problem is that the Python 3 stdlib handles bytes poorly, so for 
> instance there's no good way to reconstruct the URL using the stdlib.  That 
> explains certain tensions, but I think we should ignore that, and in fact 
> that's what Python-Dev seemed to say pretty clearly.
>
> Now, the other keys:
>
> wsgi.url_scheme: clearly ASCII
>
> SCRIPT_NAME/PATH_INFO: often UTF-8, could be no encoding, could be some old 
> legacy encoding.
> raw request path: should be ASCII (non-ASCII should be URL-encoded).  URL 
> encoding happens at the byte layer, so a server could reasonably URL encode 
> any non-ASCII characters without imposing any  encoding.
>
> QUERY_STRING: should be ASCII, same as raw request path
>
> headers: Most are ASCII.  Latin1 is a reasonable fallback and suggested by 
> the specification.  The spec also implies you have use the RFC2047 inline 
> encoding (like ?iso-8859-1?q?some=20text?=), but nothing supports this and 
> supporting it would probably be a bad idea for security reasons.  The Atompub 
> spec (reasonably modern) specifically says Title headers should be encoded 
> with RFC2047 (if they are not ISO-8859-1): 
> http://tools.ietf.org/html/draft-ietf-atompub-protocol-08#page-17 -- decoding 
> this kind of encoding at the application layer seems reasonable to me.
>
> cookie header: this specific header can easily have multiple encodings, as 
> the browser encodes data then treats it as opaque bytes, so a cookie can be 
> set via UTF-8 one place, Latin1 another, and those coexist in one header.  
> That is, there is no real encoding and this should be treated as bytes.  
> (Latin1 is an approximation of bytes... a spotty way to treat bytes, but 
> entirely workable.)
>
> response status: I believe the spec says this must be Latin1/ISO-8859-1.  In 
> practice it is almost always ASCII, and since it is not user-visible it's not 
> something that really needs localization.
>
> response headers: the spec implies Latin1, in practice the Set-Cookie header 
> is bytes (since interoperation with wonky legacy systems is not uncommon).  
> I'm not sure of any other exceptions?
>
>
> So... to me it seems pretty reasonable for HTTP specifically that text can 
> work.  And if feels weird that, say, environ['SERVER_NAME'] be text and 
> environ['HTTP_HOST'] not, and I don't know what environ['REMOTE_ADDR'] should 
> be in that mode.  And it would also be weird if environ['SERVER_NAME'] was 
> bytes.
>
> In the past when we've gotten down to specifics, the only holdup has been 
> SCRIPT_NAME/PATH_INFO, hence my suggestion to eliminate those.

There were a few other weird ones which are though server specific.
For example PATH_TRANSLATED (??). These are ones where again the
server or operating system dictates the encoding due to them having
bits in them deriving from things like filesystem paths and server
configuration files. I laboriously went through all these in an email
last year or earlier.

Same reason why SCRIPT_NAME is really dictated by server and raw value
perhaps should be going through to application.

Graham
_

Re: [Web-SIG] WSGI for Python 3

2010-07-16 Thread Graham Dumpleton

On Saturday, July 17, 2010, Gustavo Narea  wrote:
> Hello,
>
> Ian said:
>> Having two ways of expressing the same information will lead to bugs
>> related to which data is canonical.  If an application is using
>> SCRIPT_NAME/PATH_INFO and then updates those values in any way, and
>> wsgi.raw_script_name/wsgi.raw_path_info are present, then there will be
>> weird bugs and code will disagree about which one is correct.  Since %2f
>> can exist in the raw versions, there isn't even a way to chunk the two
>> variables in the same way.
>
> I can't agree more.
>
> I would propose the following, and excuse me in advance if this has already
> been proposed and discarded -- I've tried to follow this topic on the mailing
> list over the past few months, until it becomes an endless discussion.
>
> I think only the raw values should be available. Even if a middleware changes
> them, it must put them with raw values. And because you cannot change those
> values without knowing what encoding the request uses, the character encoding
> *must* be present.
>
> I know that sounds easy but it's not, because browsers don't specify the
> charset in the Content-Type and instead they generate a new request using the
> charset from the previous response. So the charset is unknown to the
> server/gateway and the middleware stack.
>
> So, what we could do is introduce a mandatory variable called, say,
> wsgi.charset, and would be used as follows:

Something like this was proposed before, but it only applied to the
keys that mattered, specifically PATH_INFO and maybe QUERY_STRING,
(the latter of which this discussion has been ignoring and I can't
remember how we worked out before it should be treated). It didn't
cover SCRIPT_NAME as as I indicated before, the encoding of that is
really dictated by the server and not the application for the initial
value at least.

The idea was that the server would pass them as Latin 1 and set the
encoding key. If a consumer of it didn't like the encoding it was in,
it would convert it back to bytes and then to what it wants and update
the encoding key to what it used. Thus you had a hint available to
allow reliable transcoding. This proposal didn't get acceptance
either.

Graham

>  - It MUST be set by the server or gateway on every request.
>  - Every middleware or application that reads or writes these values MUST use
> the charset specified in wsgi.charset.
>  - If a server, gateway, middleware or application wants to change the charset
> and it is possible*, it MUST convert the *entire* request into that charset
> and update wsgi.charset accordingly.
>  - When the charset is not specified in the HTTP request, UTF-8 MUST be
> assumed by the server/gateway. Unless another default charset has been
> specified by the user.
>
> I think/hope that will solve all the problems.
>
> What happens when a WSGI application is actually made up two WSGI applications
> and they send the responses in different charsets? If it's not possible to
> configure them so that they both use the same charsets, then one of them would
> have to be wrapped by a middleware which:
>  - On egress, converts the responses using the charset used by the other
> application.
>  - On ingress, if the charset is not specified in the request, it will assume
> it's the one used by the other application, and thus it will convert the
> request using the charset supported by the wrapped application.
>
> It would look like this:
> ===
> def application(environ, start_response):
>     if environ.startswith("/trac/"):
>         # Say Trac only supports Latin-1 and we want responses to use UTF-8:
>         app = trac.web.main.dispatch_request
>         app = CharsetNormalizer(app, response="latin-1", request="utf8")
>     else:
>         # myapp uses UTF-8
>         app = myapp
>     return app(environ, start_response)
> ===
>
> Then there's the string vs bytes issue. Bytes would be the natural choice to
> represent these raw values, but it would probably cause more trouble than they
> solve. So, I think they should be strings that contain the the ASCII raw
> encoded values (i.e., str on both versions of Python).
>
> What do you think about this? Again, sorry if this has been discarded before!
> :)
>
> * For example, you can always convert Latin-1 to UTF-8, but not every UTF-8
> string can be converted to Latin-1.
> --
> Gustavo Narea .
> | Tech blog: =Gustavo/(+blog)/tech  ~  About me: =Gustavo/about |
> ___
> Web-SIG mailing list
> Web-SIG@python.org
> Web SIG: http://www.python.org/sigs/web-sig
> Unsubscribe: 
> http://mail.python.org/mailman/options/web-sig/graham.dumpleton%40gmail.com
>
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] WSGI for Python 3

2010-07-16 Thread Graham Dumpleton

On Saturday, July 17, 2010, Ian Bicking  wrote:
> On Fri, Jul 16, 2010 at 12:28 PM, Chris McDonough  wrote:
>
>
> On Fri, 2010-07-16 at 11:07 -0500, Ian Bicking wrote:
>
>> And this doesn't help with Python 3: either we have byte values of
>> SCRIPT_NAME and PATH_INFO in Python 3, or we have text values.  I
>> think bytes will be more awkward to port to than text, and
>> inconsistent with other WSGI values.  If we have text then we have to
>> choose an encoding.  Latin1 will work, but it will be the exact wrong
>> encoding most of the time as UTF-8 is the typical  (unlike other
>> headers, where Latin1 will mostly be an okay encoding, or as good a
>> guess as we have).  If we firmly remove these keys then we can avoid
>> this choice entirely... and we conveniently also get a better
>> representation of the request.
>
> My $.02: I'd rather lobby the core folks for a string ABC (which we can
> hook with a stringlike bytes type) and consider all 3.X releases made so
> far "dead to WSGI" than to have to tunnel arbitrary bytes through some
> misleading Unicode encoding.
>
> While I think it would be generally useful, it's also a long way off at best, 
> with serious performance dangers that could torpedo the whole thing.  But... 
> I'm also unsure how it would help here, except perhaps we could incrementally 
> annotate bytes with an encoding?  Well, I don't really know.  Treating the 
> raw request path as text is easy enough, as it should always be ASCII 
> anyway.  We don't have to worry what is "right" or "wrong" in this case.
>
> We could make everything bytes and be done with it, but it would make it much 
> harder to port Python 2 WSGI code to Python

FWIW, I see the whole ebytes discussion only relevant were you to make
absolutely everything bytes. We don't really need it otherwise.

Graham
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] WSGI for Python 3

2010-07-16 Thread Graham Dumpleton

On Saturday, July 17, 2010, Ian Bicking  wrote:
> On Fri, Jul 16, 2010 at 4:33 AM, And Clover  wrote:
>
>
> On 07/14/2010 06:43 AM, Ian Bicking wrote:
>
>
> There's only a couple tricky keys: SCRIPT_NAME, PATH_INFO,
> and HTTP_COOKIE.
>
>
>
> (And of those, PATH_INFO is the only one that really matters, in that no-one 
> really uses non-ASCII script filenames, and non-ASCII characters in 
> Cookie/Set-Cookie are still handled so differently/brokenly across browsers 
> that you can't rely on them at all.)
>
>
>
>
> * I (re)propose we eliminate SCRIPT_NAME and PATH_INFO and replace them
> exclusively with encoded versions
>
>
>
> For compatibility with existing apps, how about keeping the existing 
> SCRIPT_NAME and PATH_INFO as-is (with all their problems), and specifying 
> that the new 'raw' versions (whatever they are called) are added only if they 
> really are raw, not reconstructed.
>
> Having two ways of expressing the same information will lead to bugs related 
> to which data is canonical.  If an application is using SCRIPT_NAME/PATH_INFO 
> and then updates those values in any way, and 
> wsgi.raw_script_name/wsgi.raw_path_info are present, then there will be weird 
> bugs and code will disagree about which one is correct.  Since %2f can exist 
> in the raw versions, there isn't even a way to chunk the two variables in the 
> same way.
>
>
> Then existing scripts that don't care about non-ASCII and slashes can carry 
> on as before, and for apps that do care about them, they'll be able to be 
> *sure* the input is correct. Or they can fall back to PATH_INFO when not 
> present, and avoid producing these kind of URLs in response.
>
> I don't think it works to imagine you can just not care about non-ASCII.  
> Requests come in.  WSGI should represent those requests.  If a request comes 
> in with non-ASCII bytes then WSGI needs to do *something* with it.  I don't 
> want to have to configure servers with application policy; servers should 
> just work.
>
> And this doesn't help with Python 3: either we have byte values of 
> SCRIPT_NAME and PATH_INFO in Python 3, or we have text values.  I think bytes 
> will be more awkward to port to than text, and inconsistent with other WSGI 
> values.  If we have text then we have to choose an encoding.  Latin1 will 
> work, but it will be the exact wrong encoding most of the time as UTF-8 is 
> the typical  (unlike other headers, where Latin1 will mostly be an okay 
> encoding, or as good a guess as we have).  If we firmly remove these keys 
> then we can avoid this choice entirely... and we conveniently also get a 
> better representation of the request.

One reason I don't want to see the existing keys removed is for
debugging purposes. In Apache, various Apache modules such as
mod_rewrite will operate on that translated path. I am concerned that
if only the raw one is available in the WSGI application then
confusion may arise where something doesn't go right with rewrites
because the only information that may be able to be dumped in the way
of debug by an application will be different to what other Apache
modules may operate on. If you aren't going to make use of CGI
versions, then would still like to see them present but perhaps
renamed. That way you don't have a loss of information when it comes
to trying to debug stuff. I could perhaps just put this in a
Apache/mod_wsgi specific key as well given that the issue is
particular to it. Thus might have apache.path_info or cgi.path_info.

Graham

> Note that libraries can smooth over this change; WebOb for instance will 
> certainly still support req.script_name/req.path_info by decoding the raw 
> values.  Admittedly lots of code use these values directly... but at least if 
> they get a KeyError the port/fix will be obvious (as opposed to out of sync 
> values, which will only emerge as a problem occasionally -- I'd rather not 
> invite more occasional bugs).
>
> --
> Ian Bicking  |  http://blog.ianbicking.org
>
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] WSGI for Python 3

2010-07-16 Thread Graham Dumpleton

On Friday, July 16, 2010, And Clover  wrote:
> On 07/14/2010 06:43 AM, Ian Bicking wrote:
>
>
> There's only a couple tricky keys: SCRIPT_NAME, PATH_INFO,
> and HTTP_COOKIE.
>
>
> (And of those, PATH_INFO is the only one that really matters, in that no-one 
> really uses non-ASCII script filenames,

FWIW, I had to go to a lot of trouble to allow non ASCII in final
SCRIPT_NAME in mod_wsgi. Specifically using AddHandler directive in
Apache means a file system path can make up part of SCRIPT_NAME. I had
someone who was specifically using Russian in a WSGI script file name
and because with AddHandler that becomes part of SCRIPT_NAME you had
to cater for it. Anyway this was more of a Windows issue in having to
use special file system functions to deal with fact that on Windows
filesystem paths aren't UTF-8 but something else.

What this does highlight though is that although one can talk about
passing raw script name through to application, that isn't necessarily
right as it isn't the application that dictates what encoding may be
used but the web server which is performing the mapping of that part
of the original URL path to a potential filesystem resource, or
alternatively where file based configuration for mount point, the
encoding of the web sever configuration file.

We touched on all of this before in prior discussions, thus original
raw value is only relevant in PATH_INFO and not SCRIPT_NAME as in the
case of the latter it is the web server that dictates the charset
based on configuration file encoding or file system encoding.

Graham
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

1 2 3 4 >

1 - 100 of 353 matches

Mail list logo