Re: [Web-SIG] [Python-Dev] wsgi validator with asynchronous handlers/servers
Il 24/03/2013 06:14, PJ Eby ha scritto: > [...] >> Thanks for response PJ, >> that is what I, unfortunately, didn't want to hear, the validator being >> correct for the "spec" means I can't use it for my asynchronous stuff, which >> is a shame :-((( >> But why commit to send headers when you may not know about your response? >> Sorry if this is the wrong mailing list for the issue, I'll adjust as I go >> along. > > Because async was added as an afterthought to WSGI about nine years > ago, and we didn't get it right, but it long ago was too late to do > anything about it. A properly async WSGI implementation will probably > have to wait for Tulip (Guido's project to bring a standard async > programming API to Python). Do you really need a standard async programming API to design and implement an async WSGI specification? I think it is not needed. Some time ago I posted a sample implementation and documentation for a very simple async extension for WSGI: https://bitbucket.org/mperillo/txwsgi An interesting example about how an async API can be designed is PostgreSQL libpq, where the API expose a direct interface to the protocol state machine (pqConsumeInput), so you can not only use it with any async framework you like, but you can also use it in blocking mode. This, as far as I know, is impossible with the network protocol implementations in Twisted or other async frameworks. Regards Manlio ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Last call for WSGI 1.0 errata/clarifications
Il 23/09/2010 18:32, P.J. Eby ha scritto: > Just a reminder: I'm planning to actually update PEP 333 over the > weekend and start working on wsgiref updates, so if you have any > last-minute comments on the proposal, now's the time to post them, > however unpolished they may be! > Where can I find a draft of the update? Thanks Manlio ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] [RFC] x-wsgiorg.suspend extension
Ludvig Ericson ha scritto: I have put web-sig in Cc. > On 11 apr 2010, at 22:07, Manlio Perillo wrote: > >> I here propose the x-wsgiorg.suspend to be accepted as official WSGI >> extension, using the wsgiorg namespace. > > I'm sorry, but I don't see how such a solution wins out over any other stab > at event-based concurrency (like gevent, eventlet, etc.) > > I've made a WSGI application using gevent, and then gunicorn's gevent arbiter > thing. Works like a charm. > Because eventlet, gevent and friends works *because* they have full control over the event loop, and they can use greenlets as they like. This is not possible with implementations like txwsgi (Twisted) and ngx_http_wsgi_module (Nginx). eventlet has support for Twisted, but, as far as I can tell, it works by running the Twisted event loop inside a greenlet. This is of course impossible with ngx_http_wsgi_module, since it is embedded in a web server written in C. > I get the point in trying to standardize something, but this solution seems > rather intrusive and not something I'd adopt any time soon. > Can you suggest a less intrusive extension that works with *every* WSGI implementation? > Nice work though! > Regards Manlio ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Draft PEP: WSGI 1.1
And Clover ha scritto: > [...] >> 8. The value passed to the 'write()' callback returned by >>'start_response()' should be a byte string. Where native strings >>are unicode strings, a native string type can also be supplied, in >>which case it would be encoded as ISO-8859-1. > > Weren't we going to only allow US-ASCII for the output? (These threads > are always so far apart I can never remember what conclusion we > reached... if any.) > By the way, yesterday I wrote some tests for Python 3.x and I found a possible problem (only indirectly related to WSGI, however). The example consists in a simple client -> proxy -> server, where the client and server are in Python 2.5 and the proxy in Python 3.2 (compiled from tip, some time ago). Here is the proxy: http://paste.pocoo.org/show/202212/ The application fails, if cookie contains non ascii character. The reason is that, for reasons I do not understand, http.client encode request headers using us-ascii, instead of iso-8859-1. The offending code is: http://hg.python.org/cpython/file/7dcb7a2fb54d/Lib/http/client.py#l912 Regards Manlio ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Draft PEP: WSGI 1.1
Dirkjan Ochtman ha scritto: > Mostly taking Graham's list of issues and incorporating it into PEP 333. > > Latest revision: http://hg.xavamedia.nl/peps/file/tip/wsgi-1.1.txt > > Let's have comments here (comments in the form of diffs are > particularly welcome, of course). Remember, the idea is not to change > or improve WSGI right now, but only to improve the spec, improving > interoperability and enabling Python 3 support. > > [...] Another comment. The run_with_cgi sample function should be changed, since it probably does not work correctly, on Python 3.x. I'm not sure, since sys.stdout.write accepts a native string, however how it is encoded is platform specific (with current text of WSGI 1.1, however, it seems this is allowed). I would like to do some tests with CGI, Python 3.2, IIS and Windows. Regards Manlio ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Draft PEP: WSGI 1.1
Dirkjan Ochtman ha scritto: > [...] > --- pep-0333.txt 2010-04-15 14:46:02.0 +0200 > +++ wsgi-1.1.txt 2010-04-15 14:51:39.0 +0200 > @@ -1,114 +1,124 @@ > [...] > Abstract > > > [...] > -Thus, simplicity of implementation on *both* the server and framework > -sides of the interface is absolutely critical to the utility of the > -WSGI interface, and is therefore the principal criterion for any > -design decisions. > - > -Note, however, that simplicity of implementation for a framework > -author is not the same thing as ease of use for a web application > -author. WSGI presents an absolutely "no frills" interface to the > -framework author, because bells and whistles like response objects and > -cookie handling would just get in the way of existing frameworks' > -handling of these issues. Again, the goal of WSGI is to facilitate > -easy interconnection of existing servers and applications or > -frameworks, not to create a new web framework. > - This, and the rest of the abstract, should not entirely be removed, IMHO. > [...] > - > -Finally, it should be mentioned that the current version of WSGI > -does not prescribe any particular mechanism for "deploying" an > -application for use with a web server or server gateway. At the > -present time, this is necessarily implementation-defined by the > -server or gateway. After a sufficient number of servers and > -frameworks have implemented WSGI to provide field experience with > -varying deployment requirements, it may make sense to create > -another PEP, describing a deployment standard for WSGI servers and > -application frameworks. This should not be removed. > [...] > + > +Differences with WSGI 1.0 > += > + > +Descriptive changes > +--- > + > +The following changes were made to realign the spec with > +implementations 'in the wild'. > + This text feels wrong, to me, > +1. The 'readline()' function of 'wsgi.input' must optionally take > + a size hint. This is required because many applications use > + cgi.FieldStorage, which uses this functionality. > + What values are supported for size? Are values -1 and None supported? > [...] > +3. Any WSGI application or middleware should not itself return, or > + consume from a wrapped WSGI component This is not very clear. What is the meaning of "consume from a wrapped WSGI component"? > , more data than specified by > + the Content-Length response header if defined. Middleware that > + does this is arguably broken and can generate incorrect data. > + This is just a clarification of obligations. > + > [...] > + > +String handling changes > +--- > + > +The following changes were made to make WSGI work on Python 3.x. > + > +1. The application is passed an instance of a Python dictionary > + containing what is referred to as the WSGI environment. All keys > + in this dictionary are native strings. For CGI variables, all names > + are going to be ISO-8859-1 "going to be ISO-8859-1" should be expressed in more precise terms. Moreover, you should probably define first what a "native string" is, or you shoudl add a note that it is defined later in the document. > and so where native strings are > + unicode strings, that encoding is used for the names of CGI > + variables. > + > +2. For the WSGI variable 'wsgi.url_scheme' contained in the WSGI > + environment, the value of the variable should be a native string. > + > +3. For the CGI variables contained in the WSGI environment, the values > + of the variables are native strings. Where native strings are > + unicode strings, ISO-8859-1 encoding would be used such that the What is the precise meaning of *would*, here? > + original character data is preserved and as necessary the unicode > + string can be converted back to bytes and thence decoded to unicode > + again using a different encoding. > + > +4. The WSGI input stream 'wsgi.input' contained in the WSGI environment > + and from which request content is read, should yield byte strings. > + "yield" should be replaced with "return". And, again, why are you using *should*, here? Is an implementation allowed to return a native string? See my previous comment for "native string", about the use od "byte string" here. > [...] > @@ -575,13 +602,14 @@ > = === > Variable Value > = === > -``wsgi.version`` The tuple ``(1,0)``, representing WSGI > +``wsgi.version`` The tuple ``(1, 0)``, representing WSGI > version 1.0. > Should be (1, 1), not (1, 0). > [...] > > -Proposed/Under Discussion > -= > - I see no real reasons for removing this section. > [...] Moreover, should the section "Supporting Older (<2.2) Versions of Python" be removed? > - > Acknowledgements > =
Re: [Web-SIG] WSGI and start_response
Dirkjan Ochtman ha scritto: > [...] >> Such a significant change as removing the requirement for write() >> should also not be done within a minor version of the WSGI >> specification because anything that works with WSGI 1.0 should still >> work with WSGI 1.1 and vice versa. On that basis it can't really be >> entertained until WSGI 2.0 where incompatible changes would be >> allowed. > > I think it's a good idea to consider for 2.0, certainly. > Ehm, the purpose of WSGI 2.0 is precisely to remove start_response and write callable with it... Manlio ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] WSGI and start_response
Dirkjan Ochtman ha scritto: > On Tue, Apr 13, 2010 at 14:46, Graham Dumpleton > wrote: >> The last attempt was to have WSGI 1.1 as clarifications and Python 3.X. >> >> And when I say 'last attempt', yes there have been people who have >> stepped up to try and get this to happen in the past. I think you >> would be the 3rd time, excluding me in general having tried to push it >> in the past and also given up. >> >> You really should perhaps look back through the archive of WEB-SIG >> posts on Google Groups to understand the history and how this always >> seems to just go around in circles. :-) > > I've been on Web-SIG for quite a while now, exactly to keep track of > these issues. > > Since there doesn't seem to be much traction, I figured it would be > time to just get a new PEP together. To limit the amount of work, I'd > go in the direction of having a single WSGI 2.0 PEP incorporating your > suggestions (maybe minus the number 3), everything required for Python > 3 (as outlined by your wiki page). > If you volunteer for this task, I have some suggestions: * target WSGI 1.1, not WSGI 2.0 * take the original WSGI 1.0 spec text * start to integrate all changes documented by Graham * I would really like to have changes integrates as a series of diff, using and HTML elements. Unfortunately docutils seems to not have support for this, but should not be hard to implement. I can help. * You should keep a separate, unofficial document, with the rationale of the changes. You can just copy the content of Graham blog post, and reformatting it, if this is ok for Graham * For each of the main changea, start a thread on this mailing list asking for votation. If, after 1 week, there is no vote against it, consider it approved If we are really going to approve WSGI 1.1, I have a request: remove the ``write`` callable. Rationale: * it was added in WSGI 1.0 only for compatibility * new code does not use it * this will force applications under development that still use the ``write`` callable to be fixed. See work on Mercurial * it is very easy for current implementations to support both WSGI 1.0 and WSGI 1.1 * legacy application will continue to work * removing of the ``write`` callable will make middlewares more easy to write Thanks Manlio ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] WSGI and start_response
Dirkjan Ochtman ha scritto: > On Tue, Apr 13, 2010 at 13:13, Graham Dumpleton > wrote: >> There is no such thing as a WSGI 2.0 PEP and there is no proper >> concensus either on what it should look like. Thus if you see anything >> claiming to implement WSGI 2.0, then it isn't and you should only view >> it as an experimental proposal. You are warned. :-) > > Do you (or someone else) have a status on where WSGI 2 is? IIRC WSGI 1 > isn't really usable with Python 3.x, so it seems about time we get > something going again (AIUI this is blocking Werkzeug from being > ported to 3.x, for example). > WSGI 2.0 ideas are here: http://wsgi.org/wsgi/WSGI_2.0 But it does not have support for Python 3.x. Some corrections to WSGI 1.0 are here: http://wsgi.org/wsgi/Amendments_1.0 You may add support to Python 3.x in existing WSGI 1.0 implementation, but your implementation will end up to something that is no more WSGI 1.0. Manlio ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] WSGI and start_response
P.J. Eby ha scritto: > At 10:18 PM 4/8/2010 +0200, Manlio Perillo wrote: >> Suppose I have an HTML template file, and I want to use a sub request. >> >> ... >> ${subrequest('/header/'} >> ... >> >> The problem with this code is that, since Mako will buffer all generated >> content, the result response body will contain incorrect data. >> >> It will first contain the response body generated by the sub request, >> then the content generated from the Mako template (XXX I have not >> checked this, but I think it is how it works). > > Okay, I'm confused even more now. It seems to me like what you've just > described is something that's fundamentally broken, even if you're not > using WSGI at all. > If you are referring to Mako being turned in a generator, yes, this implementation is rather obscure. I wrote it as a proof of concept. Before this, I wrote a more polite implementation: http://paste.pocoo.org/show/201324/ > >> So, when executing a sub request, it is necessary to flush (that is, >> send to Nginx, in my case) the content generated from the template >> before the sub request is done. > > This seems to only makes sense if you're saying that the subrequest *has > to* send its output directly to the client, rather than to the parent > request. Yes, this is how subrequests work in Nginx. And I assume the same is true for Apache. > If the subrequest sends its output to the parent request (as a > sane implementation would), then there is no problem. You are forgetting that Nginx is not an application server. Why should the subrequest output returned to the parent? This would only make it less efficient. > Likewise, if the > subrequest is sent to a buffer that's then inserted into the parent > invocation. > > Anything else seems utterly insane to me, unless you're basically taking > a bunch of legacy CGI code using 'print' statements and hacking it into > something else. (Which is still insane, just differently. ;-) ) > We are talking about subrequest implementation in a efficient web server written in C, like Nginx and Apache. > >> Ah, you are right sorry. >> But this is not required for the Mako example (I was focusing on that >> example). > > As far as I can tell, that example is horribly wrong. ;-) > I agree ;-) > >> But when using the greenlet middleware, and when using the function for >> flushing Mako buffer, some data will be yielded *before* the application >> returns and status and headers are passed to Nginx. > > And that's probably because sharing a single output channel between the > parent and child requests is a bad idea. ;-) > No, this is not specific to subrequests. As an example, here you can find an up to date greenlet adapters: http://bitbucket.org/mperillo/txwsgi/src/tip/txwsgi/greenlet.py The ``write_adapter`` **needs** to yield some data before WSGI application return, because this is how the write callable workd. The exposed ``gsuspend`` function, instead, will cause an empty string to be yielded to the server, before the WSGI application returns. > (Specifically, it's an increase in "temporal coupling", I believe. I > know it's some kind of coupling between functions that's considered bad, > I just don't remember if that's the correct name for it.) > Nginx code contains some coupling; I assume this is done because it was designed with efficiency in mind. > [...] > It's true that dropping start_response() means you can't yield empty > strings prior to determining your headers, yes. > > >> > - yielding is for server push or >> > sending blocks of large files, not tiny strings. >> >> Again, consider the use of sub requests. >> yielding a "not large" block is the only choice you have. > > No, it isn't. You can buffer your output and yield empty strings until > you're ready to flush. > As I wrote, this will not work if you want to use subrequest support from Nginx. > > >> Unless, of course, you implement sub request support in pure Python (or >> using SSI - Server Side Include). > > I don't see why it has to be "pure", actually. It just that the > subrequest needs to send data to the invoker rather than sending it > straight to the client. > You may say this, but it is not how subrequests are implemented in Nginx ;-). > That's the bit that's crazy in your example -- it's not a scenario that > WSGI 2 should support, and I'd consider the fact that WSGI 1 lets you do > it to be a bug, not a feature. ;-) > Are you referring to the bad Mako examp
Re: [Web-SIG] [RFC] x-wsgiorg.suspend extension
Graham Dumpleton ha scritto: > [...] >> Just yielding an empty string does not give the server some important >> informations. >> >> As an example, with x-wsgi.suspend application can specify a timeout, >> that tells the server that the application must be resumed before >> timeout milliseconds have elapsed. >> >> And x-wsgi.suspend returns a callable that, when called, tell the server >> to poll the app again. > > There are other ways of doing that, the callable doesn't need to be in > the WSGI environment. This is because since it is single threaded, the > WSGI server need only record in a global variable for that WSGI > application some state about the current request. The separate > function to note the suspension can then lookup that and does what it > needs to. In other words, you don't need the WSGI environment to > maintain that relationship. > This seems completely broken, to me; do you have looked at txwsgi implementation? It is true that the WSGI server is single threaded, but there can be multiple concurrent requests processed in this thread. What happens if one request is being suspended and a new one is being processed? As far as I can tell, the new request will note the suspend flag set to True, and will be suspended as well. > Having the timeout as argument is also questionable anyway. All you > really need to do is to tell the WSGI server that I don't want to be > called until I tell it otherwise. The WSGI application could itself > handle the timeout in other ways. > But I can't see the reason why this can not be done by x-wsgiorg.suspend, since it is a very convenient interface. > Overall one could do all of this without having to do anything in the > WSGI environment. As PJE points out, it can be done by relying only on > the ability to yield an empty string. Everything else can be in the > application realm with the application normally being bound to a > specific WSGI server/event loop implementation, thus no portability. > >From what I can tell, this is only possible by having a custom variable in the WSGI environ. But since I wrote txwsgi for precisely this reason, it should not be hard to prove that your idea is actually possible to implement (and it does not make implementation more complex as it should be, think about an implementation written in C). > The problem of a middleware not passing through an empty string > doesn't even need to be an issue in as much as the application could > track when it requested to be suspended and if called into again > before the required criteria had been met, it could detect a > middleware that wasn't playing by the rules and at least raise an > error rather than potentially go into blocking state and tight loop. > Yes. This is something that can be done by an implementation. Currently txwsgi only checks for suspend flag when an empty string is yielded by application. > One could theoretically abstract out an interface for a generic event > system, but what you don't want is a general purpose one. You want one > which is specifically associated with the concept of a WSGI server. Why? This is not required at all. > That way the API for it can expose methods which specifically relate > to stuff like suspension of calling into the WSGI application for data > until specific events occur. The event API just needs to deal with events, using callbacks to report data to application. Please, see the demo_getpage_green.py example, in txwsgi. > [...] Regards Manlio ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] [RFC] x-wsgiorg.suspend extension
P.J. Eby ha scritto: > At 01:25 PM 4/12/2010 +0200, Manlio Perillo wrote: >> The purpose of the extension if to just have a standard interface that >> WSGI applications can use to take advantage of the possibility, offered >> by asynchronous server, to suspend execution and resume it later. > > WSGI has this ability now - it's yielding an empty string. Yielding an > empty string is a hint to the server that the application is not ready > to send any output, and the server is free to schedule other > applications next. And WSGI does not require the application to be > rescheduled any time soon. > > In other words, if saying "don't call me for a while" is the purpose of > the extension, it is not needed. As Graham says, the thing that would > actually be needed is a way to tell the server when to poll the app again. > Just yielding an empty string does not give the server some important informations. As an example, with x-wsgi.suspend application can specify a timeout, that tells the server that the application must be resumed before timeout milliseconds have elapsed. And x-wsgi.suspend returns a callable that, when called, tell the server to poll the app again. Regards Manlio ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] [RFC] x-wsgiorg.suspend extension
Graham Dumpleton ha scritto: > [...] >> >> Claiming that x-wsgiorg.suspend does not help writing portable WSGI >> application is something similar (well, I'm a bit exaggerating here) of >> saying that WSGI does not allow to write portable web applications, >> because real world WSGI applications needs a database, a database >> engine, and so on. > > It is not the same. I can take code using a specific database instance > and still run that WSGI application, using the same database, on a > different WSGI hosting mechanism without really changing anything > about how I interact with the WSGI server and its request handling. > The concern here is the WSGI interface and interacting with the web > server, not other non related third party packages. > This is true. However you can say the same for x-wsgorg.suspend extension. As an example, you can have an application that use a standard event API, and you can run it on several asynchronous WSGI implementations. The difference is that here we speak about event API, and not specific event implementation. Note however that we can also speak about specific implementations. As an example, I can implement Twisted reactor API in Nginx, so that WSGI applications using Twisted API can be executed on both Twisted and Nginx. I could do the same with libevent API. It's only a technical problem. > You are articificially adding something to the WSGI interface as an > extension which is pointless. Since you are bound to the specific > event loop of the underlying WSGI server or event framework being used You are not bound to a specific event framework, when using x-wsgiorg.suspend! > you may just as well call a function directly on the WSGI server. > Adding that function under a key in the WSGI environment and accessing > it that way does not in itself provide any value and doesn't somehow > make the code easily portable to a different WSGI hosting mechanism > using a different event loop as you still have to change lots of other > code in your application. > This is absolutely not true! > In some respects this is similar to the issues between using a WSGI > wrapper which injects stuff in WSGI environment versus that > functionality being in a separate library. Read: > > http://dirtsimple.org/2007/02/wsgi-middleware-considered-harmful.html > This is simply wrong. x-wsgiorg.suspend **can not** be implemented as simply library code; it **must** be accessed from environ dictionary. The reason is simple: 1) First of all, in order to suspend application, you **must** return control to the server, and this can only be done by yielding some value in the application generator. 2) In order for the implementation to know if application requested suspension, it must keep a flag in its *internal* state. The x-wsgiorg.suspend function simply sets this flag. Manlio ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] [RFC] x-wsgiorg.suspend extension
Graham Dumpleton ha scritto: > On 12 April 2010 06:07, Manlio Perillo wrote: >> I'm not sure about the correct procedure to follow, I hope it is not a >> problem. >> >> I here propose the x-wsgiorg.suspend to be accepted as official WSGI >> extension, using the wsgiorg namespace. >> First of all thanks for the feedback. > [...] > In the code of demo_fdevent.py it has: > > while True: > while True: > ret, num_handles = m.perform() > if ret != pycurl.E_CALL_MULTI_PERFORM: > break > if not num_handles: > break > > read, write, exc = m.fdset() > resume = environ['x-wsgiorg.suspend'](1000) > if read: > readable(read[0], resume) > yield '' > else: > writeable(write[0], resume) > yield '' > > The registration of file descriptors doesn't occur until after the > first suspend() call. > > If the underlying reactor that the WSGI server is presumably also > using doesn't know about the file descriptors at that point, then how > does it now to return from the suspend(). > I'm not sure to understand your concern, but the execution is not suspended when you call x-wsgiorg.suspend, but only when you yield a empty string. In the example, registration of file descriptor occur before application is suspended. > You are also calling perform() before that point. When calling that, > it is presumed you have already done a select/poll to know data is > available, but you haven't done that on first pass through the loop. > If you call that and data isn't ready, can't it block still. > I have to admit that I just copied the example from fdevent specification. However the code seems correct, to me. > This example also illustrates well why I am so against an asynchronous > WSGI server extension. > > The reason is that your specific application has to be with this > extension bound to the specific event loop mechanism used by the > underlying WSGI server. > > I can't for example take this application and host it on a different > WSGI server which implements the same WSGI extension but uses a > different event loop. > Instead I think that being "agnostic" about how it is used, in one of the most important feature of x-wsgiorg.suspend extension. After all, if you think about it, how to interface with a database in a WSGI application is not specified by WSGI. This is done by a separate standard, dbapi2. For applications that need a template engine, we don't even have a standard inteface. The lack of a standard event API is not a problem that should be discussed in WSGI. It is a problem with the Python community; in fact I would like to define a standard event API *and* a standard efficient network API (the reason is expressed at the end of the README file in txwsgi). > If one can't do that and it is tied to the event loop and > infrastructure of the underlying WSGI server, what is the point of > defining and implementing the WSGI extension as it doesn't aid > portability at all, so what service is it actually providing? > The service it provides is: "allow a WSGI application to suspend its execution and resume it later". > In that respect, the extension: > > http://www.wsgi.org/wsgi/Specifications/fdevent/ > > provided more as at least it tried to abstract out a generic interface > for registering interest in file descriptor activity and so perhaps > allow the application not to be dependent on the specific event loop > used by the underlying WSGI server. > However exposing this event interface is really something that has little to do with WSGI. Moreover, the fdevent example is rather inefficient. Suspensions should be minimized, and this is not possible with x-wsgiorg.fdevent but it is possible with x-wsgiorg.suspend. >>From the open issues of that other specification however, you can see > that there can be problems. It only allowed an application to be > interested in a single file descriptor where some packages may need to > express interest in more than one. > > Quite often an application is never going to be that simple anyway. > Some event systems allow a lot more than just watching of file > descriptors and timeouts however. You cant come up with a generic > interface for all these as they will not be able to be implemented by > a different event system which isn't so feature rich or which has a > different style of interface. Thus applications are restricted to the > lowest common denominator and likely that is not going to be enough > for most and so have no choice but to bind it to in
Re: [Web-SIG] wsgi and generators (was Re: WSGI and start_response)
P.J. Eby ha scritto: > At 02:04 PM 4/10/2010 +0100, Chris Dent wrote: >> I realize I'm able to build up a complete string or yield via a >> generator, or a whole bunch of various ways to accomplish things >> (which is part of why I like WSGI: that content is just an iterator, >> that's a good thing) so I'm not looking for a statement of what is or >> isn't possible, but rather opinions. Why is yielding lots of moderately >> sized strings *very bad*? Why is it _not_ very bad (as presumably >> others think)? > > How bad it is depends a lot on the specific middleware, server > architecture, OS, and what else is running on the machine. The more > layers of architecture you have, the worse the overhead is going to be. > > The main reason, though, is that alternating control between your app > and the server means increased request lifetime and worsened average > request completion latency. > This is not completely true. At least this is not how things will work on an asynchronous WSGI implementation. It is true that alternating control between your app and server decrease performance. This can be verified with: http://bitbucket.org/mperillo/txwsgi/src/tip/doc/examples/demo_cooperative.py However yielding small strings in the application iterator, because the application does not want to buffer data, will usually not cause the problems you describe. Instead, the possible performance problems have been described by Graham. Moreover, when we speak about latency, we should also consider that web page are usually served to human users. In this case, latency is not the only factor to consider. Is it better for the user to wait 3 seconds for some text to appear on the browser window, and then wait for other 5 seconds for the complete page to be rendered, or having to wait 5 seconds for some text to appear on the browser window? > [...] > If you translate this to the architecture of a web application, where > the "work" is the server serving up bytes produced by the application, > then you will see that if the application serves up small chunks, the > web server is effectively forced to multitask, and keep more application > instances simultaneously running, with lowered latency, increased memory > usage, etc. > Yielding small strings *will* not force multitasking. This can be verified with: http://bitbucket.org/mperillo/txwsgi/src/tip/doc/examples/demo_producer.py WSGI application will be suspended *only* when data can not be sent to the OS socket buffer. Yielding several small strings will *usually* not cause socket buffer overflow, unless the client is very slow at reading data. Instead, ironically, you will have a problem when the application yields several big strings. In this case it is better to yield only one very big string, but this is not always feasible. And I'm not sure if it is worse to keep a very big buffer in memory, or to send several not small chunks to the client. > [...] Regards Manlio ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] [ANN] twsgi: asynchronous WSGI implementation for Twisted Web
Gustavo Narea ha scritto: > Hello, > > Maybe I'm missing something obvious, but if the gateway doesn't support > applications that return write() callables, then it's not WSGI. > > A callable that raises an exception does not even count. It's obvious > that they must not raise exceptions -- Then what's the point of > providing the callable? > Nothing is obvious in an official specification ;-). The reason I choose to not completely remove the write callable is because it will raise a nice error message if someone even try to use my implementation to execute a WSGI application that requires the write callable. Moreover some middlewares or applications may assume the write callable exists and the value returned by start_response is not None, even if it is never used. > That said, I *think* it might be OK to disable support for the write() > callable *optionally* on a per application basis. For example, the > gateway could look at the "requires_write" attribute of the application > callable, if any: > """ > def wsgi_app(environ, start_response): > # ... process the request and return a response > > wsgi_app.requires_write = False > """ > > That way, applications which don't use the write() callable can let your > gateway know and thus it won't pass one on. > The problem is that applications that requires the write callable, are not aware of this extension. This is really a no problem, IMHO. If you try to execute an application, and you get a NotImplementedError extension, then you *know* that write callable is required. Then, you just configure the WSGI gateway to use the required adapter. See http://bitbucket.org/mperillo/txwsgi/src/tip/doc/examples/demo_write.py for a pratical example using txwsgi. With ngx_http_wsgi_module, you just have to add a wsgi_middleware txwsgi.greenlet write_adapter; directive in Nginx configuration file. > We could even standardize this (at wsgi.org) so that any WSGI middleware > which wraps an application can expose the "requires_write" attribute of > the wrapped application... As long as such a middleware doesn't use > write() either. > > On the other hand, I would avoid using "middleware" in this context for > something specific to your implementation as people will believe it's a > proper WSGI middleware. Yes. I now use the term "adapter". Regards Manlio ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
[Web-SIG] [RFC] x-wsgiorg.suspend extension
I'm not sure about the correct procedure to follow, I hope it is not a problem. I here propose the x-wsgiorg.suspend to be accepted as official WSGI extension, using the wsgiorg namespace. The extension is documented in doc/wsgiorg.suspend.rst document in the txwsgi source distribution, available on: http://bitbucket.org/mperillo/txwsgi/ The direct link to the specification is: http://bitbucket.org/mperillo/txwsgi/src/tip/doc/wsgiorg.suspend.rst The extension is implemented in txwsgi implementation for Twisted Web server, and I'm going to implement it in the ngx_http_wsgi_module implementation for Nginx server. The extension is very easy to implement. It also generalize the proposed x-wsgiorg.fdevent extension. Please, see http://bitbucket.org/mperillo/txwsgi/src/tip/doc/examples/demo_fdevent.py for a comparison of the same example described in fdevent specification, implemented using suspend and Twisted reactor API. Thanks to Christopher Stawarz for writing the fdevent specification, since I was able to use it as a reference. Some additional notes. x-wsgiorg.suspend extension can be implemented in both WSGI 1.0 and the proposed WSGI 2.0. However, due to the lack of start_response support, the usability is limited. Thanks and regards Manlio Perillo ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
[Web-SIG] [ANN] txwsgi 0.1
I'm pleased to announce txwsgi, version 0.1. txwsgi is a fork of twisted.web.wsgi, that, unlike the original implementation, executes the WSGI application in the main I/O thread. txwsgi implements the proposed x-wsgiorg.suspend extension, that enables support to asynchronous WSGI applications. Some examples are available in the doc/examples directory, in the source distribution. The project is available on BitBucket: http://bitbucket.org/mperillo/txwsgi/ More informations are available in the README file. The x-wsgiorg.suspend extension is specified in doc/wsgiorg.suspend.rst. I will starte a new thread for official approval process. I have tried to write as much documentation possible, also taking into consideration feedback received in previous threads; thanks for the support. Thanks and regards Manlio Perillo ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] [ANN] twsgi: asynchronous WSGI implementation for Twisted Web
Graham Dumpleton ha scritto: > On 9 April 2010 22:15, Manlio Perillo wrote: >> Graham Dumpleton ha scritto: >>> [...] >>>> But since the write callable **can** be implemented in a middleware >>>> (using greenlets) and since middlewares *can* be configured inside WSGI >>>> gateway, implementations can still claim to be WSGI 1.0 conformant. >>> Then only the higher level middleware adapter can even claim to be >>> WSGI compliant and deserve to use the WSGI name. >> Since the middleware is executed inside WSGI gateway, and the gateway >> can be configured to always execute some middleware, the final >> application will simply have at disposal a WSGI conformant write callable. > > Then it isn't really a middleware at all then, but a part of your > overall solution. It is just that the gateway has support to direct execution of middlewares, since this make the implementation more flexible. > So long as only the complete solution is exposed and > is WSGI compliant then fine. But if it is going to be layered in any > way such that lower level layers can be used in their own right, then > the lower level layers shouldn't really be said to be WSGI if they > don't implement full WSGI specification. As much as we all have our > complaints about WSGI specification, it is what it is and is all we > have right now. > By the way, as a matter of curiosity. WSGI 1.0 states: """The start_response callable must return a write(body_data) callable that takes one positional parameter: a string to be written as part of the HTTP response body. (Note: the write() callable is provided only to support certain existing frameworks' imperative output APIs; it should not be used by new applications or frameworks if it can be avoided. See the Buffering and Streaming section for more details.)""" There is nothing that prevents the write callable to raise an exception. Of course an implementation that always raise a NotImplementedError is going to be useless (for applications that require the write callable), but it seems to me that such an implementation can still claim to conform to WSGI 1.0. > [...] Manlio ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] [ANN] twsgi: asynchronous WSGI implementation for Twisted Web
Graham Dumpleton ha scritto: > [...] >> But since the write callable **can** be implemented in a middleware >> (using greenlets) and since middlewares *can* be configured inside WSGI >> gateway, implementations can still claim to be WSGI 1.0 conformant. > > Then only the higher level middleware adapter can even claim to be > WSGI compliant and deserve to use the WSGI name. Since the middleware is executed inside WSGI gateway, and the gateway can be configured to always execute some middleware, the final application will simply have at disposal a WSGI conformant write callable. > Any underlying > abstraction you use at the web server interface isn't WSGI and by > rights should be called something else so there is no confusion and > also shouldn't use 'wsgi' keys in its environ dictionary. Have your > high level middleware do a completely remapping of names as > appropriate. > This will add useless overhead. >>> Why don't you given it all a completely different name else you will >>> just cause ongoing confusion >> In don't really see how this can cause confusion! > > So, when someone goes and runs a WSGI application directly against you > WSGIish web server interface which you still insist you can describe > as being WSGI and it fails because the write() method isn't > implemented what is your answr going to be? If something is going to > use WSGI name it should implement the full WSGI specification. > To make people happy, I can just have the default implementation include the required middleware by default. >>> like you did with when you felt you could >>> reuse the 'mod_wsgi' name for your nginx >> In fact the first thing I did during code refactoring was to rename it >> to ngx_http_wsgi_module. > > The mod_wsgi name is still used all through > http://wiki.nginx.org/NginxNgxWSGIModule that I can tell. > I still have to update it. Manlio ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] [ANN] twsgi: asynchronous WSGI implementation for Twisted Web
Graham Dumpleton ha scritto: > [...] >>- the name will be 'wsgiorg.suspend' instead of 'wsgi.pause_output' >> >> The wsgiorg namespace is used, since the plan is to have it >> standardized [1], but it can only be implemented on asynchronous >> servers. > > Please read: > > http://www.wsgi.org/wsgi/Specifications > > If a proposal is suggested, it MUST use 'x-wsgiorg.' and not > 'wsgiorg.'. Only after it is officially accepted can it use the > 'wsgiorg.'. > Well; since the original propose was using wsgi namespace, I just suggested the use of wsgiorg namespace instead Of course, when it will be implemented I will use a different namespace, until it gots approved. > I would question whether you should even be using 'x-wsgiorg.' as as > far as I can see from my quick scans of emails, you aren't even > supporting WSGI proper as you are dropping support for bits. As such, > it isn't WSGI, only WSGIish so how can you justify using the name. > This is not completely correct. The twsgi implementation, as well ngx_http_wsgi_module implementation, does not implement the write callable. The reason is simple: write callable was an huge mistake in WSGI 1.0 since it can not be implemented in an asynchronous web server. But since the write callable **can** be implemented in a middleware (using greenlets) and since middlewares *can* be configured inside WSGI gateway, implementations can still claim to be WSGI 1.0 conformant. > Why don't you given it all a completely different name else you will > just cause ongoing confusion In don't really see how this can cause confusion! > like you did with when you felt you could > reuse the 'mod_wsgi' name for your nginx In fact the first thing I did during code refactoring was to rename it to ngx_http_wsgi_module. > version even though I asked > you to use a different name. It has been an absolute pain seeing > discussions on places like #django irc where people don't know when > people mention mod_wsgi whether they are talking about Apache of > nginx. > Apologies for having underestimated this. Manlio ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
[Web-SIG] [ANN] twsgi: asynchronous WSGI implementation for Twisted Web
I have started to write an asynchronous WSGI implementation for Twisted Web. The standard implementation execute the WSGI application in a separate thread. twsgi will instead execute the application in the main Twisted thread. The advantage is that twsgi is better integrated in Twisted, and WSGI applications will be able to use all features available in Twisted. Code is availale from a Mercurial repository: http://hg.mperillo.ath.cx/twisted/twsgi The purpose of twsgi is to have a pure Python implementation of WSGI with support for asynchronous HTTP servers and asynchronous WSGI applications. The implementation is similar to ngx_http_wsgi_module, and can be used to quick test asynchronous extensions. write callable is not implemented (calling it will raise NotImplemented error), since write callable can not be implemented in an asynchronous web server without using threads (and twsgi *does* not use threads). ngx_http_wsgi_module does the same. TODO * support for suspending iteration over WSGI app iter, when socket is not ready to send data. execution will be resumed when socked is ready again. * support for suspend/resume extension, as described here: http://comments.gmane.org/gmane.comp.python.twisted.web/632 It will have some differences: - the name will be 'wsgiorg.suspend' instead of 'wsgi.pause_output' The wsgiorg namespace is used, since the plan is to have it standardized [1], but it can only be implemented on asynchronous servers. - wsgi.pause_output function will accept an optional timeout, in milliseconds. If timeout is specified, application will be implicitly resumed when timeout expires. - resume function will return a boolean value. True: if execution was suspended and it is going to be resumed False: if execution was not suspended The return value can be used to check if timeout specified in wsgiorg.suspend expired. I'm not sure if a boolean value is the best solution. Maybe it should return -1 is execution was not suspended, and 0 otherwise. [1] unlike other proposed async extensions, suspend/resume is much more simple and easy to implement, so it is more likely to have a wide consensus over the specification. Feedbacks are welcomed. Regards Manlio ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] WSGI and start_response
P.J. Eby ha scritto: > At 08:06 PM 4/8/2010 +0200, Manlio Perillo wrote: >> What I'm trying to do is: >> >> * as in the example I posted, turn Mako render function in a generator. >> >> The reason is that I would lite to to implement support for Nginx >> subrequests. > > By subrequest, do you mean that one request is invoking another, like > one WSGI application calling multiple other WSGI applications to render > one page containing contents from more than one? > Yes. > >> During a subrequest, the generated response body is sent directly to >> the client, so it is necessary to be able to flush the Mako buffer > > I don't quite understand this, since I don't know what Mako is, or, if > it's a template engine, what flushing its buffer would have to do with > WSGI buffering. > Ah, sorry. Mako is a template engine. Suppose I have an HTML template file, and I want to use a sub request. ... ${subrequest('/header/'} ... The problem with this code is that, since Mako will buffer all generated content, the result response body will contain incorrect data. It will first contain the response body generated by the sub request, then the content generated from the Mako template (XXX I have not checked this, but I think it is how it works). So, when executing a sub request, it is necessary to flush (that is, send to Nginx, in my case) the content generated from the template before the sub request is done. Since Mako does not return a generator (I asked the author, and it was too hard to implement), I use a greenlet in order to "turn" the Mako render function in a generator. > >> > Under >> > WSGI 1, you can do this by yielding empty strings before calling >> > start_response. >> >> No, in this case this is not what I need to do. > > Well, if that's not when you're needing to suspend the application, then > I don't see what you're losing in WSGI 2. > > >> I need to call start_response, since the greenlet middleware will yield >> data to the caller before the application returns. > > I still don't understand you. In WSGI 1, the only way to suspend > execution (without using greenlets) prior to determining the headers is > to yield empty strings. > Ah, you are right sorry. But this is not required for the Mako example (I was focusing on that example). > I'm beginning to wonder if maybe what you're saying is that you want to > be able to write an application function in the form of a generator? The greenlet middleware return a generator, in order to work. > If > so, be aware that any WSGI 1 app written as: > > def app(environ, start_response): > start_response(status, headers) > yield "foo" > yield "bar" > > can be written as a WSGI 2 app thus: > > def app(environ, start_response): > def respond(): > yield "foo" > yield "bar" > return status, headers, respond() > The problem, as I wrote, is that with the greenlet middleware, the application needs not to return a generator. def app(environ): tmpl = ... body = tmpl.render(...) return status, headers, [body] This is a very simple WSGI application. But when using the greenlet middleware, and when using the function for flushing Mako buffer, some data will be yielded *before* the application returns and status and headers are passed to Nginx. > This is also a good time for people to learn that generators are usually > a *very bad* way to write WSGI apps It's the only way to be able to suspend execution, when the WSGI implementation is embedded in an async web server not written in Python. The reason is that you can not use (XXX check me) greenlets in C code, you should probably use something like http://code.google.com/p/coev/ Greenlets can be used in gevent, as an example, because scheduling is under control of Python code. This is not the case with Nginx. > - yielding is for server push or > sending blocks of large files, not tiny strings. Again, consider the use of sub requests. yielding a "not large" block is the only choice you have. Unless, of course, you implement sub request support in pure Python (or using SSI - Server Side Include). Another use case is when you have a very large page, and you want to return some data as soon as possible to avoid the user to abort request if it takes some time. Also, note that with Nginx (as with Apache, if I'm not wrong), even if application yields small strings, the server can still do some buffering in order to increase performance. In ngx_http_wsgi_module buffering is optional (and disabled by def
Re: [Web-SIG] WSGI and start_response
P.J. Eby ha scritto: > At 05:40 PM 4/8/2010 +0200, Manlio Perillo wrote: >> With WSGI 2.0 we will end up with: >> >> - WSGI 1.0, a full featured protocol, but with hard to implement >> middlewares >> - WSGI 2.0, a simple protocol, with more easy to implement middlewares >> but without support for some "advanced" applications > > Let me see if I understand what you're saying. You want to support > suspending an application, without using greenlets or threads. What I'm trying to do is: * as in the example I posted, turn Mako render function in a generator. The reason is that I would lite to to implement support for Nginx subrequests. During a subrequest, the generated response body is sent directly to the client, so it is necessary to be able to flush the Mako buffer * implement the simple suspend/resume extension, as described here: http://comments.gmane.org/gmane.comp.python.twisted.web/632 Note that my ngx_http_wsgi_module already support asynchronous web server, since when the application returns a generator and sending a yielded buffer to the client would block, execution of WSGI application is suspended, and resumed when the socket is ready to send data. The suspend/resume extension allows an application to explicitly suspend/resume execution, so it is a nice complement for an asynchronous server. I would like to propose this extension for wsgiorg namespace. Not that, however, greenlets are still required, since it will make the code much more usable. > Under > WSGI 1, you can do this by yielding empty strings before calling > start_response. No, in this case this is not what I need to do. I need to call start_response, since the greenlet middleware will yield data to the caller before the application returns. > Under WSGI 2, you can only do this by directly > suspending execution, e.g. via greenlet or eventlets or some similar API > provided by the server. Is this your objection? > In WSGI 2 what I want to do is not really possible. The reason is that I don't use greenlets in the C module (I'm not even sure greenlets can be used in my ngx_http_wsgi module) Execution is suspended using the "normal" suspend extension. The problem is with the greenlet middleware that will force a different code flow. > As far as I know, nobody has actually implemented an async app facility > for WSGI 1, although it sounds like perhaps you're trying to design or > implement such a thing now. Right. My previous attempt was a failure, since the extensions have severe usability problem. It is the same problem you have with Twisted deferred. In this case every function that call a function that use the async extension must be a generator. In my new attempt I plan to: 1) Implement the simple suspend/resume extension 2) Implement a Python extension module that wraps the Nginx events system. 3) Implement a pure Python WSGI middleware that, using greenlets, will enable normal applications to take advantage of Nginx async features. This middleware will have the same purpose as the Hub available in gevent > If so, then there's nothing stopping you > from implementing a WSGI 1 server and providing a WSGI 2 adapter, since > as you point out, WSGI 2 is easier to implement on top of WSGI 1 than > the other way around. > Yes, this is what I would like to do. Do you think it will possible to implement all the requirements of WSGI 2 (including Python 3.x support) in a simple adapter on top of WSGI 1.0 ? And what about applications that need to use the WSGI 1.0 API but require to run with Python 3.x? Thanks Manlio ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] WSGI and start_response
P.J. Eby ha scritto: > At 04:59 PM 4/8/2010 +0200, Manlio Perillo wrote: > [...] >> There should be a sample WSGI 2.0 implementation for CGI, and a sample >> WSGI 1.0 -> 2.0 adapter. >> >> This adapter should be able to support the coroutine example, >> > http://paste.pocoo.org/show/199202/ >> but I would like to test. >> >> write callable, as far as I know, can not be implemented. > > Implementing it requires greenlets or threads, but it's implementable. > See: > > http://mail.python.org/pipermail/web-sig/2009-September/003986.html > Right. In fact, in the example I posted, I implemented the write callable using greenlets (although the implementation is different). > (Btw, I've noticed that this early sketch of mine doesn't support the > case where an application is a generator, because start_response won't > have been called when the application returns. This can be fixed, but > it requires the addition of a wrapper class and a few other annoying > details. It also doesn't support exc_info properly, so it's still a > ways from being a correct WSGI 1 server implementation. Getting rid of > all these little variations, though, is the goal of having a WSGI 2 - > it's difficult to write *any* middleware to be completely WSGI 1 > compliant.) > I agree that this is a good goal. However I don't like the idea of losing support for some features. With WSGI 2.0 we will end up with: - WSGI 1.0, a full featured protocol, but with hard to implement middlewares - WSGI 2.0, a simple protocol, with more easy to implement middlewares but without support for some "advanced" applications Both WSGI 1.0 can be implemented on top of WSGI 2.0, and WSGI 2.0 on top of WSGI 1.0. The latter should be more "easy" to implement. I would like to have a WSGI 1.1 specification without the write callable, and a *standard* adapter that will expose a more simple API (like WSGI 2.0) so that applications and middlewares can be implemented using this simple API but you still have the full featured API. This is important, IMHO. Because with the next version of WSGI, there will be also support for Python 3.x. And if the next version will not have support for the start_response function, applications that needs Python 3.x and want to use "advance features" will not be able to rely a standard procotol. Regards Manlio ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] WSGI and start_response
Aaron Watters ha scritto: > someone remind me: where is the canonical WSGI 2 spec? http://wsgi.org/wsgi/WSGI_2.0 > I assume there is a way to "wrap" WSGI 1 applications > without breaking them? Or is this the regex-->re fiasco > all over again? > start_response can be implemented by a function that will store the status code and response headers. There should be a sample WSGI 2.0 implementation for CGI, and a sample WSGI 1.0 -> 2.0 adapter. This adapter should be able to support the coroutine example, > http://paste.pocoo.org/show/199202/ but I would like to test. write callable, as far as I know, can not be implemented. > [...] Regards Manlio ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
[Web-SIG] WSGI and start_response
Hi. Some time ago I objected the decision to remove start_response function from next version WSGI, using as rationale the fact that without start_callable, asynchronous extension are impossible to support. Now I have found that removing start_response will also make impossible to support coroutines (or, at least, some coroutines usage). Here is an example (this is the same example I posted few days ago): http://paste.pocoo.org/show/199202/ Forgetting about the write callable, the problem is that the application starts to yield data when tmpl.render_unicode function is called. Please note that this has *nothing* to do with asynchronus applications. The code should work with *all* WSGI implementations. In the pasted example, the Mako render_unicode function is "turned" into a generator, with a simple function that allows to flush the current buffer. Can someone else confirm that this code is impossible to support in WSGI 2.0? If my suspect is true, I once again object against removing start_response. WSGI 1.0 is really a well designed protocol, since it is able to support both asynchonous application (with a custom extension) and coroutines, *even* if this was not considered during protocol design. Thanks Manlio ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] WSGI safe write callable using greenlet
Manlio Perillo ha scritto: > Hi. > > In this period I'm upgrading my WSGI implementation for Nginx: > http://hg.mperillo.ath.cx/nginx/ngx_http_wsgi_module/ > [...] > So, I was thinking: what about a WSGI middleware that, using greenlets, > expose to the application a write callable with the correct code flow? > > > Here is a very first draft: > http://pastebin.com/4k1Ep4dH > > It should work with every standard WSGI implementation. > Here is a more generic middleware and example application: http://pastebin.com/S8c1gRfY and here is the output: http://pastebin.com/zzkRiRuA The example also contains hints about features I plan to implement, like the wsgiorg.suspend extension, and subrequests. Regards Manlio ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
[Web-SIG] WSGI safe write callable using greenlet
Hi. In this period I'm upgrading my WSGI implementation for Nginx: http://hg.mperillo.ath.cx/nginx/ngx_http_wsgi_module/ I'm not only updating the code to work with recent Nginx versions (after 2 years) but, above all, I'm cleaning up the code, removing stuff not strictly required and hard to maintain. I have already removed support to multiple Python subinterpreters, and now I'm going to remove the async extensions I wrote (there will only one very simple API, for applications using greenlets); finally I would like to remove support to the write callable. The problem, to put it simple, is that the write callable *can not* be implemented in an asynchronous web server like Nginx. I have two implementations: * the first (not the default), simply keeps a buffer. This is explicitly forbidden by WSGI. * the second puts the Nginx connection socket in synchronous mode; it works but it is something that *should not* be done. So, I was thinking: what about a WSGI middleware that, using greenlets, expose to the application a write callable with the correct code flow? Here is a very first draft: http://pastebin.com/4k1Ep4dH It should work with every standard WSGI implementation. I would really like to recevive feeback about this implementation, since I have never used greenlets before. P.S.: LICENSE is a MIT license Thanks Manlio ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] wsgi.errors and close method
Dirkjan Ochtman ha scritto: > On Tue, Mar 30, 2010 at 11:28, Manlio Perillo > wrote: >> Note however, that Mercurial has fixed the problem: > > So, as the guy who inherited Mercurial's hgweb WSGI application (or > rather, made it much more WSGI-compliant), Did you managed to remove usage of the write callable? > [...] Regards Manlio ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] wsgi.errors and close method
Graham Dumpleton ha scritto: > [...] >> Here is the culprit: >> http://lists.alioth.debian.org/pipermail/python-modules-team/2009-January/003514.html >> http://code.google.com/p/modwsgi/issues/detail?id=82 >> >> So it seems safe, when the Log object used in wsgi.errors is also used >> to replace sys.stderr, to just add the closed attribute (but *not* the >> close method). > > It is all very silly. Technically a file like object is not required > to have a 'closed' attribute, so that code expecting it was wrong in > the first place. > > http://docs.python.org/library/stdtypes.html#file-objects > Right, thanks; I did not notice it. Note however, that Mercurial has fixed the problem: # stderr may be buffered under win32 when redirected to files, # including stdout. if not getattr(sys.stderr, 'closed', False): sys.stderr.flush() I would probably do something like: try: sys.stderr.flush() except: pass > The close() method is however required of file like objects so if you > are going to replace a file like object, you should have it. > Yes, I should. But since they should raise an exception, raising AttributeError, instead, should not be a critical problem. Manlio ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] wsgi.errors and close method
Manlio Perillo ha scritto: > Hi. > > Some time ago, someone reported me that an application embedded in Nginx > with my WSGI module failed to execute, since in my implementation the > wsgi.errors object does not implement the .close method. > > [...] > Any idea? > Here is the culprit: http://lists.alioth.debian.org/pipermail/python-modules-team/2009-January/003514.html http://code.google.com/p/modwsgi/issues/detail?id=82 So it seems safe, when the Log object used in wsgi.errors is also used to replace sys.stderr, to just add the closed attribute (but *not* the close method). Thanks Manlio ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] wsgi.errors and close method
Graham Dumpleton ha scritto: > On 28 March 2010 22:21, Manlio Perillo wrote: >> Graham Dumpleton ha scritto: >>> [...] >>>> Unfortunately I never got to know what application or framework was >>>> causing the problem. >>>> >>>> Any idea? >> Sorry, my question was not clear. >> >> I was asking what applications or frameworks call the .close method on >> the errors object. > > I know what you were asking. My point was that it doesn't help to find > out as nearly impossible to get them to change the code. Ok, thanks. My point is that I don't have strict compatibility requirements for my ngx_http_wsgi_module, as you have with Apache mod_wsgi. As an example, the other day I removed support for CPython subinterpreters, since they make code more complex as it should be. The reason I want to know the "bad" applications/framework is because I would like to see the reason why they are calling the .close method. [...] >>> static PyGetSetDef Log_getset[] = { >>> { "closed", (getter)Log_closed, NULL, 0 }, >>> #if PY_MAJOR_VERSION < 3 >>> { "softspace", (getter)Log_get_softspace, (setter)Log_set_softspace, 0 >>> }, >>> #else >> I noted that you added softspace descriptor in recent versions. >> What is its purpose? >> Is it here just for compatibility? > > It is related to how comma separated lists and comma at end of line is > used in the following. > > print >> sys.stderr, "a", "b", > print >> sys.stderr, "c" > I will check it with my module, thanks. Regards Manlio ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] wsgi.errors and close method
Graham Dumpleton ha scritto: > [...] >> Unfortunately I never got to know what application or framework was >> causing the problem. >> >> Any idea? > Sorry, my question was not clear. I was asking what applications or frameworks call the .close method on the errors object. I want to check if: * they are really calling the .close method on wsgi.errors, and why * they are calling the .close method on stderr, and why > [...] > static PyGetSetDef Log_getset[] = { > { "closed", (getter)Log_closed, NULL, 0 }, > #if PY_MAJOR_VERSION < 3 > { "softspace", (getter)Log_get_softspace, (setter)Log_set_softspace, 0 }, > #else I noted that you added softspace descriptor in recent versions. What is its purpose? Is it here just for compatibility? Thanks Manlio ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
[Web-SIG] wsgi.errors and close method
Hi. Some time ago, someone reported me that an application embedded in Nginx with my WSGI module failed to execute, since in my implementation the wsgi.errors object does not implement the .close method. The same object type is used to replace sys.stderr. Of course, both trying to close wsgi.errors and sys.stderr means an application/framework is broken, IMHO. Unfortunately I never got to know what application or framework was causing the problem. Any idea? Thanks Manlio ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Generic configuration
Alex Morega ha scritto: > On 17 Mar 2010, at 13:47, Manlio Perillo wrote: > [...] >>> = >>> [daemon] >>> factory = egg:PasteScript#wsgiutils >>> host = 127.0.0.1 >>> port = 8000 >>> app = my_site >>> >>> [...] >>> >> If you want this, isn't it more simple and generic to use YAML? > > Yaml buys you flexibility at the cost of readability, which might be a good > trade-off, but that's not the point. You still need a tool that reads the > configuration file and does the actual setup. > > Does the wsgix configuration loader allow for plugins, i.e. defining my own > constructors? Is it documented? > This is a non problem. You can write your own YAML loader (maybe deriving it from the existing one), write a small middleware that use this loader and push it into the stack middleware. There is no need to support generic plugins. > I chose to base my example on Paster configuration because it already knows > about egg entry points and explicitly pointing to factory functions. > Manlio ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Generic configuration
Alex Morega ha scritto: > On 17 Mar 2010, at 0:24, Manlio Perillo wrote: > >> Alex Morega ha scritto: >>> Hello, >>> >>> This is not really a WSGI question, it's more into general configuration, >>> but I don't know of a better place to ask it. >>> > [...] >> I use YAML with custom constructors: >> http://hg.mperillo.ath.cx/wsgix/file/tip/wsgix/conf/loader.py >> >> There is a middleware: >> http://hg.mperillo.ath.cx/wsgix/file/tip/wsgix/conf/middleware.py >> that reads a list of configuration files to load from the WSGI environ >> (I set this in the Nginx mod_wsgi) configuration, and merge all the >> configuration in the WSGI environ. >> >> Using custom YAML constructors it is possible to do something like: >> http://hg.mperillo.ath.cx/wsgix/examples/file/tip/dbview/settings.yml >> > [...] > > That's still configuring a piece of WSGI middleware or application. I'm > thinking about something along these lines: > Yes, since it works quite differently from other frameworks. Middleware are very easy to configure; in Nginx configuration file: wsgi_middleware wsgix.conf.middleware; The reason is that how you want to store configuration parameters should not hard written in the framework. Using YAML in my framework does not prevent using other methods like ConfigParser or Python modules. > = > [daemon] > factory = egg:PasteScript#wsgiutils > host = 127.0.0.1 > port = 8000 > app = my_site > > [...] > If you want this, isn't it more simple and generic to use YAML? Manlio ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Generic configuration
Alex Morega ha scritto: > Hello, > > This is not really a WSGI question, it's more into general configuration, but > I don't know of a better place to ask it. > > Paster config files allow you to hook up WSGI applications, middleware, and a > server, plus some (undocumented?) magic configuration of the logging module. > But what about random components, like a database? Ideally I'd like to > specify a factory for database connections and give it some parameters; this > would return a reference to a new database connection. I could then pass this > reference to my wsgi app or middleware. > > Apparently the pattern is to perform this database configuration as part of a > wsgi middleware, but that feels unnatural. Or one could do this outside of > the paste configuration file, but that just splits the configuration > needlessly into several pieces. Am I missing something obvious? > I use YAML with custom constructors: http://hg.mperillo.ath.cx/wsgix/file/tip/wsgix/conf/loader.py There is a middleware: http://hg.mperillo.ath.cx/wsgix/file/tip/wsgix/conf/middleware.py that reads a list of configuration files to load from the WSGI environ (I set this in the Nginx mod_wsgi) configuration, and merge all the configuration in the WSGI environ. Using custom YAML constructors it is possible to do something like: http://hg.mperillo.ath.cx/wsgix/examples/file/tip/dbview/settings.yml It is also possible to configure the global python logging, create temporary files and so on. > Thanks, > -- Alex > Manlio ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Migrating from mod_wsgi to FastCGI
Gustavo Narea ha scritto: > Hello, > > We're considering migrating from mod_wsgi to FastCGI (Apache) because > we'll need to use versions of Python compiled by ourselves. > Note that you can simply recompile mod_wsgi to use your custom Python. > [...] Manlio ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
[Web-SIG] host_name and request_uri_path
Hi. Recently I have implemented these two functions: http://paste.pocoo.org/show/170198/ I would like to know if it is worth to have them as a saparate functions or if there is a better method to get the host name and the request URI path. About the host_name function, what is the reason why it is not included in wsgiref? Thanks Manlio ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] CGI WSGI and Unicode
Graham Dumpleton ha scritto: Note: I'm sending the entire message to the mailing list. > 2009/12/7 Manlio Perillo : >> Hi. >> >> I'm playing with Python 3.x, current revision. >> >> I have noted that the data in the os.environ are noe Unicode strings. >> >> In a CGI application, HTTP headers are Unicode strings, and are decoded >> using system default encoding. >> In a future WSGI application, HTTP headers are Unicode strings, and are >> decoded using latin-1 encoding. >> >> In both cases, 'surrogateescape' is used. > > No, 'surrogateescape' is not necessary when using latin-1, or at least > for variables which use latin-1. > The problem is that not all browsers use latin-1. As an example with HTTP Digest authentication. > Use of 'surrogateescape' is only relevant in the context of some web > servers and only relevant for specific variables, some of which aren't > even part of set of variables which are required by WSGI. > > For example, in Apache/mod_wsgi, 'surrogateescape' is used on > DOCUMENT_ROOT and SCRIPT_FILENAME. What about HTTP_COOKIE? > [...] >> Can this cause troubles and incompatibility problems? >> I'm interested in special header handling, like cookies, that contain >> opaque data. > > The issues which CGI/WSGI bridge in Python 3.X has been discussed > previously on the list. It seems I missed it. > It is acknowledged that there are problems to > be solved there, at least to extent that CGI/WSGI bridge > implementation has to correct the encoding, and also that that may > only be solvable in Python 3.1 onwards due to not having access to > what encoding was use for environment variables in Python 3.0. Not > many people care about CGI these days and so no one has been bother to > come up with working CGI/WSGI bridge for Python 3.X. > CGI is very important; there are some kind of web applications that have problems when executing in a long running process. As an example, I prefer to run Trac and Mercurial instances as CGI. > Graham Regards Manlio ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
[Web-SIG] CGI WSGI and Unicode
Hi. I'm playing with Python 3.x, current revision. I have noted that the data in the os.environ are noe Unicode strings. In a CGI application, HTTP headers are Unicode strings, and are decoded using system default encoding. In a future WSGI application, HTTP headers are Unicode strings, and are decoded using latin-1 encoding. In both cases, 'surrogateescape' is used. Can this cause troubles and incompatibility problems? I'm interested in special header handling, like cookies, that contain opaque data. Thanks Manlio ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Move to bless Graham's WSGI 1.1 as official spec
Henry Precheur ha scritto: > On Fri, Dec 04, 2009 at 07:40:55PM +0100, Manlio Perillo wrote: >> What are the functions that does not works with byte strings? > > Just to make things clear, I was talking about Python 3. > I know. Unfortunately I don't have installed Python 3, I'm just reading the code. > All the functions I tried not ending with _from_bytes raise an exception > with bytes. This includes urllib.parse.parse_qs & urllib.parse.urlparse > which are rather critical ... > Ah, ok. Can you show me the traceback of parse_qs? Thanks. >> First of all, HTTP never says that whole headers are of type TEXT. >> Only specific components are of type TEXT. > > If parts of a header contain latin-1 characters, that means its > encoding is latin-1 (at least partially). > This is not completely true. > [...] > And WSGI is not about HTTP in a distant future, it's about HTTP right > now. > >> Do you really want to define the new WSGI specification to be "against" >> the new (possible) HTTP spec? > > I don't know why it would be "against" it. Well, I have quoted it for this reason. What I mean is that, IMHO: - Using Unicode strings in WSGI is an abuse of Unicode string - This abuse is not justified by the HTTP spec > [...] Regards Manlio ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Move to bless Graham's WSGI 1.1 as official spec
Henry Precheur ha scritto: > On Fri, Dec 04, 2009 at 10:17:09AM +0100, Manlio Perillo wrote: >> It is just as simple as using byte strings, IMHO. > > No, it's not. There were lots of dicussions regarding this on the > mailing list. One of the main issue is that the standard library > supports bytes poorly. urllib for example expects strings not bytes. > I read last month discussions 3 day ago! The quote function supports byte strings, as an example. What are the functions that does not works with byte strings? >>> * WSGI sticks to what RFC 2616 (Hypertext Transfer Protocol -- HTTP/1.1) >>> says. WSGI is about HTTP, but that doesn't necessarily includes all >>> other standards extending HTTP. >>> >> HTTP never says to consided whole headers as latin-1 text, IMHO. > > It does: > > When no explicit charset parameter is provided by the sender, media > subtypes of the "text" type are defined to have a default charset value > of "ISO-8859-1" when received via HTTP. > > http://tools.ietf.org/html/rfc2616#section-3.7.1 > This is not correct. First of all, HTTP never says that whole headers are of type TEXT. Only specific components are of type TEXT. Moreover, HTTPbis has finally clarified this; TEXT is no more used, instead non ascii characters are to be considered opaque. Do you really want to define the new WSGI specification to be "against" the new (possible) HTTP spec? Of course it will work; but since some code in the standard library needs to be fixed (the wsgiref.util.application_uri, as an example), maybe it is better to fix it to work with byte strings. Just my two cents. > [...] Regards Manlio ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Move to bless Graham's WSGI 1.1 as official spec
And Clover ha scritto: > Manlio Perillo wrote: > >> Words of *TEXT MAY contain characters from character sets other than >> ISO-8859-1 [22] only when encoded according to the rules of RFC 2047 > > Yeah, this is, unfortunately, a lie. The rules of RFC 2047 apply only to > RFC*822-family 'atoms' and not elsewhere; indeed, RFC2047 itself > specifically denies that an encoded-word can go in a quoted-string. > > RFC2047 encoded-words are not on-topic in an HTTP header(*); this has > been confirmed by newer development work on HTTPbis by Reschke et al. > (http://tools.ietf.org/wg/httpbis/). > Thanks. HTTPbis seems to fix all these problems: "Historically, HTTP has allowed field content with text in the ISO- 8859-1 [ISO-8859-1] character encoding and supported other character sets only through use of [RFC2047] encoding. In practice, most HTTP header field values use only a subset of the US-ASCII character encoding [USASCII]. Newly defined header fields SHOULD limit their field values to US-ASCII characters. Recipients SHOULD treat other (obs-text) octets in field content as opaque data." This is the new rule for `quoted-string`: quoted-string = DQUOTE *( qdtext / quoted-pair ) DQUOTE qdtext = OWS / %x21 / %x23-5B / %x5D-7E / obs-text ; OWS / / obs-text obs-text = %x80-FF quoted-pair= "\" ( WSP / VCHAR / obs-text ) > The "correct" way of escaping header parameters in an RFC*822-family > protocol would be RFC2231's complex encoding scheme, but HTTP is > explicitly not an 822-family protocol despite sharing many of the same > constructs. See > http://tools.ietf.org/html/draft-reschke-rfc2231-in-http-06 for a > strategy for how 2231 should interact with HTTP, but note that for now > RFC2231-in-HTTP simply does not exist in any deployed tools. > It seems reasonable. > So for now there is basically nothing useful WSGI can do other than > provide direct, byte-oriented (even if wrapped in 8859-1 unicode > strings) access to headers. > Yes, this is what I think. I have some doubts about wrapping the headers in 8859-1 unicode strings, but luckily there is surrogateescape. Regards Manlio ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Move to bless Graham's WSGI 1.1 as official spec
Henry Precheur ha scritto: > On Thu, Dec 03, 2009 at 09:15:06PM +0100, Manlio Perillo wrote: >> There is something that I don't understand. >> >> Some HTTP headers, like Accept-Language, contains data described as >> `token`, where: >> >> token = 1* >> >> So a token, IMHO, is an opaque string, and it SHOULD not decoded. >> In Python 3.x it SHOULD be a byte string. > > I think this is more an issue that frameworks should deal with. By > decoding every headers value to latin-1: > > * It keeps WSGI simple. Simple is good. > It is just as simple as using byte strings, IMHO. It is not simple, it is convenient because of (if I understand correctly) how code is converted by 2to3. > * WSGI sticks to what RFC 2616 (Hypertext Transfer Protocol -- HTTP/1.1) > says. WSGI is about HTTP, but that doesn't necessarily includes all > other standards extending HTTP. > HTTP never says to consided whole headers as latin-1 text, IMHO. > * It's possible to convert latin-1 strings to bytes without losing data. > Yes, but it is quite stupid to first convert to Unicode and then convert again to byte string. It it true, however, that this does not happen often; but only for: - WSGI applications that implement an HTTP proxy - WSGI applications that needs to support HTTP Digest Authentication - WSGI applications that store encoded data in cookies Regards Manlio ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Move to bless Graham's WSGI 1.1 as official spec
And Clover ha scritto: > Manlio Perillo wrote: > >> However what about URI (that is, for PATH_INFO and the like)? >> For URI (if I remember correctly) the suggested encoding is UTF-8, so >> URLS should be decoded using > >> url.decode('utf-8', 'surrogateescape') > >> Is this correct? > > The currently-discussed proposal is ISO-8859-1, allowing the real bytes > to be trivially extracted. This is consistent with the other headers and > would be my preferred approach. > There is something that I don't understand. Some HTTP headers, like Accept-Language, contains data described as `token`, where: token = 1* So a token, IMHO, is an opaque string, and it SHOULD not decoded. In Python 3.x it SHOULD be a byte string. Text content is described as `TEXT`, where: The TEXT rule is only used for descriptive field contents and values that are not intended to be interpreted by the message parser. Words of *TEXT MAY contain characters from character sets other than ISO- 8859-1 [22] only when encoded according to the rules of RFC 2047 [14]. TEXT = The only type of data where TEXT can be used is `quoted-string`. A `quoted-string` only appears in well specified portions of an header. So, IMHO, it is *not* correct for a WSGI middleware, to return all HTTP headers as Unicode strings. This is up to the application/framework, that must parse each header, split it in component and handle them as more appropriate (as byte string, Unicode string or instance of some other data type). > [...] Regards Manlio ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] HTTP headers encoding
Henry Precheur ha scritto: > [...] >> How is authorization username handled in common WSGI frameworks? > > As far as I know, they don't handle this. They just return the string > without dealing with the encoding issues. > > I think there is no correct way of handling this, because 99% of > username/password contain only ascii characters. A possible 'workaround' > would be to limit yourself to the ascii charset. If you get a non-ascii > character raise an Exception. > Right now I'm doing a: username.decode('us-ascii', 'replace') Regards Manlio ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Move to bless Graham's WSGI 1.1 as official spec
And Clover ha scritto: > [...] >> Cookie data SHOULD be transparent to the server/gateway; however WSGI is >> going to assume that data is encoded in latin-1. > > Yeah. This is no big deal because non-ASCII characters in cookies are > already broken everywhere(*). Given this and other limitations on what > characters can go in cookies, they are habitually encoded using ad-hoc > mechanisms handled by the application (typically a round of URL-encoding). > > *: in particular: > > - Opera and Chrome send non-ASCII cookie characters in UTF-8. > - IE encodes using the system codepage (which can never be UTF-8), > mangling any characters that don't fit in the codepage through the > traditional Windows 'similar replacement character' scheme. > - Mozilla uses the low byte of each UTF-16 code point (so ISO-8859-1 > gets through but everything else is mangled) > - Safari refuses to send any cookie containing non-ASCII characters. > Thanks for this summary. I think it should go in a wiki or in a separate document (like rationale) to the WSGI spec. However this should never happen with cookie, since cookie data is opaque to browser, and it MUST send it "as is". What you describe happen with other headers containing TEXT. And now I understand that strange behaviour of Firefox with non latin-1 strings in username, in HTTP Basic Authentication. > [...] Regards Manlio ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] HTTP headers encoding
Manlio Perillo ha scritto: > Hi. > > I'm doing some tests to try to understand how HTTP headers are encoded > by browsers. > > I have written a simple WSGI application that asks authentication > credentials and then print them on the terminal and return the data as > response, as raw bytes > http://paste.pocoo.org/show/154633/ > I'm now testing using HTTP Digest Authentication. The application is here: http://paste.pocoo.org/show/154667/ It uses my wsgix framework http://hg.mperillo.ath.cx/wsgix/ since I don't want to rewrite the entire Digest Authentication handling. As user name I use the the string "à è€". The results are: - Firefox does not send any request, and instead it show me the returned response body "Authentication required". This is quite strange. - Internet Explorer 6 encode the username using cp1252, as always. - Opera (10.01) encode the username using utf-8 I can not test with Konqueror, since the wsgiref server have problems with it. All these implementation are against the HTTP spec. username is a quoted string, and so it SHOULD be encoded using the default latin-1, or another charset and in this case it should be formatted as specified my MIME (unfortunately there are no examples in the HTTP spec). This is really a mess. How is authorization username handled in common WSGI frameworks? Thanks Manlio ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
[Web-SIG] HTTP headers encoding
Hi. I'm doing some tests to try to understand how HTTP headers are encoded by browsers. I have written a simple WSGI application that asks authentication credentials and then print them on the terminal and return the data as response, as raw bytes http://paste.pocoo.org/show/154633/ Then I used some browsers to try to send an username with non ascii characters. When I try with simple characters in the iso-8859-1 charset, things works well; the data is encoded using this charset. However when I try to use some extraneus character, like Euro, there are problems. Firefox (Iceweasel 3.0.14, Linux Debian Squeeze) sends me a '\xac' I don't know where \xac come from, but it is the last byte in the utf-8 encoded Euro: '\xe2\x82\xac' Internet Explorer 6.0 sends me a '\x80' and this this the Euro characted encoded using cp1252 (and I suspect that it always use this encoding, instead of iso-8859-1). Unfortunately I can not test with IE 7 and 8. With a browser working on a terminal, like lynx, things get worse. If I enter as user name the string "à è", lynx sends me '\xc3\xa0\xc3\xa8' This happens in a GNOME terminal, with an it_IT.utf8 locale. wget and curl do the same. Can someone else reproduce this? Thanks Manlio ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Move to bless Graham's WSGI 1.1 as official spec
James Y Knight ha scritto: > I move to bless mod_wsgi's definition of WSGI 1.1 [1] > [...] > > [1] http://code.google.com/p/modwsgi/wiki/SupportForPython3X Hi. Just a few questions. It is true that HTTP headers can be encoded assuming latin-1; and they can be encoded using PEP 383. However what about URI (that is, for PATH_INFO and the like)? For URI (if I remember correctly) the suggested encoding is UTF-8, so URLS should be decoded using url.decode('utf-8', 'surrogateescape') Is this correct? Now another question. Let's consider the `wsgiref.util.application_uri` function def application_uri(environ): url = environ['wsgi.url_scheme']+'://' from urllib.parse import quote if environ.get('HTTP_HOST'): url += environ['HTTP_HOST'] else: url += environ['SERVER_NAME'] if environ['wsgi.url_scheme'] == 'https': if environ['SERVER_PORT'] != '443': url += ':' + environ['SERVER_PORT'] else: if environ['SERVER_PORT'] != '80': url += ':' + environ['SERVER_PORT'] url += quote(environ.get('SCRIPT_NAME') or '/') return url There is a potential problem, here, with the quote function. This function does the following: def quote(string, safe='/', encoding=None, errors=None): if isinstance(string, str): if encoding is None: encoding = 'utf-8' if errors is None: errors = 'strict' string = string.encode(encoding, errors) This means that if we use surrogateescape, the informations about original bytes is lost here. This can be easily fixed by changing the application_uri function, but this also means that a WSGI application will not work with Python 3.1.x. Finally, a question about cookies. Cookie data SHOULD be transparent to the server/gateway; however WSGI is going to assume that data is encoded in latin-1. I don't know what the HTTP/Cookie spec says about this. However, from a WSGI application point of view, the cookie data can, as an example, contain some text encoded in UTF-8; this means that the application must first encode the data: cookie_bytes = cookie.encode('latin-1', 'surrogateescape') and then decode it using UTF-8: my_cookie_data = cookie_bytes.decode('utf-8') This is a bit unreasonable, but I don't know if this is a common practice (I do this, just to make an example). Manlio Perillo ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Closing long-running WSGI requests (possible?)
Chimezie Ogbuji ha scritto: > Hello. I have a problem with a WSGI-based SPARQL server that I have been > unable to resolve for some time. I was told this is the best place to ask > :). I'm building a SPARQL [1] server that is deployed as WSGI/Paste > server. SPARQL queries are handled by the server and evaluated against a > MySQL database using mysql-python/MySQLdb to manage the connection. > > My goal is to be able to allow clients to close the connection in order to > kill queries that have been dispatched (in order to 'abort' them). > Unfortunately, when the client kills the connection, the application is not > signaled in any way. So, the result is that (for long-running queries), the > MySQL query continues to run even after the connection is closed (by > clicking cancel in the browser for instance). > > [...] What you want to do is not possible. A more viable solution is to use JavaScript. Add a custom "abort button" on the web page so that a function is associate to the "click" event. Also, you should associate a function to the "unload" event (where you can check if there are active queries). In the JavaScript function you can issue an XMLHTTPRequest, using an unique identifier. Note that if you use PostgreSQL, you can use: http://www.postgresql.org/docs/8.3/interactive/protocol-flow.html#AEN73870 When you create a connection to PostgreSQL, the server will send you the backend process id an unique key. You can use this data to send a cancellation request. All you need to do is to pass the process id and the unique key to the client (with some encryption so that the client can use the data only once). Unfortunately, libpq does not offer a flexible interface to this feature. The PGCancel structure is opaque, so you need some hacking. Manlio Perillo ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] HTML parsing - get text position and font size
Girish Redekar ha scritto: I'm trying to build a search engine in python am stuck at the place where I parse HTML to get useful text. One should ideally be able to parse the text (out of HTML tags) along with its position (for phrase searches) and font-size (to weigh words appropriately). Words weight should be done using semantics, not style. However, if you really need it, for CSS parsing, there is cssutils package. I'm writing a CSS parser, too: http://hg.mperillo.ath.cx/pdfimg/file/tip/pdfimg/style/css/ using PLY, so it should easy to read/modify. It is still in very early stage. > [...] Regards Manlio Perillo ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
[Web-SIG] setup_testing_defaults and SERVER_PROTOCOL
Isn't more appropriate for wsgiref.util.setup_testing_defaults function to set SERVER_PROTOCOL to HTTP/1.1, instead of HTTP/1.0, since HTTP/1.1 is the current version of the protocol? Thanks Manlio Perillo ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
[Web-SIG] logging support in a multiprocess web server
Hi. I have noted that some WSGI based web applications use the standard logging module, for logging. However I have some doubts about how this works when the application is embedded in a web server that uses multiple processes (like Nginx or Apache with prefork). Thanks Manlio Perillo ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] handling URLs with ending slash
Thomas Broyer ha scritto: On Sun, Dec 14, 2008 at 11:23 AM, Manlio Perillo wrote: In my WSGI applications I always have an ending slash to the URLs. This means that an URL without the ending slash will cause the underlying resource to return 404 Not Found HTTP response. What is the best method to handle this, using a regex based URL dispatcher? I would add some kind of "catch-all entry" to dispatch to a "trailing slash redirector" WSGI app: routes.add("[^/]$", force_trailing_slash) I not sure I like this. or eventually add a WSGI middleware to each mapped application The URL dispatcher is a WSGI middleware, so it is ok for me to do this in the url dispatcher. I would like to keep the numbers of middleware to a minimun (function calls in Python are not cheap). (...that need such a treatment, could be all of them) that would issue a redirect to the "slash-appended" URL when needed, or just pass through to the application otherwise: routes.add(, force_trailing_slash(my_application)) Yes, that's an idea. Note that this is a special case of an "URL normalizer" middleware. A middleware can be used as a function decorator. This is a point in favour to use a dedicated middleware for this. Thanks Manlio Perillo ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] handling URLs with ending slash
Randy Syring ha scritto: Manilo, Manlio not Manilo, please! Here is a thread on this topic, well a partial thread, start reading about half way down: http://groups.google.com/group/pylons-discuss/browse_thread/thread/6888b790239b488b I found it informative. Thanks, it is interesting. Regards Manlio Perillo ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
[Web-SIG] handling URLs with ending slash
Hi. In my WSGI applications I always have an ending slash to the URLs. This means that an URL without the ending slash will cause the underlying resource to return 404 Not Found HTTP response. What is the best method to handle this, using a regex based URL dispatcher? I'm planning to add an option to my URL dispatcher to force any URL to have an ending slash (as an example requesting an HTTP redirect - either 302 or 301, or by just internally modifying the URL), but I'm not sure this is the best solution. Thanks Manlio Perillo ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] wsgiref.validate allows wsgi.input.read() with no argument
Graham Dumpleton ha scritto: Just noticed that although WSGI PEP doesn't specifically mention that argument to read() on wsgi.input is optional, wsgiref.validate allows calling read() with no argument. wsgiref.validate makes also other assumptions about a WSGI application that are not required by the WSGI PEP. As an example it reports as an error the presence in the environ dictionary of HTTP_CONTENT_TYPE and HTTP_CONTENT_LENGTH, but the PEP says nothing about this, and CGI [1] says: """"The server may exclude any headers which it has already processed, such as Authorization, Content-type, and Content-length. If necessary, the server may choose to exclude any or all of these headers if including them would exceed any system environment limits.""" [1] http://hoohoo.ncsa.uiuc.edu/cgi/env.html P.S.: the link "http://cgi-spec.golux.com/draft-coar-cgi-v11-03.txt"; is broken. [...] Regards Manlio Perillo ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Revising environ['wsgi.input'].readline in the WSGI specification
Ian Bicking ha scritto: [...] Fine for me, but of course we need to do this as: 1) Errata to WSGI 1.0 or 2) WSGI 1.1 or 3) WSGI 2.0 You can't just modify the current WSGI 1.0 spec. I'm for 2), with the other clarifications about WSGI we have discussed in the past. I'm for 1. What other clarifications were you thinking of? Here is a list of messages I have posted in the past. - start_response and error checking 25 September 2007 http://mail.python.org/pipermail/web-sig/2007-September/002771.html - hop-by-hop headers handling 1 October 2007 http://mail.python.org/pipermail/web-sig/2007-October/002775.html - HTTP_CONTENT_TYPE and HTTP_CONTENT_LENGTH 12 December 2007 http://mail.python.org/pipermail/web-sig/2007-December/003014.html - a possible error in the WSGI spec 20 December 2007 http://mail.python.org/pipermail/web-sig/2007-December/003064.html - calling start_response and the write from a separate thread 27 December 2007 http://mail.python.org/pipermail/web-sig/2007-December/003104.html - WSGI and PEP 325 20 May 2008 http://mail.python.org/pipermail/web-sig/2008-May/003438.html I'm rather sure there were other threads about clarifications of WSGI 1.0. One of these was about if a WSGI gateway is allowed to skip the generation of the request body (assuming the WSGI applications returns a generator) if this is not required (the client cached copy of the request entity is up to date and the server is going to return 304 Not Modified) Regards Manlio Perillo ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Revising environ['wsgi.input'].readline in the WSGI specification
Phillip J. Eby ha scritto: At 08:49 PM 11/17/2008 +0100, Manlio Perillo wrote: Ian Bicking ha scritto: [...] We need to propose a change to the WSGI specification. I propose, in "Input and Error Streams" (http://www.python.org/dev/peps/pep-0333/#input-and-error-streams) we change it to have "readline(hint)" and expand Note 3 to include readline as well as readlines, removing Note 2. Also I suppose some sort of change note in the specification? Does this sound like a sufficient change to the spec, and are there any objections to the change? Fine for me, but of course we need to do this as: 1) Errata to WSGI 1.0 or 2) WSGI 1.1 or 3) WSGI 2.0 You can't just modify the current WSGI 1.0 spec. I'm for 2), with the other clarifications about WSGI we have discussed in the past. I'm more inclined towards #1. I'm not sure, since it is an API change; of course if there was an error in the API this should be an errata, but there is a rationale behind the current API. I'm fine, however, with an amendment. > [...] Regards Manlio Perillo ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Revising environ['wsgi.input'].readline in the WSGI specification
Ian Bicking ha scritto: [...] We need to propose a change to the WSGI specification. I propose, in "Input and Error Streams" (http://www.python.org/dev/peps/pep-0333/#input-and-error-streams) we change it to have "readline(hint)" and expand Note 3 to include readline as well as readlines, removing Note 2. Also I suppose some sort of change note in the specification? Does this sound like a sufficient change to the spec, and are there any objections to the change? Fine for me, but of course we need to do this as: 1) Errata to WSGI 1.0 or 2) WSGI 1.1 or 3) WSGI 2.0 You can't just modify the current WSGI 1.0 spec. I'm for 2), with the other clarifications about WSGI we have discussed in the past. Regards Manlio Perillo ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Async API for Python
Jerry Spicklemire ha scritto: Sorry, if this turns up twice ... Phillip J. Eby wrote, on Tue Jul 29 03:21:18 CEST 2008: "There is no async API that's part of WSGI itself, and it's unlikely there will ever be one unless there ends up being an async API for Python as well." http://mail.python.org/pipermail/web-sig/2008-July/003547.html Following up, perhaps this would be of interest: "New PEP proposal: C Micro-Threading" "This PEP adds micro-threading (or 'green threads') at the C level so that micro-threading is built in and can be used with very little coding effort at the python level. Personally I think that implementing a standard reactor in Python is bad. The Micro-Threading should just offer an API, like Twisted Deferred, generators and greenlets do; the reactor should be implemented separately. > [...] Manlio Perillo ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
[Web-SIG] a new implementation of multipart/form-data parser
Hi all. For my WSGI framework I have implemented a multipart/form-data parser. http://hg.mperillo.ath.cx/wsgix/diff/70aacc4a8301/wsgix/parse.py The code has been adapted from cgi.parse_multidata. I think that the function is more robust of FieldStorage, since you can set a max size for field data stored in memory. The code is more simple, too (since I have done a little review of current browsers behaviour, and none of them use multipart/mixed when encoding multiple file fields with the same name). Now I'm going to write a middleware that takes a POST request with data encoded in multipart/form-data, and transcode the request entity in application/www-form-urlencoded, with file fields saved as: field_name=&field_path=&field_content_type= where is the temporary path where the file has been stored. Note that there is a Nginx module http://www.grid.net.ru/nginx/upload.en.html that does this (but don't transcode in application/www-form-urlencoded. Any one interested? I really whould like some reviews. Thanks Manlio Perillo ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] parsing of urlencoded data and Unicode
Deron Meranda ha scritto: [...] But, at this point, can one consider the content of form post to be encoded "text" string? Or it should be considered encoded "byte" string? Both/either. I'd say follow the RFC, but perhaps allow a caller to provide an override default. So yes, you should assume an encoded string if the subpart has a text/* Content-Type, or if it has no content type at all (which must then be assumed to be text/plain US-ASCII). That is the intent of the MIME text/* media type after all; that it should be interpreted as a character string and not a byte string. In other cases, I would say returning a byte string is the correct thing to do. I'm not sure to understand. If you want non text data in the POST request body, you can use the file control. I can't really see use cases of normal input fields having byte strings. > [...] Manlio Perillo ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] parsing of urlencoded data and Unicode
James Y Knight ha scritto: On Jul 29, 2008, at 1:14 PM, Bill Janssen wrote: Ok with theory. But in practice: Seems like you're looking at a broken browser there. Can anyone point to where a W3C standard or IETF RFC describes this behavior? You seem to be under the mistaken impression that form post content is MIME. It is not. It looks kinda like it should be, and maybe it's even specified to be [rfc2388], but actually treating it as MIME is a rather critical error. RFC2388 is just wrong, don't believe a thing it says. But, at this point, can one consider the content of form post to be encoded "text" string? Or it should be considered encoded "byte" string? > [...] Manlio Perillo ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] parsing of urlencoded data and Unicode
Bill Janssen ha scritto: Ok with theory. But in practice: Seems like you're looking at a broken browser there. Right. It's Firefox. But it's the same with IE 6 and Opera. Can anyone point to where a W3C standard or IETF RFC describes this behavior? I think that it is safe to decode data from the QUERY_STRING and POST=20 data to Unicode, and to return Bad Request in case of errors. It's clearly not safe to do so generally. If you do decide to do this, please tell me what framework you're building so that I can avoid it :-). No, wait. I don't blindly guess the encoding. I first try the content-type header, then the special _charset_ field, and finally utf-8. If there is a problem in the decoding, the client is broken (or there is a bug in the application). So the correct response is Bad Request, IMHO. Bill Manlio Perillo ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] parsing of urlencoded data and Unicode
Bill Janssen ha scritto: That's probably wrong. We went through this recently on the python-dev list. While it's possible to tell the encoding of multipart/form-data, With multipart/form-data the problem should be the same. The content type is defined only for file fields. Actually, it's defined for all fields, isn't it? From RFC 2388: ``As with all multipart MIME types, each part has an optional "Content-Type", which defaults to text/plain.'' So the type is "text/plain" unless it says something else. And, according to RFC 2046, the default charset for "text/plain" is "US-ASCII". Ok with theory. But in practice: Content-Type: multipart/form-data; boundary=abcde abcde Content-Disposition: form-data; name="Title" hello abcde Content-Disposition: form-data; name="body" à Úìòù abcde In theory I should assume ascii encoded data for the body field; and since this data can not be decoded, I should assume it as byte string. However the body field is encoded in utf-8, and if I add an hidden _charset_ field, FF and IE add this field in the response, with the charset used in the encoding. I think that it is safe to decode data from the QUERY_STRING and POST data to Unicode, and to return Bad Request in case of errors. If the user have specialized needs, he can use low level parsing functions. In wsgix the "high" level functions are parse_query_string and parse_simple_post_data; the "low" level function is parse_qs. > [...] Thanks Manlio Perillo ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] parsing of urlencoded data and Unicode
Bill Janssen ha scritto: In wsgix I use utf-8 for decoding the QUERY_STRING, and the charset specified in the POST'ed data (utf-8 or the charset found in the special _charset_ field). That's probably wrong. We went through this recently on the python-dev list. While it's possible to tell the encoding of multipart/form-data, With multipart/form-data the problem should be the same. The content type is defined only for file fields. the query_string and x-www-form-urlencoded data may be in arbitary character set encodings (see RFC 3986). It's probably best to not try to map them to strings; instead, return byte arrays for the value, and only return strings for data that can be correctly decoded. Otherwise, you lose information that the app cannot recover. Interesting, thanks. I have read Django code and, as far as I can tell, it always decode data to strings, but using "replace" error handling. Can you point me to the discussion on python-dev list? Bill Manlio Perillo ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] parsing of urlencoded data and Unicode
Ian Bicking ha scritto: Manlio Perillo wrote: Hi. In my WSGI framework: http://hg.mperillo.ath.cx/wsgix I have, in the `http` module, the functions `parse_query_string` and `parse_simple_post_data`. The first parse the query string and return a dictionary of strings, the latter parse the application/x-www-form-urlencoded client body and return a dictionary of strings and the charset used by the client for the unicode encoding. Now, I'm thinking if these two function should instead return Unicode strings instead of plain strings. I think that Unicode strings should be returned, but I would like to know what other web frameworks do. Django seems to convert to Unicode, but the Python standard library does not (and I would like to know if changes are planned for Python 3.x). WebOb decodes to request data to str, then lazily decodes to unicode based on the request encoding. The request encoding is a bit fuzzy to calculate, which is part of why the decoding is lazy, so that the request encoding can be set or changed at any time. Ok, thanks. In wsgix I use utf-8 for decoding the QUERY_STRING, and the charset specified in the POST'ed data (utf-8 or the charset found in the special _charset_ field). Manlio Perillo ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Could WSGI handle Asynchronous response?
est ha scritto: I am writing a small 'comet'-like app using flup, something like this: def myapp(environ, start_response): start_response('200 OK', [('Content-Type', 'text/plain')]) return ['Flup works!\n']<-Could this be part of response output? What do you mean by "part of response output"? Could I time.sleep() for a while then write other outputs? Not with flup. if __name__ == '__main__': from flup.server.fcgi import WSGIServer WSGIServer(myapp, multiplexed=True, bindAddress=('0.0.0.0', )).run() So is WSGI really synchronous? Not really. Since you can return a generator, it's possible to support asynchronous programming, but the WSGI gateway must support it, as an example with Nginx mod_wsgi and some other implementations (search in the mailing list archive). But this support has not been standardized. How can I handle asynchronous outputs with flup/WSGI ? Regards Manlio Perillo ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
[Web-SIG] parsing of urlencoded data and Unicode
Hi. In my WSGI framework: http://hg.mperillo.ath.cx/wsgix I have, in the `http` module, the functions `parse_query_string` and `parse_simple_post_data`. The first parse the query string and return a dictionary of strings, the latter parse the application/x-www-form-urlencoded client body and return a dictionary of strings and the charset used by the client for the unicode encoding. Now, I'm thinking if these two function should instead return Unicode strings instead of plain strings. I think that Unicode strings should be returned, but I would like to know what other web frameworks do. Django seems to convert to Unicode, but the Python standard library does not (and I would like to know if changes are planned for Python 3.x). Thanks Manlio Perillo ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
[Web-SIG] problem with wsgiref.util.request_uri and decoded uri
I'm having a nightmare with encoded/decoded uri and request_uri function: >>> from wsgiref.util import request_uri >>> environ = { ... 'HTTP_HOST': 'www.test.org', ... 'SCRIPT_NAME': '', ... 'PATH_INFO': '/b%40x/', ... 'wsgi.url_scheme': 'http' ... } >>> print request_uri(environ) http://www.test.org/b%2540x/ Here I'm assuming that the WSGI gateway *does* not decode the uri. The result of request_uri is incorrect, in this case. On the other hand, if the WSGI gateway *do* decode the uri, I can no more handle '/' in uri. I can usually avoid to have '/' in uri, but right now I'm implementing a WSGI application that implement a restfull interface to an SQL database: http://hg.mperillo.ath.cx/wsgix/file/tip/wsgix/contrib/sqltables.py so I can not avoid fields with '/' character in it. The proposed solution in a previous thread http://mail.python.org/pipermail/web-sig/2008-January/003122.html is to implement a custom encoding scheme (like done in MoinMoin). There are really no other good solutions? Assuming that WSGI requires the uri to not be encoded, then the solution is to do modify the request_uri function replacing: quote(SCRIPT_NAME) with: quote(unquote(SCRIPT_NAME)) ? Where can I find informations about alternate encoding scheme? Thanks Manlio Perillo ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Fwd: wsgiref.simple_server slow on slow network
Tibor Arpas ha scritto: Hi, I'm quite new to python and I ran into a performance problem with wsgiref.simple_server. I'm running this little program. from wsgiref import simple_server def app(environ, start_response): start_response('200 OK', [('content-type', 'text/html')]) return ['*'*5] httpd = simple_server.make_server('',8080,app) try: httpd.serve_forever() except KeyboardInterrupt: pass I get many hundreds of responses/second on my local computer, which is fine. But when I access this server through our VPN it performs very bad. wsgiref is an iterative server, if I not wrong; it serves only one request at a time. On the loopback interface this is not a problem, but on Internet the latency of the connection make a single request time high. paste.httpserver uses a thread pool. > [...] Manlio Perillo ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Alternative to threading.local, based on the stack
Donovan Preston ha scritto: On Jul 8, 2008, at 11:45 AM, Manlio Perillo wrote: Using greenlets, there is always a current greenlet, so you can use this for local storage. A library function can check if there is an active greenlet, and use it as data key; otherwise it will use the current thread id. Yes, this is exactly what I did in the wrap_threading_local_with_coro_local here: http://donovanpreston.com:/eventlet/file/b6f9627e88df/eventlet/util.py Ok. However this will not work if you have an asynchronous server that does not make use of greenlets. Exactly, which is why I am proposing just standardizing something that does exactly what people use threading.local for, but whose implementation is pluggable by the wsgi server. But this will be not easy to implement, especially if it should go in a separate module. Maybe its better to have something like: wsgiorg.local_scope a function that returns the current request id. The function itself is not bound to the current request, so it can be safely stored. Maybe this should be more easy to implement, I'm not sure. Manlio Perillo ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Alternative to threading.local, based on the stack
Donovan Preston ha scritto: On Jul 7, 2008, at 6:11 PM, Phillip J. Eby wrote: At 02:12 PM 7/7/2008 -0700, Donovan Preston wrote: It seems to me that what is really needed here is an extension of wsgi that specifies how to get, set, and list request local storage, and for people to use that instead of the threadlocal module. I don't follow why you wouldn't just put that in the environ. (If you need it to be carried back from the application, use mutable objects in the environ.) Yes, the logical place to store it is in the environ, but this whole thread is about having an api for doing request-local storage that doesn't involve passing the request everywhere. Here's what I am imagining: There's just a module, called requestlocal or something. It has an API just like threading.local(), except the implementation can be changed by the wsgi server. Using greenlets, there is always a current greenlet, so you can use this for local storage. A library function can check if there is an active greenlet, and use it as data key; otherwise it will use the current thread id. However this will not work if you have an asynchronous server that does not make use of greenlets. > [...] Manlio Perillo ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] help with the implementation of a WSGI middleware
Phillip J. Eby ha scritto: At 11:21 PM 7/7/2008 +0200, Manlio Perillo wrote: So this is not a "bad" middleware, IMHO. True, but it's part of the application, rather than being transparent. Ok, I agree. Does this means that such non trasparent middlewares must not be inserted inside the "gateway middleware stack", even if this is done only as a convenience (so that you don't have to use a decorator for every functions)? By the way, a middleware that is responsible for user authentication: http://hg.mperillo.ath.cx/wsgix/file/tip/wsgix/auth/http_middleware.py is a good middleware? To keep it simple, the middleware check if there is an authorization header and the credentials are correct. If this is true, execute the WSGI application (setting environ['REMOTE_USER']), otherwise return a forbidden response. Right - that's transparent middleware: the application doesn't need to know it's there. I think that it's rather subtle. If you remove the middleware, the application is no more able to handle authenticated user. This is not a problem, the application is still able to work correctly, but the same applies to my messages middleware, IMHO. Under WSGI 2.0, it's even easier since you don't need decorators to manipulate your response: you can just "return someapi(...)" where the "..." is whatever you were going to return directly. return someapi() from inside the WSGI application? Yes. Do you have a working example? Also, can you post an example of a middleware that needs to replace the environ dictionary? Thanks Manlio Perillo ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Alternative to threading.local, based on the stack
Donovan Preston ha scritto: [...] It seems to me that what is really needed here is an extension of wsgi that specifies how to get, set, and list request local storage, and for people to use that instead of the threadlocal module. There seems to be something that I don't understand: why not just store the values inside the WSGI environ dictionary? It is a per request dictionary, so it is really what you want. > [...] Manlio Perillo ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] help with the implementation of a WSGI middleware
Phillip J. Eby ha scritto: At 09:58 PM 7/7/2008 +0200, Manlio Perillo wrote: In this case the first solution is to use this middleware as a decorator, instead of a full middleware. This is the correct way to implement non-transparent middleware; i.e., so-called middleware which is in fact an application API. See: http://dirtsimple.org/2007/02/wsgi-middleware-considered-harmful.html for more about this. Basically, if a piece of middleware has to be there for the application to run, it's not really "middleware"; it's a misnamed decorator. Right, this what I thought (and yes, I have read your article). However as a "justification" I used the following argumentation: Ok, the application does not "fully" work without the middleware, however it "mainly" works, and it's not a big problem is messages are not actually sent to the client. Fortunately, in wsgix a "middleware" is very easy to use both in a full middleware stack and as a decorator (since all the state is maintained in the environ dictionary and there is no need for factory functions). In Nginx you can do, in server config: wsgi_middleware wsgix.contrib.messages; However I want to document that this is not a "good" middleware. "non-transparent middleware" is a good term, thanks. In the original WSGI spec, I overestimated the usefulness of adding extension APIs to the environ... or more likely, I went along with some of Ian's overenthusiasm for the idea. ;-) Extension APIs in the environ just mean you have to write your code to handle the case where the API isn't there -- in which case you might as well have used a library. Extension APIs really only make sense if they are true *server* features, not application features; otherwise, you are better off using a library rather than "middleware" per se. Yes. However my messages middleware does not "inject" an API into the WSGI environment. The API uses the environ to store state; the middleware is only required to "activate" the cookies to actually send messages to the client. So this is not a "bad" middleware, IMHO. By the way, a middleware that is responsible for user authentication: http://hg.mperillo.ath.cx/wsgix/file/tip/wsgix/auth/http_middleware.py is a good middleware? To keep it simple, the middleware check if there is an authorization header and the credentials are correct. If this is true, execute the WSGI application (setting environ['REMOTE_USER']), otherwise return a forbidden response. Under WSGI 2.0, it's even easier since you don't need decorators to manipulate your response: you can just "return someapi(...)" where the "..." is whatever you were going to return directly. return someapi() from inside the WSGI application? Thanks Manlio Perillo ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
[Web-SIG] help with the implementation of a WSGI middleware
As I have informally written in previous messages, I'm writing a small WSGI framework. The framework is available here (a Mercurial repository): http://hg.mperillo.ath.cx/wsgix In wsgix I have written two middleware that I find interesting since I have learned a bit more about how to write middlewares (and Eby concerns about WSGI 1.0). One of this middleware is wsgix.contrib.messages: http://hg.mperillo.ath.cx/wsgix/file/tip/wsgix/contrib/messages.py The purpose of this middleware is to support sending messages to a client. The idea originates from Django, however in wsgix I use cookies (since I find not a really good idea to use a database for this) and messages can be sent to every user (Django sends messages only to authenticated users, if I'm correct). The wsgix support for messages consist of two parts. The first is the implementation of a simple API for sending an retrieving messages (only Unicode strings are supported): message_push(environ, message) message_pop(environ) # this returns and remove the messages These functions does not actually manage cookies: the messages are stored in environ['wsgix.messages'], as a list. The latter is the implementation of a middleware that take care of cookies handling. The problem is that, if I have well understood, a middleware is allowed to entirely replace the environ dictionary. This means that if such a middleware is presend before the messages middleware is called, messages are not sent to the client. Is this true? In this case the first solution is to use this middleware as a decorator, instead of a full middleware. The other solution is to implement an additional interface: message_push(environ, start_response, headers, message) that explicitly handle the cookie (this is possible but harder to implement and less flexibile to use). Any suggestions? Thanks Manlio Perillo ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Alternative to threading.local, based on the stack
Ian Bicking ha scritto: Manlio Perillo wrote: [...] As an example, in Paste you have choosed to using config dictionary for middleware configuration, that is, you have middleware factories. I think this is a red herring. WebOb specifically doesn't do anything related to configuration or the setup of the stack. What it does do is stuff like: expires = http.format_time(0) http.generate_cookie( environ, headers, name, '', expires=expires, domain=cookie_domain(environ), path=path, max_age=0) which would be resp.delete_cookie(name) (well, cookie_domain seems to be derived from a setting, but that's mostly unrelated). This isn't a particularly substantial difference, but these small conveniences add up. As I have said, this is a personal taste, I don't like the "architecture" used by WebOb and prefer to directly use the environ dictionary without introducing other abstractions. This is possible, I'm writing a "not simple" application using wsgix. I'm still evaluating if I can reuse WebOb parsing functions (and this would be a great thing: I think that we *really* need a package with *only* low *level* parsing functions for the HTTP protocol). From what I can see, WebOb *does* not offer a low level interface for the parsers: you *have* to use the Request object. I really like multilevel architectures, instead. Manlio Perillo ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Alternative to threading.local, based on the stack
Ian Bicking ha scritto: Manlio Perillo wrote: I'm adding web-sig in Cc. [...] I'm developing a WSGI framework with all these (and other) ideas: http://hg.mperillo.ath.cx/wsgix Its still not documented, so I have not yet made an official announcement. The main design goal is to keep the level of the interface as low level as possible. I don't like additional interfaces (like Request and Response) objects around the WSGI dictionary, and I don't like frameworks like Django that completely hides the WSGI interface. Have you tried webob? My first run as Paste avoided wrappers around those objects, but an object interface has been very helpful. I have not tried it, but I have read the code (as I have read the code of Paste). In principle I'm against using additional interface, and one of the reason I wrote wsgix is to have a prof of concept, for trying to understand if it is feasible to write a WSGI application using an alternative framework. wsgix (+ mod_wsgi for Nginx) has the same role as Paste, but I have decided to use a rather different approach. As an example, in Paste you have choosed to using config dictionary for middleware configuration, that is, you have middleware factories. In wsgix it is very different. As an example: http://hg.mperillo.ath.cx/wsgix/file/tip/wsgix/contrib/messages.py http://hg.mperillo.ath.cx/wsgix/file/tip/wsgix/contrib/error_page.py There are no factories. The configuration is read (and globally cached) at request time from the environ dictionary. With Nginx, configuration parameters can be defined in the server configuration. There is an helper class: http://hg.mperillo.ath.cx/wsgix/file/tip/wsgix/options.py that helps with the parsing. There is also a middleware: http://hg.mperillo.ath.cx/wsgix/file/tip/wsgix/conf/middleware.py that reads the configuration from a YAML file, and merge it into the environ dictionary. Of course it's all a matter of personal taste :). The goal is to have the possibility to write "truly" reusable middlewares, that are easy to "plug" inside any WSGI server (almost all of configuration parameters have default values). Manlio Perillo ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Alternative to threading.local, based on the stack
Matt Goodall ha scritto: [...] True, but even passing a request or env dict around to everyone gets tedious don't you think? Yes, it can be tedious but I believe explicit arg passing is necessary to make code readable, testable and reusable. If it's web-related code then give it the request, it will almost certainly need it. Otherwise, don't. I would even advocate extracting request-scope objects, e.g. a database connection, the current user, etc, as early as possible and passing them around explicitly (along with the request, if necessary). This exactly what I too have realized! I'm developing a WSGI framework with all these (and other) ideas: http://hg.mperillo.ath.cx/wsgix Its still not documented, so I have not yet made an official announcement. The main design goal is to keep the level of the interface as low level as possible. I don't like additional interfaces (like Request and Response) objects around the WSGI dictionary, and I don't like frameworks like Django that completely hides the WSGI interface. > [...] Manlio Perillo ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Alternative to threading.local, based on the stack
Iwan Vosloo ha scritto: On Fri, 2008-07-04 at 13:42 +0200, Manlio Perillo wrote: Iwan Vosloo ha scritto: Hi, Many web frameworks and ORM tools have the need to propagate data depending on some or other context within which a request is dealt with. Passing it all via parameters to every nook of your code is cumbersome. The natural solution with WSGI is to store objects in the environ dictionary. In fact in my web applications I always pass the environ dictionary explicitly to every functions. But, this passing of the environ dictionary to every function in you web app is exactly what I'd want to avoid? Yes, but you only need to pass the environ dictionary and not N paramerers. I think this is a good compromise. Using thread local storage is not the solution to every problem (as you have noted it can not be used when the server handle more then one request per thread). -i Manlio Perillo ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Alternative to threading.local, based on the stack
Iwan Vosloo ha scritto: Hi, Many web frameworks and ORM tools have the need to propagate data depending on some or other context within which a request is dealt with. Passing it all via parameters to every nook of your code is cumbersome. A lot of the frameworks use a thread local context to solve this problem. I'm assuming these are based on threading.local. (See, for example: http://www.sqlalchemy.org/docs/05/session.html#unitofwork_contextual ) Such usage assumes that one request is served per thread. This is not necessarily the case. (Twisted would perhaps be an example, but I have not checked how the twisted people deal with the issue.) The natural solution with WSGI is to store objects in the environ dictionary. In fact in my web applications I always pass the environ dictionary explicitly to every functions. > [...] Manlio Perillo ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Proposed specification: waiting for file descriptor events
Christopher Stawarz ha scritto: On May 21, 2008, at 1:34 PM, Manlio Perillo wrote: Instead, the spec recommends that async servers pre-read the request body before invoking the app (either by default or as a configurable option). This is the best solution most of the time (but not for all of the time), especially if the "server" can do some "pre-parsing" of multipart/form-data request body. In fact I plan to write a custom function (in C for Nginx) that will "reduce", as an example: Content-Type: multipart/form-data; boundary=AaB03x --AaB03x Content-Disposition: form-data; name="submit-name" Larry --AaB03x Content-Disposition: form-data; name="files"; filename="file1.txt" Content-Type: text/plain ... contents of file1.txt ... --AaB03x-- to (not properly escaped): Content-Type: application/x-www-form-urlencoded submit-name=Larry&files.filename=file1.txt&files.ctype=text/plain&files.path=xxx and the contents of file1.txt will be saved to a temporary file 'xxx'. It seems like you're making this more complicated than it needs to be. Why not just store the entire request body in a temporary file, and then pass an open handle to it as wsgi.input? Because if you have a big file (like a video of > 100 MB), your application will block everything while parsing the request body. Parsing the body incrementally is far more efficient (although it is more hard). That way, the server doesn't have to rewrite the request, and the application doesn't need to know how to interpret the files.* parameters. How to interpret the files.* parameters is not really a problem. 1) Why not add a more generic poll like interface? Because such an interface would be more complicated than what I've proposed and harder for server authors to implement. Also, I'm not sure that it gains you much. Well, I have modelled my extension so that it has a "well know" interface and that it is not hard to implement. But I have to say that I'm not sure if one want to poll multiple sockets. Moreover in my implementation ngx.poll only returns one "ready" socket at a time. By the way: I see a problem with you API. What happens if an application do: read, write, exc = m.fdset() environ['x-wsgiorg.fdevent.readable'](read[0], 1.0) environ['x-wsgiorg.fdevent.writable'](write[0], 1.0) yield '' There is no way to know, when the application is resumed, if the socket is ready for read or write. This probabily should not be a problem, but I'm not sure. Note that I'm not 100% sure on this, as I tried to indicate in the "Open Issues" section of my proposal. The approach I'd like to take is to try writing apps with my interface for a while, and if real-world usage shows that a poll-like interface would be very useful (or necessary), then the spec could be extended to add one. I think this is a safe route, since the readable/writable functions could easily be implemented in terms of a more generic poll-like interface, so existing apps that use the fdevent extensions would continue to work. Moreover IMHO storing a timeout variable in the environ to check if the previous call timedout, is not the best solution. I think it's a simple and effective solution. Server authors don't need to implement any new functions or data types. They just create and hold on to a mutable object instance (the simplest being a list instance) for each app instance and toggle its truth value as required. In my implementation I return a function, but with generators in Python 2.5 this can be done in a better way. What advantage does this have over what I've proposed? You don't need to store a mutable variable in the environ. 2) In Nginx it is not possible to simply handle "plain" file descriptors, since these are wrapped in a connection structure. This is the reason why I had to add a connection_wrapper function in my WSGI module for Nginx. But the connection structure just wraps an integer file descriptor, right? So the readable/writable functions can create the required wrapper to register with nginx. There's no reason to make the application author do it. The "problem" is that Ninx keeps a list of preallocated connection objects (the size of the list being controlled by worker_connections). This means that a newly constructed connection *must* be freed as soon as it is no more used, otherwise it can limit the number of concurrent connections that can be handled by Nginx. Since with my API (register/unregister) a connection should be kept alive until is is unregistered, I have choosen to create a wrapper for the Nginx connection object. Probabily with your API it can be possible to c
Re: [Web-SIG] WSGI and greenlets
Christopher Stawarz ha scritto: On May 7, 2008, at 4:44 AM, Manlio Perillo wrote: [...] I don't think this will solve the problem. Moreover in your example you buffer the whole request body so that you have to yield only one time. Your example was: def application(environ, start_response): def nested(): while True: poll(xxx) yield '' yield result for r in nested(): if not r: yield '' yield r My suggestion would allow you to rewrite this like so: @awsgiref.callstack.add_callstack def application(environ, start_response): def nested(): while True: poll(xxx) yield '' yield result yield nested() The nesting can be arbitrarily deep, so nested() could yield doubly_nested() and so on. While not as elegant as greenlets, I think this does address your concern. I'm reading the PEP 342, and I still think that this will not work as I want for Nginx (where I have no control over the "scheduler"). In fact the PEP 342 says: """However, if it were possible to pass values or exceptions *into* a generator at the point where it was suspended, a simple co-routine scheduler or "trampoline function" would let coroutines "call" each other without blocking.""" However writing a co-routine scheduler or "trampoline function" when your application is embedded in an external server is not possible (but please, correct me if I'm wrong). > [...] Regards Manlio Perillo ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Proposed specification: waiting for file descriptor events
Christopher Stawarz ha scritto: This is the third draft of my proposed extensions for better supporting WSGI apps on asynchronous servers. The major changes since the last draft are as follows: First of all, thanks for your effort. * The title and abstract now accurately reflect the scope of the proposal. In addition, the extensions are now in the namespace "x-wsgiorg.fdevent" (instead of "x-wsgiorg.async"). * The proposal for an alternative, non-blocking input stream has been dropped, since I don't see a way to add one that wouldn't break middleware. Well, IMHO the "general" solution here is to use greenlets. Instead, the spec recommends that async servers pre-read the request body before invoking the app (either by default or as a configurable option). This is the best solution most of the time (but not for all of the time), especially if the "server" can do some "pre-parsing" of multipart/form-data request body. In fact I plan to write a custom function (in C for Nginx) that will "reduce", as an example: Content-Type: multipart/form-data; boundary=AaB03x --AaB03x Content-Disposition: form-data; name="submit-name" Larry --AaB03x Content-Disposition: form-data; name="files"; filename="file1.txt" Content-Type: text/plain ... contents of file1.txt ... --AaB03x-- to (not properly escaped): Content-Type: application/x-www-form-urlencoded submit-name=Larry&files.filename=file1.txt&files.ctype=text/plain&files.path=xxx and the contents of file1.txt will be saved to a temporary file 'xxx'. Once again, I'd appreciate your comments. I have some comments: 1) Why not add a more generic poll like interface? Moreover IMHO storing a timeout variable in the environ to check if the previous call timedout, is not the best solution. In my implementation I return a function, but with generators in Python 2.5 this can be done in a better way. 2) In Nginx it is not possible to simply handle "plain" file descriptors, since these are wrapped in a connection structure. This is the reason why I had to add a connection_wrapper function in my WSGI module for Nginx. 3) If you read an example that implements a database connection pool: http://hg.mperillo.ath.cx/nginx/mod_wsgi/file/tip/examples/nginx-postgres-async.py you can see that there is a problem. In fact the pool is not very flexible; the application can not handle more than POOL_SIZE concurrent requests. However it is possible to just have a new request to wait until a previous connection is free (or a timeout occurs). I have attached an example (it is not in the repository since there are some problems). The examples use a new extension: - ctx = environ['ngx.request_context']() - ctx.resume() ctx.resume() "asynchronously" resumes the given request (it will be resumed as soon as control returns to Nginx, when the application yields something). Note that the problem of resuming another request is easily solved with greenlets, without the need to new extensions (this is one of the reason why I like greenlets). > [...] Regards Manlio Perillo from collections import deque import psycopg2 as db # The table and the function are created by the setup script `postgres_setup.py` query_select = "SELECT a, b, c, d, e FROM RandomTable LIMIT 10" query_sleep = "SELECT * FROM sleep(1)" # These constants are defined in the WSGI environment but their value # is know WSGI_POLLIN = 0x01 WSGI_POLLOUT = 0x04 # Size of the connection pool POOL_SIZE = 20 # Free connections available free_connections = deque() # Connections waiting for a free slot waiting_requests = deque() # Number of concurrent connections connections = 0 # State to be kept between requests request_state = {} def get_connection(environ): global connections print 'open', connections, len(free_connections), len(waiting_requests) if free_connections: print 'reuse' # reuse existing connection dbconn, c = free_connections.pop() elif connections < POOL_SIZE: print 'new' # create a new connection dbconn = db.connect(database='test') curs = dbconn.cursor() # XXX bad API, fileno should be a property of the connection object fd = curs.fileno() c = environ['ngx.connection_wrapper'](fd) connections = connections + 1 else: print 'wait' # no free slots, this request will have to wait ctx = environ['ngx.request_context']() waiting_requests.append(ctx) return None, None # XXX check me environ['ngx.poll_register'](c, WSGI_POLLIN)
[Web-SIG] WSGI and PEP 325
The WSGI PEP explicitly mention the PEP 325 (for the application iterable close method). Maybe this should be updated for the next WSGI spec, since Python 2.5 implements the PEP 342? Regards Manlio Perillo ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Proposed WSGI extensions for asynchronous servers
James Y Knight ha scritto: On May 11, 2008, at 6:15 PM, Christopher Stawarz wrote: Abstract This specification defines a set of extensions that allow WSGI applications to run effectively on asynchronous (aka event driven) servers. Rationale - The architecture of an asynchronous server requires all I/O operations, including both interprocess and network communication, to be non-blocking. For a WSGI-compliant server, this requirement extends to all applications run on the server. However, the WSGI specification does not provide sufficient facilities for an application to ensure that its I/O is non-blocking. Specifically, there are two issues: * The methods provided by the input stream (``environ['wsgi.input']``) follow the semantics of the corresponding methods of the ``file`` class. * WSGI does not provide the application with a mechanism to test arbitrary file descriptors (such as those belonging to sockets or pipes opened by the application) for I/O readiness. There are other issues. How do you do a DNS lookup? How do you get process completion notification? Heck, how do you run a process? Once you have I/O readiness information, what do you do with that? I guess you'd need to write a whole new asynchronous server framework on top of AWSGI? I can't see being able to use it "raw" for any real applications. This is not a problem with AWSGI. As an example there are libraries like PostgreSQL and curl that can be used with an external event loop. In the WSGI implementation for Nginx I can provide an interface for using the builtin supporto for asynchronous DNS client. The first argument, ``fd``, is either an integer representing a file descriptor or an object with a ``fileno`` method that returns such an integer. (In addition, ``fd`` may be ``x-wsgiorg.async.input``, even if it lacks a ``fileno`` method.) The second, optional argument, ``timeout``, is either ``None`` or a floating-point value in seconds. If omitted, it defaults to ``None``. What if the event-loop of the server doesn't use integer fds, but windows file handles or a java channel object? Where are you allowed to get these integers from? Is it always a socket from socket.socket().fileno()? Or can it be a file from open().fileno() or os.open()? A pipe from os.pipe()? Note that these distinctions are important everywhere but UNIX. This has the same problems that we have with wsgi.file_wrapper. This is the reason, among other things, why the API in my implementation uses ngx.connection_wrapper and ngx.poll_register > [...] Manlio Perillo ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Proposed WSGI extensions for asynchronous servers
Phillip J. Eby ha scritto: [...] If ``timeout`` seconds elapse without the file descriptor becoming ready for I/O, the variable ``x-wsgiorg.async.timeout`` will be true when the application resumes. Otherwise, it will be false. The value of ``x-wsgiorg.async.timeout`` when the application is first started or after it yields each response-body string is undefined. Er, I think you are confused here. There is no way for the server to know what environ dictionary the application is using, unless you explicitly pass it into your extension API. Interesting, this is something I have never considered. In my implementation ngx.poll returns a function, so there should be no problems. Regards Manlio Perillo ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Proposal for asynchronous WSGI variant
Christopher Stawarz ha scritto: On May 7, 2008, at 4:20 AM, Graham Dumpleton wrote: 2008/5/7 Manlio Perillo <[EMAIL PROTECTED]>: With your solution it seems that writing middlewares will not became more easy. Part of what I was trying to say was that this needn't be exposed to middlewares, unless it has to be. It was effectively a lower level of interaction which a middleware immediately on top of the WSGI adapter would use to hook into the async type model, but then present it to higher levels as more traditional WSGI interface. That would be a really elegant solution, except, as you say: That layer would though obviously use something like greenlets to bridge the two. The problem being that greenlets aren't part of the Python language. They're an extension that works by doing clever stuff with the C stack. And as much as we might wish that Python supported them natively (which I do, since they're a really nice alternative to OS threads), it doesn't, so I don't think they can play any role in a WSGI-ASYNC spec. This is not fully true, after all WSGI explicitly exposes the concept of processes and threads (via the relative variable in the WSGI environ and some hints in the specification) and these are not really part of the Python Language. Chris Manlio Perillo ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] WSGI and greenlets
Manlio Perillo ha scritto: [...] The main problem I see with greenlet is that is is not yet stable (there are some problems with the garbage collector) and that is is not part of CPython. This means that it can be not acceptable to write a PEP for a WSGI like interface with coroutine support. Maybe a solution can be to add a new variable to the WSGI environ: wsgi.microthreads When it is true it means that the WSGI implementation will execute the application inside a micro thread (may it be stackless, greenlet, pypy coroutine). Also note that when using coroutines there will be no problems with WSGI 2.0. However I still think that we should release a WSGI 1.1 since many applications still use and will continue to use WSGI 1.x and a gateway will have to support WSGI 1.x in order to support both WSGI 1.x and 2.x Regards Manlio Perillo ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
[Web-SIG] WSGI and greenlets
Christopher Stawarz ha scritto: On May 6, 2008, at 6:17 AM, Manlio Perillo wrote: I'm glad to know that there are some other people interested in asynchronous application, do you have seen my extensions to WSGI in my module for Nginx? Yes, I have, and I had your module in mind as a potential provider of the AWSGI interface. Note that in Nginx the request body is pre-read before the application is called (in fact wsgi.input is either a cStringIO or File object). Although I didn't state it explicitly in my spec, my intention is for the server to be able to implement awsgi.input in any way it likes, as long as it provides a recv() method. It's totally acceptable for the request body to be pre-read. Ok. But what I meant was that since Nginx pre-read the request body I have not tried to implement an interface for dealing with an asynchronous wsgi.input ;-). Moreover I don't see any readons to have a revc method instead of read. Unfortunately there is a *big* usability problem: the extension is based on a well specified feature of WSGI: the gateway can suspend the execution of the WSGI application when it yields. However if the asynchronous code is present in a "child" function, we have something like this: ... That is, all the functions in the "chain" have to yield, and is not very good. Yes, you're right. However, if you're willing/able to use Python 2.5, you can use the new features of generators to implement a call stack that lets you call child functions and receive return values and exceptions from them. I've implemented this in awsgiref.callstack. Have a look at http://pseudogreen.org/bzr/awsgiref/examples/echo_request_with_callstack.py for an example of how it works. I don't think this will solve the problem. Moreover in your example you buffer the whole request body so that you have to yield only one time. The solution is to use coroutines, and I'm planning to integrate greenlets (from the pylib project) into the WSGI module for Nginx. Interesting, but it's not clear to me how/if this would work. Can you explain more or point me to some code? http://codespeak.net/py/dist/greenlet.html def process_commands(*args): while True: line = '' while not line.endswith('\n'): line += read_next_char() if line == 'quit\n': print "are you sure?" if read_next_char() != 'y': continue# ignore the command process_command(line) With greenlets the execution can be suspened by any of the functions called by the main greelet. This has a lot of advantages. You can implement wsgi.input.read(n) so that it will suspend the execution of the current greenlet until *all* the n bytes have been read. You can also implement the write callable so that control is returned to the main greelet when the socket is ready to send more data. And, of course, you can implement a poll like interface and a sleep like interface. I think that it is a great advantage, moreover it is the only way to implement truly reusable components. Note that there is an effort of integrating greenlets with Twisted: http://radix.twistedmatrix.com/2008/03/corotwine-01.html The "problem" is that once you add support to greenlets, you have no more WSGI. The interface can be the same, and applications can work on it without problems, but the semantic is *completely* different. Also note that with greenlets should be possible to "magically" transform blocking applications like Django to non blocking. The main problem I see with greenlet is that is is not yet stable (there are some problems with the garbage collector) and that is is not part of CPython. This means that it can be not acceptable to write a PEP for a WSGI like interface with coroutine support. Thanks, Chris Regards Manlio Perillo ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Proposal for asynchronous WSGI variant
Ionel Maries Cristian ha scritto: This is a very interesting initiative. However there are few problems: - there is no support for chunked input - that would require having support for readline in the first place, also, it should be the gateway's business decoding the chunked input. Unfortunately Nginx does not yet support chunked input, so I can't help here. - the original wsgi spec somewhat has some support for streaming and asynchronicity [*1] Right, and in fact I have used this for the implementation of some extensions in the WSGI module for Nginx. - i don't see how removing the write callable will help (i don't see a issue having the server providing a stringio.write as the write callable for synchronous apps) To summarize: the main problem with the write callable is that after you call it control is not returned to the WSGI gateway. With an asynchronous server it is a problem since if you write a lot of data the server may not be able to send it to the client. This is not a problem if the application returns a generator, since the gateway can suspend the execution until the socket is ready to send data. With the write callable this is not possible, In my implementation of WSGI for Nginx I provide two separate implementation of the write callable: - put the socket temporary in synchronous mode (this is WSGI compliant but it is very bad for Nginx) - buffer all the written data until control is returned to the gateway (this is *not* WSGI compliant) However if you use greenlets, then implementing the write callable is not a problem. - passing nonstring values though middleware will make using/porting existing wsgi middleware hairy (suppose you have a middleware that applies some filter to the appiter - you'll have your code full of isinstance nastiness) Yes, this should be avoided. Also, have you looked at the existing gateway implementations with asynchronous support? There are a bunch of them: http://trac.wiretooth.com/public/wiki/asycwsgi http://chiral.j4cbo.com/trac http://wiki.secondlife.com/wiki/Eventlet my own shot at the problem: http://code.google.com/p/cogen/ and manlio's mod_wsgi for nginx (I may be missing some) However there is absolutely no unity in handling the wsgi.input (or equivalent) The wsgi.input can be handled with ngx.poll: c = ngx.connection_wrapper(wsgi.input) ... ngx.poll_register(c, WSGI_POLLIN) ... ngx.poll(1000) Unfortunately I can not test if this is implementable. I have some doubts. > [...] Manlio Perillo ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Proposal for asynchronous WSGI variant
Graham Dumpleton ha scritto: 2008/5/7 Christopher Stawarz <[EMAIL PROTECTED]>: On May 5, 2008, at 10:08 PM, Graham Dumpleton wrote: If write() isn't to be returned by start_response(), then do away with start_response() if possible as per discussions for WSGI 2.0. I think start_response() is necessary, because the application may need to yield for I/O readiness (e.g. to read the request body, as in my example app) before it decides what response status and headers to send. One could come up with other ways of doing it which aligns better with WSGI 2.0. I previously gave an idea as a starting point for discussion, but don't think others really understood what I was suggesting. But then I did post it at 4am in the morning in the middle of a baby induced period of sleep deprivation. See post 24 in: http://groups.google.com/group/python-web-sig/tree/browse_frm/thread/74c1f8cf15adf114/d98086a8db568ebd?rnum=24 I think what was missed by others was that I wasn't suggest that the 102 code be sent all the way back to the client, but as a convention between WSGI application and underlying WSGI adapter only, to facilitate the ability to return control back to the WSGI adapter before one had decided what actual response headers to send. This seems to align with what you want. Its seems a bit more complex to implement then the start_callable. Moreover the whole point of removing the start_callable is to simplify the writing of middlewares. With your solution it seems that writing middlewares will not became more easy. Graham Manlio Perillo ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] [proposal] wsgiref.util.abs_url
Phillip J. Eby ha scritto: At 06:27 PM 5/5/2008 +0200, Manlio Perillo wrote: Phillip J. Eby ha scritto: I think that it doesn't accept a relative URL, it accepts an absolute path. What do you mean? environ = {} setup_testing_defaults(environ) url = '/a/b/' That's a relative URL that's also an absolute path. Try a relative URL like './a/b', or just plain 'a/b'. self.failUnlessEqual( util.abs_url(environ, url), 'http://127.0.0.1/a/b/') I also think that using urlparse.urljoin() with either request_uri() or application_uri() would be a clearer (and tested) way to obtain an absolute URL, and more generally useful. But application_uri also includes SCRIPT_NAME. Yes, and you might want to use it as the base against which a relative URL will be resolved -- i.e. an application-relative URL, vs. a request-relative URL. In fact, application_uri() would probably be *more* useful, since if you want a request-relative URL, there's no need to turn it into an absolute URL, since you could just use it in its relative form. Yes, but this is not always the case. Note, however, that in either case, using a relative URL that's an absolute path (e.g. '/a/b'), will still produce the same result as your function would. It's just that urljoin also works properly for all kinds of relative urls, not just the absolute-path subset. You are right, thanks. Regards Manlio Perillo ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Proposal for asynchronous WSGI variant
Christopher Stawarz ha scritto: (I'm new to the list, so please forgive me for making my first post a specification proposal :) Browsing through the list archives, I see there's been some inconclusive discussions on adding better support for asynchronous web servers to the WSGI spec. Since such support would be very useful for some upcoming projects of mine, I decided to take a shot at specing out and implementing it. I'd be grateful for any feedback you have. If this seems like something worth pursuing, I would also welcome collaborators to help develop the spec further. I'm glad to know that there are some other people interested in asynchronous application, do you have seen my extensions to WSGI in my module for Nginx? The extension is documented here: http://hg.mperillo.ath.cx/nginx/mod_wsgi/file/tip/README see the Extensions chapter. For some examples: http://hg.mperillo.ath.cx/nginx/mod_wsgi/file/tip/examples/nginx-postgres-async.py http://hg.mperillo.ath.cx/nginx/mod_wsgi/file/tip/examples/nginx-poll-sleep.py Note that in Nginx the request body is pre-read before the application is called (in fact wsgi.input is either a cStringIO or File object). Unfortunately there is a *big* usability problem: the extension is based on a well specified feature of WSGI: the gateway can suspend the execution of the WSGI application when it yields. However if the asynchronous code is present in a "child" function, we have something like this: def application(environ, start_response): def nested(): while True: poll(xxx) yield '' yield result for r in nested(): if not r: yield '' yield r That is, all the functions in the "chain" have to yield, and is not very good. The solution is to use coroutines, and I'm planning to integrate greenlets (from the pylib project) into the WSGI module for Nginx. > [...] Regards Manlio Perillo ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] [proposal] wsgiref.util.abs_url
Phillip J. Eby ha scritto: At 11:03 PM 5/2/2008 +0200, Manlio Perillo wrote: Hi. I think that a function like (not tested): def abs_url(environ, relative_url): """Return the absolute url""" [...] url += quote(relative_url) return url would be an useful addition to the wsgiref.util module. What do you think? I think that it doesn't accept a relative URL, it accepts an absolute path. What do you mean? environ = {} setup_testing_defaults(environ) url = '/a/b/' self.failUnlessEqual( util.abs_url(environ, url), 'http://127.0.0.1/a/b/') I also think that using urlparse.urljoin() with either request_uri() or application_uri() would be a clearer (and tested) way to obtain an absolute URL, and more generally useful. But application_uri also includes SCRIPT_NAME. Regards Manlio Perillo ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com