Re: [Web-SIG] [Python-Dev] wsgi validator with asynchronous handlers/servers

2013-03-25 Thread Manlio Perillo
Il 24/03/2013 06:14, PJ Eby ha scritto:
> [...]
>> Thanks for response PJ,
>> that is what I, unfortunately, didn't want to hear, the validator being
>> correct for the "spec" means I can't use it for my asynchronous stuff, which
>> is a shame :-(((
>> But why commit to send headers when you may not know about your response?
>> Sorry if this is the wrong mailing list for the issue, I'll adjust as I go
>> along.
> 
> Because async was added as an afterthought to WSGI about nine years
> ago, and we didn't get it right, but it long ago was too late to do
> anything about it.  A properly async WSGI implementation will probably
> have to wait for Tulip (Guido's project to bring a standard async
> programming API to Python).

Do you really need a standard async programming API to design and
implement an async WSGI specification?

I think it is not needed.
Some time ago I posted a sample implementation and documentation for a
very simple async extension for WSGI:
https://bitbucket.org/mperillo/txwsgi

An interesting example about how an async API can be designed is
PostgreSQL libpq, where the API expose a direct interface to the
protocol state machine (pqConsumeInput), so you can not only use it with
any async framework you like, but you can also use it in blocking mode.

This, as far as I know, is impossible with the network protocol
implementations in Twisted or other async frameworks.



Regards   Manlio
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Last call for WSGI 1.0 errata/clarifications

2010-09-24 Thread Manlio Perillo
Il 23/09/2010 18:32, P.J. Eby ha scritto:
> Just a reminder: I'm planning to actually update PEP 333 over the
> weekend and start working on wsgiref updates, so if you have any
> last-minute comments on the proposal, now's the time to post them,
> however unpolished they may be!
>

Where can I find a draft of the update?



Thanks   Manlio
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] [RFC] x-wsgiorg.suspend extension

2010-04-17 Thread Manlio Perillo
Ludvig Ericson ha scritto:

I have put web-sig in Cc.

> On 11 apr 2010, at 22:07, Manlio Perillo wrote:
> 
>> I here propose the x-wsgiorg.suspend to be accepted as official WSGI
>> extension, using the wsgiorg namespace.
> 
> I'm sorry, but I don't see how such a solution wins out over any other stab 
> at event-based concurrency (like gevent, eventlet, etc.)
> 
> I've made a WSGI application using gevent, and then gunicorn's gevent arbiter 
> thing. Works like a charm.
> 

Because eventlet, gevent and friends works *because* they have full
control over the event loop, and they can use greenlets as they like.

This is not possible with implementations like txwsgi (Twisted) and
ngx_http_wsgi_module (Nginx).

eventlet has support for Twisted, but, as far as I can tell, it works by
running the Twisted event loop inside a greenlet.
This is of course impossible with ngx_http_wsgi_module, since it is
embedded in a web server written in C.


> I get the point in trying to standardize something, but this solution seems 
> rather intrusive and not something I'd adopt any time soon.
> 

Can you suggest a less intrusive extension that works with *every* WSGI
implementation?

> Nice work though!
> 

Regards   Manlio
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Draft PEP: WSGI 1.1

2010-04-15 Thread Manlio Perillo
And Clover ha scritto:
> [...]
>> 8. The value passed to the 'write()' callback returned by
>>'start_response()' should be a byte string. Where native strings
>>are unicode strings, a native string type can also be supplied, in
>>which case it would be encoded as ISO-8859-1.
> 
> Weren't we going to only allow US-ASCII for the output? (These threads
> are always so far apart I can never remember what conclusion we
> reached... if any.)
> 

By the way, yesterday I wrote some tests for Python 3.x and I found a
possible problem (only indirectly related to WSGI, however).

The example consists in a simple client -> proxy -> server, where the
client and server are in Python 2.5 and the proxy in Python 3.2
(compiled from tip, some time ago).

Here is the proxy:
http://paste.pocoo.org/show/202212/

The application fails, if cookie contains non ascii character.
The reason is that, for reasons I do not understand, http.client encode
request headers using us-ascii, instead of iso-8859-1.

The offending code is:
http://hg.python.org/cpython/file/7dcb7a2fb54d/Lib/http/client.py#l912



Regards  Manlio
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Draft PEP: WSGI 1.1

2010-04-15 Thread Manlio Perillo
Dirkjan Ochtman ha scritto:
> Mostly taking Graham's list of issues and incorporating it into PEP 333.
> 
> Latest revision: http://hg.xavamedia.nl/peps/file/tip/wsgi-1.1.txt
> 
> Let's have comments here (comments in the form of diffs are
> particularly welcome, of course). Remember, the idea is not to change
> or improve WSGI right now, but only to improve the spec, improving
> interoperability and enabling Python 3 support.
> 

> [...]

Another comment.
The run_with_cgi sample function should be changed, since it probably
does not work correctly, on Python 3.x.

I'm not sure, since sys.stdout.write accepts a native string, however
how it is encoded is platform specific (with current text of WSGI 1.1,
however, it seems this is allowed).


I would like to do some tests with CGI, Python 3.2, IIS and Windows.


Regards  Manlio
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Draft PEP: WSGI 1.1

2010-04-15 Thread Manlio Perillo
Dirkjan Ochtman ha scritto:
> [...]
> --- pep-0333.txt  2010-04-15 14:46:02.0 +0200
> +++ wsgi-1.1.txt  2010-04-15 14:51:39.0 +0200
> @@ -1,114 +1,124 @@
> [...]


>  Abstract
>  
> 
> [...]
> -Thus, simplicity of implementation on *both* the server and framework
> -sides of the interface is absolutely critical to the utility of the
> -WSGI interface, and is therefore the principal criterion for any
> -design decisions.
> -
> -Note, however, that simplicity of implementation for a framework
> -author is not the same thing as ease of use for a web application
> -author.  WSGI presents an absolutely "no frills" interface to the
> -framework author, because bells and whistles like response objects and
> -cookie handling would just get in the way of existing frameworks'
> -handling of these issues.  Again, the goal of WSGI is to facilitate
> -easy interconnection of existing servers and applications or
> -frameworks, not to create a new web framework.
> -

This, and the rest of the abstract, should not entirely be removed, IMHO.

> [...]
> -
> -Finally, it should be mentioned that the current version of WSGI
> -does not prescribe any particular mechanism for "deploying" an
> -application for use with a web server or server gateway.  At the
> -present time, this is necessarily implementation-defined by the
> -server or gateway.  After a sufficient number of servers and
> -frameworks have implemented WSGI to provide field experience with
> -varying deployment requirements, it may make sense to create
> -another PEP, describing a deployment standard for WSGI servers and
> -application frameworks.

This should not be removed.

> [...]
> +
> +Differences with WSGI 1.0
> +=
> +
> +Descriptive changes
> +---
> +
> +The following changes were made to realign the spec with
> +implementations 'in the wild'.
> +

This text feels wrong, to me,

> +1. The 'readline()' function of 'wsgi.input' must optionally take
> +   a size hint. This is required because many applications use
> +   cgi.FieldStorage, which uses this functionality.
> +

What values are supported for size?
Are values -1 and None supported?

> [...]
> +3. Any WSGI application or middleware should not itself return, or
> +   consume from a wrapped WSGI component

This is not very clear.
What is the meaning of "consume from a wrapped WSGI component"?

> , more data than specified by
> +   the Content-Length response header if defined. Middleware that
> +   does this is arguably broken and can generate incorrect data.
> +   This is just a clarification of obligations.
> +
> [...]
> +
> +String handling changes
> +---
> +
> +The following changes were made to make WSGI work on Python 3.x.
> +
> +1. The application is passed an instance of a Python dictionary
> +   containing what is referred to as the WSGI environment. All keys
> +   in this dictionary are native strings. For CGI variables, all names
> +   are going to be ISO-8859-1 

"going to be ISO-8859-1" should be expressed in more precise terms.

Moreover, you should probably define first what a "native string" is, or
you shoudl add a note that it is defined later in the document.

> and so where native strings are
> +   unicode strings, that encoding is used for the names of CGI
> +   variables.
> +
> +2. For the WSGI variable 'wsgi.url_scheme' contained in the WSGI
> +   environment, the value of the variable should be a native string.
> +
> +3. For the CGI variables contained in the WSGI environment, the values
> +   of the variables are native strings. Where native strings are
> +   unicode strings, ISO-8859-1 encoding would be used such that the

What is the precise meaning of *would*, here?

> +   original character data is preserved and as necessary the unicode
> +   string can be converted back to bytes and thence decoded to unicode
> +   again using a different encoding.
> +
> +4. The WSGI input stream 'wsgi.input' contained in the WSGI environment
> +   and from which request content is read, should yield byte strings.
> +

"yield" should be replaced with "return".

And, again, why are you using *should*, here? Is an implementation
allowed to return a native string?

See my previous comment for "native string", about the use od "byte
string" here.

> [...]

> @@ -575,13 +602,14 @@
>  =  ===
>  Variable   Value
>  =  ===
> -``wsgi.version``   The tuple ``(1,0)``, representing WSGI
> +``wsgi.version``   The tuple ``(1, 0)``, representing WSGI
> version 1.0.
> 

Should be (1, 1), not (1, 0).

> [...]
> 
> -Proposed/Under Discussion
> -=
> -

I see no real reasons for removing this section.

> [...]

Moreover, should the section
"Supporting Older (<2.2) Versions of Python" be removed?

> -
>  Acknowledgements
>  =

Re: [Web-SIG] WSGI and start_response

2010-04-15 Thread Manlio Perillo
Dirkjan Ochtman ha scritto:
> [...]
>> Such a significant change as removing the requirement for write()
>> should also not be done within a minor version of the WSGI
>> specification because anything that works with WSGI 1.0 should still
>> work with WSGI 1.1 and vice versa. On that basis it can't really be
>> entertained until WSGI 2.0 where incompatible changes would be
>> allowed.
> 
> I think it's a good idea to consider for 2.0, certainly.
> 

Ehm, the purpose of WSGI 2.0 is precisely to remove start_response and
write callable with it...


Manlio
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] WSGI and start_response

2010-04-14 Thread Manlio Perillo
Dirkjan Ochtman ha scritto:
> On Tue, Apr 13, 2010 at 14:46, Graham Dumpleton
>  wrote:
>> The last attempt was to have WSGI 1.1 as clarifications and Python 3.X.
>>
>> And when I say 'last attempt', yes there have been people who have
>> stepped up to try and get this to happen in the past. I think you
>> would be the 3rd time, excluding me in general having tried to push it
>> in the past and also given up.
>>
>> You really should perhaps look back through the archive of WEB-SIG
>> posts on Google Groups to understand the history and how this always
>> seems to just go around in circles. :-)
> 
> I've been on Web-SIG for quite a while now, exactly to keep track of
> these issues.
> 
> Since there doesn't seem to be much traction, I figured it would be
> time to just get a new PEP together. To limit the amount of work, I'd
> go in the direction of having a single WSGI 2.0 PEP incorporating your
> suggestions (maybe minus the number 3), everything required for Python
> 3 (as outlined by your wiki page).
> 

If you volunteer for this task, I have some suggestions:

* target WSGI 1.1, not WSGI 2.0
* take the original WSGI 1.0 spec text
* start to integrate all changes documented by Graham
* I would really like to have changes integrates as a series of diff,
  using  and  HTML elements.

  Unfortunately docutils seems to not have support for this, but should
  not be hard to implement. I can help.
* You should keep a separate, unofficial document, with the rationale of
  the changes.
  You can just copy the content of Graham blog post, and reformatting
  it, if this is ok for Graham
* For each of the main changea, start a thread on this mailing list
  asking for votation.
  If, after 1 week, there is no vote against it, consider it approved


If we are really going to approve WSGI 1.1, I have a request: remove the
``write`` callable.
Rationale:
* it was added in WSGI 1.0 only for compatibility
* new code does not use it
* this will force applications under development that still use the
  ``write`` callable to be fixed. See work on Mercurial
* it is very easy for current implementations to support both WSGI 1.0
  and WSGI 1.1
* legacy application will continue to work
* removing of the ``write`` callable will make middlewares more easy to
  write


Thanks  Manlio
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] WSGI and start_response

2010-04-13 Thread Manlio Perillo
Dirkjan Ochtman ha scritto:
> On Tue, Apr 13, 2010 at 13:13, Graham Dumpleton
>  wrote:
>> There is no such thing as a WSGI 2.0 PEP and there is no proper
>> concensus either on what it should look like. Thus if you see anything
>> claiming to implement WSGI 2.0, then it isn't and you should only view
>> it as an experimental proposal. You are warned. :-)
> 
> Do you (or someone else) have a status on where WSGI 2 is? IIRC WSGI 1
> isn't really usable with Python 3.x, so it seems about time we get
> something going again (AIUI this is blocking Werkzeug from being
> ported to 3.x, for example).
> 

WSGI 2.0 ideas are here:
http://wsgi.org/wsgi/WSGI_2.0

But it does not have support for Python 3.x.

Some corrections to WSGI 1.0 are here:
http://wsgi.org/wsgi/Amendments_1.0


You may add support to Python 3.x in existing WSGI 1.0 implementation,
but your implementation will end up to something that is no more WSGI 1.0.


Manlio
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] WSGI and start_response

2010-04-13 Thread Manlio Perillo
P.J. Eby ha scritto:
> At 10:18 PM 4/8/2010 +0200, Manlio Perillo wrote:
>> Suppose I have an HTML template file, and I want to use a sub request.
>>
>> ...
>> ${subrequest('/header/'}
>> ...
>>
>> The problem with this code is that, since Mako will buffer all generated
>> content, the result response body will contain incorrect data.
>>
>> It will first contain the response body generated by the sub request,
>> then the content generated from the Mako template (XXX I have not
>> checked this, but I think it is how it works).
> 
> Okay, I'm confused even more now.  It seems to me like what you've just
> described is something that's fundamentally broken, even if you're not
> using WSGI at all.
> 

If you are referring to Mako being turned in a generator, yes, this
implementation is rather obscure.

I wrote it as a proof of concept.
Before this, I wrote a more polite implementation:
http://paste.pocoo.org/show/201324/

> 
>> So, when executing a sub request, it is necessary to flush (that is,
>> send to Nginx, in my case) the content generated from the template
>> before the sub request is done.
> 
> This seems to only makes sense if you're saying that the subrequest *has
> to* send its output directly to the client, rather than to the parent
> request.  

Yes, this is how subrequests work in Nginx. And I assume the same is
true for Apache.

> If the subrequest sends its output to the parent request (as a
> sane implementation would), then there is no problem. 

You are forgetting that Nginx is not an application server.
Why should the subrequest output returned to the parent?

This would only make it less efficient.

> Likewise, if the
> subrequest is sent to a buffer that's then inserted into the parent
> invocation.
> 
> Anything else seems utterly insane to me, unless you're basically taking
> a bunch of legacy CGI code using 'print' statements and hacking it into
> something else.  (Which is still insane, just differently. ;-) )
> 

We are talking about subrequest implementation in a efficient web server
written in C, like Nginx and Apache.

> 
>> Ah, you are right sorry.
>> But this is not required for the Mako example (I was focusing on that
>> example).
> 
> As far as I can tell, that example is horribly wrong.  ;-)
> 

I agree ;-)

> 
>> But when using the greenlet middleware, and when using the function for
>> flushing Mako buffer, some data will be yielded *before* the application
>> returns and status and headers are passed to Nginx.
> 
> And that's probably because sharing a single output channel between the
> parent and child requests is a bad idea.  ;-)
> 

No, this is not specific to subrequests.

As an example, here you can find an up to date greenlet adapters:
http://bitbucket.org/mperillo/txwsgi/src/tip/txwsgi/greenlet.py

The ``write_adapter`` **needs** to yield some data before WSGI
application return, because this is how the write callable workd.

The exposed ``gsuspend`` function, instead, will cause an empty string
to be yielded to the server, before the WSGI application returns.

> (Specifically, it's an increase in "temporal coupling", I believe.  I
> know it's some kind of coupling between functions that's considered bad,
> I just don't remember if that's the correct name for it.)
> 

Nginx code contains some coupling; I assume this is done because it was
designed with efficiency in mind.

> [...] 
> It's true that dropping start_response() means you can't yield empty
> strings prior to determining your headers, yes.
> 
> 
>> > - yielding is for server push or
>> > sending blocks of large files, not tiny strings.
>>
>> Again, consider the use of sub requests.
>> yielding a "not large" block is the only choice you have.
> 
> No, it isn't.  You can buffer your output and yield empty strings until
> you're ready to flush.
> 

As I wrote, this will not work if you want to use subrequest support
from Nginx.

> 
> 
>> Unless, of course, you implement sub request support in pure Python (or
>> using SSI - Server Side Include).
> 
> I don't see why it has to be "pure", actually.  It just that the
> subrequest needs to send data to the invoker rather than sending it
> straight to the client.
> 

You may say this, but it is not how subrequests are implemented in Nginx
;-).


> That's the bit that's crazy in your example -- it's not a scenario that
> WSGI 2 should support, and I'd consider the fact that WSGI 1 lets you do
> it to be a bug, not a feature.  ;-)
> 

Are you referring to the bad Mako examp

Re: [Web-SIG] [RFC] x-wsgiorg.suspend extension

2010-04-13 Thread Manlio Perillo
Graham Dumpleton ha scritto:
> [...]
>> Just yielding an empty string does not give the server some important
>> informations.
>>
>> As an example, with x-wsgi.suspend application can specify a timeout,
>> that tells the server that the application must be resumed before
>> timeout milliseconds have elapsed.
>>
>> And x-wsgi.suspend returns a callable that, when called, tell the server
>> to poll the app again.
> 
> There are other ways of doing that, the callable doesn't need to be in
> the WSGI environment. This is because since it is single threaded, the
> WSGI server need only record in a global variable for that WSGI
> application some state about the current request. The separate
> function to note the suspension can then lookup that and does what it
> needs to. In other words, you don't need the WSGI environment to
> maintain  that relationship.
> 

This seems completely broken, to me; do you have looked at txwsgi
implementation?

It is true that the WSGI server is single threaded, but there can be
multiple concurrent requests processed in this thread.

What happens if one request is being suspended and a new one is being
processed?
As far as I can tell, the new request will note the suspend flag set to
True, and will be suspended as well.

> Having the timeout as argument is also questionable anyway. All you
> really need to do is to tell the WSGI server that I don't want to be
> called until I tell it otherwise. The WSGI application could itself
> handle the timeout in other ways.
> 

But I can't see the reason why this can not be done by
x-wsgiorg.suspend, since it is a very convenient interface.

> Overall one could do all of this without having to do anything in the
> WSGI environment. As PJE points out, it can be done by relying only on
> the ability to yield an empty string. Everything else can be in the
> application realm with the application normally being bound to a
> specific WSGI server/event loop implementation, thus no portability.
> 

>From what I can tell, this is only possible by having a custom variable
in the WSGI environ.
But since I wrote txwsgi for precisely this reason, it should not be
hard to prove that your idea is actually possible to implement (and it
does not make implementation more complex as it should be, think about
an implementation written in C).

> The problem of a middleware not passing through an empty string
> doesn't even need to be an issue in as much as the application could
> track when it requested to be suspended and if called into again
> before the required criteria had been met, it could detect a
> middleware that wasn't playing by the rules and at least raise an
> error rather than potentially go into blocking state and tight loop.
> 

Yes.
This is something that can be done by an implementation.
Currently txwsgi only checks for suspend flag when an empty string is
yielded by application.

> One could theoretically abstract out an interface for a generic event
> system, but what you don't want is a general purpose one. You want one
> which is specifically associated with the concept of a WSGI server.

Why?
This is not required at all.

> That way the API for it can expose methods which specifically relate
> to stuff like suspension of calling into the WSGI application for data
> until specific events occur. 

The event API just needs to deal with events, using callbacks to report
data to application.

Please, see the demo_getpage_green.py example, in txwsgi.

> [...]



Regards  Manlio
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] [RFC] x-wsgiorg.suspend extension

2010-04-12 Thread Manlio Perillo
P.J. Eby ha scritto:
> At 01:25 PM 4/12/2010 +0200, Manlio Perillo wrote:
>> The purpose of the extension if to just have a standard interface that
>> WSGI applications can use to take advantage of the possibility, offered
>> by asynchronous server, to suspend execution and resume it later.
> 
> WSGI has this ability now - it's yielding an empty string.  Yielding an
> empty string is a hint to the server that the application is not ready
> to send any output, and the server is free to schedule other
> applications next.  And WSGI does not require the application to be
> rescheduled any time soon.
> 
> In other words, if saying "don't call me for a while" is the purpose of
> the extension, it is not needed.  As Graham says, the thing that would
> actually be needed is a way to tell the server when to poll the app again.
> 

Just yielding an empty string does not give the server some important
informations.

As an example, with x-wsgi.suspend application can specify a timeout,
that tells the server that the application must be resumed before
timeout milliseconds have elapsed.

And x-wsgi.suspend returns a callable that, when called, tell the server
to poll the app again.


Regards  Manlio
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] [RFC] x-wsgiorg.suspend extension

2010-04-12 Thread Manlio Perillo
Graham Dumpleton ha scritto:
> [...]
>>
>> Claiming that x-wsgiorg.suspend does not help writing portable WSGI
>> application is something similar (well, I'm a bit exaggerating here) of
>> saying that WSGI does not allow to write portable web applications,
>> because real world WSGI applications needs a database, a database
>> engine, and so on.
> 
> It is not the same. I can take code using a specific database instance
> and still run that WSGI application, using the same database, on a
> different WSGI hosting mechanism without really changing anything
> about how I interact with the WSGI server and its request handling.
> The concern here is the WSGI interface and interacting with the web
> server, not other non related third party packages.
> 

This is true.

However you can say the same for x-wsgorg.suspend extension.

As an example, you can have an application that use a standard event
API, and you can run it on several asynchronous WSGI implementations.

The difference is that here we speak about event API, and not specific
event implementation.

Note however that we can also speak about specific implementations.
As an example, I can implement Twisted reactor API in Nginx, so that
WSGI applications using Twisted API can be executed on both Twisted and
Nginx.  I could do the same with libevent API.  It's only a technical
problem.

> You are articificially adding something to the WSGI interface as an
> extension which is pointless. Since you are bound to the specific
> event loop of the underlying WSGI server or event framework being used

You are not bound to a specific event framework, when using
x-wsgiorg.suspend!

> you may just as well call a function directly on the WSGI server.
> Adding that function under a key in the WSGI environment and accessing
> it that way does not in itself provide any value and doesn't somehow
> make the code easily portable to a different WSGI hosting mechanism
> using a different event loop as you still have to change lots of other
> code in your application.
> 

This is absolutely not true!

> In some respects this is similar to the issues between using a WSGI
> wrapper which injects stuff in WSGI environment versus that
> functionality being in a separate library. Read:
> 
>   http://dirtsimple.org/2007/02/wsgi-middleware-considered-harmful.html
> 

This is simply wrong.

x-wsgiorg.suspend **can not** be implemented as simply library code; it
**must** be accessed from environ dictionary.

The reason is simple:

1) First of all, in order to suspend application, you **must** return
   control to the server, and this can only be done by yielding some
   value in the application generator.
2) In order for the implementation to know if application requested
   suspension, it must keep a flag in its *internal* state.
   The x-wsgiorg.suspend function simply sets this flag.


Manlio
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] [RFC] x-wsgiorg.suspend extension

2010-04-12 Thread Manlio Perillo
Graham Dumpleton ha scritto:
> On 12 April 2010 06:07, Manlio Perillo  wrote:
>> I'm not sure about the correct procedure to follow, I hope it is not a
>> problem.
>>
>> I here propose the x-wsgiorg.suspend to be accepted as official WSGI
>> extension, using the wsgiorg namespace.
>>

First of all thanks for the feedback.

> [...]
> In the code of demo_fdevent.py it has:
> 
> while True:
> while True:
> ret, num_handles = m.perform()
> if ret != pycurl.E_CALL_MULTI_PERFORM:
> break
> if not num_handles:
> break
> 
> read, write, exc = m.fdset()
> resume = environ['x-wsgiorg.suspend'](1000)
> if read:
> readable(read[0], resume)
> yield ''
> else:
> writeable(write[0], resume)
> yield ''
> 
> The registration of file descriptors doesn't occur until after the
> first suspend() call.
> 
> If the underlying reactor that the WSGI server is presumably also
> using doesn't know about the file descriptors at that point, then how
> does it now to return from the suspend().
> 

I'm not sure to understand your concern, but the execution is not
suspended when you call x-wsgiorg.suspend, but only when you yield a
empty string.

In the example, registration of file descriptor occur before application
is suspended.

> You are also calling perform() before that point. When calling that,
> it is presumed you have already done a select/poll to know data is
> available, but you haven't done that on first pass through the loop.
> If you call that and data isn't ready, can't it block still.
> 

I have to admit that I just copied the example from fdevent specification.
However the code seems correct, to me.

> This example also illustrates well why I am so against an asynchronous
> WSGI server extension.
> 
> The reason is that your specific application has to be with this
> extension bound to the specific event loop mechanism used by the
> underlying WSGI server.
> 
> I can't for example take this application and host it on a different
> WSGI server which implements the same WSGI extension but uses a
> different event loop.
> 

Instead I think that being "agnostic" about how it is used, in one of
the most important feature of x-wsgiorg.suspend extension.

After all, if you think about it, how to interface with a database in a
WSGI application is not specified by WSGI.
This is done by a separate standard, dbapi2.

For applications that need a template engine, we don't even have a
standard inteface.

The lack of a standard event API is not a problem that should be
discussed in WSGI.
It is a problem with the Python community; in fact I would like to
define a standard event API *and* a standard efficient network API (the
reason is expressed at the end of the README file in txwsgi).

> If one can't do that and it is tied to the event loop and
> infrastructure of the underlying WSGI server, what is the point of
> defining and implementing the WSGI extension as it doesn't aid
> portability at all, so what service is it actually providing?
> 

The service it provides is: "allow a WSGI application to suspend its
execution and resume it later".

> In that respect, the extension:
> 
> http://www.wsgi.org/wsgi/Specifications/fdevent/
> 
> provided more as at least it tried to abstract out a generic interface
> for registering interest in file descriptor activity and so perhaps
> allow the application not to be dependent on the specific event loop
> used by the underlying WSGI server.
> 

However exposing this event interface is really something that has
little to do with WSGI.

Moreover, the fdevent example is rather inefficient.
Suspensions should be minimized, and this is not possible with
x-wsgiorg.fdevent but it is possible with x-wsgiorg.suspend.

>>From the open issues of that other specification however, you can see
> that there can be problems. It only allowed an application to be
> interested in a single file descriptor where some packages may need to
> express interest in more than one.
> 
> Quite often an application is never going to be that simple anyway.
> Some event systems allow a lot more than just watching of file
> descriptors and timeouts however. You cant come up with a generic
> interface for all these as they will not be able to be implemented by
> a different event system which isn't so feature rich or which has a
> different style of interface. Thus applications are restricted to the
> lowest common denominator and likely that is not going to be enough
> for most and so have no choice but to bind it to in

Re: [Web-SIG] wsgi and generators (was Re: WSGI and start_response)

2010-04-11 Thread Manlio Perillo
P.J. Eby ha scritto:
> At 02:04 PM 4/10/2010 +0100, Chris Dent wrote:
>> I realize I'm able to build up a complete string or yield via a
>> generator, or a whole bunch of various ways to accomplish things
>> (which is part of why I like WSGI: that content is just an iterator,
>> that's a good thing) so I'm not looking for a statement of what is or
>> isn't possible, but rather opinions. Why is yielding lots of moderately
>> sized strings *very bad*? Why is it _not_ very bad (as presumably
>> others think)?
> 
> How bad it is depends a lot on the specific middleware, server
> architecture, OS, and what else is running on the machine.  The more
> layers of architecture you have, the worse the overhead is going to be.
> 
> The main reason, though, is that alternating control between your app
> and the server means increased request lifetime and worsened average
> request completion latency.
> 

This is not completely true.
At least this is not how things will work on an asynchronous WSGI
implementation.

It is true that alternating control between your app and server decrease
performance.
This can be verified with:
http://bitbucket.org/mperillo/txwsgi/src/tip/doc/examples/demo_cooperative.py

However yielding small strings in the application iterator, because the
application does not want to buffer data, will usually not cause the
problems you describe.

Instead, the possible performance problems have been described by Graham.


Moreover, when we speak about latency, we should also consider that web
page are usually served to human users.
In this case, latency is not the only factor to consider.

Is it better for the user to wait 3 seconds for some text to appear on
the browser window, and then wait for other 5 seconds for the complete
page to be rendered, or having to wait 5 seconds for some text to appear
on the browser window?

> [...]
> If you translate this to the architecture of a web application, where
> the "work" is the server serving up bytes produced by the application,
> then you will see that if the application serves up small chunks, the
> web server is effectively forced to multitask, and keep more application
> instances simultaneously running, with lowered latency, increased memory
> usage, etc.
> 

Yielding small strings *will* not force multitasking.
This can be verified with:
http://bitbucket.org/mperillo/txwsgi/src/tip/doc/examples/demo_producer.py

WSGI application will be suspended *only* when data can not be sent to
the OS socket buffer.

Yielding several small strings will *usually* not cause socket buffer
overflow, unless the client is very slow at reading data.

Instead, ironically, you will have a problem when the application yields
several big strings.

In this case it is better to yield only one very big string, but this is
not always feasible.
And I'm not sure if it is worse to keep a very big buffer in memory, or
to send several not small chunks to the client.

> [...]


Regards  Manlio
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] [ANN] twsgi: asynchronous WSGI implementation for Twisted Web

2010-04-11 Thread Manlio Perillo
Gustavo Narea ha scritto:
> Hello,
> 
> Maybe I'm missing something obvious, but if the gateway doesn't support
> applications that return write() callables, then it's not WSGI.
> 
> A callable that raises an exception does not even count. It's obvious
> that they must not raise exceptions -- Then what's the point of
> providing the callable?
> 

Nothing is obvious in an official specification ;-).

The reason I choose to not completely remove the write callable is
because it will raise a nice error message if someone even try to use my
implementation to execute a WSGI application that requires the write
callable.

Moreover some middlewares or applications may assume the write callable
exists and the value returned by start_response is not None, even if it
is never used.


> That said, I *think* it might be OK to disable support for the write()
> callable *optionally* on a per application basis. For example, the
> gateway could look at the "requires_write" attribute of the application
> callable, if any:
> """
> def wsgi_app(environ, start_response):
> # ... process the request and return a response
> 
> wsgi_app.requires_write = False
> """
> 
> That way, applications which don't use the write() callable can let your
> gateway know and thus it won't pass one on.
> 

The problem is that applications that requires the write callable, are
not aware of this extension.

This is really a no problem, IMHO.
If you try to execute an application, and you get a NotImplementedError
extension, then you *know* that write callable is required.

Then, you just configure the WSGI gateway to use the required adapter.
See http://bitbucket.org/mperillo/txwsgi/src/tip/doc/examples/demo_write.py
for a pratical example using txwsgi.

With ngx_http_wsgi_module, you just have to add a
wsgi_middleware  txwsgi.greenlet write_adapter;
directive in Nginx configuration file.

> We could even standardize this (at wsgi.org) so that any WSGI middleware
> which wraps an application can expose the "requires_write" attribute of
> the wrapped application... As long as such a middleware doesn't use
> write() either.
> 
> On the other hand, I would avoid using "middleware" in this context for
> something specific to your implementation as people will believe it's a
> proper WSGI middleware. 

Yes.
I now use the term "adapter".


Regards   Manlio
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


[Web-SIG] [RFC] x-wsgiorg.suspend extension

2010-04-11 Thread Manlio Perillo
I'm not sure about the correct procedure to follow, I hope it is not a
problem.

I here propose the x-wsgiorg.suspend to be accepted as official WSGI
extension, using the wsgiorg namespace.

The extension is documented in doc/wsgiorg.suspend.rst document in the
txwsgi source distribution, available on:
http://bitbucket.org/mperillo/txwsgi/

The direct link to the specification is:
http://bitbucket.org/mperillo/txwsgi/src/tip/doc/wsgiorg.suspend.rst

The extension is implemented in txwsgi implementation for Twisted Web
server, and I'm going to implement it in the ngx_http_wsgi_module
implementation for Nginx server.

The extension is very easy to implement.
It also generalize the proposed x-wsgiorg.fdevent extension.

Please, see
http://bitbucket.org/mperillo/txwsgi/src/tip/doc/examples/demo_fdevent.py
for a comparison of the same example described in fdevent specification,
implemented using suspend and Twisted reactor API.


Thanks to Christopher Stawarz for writing the fdevent specification,
since I was able to use it as a reference.


Some additional notes.
x-wsgiorg.suspend extension can be implemented in both WSGI 1.0 and the
proposed WSGI 2.0.  However, due to the lack of start_response support,
the usability is limited.



Thanks and regards   Manlio Perillo
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


[Web-SIG] [ANN] txwsgi 0.1

2010-04-11 Thread Manlio Perillo
I'm pleased to announce txwsgi, version 0.1.

txwsgi is a fork of twisted.web.wsgi, that, unlike the original
implementation, executes the WSGI application in the main I/O thread.

txwsgi implements the proposed x-wsgiorg.suspend extension, that enables
support to asynchronous WSGI applications.

Some examples are available in the doc/examples directory, in the source
distribution.

The project is available on BitBucket:
http://bitbucket.org/mperillo/txwsgi/

More informations are available in the README file.
The x-wsgiorg.suspend extension is specified in doc/wsgiorg.suspend.rst.
I will starte a new thread for official approval process.

I have tried to write as much documentation possible, also taking into
consideration feedback received in previous threads; thanks for the support.



Thanks and regards  Manlio Perillo
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] [ANN] twsgi: asynchronous WSGI implementation for Twisted Web

2010-04-09 Thread Manlio Perillo
Graham Dumpleton ha scritto:
> On 9 April 2010 22:15, Manlio Perillo  wrote:
>> Graham Dumpleton ha scritto:
>>> [...]
>>>> But since the write callable **can** be implemented in a middleware
>>>> (using greenlets) and since middlewares *can* be configured inside WSGI
>>>> gateway, implementations can still claim to be WSGI 1.0 conformant.
>>> Then only the higher level middleware adapter can even claim to be
>>> WSGI compliant and deserve to use the WSGI name.
>> Since the middleware is executed inside WSGI gateway, and the gateway
>> can be configured to always execute some middleware, the final
>> application will simply have at disposal a WSGI conformant write callable.
> 
> Then it isn't really a middleware at all then, but a part of your
> overall solution.

It is just that the gateway has support to direct execution of
middlewares, since this make the implementation more flexible.

> So long as only the complete solution is exposed and
> is WSGI compliant then fine. But if it is going to be layered in any
> way such that lower level layers can be used in their own right, then
> the lower level layers shouldn't really be said to be WSGI if they
> don't implement full WSGI specification. As much as we all have our
> complaints about WSGI specification, it is what it is and is all we
> have right now.
> 

By the way, as a matter of curiosity.
WSGI 1.0 states:

"""The start_response callable must return a write(body_data) callable
that takes one positional parameter: a string to be written as part of
the HTTP response body. (Note: the write() callable is provided only to
support certain existing frameworks' imperative output APIs; it should
not be used by new applications or frameworks if it can be avoided. See
the Buffering and Streaming section for more details.)"""


There is nothing that prevents the write callable to raise an exception.

Of course an implementation that always raise a NotImplementedError is
going to be useless (for applications that require the write callable),
but it seems to me that such an implementation can still claim to
conform to WSGI 1.0.

> [...]

Manlio
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] [ANN] twsgi: asynchronous WSGI implementation for Twisted Web

2010-04-09 Thread Manlio Perillo
Graham Dumpleton ha scritto:
> [...]
>> But since the write callable **can** be implemented in a middleware
>> (using greenlets) and since middlewares *can* be configured inside WSGI
>> gateway, implementations can still claim to be WSGI 1.0 conformant.
> 
> Then only the higher level middleware adapter can even claim to be
> WSGI compliant and deserve to use the WSGI name. 

Since the middleware is executed inside WSGI gateway, and the gateway
can be configured to always execute some middleware, the final
application will simply have at disposal a WSGI conformant write callable.

> Any underlying
> abstraction you use at the web server interface isn't WSGI and by
> rights should be called something else so there is no confusion and
> also shouldn't use 'wsgi' keys in its environ dictionary. Have your
> high level middleware do a completely remapping of names as
> appropriate.
> 

This will add useless overhead.

>>> Why don't you given it all a completely different name else you will
>>> just cause ongoing confusion
>> In don't really see how this can cause confusion!
> 
> So, when someone goes and runs a WSGI application directly against you
> WSGIish web server interface which you still insist you can describe
> as being WSGI and it fails because the write() method isn't
> implemented what is your answr going to be? If something is going to
> use WSGI name it should implement the full WSGI specification.
> 

To make people happy, I can just have the default implementation include
the required middleware by default.

>>> like you did with when you felt you could
>>> reuse the 'mod_wsgi' name for your nginx
>> In fact the first thing I did during code refactoring was to rename it
>> to ngx_http_wsgi_module.
> 
> The mod_wsgi name is still used all through
> http://wiki.nginx.org/NginxNgxWSGIModule that I can tell.
> 

I still have to update it.


Manlio
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] [ANN] twsgi: asynchronous WSGI implementation for Twisted Web

2010-04-09 Thread Manlio Perillo
Graham Dumpleton ha scritto:
> [...]
>>- the name will be 'wsgiorg.suspend' instead of 'wsgi.pause_output'
>>
>>  The wsgiorg namespace is used, since the plan is to have it
>>  standardized [1], but it can only be implemented on asynchronous
>>  servers.
> 
> Please read:
> 
>   http://www.wsgi.org/wsgi/Specifications
> 
> If a proposal is suggested, it MUST use 'x-wsgiorg.' and not
> 'wsgiorg.'. Only after it is officially accepted can it use the
> 'wsgiorg.'.
> 

Well; since the original propose was using wsgi namespace, I just
suggested the use of wsgiorg namespace instead

Of course, when it will be implemented I will use a different namespace,
until it gots approved.

> I would question whether you should even be using 'x-wsgiorg.' as as
> far as I can see from my quick scans of emails, you aren't even
> supporting WSGI proper as you are dropping support for bits. As such,
> it isn't WSGI, only WSGIish so how can you justify using the name.
> 

This is not completely correct.
The twsgi implementation, as well ngx_http_wsgi_module implementation,
does not implement the write callable.

The reason is simple: write callable was an huge mistake in WSGI 1.0
since it can not be implemented in an asynchronous web server.

But since the write callable **can** be implemented in a middleware
(using greenlets) and since middlewares *can* be configured inside WSGI
gateway, implementations can still claim to be WSGI 1.0 conformant.

> Why don't you given it all a completely different name else you will
> just cause ongoing confusion 

In don't really see how this can cause confusion!

> like you did with when you felt you could
> reuse the 'mod_wsgi' name for your nginx 

In fact the first thing I did during code refactoring was to rename it
to ngx_http_wsgi_module.

> version even though I asked
> you to use a different name. It has been an absolute pain seeing
> discussions on places like #django irc where people don't know when
> people mention mod_wsgi whether they are talking about Apache of
> nginx.
> 

Apologies for having underestimated this.


Manlio
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


[Web-SIG] [ANN] twsgi: asynchronous WSGI implementation for Twisted Web

2010-04-09 Thread Manlio Perillo
I have started to write an asynchronous WSGI implementation for Twisted Web.

The standard implementation execute the WSGI application in a separate
thread.
twsgi will instead execute the application in the main Twisted thread.

The advantage is that twsgi is better integrated in Twisted, and WSGI
applications will be able to use all features available in Twisted.


Code is availale from a Mercurial repository:
http://hg.mperillo.ath.cx/twisted/twsgi


The purpose of twsgi is to have a pure Python implementation of WSGI
with support for asynchronous HTTP servers and asynchronous WSGI
applications.

The implementation is similar to ngx_http_wsgi_module, and can be used
to quick test asynchronous extensions.

write callable is not implemented (calling it will raise NotImplemented
error), since write callable can not be implemented in an asynchronous
web server without using threads (and twsgi *does* not use threads).

ngx_http_wsgi_module does the same.


TODO


* support for suspending iteration over WSGI app iter, when socket is
  not ready to send data.
  execution will be resumed when socked is ready again.

* support for suspend/resume extension, as described here:
  http://comments.gmane.org/gmane.comp.python.twisted.web/632

  It will have some differences:

- the name will be 'wsgiorg.suspend' instead of 'wsgi.pause_output'

  The wsgiorg namespace is used, since the plan is to have it
  standardized [1], but it can only be implemented on asynchronous
  servers.

- wsgi.pause_output function will accept an optional timeout, in
  milliseconds.

  If timeout is specified, application will be implicitly resumed
  when timeout expires.

- resume function will return a boolean value.
  True: if execution was suspended and it is going to be resumed
  False: if execution was not suspended

  The return value can be used to check if timeout specified in
  wsgiorg.suspend expired.

  I'm not sure if a boolean value is the best solution.
  Maybe it should return -1 is execution was not suspended, and 0
  otherwise.


[1] unlike other proposed async extensions, suspend/resume is much more
simple and easy to implement, so it is more likely to have a wide
consensus over the specification.


Feedbacks are welcomed.


Regards  Manlio
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] WSGI and start_response

2010-04-08 Thread Manlio Perillo
P.J. Eby ha scritto:
> At 08:06 PM 4/8/2010 +0200, Manlio Perillo wrote:
>> What I'm trying to do is:
>>
>> * as in the example I posted, turn Mako render function in a generator.
>>
>>   The reason is that I would lite to to implement support for Nginx
>>   subrequests.
> 
> By subrequest, do you mean that one request is invoking another, like
> one WSGI application calling multiple other WSGI applications to render
> one page containing contents from more than one?
> 

Yes.

> 
>>   During a subrequest, the generated response body is sent directly to
>>   the client, so it is necessary to be able to flush the Mako buffer
> 
> I don't quite understand this, since I don't know what Mako is, or, if
> it's a template engine, what flushing its buffer would have to do with
> WSGI buffering.
> 

Ah, sorry.

Mako is a template engine.
Suppose I have an HTML template file, and I want to use a sub request.


  ...
  
${subrequest('/header/'}
...
  



The problem with this code is that, since Mako will buffer all generated
content, the result response body will contain incorrect data.

It will first contain the response body generated by the sub request,
then the content generated from the Mako template (XXX I have not
checked this, but I think it is how it works).

So, when executing a sub request, it is necessary to flush (that is,
send to Nginx, in my case) the content generated from the template
before the sub request is done.

Since Mako does not return a generator (I asked the author, and it was
too hard to implement), I use a greenlet in order to "turn" the Mako
render function in a generator.

> 
>> > Under
>> > WSGI 1, you can do this by yielding empty strings before calling
>> > start_response.
>>
>> No, in this case this is not what I need to do.
> 
> Well, if that's not when you're needing to suspend the application, then
> I don't see what you're losing in WSGI 2.
> 
> 
>> I need to call start_response, since the greenlet middleware will yield
>> data to the caller before the application returns.
> 
> I still don't understand you.  In WSGI 1, the only way to suspend
> execution (without using greenlets) prior to determining the headers is
> to yield empty strings.
> 

Ah, you are right sorry.
But this is not required for the Mako example (I was focusing on that
example).

> I'm beginning to wonder if maybe what you're saying is that you want to
> be able to write an application function in the form of a generator? 

The greenlet middleware return a generator, in order to work.

> If
> so, be aware that any WSGI 1 app written as:
> 
>  def app(environ, start_response):
>  start_response(status, headers)
>  yield "foo"
>  yield "bar"
> 
> can be written as a WSGI 2 app thus:
> 
>  def app(environ, start_response):
>  def respond():
>  yield "foo"
>  yield "bar"
>  return status, headers, respond()
> 

The problem, as I wrote, is that with the greenlet middleware, the
application needs not to return a generator.

def app(environ):
tmpl = ...
body = tmpl.render(...)

return status, headers, [body]

This is a very simple WSGI application.

But when using the greenlet middleware, and when using the function for
flushing Mako buffer, some data will be yielded *before* the application
returns and status and headers are passed to Nginx.


> This is also a good time for people to learn that generators are usually
> a *very bad* way to write WSGI apps 

It's the only way to be able to suspend execution, when the WSGI
implementation is embedded in an async web server not written in Python.

The reason is that you can not use (XXX check me) greenlets in C code,
you should probably use something like http://code.google.com/p/coev/

Greenlets can be used in gevent, as an example, because scheduling is
under control of Python code.
This is not the case with Nginx.

> - yielding is for server push or
> sending blocks of large files, not tiny strings.  

Again, consider the use of sub requests.
yielding a "not large" block is the only choice you have.

Unless, of course, you implement sub request support in pure Python (or
using SSI - Server Side Include).

Another use case is when you have a very large page, and you want to
return some data as soon as possible to avoid the user to abort request
if it takes some time.

Also, note that with Nginx (as with Apache, if I'm not wrong), even if
application yields small strings, the server can still do some buffering
in order to increase performance.

In ngx_http_wsgi_module buffering is optional (and disabled by def

Re: [Web-SIG] WSGI and start_response

2010-04-08 Thread Manlio Perillo
P.J. Eby ha scritto:
> At 05:40 PM 4/8/2010 +0200, Manlio Perillo wrote:
>> With WSGI 2.0 we will end up with:
>>
>> - WSGI 1.0, a full featured protocol, but with hard to implement
>>   middlewares
>> - WSGI 2.0, a simple protocol, with more easy to implement middlewares
>>   but without support for some "advanced" applications
> 
> Let me see if I understand what you're saying.  You want to support
> suspending an application, without using greenlets or threads. 

What I'm trying to do is:

* as in the example I posted, turn Mako render function in a generator.

  The reason is that I would lite to to implement support for Nginx
  subrequests.
  During a subrequest, the generated response body is sent directly to
  the client, so it is necessary to be able to flush the Mako buffer

* implement the simple suspend/resume extension, as described here:
  http://comments.gmane.org/gmane.comp.python.twisted.web/632

  Note that my ngx_http_wsgi_module already support asynchronous web
  server, since when the application returns a generator and sending a
  yielded buffer to the client would block, execution of WSGI
  application is suspended, and resumed when the socket is ready to send
  data.

  The suspend/resume extension allows an application to explicitly
  suspend/resume execution, so it is a nice complement for an
  asynchronous server.

  I would like to propose this extension for wsgiorg namespace.


Not that, however, greenlets are still required, since it will make the
code much more usable.

> Under
> WSGI 1, you can do this by yielding empty strings before calling
> start_response.  

No, in this case this is not what I need to do.

I need to call start_response, since the greenlet middleware will yield
data to the caller before the application returns.

> Under WSGI 2, you can only do this by directly
> suspending execution, e.g. via greenlet or eventlets or some similar API
> provided by the server.  Is this your objection?
> 

In WSGI 2 what I want to do is not really possible.
The reason is that I don't use greenlets in the C module (I'm not even
sure greenlets can be used in my ngx_http_wsgi module)

Execution is suspended using the "normal" suspend extension.
The problem is with the greenlet middleware that will force a different
code flow.

> As far as I know, nobody has actually implemented an async app facility
> for WSGI 1, although it sounds like perhaps you're trying to design or
> implement such a thing now.  

Right.

My previous attempt was a failure, since the extensions have severe
usability problem.

It is the same problem you have with Twisted deferred. In this case
every function that call a function that use the async extension must be
a generator.

In my new attempt I plan to:

1) Implement the simple suspend/resume extension
2) Implement a Python extension module that wraps the Nginx events
   system.
3) Implement a pure Python WSGI middleware that, using greenlets, will
   enable normal applications to take advantage of Nginx async features.

   This middleware will have the same purpose as the Hub available in
   gevent


> If so, then there's nothing stopping you
> from implementing a WSGI 1 server and providing a WSGI 2 adapter, since
> as you point out, WSGI 2 is easier to implement on top of WSGI 1 than
> the other way around.
> 

Yes, this is what I would like to do.

Do you think it will possible to implement all the requirements of WSGI
2 (including Python 3.x support) in a simple adapter on top of WSGI 1.0 ?

And what about applications that need to use the WSGI 1.0 API but
require to run with Python 3.x?


Thanks  Manlio
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] WSGI and start_response

2010-04-08 Thread Manlio Perillo
P.J. Eby ha scritto:
> At 04:59 PM 4/8/2010 +0200, Manlio Perillo wrote:
> [...]
>> There should be a sample WSGI 2.0 implementation for CGI, and a sample
>> WSGI 1.0 -> 2.0 adapter.
>>
>> This adapter should be able to support the coroutine example,
>> > http://paste.pocoo.org/show/199202/
>> but I would like to test.
>>
>> write callable, as far as I know, can not be implemented.
> 
> Implementing it requires greenlets or threads, but it's implementable. 
> See:
> 
> http://mail.python.org/pipermail/web-sig/2009-September/003986.html
> 

Right.
In fact, in the example I posted, I implemented the write callable using
greenlets (although the implementation is different).

> (Btw, I've noticed that this early sketch of mine doesn't support the
> case where an application is a generator, because start_response won't
> have been called when the application returns.  This can be fixed, but
> it requires the addition of a wrapper class and a few other annoying
> details.  It also doesn't support exc_info properly, so it's still a
> ways from being a correct WSGI 1 server implementation.  Getting rid of
> all these little variations, though, is the goal of having a WSGI 2 -
> it's difficult to write *any* middleware to be completely WSGI 1
> compliant.)
> 

I agree that this is a good goal.
However I don't like the idea of losing support for some features.

With WSGI 2.0 we will end up with:

- WSGI 1.0, a full featured protocol, but with hard to implement
  middlewares
- WSGI 2.0, a simple protocol, with more easy to implement middlewares
  but without support for some "advanced" applications


Both WSGI 1.0 can be implemented on top of WSGI 2.0, and WSGI 2.0 on top
of WSGI 1.0.

The latter should be more "easy" to implement.


I would like to have a WSGI 1.1 specification without the write
callable, and a *standard* adapter that will expose a more simple API
(like WSGI 2.0) so that applications and middlewares can be implemented
using this simple API but you still have the full featured API.

This is important, IMHO.
Because with the next version of WSGI, there will be also support for
Python 3.x.
And if the next version will not have support for the start_response
function, applications that needs Python 3.x and want to use "advance
features" will not be able to rely a standard procotol.



Regards   Manlio
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] WSGI and start_response

2010-04-08 Thread Manlio Perillo
Aaron Watters ha scritto:
> someone remind me: where is the canonical WSGI 2 spec?

http://wsgi.org/wsgi/WSGI_2.0

> I assume there is a way to "wrap" WSGI 1 applications
> without breaking them?  Or is this the regex-->re fiasco
> all over again?
> 

start_response can be implemented by a function that will store the
status code and response headers.

There should be a sample WSGI 2.0 implementation for CGI, and a sample
WSGI 1.0 -> 2.0 adapter.

This adapter should be able to support the coroutine example,
> http://paste.pocoo.org/show/199202/
but I would like to test.

write callable, as far as I know, can not be implemented.

> [...]


Regards  Manlio
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


[Web-SIG] WSGI and start_response

2010-04-08 Thread Manlio Perillo
Hi.

Some time ago I objected the decision to remove start_response function
from next version WSGI, using as rationale the fact that without
start_callable, asynchronous extension are impossible to support.

Now I have found that removing start_response will also make impossible
to support coroutines (or, at least, some coroutines usage).

Here is an example (this is the same example I posted few days ago):
http://paste.pocoo.org/show/199202/

Forgetting about the write callable, the problem is that the application
starts to yield data when tmpl.render_unicode function is called.

Please note that this has *nothing* to do with asynchronus applications.
The code should work with *all* WSGI implementations.


In the pasted example, the Mako render_unicode function is "turned" into
a generator, with a simple function that allows to flush the current buffer.


Can someone else confirm that this code is impossible to support in WSGI
2.0?

If my suspect is true, I once again object against removing start_response.

WSGI 1.0 is really a well designed protocol, since it is able to support
both asynchonous application (with a custom extension) and coroutines,
*even* if this was not considered during protocol design.


Thanks  Manlio
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] WSGI safe write callable using greenlet

2010-03-31 Thread Manlio Perillo
Manlio Perillo ha scritto:
> Hi.
> 
> In this period I'm upgrading my WSGI implementation for Nginx:
> http://hg.mperillo.ath.cx/nginx/ngx_http_wsgi_module/
> [...]
> So, I was thinking: what about a WSGI middleware that, using greenlets,
> expose to the application a write callable with the correct code flow?
> 
> 
> Here is a very first draft:
> http://pastebin.com/4k1Ep4dH
> 
> It should work with every standard WSGI implementation.
> 

Here is a more generic middleware and example application:
http://pastebin.com/S8c1gRfY

and here is the output:
http://pastebin.com/zzkRiRuA


The example also contains hints about features I plan to implement,
like the wsgiorg.suspend extension, and subrequests.



Regards  Manlio
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


[Web-SIG] WSGI safe write callable using greenlet

2010-03-30 Thread Manlio Perillo
Hi.

In this period I'm upgrading my WSGI implementation for Nginx:
http://hg.mperillo.ath.cx/nginx/ngx_http_wsgi_module/

I'm not only updating the code to work with recent Nginx versions (after
 2 years) but, above all, I'm cleaning up the code, removing stuff not
strictly required and hard to maintain.

I have already removed support to multiple Python subinterpreters, and
now I'm going to remove the async extensions I wrote (there will only
one very simple API, for applications using greenlets); finally I would
like to remove support to the write callable.

The problem, to put it simple, is that the write callable *can not* be
implemented in an asynchronous web server like Nginx.

I have two implementations:
* the first (not the default), simply keeps a buffer.
  This is explicitly forbidden by WSGI.
* the second puts the Nginx connection socket in synchronous mode;
  it works but it is something that *should not* be done.

So, I was thinking: what about a WSGI middleware that, using greenlets,
expose to the application a write callable with the correct code flow?


Here is a very first draft:
http://pastebin.com/4k1Ep4dH

It should work with every standard WSGI implementation.

I would really like to recevive feeback about this implementation, since
I have never used greenlets before.


P.S.: LICENSE is a MIT license


Thanks   Manlio
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] wsgi.errors and close method

2010-03-30 Thread Manlio Perillo
Dirkjan Ochtman ha scritto:
> On Tue, Mar 30, 2010 at 11:28, Manlio Perillo  
> wrote:
>> Note however, that Mercurial has fixed the problem:
> 
> So, as the guy who inherited Mercurial's hgweb WSGI application (or
> rather, made it much more WSGI-compliant), 

Did you managed to remove usage of the write callable?

> [...]


Regards  Manlio

___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] wsgi.errors and close method

2010-03-30 Thread Manlio Perillo
Graham Dumpleton ha scritto:
> [...]

>> Here is the culprit:
>> http://lists.alioth.debian.org/pipermail/python-modules-team/2009-January/003514.html
>> http://code.google.com/p/modwsgi/issues/detail?id=82
>>
>> So it seems safe, when the Log object used in wsgi.errors is also used
>> to replace sys.stderr, to just add the closed attribute (but *not* the
>> close method).
> 
> It is all very silly. Technically a file like object is not required
> to have a 'closed' attribute, so that code expecting it was wrong in
> the first place.
> 
>   http://docs.python.org/library/stdtypes.html#file-objects
> 

Right, thanks; I did not notice it.

Note however, that Mercurial has fixed the problem:

# stderr may be buffered under win32 when redirected to files,
# including stdout.
if not getattr(sys.stderr, 'closed', False):
sys.stderr.flush()


I would probably do something like:
try:
sys.stderr.flush()
except:
pass


> The close() method is however required of file like objects so if you
> are going to replace a file like object, you should have it.
> 

Yes, I should.
But since they should raise an exception, raising AttributeError,
instead, should not be a critical problem.


Manlio
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] wsgi.errors and close method

2010-03-30 Thread Manlio Perillo
Manlio Perillo ha scritto:
> Hi.
> 
> Some time ago, someone reported me that an application embedded in Nginx
> with my WSGI module failed to execute, since in my implementation the
> wsgi.errors object does not implement the .close method.
> 
> [...]
> Any idea?
> 

Here is the culprit:
http://lists.alioth.debian.org/pipermail/python-modules-team/2009-January/003514.html
http://code.google.com/p/modwsgi/issues/detail?id=82

So it seems safe, when the Log object used in wsgi.errors is also used
to replace sys.stderr, to just add the closed attribute (but *not* the
close method).


Thanks  Manlio
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] wsgi.errors and close method

2010-03-28 Thread Manlio Perillo
Graham Dumpleton ha scritto:
> On 28 March 2010 22:21, Manlio Perillo  wrote:
>> Graham Dumpleton ha scritto:
>>> [...]
>>>> Unfortunately I never got to know what application or framework was
>>>> causing the problem.
>>>>
>>>> Any idea?
>> Sorry, my question was not clear.
>>
>> I was asking what applications or frameworks call the .close method on
>> the errors object.
> 
> I know what you were asking. My point was that it doesn't help to find
> out as nearly impossible to get them to change the code. 

Ok, thanks.

My point is that I don't have strict compatibility requirements for my
ngx_http_wsgi_module, as you have with Apache mod_wsgi.

As an example, the other day I removed support for CPython
subinterpreters, since they make code more complex as it should be.

The reason I want to know the "bad" applications/framework is because I
would like to see the reason why they are calling the .close method.

[...]
>>> static PyGetSetDef Log_getset[] = {
>>> { "closed", (getter)Log_closed, NULL, 0 },
>>> #if PY_MAJOR_VERSION < 3
>>> { "softspace", (getter)Log_get_softspace, (setter)Log_set_softspace, 0 
>>> },
>>> #else
>> I noted that you added softspace descriptor in recent versions.
>> What is its purpose?
>> Is it here just for compatibility?
> 
> It is related to how comma separated lists and comma at end of line is
> used in the following.
> 
>   print >> sys.stderr, "a", "b",
>   print >> sys.stderr, "c"
> 

I will check it with my module, thanks.


Regards  Manlio
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] wsgi.errors and close method

2010-03-28 Thread Manlio Perillo
Graham Dumpleton ha scritto:
> [...]
>> Unfortunately I never got to know what application or framework was
>> causing the problem.
>>
>> Any idea?
> 

Sorry, my question was not clear.

I was asking what applications or frameworks call the .close method on
the errors object.

I want to check if:
* they are really calling the .close method on wsgi.errors, and why
* they are calling the .close method on stderr, and why


> [...]
> static PyGetSetDef Log_getset[] = {
> { "closed", (getter)Log_closed, NULL, 0 },
> #if PY_MAJOR_VERSION < 3
> { "softspace", (getter)Log_get_softspace, (setter)Log_set_softspace, 0 },
> #else

I noted that you added softspace descriptor in recent versions.
What is its purpose?
Is it here just for compatibility?



Thanks   Manlio
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


[Web-SIG] wsgi.errors and close method

2010-03-27 Thread Manlio Perillo
Hi.

Some time ago, someone reported me that an application embedded in Nginx
with my WSGI module failed to execute, since in my implementation the
wsgi.errors object does not implement the .close method.

The same object type is used to replace sys.stderr.

Of course, both trying to close wsgi.errors and sys.stderr means an
application/framework is broken, IMHO.

Unfortunately I never got to know what application or framework was
causing the problem.

Any idea?


Thanks  Manlio
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Generic configuration

2010-03-18 Thread Manlio Perillo
Alex Morega ha scritto:
> On 17 Mar 2010, at 13:47, Manlio Perillo wrote:
> [...]
>>> =
>>> [daemon]
>>> factory = egg:PasteScript#wsgiutils
>>> host = 127.0.0.1
>>> port = 8000
>>> app = my_site
>>>
>>> [...]
>>>
>> If you want this, isn't it more simple and generic to use YAML?
> 
> Yaml buys you flexibility at the cost of readability, which might be a good 
> trade-off, but that's not the point. You still need a tool that reads the 
> configuration file and does the actual setup.
> 
> Does the wsgix configuration loader allow for plugins, i.e. defining my own 
> constructors? Is it documented?
> 

This is a non problem.

You can write your own YAML loader (maybe deriving it from the existing
one), write a small middleware that use this loader and push it into the
stack middleware.

There is no need to support generic plugins.

> I chose to base my example on Paster configuration because it already knows 
> about egg entry points and explicitly pointing to factory functions.
> 


Manlio
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Generic configuration

2010-03-17 Thread Manlio Perillo
Alex Morega ha scritto:
> On 17 Mar 2010, at 0:24, Manlio Perillo wrote:
> 
>> Alex Morega ha scritto:
>>> Hello,
>>>
>>> This is not really a WSGI question, it's more into general configuration, 
>>> but I don't know of a better place to ask it.
>>>
> [...]
>> I use YAML with custom constructors:
>> http://hg.mperillo.ath.cx/wsgix/file/tip/wsgix/conf/loader.py
>>
>> There is a middleware:
>>  http://hg.mperillo.ath.cx/wsgix/file/tip/wsgix/conf/middleware.py
>> that reads a list of configuration files to load from the WSGI environ
>> (I set this in the Nginx mod_wsgi) configuration, and merge all the
>> configuration in the WSGI environ.
>>
>> Using custom YAML constructors it is possible to do something like:
>>  http://hg.mperillo.ath.cx/wsgix/examples/file/tip/dbview/settings.yml
>>
> [...]
> 
> That's still configuring a piece of WSGI middleware or application. I'm 
> thinking about something along these lines:
> 

Yes, since it works quite differently from other frameworks.

Middleware are very easy to configure; in Nginx configuration file:
   wsgi_middleware wsgix.conf.middleware;


The reason is that how you want to store configuration parameters should
not hard written in the framework.
Using YAML in my framework does not prevent using other methods like
ConfigParser or Python modules.

> =
> [daemon]
> factory = egg:PasteScript#wsgiutils
> host = 127.0.0.1
> port = 8000
> app = my_site
>
> [...]
>

If you want this, isn't it more simple and generic to use YAML?


Manlio
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Generic configuration

2010-03-16 Thread Manlio Perillo
Alex Morega ha scritto:
> Hello,
> 
> This is not really a WSGI question, it's more into general configuration, but 
> I don't know of a better place to ask it.
> 
> Paster config files allow you to hook up WSGI applications, middleware, and a 
> server, plus some (undocumented?) magic configuration of the logging module. 
> But what about random components, like a database? Ideally I'd like to 
> specify a factory for database connections and give it some parameters; this 
> would return a reference to a new database connection. I could then pass this 
> reference to my wsgi app or middleware.
> 
> Apparently the pattern is to perform this database configuration as part of a 
> wsgi middleware, but that feels unnatural. Or one could do this outside of 
> the paste configuration file, but that just splits the configuration 
> needlessly into several pieces. Am I missing something obvious?
> 

I use YAML with custom constructors:
http://hg.mperillo.ath.cx/wsgix/file/tip/wsgix/conf/loader.py

There is a middleware:
  http://hg.mperillo.ath.cx/wsgix/file/tip/wsgix/conf/middleware.py
that reads a list of configuration files to load from the WSGI environ
(I set this in the Nginx mod_wsgi) configuration, and merge all the
configuration in the WSGI environ.

Using custom YAML constructors it is possible to do something like:
  http://hg.mperillo.ath.cx/wsgix/examples/file/tip/dbview/settings.yml

It is also possible to configure the global python logging, create
temporary files and so on.


> Thanks,
> -- Alex
> 


Manlio
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Migrating from mod_wsgi to FastCGI

2010-03-15 Thread Manlio Perillo
Gustavo Narea ha scritto:
> Hello,
> 
> We're considering migrating from mod_wsgi to FastCGI (Apache) because
> we'll need to use versions of Python compiled by ourselves.
> 

Note that you can simply recompile mod_wsgi to use your custom Python.

> [...]


Manlio
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


[Web-SIG] host_name and request_uri_path

2010-01-26 Thread Manlio Perillo
Hi.

Recently I have implemented these two functions:
http://paste.pocoo.org/show/170198/


I would like to know if it is worth to have them as a saparate functions
or if there is a better method to get the host name and the request URI
path.


About the host_name function, what is the reason why it is not included
in wsgiref?


Thanks  Manlio
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] CGI WSGI and Unicode

2009-12-07 Thread Manlio Perillo
Graham Dumpleton ha scritto:

Note: I'm sending the entire message to the mailing list.

> 2009/12/7 Manlio Perillo :
>> Hi.
>>
>> I'm playing with Python 3.x, current revision.
>>
>> I have noted that the data in the os.environ are noe Unicode strings.
>>
>> In a CGI application, HTTP headers are Unicode strings, and are decoded
>> using system default encoding.
>> In a future WSGI application, HTTP headers are Unicode strings, and are
>> decoded using latin-1 encoding.
>>
>> In both cases, 'surrogateescape' is used.
> 
> No, 'surrogateescape' is not necessary when using latin-1, or at least
> for variables which use latin-1.
> 

The problem is that not all browsers use latin-1.
As an example with HTTP Digest authentication.

> Use of 'surrogateescape' is only relevant in the context of some web
> servers and only relevant for specific variables, some of which aren't
> even part of set of variables which are required by WSGI.
> 
> For example, in Apache/mod_wsgi, 'surrogateescape' is used on
> DOCUMENT_ROOT and SCRIPT_FILENAME. 

What about HTTP_COOKIE?

> [...] 
>> Can this cause troubles and incompatibility problems?
>> I'm interested in special header handling, like cookies, that contain
>> opaque data.
> 
> The issues which CGI/WSGI bridge in Python 3.X has been discussed
> previously on the list. 

It seems I missed it.

> It is acknowledged that there are problems to
> be solved there, at least to extent that CGI/WSGI bridge
> implementation has to correct the encoding, and also that that may
> only be solvable in Python 3.1 onwards due to not having access to
> what encoding was use for environment variables in Python 3.0. Not
> many people care about CGI these days and so no one has been bother to
> come up with working CGI/WSGI bridge for Python 3.X.
> 

CGI is very important; there are some kind of web applications that have
problems when executing in a long running process.

As an example, I prefer to run Trac and Mercurial instances as CGI.

> Graham


Regards  Manlio
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


[Web-SIG] CGI WSGI and Unicode

2009-12-06 Thread Manlio Perillo
Hi.

I'm playing with Python 3.x, current revision.

I have noted that the data in the os.environ are noe Unicode strings.

In a CGI application, HTTP headers are Unicode strings, and are decoded
using system default encoding.
In a future WSGI application, HTTP headers are Unicode strings, and are
decoded using latin-1 encoding.

In both cases, 'surrogateescape' is used.

Can this cause troubles and incompatibility problems?
I'm interested in special header handling, like cookies, that contain
opaque data.


Thanks  Manlio
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Move to bless Graham's WSGI 1.1 as official spec

2009-12-04 Thread Manlio Perillo
Henry Precheur ha scritto:
> On Fri, Dec 04, 2009 at 07:40:55PM +0100, Manlio Perillo wrote:
>> What are the functions that does not works with byte strings?
> 
> Just to make things clear, I was talking about Python 3.
> 

I know.

Unfortunately I don't have installed Python 3, I'm just reading the code.

> All the functions I tried not ending with _from_bytes raise an exception
> with bytes. This includes urllib.parse.parse_qs & urllib.parse.urlparse
> which are rather critical ...
> 

Ah, ok.
Can you show me the traceback of parse_qs? Thanks.


>> First of all, HTTP never says that whole headers are of type TEXT.
>> Only specific components are of type TEXT.
> 
> If parts of a header contain latin-1 characters, that means its
> encoding is latin-1 (at least partially).
> 

This is not completely true.

> [...]

> And WSGI is not about HTTP in a distant future, it's about HTTP right
> now.
> 
>> Do you really want to define the new WSGI specification to be "against"
>> the new (possible) HTTP spec?
> 
> I don't know why it would be "against" it.

Well, I have quoted it for this reason.
What I mean is that, IMHO:

- Using Unicode strings in WSGI is an abuse of Unicode string
- This abuse is not justified by the HTTP spec


> [...]


Regards  Manlio
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Move to bless Graham's WSGI 1.1 as official spec

2009-12-04 Thread Manlio Perillo
Henry Precheur ha scritto:
> On Fri, Dec 04, 2009 at 10:17:09AM +0100, Manlio Perillo wrote:
>> It is just as simple as using byte strings, IMHO.
> 
> No, it's not. There were lots of dicussions regarding this on the
> mailing list. One of the main issue is that the standard library
> supports bytes poorly. urllib for example expects strings not bytes.
> 

I read last month discussions 3 day ago!
The quote function supports byte strings, as an example.

What are the functions that does not works with byte strings?

>>> * WSGI sticks to what RFC 2616 (Hypertext Transfer Protocol -- HTTP/1.1)
>>>   says. WSGI is about HTTP, but that doesn't necessarily includes all
>>>   other standards extending HTTP.
>>>
>> HTTP never says to consided whole headers as latin-1 text, IMHO.
> 
> It does:
> 
>   When no explicit charset parameter is provided by the sender, media
>   subtypes of the "text" type are defined to have a default charset value
>   of "ISO-8859-1" when received via HTTP.
> 
>   http://tools.ietf.org/html/rfc2616#section-3.7.1
> 

This is not correct.

First of all, HTTP never says that whole headers are of type TEXT.
Only specific components are of type TEXT.

Moreover, HTTPbis has finally clarified this; TEXT is no more used,
instead non ascii characters are to be considered opaque.

Do you really want to define the new WSGI specification to be "against"
the new (possible) HTTP spec?

Of course it will work; but since some code in the standard library
needs to be fixed (the wsgiref.util.application_uri, as an example),
maybe it is better to fix it to work with byte strings.

Just my two cents.

> [...]


Regards  Manlio
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Move to bless Graham's WSGI 1.1 as official spec

2009-12-04 Thread Manlio Perillo
And Clover ha scritto:
> Manlio Perillo wrote:
> 
>> Words of *TEXT MAY contain characters from character sets other than
>> ISO-8859-1 [22] only when encoded according to the rules of RFC 2047
> 
> Yeah, this is, unfortunately, a lie. The rules of RFC 2047 apply only to
> RFC*822-family 'atoms' and not elsewhere; indeed, RFC2047 itself
> specifically denies that an encoded-word can go in a quoted-string.
> 
> RFC2047 encoded-words are not on-topic in an HTTP header(*); this has
> been confirmed by newer development work on HTTPbis by Reschke et al.
> (http://tools.ietf.org/wg/httpbis/).
> 

Thanks.
HTTPbis seems to fix all these problems:

"Historically, HTTP has allowed field content with text in the ISO-
8859-1 [ISO-8859-1] character encoding and supported other character
sets only through use of [RFC2047] encoding.  In practice, most HTTP
header field values use only a subset of the US-ASCII character
encoding [USASCII].  Newly defined header fields SHOULD limit their
field values to US-ASCII characters.  Recipients SHOULD treat other
(obs-text) octets in field content as opaque data."


This is the new rule for `quoted-string`:

quoted-string  = DQUOTE *( qdtext / quoted-pair ) DQUOTE
qdtext = OWS / %x21 / %x23-5B / %x5D-7E / obs-text
   ; OWS /  / obs-text
obs-text   = %x80-FF

quoted-pair= "\" ( WSP / VCHAR / obs-text )


> The "correct" way of escaping header parameters in an RFC*822-family
> protocol would be RFC2231's complex encoding scheme, but HTTP is
> explicitly not an 822-family protocol despite sharing many of the same
> constructs. See
> http://tools.ietf.org/html/draft-reschke-rfc2231-in-http-06 for a
> strategy for how 2231 should interact with HTTP, but note that for now
> RFC2231-in-HTTP simply does not exist in any deployed tools.
> 

It seems reasonable.

> So for now there is basically nothing useful WSGI can do other than
> provide direct, byte-oriented (even if wrapped in 8859-1 unicode
> strings) access to headers.
> 

Yes, this is what I think.
I have some doubts about wrapping the headers in 8859-1 unicode strings,
but luckily there is surrogateescape.



Regards  Manlio
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Move to bless Graham's WSGI 1.1 as official spec

2009-12-04 Thread Manlio Perillo
Henry Precheur ha scritto:
> On Thu, Dec 03, 2009 at 09:15:06PM +0100, Manlio Perillo wrote:
>> There is something that I don't understand.
>>
>> Some HTTP headers, like Accept-Language, contains data described as
>> `token`, where:
>>
>> token  = 1*
>>
>> So a token, IMHO, is an opaque string, and it SHOULD not decoded.
>> In Python 3.x it SHOULD be a byte string.
> 
> I think this is more an issue that frameworks should deal with. By
> decoding every headers value to latin-1:
> 
> * It keeps WSGI simple. Simple is good.
> 

It is just as simple as using byte strings, IMHO.
It is not simple, it is convenient because of (if I understand
correctly) how code is converted by 2to3.

> * WSGI sticks to what RFC 2616 (Hypertext Transfer Protocol -- HTTP/1.1)
>   says. WSGI is about HTTP, but that doesn't necessarily includes all
>   other standards extending HTTP.
> 

HTTP never says to consided whole headers as latin-1 text, IMHO.

> * It's possible to convert latin-1 strings to bytes without losing data.
> 

Yes, but it is quite stupid to first convert to Unicode and then convert
again to byte string.

It it true, however, that this does not happen often; but only for:

- WSGI applications that implement an HTTP proxy
- WSGI applications that needs to support HTTP Digest Authentication
- WSGI applications that store encoded data in cookies


Regards  Manlio
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Move to bless Graham's WSGI 1.1 as official spec

2009-12-03 Thread Manlio Perillo
And Clover ha scritto:
> Manlio Perillo wrote:
> 
>> However what about URI (that is, for PATH_INFO and the like)?
>> For URI (if I remember correctly) the suggested encoding is UTF-8, so
>> URLS should be decoded using
> 
>>   url.decode('utf-8', 'surrogateescape')
> 
>> Is this correct?
> 
> The currently-discussed proposal is ISO-8859-1, allowing the real bytes
> to be trivially extracted. This is consistent with the other headers and
> would be my preferred approach.
> 

There is something that I don't understand.

Some HTTP headers, like Accept-Language, contains data described as
`token`, where:

token  = 1*

So a token, IMHO, is an opaque string, and it SHOULD not decoded.
In Python 3.x it SHOULD be a byte string.

Text content is described as `TEXT`, where:

The TEXT rule is only used for descriptive field contents and values
that are not intended to be interpreted by the message parser. Words
of *TEXT MAY contain characters from character sets other than ISO-
8859-1 [22] only when encoded according to the rules of RFC 2047
[14].

TEXT   = 


The only type of data where TEXT can be used is `quoted-string`.

A `quoted-string` only appears in well specified portions of an header.
So, IMHO, it is *not* correct for a WSGI middleware, to return all HTTP
headers as Unicode strings.

This is up to the application/framework, that must parse each header,
split it in component and handle them as more appropriate (as byte
string, Unicode string or instance of some other data type).


> [...]


Regards   Manlio
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] HTTP headers encoding

2009-12-03 Thread Manlio Perillo
Henry Precheur ha scritto:
> [...]
>> How is authorization username handled in common WSGI frameworks?
> 
> As far as I know, they don't handle this. They just return the string
> without dealing with the encoding issues.
> 
> I think there is no correct way of handling this, because 99% of
> username/password contain only ascii characters. A possible 'workaround'
> would be to limit yourself to the ascii charset. If you get a non-ascii
> character raise an Exception.
> 

Right now I'm doing a: username.decode('us-ascii', 'replace')



Regards  Manlio
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Move to bless Graham's WSGI 1.1 as official spec

2009-12-03 Thread Manlio Perillo
And Clover ha scritto:
> [...]
>> Cookie data SHOULD be transparent to the server/gateway; however WSGI is
>> going to assume that data is encoded in latin-1.
> 
> Yeah. This is no big deal because non-ASCII characters in cookies are
> already broken everywhere(*). Given this and other limitations on what
> characters can go in cookies, they are habitually encoded using ad-hoc
> mechanisms handled by the application (typically a round of URL-encoding).
> 
> *: in particular:
> 
> - Opera and Chrome send non-ASCII cookie characters in UTF-8.
> - IE encodes using the system codepage (which can never be UTF-8),
>   mangling any characters that don't fit in the codepage through the
>   traditional Windows 'similar replacement character' scheme.
> - Mozilla uses the low byte of each UTF-16 code point (so ISO-8859-1
>   gets through but everything else is mangled)
> - Safari refuses to send any cookie containing non-ASCII characters.
> 

Thanks for this summary.
I think it should go in a wiki or in a separate document (like
rationale) to the WSGI spec.

However this should never happen with cookie, since cookie data is
opaque to browser, and it MUST send it "as is".

What you describe happen with other headers containing TEXT.
And now I understand that strange behaviour of Firefox with non latin-1
strings in username, in HTTP Basic Authentication.

> [...]

Regards   Manlio
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] HTTP headers encoding

2009-12-03 Thread Manlio Perillo
Manlio Perillo ha scritto:
> Hi.
> 
> I'm doing some tests to try to understand how HTTP headers are encoded
> by browsers.
> 
> I have written a simple WSGI application that asks authentication
> credentials and then print them on the terminal and return the data as
> response, as raw bytes
> http://paste.pocoo.org/show/154633/
> 

I'm now testing using HTTP Digest Authentication.
The application is here:
http://paste.pocoo.org/show/154667/

It uses my wsgix framework
http://hg.mperillo.ath.cx/wsgix/
since I don't want to rewrite the entire Digest Authentication handling.


As user name I use the the string "àè€".
The results are:

- Firefox does not send any request, and instead it show me the returned
  response body "Authentication required".

  This is quite strange.

- Internet Explorer 6 encode the username using cp1252, as always.

- Opera (10.01) encode the username using utf-8

I can not test with Konqueror, since the wsgiref server have problems
with it.


All these implementation are against the HTTP spec.
username is a quoted string, and so it SHOULD be encoded using the
default latin-1, or another charset and in this case it should be
formatted as specified my MIME (unfortunately there are no examples in
the HTTP spec).


This is really a mess.
How is authorization username handled in common WSGI frameworks?




Thanks  Manlio
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


[Web-SIG] HTTP headers encoding

2009-12-03 Thread Manlio Perillo
Hi.

I'm doing some tests to try to understand how HTTP headers are encoded
by browsers.

I have written a simple WSGI application that asks authentication
credentials and then print them on the terminal and return the data as
response, as raw bytes
http://paste.pocoo.org/show/154633/

Then I used some browsers to try to send an username with non ascii
characters.


When I try with simple characters in the iso-8859-1 charset, things
works well; the data is encoded using this charset.

However when I try to use some extraneus character, like Euro, there are
problems.

Firefox (Iceweasel 3.0.14, Linux Debian Squeeze) sends me a
'\xac'

I don't know where \xac come from, but it is the last byte in the utf-8
encoded Euro: '\xe2\x82\xac'


Internet Explorer 6.0 sends me a
'\x80'
and this this the Euro characted encoded using cp1252 (and I suspect
that it always use this encoding, instead of iso-8859-1).

Unfortunately I can not test with IE 7 and 8.



With a browser working on a terminal, like lynx, things get worse.
If I enter as user name the string "àè", lynx sends me
'\xc3\xa0\xc3\xa8'

This happens in a GNOME terminal, with an it_IT.utf8 locale.

wget and curl do the same.


Can someone else reproduce this?



Thanks   Manlio
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Move to bless Graham's WSGI 1.1 as official spec

2009-12-03 Thread Manlio Perillo
James Y Knight ha scritto:
> I move to bless mod_wsgi's definition of WSGI 1.1 [1]
> [...]
> 
> [1] http://code.google.com/p/modwsgi/wiki/SupportForPython3X

Hi.

Just a few questions.

It is true that HTTP headers can be encoded assuming latin-1; and they
can be encoded using PEP 383.

However what about URI (that is, for PATH_INFO and the like)?
For URI (if I remember correctly) the suggested encoding is UTF-8, so
URLS should be decoded using

  url.decode('utf-8', 'surrogateescape')

Is this correct?


Now another question.
Let's consider the `wsgiref.util.application_uri` function

def application_uri(environ):
url = environ['wsgi.url_scheme']+'://'
from urllib.parse import quote

if environ.get('HTTP_HOST'):
url += environ['HTTP_HOST']
else:
url += environ['SERVER_NAME']

if environ['wsgi.url_scheme'] == 'https':
if environ['SERVER_PORT'] != '443':
url += ':' + environ['SERVER_PORT']
else:
if environ['SERVER_PORT'] != '80':
url += ':' + environ['SERVER_PORT']

url += quote(environ.get('SCRIPT_NAME') or '/')
return url


There is a potential problem, here, with the quote function.
This function does the following:

def quote(string, safe='/', encoding=None, errors=None):
if isinstance(string, str):
if encoding is None:
encoding = 'utf-8'
if errors is None:
errors = 'strict'
string = string.encode(encoding, errors)

This means that if we use surrogateescape, the informations about
original bytes is lost here.

This can be easily fixed by changing the application_uri function, but
this also means that a WSGI application will not work with Python 3.1.x.


Finally, a question about cookies.
Cookie data SHOULD be transparent to the server/gateway; however WSGI is
going to assume that data is encoded in latin-1.

I don't know what the HTTP/Cookie spec says about this.
However, from a WSGI application point of view, the cookie data can, as
an example, contain some text encoded in UTF-8; this means that the
application must first encode the data:

  cookie_bytes = cookie.encode('latin-1', 'surrogateescape')

and then decode it using UTF-8:

  my_cookie_data = cookie_bytes.decode('utf-8')


This is a bit unreasonable, but I don't know if this is a common
practice (I do this, just to make an example).



Manlio Perillo
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Closing long-running WSGI requests (possible?)

2009-04-13 Thread Manlio Perillo
Chimezie Ogbuji ha scritto:
> Hello.  I have a problem with a WSGI-based SPARQL server that I have been
> unable to resolve for some time.  I was told this is the best place to ask
> :).  I'm building a SPARQL [1] server that is deployed as  WSGI/Paste
> server.  SPARQL queries are handled by the server and evaluated against a
> MySQL database using mysql-python/MySQLdb to manage the connection.
> 
> My goal is to be able to allow clients to close the connection in order to
> kill queries that have been dispatched (in order to 'abort' them).
> Unfortunately, when the client kills the connection, the application is not
> signaled in any way.  So, the result is that (for long-running queries), the
> MySQL query continues to run even after the connection is closed (by
> clicking cancel in the browser for instance).
> 
> [...]

What you want to do is not possible.

A more viable solution is to use JavaScript.
Add a custom "abort button" on the web page so that a function is
associate to the "click" event.

Also, you should associate a function to the "unload" event (where you
can check if there are active queries).

In the JavaScript function you can issue an XMLHTTPRequest, using an
unique identifier.

Note that if you use PostgreSQL, you can use:
http://www.postgresql.org/docs/8.3/interactive/protocol-flow.html#AEN73870

When you create a connection to PostgreSQL, the server will send you the
backend process id an unique key.

You can use this data to send a cancellation request.
All you need to do is to pass the process id and the unique key to the
client (with some encryption so that the client can use the data only once).

Unfortunately, libpq does not offer a flexible interface to this feature.
The PGCancel structure is opaque, so you need some hacking.



Manlio Perillo
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] HTML parsing - get text position and font size

2009-01-12 Thread Manlio Perillo

Girish Redekar ha scritto:
I'm trying to build a search engine in python am stuck at the place 
where I parse HTML to get useful text. One should ideally be able to 
parse the text (out of HTML tags) along with its position (for phrase 
searches) and font-size (to weigh words appropriately).




Words weight should be done using semantics, not style.

However, if you really need it, for CSS parsing, there is cssutils package.
I'm writing a CSS parser, too:
http://hg.mperillo.ath.cx/pdfimg/file/tip/pdfimg/style/css/

using PLY, so it should easy to read/modify.
It is still in very early stage.



> [...]


Regards  Manlio Perillo
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


[Web-SIG] setup_testing_defaults and SERVER_PROTOCOL

2008-12-18 Thread Manlio Perillo
Isn't more appropriate for wsgiref.util.setup_testing_defaults function 
to set SERVER_PROTOCOL to HTTP/1.1, instead of HTTP/1.0, since HTTP/1.1 
is the current version of the protocol?





Thanks   Manlio Perillo
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


[Web-SIG] logging support in a multiprocess web server

2008-12-17 Thread Manlio Perillo

Hi.

I have noted that some WSGI based web applications use the standard 
logging module, for logging.


However I have some doubts about how this works when the application is 
embedded in a web server that uses multiple processes (like Nginx or 
Apache with prefork).




Thanks   Manlio Perillo


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] handling URLs with ending slash

2008-12-14 Thread Manlio Perillo

Thomas Broyer ha scritto:

On Sun, Dec 14, 2008 at 11:23 AM, Manlio Perillo wrote:

In my WSGI applications I always have an ending slash to the URLs.
This means that an URL without the ending slash will cause the underlying
resource to return 404 Not Found HTTP response.

What is the best method to handle this, using a regex based URL dispatcher?


I would add some kind of "catch-all entry" to dispatch to a "trailing
slash redirector" WSGI app:

   routes.add("[^/]$", force_trailing_slash)



I not sure I like this.


or eventually add a WSGI middleware to each mapped application


The URL dispatcher is a WSGI middleware, so it is ok for me to do this 
in the url dispatcher.
I would like to keep the numbers of middleware to a minimun (function 
calls in Python are not cheap).



(...that need such a treatment, could be all of them) that would issue
a redirect to the "slash-appended" URL when needed, or just pass
through to the application otherwise:

   routes.add(, force_trailing_slash(my_application))



Yes, that's an idea.
Note that this is a special case of an "URL normalizer" middleware.
A middleware can be used as a function decorator.
This is a point in favour to use a dedicated middleware for this.


Thanks   Manlio Perillo
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] handling URLs with ending slash

2008-12-14 Thread Manlio Perillo

Randy Syring ha scritto:

Manilo,



Manlio not Manilo, please!

Here is a thread on this topic, well a partial thread, start reading 
about half way down:


http://groups.google.com/group/pylons-discuss/browse_thread/thread/6888b790239b488b 



I found it informative.



Thanks, it is interesting.



Regards  Manlio Perillo
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


[Web-SIG] handling URLs with ending slash

2008-12-14 Thread Manlio Perillo

Hi.

In my WSGI applications I always have an ending slash to the URLs.
This means that an URL without the ending slash will cause the 
underlying resource to return 404 Not Found HTTP response.


What is the best method to handle this, using a regex based URL dispatcher?

I'm planning to add an option to my URL dispatcher to force any URL to 
have an ending slash (as an example requesting an HTTP redirect - either 
302 or 301, or by just internally modifying the URL), but I'm not sure 
this is the best solution.



Thanks   Manlio Perillo
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] wsgiref.validate allows wsgi.input.read() with no argument

2008-12-12 Thread Manlio Perillo

Graham Dumpleton ha scritto:

Just noticed that although WSGI PEP doesn't specifically mention that
argument to read() on wsgi.input is optional, wsgiref.validate allows
calling read() with no argument.



wsgiref.validate makes also other assumptions about a WSGI application
that are not required by the WSGI PEP.

As an example it reports as an error the presence in the environ
dictionary of HTTP_CONTENT_TYPE and HTTP_CONTENT_LENGTH, but the PEP
says nothing about this, and CGI [1] says:
""""The server may exclude any headers which it has already processed,
such as Authorization, Content-type, and Content-length. If necessary,
the server may choose to exclude any or all of these headers if
including them would exceed any system environment limits."""


[1] http://hoohoo.ncsa.uiuc.edu/cgi/env.html

P.S.:
the link "http://cgi-spec.golux.com/draft-coar-cgi-v11-03.txt"; is broken.



[...]



Regards  Manlio Perillo

___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Revising environ['wsgi.input'].readline in the WSGI specification

2008-11-17 Thread Manlio Perillo

Ian Bicking ha scritto:

[...]

Fine for me, but of course we need to do this as:
1) Errata to WSGI 1.0
or
2) WSGI 1.1
or
3) WSGI 2.0

You can't just modify the current WSGI 1.0 spec.

I'm for 2), with the other clarifications about WSGI we have discussed 
in the past.


I'm for 1.  What other clarifications were you thinking of?



Here is a list of messages I have posted in the past.

- start_response and error checking
  25 September 2007
  http://mail.python.org/pipermail/web-sig/2007-September/002771.html
- hop-by-hop headers handling
  1 October 2007
  http://mail.python.org/pipermail/web-sig/2007-October/002775.html
- HTTP_CONTENT_TYPE and HTTP_CONTENT_LENGTH
  12 December 2007
  http://mail.python.org/pipermail/web-sig/2007-December/003014.html
- a possible error in the WSGI spec
  20 December 2007
  http://mail.python.org/pipermail/web-sig/2007-December/003064.html
- calling start_response and the write from a separate thread
  27 December 2007
  http://mail.python.org/pipermail/web-sig/2007-December/003104.html
- WSGI and PEP 325
  20 May 2008
  http://mail.python.org/pipermail/web-sig/2008-May/003438.html


I'm rather sure there were other threads about clarifications of WSGI 1.0.

One of these was about if a WSGI gateway is allowed to skip the 
generation of the request body (assuming the WSGI applications returns a 
generator) if this is not required (the client cached copy of the 
request entity is up to date and the server is going to return 304 Not 
Modified)




Regards   Manlio Perillo
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Revising environ['wsgi.input'].readline in the WSGI specification

2008-11-17 Thread Manlio Perillo

Phillip J. Eby ha scritto:

At 08:49 PM 11/17/2008 +0100, Manlio Perillo wrote:

Ian Bicking ha scritto:

[...]
We need to propose a change to the WSGI specification.  I propose, in 
"Input and Error Streams" 
(http://www.python.org/dev/peps/pep-0333/#input-and-error-streams) we 
change it to have "readline(hint)" and expand Note 3 to include 
readline as well as readlines, removing Note 2.  Also I suppose some 
sort of change note in the specification?
Does this sound like a sufficient change to the spec, and are there 
any objections to the change?


Fine for me, but of course we need to do this as:
1) Errata to WSGI 1.0
or
2) WSGI 1.1
or
3) WSGI 2.0

You can't just modify the current WSGI 1.0 spec.

I'm for 2), with the other clarifications about WSGI we have discussed 
in the past.


I'm more inclined towards #1.  


I'm not sure, since it is an API change; of course if there was an error 
in the API this should be an errata, but there is a rationale behind the 
current API.


I'm fine, however, with an amendment.


> [...]


Regards   Manlio Perillo
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Revising environ['wsgi.input'].readline in the WSGI specification

2008-11-17 Thread Manlio Perillo

Ian Bicking ha scritto:

[...]
We need to propose a change to the WSGI specification.  I propose, in 
"Input and Error Streams" 
(http://www.python.org/dev/peps/pep-0333/#input-and-error-streams) we 
change it to have "readline(hint)" and expand Note 3 to include readline 
as well as readlines, removing Note 2.  Also I suppose some sort of 
change note in the specification?


Does this sound like a sufficient change to the spec, and are there any 
objections to the change?




Fine for me, but of course we need to do this as:
1) Errata to WSGI 1.0
or
2) WSGI 1.1
or
3) WSGI 2.0

You can't just modify the current WSGI 1.0 spec.

I'm for 2), with the other clarifications about WSGI we have discussed 
in the past.




Regards  Manlio Perillo
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Async API for Python

2008-09-10 Thread Manlio Perillo

Jerry Spicklemire ha scritto:

Sorry, if this turns up twice ...

Phillip J. Eby wrote, on Tue Jul 29 03:21:18 CEST 2008:

"There is no async API that's part of WSGI itself, and it's
unlikely there will ever be one unless there ends up
being an async API for Python as well."

http://mail.python.org/pipermail/web-sig/2008-July/003547.html


Following up, perhaps this would be of interest:

"New PEP proposal: C Micro-Threading"

"This PEP adds micro-threading (or 'green threads')
at the C level so that micro-threading is built in and
can be used with very little coding effort at the python
level.



Personally I think that implementing a standard reactor in Python is bad.
The Micro-Threading should just offer an API, like Twisted Deferred, 
generators and greenlets do; the reactor should be implemented separately.


> [...]


Manlio Perillo
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


[Web-SIG] a new implementation of multipart/form-data parser

2008-09-10 Thread Manlio Perillo

Hi all.

For my WSGI framework I have implemented a multipart/form-data parser.
http://hg.mperillo.ath.cx/wsgix/diff/70aacc4a8301/wsgix/parse.py

The code has been adapted from cgi.parse_multidata.

I think that the function is more robust of FieldStorage, since you can 
set a max size for field data stored in memory.
The code is more simple, too (since I have done a little review of 
current browsers behaviour, and none of them use multipart/mixed when 
encoding multiple file fields with the same name).


Now I'm going to write a middleware that takes a POST request with data 
encoded in multipart/form-data, and transcode the request entity in 
application/www-form-urlencoded, with file fields saved as:

field_name=&field_path=&field_content_type=

where  is the temporary path where the file has been stored.

Note that there is a Nginx module 
http://www.grid.net.ru/nginx/upload.en.html

that does this (but don't transcode in application/www-form-urlencoded.


Any one interested?
I really whould like some reviews.



Thanks   Manlio Perillo
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] parsing of urlencoded data and Unicode

2008-07-29 Thread Manlio Perillo

Deron Meranda ha scritto:

[...]

But, at this point, can one consider the content of form post to be encoded
"text" string?

Or it should be considered encoded "byte" string?


Both/either.

I'd say follow the RFC, but perhaps allow a caller to provide
an override default.  So yes, you should assume an encoded
string if the subpart has a text/* Content-Type, or if it has no
content type at all (which must then be assumed to be text/plain
US-ASCII).  That is the intent of the MIME text/* media type
after all; that it should be interpreted as a character string
and not a byte string.

In other cases, I would say returning a byte string is the
correct thing to do.




I'm not sure to understand.
If you want non text data in the POST request body, you can use the file 
control.


I can't really see use cases of normal input fields having byte strings.


> [...]


Manlio Perillo
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] parsing of urlencoded data and Unicode

2008-07-29 Thread Manlio Perillo

James Y Knight ha scritto:

On Jul 29, 2008, at 1:14 PM, Bill Janssen wrote:


Ok with theory.
But in practice:


Seems like you're looking at a broken browser there.

Can anyone point to where a W3C standard or IETF RFC describes this
behavior?


You seem to be under the mistaken impression that form post content is 
MIME. It is not. It looks kinda like it should be, and maybe it's even 
specified to be [rfc2388], but actually treating it as MIME is a rather 
critical error. RFC2388 is just wrong, don't believe a thing it says.




But, at this point, can one consider the content of form post to be 
encoded "text" string?


Or it should be considered encoded "byte" string?


> [...]


Manlio Perillo
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] parsing of urlencoded data and Unicode

2008-07-29 Thread Manlio Perillo

Bill Janssen ha scritto:

Ok with theory.
But in practice:


Seems like you're looking at a broken browser there.



Right.
It's Firefox.
But it's the same with IE 6 and Opera.


Can anyone point to where a W3C standard or IETF RFC describes this
behavior?


I think that it is safe to decode data from the QUERY_STRING and POST=20
data to Unicode, and to return Bad Request in case of errors.


It's clearly not safe to do so generally.  If you do decide to do
this, please tell me what framework you're building so that I can
avoid it :-).



No, wait.
I don't blindly guess the encoding.

I first try the content-type header, then the special _charset_ field, 
and finally utf-8.



If there is a problem in the decoding, the client is broken (or there is 
a bug in the application).

So the correct response is Bad Request, IMHO.


Bill




Manlio Perillo
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] parsing of urlencoded data and Unicode

2008-07-29 Thread Manlio Perillo

Bill Janssen ha scritto:

That's probably wrong.  We went through this recently on the
python-dev list.  While it's possible to tell the encoding of
multipart/form-data, 

With multipart/form-data the problem should be the same.
The content type is defined only for file fields.


Actually, it's defined for all fields, isn't it?  From RFC 2388:

``As with all multipart MIME types, each part has an optional
"Content-Type", which defaults to text/plain.''

So the type is "text/plain" unless it says something else.  And,
according to RFC 2046, the default charset for "text/plain" is
"US-ASCII".



Ok with theory.
But in practice:




Content-Type: multipart/form-data; boundary=abcde
abcde
Content-Disposition: form-data; name="Title"

hello
abcde
Content-Disposition: form-data; name="body"

à Úìòù
abcde


In theory I should assume ascii encoded data for the body field; and 
since this data can not be decoded, I should assume it as byte string.


However the body field is encoded in utf-8, and if I add an hidden 
_charset_ field, FF and IE add this field in the response, with the 
charset used in the encoding.



I think that it is safe to decode data from the QUERY_STRING and POST 
data to Unicode, and to return Bad Request in case of errors.


If the user have specialized needs, he can use low level parsing functions.

In wsgix the "high" level functions are parse_query_string and 
parse_simple_post_data; the "low" level function is parse_qs.


> [...]



Thanks   Manlio Perillo
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] parsing of urlencoded data and Unicode

2008-07-28 Thread Manlio Perillo

Bill Janssen ha scritto:
In wsgix I use utf-8 for decoding the QUERY_STRING, and the charset 
specified in the POST'ed data (utf-8 or the charset found in the special 
_charset_ field).


That's probably wrong.  We went through this recently on the
python-dev list.  While it's possible to tell the encoding of
multipart/form-data, 


With multipart/form-data the problem should be the same.
The content type is defined only for file fields.


the query_string and x-www-form-urlencoded data
may be in arbitary character set encodings (see RFC 3986).  It's
probably best to not try to map them to strings; instead, return byte
arrays for the value, and only return strings for data that can be
correctly decoded.  Otherwise, you lose information that the app
cannot recover.



Interesting, thanks.

I have read Django code and, as far as I can tell, it always decode data 
to strings, but using "replace" error handling.


Can you point me to the discussion on python-dev list?


Bill




Manlio Perillo
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] parsing of urlencoded data and Unicode

2008-07-28 Thread Manlio Perillo

Ian Bicking ha scritto:

Manlio Perillo wrote:

Hi.

In my WSGI framework:
http://hg.mperillo.ath.cx/wsgix

I have, in the `http` module, the functions `parse_query_string` and
`parse_simple_post_data`.

The first parse the query string and return a dictionary of strings, the
latter parse the application/x-www-form-urlencoded client body and
return a dictionary of strings and the charset used by the client for
the unicode encoding.


Now, I'm thinking if these two function should instead return Unicode
strings instead of plain strings.

I think that Unicode strings should be returned, but I would like to
know what other web frameworks do.

Django seems to convert to Unicode, but the Python standard library 
does not (and I would like to know if changes are planned for Python 
3.x).


WebOb decodes to request data to str, then lazily decodes to unicode 
based on the request encoding.  The request encoding is a bit fuzzy to 
calculate, which is part of why the decoding is lazy, so that the 
request encoding can be set or changed at any time.




Ok, thanks.
In wsgix I use utf-8 for decoding the QUERY_STRING, and the charset 
specified in the POST'ed data (utf-8 or the charset found in the special 
_charset_ field).




Manlio Perillo

___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Could WSGI handle Asynchronous response?

2008-07-28 Thread Manlio Perillo

est ha scritto:

I am writing a small 'comet'-like app using flup, something like
this:

def myapp(environ, start_response):
start_response('200 OK', [('Content-Type', 'text/plain')])
return ['Flup works!\n']<-Could this be part
of response output? 


What do you mean by "part of response output"?


Could I time.sleep() for a while then write other
outputs?



Not with flup.



if __name__ == '__main__':
from flup.server.fcgi import WSGIServer
WSGIServer(myapp, multiplexed=True, bindAddress=('0.0.0.0',
)).run()


So is WSGI really synchronous? 


Not really.
Since you can return a generator, it's possible to support asynchronous 
programming, but the WSGI gateway must support it, as an example with 
Nginx mod_wsgi and some other implementations (search in the mailing 
list archive).


But this support has not been standardized.


How can I handle asynchronous outputs
with flup/WSGI ?



Regards  Manlio Perillo
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


[Web-SIG] parsing of urlencoded data and Unicode

2008-07-28 Thread Manlio Perillo

Hi.

In my WSGI framework:
http://hg.mperillo.ath.cx/wsgix

I have, in the `http` module, the functions `parse_query_string` and
`parse_simple_post_data`.

The first parse the query string and return a dictionary of strings, the
latter parse the application/x-www-form-urlencoded client body and
return a dictionary of strings and the charset used by the client for
the unicode encoding.


Now, I'm thinking if these two function should instead return Unicode
strings instead of plain strings.

I think that Unicode strings should be returned, but I would like to
know what other web frameworks do.

Django seems to convert to Unicode, but the Python standard library does 
not (and I would like to know if changes are planned for Python 3.x).




Thanks  Manlio Perillo

___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


[Web-SIG] problem with wsgiref.util.request_uri and decoded uri

2008-07-23 Thread Manlio Perillo

I'm having a nightmare with encoded/decoded uri and request_uri function:

>>> from wsgiref.util import request_uri
>>> environ = {
... 'HTTP_HOST': 'www.test.org',
... 'SCRIPT_NAME': '',
... 'PATH_INFO': '/b%40x/',
... 'wsgi.url_scheme': 'http'
... }
>>> print request_uri(environ)
http://www.test.org/b%2540x/

Here I'm assuming that the WSGI gateway *does* not decode the uri.
The result of request_uri is incorrect, in this case.

On the other hand, if the WSGI gateway *do* decode the uri, I can no 
more handle '/' in uri.


I can usually avoid to have '/' in uri, but right now I'm implementing a 
WSGI application that implement a restfull interface to an SQL database:

http://hg.mperillo.ath.cx/wsgix/file/tip/wsgix/contrib/sqltables.py

so I can not avoid fields with '/' character in it.


The proposed solution in a previous thread
http://mail.python.org/pipermail/web-sig/2008-January/003122.html

is to implement a custom encoding scheme (like done in MoinMoin).

There are really no other good solutions?

Assuming that WSGI requires the uri to not be encoded, then the solution 
is to do modify the request_uri function replacing:

quote(SCRIPT_NAME) with:
quote(unquote(SCRIPT_NAME))
?


Where can I find informations about alternate encoding scheme?


Thanks  Manlio Perillo
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Fwd: wsgiref.simple_server slow on slow network

2008-07-22 Thread Manlio Perillo

Tibor Arpas ha scritto:

Hi,
I'm quite new to python and I ran into a performance problem with
wsgiref.simple_server. I'm running this little program.

from wsgiref import simple_server

def app(environ, start_response):
   start_response('200 OK', [('content-type', 'text/html')])
   return ['*'*5]

httpd = simple_server.make_server('',8080,app)
try:
   httpd.serve_forever()
except KeyboardInterrupt:
   pass


I get many hundreds of responses/second on my local computer, which is fine.
But when I access this server through our VPN it performs very bad.



wsgiref is an iterative server, if I not wrong; it serves only one 
request at a time.


On the loopback interface this is not a problem, but on Internet the 
latency of the connection make a single request time high.


paste.httpserver uses a thread pool.

> [...]


Manlio Perillo
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Alternative to threading.local, based on the stack

2008-07-08 Thread Manlio Perillo

Donovan Preston ha scritto:


On Jul 8, 2008, at 11:45 AM, Manlio Perillo wrote:

Using greenlets, there is always a current greenlet, so you can use 
this for local storage.


A library function can check if there is an active greenlet, and use 
it as data key; otherwise it will use the current thread id.


Yes, this is exactly what I did in the 
wrap_threading_local_with_coro_local here:


http://donovanpreston.com:/eventlet/file/b6f9627e88df/eventlet/util.py



Ok.

However this will not work if you have an asynchronous server that 
does not make use of greenlets.


Exactly, which is why I am proposing just standardizing something that 
does exactly what people use threading.local for, but whose 
implementation is pluggable by the wsgi server.




But this will be not easy to implement, especially if it should go in a 
separate module.



Maybe its better to have something like:

wsgiorg.local_scope
a function that returns the current request id.

The function itself is not bound to the current request, so it can be 
safely stored.


Maybe this should be more easy to implement, I'm not sure.




Manlio Perillo
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Alternative to threading.local, based on the stack

2008-07-08 Thread Manlio Perillo

Donovan Preston ha scritto:


On Jul 7, 2008, at 6:11 PM, Phillip J. Eby wrote:


At 02:12 PM 7/7/2008 -0700, Donovan Preston wrote:

It seems to me that what is really needed here is an extension of wsgi
that specifies how to get, set, and list request local storage, and
for people to use that instead of the threadlocal module.


I don't follow why you wouldn't just put that in the environ.  (If you 
need it to be carried back from the application, use mutable objects 
in the environ.)


Yes, the logical place to store it is in the environ, but this whole 
thread is about having an api for doing request-local storage that 
doesn't involve passing the request everywhere.


Here's what I am imagining:

There's just a module, called requestlocal or something. It has an API 
just like threading.local(), except the implementation can be changed by 
the wsgi server.




Using greenlets, there is always a current greenlet, so you can use this 
for local storage.


A library function can check if there is an active greenlet, and use it 
as data key; otherwise it will use the current thread id.


However this will not work if you have an asynchronous server that does 
not make use of greenlets.


> [...]


Manlio Perillo
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] help with the implementation of a WSGI middleware

2008-07-08 Thread Manlio Perillo

Phillip J. Eby ha scritto:

At 11:21 PM 7/7/2008 +0200, Manlio Perillo wrote:

So this is not a "bad" middleware, IMHO.


True, but it's part of the application, rather than being transparent.



Ok, I agree.

Does this means that such non trasparent middlewares must not be 
inserted inside the "gateway middleware stack", even if this is done 
only as a convenience (so that you don't have to use a decorator for 
every functions)?




By the way, a middleware that is responsible for user authentication:
http://hg.mperillo.ath.cx/wsgix/file/tip/wsgix/auth/http_middleware.py

is a good middleware?

To keep it simple, the middleware check if there is an authorization 
header and the credentials are correct.


If this is true, execute the WSGI application (setting 
environ['REMOTE_USER']), otherwise return a forbidden response.


Right - that's transparent middleware: the application doesn't need to 
know it's there.




I think that it's rather subtle.
If you remove the middleware, the application is no more able to handle 
authenticated user.


This is not a problem, the application is still able to work correctly, 
but the same applies to my messages middleware, IMHO.




Under WSGI 2.0, it's even easier since you don't need decorators to 
manipulate your response: you can just "return someapi(...)" where 
the "..." is whatever you were going to return directly.


return someapi() from inside the WSGI application?


Yes.



Do you have a working example?

Also, can you post an example of a middleware that needs to replace the 
environ dictionary?




Thanks   Manlio Perillo
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Alternative to threading.local, based on the stack

2008-07-08 Thread Manlio Perillo

Donovan Preston ha scritto:

[...]
It seems to me that what is really needed here is an extension of wsgi 
that specifies how to get, set, and list request local storage, and for 
people to use that instead of the threadlocal module. 


There seems to be something that I don't understand: why not just store 
the values inside the WSGI environ dictionary?


It is a per request dictionary, so it is really what you want.

> [...]


Manlio Perillo
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] help with the implementation of a WSGI middleware

2008-07-07 Thread Manlio Perillo

Phillip J. Eby ha scritto:

At 09:58 PM 7/7/2008 +0200, Manlio Perillo wrote:
In this case the first solution is to use this middleware as a 
decorator, instead of a full middleware.


This is the correct way to implement non-transparent middleware; i.e., 
so-called middleware which is in fact an application API.  See:


http://dirtsimple.org/2007/02/wsgi-middleware-considered-harmful.html

for more about this.

Basically, if a piece of middleware has to be there for the application 
to run, it's not really "middleware"; it's a misnamed decorator.




Right, this what I thought (and yes, I have read your article).

However as a "justification" I used the following argumentation:
Ok, the application does not "fully" work without the middleware, 
however it "mainly" works, and it's not a big problem is messages are 
not actually sent to the client.



Fortunately, in wsgix a "middleware" is very easy to use both in a full 
middleware stack and as a decorator (since all the state is maintained 
in the environ dictionary and there is no need for factory functions).


In Nginx you can do, in server config:

   wsgi_middleware  wsgix.contrib.messages;


However I want to document that this is not a "good" middleware.
"non-transparent middleware" is a good term, thanks.

In the original WSGI spec, I overestimated the usefulness of adding 
extension APIs to the environ... or more likely, I went along with some 
of Ian's overenthusiasm for the idea.  ;-)  Extension APIs in the 
environ just mean you have to write your code to handle the case where 
the API isn't there -- in which case you might as well have used a library.


Extension APIs really only make sense if they are true *server* 
features, not application features; otherwise, you are better off using 
a library rather than "middleware" per se.




Yes.
However my messages middleware does not "inject" an API into the WSGI 
environment.


The API uses the environ to store state; the middleware is only required 
to "activate" the cookies to actually send messages to the client.


So this is not a "bad" middleware, IMHO.


By the way, a middleware that is responsible for user authentication:
http://hg.mperillo.ath.cx/wsgix/file/tip/wsgix/auth/http_middleware.py

is a good middleware?

To keep it simple, the middleware check if there is an authorization 
header and the credentials are correct.


If this is true, execute the WSGI application (setting 
environ['REMOTE_USER']), otherwise return a forbidden response.



Under WSGI 2.0, it's even easier since you don't need decorators to 
manipulate your response: you can just "return someapi(...)" where the 
"..." is whatever you were going to return directly.





return someapi() from inside the WSGI application?



Thanks   Manlio Perillo
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


[Web-SIG] help with the implementation of a WSGI middleware

2008-07-07 Thread Manlio Perillo
As I have informally written in previous messages, I'm writing a small 
WSGI framework.


The framework is available here (a Mercurial repository):
http://hg.mperillo.ath.cx/wsgix


In wsgix I have written two middleware that I find interesting since I 
have learned a bit more about how to write middlewares

(and Eby concerns about WSGI 1.0).

One of this middleware is wsgix.contrib.messages:
http://hg.mperillo.ath.cx/wsgix/file/tip/wsgix/contrib/messages.py


The purpose of this middleware is to support sending messages to a client.

The idea originates from Django, however in wsgix I use cookies (since I 
find not a really good idea to use a database for this) and messages can 
be sent to every user (Django sends messages only to authenticated 
users, if I'm correct).



The wsgix support for messages consist of two parts.
The first is the implementation of a simple API for sending an 
retrieving messages (only Unicode strings are supported):


message_push(environ, message)
message_pop(environ) # this returns and remove the messages

These functions does not actually manage cookies: the messages are 
stored in environ['wsgix.messages'], as a list.



The latter is the implementation of a middleware that take care of 
cookies handling.



The problem is that, if I have well understood, a middleware is allowed 
to entirely replace the environ dictionary.


This means that if such a middleware is presend before the messages 
middleware is called, messages are not sent to the client.



Is this true?
In this case the first solution is to use this middleware as a 
decorator, instead of a full middleware.


The other solution is to implement an additional interface:

message_push(environ, start_response, headers, message)

that explicitly handle the cookie (this is possible but harder to 
implement and less flexibile to use).



Any suggestions?


Thanks   Manlio Perillo
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Alternative to threading.local, based on the stack

2008-07-07 Thread Manlio Perillo

Ian Bicking ha scritto:

Manlio Perillo wrote:
[...]


As an example, in Paste you have choosed to using config dictionary 
for middleware configuration, that is, you have middleware factories.


I think this is a red herring.  WebOb specifically doesn't do anything 
related to configuration or the setup of the stack.  What it does do is 
stuff like:


expires = http.format_time(0)
http.generate_cookie(
environ, headers, name, '', expires=expires,
domain=cookie_domain(environ), path=path,
max_age=0)

which would be resp.delete_cookie(name) (well, cookie_domain seems to be 
derived from a setting, but that's mostly unrelated).  This isn't a 
particularly substantial difference, but these small conveniences add up.




As I have said, this is a personal taste, I don't like the 
"architecture" used by WebOb and prefer to directly use the environ 
dictionary without introducing other abstractions.

This is possible, I'm writing a "not simple" application using wsgix.


I'm still evaluating if I can reuse WebOb parsing functions (and this 
would be a great thing: I think that we *really* need a package with 
*only* low *level* parsing functions for the HTTP protocol).


From what I can see, WebOb *does* not offer a low level interface for 
the parsers: you *have* to use the Request object.


I really like multilevel architectures, instead.




Manlio Perillo

___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Alternative to threading.local, based on the stack

2008-07-07 Thread Manlio Perillo

Ian Bicking ha scritto:

Manlio Perillo wrote:


I'm adding web-sig in Cc.


 [...]

I'm developing a WSGI framework with all these (and other) ideas:
http://hg.mperillo.ath.cx/wsgix

Its still not documented, so I have not yet made an official 
announcement.


The main design goal is to keep the level of the interface as low 
level as possible.


I don't like additional interfaces (like Request and Response) objects 
around the WSGI dictionary, and I don't like frameworks like Django 
that completely hides the WSGI interface.


Have you tried webob?  My first run as Paste avoided wrappers around 
those objects, but an object interface has been very helpful.




I have not tried it, but I have read the code (as I have read the code 
of Paste).


In principle I'm against using additional interface, and one of the 
reason I wrote wsgix is to have a prof of concept, for trying to 
understand if it is feasible to write a WSGI application using an 
alternative framework.


wsgix (+ mod_wsgi for Nginx) has the same role as Paste, but I have 
decided to use a rather different approach.


As an example, in Paste you have choosed to using config dictionary for 
middleware configuration, that is, you have middleware factories.


In wsgix it is very different.
As an example:

http://hg.mperillo.ath.cx/wsgix/file/tip/wsgix/contrib/messages.py
http://hg.mperillo.ath.cx/wsgix/file/tip/wsgix/contrib/error_page.py

There are no factories.
The configuration is read (and globally cached) at request time from the 
environ dictionary.


With Nginx, configuration parameters can be defined in the server 
configuration.


There is an helper class:
http://hg.mperillo.ath.cx/wsgix/file/tip/wsgix/options.py
that helps with the parsing.

There is also a middleware:
http://hg.mperillo.ath.cx/wsgix/file/tip/wsgix/conf/middleware.py

that reads the configuration from a YAML file, and merge it into the 
environ dictionary.



Of course it's all a matter of personal taste :).

The goal is to have the possibility to write "truly" reusable 
middlewares, that are easy to "plug" inside any WSGI server (almost all 
of configuration parameters have default values).





Manlio Perillo
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Alternative to threading.local, based on the stack

2008-07-07 Thread Manlio Perillo

Matt Goodall ha scritto:

[...]

True, but even passing a request or env dict around to everyone gets
tedious don't you think?


Yes, it can be tedious but I believe explicit arg passing is necessary
to make code readable, testable and reusable.

If it's web-related code then give it the request, it will almost
certainly need it. Otherwise, don't.

I would even advocate extracting request-scope objects, e.g. a database
connection, the current user, etc, as early as possible and passing them
around explicitly (along with the request, if necessary).



This exactly what I too have realized!

I'm developing a WSGI framework with all these (and other) ideas:
http://hg.mperillo.ath.cx/wsgix

Its still not documented, so I have not yet made an official announcement.

The main design goal is to keep the level of the interface as low level 
as possible.


I don't like additional interfaces (like Request and Response) objects 
around the WSGI dictionary, and I don't like frameworks like Django that 
completely hides the WSGI interface.



> [...]



Manlio Perillo
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Alternative to threading.local, based on the stack

2008-07-04 Thread Manlio Perillo

Iwan Vosloo ha scritto:

On Fri, 2008-07-04 at 13:42 +0200, Manlio Perillo wrote:

Iwan Vosloo ha scritto:

Hi,

Many web frameworks and ORM tools have the need to propagate data
depending on some or other context within which a request is dealt with.
Passing it all via parameters to every nook of your code is cumbersome.

The natural solution with WSGI is to store objects in the environ 
dictionary.


In fact in my web applications I always pass the environ dictionary 
explicitly to every functions.


But, this passing of the environ dictionary to every function in you web
app is exactly what I'd want to avoid?



Yes, but you only need to pass the environ dictionary and not N paramerers.
I think this is a good compromise.

Using thread local storage is not the solution to every problem (as you 
have noted it can not be used when the server handle more then one 
request per thread).



-i




Manlio Perillo
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Alternative to threading.local, based on the stack

2008-07-04 Thread Manlio Perillo

Iwan Vosloo ha scritto:

Hi,

Many web frameworks and ORM tools have the need to propagate data
depending on some or other context within which a request is dealt with.
Passing it all via parameters to every nook of your code is cumbersome.

A lot of the frameworks use a thread local context to solve this
problem. I'm assuming these are based on threading.local.  


(See, for example:
http://www.sqlalchemy.org/docs/05/session.html#unitofwork_contextual )

Such usage assumes that one request is served per thread.

This is not necessarily the case.  (Twisted would perhaps be an example,
but I have not checked how the twisted people deal with the issue.)



The natural solution with WSGI is to store objects in the environ 
dictionary.


In fact in my web applications I always pass the environ dictionary 
explicitly to every functions.



> [...]


Manlio Perillo
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Proposed specification: waiting for file descriptor events

2008-05-22 Thread Manlio Perillo

Christopher Stawarz ha scritto:

On May 21, 2008, at 1:34 PM, Manlio Perillo wrote:

 Instead, the spec recommends that async servers pre-read the request 
body
 before invoking the app (either by default or as a configurable 
option).


This is the best solution most of the time (but not for all of the 
time), especially if the "server" can do some "pre-parsing" of 
multipart/form-data request body.


In fact I plan to write a custom function (in C for Nginx) that will 
"reduce", as an example:


  Content-Type: multipart/form-data; boundary=AaB03x

  --AaB03x
  Content-Disposition: form-data; name="submit-name"

  Larry
  --AaB03x
  Content-Disposition: form-data; name="files"; filename="file1.txt"
  Content-Type: text/plain

  ... contents of file1.txt ...
  --AaB03x--

to (not properly escaped):

Content-Type: application/x-www-form-urlencoded

submit-name=Larry&files.filename=file1.txt&files.ctype=text/plain&files.path=xxx 




and the contents of file1.txt will be saved to a temporary file 'xxx'.


It seems like you're making this more complicated than it needs to be.  
Why not just store the entire request body in a temporary file, and then 
pass an open handle to it as wsgi.input?  


Because if you have a big file (like a video of > 100 MB), your 
application will block everything while parsing the request body.


Parsing the body incrementally is far more efficient (although it is 
more hard).



That way, the server doesn't 
have to rewrite the request, and the application doesn't need to know 
how to interpret the files.* parameters.




How to interpret the files.* parameters is not really a problem.


1) Why not add a more generic poll like interface?


Because such an interface would be more complicated than what I've 
proposed and harder for server authors to implement.  Also, I'm not sure 
that it gains you much.




Well, I have modelled my extension so that it has a "well know" 
interface and that it is not hard to implement.


But I have to say that I'm not sure if one want to poll multiple sockets.

Moreover in my implementation ngx.poll only returns one "ready" socket 
at a time.



By the way: I see a problem with you API.
What happens if an application do:

read, write, exc = m.fdset()

environ['x-wsgiorg.fdevent.readable'](read[0], 1.0)
environ['x-wsgiorg.fdevent.writable'](write[0], 1.0)

yield ''


There is no way to know, when the application is resumed, if the socket 
is ready for read or write.


This probabily should not be a problem, but I'm not sure.

Note that I'm not 100% sure on this, as I tried to indicate in the "Open 
Issues" section of my proposal.  The approach I'd like to take is to try 
writing apps with my interface for a while, and if real-world usage 
shows that a poll-like interface would be very useful (or necessary), 
then the spec could be extended to add one.  I think this is a safe 
route, since the readable/writable functions could easily be implemented 
in terms of a more generic poll-like interface, so existing apps that 
use the fdevent extensions would continue to work.



  Moreover IMHO storing a timeout variable in the environ to check if
  the previous call timedout, is not the best solution.


I think it's a simple and effective solution.  Server authors don't need 
to implement any new functions or data types.  They just create and hold 
on to a mutable object instance (the simplest being a list instance) for 
each app instance and toggle its truth value as required.



  In my implementation I return a function, but with generators in
  Python 2.5 this can be done in a better way.


What advantage does this have over what I've proposed?



You don't need to store a mutable variable in the environ.


2) In Nginx it is not possible to simply handle "plain" file
  descriptors, since these are wrapped in a connection structure.

  This is the reason why I had to add a connection_wrapper function in
  my WSGI module for Nginx.


But the connection structure just wraps an integer file descriptor, 
right?  So the readable/writable functions can create the required 
wrapper to register with nginx. There's no reason to make the 
application author do it.




The "problem" is that Ninx keeps a list of preallocated connection 
objects (the size of the list being controlled by worker_connections).


This means that a newly constructed connection *must* be freed as soon 
as it is no more used, otherwise it can limit the number of concurrent 
connections that can be handled by Nginx.


Since with my API (register/unregister) a connection should be kept 
alive until is is unregistered, I have choosen to create a wrapper for 
the Nginx connection object.



Probabily with your API it can be possible to c

Re: [Web-SIG] WSGI and greenlets

2008-05-22 Thread Manlio Perillo

Christopher Stawarz ha scritto:

On May 7, 2008, at 4:44 AM, Manlio Perillo wrote:
[...]

I don't think this will solve the problem.
Moreover in your example you buffer the whole request body so that you 
have to yield only one time.


Your example was:

def application(environ, start_response):
  def nested():
 while True:
poll(xxx)
yield ''
 yield result

  for r in nested():
 if not r:
 yield ''

  yield r

My suggestion would allow you to rewrite this like so:

@awsgiref.callstack.add_callstack
def application(environ, start_response):
  def nested():
 while True:
poll(xxx)
yield ''
 yield result

  yield nested()

The nesting can be arbitrarily deep, so nested() could yield 
doubly_nested() and so on.  While not as elegant as greenlets, I think 
this does address your concern.





I'm reading the PEP 342, and I still think that this will not work as I 
want for Nginx (where I have no control over the "scheduler").


In fact the PEP 342 says:
"""However, if it were possible to pass values or exceptions *into* a
generator at the point where it was suspended, a simple co-routine
scheduler or "trampoline function" would let coroutines "call" each
other without blocking."""



However writing a co-routine scheduler or "trampoline function" when 
your application is embedded in an external server is not possible (but 
please, correct me if I'm wrong).





> [...]


Regards   Manlio Perillo
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Proposed specification: waiting for file descriptor events

2008-05-21 Thread Manlio Perillo

Christopher Stawarz ha scritto:
This is the third draft of my proposed extensions for better supporting 
WSGI apps on asynchronous servers.  The major changes since the last 
draft are as follows:




First of all, thanks for your effort.


* The title and abstract now accurately reflect the scope of the proposal.
  In addition, the extensions are now in the namespace "x-wsgiorg.fdevent"
  (instead of "x-wsgiorg.async").

* The proposal for an alternative, non-blocking input stream has been
  dropped, since I don't see a way to add one that wouldn't break 
middleware.


Well, IMHO the "general" solution here is to use greenlets.


  Instead, the spec recommends that async servers pre-read the request body
  before invoking the app (either by default or as a configurable option).



This is the best solution most of the time (but not for all of the 
time), especially if the "server" can do some "pre-parsing" of 
multipart/form-data request body.


In fact I plan to write a custom function (in C for Nginx) that will 
"reduce", as an example:


   Content-Type: multipart/form-data; boundary=AaB03x

   --AaB03x
   Content-Disposition: form-data; name="submit-name"

   Larry
   --AaB03x
   Content-Disposition: form-data; name="files"; filename="file1.txt"
   Content-Type: text/plain

   ... contents of file1.txt ...
   --AaB03x--

to (not properly escaped):

Content-Type: application/x-www-form-urlencoded

submit-name=Larry&files.filename=file1.txt&files.ctype=text/plain&files.path=xxx


and the contents of file1.txt will be saved to a temporary file 'xxx'.




Once again, I'd appreciate your comments.




I have some comments:

1) Why not add a more generic poll like interface?

   Moreover IMHO storing a timeout variable in the environ to check if
   the previous call timedout, is not the best solution.

   In my implementation I return a function, but with generators in
   Python 2.5 this can be done in a better way.

2) In Nginx it is not possible to simply handle "plain" file
   descriptors, since these are wrapped in a connection structure.

   This is the reason why I had to add a connection_wrapper function in
   my WSGI module for Nginx.

3) If you read an example that implements a database connection pool:
http://hg.mperillo.ath.cx/nginx/mod_wsgi/file/tip/examples/nginx-postgres-async.py

   you can see that there is a problem.

   In fact the pool is not very flexible; the application can not handle
   more than POOL_SIZE concurrent requests.

   However it is possible to just have a new request to wait until a
   previous connection is free (or a timeout occurs).

   I have attached an example (it is not in the repository since there
   are some problems).

   The examples use a new extension:

 - ctx = environ['ngx.request_context']()
 - ctx.resume()

   ctx.resume() "asynchronously" resumes the given request
   (it will be resumed as soon as control returns to Nginx, when the
application yields something).


   Note that the problem of resuming another request is easily solved
   with greenlets, without the need to new extensions
   (this is one of the reason why I like greenlets).


> [...]



Regards  Manlio Perillo
from collections import deque
import psycopg2 as db


# The table and the function are created by the setup script `postgres_setup.py`
query_select = "SELECT a, b, c, d, e FROM RandomTable LIMIT 10"
query_sleep = "SELECT * FROM sleep(1)"


# These constants are defined in the WSGI environment but their value
# is know
WSGI_POLLIN = 0x01
WSGI_POLLOUT = 0x04


# Size of the connection pool
POOL_SIZE = 20

# Free connections available
free_connections = deque()

# Connections waiting for a free slot
waiting_requests = deque()

# Number of concurrent connections
connections = 0

# State to be kept between requests
request_state = {}



def get_connection(environ):
global connections

print 'open', connections, len(free_connections), len(waiting_requests)

if free_connections:
print 'reuse'
# reuse existing connection
dbconn, c = free_connections.pop()
elif connections < POOL_SIZE:
print 'new'
# create a new connection
dbconn = db.connect(database='test')

curs = dbconn.cursor()
# XXX bad API, fileno should be a property of the connection object
fd = curs.fileno()
c = environ['ngx.connection_wrapper'](fd)

connections = connections + 1
else:
print 'wait'
# no free slots, this request will have to wait
ctx = environ['ngx.request_context']()
waiting_requests.append(ctx)

return None, None

# XXX check me
environ['ngx.poll_register'](c, WSGI_POLLIN)


[Web-SIG] WSGI and PEP 325

2008-05-20 Thread Manlio Perillo
The WSGI PEP explicitly mention the PEP 325 (for the application 
iterable close method).


Maybe this should be updated for the next WSGI spec, since Python 2.5 
implements the PEP 342?




Regards
Manlio Perillo
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Proposed WSGI extensions for asynchronous servers

2008-05-13 Thread Manlio Perillo

James Y Knight ha scritto:


On May 11, 2008, at 6:15 PM, Christopher Stawarz wrote:

Abstract


This specification defines a set of extensions that allow WSGI
applications to run effectively on asynchronous (aka event driven)
servers.

Rationale
-

The architecture of an asynchronous server requires all I/O
operations, including both interprocess and network communication, to
be non-blocking.  For a WSGI-compliant server, this requirement
extends to all applications run on the server.  However, the WSGI
specification does not provide sufficient facilities for an
application to ensure that its I/O is non-blocking.  Specifically,
there are two issues:

* The methods provided by the input stream (``environ['wsgi.input']``)
 follow the semantics of the corresponding methods of the ``file``
 class.

* WSGI does not provide the application with a mechanism to test
 arbitrary file descriptors (such as those belonging to sockets or
 pipes opened by the application) for I/O readiness.


There are other issues. How do you do a DNS lookup? How do you get 
process completion notification? Heck, how do you run a process? Once 
you have I/O readiness information, what do you do with that? I guess 
you'd need to write a whole new asynchronous server framework on top of 
AWSGI? I can't see being able to use it "raw" for any real applications.




This is not a problem with AWSGI.
As an example there are libraries like PostgreSQL and curl that can be 
used with an external event loop.


In the WSGI implementation for Nginx I can provide an interface for 
using the builtin supporto for asynchronous DNS client.




The first argument, ``fd``, is either an integer representing a file
descriptor or an object with a ``fileno`` method that returns such an
integer.  (In addition, ``fd`` may be ``x-wsgiorg.async.input``, even
if it lacks a ``fileno`` method.)  The second, optional argument,
``timeout``, is either ``None`` or a floating-point value in seconds.
If omitted, it defaults to ``None``.


What if the event-loop of the server doesn't use integer fds, but 
windows file handles or a java channel object? Where are you allowed to 
get these integers from? Is it always a socket from 
socket.socket().fileno()? Or can it be a file from open().fileno() or 
os.open()? A pipe from os.pipe()? Note that these distinctions are 
important everywhere but UNIX.




This has the same problems that we have with wsgi.file_wrapper.

This is the reason, among other things, why the API in my implementation 
uses ngx.connection_wrapper and ngx.poll_register


> [...]




Manlio Perillo
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Proposed WSGI extensions for asynchronous servers

2008-05-12 Thread Manlio Perillo

Phillip J. Eby ha scritto:

[...]



If ``timeout`` seconds elapse without the file descriptor becoming
ready for I/O, the variable ``x-wsgiorg.async.timeout`` will be true
when the application resumes.  Otherwise, it will be false.  The value
of ``x-wsgiorg.async.timeout`` when the application is first started
or after it yields each response-body string is undefined.


Er, I think you are confused here.  There is no way for the server to 
know what environ dictionary the application is using, unless you 
explicitly pass it into your extension API.




Interesting, this is something I have never considered.
In my implementation ngx.poll returns a function, so there should be no 
problems.




Regards  Manlio Perillo
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Proposal for asynchronous WSGI variant

2008-05-07 Thread Manlio Perillo

Christopher Stawarz ha scritto:

On May 7, 2008, at 4:20 AM, Graham Dumpleton wrote:


2008/5/7 Manlio Perillo <[EMAIL PROTECTED]>:
With your solution it seems that writing middlewares will not became 
more

easy.


Part of what I was trying to say was that this needn't be exposed to
middlewares, unless it has to be. It was effectively a lower level of
interaction which a middleware immediately on top of the WSGI adapter
would use to hook into the async type model, but then present it to
higher levels as more traditional WSGI interface.


That would be a really elegant solution, except, as you say:


That layer would
though obviously use something like greenlets to bridge the two.


The problem being that greenlets aren't part of the Python language.  
They're an extension that works by doing clever stuff with the C stack.  
And as much as we might wish that Python supported them natively (which 
I do, since they're a really nice alternative to OS threads), it 
doesn't, so I don't think they can play any role in a WSGI-ASYNC spec.




This is not fully true, after all WSGI explicitly exposes the concept of 
processes and threads (via the relative variable in the WSGI environ and 
some hints in the specification) and these are not really part of the 
Python Language.





Chris




Manlio Perillo
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] WSGI and greenlets

2008-05-07 Thread Manlio Perillo

Manlio Perillo ha scritto:

[...]
The main problem I see with greenlet is that is is not yet stable (there 
are some problems with the garbage collector) and that is is not part of 
CPython.


This means that it can be not acceptable to write a PEP for a WSGI like 
interface with coroutine support.




Maybe a solution can be to add a new variable to the WSGI environ:
wsgi.microthreads


When it is true it means that the WSGI implementation will execute the 
application inside a micro thread (may it be stackless, greenlet, pypy 
coroutine).



Also note that when using coroutines there will be no problems with WSGI 
2.0.


However I still think that we should release a WSGI 1.1 since many 
applications still use and will continue to use WSGI 1.x and a gateway 
will have to support WSGI 1.x in order to support both WSGI 1.x and 2.x




Regards  Manlio Perillo
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


[Web-SIG] WSGI and greenlets

2008-05-07 Thread Manlio Perillo

Christopher Stawarz ha scritto:

On May 6, 2008, at 6:17 AM, Manlio Perillo wrote:

I'm glad to know that there are some other people interested in 
asynchronous application, do you have seen my extensions to WSGI in my 
module for Nginx?


Yes, I have, and I had your module in mind as a potential provider of 
the AWSGI interface.


Note that in Nginx the request body is pre-read before the application 
is called (in fact wsgi.input is either a cStringIO or File object).


Although I didn't state it explicitly in my spec, my intention is for 
the server to be able to implement awsgi.input in any way it likes, as 
long as it provides a recv() method.  It's totally acceptable for the 
request body to be pre-read.




Ok.
But what I meant was that since Nginx pre-read the request body I have 
not tried to implement an interface for dealing with an asynchronous 
wsgi.input ;-).



Moreover I don't see any readons to have a revc method instead of read.

Unfortunately there is a *big* usability problem: the extension is 
based on a well specified feature of WSGI: the gateway can suspend the 
execution of the WSGI application when it yields.


However if the asynchronous code is present in a "child" function, we 
have something like this:

...
That is, all the functions in the "chain" have to yield, and is not 
very good.


Yes, you're right.  However, if you're willing/able to use Python 2.5, 
you can use the new features of generators to implement a call stack 
that lets you call child functions and receive return values and 
exceptions from them.  I've implemented this in awsgiref.callstack.  
Have a look at


  
http://pseudogreen.org/bzr/awsgiref/examples/echo_request_with_callstack.py


for an example of how it works.



I don't think this will solve the problem.
Moreover in your example you buffer the whole request body so that you 
have to yield only one time.


The solution is to use coroutines, and I'm planning to integrate 
greenlets (from the pylib project) into the WSGI module for Nginx.


Interesting, but it's not clear to me how/if this would work.  Can you 
explain more or point me to some code?




http://codespeak.net/py/dist/greenlet.html

def process_commands(*args):
while True:
line = ''
while not line.endswith('\n'):
line += read_next_char()
if line == 'quit\n':
print "are you sure?"
if read_next_char() != 'y':
continue# ignore the command
process_command(line)


With greenlets the execution can be suspened by any of the functions 
called by the main greelet.


This has a lot of advantages.

You can implement wsgi.input.read(n) so that it will suspend the 
execution of the current greenlet until *all* the n bytes have been read.


You can also implement the write callable so that control is returned to 
the main greelet when the socket is ready to send more data.


And, of course, you can implement a poll like interface and a sleep like 
interface.



I think that it is a great advantage, moreover it is the only way to 
implement truly reusable components.


Note that there is an effort of integrating greenlets with Twisted:
http://radix.twistedmatrix.com/2008/03/corotwine-01.html


The "problem" is that once you add support to greenlets, you have no 
more WSGI.


The interface can be the same, and applications can work on it without 
problems, but the semantic is *completely* different.



Also note that with greenlets should be possible to "magically" 
transform blocking applications like Django to non blocking.




The main problem I see with greenlet is that is is not yet stable (there 
are some problems with the garbage collector) and that is is not part of 
CPython.


This means that it can be not acceptable to write a PEP for a WSGI like 
interface with coroutine support.





Thanks,
Chris




Regards  Manlio Perillo
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Proposal for asynchronous WSGI variant

2008-05-07 Thread Manlio Perillo

Ionel Maries Cristian ha scritto:

This is a very interesting initiative.
 
However there are few problems:
- there is no support for chunked input - that would require having 
support for readline in the first place, also, it should be the 
gateway's business decoding the chunked input.


Unfortunately Nginx does not yet support chunked input, so I can't help 
here.


- the original wsgi spec somewhat has some support for streaming and 
asynchronicity [*1]


Right, and in fact I have used this for the implementation of some 
extensions in the WSGI module for Nginx.


- i don't see how removing the write callable will help (i don't see a 
issue having the server providing a stringio.write as the write callable 
for synchronous apps)


To summarize: the main problem with the write callable is that after you 
call it control is not returned to the WSGI gateway.


With an asynchronous server it is a problem since if you write a lot of 
data the server may not be able to send it to the client.


This is not a problem if the application returns a generator, since the 
gateway can suspend the execution until the socket is ready to send data.


With the write callable this is not possible,

In my implementation of WSGI for Nginx I provide two separate 
implementation of the write callable:

- put the socket temporary in synchronous mode
  (this is WSGI compliant but it is very bad for Nginx)
- buffer all the written data until control is returned to the
  gateway (this is *not* WSGI compliant)


However if you use greenlets, then implementing the write callable is 
not a problem.


- passing nonstring values though middleware will make using/porting 
existing wsgi middleware hairy (suppose you have a middleware that 
applies some filter to the appiter - you'll have your code full of 
isinstance nastiness)
 


Yes, this should be avoided.

Also, have you looked at the existing gateway implementations with 
asynchronous support?

There are a bunch of them:
http://trac.wiretooth.com/public/wiki/asycwsgi
http://chiral.j4cbo.com/trac
http://wiki.secondlife.com/wiki/Eventlet
my own shot at the problem: http://code.google.com/p/cogen/
and manlio's mod_wsgi for nginx
(I may be missing some)
 
However there is absolutely no unity in handling the wsgi.input (or 
equivalent)
 


The wsgi.input can be handled with ngx.poll:

c = ngx.connection_wrapper(wsgi.input)
...

ngx.poll_register(c, WSGI_POLLIN)
...

ngx.poll(1000)


Unfortunately I can not test if this is implementable.
I have some doubts.


> [...]



Manlio Perillo
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Proposal for asynchronous WSGI variant

2008-05-07 Thread Manlio Perillo

Graham Dumpleton ha scritto:

2008/5/7 Christopher Stawarz <[EMAIL PROTECTED]>:

On May 5, 2008, at 10:08 PM, Graham Dumpleton wrote:



If write() isn't to be returned by start_response(), then do away with
start_response() if possible as per discussions for WSGI 2.0.

 I think start_response() is necessary, because the application may need to
yield for I/O readiness (e.g. to read the request body, as in my example
app) before it decides what response status and headers to send.


One could come up with other ways of doing it which aligns better with
WSGI 2.0. I previously gave an idea as a starting point for
discussion, but don't think others really understood what I was
suggesting. But then I did post it at 4am in the morning in the middle
of a baby induced period of sleep deprivation. See post 24 in:

http://groups.google.com/group/python-web-sig/tree/browse_frm/thread/74c1f8cf15adf114/d98086a8db568ebd?rnum=24

I think what was missed by others was that I wasn't suggest that the
102 code be sent all the way back to the client, but as a convention
between WSGI application and underlying WSGI adapter only, to
facilitate the ability to return control back to the WSGI adapter
before one had decided what actual response headers to send. This
seems to align with what you want.



Its seems a bit more complex to implement then the start_callable.

Moreover the whole point of removing the start_callable is to simplify 
the writing of middlewares.


With your solution it seems that writing middlewares will not became 
more easy.





Graham




Manlio Perillo
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] [proposal] wsgiref.util.abs_url

2008-05-06 Thread Manlio Perillo

Phillip J. Eby ha scritto:

At 06:27 PM 5/5/2008 +0200, Manlio Perillo wrote:

Phillip J. Eby ha scritto:
I think that it doesn't accept a relative URL, it accepts an absolute 
path.


What do you mean?

 environ = {}
 setup_testing_defaults(environ)

 url = '/a/b/'


That's a relative URL that's also an absolute path.  Try a relative URL 
like './a/b', or just plain 'a/b'.





   self.failUnlessEqual(
  util.abs_url(environ, url), 'http://127.0.0.1/a/b/')

I also think that using urlparse.urljoin() with either request_uri() 
or application_uri() would be a clearer (and tested) way to obtain an 
absolute URL, and more generally useful.


But application_uri also includes SCRIPT_NAME.


Yes, and you might want to use it as the base against which a relative 
URL will be resolved -- i.e. an application-relative URL, vs. a 
request-relative URL.  In fact, application_uri() would probably be 
*more* useful, since if you want a request-relative URL, there's no need 
to turn it into an absolute URL, since you could just use it in its 
relative form.




Yes, but this is not always the case.

Note, however, that in either case, using a relative URL that's an 
absolute path (e.g. '/a/b'), will still produce the same result as your 
function would.  It's just that urljoin also works properly for all 
kinds of relative urls, not just the absolute-path subset.




You are right, thanks.


Regards  Manlio Perillo
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Proposal for asynchronous WSGI variant

2008-05-06 Thread Manlio Perillo

Christopher Stawarz ha scritto:

(I'm new to the list, so please forgive me for making my first post a
specification proposal :)

Browsing through the list archives, I see there's been some
inconclusive discussions on adding better support for asynchronous web
servers to the WSGI spec.  Since such support would be very useful for
some upcoming projects of mine, I decided to take a shot at specing
out and implementing it.  I'd be grateful for any feedback you have.
If this seems like something worth pursuing, I would also welcome
collaborators to help develop the spec further.



I'm glad to know that there are some other people interested in 
asynchronous application, do you have seen my extensions to WSGI in my 
module for Nginx?


The extension is documented here:
http://hg.mperillo.ath.cx/nginx/mod_wsgi/file/tip/README

see the Extensions chapter.

For some examples:
http://hg.mperillo.ath.cx/nginx/mod_wsgi/file/tip/examples/nginx-postgres-async.py
http://hg.mperillo.ath.cx/nginx/mod_wsgi/file/tip/examples/nginx-poll-sleep.py

Note that in Nginx the request body is pre-read before the application 
is called (in fact wsgi.input is either a cStringIO or File object).



Unfortunately there is a *big* usability problem: the extension is based 
on a well specified feature of WSGI: the gateway can suspend the 
execution of the WSGI application when it yields.


However if the asynchronous code is present in a "child" function, we 
have something like this:


def application(environ, start_response):
   def nested():
  while True:
 poll(xxx)
 yield ''

  yield result


   for r in nested():
  if not r:
  yield ''

   yield r


That is, all the functions in the "chain" have to yield, and is not very 
good.



The solution is to use coroutines, and I'm planning to integrate 
greenlets (from the pylib project) into the WSGI module for Nginx.



> [...]



Regards   Manlio Perillo
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] [proposal] wsgiref.util.abs_url

2008-05-05 Thread Manlio Perillo

Phillip J. Eby ha scritto:

At 11:03 PM 5/2/2008 +0200, Manlio Perillo wrote:

Hi.

I think that a function like (not tested):

def abs_url(environ, relative_url):
"""Return the absolute url"""

[...]

url += quote(relative_url)
return url

would be an useful addition to the wsgiref.util module.


What do you think?


I think that it doesn't accept a relative URL, it accepts an absolute path.



What do you mean?

 environ = {}
 setup_testing_defaults(environ)

 url = '/a/b/'
   self.failUnlessEqual(
  util.abs_url(environ, url), 'http://127.0.0.1/a/b/')

I also think that using urlparse.urljoin() with either request_uri() or 
application_uri() would be a clearer (and tested) way to obtain an 
absolute URL, and more generally useful.




But application_uri also includes SCRIPT_NAME.



Regards   Manlio Perillo
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


  1   2   3   >