Re: [Web-SIG] Any practical reason type(environ) must be dict (not subclass)?

2016-03-24 Thread Alan Kennedy
I don't see this relevant message in your references.

https://mail.python.org/pipermail/web-sig/2004-September/000749.html

Perhaps that, and following messages, might shed more light?

On Thu, Mar 24, 2016 at 3:18 PM, Jason Madden 
wrote:

> Hi all,
>
>
> Is there any practical reason that the type of the `environ` object must
> be exactly `dict`, as specified in PEP?
>
> I'm asking because it was recently pointed out that gevent's WSGI server
> can sometimes print `environ` (on certain error cases), but that can lead
> to sensitive information being kept in the server's logs (e.g.,
> HTTP_AUTHORIZATION, HTTP_COOKIE, maybe other things). The simplest and most
> flexible way to prevent this from happening, not just inadvertently within
> gevent itself but also for client applications, I thought, was to have
> `environ` be a subclass of `dict` with a customized `__repr__` (much like
> WebOb does for MultiDict, and repoze.who does for Identity, both for
> similar reasons).
>
> Unfortunately, when I implemented that in [0], I discovered that
> `wsgiref.validator` asserts that type(environ) is dict. I looked up the
> PEP, and sure enough, PEP  states that environ "must be a builtin
> Python dictionary (not a subclass, UserDict or other dictionary
> emulation)." [1]
>
> Background/History
> ==
>
> That seemed overly restrictive to me, so I tried to backtrack the history
> of that language in hopes of discovering the rationale.
>
> - It was present in the predecessor of PEP , PEP 0333, in the first
> version committed to the repository in August 2004. [2]
> - Prior to that, it was in both drafts of what would become PEP 0333
> posted to this mailing list, again from August 2004: [3], [4].
> - The ancestor of those drafts, the "Python Web Container Interface v1.0"
> was posted in December of 2003 with somewhat less restrictive language:
> "the environ object *must* be a Python dictionaryThe rationale for
> requiring a dictionary is to maximize portability
> between containers" [5].
>
> Now, the discussion on that earliest draft in [5] specifically brought up
> using other types that implement all the methods of a dictionary, like
> UserDict.DictMixin [6]. The last post on the subject in that thread seemed
> to be leaning towards accepting non-dict objects, at least if they were
> good enough [7].
>
> By the time the draft became recognizable as the precursor to PEP 0333 in
> [3], the very strict language we have now was in place. That draft,
> however, specifically stated that it was intended to be compatible with
> Python 1.5.2. In Python 1.5.2, it wasn't possible to subclass the builtin
> dict, so imitations, like UserDict.DictMixin, were necessarily imprecise.
> This was later changed to the much-maligned Python 2.2.2 release [8];
> Python 2.2 added the ability to subclass dict, but the language wasn't
> changed.
>
> Today
> =
>
> Given that today, we can subclass dict with full fidelity, is there still
> any practical reason not to be able to do so? I'm probably OK with gevent
> violating the letter of the spec in this regard, so long as there are no
> practical consequences. I was able to think of two possible objections, but
> both can be solved:
>
> - Pickling the custom `environ` type and then loading it in another
> process might not work if the class is not available. I can imagine this
> coming up with Celery, for example. This is easily fixed by adding an
> appropriate `__reduce_ex__` implementation.
>
> - Code somewhere relies on `if type(some_object) is dict:` (where
> `environ` became `some_object`, presumably through several levels of
> calls), instead of `isinstance(some_object, dict)` or
> `isinstance(some_object, collections.MutableMapping)`. The solution here is
> simply to not do that :) Pylint, among other linters, produces warnings if
> you do.
>
> Can anyone think of any other practical reasons I've overlooked? Is this
> just a horrible idea for other reasons?
>
> I appreciate any discussion!
>
> Thanks,
> Jason
>
> [0] https://github.com/gevent/gevent/compare/secure-environ
> [1] https://www.python.org/dev/peps/pep-/#specification-details
> [2]
> https://github.com/python/peps/commit/d5864f018f58a35fa787492e6763e382f98b923c#diff-ff370d50af3db062b015d1ef85935779
> [3] https://mail.python.org/pipermail/web-sig/2004-August/000518.html
> [4] https://mail.python.org/pipermail/web-sig/2004-August/000562.html
> [5] https://mail.python.org/pipermail/web-sig/2003-December/000394.html
> [7] https://mail.python.org/pipermail/web-sig/2003-December/000401.html
> [8] https://mail.python.org/pipermail/web-sig/2004-August/000565.html
>
> ___
> Web-SIG mailing list
> Web-SIG@python.org
> Web SIG: http://www.python.org/sigs/web-sig
> Unsubscribe:
> https://mail.python.org/mailman/options/web-sig/alan%40xhaus.com
>
___
Web-SIG mailing list

Re: [Web-SIG] REMOTE_ADDR and proxys

2014-09-24 Thread Alan Kennedy
[Collin]
 It seems to me, it is the role of the server/gateway, not the
 application/framework to determine the correct client ip address and
 correctly account for the situation of being behind a known proxy.

I disagreee. I think it is the role of the server/gateway to represent the
actual incoming HTTP request as accurately as possible.

If the application knows about remote proxies and local reverse proxies,
then it can take action accordingly.

But the server should not attempt any magic: it is up to the application to
interpret the request in whatever way it sees fit.

[Collin]
 Also, I am aware of the security issues of improperly handling
 X-Forwarded-For, but that's an issue no matter where it's being
 handled.

This is exactly why the server/gateway should refuse the temptation to
guess. It should leave it to the application to be smart enough to handle
all scenarios appropriately, knowing that it has access to the original
unmodified request.

If want to the magic rewriting functionality to be isolated from the
application, then it could easily be implemented as middleware.

Alan.


On Wed, Sep 10, 2014 at 7:41 PM, Collin Anderson cmawebs...@gmail.com
wrote:

 Hi All,

 The CGI spec says:

 Script authors should be aware that the REMOTE_ADDR and REMOTE_HOST
 meta-variables (see sections 4.1.8 and 4.1.9) may not identify the
 ultimate source of the request.  They identify the client for the
 immediate request to the server; that client may be a proxy, gateway,
 or other intermediary acting on behalf of the actual source client.

 However, if the there is a revere proxy on the server side (such as
 nginx), it seems to me, the ip address of the immediate request to
 the server will be 127.0.0.1 and the actual address will be in an
 X-Forwarded-For header.

 It seems to me, it is the role of the server/gateway, not the
 application/framework to determine the correct client ip address and
 correctly account for the situation of being behind a known proxy.

 Also, I am aware of the security issues of improperly handling
 X-Forwarded-For, but that's an issue no matter where it's being
 handled.

 So, in the case of a reverse proxy, is it ok if the WSGI server sends
 back a REMOTE_ADDR that isn't 127.0.0.1, even if it's the immediate
 connection to the WSGI server is local?

 Basically can we interpret the server above to be the machine rather
 than the program?

 Thanks,
 Collin
 ___
 Web-SIG mailing list
 Web-SIG@python.org
 Web SIG: http://www.python.org/sigs/web-sig
 Unsubscribe:
 https://mail.python.org/mailman/options/web-sig/alan%40xhaus.com

___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
https://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Fwd: Can writing to stderr be evil for web apps?

2012-05-19 Thread Alan Kennedy
[anatoly]
 Martin expressed concerns that using logging module with stderr output
 can break web applications, such as PyPI.

Please can you specify exactly what you mean by using logging module
with stderr output?

Dealing with stderr is a webserver specific concern.

Consider the case where you're the author of a webserver that deals
with CGI scripts.

When you get a request for the CGI script, you start a subprocess to
run the script. You must decide what to do with the stdin, stdout and
stderr of the process.

 - CGI mandates that any content that came with the request (e.g. a
POST body) should be fed into stdin(if no other mechanism is in
place[0])
 - CGI mandates that the stdout of the process is sent back to the
client (if no other mechanism is in place[1]).
 - CGI makes no mention of stderr.

Various webservers permit configurable handling of stderr.

For example, Tomcat has a setting called swallowOutput which
redirects both stdout and stderr to a log file. (Obviously, Tomcat's
treatment of stdout is different for CGI)

http://tomcat.apache.org/tomcat-6.0-doc/config/context.html

WSGI has a specific mechanism for diagnostic output, wsgi.errors.


wsgi.errors 

An output stream (file-like object) to which error output can be
written, for the purpose of recording program or other errors in a
standardized and possibly centralized location. This should be a text
mode stream; i.e., applications should use \n as a line ending, and
assume that it will be converted to the correct line ending by the
server/gateway.

...

For many servers, wsgi.errors will be the server's main error log.
Alternatively, this may be sys.stderr, or a log file of some sort. The
server's documentation should include an explanation of how to
configure this or where to find the recorded output. A server or
gateway may supply different error streams to different applications,
if this is desired.


Lastly, note that WSGI supplies an example CGI gateway, about which it
has this to say about error handling


Note that this simple example has limited error handling, because by
default an uncaught exception will be dumped to sys.stderr and logged
by the web server.


http://www.python.org/dev/peps/pep-/#the-server-gateway-side

So I would say that

1. If you are writing a web application, and want it run under any
WSGI container, and for the user to be able to control that output in
a way with which they are familiar (i.e. which is documented and may
have specific configuration options), send the output to wsgi.errors.

2. If you are writing a web server, you should either capture or
ignore stderr. If it is captured, then it is reasonable to, e.g.,
write it to a file so that the user can find it. It should never be
mixed with stdout if stdout is the mechanism by which the application
communicates with the webserver, as with CGI.

Alan.

[0] http://ken.coar.org/cgi/draft-coar-cgi-v11-03.txt

Section 6.2 Request Message-Bodies

   As there may be a data entity attached to the request, there
   MUST be a system defined method for the script to read these
   data. Unless defined otherwise, this will be via the 'standard
   input' file descriptor.


[1] http://ken.coar.org/cgi/draft-coar-cgi-v11-03.txt

Section 7. Data Output from the CGI Script

   There MUST be a system defined method for the script to send
   data back to the server or client; a script MUST always return
   some data. Unless defined otherwise, this will be via the
   'standard output' file descriptor

___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] WSGI for Python 3

2010-07-17 Thread Alan Kennedy
[PJ Eby]
 IOW, the bytes/string discussion on Python-dev has kind of led me to realize
 that we might just as well make the *entire* stack bytes (incoming and
 outgoing headers *and* streams), and rewrite that bit in PEP 333 about using
 str on Python 3000 to say we go with bytes on Python 3+ for everything
 that's a str in today's WSGI.

 Or, to put it another way, if I knew then what I know *now*, I think I'd
 have written the PEP the other way around, such that the use of 'str' in
 WSGI would be a substitute for the future 'bytes' type, rather than viewing
 some byte strings as a forward-compatible substitute for Py3K unicode
 strings.

 Of course, this would be a WSGI 2 change, but IMO we're better off making a
 clean break with backward compatibility here anyway, rather than having
 conditionals.  Also, going with bytes everywhere means we don't have to
 rename SCRIPT_NAME and PATH_INFO, which in turn avoids deeper rewrites being
 required in today's apps.

+1

 (Hm.  Although actually, I suppose we *could* just borrow the time machine
 and pretend that WSGI called for byte-strings everywhere all along...)

+1/0

Alan.
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Request for Comments on upcoming WSGI Changes

2009-09-22 Thread Alan Kennedy
[Ian]
 OK, another proposal entirely: we kill SCRIPT_NAME and PATH_INFO
 introduce two equivalent variables that hold the NOT url-decoded values.

[Graham]
 That may be fine for pure Python web servers where you control the
 split of REQUEST_URI into SCRIPT_NAME and PATH_INFO in the first place
 but don't have that luxury in Apache or via FASTCGI/SCGI/CGI etc as
 that is done by the web server. Also, as pointed out in my blog,
 because of rewrites in web server, it may be difficult to try and map
 SCRIPT_NAME and PATH_INFO back into REQUEST_URI provided to try and
 reclaim original characters. There is also the problem that often
 FASTCGI totally stuffs up SCRIPT_NAME/PATH_INFO split anyway and
 manual overrides needed to tweak them.

This applies doubly under Java servlets, where different containers
take different approaches to solve these rather hard problems. It is
worth noting that they have to do so because the java servlet spec,
even under the most recent 2.5,  punts on *all* of the issues being
discussed here.

See here for how Tomcat does it. Or half does it, messily.

http://wiki.apache.org/tomcat/FAQ/CharacterEncoding

I know this is not helpful ;-)

Alan.
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Request for Comments on upcoming WSGI Changes

2009-09-22 Thread Alan Kennedy
[Ian]
 When things get messed up I recommend people use a middleware
 (paste.deploy.config.PrefixMiddleware, though I don't really care what they
 use) to fix up the request to be correct.  Pulling it from REQUEST_URI would
 be fine.

That would be unworkable under java servlet containers, since they
each take a different approach to addressing encoding issues, or fail
to deal with them entirely.

So there would probably have to be a special case for every single one of these

http://en.wikipedia.org/wiki/List_of_Servlet_containers

Each of which has a number of different ways of being configured in
relation to these issues.

I don't know if it would even be possible to write such a middleware.

And retain all of one's hair.

Alan.
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Request for Comments on upcoming WSGI Changes

2009-09-22 Thread Alan Kennedy
[P.J. Eby]
 Actually, latin-1 bytes encoding is the *simplest* thing that could
 possibly work, since it works already in e.g. Jython, and is actually
 in the spec already...  and any framework that wants unicode URIs
 already has to decode them, so the code is already written.

[Armin]
 Except that nobody implements that

So, if nobody implements that, then why are we trying to standardise it?

Is there a real need out there?

Or are all these discussions solely driven by the need/desire to have
only unicode strings in the WSGI dictionary under python 3?

Which is a worthy goal, IMHO. Java has been there since the very
start, since java strings have always been unicode. Take a look at the
java docs for HttpServlet: no methods return bytes/bytearrays.

http://java.sun.com/products/servlet/2.5/docs/servlet-2_5-mr2/javax/servlet/http/HttpServletRequest.html

But the java servlet spec still ignores *all* of the encoding concerns
being discussed here. Which means that mistakes/mojibake must happen
all the time. And it's up to the author of the individual java web
application to solve those problems, using a mechanism appropriate for
their needs and local environment.

Java programmers just tolerate this, although they may curse the
developers of the servlet spec for not having solved their specific
problem for them.

Alan.
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Request for Comments on upcoming WSGI Changes

2009-09-22 Thread Alan Kennedy
[Alan]
 Is there a real need out there?

[Armin]
 In python 3, yes.  Because the stdlib no longer works with bytes and the
 bytes object has few string semantics left.

Why can't we just do the same as the java servlet spec? I.E.

1. Ignore the encoding issues being discussed
2. Give the programmer (possibly mojibake) unicode strings in the WSGI
environ anyway
3. And let them solve their problems themselves, using server
configuration or bespoke middleware

[Alan]
 Java programmers just tolerate this, although they may curse the
 developers of the servlet spec for not having solved their specific
 problem for them.

[Armin]
 Many Java apps are also still using latin1 only or have all kinds of
 problems with charsets.

My point exactly.

Many web developers simply never have to deal with these issues,
perhaps a majority.

The ones that do have to sort it out for themselves.

To do so, the publishers of the various containers give them
(non-standard) options to control the decoding of the incoming request
and all of its component parts: you cited the Tomcat approach above.
Other containers do it differently. Which means that i18n knowledge is
not portable between containers.

It would be nice if we could avoid such a situation with i18n and WSGI.

But I suppose I'm a little dubious that this group can out-do the
enormous java community, and the enormous financial resources that
Sun, IBM, Oracle, etc, etc, plough into it. And still failed to solve
this complex problem satisfactorily.

Alan.
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Request for Comments on upcoming WSGI Changes

2009-09-22 Thread Alan Kennedy
[Armin]
 Because that problem was solved a long ago in applications themselves.
 Webob, Werkzeug, Paste, Pylons, Django, you name it, all are operating
 on unicode.  And the way they do that is straightforward.

So what are we all discussing?

Those frameworks obviously have solved all of the problems of decoding
incoming request components, e.g.

1. SCRIPT_NAME
2. PATH_INFO
3. QUERY_STRING
4. Etc

from miscellaneous unknown character sets into unicode, with out any
mistakes, under all possible WSGI environments, e.g.

1. Mod_wsgi
2. Modjy (java servlets)
3. IIS
4. CGI
5. FCGI
6. Etc

So why not just adopt one of those mechanisms, e.g. Django, and make
it the de-facto standard? Since they all deliver unicode, python 3 is
no longer a problem, since it permits only unicode strings.

Alan.
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Request for Comments on upcoming WSGI Changes

2009-09-22 Thread Alan Kennedy
[Armin]
 No, they know the character sets.

Hmmm, define know ;-)

[Armin]
 You tell them what character set you
 want to use.  For example you can specify utf-8, and they will
 decode/encode from/to utf-8.  But there is no way for the application to
 send information to the server before they are invoked to tell the
 server what encoding they want to use.

I see this as being the same as Graham's suggested approach of a
per-server configurable charset, which is then stored in the WSGI
dictionary, so that applications that have problems, i.e. that detect
mojibake in the unicode SCRIPT_NAME or PATH_INFO, can attempt to undo
the faulty decoding by the server.

Alan.
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Request for Comments on upcoming WSGI Changes

2009-09-22 Thread Alan Kennedy
[Armin]
 Of course a server configuration variable would be a solution for many
 of these problems, but I don't like the idea of changing application
 behavior based on server configuration.

So you don't like the way that Django, Werkzeug, WebOb, etc, do it
now, even though they appear to be mostly successful, and you're happy
to cite them as such?

From the applications point of view, a framework-level configuration
variable is the same as a server-level configuration variable.

 At that point we will finally
 have successfully killed the idea of nested WSGI applications, because
 those could depend on different charsets.

Wouldn't well-written applications depend on unicode?

The server configured charset is simply an explicit statement of the
character set from which incoming requests are to be decoded, into
unicode, and no other character set.

Alan.
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] WSGI 1 Changes [ianb's and my changes]

2009-09-18 Thread Alan Kennedy
[Rene]
 I think you mean pre-2.2 support, not python 2.2?  iterators came
 about in python 2.2.

[Armin]
 That might be.  That was before my time.  I'm pretty sure the first
 Python version I used was 2.3, but don't quote me on that.

As WSGI was being developed, cpython was at version 2.3.

The only reason that support for older versions was in the spec was
because jython was at version 2.1 at the time.

The WSGI spec was made much simpler by the use of the iterator
protocol (PEP 234), which was in introduced into the language in 2.2.
So where the spec says

Supporting Older (2.2) Versions of Python

It should probably have read

Supporting Older (pre-pep-234-iterator-protocol) Versions of Python

I don't know of any modern python implementation that doesn't support
the iterator protocol.

It's probably time to drop that section from the PEP.

Alan.
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Announcing bobo

2009-06-16 Thread Alan Kennedy
[Etienne]
 If you want to start a thread for Bobo, please switch mailing-list or
 create a new thread, as all I wanted was to tell Jim my disappointement
 regarding Bobo, and I still think its not very revolutionary.

I completely disagree; this is definitely the appopriate list for
discussing web frameworks and new approaches. There is no perfect
framework in python, or any other language. It is only with the
introduction, discussion, acceptance and assimilation of new ideas
that we all move forward together.

Jim has the longest history of all in Python web frameworks; he
created the very concept. He founded and built the entire Zope
community; I will always listen to what he has to say.

I wish you the best of luck with your own web framework, notmm

http://gthc.org/projects/notmm/0.2.12/

Which seems to have some potential, but currently lacks community support.

http://gthc.org/community/

I'm looking forward to Europython, where I know I'll be meeting some
great python folks, and hopefully some of us will get to continue our
WSGI revision discussions.

All the best,

Alan.
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] RESTful Python email list?

2009-04-11 Thread Alan Kennedy
[Pete]
 Any interest in a dedicated email list for REST + python, a la the
 restful-json group [0]?  The group would discuss strategies for REST
 architecture built with and within Python.  WSGI 1.0 vs. 2.0 vs. 2e6 is out
 of scope. ;-)

Just a thought: is there any reason why RESTful python discussions
cannot take place on the restful-json group referred to?

Alan.
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] FW: Closing #63: RFC2047 encoded words

2009-04-08 Thread Alan Kennedy
[James]
 If you want to start a discussion about having a standard parsed-header
 object in WSGI, that's another thing, but saying that WSGI servers should
 *partially* decode the headers seems rather silly to me.

Hi James,

It's a shame that your proposal to add the twisted header parsing
library to the standard library didn't catch on years ago.

http://mail.python.org/pipermail/web-sig/2006-February/002119.html

Alan.
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Python 3.0 and WSGI 1.0.

2009-04-02 Thread Alan Kennedy
[Sylvain]
 Would there be any interest in asking the HTTP-BIS working group [1] what
 they think about it?

 Currently I couldn't find anything in their drafts suggesting they had
 decided to clarify this issue from a protocol's perspective but they might
 consider it to be relevant to their goals.

 - Sylvain

 [1] http://www.ietf.org/html.charters/httpbis-charter.html

I checked the current version of their replacement for RFC 2616. It says


2.1.3.  URI Comparison

   When comparing two URIs to decide if they match or not, a client
   SHOULD use a case-sensitive octet-by-octet comparison of the entire
   URIs


Which doesn't work if the two URIs to be compared are in different encodings.

I did find this page on the W3C site which at least explains the
issues, and does a survey of existing modern browsers for how they
encode URIs and IRIs.

http://www.w3.org/International/articles/idn-and-iri/


Paths

The conversion process for parts of the IRI relating to the path is
already supported natively in the latest versions of IE7, Firefox,
Opera, Safari and Google Chrome.

It works in Internet Explorer 6 if the option in ToolsInternet
OptionsAdvancedAlways send URLs as UTF-8 is turned on. This means
that links in HTML, or addresses typed into the browser's address bar
will be correctly converted in those user agents. It doesn't work out
of the box for Firefox 2 (although you may obtain results if the IRI
and the resource name are in the same encoding), but technically-aware
users can turn on an option to support this (set
network.standard-url.encode-utf8 to true in about:config).

Whether or not the resource is found on the server, however, is a
different question. If the file system is in UTF-8, there should be no
problem. If not, and no mechanism is available to convert addresses
from UTF-8 to the appropriate encoding, the request will fail.

Files are normally exposed as UTF-8 by servers such as IIS and Apache
2 on Windows and Mac OS X. Unix and Linux users can store file names
in UTF-8, or use the mod_fileiri module mentioned earlier. Version 1
of the Apache server doesn't yet expose filenames as UTF-8.

You can run a basic check whether it works for your client and
resource using this simple test.

Note that, while the basics may work, there are other somewhat more
complicated aspects of IRI support, such as handling of bidirectional
text in Arabic or Hebrew, which may need some additional time for full
implementation.


Alan.
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Python 3.0 and WSGI 1.0.

2009-04-02 Thread Alan Kennedy
[Sylvain]
 Would there be any interest in asking the HTTP-BIS working group [1] what
 they think about it?

 Currently I couldn't find anything in their drafts suggesting they had
 decided to clarify this issue from a protocol's perspective but they might
 consider it to be relevant to their goals.

 - Sylvain

 [1] http://www.ietf.org/html.charters/httpbis-charter.html

As mentioned in an earlier post, I think their current spec avoids the
issue, by still relying on octet-by-octet comparison.

But I did come across this discussion on their list, which goes into
all of the issues in fine detail.

http://www.nabble.com/PROPOSAL%3A-i74%3A-Encoding-for-non-ASCII-headers-tt16274487.html#a16291951

Quote of the thread

[Roy Fielding]
 We are simply passing through the one and only defined i18n solution
 for HTTP/1.1 because it was the only solution available in 1994.
 If email clients can (and do) implement it, then so can WWW clients.

 People who want to fix that should start queueing for HTTP/1.2.

Alan.
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] WSGI Open Space @ PyCon.

2009-04-01 Thread Alan Kennedy
[Noah]
 +1 on the iterator, although I might just like the idea and might be missing
 something important.  It seems like there are a lot of powerful things being
 developed with generators in mind, and there are some nifty things you can
 do with them like the contextlib example:
  http://docs.python.org/library/contextlib.html#contextlib.closing

Indeed, like coroutines.

http://www.python.org/dev/peps/pep-0342/

[Robert]
 The counter-argument was that
 servers could use non-blocking sockets to allow apps which read() to
 yield in the case of no immediate data rather than block indefinitely.

Ah, but the problem with that is that one can't magically suspend
methods like that and return control to the scheduler, without using
coroutines or stackless.

Who does the read() method return control to when there's no data
available (i.e. no bytes on the socket). If wsgi.input is a simple
file-like object, then it's methods must be coded to recognise, rather
than blocking, when the data is not yet available to fulfill the
applications expectation. How does it know how to return control to
the scheduler, instead of the application?

If the application expects to receive all of the data that it asked
for with a, say read(1024) call, it has to be prepared to accept that
it may get less than 1024 bytes, in an asynchronous situation. What
does it return to the application in the case when  1024 bytes is
available?

 If a file-like object were retained, it would help to publish a
 chainable file example to help middleware re-stream files they read any
 part of.

I don't think that re-streaming of input should be a part of the spec;
it's an application layer thing. We don't expect to re-stream the
output of an application: why re-stream the input?

If some application needs to examine the entire byte sequence for
whatever reasons, that's a special case that can be catered for with
itertools, and dedicated middleware.

 Continuing deferred issues

  * Lifecycle methods (start/stop/etc event API driven by the container)

I'd really like to get this one nailed: java people and .net people
expect this stuff.

Alan.
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


[Web-SIG] WSGI Open Space @ PyCon.

2009-03-27 Thread Alan Kennedy
Dear all,

For those of you at PyCon, there is a WSGI Open Space @ 5pm today (Friday).

The sub-title of the open space is Does WSGI need revision?

An example: Philip Jenvey (http://dunderboss.blogspot.com/) raised the
need for something akin to what Java folks call Lifecycle methods,
so that WSGI apps can do initialization and finalization.

http://java.sun.com/j2ee/tutorial/1_3-fcs/doc/Servlets4.html

I'm sure there are plenty of other topics that could be discussed as well.

See you @5pm.

Alan.
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Use both Python and Javascript in html webpages

2009-03-05 Thread Alan Kennedy
[David]
 Can we use both Python and Javascript in html webpages?   Any demo on this?

If you're willing to write rpython, PyPy can compile it to javascript
which run can in a browser.

http://codespeak.net/pypy/dist/pypy/doc/js/using.html

HTH,

Alan.
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Revising environ['wsgi.input'].readline in the WSGI specification

2008-11-18 Thread Alan Kennedy
[Graham]
 I would be for (1) errata or amendment as reality is that there is
 probably no WSGI implementation that disallows an argument to
 readline() given that certain Python code such as cgi.FieldStorage
 wouldn't work otherwise.

 For such a clarification on existing practice, I see no point in
 having to change wsgi.version in environ as it would just cause
 confusion.

+1

[Graham]
 I would also like to see other changes to WSGI specification but now
 is not the time, let us at least though get this obvious issue with
 API dealt with. After that we can then perhaps have a discussion of
 future of WSGI specification and whether there really is any interest
 in future versions with more significant changes.

+1

Alan.
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Newline values in WSGI response header values.

2008-06-12 Thread Alan Kennedy
[Graham]
 Thus, is an embedded newline in value invalid? Would it be reasonable
 for a WSGI adapter to flag it as an error?

From a security POV, it may be advisable for WSGI servers to *not*
allow newlines in HTTP response headers; newlines in response headers
may be the result of an application's failure to sanitise its inputs.

http://en.wikipedia.org/wiki/HTTP_response_splitting

Regards,

Alan.
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Time a for JSON parser in the standard library?

2008-04-10 Thread Alan Kennedy
[Bob]
  simplejson would give you an error and tell you exactly where the
  problem was,

Another good point.

Other JSON modules should follow simplejson's lead, and provide access
to the location in the document where the lexical or parse error
occurred, so that the offending document can be opened in a text
editor to determine the source of the problem, and perhaps fix it.

This should also apply to junk after the document object, i.e. JSON
expressions present in the document after the main document has been
successfully parsed. A strict interpretation of the spec is that such
junk is not permitted, and makes the JSON document broken, even
though the main object representation is valid.

Simplejson has an option for the user to control this, and jyson does
too; I don't know about the others.

[Bob]
 but there isn't currently a non-strict mode and honestly
  nobody has asked for it.

If we only need strict mode, then why do all of our parsers have options?

Isn't permissive mode just a way of setting all of the parse options
to liberal, in one go?

Alan.
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Time a for JSON parser in the standard library?

2008-04-09 Thread Alan Kennedy
[John]
 I'm interested in whether you generally use JSON to communicate with a
 JavaScript client, or another JSON library. Both the demjson and simplejson
 libraries are written with the assumption that they are to be used to
 interact with JavaScript.

Answer #1: My motive is simply to implement the JSON spec, in a
[j|p]ythonic way. If the ideal of JSON is to be realised, then the
producer of the document is not relevant: it is only the document
itself that matters.

Answer #2: I'm working (i.e. day job) with JSON at the moment: a
javascript client talking to a java server. The JS guy had a problem
last week with a sample JSON document I gave him to prototype on. I
wrote the sample by hand (it later became my freemarker template), and
so inadvertently left in a hard-to-spot dangling comma, from all the
copying and pasting. That broke his javascript library; he solved the
problem by passing it through a PHP JSON codec on his local Apache. It
worked, i.e. his problem disappeared, but he didn't know why (the PHP
lib had eliminated the dangling comma). Which all goes to confirm,
IMHO, that you should be liberal in what you consume and strict in
what you produce.

[John]
 You mentioned in an earlier e-mail that jyson supports reading arrays with
 trailing commas -- is this intentional, or accidental? Do you read them with
 Python or JavaScript semantics?

Went out of my way to accept them, with python semantics.

Javascript semantics differ. Last time I tested, FireFox and IE
interpreted [1,2,3,] differently as [1,2,3] and [1,2,3,null].
Although that may have changed during the meanwhilst.

[Alan]
  2. To have a native-code implementation, customised for jython.

[John]
 Did you encounter any particular issues related to implementing a JSON
 library in Jython that would affect how a standard library implementation's
 API should be designed?

Jython is changing rapidly. It is evolving from a 2.2 stage (from
__future__ import generators) to a 2.5 stage in one leap. Jython 2.5
is built with java 1.5 (1.5 is where java grew annotations and
generics). Between 2.2. and 2.5, python has grown Decimal's, generator
comprehensions, decorators, context managers, bi-directional
generators, etc. I prefer for a pure java implementation of a JSON
codec to remain flexible in terms of the way that it maps
fundamental JSON types into the jython type hierarchy and
interpreter machinery[1].

I'm beginning to think that any putative JSON API should permit the
user to specify which class will be used to instantiate JSON objects.
If the users can specify their own classes, that might go a long way
way resolve issues such as I need my javascript client to communicate
Numbers representing radians to my python server which uses Decimal
because it works better with my geo-positioning library. Standard
libraries should provide their own set of default instantiation
classes, which the user could override.

Regards,

Alan.

[1] There is an argument that a pure java JSON parser for jython is
not worth the effort, in performance terms at least. JVM optimisation
is very sophisticated these days, and it is conceivable that pure
python (byte)code could run as fast or faster on a JVM than equivalent
java code. Think PyPy. So maybe a single well-designed pure-python
JSON module in the cpython standard library is the way to go.
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Time a for JSON parser in the standard library?

2008-04-09 Thread Alan Kennedy
[Alan]
 [hand written JSON containing a] hard-to-spot dangling comma, from all the
  copying and pasting. That broke his javascript library; he solved the
  problem by passing it through a PHP JSON codec on his local Apache. It
  worked, i.e. his problem disappeared, but he didn't know why (the PHP
  lib had eliminated the dangling comma). Which all goes to confirm,
  IMHO, that you should be liberal in what you consume and strict in
  what you produce.

[John]
  Sounds like a case *for* strict parsing, in my opinion. PHP's loose
  parsing made it difficult to figure out why the JSON was invalid. If
  trailing comma handling is to try to work around copy-paste errors, -1
  from me.

No, the PHP lib did exactly what it should, IMHO. The PHP lib was
liberal in what it consumed (a dangling comma), and strict in what it
produced (no dangling comma).

It accepted my broken document with a dangling-comma, and emitted a
strictly conformant document with the offending comma removed, which
enabled my co-worker to proceed with his job.

+1 from me.

Other opinions?

Alan.
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Time a for JSON parser in the standard library?

2008-03-11 Thread Alan Kennedy
[Massimo]
 It would also be nice to have a common interface to all modules that
 do serialization. For example pickle, cPickle, marshall has dumps, so
 json should also have dumps.

Indeed, this is my primary concern also.

The reason is that I have a pure-java JSON codec for jython, that I
will either publish separately or contribute to jython itself.

If we're going to have the facility in both cpython and jython (and
probably ironpython, etc), then it would be optimal to have a
compatible API so that we have full interoperability. And given that
we in jython land are always left implementing cpython APIs (which are
not necessarily always the optimal design for jython) it would be nice
if we could agree on APIs, etc, *before* stuff goes into the standard
library.

The API for my codec is slightly different from simplejson, although
it could be made the same with a little work, including exception
signatures, etc.

But there are some things about my own design that I like. For
example, simplejson allows override of the JSON output representing
certain objects, by the use of subclasses of JSONEncoder. My design
does it differently; it simply looks for a __json__() callable on
every object being serialised, and if found, calls it and uses its
return value to represent the object. I have no equivalent of
simplejson's decoding extensions.

Another difference is the set of options. Simplejson has options to
control parsing and generation, and so does mine. But the sets of
options are different, e.g. simplejson has no option to permit/reject
dangling commas (e.g. [1,2,3,])*, whereas mine has no support for
accepting NaN, infinity, etc, etc.

On the encoding side, I simply make the assumption that all character
transcoding has happened before the JSON text reaches the JSON parser.
(I think this is a reasonable assumption, given that byte streams are
always associated with file storage, network transmission, etc, and
only the programmer has access to the relevant encoding information).
But given that RFC 4627 specifies how to guess encoding of JSON byte
streams, I'll probably change that policy.

Lastly, another area of potential cooperation is testing: I have over
100 unit-tests, with fairly extensive coverage. I think that test
coverage is very important in the case of JSON; you can never have too
many tests.

So, what is the best way to go about agreeing on the best API?

1. Discussion on web-sig?
2. Discussion on stdlib-sig?
3. Collaborative authoring/discussion on a WIKI page?
4. 

Regards,

Alan.

* Which can mean different things to different software. Some
javascript interpreters interpret it as a 4 element list (inferring
the last object between the comma and the closing square bracket as a
null) , others as a 3 element list. Python obviously interprets it as
a 3-element list. So the general internet maxim be liberal in what
you accept and strict in what produce applies. My API gives control
of this strictness/relaxedness to the user.
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Time a for JSON parser in the standard library?

2008-03-11 Thread Alan Kennedy
[Graham]
  The problem areas were, different interpretations of what could be
  supplied in an error response. Whether an integer, string or arbitrary
  object could be supplied as the id attribute in a request. Finally,
  some JavaScript clients would only work with a server side
  implementation which provided introspection methods as they would
  dynamically create a JavaScript proxy object based on a call of the
  introspection methods.

These are JSON-RPC concerns, and nothing to do with JSON text de/serialization.

I do believe we're only discussing JSON-python objects
transformation, in this thread at least.

  Unfortunately the JSON 1.1 draft specification didn't necessarily make
  things better.

There is no JSON 1.1 spec; but there is a JSON-RPC 1.1 spec.

http://json-rpc.org/wiki/specification

  Thus my question is, what version of the JSON specification are you
  intending to support.

The one specified in RFC 4627

http://www.ietf.org/rfc/rfc4627.txt

Regards,

Alan.
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


[Web-SIG] Time a for JSON parser in the standard library?

2008-03-10 Thread Alan Kennedy
Dear all,

Given that

1. Python comes with batteries included

2. There is a standard library re-org happening because of Py3K

3. JSON is now a very commonly used format on the web

Is it time there was a JSON codec included in the python standard library?

(If XML is already supported, I see no reason why JSON shouldn't be)

Or is it best to make users who want to use JSON go and research all
of the different options available to them?

Choosing a Python JSON Translator
http://blog.hill-street.net/?p=7

Just a thought.

Regards,

Alan.
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] WSGI, Python 3 and Unicode

2007-12-07 Thread Alan Kennedy
[Alan]
 The restriction to iso-8859-1 is really a distraction; iso-8859-1 is
 used simply as an identity encoding that also enforces that all
 bytes in the string have a value from 0x00 to 0xff, so that they are
 suitable for byte-oriented IO. So, in output terms at least, WSGI *is*
 a byte-oriented protocol. The problem is the python-the-language
 didn't have support for bytes at the time WSGI was designed.

[Thomas]
 If you're talking about the output stream, then yes, it's all about
 bytes (or should be).

Indeed, I was only talking about output, specifically the response body.

 But at the status and headers level, HTTP/1.1 is
 fundamentally ISO-8859-1-encoded.

Agreed.

That is why the WSGI spec also states


Note also that strings passed to start_response() as a status or as
response headers must follow RFC 2616 with respect to encoding. That
is, they must either be ISO-8859-1 characters, or use RFC 2047 MIME
encoding.


So in order to use non-ISO-8859-1 characters in response status
strings or headers, you must use RFC 2047.

As confirmed by the links you posted, this is a HTTP restriction, not
a WSGI restriction.

Regards,

Alan.
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] WSGI, Python 3 and Unicode

2007-12-07 Thread Alan Kennedy
[Phillip]
 WSGI already copes, actually.  Note that Jython and IronPython have
 this issue today, and see:

 http://www.python.org/dev/peps/pep-0333/#unicode-issues

[James]
 It would seem very odd, however, for WSGI/python3 to use strings-
 restricted-to-0xFF for network I/O while everywhere else in python3 is
 going to use bytes for the same purpose.

I think it's worth pointing out the reason for the current restriction
to iso-8859-1 is *because* python did not have a bytes type at the
time the WSGI spec was drawn up. IIRC, the bytes type had not yet even
been proposed for Py3K. Cpython effectively held all byte sequences as
strings, a paradigm which is (still) followed by jython (not sure
about ironpython).

The restriction to iso-8859-1 is really a distraction; iso-8859-1 is
used simply as an identity encoding that also enforces that all
bytes in the string have a value from 0x00 to 0xff, so that they are
suitable for byte-oriented IO. So, in output terms at least, WSGI *is*
a byte-oriented protocol. The problem is the python-the-language
didn't have support for bytes at the time WSGI was designed.

[James]
 You'd have to modify your app
 to call write(unicodetext.encode('utf-8').decode('latin-1')) or so

Did you mean: write(unicodetext.encode('utf-8').encode('latin-1'))?

Either way, the second encode is not required;
write(unicodetext.encode('utf-8')) is sufficient, since it will
generate a byte-sequence(string) which will (actually should: see
(*) note below) pass the following test.

try:
   wsgi_response_data.encode('iso-8859-1')
except UnicodeError:
   # Illegal WSGI response data!

On a side note, it's worth noting that Philip Jenvey's excellent
rework of the jython IO subsystem to use java.nio is fundamentally
byte oriented.

http://www.nabble.com/fileno-support-is-not-in-jython.-Reason--t4750734.html
http://fisheye3.cenqua.com/browse/jython/trunk/jython/src/org/python/core/io

Because it is based on the new IO design for Python 3K, as described in PEP 3116

http://www.python.org/dev/peps/pep-3116/

Regards,

Alan.

[*] Although I notice that cpython 2.5, for a reason I don't fully
understand, fails this particular encoding sequence. (Maybe it's to do
with the possibility that the result of an encode operation is no
longer an encodable string?)

Python 2.5 (r25:51908, Sep 19 2006, 09:52:17) [MSC v.1310 32 bit
(Intel)] on win32
Type help, copyright, credits or license for more information.
 response = uinterferon-gamma (IFN-\u03b3) responses in cattle
 response.encode('utf-8').encode('latin-1')
Traceback (most recent call last):
  File stdin, line 1, in module
UnicodeDecodeError: 'ascii' codec can't decode byte 0xce in position
22: ordinal not in range(128)


Meaning that to enforce the WSGI iso-8859-1 convention on cpython 2.5,
you would have to carry out this rigmarole

 response.encode('utf-8').decode('latin-1').encode('latin-1')
'interferon-gamma (IFN-\xce\xb3) responses in cattle'


Perhaps this behaviour is an artifact of the cpython implementation?

Whereas jython passes it just fine (and correctly, IMHO)

Jython 2.2.1 on java1.4.2_15
Type copyright, credits or license for more information.
 response = uinterferon-gamma (IFN-\u03b3) responses in cattle
 response.encode('utf-8')
'interferon-gamma (IFN-\xCE\xB3) responses in cattle'
 response.encode('utf-8').encode('latin-1')
'interferon-gamma (IFN-\xCE\xB3) responses in cattle'

___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


[Web-SIG] Modjy and jython 2.2.

2007-09-05 Thread Alan Kennedy
Dear all,

Now that jython 2.2 has been released (hooray!)

http://www.jython.org/Project/download.html

it's time for a quick update on the status of modjy, the jython 
WSGI/J2EE gateway.

http://www.xhaus.com/modjy/

Previous versions of modjy were based on jython 2.1, which didn't have 
support for the iterator protocol. However, the new jython 2.2 has full 
iterator and generator support, and so is capable of full WSGI support 
(round of applause for the hard work of the jython-dev team).

In a testament to the stability of jython and the clean design of WSGI, 
the modjy code has not changed; the original jython 2.1 version of modjy 
works seamlessly with jython 2.2, unmodified.

Still, I am making an interim release, for two purposes

1. To fix a longstanding bug in the implementation
2. To explicitly mention jython 2.2 in the documentation

I'm off on vacation soon, and wanted to make this small publicity 
release before I go.

When I return, I will be making the following modifications

1. Adding a full test suite, based on MockRunner, the mock Java Servlet 
framework.
2. Improving J2EE resource handling
3. Improving import handling
4. Various small improvements and documentation updates.

All the best,

Alan.
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Web Site Process Bus

2007-06-26 Thread Alan Kennedy
[Graham Dumpleton]
  First comment is about WSGI applications somehow themselves using
  SIGTERM etc as triggers for things they want to do. For Apache at
  least, allowing any part of a hosted Python application to register
  its own signal handlers is a big no no. This is because Apache itself
  uses a whole range of signals to manage such tasks as shutting down
  sub processes or signaling worker and/or listener threads within a
  process that its time to wakeup or shutdown. If a WSGI application
  starts registering signal handlers it can as a result stop Apache from
  even being able to process requests. In mod_wsgi I have had to
  specifically take steps to prevent applications breaking things in
  this way by replacing signal.signal() on creation of an interpreter.
  Instead I log a warning that the signal registration has been ignored
  and otherwise do nothing. This was simply the safest thing to do.
 
  Thus I believe a clear statement should be made that UNIX signals are
  off limits to WSGI applications or components.

 From a jython POV, I agree with this statement; signals don't even 
exist on java/jython (although some JVMs have non-standard extensions 
for signals).

Thus, any standard involving signals would not be implementable on 
jython, and I guess ironpython too.

[Graham Dumpleton]
  Anyway, just wanted to make it absolutely clear that I don't believe a
  hosted WSGI application and associated framework has any business
  taking direct interest in low level UNIX signals.

Agreed.

Regards,

Alan.
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Relationship between SCRIPT_NAME and PATH_INFO.

2007-01-28 Thread Alan Kennedy
[Graham Dumpleton]
 Should a WSGI adapter for a web server which allows a mount point to
 have a trailing slash specifically flag as a configuration error an
 attempt to use such a mount point given that it appears to be
 incompatible with WSGI?

OK, I'll have a go.

I think the question boils down to the following:

Assume an application mount point of /application.

If a request is received for

/application

Then it will (and should) be redirected to the URL

/application/

Is that new URL to be interpreted as

SCRIPT_NAME: /application
PATH_INFO:   /

or interpreted as

SCRIPT_NAME: /application/
PATH_INFO:

I think that the WSGI interpretation is the first interpretation, and
the correct one, because it gives correct results when deriving
relative URLs for resources contained within the application.

Is that addressing the question?

[Graham Dumpleton]
 It therefore seems that the idea of the mount point for an
 application having a trailing slash may be incompatible
 with WSGI. Can this be considered to be the case or is there
 some other way one is meant to deal with this?

I don't know about incompatible, although it obviously creates the
double-slash problem with computed URLs.

Perhaps the Apache policy on this issue is influenced by its origins
as a http server for serving hierarchies of directories and files from
a filesystem?

When it comes to CGI though, Apache does the right thing and passes

SCRIPT_NAME: /application
PATH_INFO:   /

to CGI scripts.

I don't know if this provides any insight into whether or not mounting
applications with a trailing slash is an error.

Does that help at all?

Alan.
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] WSGI input filter that changes content length.

2007-01-15 Thread Alan Kennedy
[Graham Dumpleton]
 How does one implement in WSGI an input filter that manipulates the request
 body in such a way that the effective content length would be changed?

 The problem I am trying to address here is how one might implement using WSGI 
 a
 decompression filter for the body of a request. Ie., where Content-Encoding:
 gzip has been specified.

 So, how is one meant to deal with this in WSGI?

The usual approach to modifying something something in the WSGI
environment, in this case the wsgi.input file-like object, is to wrap
it or replace it with an object that behaves as desired.

In this case, the approach I would take would be to wrap the
wsgi.input object with a gzip.GzipFile object, which should only read
the input stream data on demand. The code would look like this

import gzip
wsgi_env['wsgi.input'] = gzip.GzipFile(wsgi_env['wsgi.input'])

Notes.

1. The application should be completely unaware that it is dealing
with a compressed stream: it simply reads from wsgi.input, unaware
that reading from what it thinks the input stream is actually causing
cascading reads down a series of file-like objects.

2. The GzipFile object will decompress on the fly, meaning that it
will only read from the wrapped input stream when it needs input.
Which means that if the application does not read data from
wsgi.input, then no data will be read from the client connection.

3. The GzipFile should not be responsible for enforcement of the
incoming Content-Length boundary. Instead, this should be enforced by
the original server-provided file-like input stream that it wraps. So
if the application attempts to read past Content-Length bytes, the
server-provided input stream is allowed to simulate an end-of-file
condition. Which would cause the GzipFile to return an EOF to the
application, or possibly an exception.

4. Because of the on-the-fly nature of the GzipFile decompression, it
would not be possible to provide a meaningful Content-Length value to
the application. To do so would require buffering and decompressing
the entire input data stream. But the application should still be able
to operate without knowing Content-Length.

5. The wrapping can NOT be done in middleware. PEP 333, Section Other
HTTP Features has this to say: WSGI applications must not generate
any hop-by-hop headers [4], attempt to use HTTP features that would
require them to generate such headers, or rely on the content of any
incoming hop-by-hop headers in the environ dictionary. WSGI servers
must handle any supported inbound hop-by-hop headers on their own,
such as by decoding any inbound Transfer-Encoding, including chunked
encoding if applicable. So the wrapping and replacement of wsgi.input
should happen in the server or gateway, NOT in middleware.

6. Exactly the same principles should apply to decoding incoming
Transfer-Encoding: chunked.

HTH,

Alan.

P.S. Thanks for all your great work on mod_python Graham!
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] [Fwd: Summer of Code preparation]

2006-04-19 Thread Alan Kennedy
[Peter Hunt]
 I think an interesting project would be complete integration of the
 client and server via AJAX. That is, whenever a DHTML event handler
 needs to be called on the client-side, the document state is serialized
 and it is sent along with the DHTML event information to the server,
 informing it that an event occured.

[Matt Goodall]
 Invoking something server-side every time there's some (interesting)
 event in the browser will almost certainly perform badly due to network
 latency and possibly put unnecessary load on the server.

I was going to refrain from this conversation, but now find the
following point relevant:

How long before we end up reinventing X-windows-style transmission of
UI events across the network, i.e. by sending all browser events over
HTTP to the server?

It's worth noting that, in the early days of X-windows, people said it
was far too heavyweight, and would saturate networks and quickly
become unusable. But those people reckoned without advances in network
technology, and the X-windows people claimed that they were
specifically designing for network technologies from several years in
the future, by which time their software technology would be mature
and ready to take advantage of the newer and higher bandwidths. And
they were pretty much right: having used X-windows over corporate WANs
since the early 1990s, I think it works pretty well.

But the X-windows people weren't designing for Internet scale: how
many connections should a server be able to handle?

 Serializing and sending document state will only make it slower.

Agreed: serialising and transmitting whole documents is taking it too far ;-)

Regards,

Alan.
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] [Fwd: Summer of Code preparation]

2006-04-18 Thread Alan Kennedy
[Titus Brown]
 I'm thinking of proposing a project to build a JavaScript interpreter
 interface for Python; the goal (for me) is to get twill/mechanize to
 understand JavaScript.  I think the project has wider applications,
 but I'm not sure what people actually want to do with JavaScript.
 I could imagine server-side parsing of javascript, and/or integration of
 javascript and python code.  Thoughts?

Have you looked at WebCleaner? WebCleaner is a filtering HTTP proxy,
written in python.

http://webcleaner.sourceforge.net/

WebCleaner uses the Mozilla SpiderMonkey javascript engine to execute
JS from web pages: From the webcleaner front page


Another feature is the JavaScript filtering: JavaScript data is
executed in the integrated Spidermonkey JavaScript engine which is
also used by the Mozilla browser suite. This eliminates all JavaScript
obfuscation, popups, and document.write() stuff, but the other
JavaScript functions still work as usual.


Perhaps webcleaner has code that already does what you need? Although
the GPL licensing might be problematic.

Regards,

Alan.
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Standalone WSGI form framework.

2006-03-16 Thread Alan Kennedy
[Alan Kennedy]
 I'm looking for a framework-independent form library. I'm using the
 Quixote forms library at the moment, inside my own framework, but
 would ideally like something more WSGI oriented, so that it is easier
 to mock and unittest.

[Daniel Miller]
 Have you looked at Ian Bicking's FormEncode? I'm not sure if it
 meets all your requirements, but it seems like a good base to start
 with (most of the hard stuff has already been done).

Thanks Daniel.

Indeed, it not only appears that FormEncode is the closest thing to
what I need, it also seems to be the only show in town, i.e. the only
framework-independent form library.

[Alan Kennedy]
 If anyone is familiar with the Java Spring Framework, it's got pretty
 much everything I need, but is overly complex, and is written in Java

[Daniel Miller]
 I wrote an app using Spring and I have to say it's the best web
 framework I've ever used in terms of completeness and flexibility,
 but it's written in Java...

Agreed. I find it's interface based design very simple and powerful.
But, IMHO, the actual implementations of the classes that implement
the interfaces are excessively complex and rigidly structured.

[Daniel Miller]
 I actually wrote a few simple classes on top of CherryPy that exposes
 the Spring webmvc Controller interface as well as the
 SimpleFormController class (those are the two main building blocks
 I found most useful in Spring's WebMVC). My SimpleFormController
 implementation uses FormEncode for validation. I'd be willing to
 share the code if you're interested.

I'd be very interested to see that, and potentially use it, if you're
willing ...

[Daniel Miller]
 I think the one true web framework could be made for Python if
 someone took the best ideas from Spring WebMVC and made a few
 component-ized building blocks on top of which complex and widely
 varied applications could be built.

Completely agreed. The term meta-framework is most appropriate, I
think. If we could agree on a set of interfaces, then everyone would
be free to contribute implementations of their own componments.

For example, I like the idea of Routes URL-mapping library: it's
precisely the kind of task that is simple enough in concept, but yet
complex enough to require a dedicated (and thoroughly tested) library.

Most of the popular web frameworks make the fundamental mistake of
picking a single URL-object mapping mechanism, and making you
shoehorn all your requirements into it. IIRC, Django, Turbogears,
Pylons, all make this mistake.

However, if URL-object mapping were controlled by an interface, then
we'd be free to choose from multiple implementations, e.g.
routes-style, quixote-style, zope-style, etc, etc.

 However, to make this possible we'd most likely need a standard
 request object (or at least an interface definition).

ISTM that WSGI eliminates the need for that. Is there any specific
thing you have in mind that WSGI doesn't cover?

Regards,

Alan.
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Standalone WSGI form framework.

2006-03-16 Thread Alan Kennedy
[Alan Kennedy]
 I'm looking for a framework-independent form library. I'm using the
 Quixote forms library at the moment, inside my own framework, but
 would ideally like something more WSGI oriented, so that it is easier
 to mock and unittest.

[Titus Brown]
 I'm confused by this -- this could mean that you want to separate the
 quixote forms lib from the Quixote 'request' object, I guess.  What
 else?

Hi Titus,

I realise that I can rewrite the Quixote form lib to achieve what I
want, but at the cost of a fairly significant effort.

As it is, I've rewritten the rendering, to work with Kid and ElementTree.

But I'm tired of hacking on it to make it do what I want: I'd much
prefer to start afresh with my own design than to continue to use
Quixote: it's just too limiting.

Regards,

Alan.
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Standalone WSGI form framework.

2006-03-16 Thread Alan Kennedy
[Alan Kennedy]
 But I'm tired of hacking on it to make it do what I want: I'd much
 prefer to start afresh with my own design than to continue to use
 Quixote: it's just too limiting.

[Titus Brown]
 I think you mistook my question for a criticism ;).  Rewrite or no, I'm
 mostly interested in what you meant by WSGI oriented and what that
 would mean specifically in the context of the Quixote forms lib.

No criticism detected ;-)

By WSGI oriented, I mean that I don't have to mock request objects: I
can just use a dictionary to mock a WSGI request: I've found that
testing approach exceedingly straightforward to work with. Also, I've
had problems in the past with Quixote not handling response encodings
correctly. And it's html escaping mechanism is excessively PTL
oriented: I ended up making too many changes to Quixote, which made me
question why I was using it in the first place.

Regards,

Alan.
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


[Web-SIG] Standalone WSGI form framework.

2006-03-15 Thread Alan Kennedy
Greetings All.

I'm looking for a framework-independent form library. I'm using the
Quixote forms library at the moment, inside my own framework, but
would ideally like something more WSGI oriented, so that it is easier
to mock and unittest.

My ideal form framework should do the following

1. Parsing of submitted POST requests, etc
2. Binding of incoming form variables to the attributes of a target
python data object
3. Customisable validation, with management of validation error messages.
4. Generate unique (hierarchical) field names for sub-attributes of
the data object to be edited, which are javascript-identifier-safe,
i.e. can be used as the names of HTML form input elements.
5. Handle multipart/form-data
6. Nice-to-have: transparently handle multi-page forms, e.g. hub forms, etc.

It should NOT

1. Attempt to generate HTML, or be tied to a specific templating mechanism

If anyone is familiar with the Java Spring Framework, it's got pretty
much everything I need, but is overly complex, and is written in Java
:-(

TIA,

Alan.
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] WSGI in standard library

2006-02-19 Thread Alan Kennedy
[Alan Kennedy]
 Maybe we need a PEP

[Bill Janssen]
 Great idea!  That's exactly what I thought when I organized this SIG a
 couple of years ago.

[Guido van Rossum]
  At first I was going to respond +1. But the fact that a couple of
  years haven't led to much suggests that it's unlikely to be fruitful;
  there just are too many diverging ideas on what is right. (Which makes
  sense since it's a huge and fast developing field.)

Having considered the area for a couple of days, I think you're right: 
the generic concept web, as in web-sig, covers far too much ground, 
and there are too many schools of thought.

  So unless someone (Alan Kennedy?) actually puts forward a PEP and gets
  it through a review of the major players on web-sig, I'm skeptical.

But there is a subset which I think is achievable, namely http support, 
which IMO is the subset that most needs a rework. And now that we have a 
nice web standard, WSGI, it would be nice to make use of it to refactor 
the current http support. The following are important omissions in the 
current stdlib.

  - Asynchronous http client/server support (use asyncore? twisted?)
  - SSL support in threaded http servers
  - Asynchronous SSL support
  - Simple client file upload support
  - HTTP header parsing support, e.g. language codes, quality lists, etc
  - Simple object publishing framework?

Addressing all of the above would be significant piece of work. And 
IMHO, it is only achievable by staying focussed on http and NOT 
addressing requirements such as

  - Content processing, e.g. html tidy, html parsing, css parsing
  - Foreign script language parsing or execution
  - Page templating API

I think it would be a good idea to address these concerns in separate PEPs.

[Guido van Rossum]
  I certainly don't want this potential effort to keep us from adding
  the low-hanging fruit (wsgiref, with perhaps some tweaks as PJE can
  manage based on recent feedback here) to the 2.5 stdlib.

Completely agreed. Any web-related PEPs are going to take a long time, 
and are unlikely to be ready in time for 2.5.

Regards,

Alan.
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] WSGI in standard library

2006-02-16 Thread Alan Kennedy
[Ian Bicking]
 Anyway, I'm +1 on the object [wsgiref's wsgi header manipulation class]
 going somewhere.  I don't know if the
 parent package has to be named wsgi -- and wsgiref seems even
 stranger to me, as anything in the standard library isn't a reference
 implementation anymore, but an actual implementation.  I personally
 like a package name like web.  Everyone will know what that means
 (though it would start with most of the web related modules not in it,
 which is a problem).

While we're on the subject, can we find a better home for the HTTP
status codes-messages mapping?

Integer status codes.
http://mail.python.org/pipermail/web-sig/2004-September/000764.html

Adding status code constants to httplib
http://mail.python.org/pipermail/web-sig/2004-September/000842.html

Regards,

Alan.
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] WSGI in standard library

2006-02-16 Thread Alan Kennedy
[Guido Van Rossum]
 Actually BaseHTTPServer.py and friends use a deprecated naming scheme
 -- just as StringIO, UserDict and many other fine standard library
 modules.
 If you read PEP 8, the current best practice is for module names to be
 all-lowercase and *different* from the class name.

[Clark C Evans]
 I propose we add wsgiref, but look at other implementations and
 steal what ever you can from them.  This is not a huge chunk of
 code -- no reason why you can't have the best combination of
 features and correctness.

[Jean Paul Calderone]
 HTTPS is orthogonal.  Besides, how would you support it in the stdlib?  It's 
 currently not  possible to write an SSL server in Python without a 
 third-party library.  Maybe someone
 would be interested in rectifying /that/? :)

[Ian Bicking]
 I've used this several times (well, not wsgiref's implementation, but
 paste.response.HeaderDict).  rfc822 is heavier than this dictionary-like
 object, and apparently is also deprecated.

[Alan Kennedy]
 While we're on the subject, can we find a better home for the HTTP
 status codes-messages mapping?

Folks,

Thinking about this some more, it's beginning to sound to me like the
server-side web support in the standard library needs a proper review
and possible rework: it's slowly decohering/kipplizing.

Maybe we need a PEP, so that we can all discuss the subject
(rationally ;-) and sort out all of the issues before we go ahead and
commit anything?

Just a thought. Feel free to dis-regard

Alan.
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] WSGI in standard library

2006-02-14 Thread Alan Kennedy
[Ian Bicking]
 Note that the scope of a WSGI server is very very limited.  It is quite 
 distinct from an XMLRPC server from that perspective -- an XMLRPC server 
 actually *does* something.  A WSGI server does nothing but delegate.

and

 I'm not set on production quality code, but I think the general 
 sentiment against that is entirely premature.  The implementations 
 brought up -- CherryPy's 
 (http://svn.cherrypy.org/trunk/cherrypy/_cphttpserver.py) and Paste's 
 (http://svn.pythonpaste.org/Paste/trunk/paste/httpserver.py) and 
 wsgiref's 
 (http://cvs.eby-sarna.com/wsgiref/src/wsgiref/simple_server.py?rev=1.2view=markup)
  
 are all pretty short.  It would be better to discuss the particulars. Is 
 there a code path in one or more of these servers which you think is 
 unneeded and problematic?

A few points.

1. My opinion is not relevant to whether/which WSGI server goes into the 
standard library. What's required is for someone to propose to 
python-dev that a particular WSGI server should go into the standard 
library. I imagine that the response on python-dev to the proposer is 
going to be along the lines of Will you be maintaining this? If/when 
python-dev is happy, then it'll go into the distribution.

2. What's wrong with leaving the current situation as-is, i.e. the 
available WSGI implementations are listed on the WSGI Moin page

http://wiki.python.org/moin/WSGIImplementations

3. If I had to pick one of the 3 you suggested, I'd pick the last one, 
i.e. PJE's, because it fulfills exactly the criteria I listed

  - It's pretty much the simplest possible implementation, meaning it's 
easiest to understand.
  - It's based on the existing *HttpServer hierarchy
  - It's got a big notice at the top saying This is both an example 
of how WSGI can be implemented, and a basis for running simple web 
applications on a local machine, such as might be done when testing or 
debugging an application.  It has not been reviewed for security issues, 
however, and we strongly recommend that you use a real web server for 
production use.

Regards,

Alan.
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] WSGI in standard library

2006-02-14 Thread Alan Kennedy

[Alan Kennedy]
3. If I had to pick one of the 3 you suggested, I'd pick the 
last one, i.e. PJE's, because it fulfills exactly the criteria

I listed


[Robert Brewer]

I have to disagree (having examined/unraveled it quite a bit recently,
to remove modpython_gateway's dependency on it). 


[Ian Bicking]
I think it also tries to enforce a lot of the details of WSGI, and thus 
guide a WSGI implementor into creating a compliant server.  


Well, I'm sure we all want our favourite server in the stdlib ;-)

But a few things have to happen first.

Priority #1: Make the requisite server a single standalone module.

Anticipating PJE's willingness to have WSGIRef included in the stdlib, 
I've taken the liberty of putting it all into one big file. And I think 
it looks pretty damn good: fully WSGI compliant, with code to represent 
every single aspect of the spec. Take a look for yourself: the file is 
attached. If the attachment doesn't make it to the list, I'll upload it 
somewhere.


But that doesn't mean the decision's over. It means that the bar has 
been raised. Anyone else who wants their module to be a contender has to 
get it all into the one file, i.e. eliminating all framework 
dependencies, etc.


Here's a few comments I put together about the three contenders that 
have been proposed so far. They're just my own comments from reading the 
code: feel free to treat them as the ravings of a madman if you so wish.


1. CherryPy server - 407 lines (non-code lines: ~80)

 - Depends on cherrypy, cherryp._cputil, cherryp.lib.httptools
 - Depends on cherrypy.config
 - Implements HTTP header length limit checking
 - Implements HTTP body length limit checking
 - Uses own logging handler
 - Subclasses SocketServer.BaseServer, not BaseHTTPServer.HTTPServer
   - Therefore does low-level socket mucking-about
 - Provides 2 server implementations
   - CherryHTTPServer
   - PooledThreadServer
 - Explicitly checks for KeyboardInterrupt exceptions
 - PooledThreadServer has clean shutdown through Queue.Queue messaging
 - Does not detect hop-by-hop headers
 - No demo application

My gut feeling: too complex, works to hard to be production-ready, at 
the expense of readability.


2. Paste Server - 450 lines

 - Supports 100 continue responses
 - No imports from outside stdlib
 - Provides HTTPS/SSL server, with fallback if no SSL
 - Supports socket timeout
 - Demo application is (imported) paste.wsgilib.dump_environ
 - Does not detect hop-by-hop headers

My gut feeling: Ignores many parts of the WSGI spec (sendfile, strict 
error checking), supports unnecessary stuff for stdlib, i.e. Continue 
support, HTTPS.


3. WSGIRef_onefile.py - 660 lines

 - No imports from outside stdlib
 - Detects hop-by-hop headers
 - Has WSGI sendfile support
 - Has dedicated class to manage WSGI headers list as dictionary
 - Has builtin demo app

My gut feeling: WSGIRef is the sweetspot in terms of simplicity vs. 
usability. Covers all aspects of WSGI (which is what it was designed 
for, IIRC ;-)


The ball's in yizzir court now..

Alan.

BaseHTTPServer that implements the Python WSGI protocol (PEP 333, rev 1.21)

This is both an example of how WSGI can be implemented, and a basis for running
simple web applications on a local machine, such as might be done when testing
or debugging an application.  It has not been reviewed for security issues,
however, and we strongly recommend that you use a real web server for
production use.

For example usage, see the 'if __name__==__main__' block at the end of the
module.  See also the BaseHTTPServer module docs for other API information.


from BaseHTTPServer import BaseHTTPRequestHandler, HTTPServer
import urllib, sys, os, mimetools, types, time

__version__ = 0.1
__all__ = ['WSGIServer','WSGIRequestHandler','demo_app']

server_version = WSGIServer/ + __version__
sys_version = Python/ + sys.version.split()[0]
software_version = server_version + ' ' + sys_version

hop_by_hop_headers = {
'connection':1,
'keep-alive':1,
'proxy-authenticate':1,
'proxy-authorization':1,
'te':1,
'trailers':1,
'transfer-encoding':1,
'upgrade':1
}

def is_hop_by_hop(header_name):
Return true if 'header_name' is an HTTP/1.1 Hop-by-Hop header
return hop_by_hop_headers.has_key(header_name.lower())

class FileWrapper:
Wrapper to convert file-like objects to iterables

def __init__(self, filelike, blocksize=8192):
self.filelike = filelike
self.blocksize = blocksize
if hasattr(filelike,'close'):
self.close = filelike.close

def __getitem__(self,key):
data = self.filelike.read(self.blocksize)
if data:
return data
raise IndexError

def __iter__(self):
return self

def next(self):
data = self.filelike.read(self.blocksize)
if data:
return data
raise StopIteration

class Headers:

Manage a collection of HTTP response headers

def

Re: [Web-SIG] WSGI in standard library

2006-02-14 Thread Alan Kennedy
[Guido van Rossum]
Let's make it so. I propose to add wsgiref to the standard library and
nothing more.

[Blake Winton]
Will you be maintaining this?  ;)

[Guido van Rossum]
I'd expect we could twist Phillip's arm to maintain it; he's not
expecting much maintenance.

[Phillip J. Eby]
 Yes, and yes.

Whew! :-)

Phillip: Hope you don't mind me taking the liberty of rearranging your code?

And before we go finalising anything, please let's give the other 
contenders a chance to come up with something competitive.

Alan.

___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] WSGI in standard library

2006-02-14 Thread Alan Kennedy
[Alan Kennedy]
Priority #1: Make the requisite server a single standalone module.

[Guido van Rossum]
 Huh? What makes you think this?

My bad :-(

Two things made me think like that

1. BaseHttpServer - BaseHttpServer.py
SimpleHttpServer - SimpleHttpServer.py
WSGIHttpServer - WSGIHttpServer.py

2. The comment was more aimed at the CherryPy entry, which imports a 
fair amount of CherryPy support code.

i'll-get-me-coat-ly'yrs,

Alan.
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] WSGI in standard library

2006-02-12 Thread Alan Kennedy
[Alan Kennedy]
Instead, I think the right approach is to continue with the existing 
approach: put the most basic possible WSGI server in the standard 
library, for educational purposes only, and a warning that it shouldn't 
really be used for production purposes.

[Bill Janssen]
 I strongly disagree with this thinking.  Non-production code shouldn't
 go into the stdlib; instead, Alan's proposed module should go onto
 some pedagogical website somewhere with appropriate tutorial
 documentation.

I still disagree ;-)

IMO, the primary reason for not including production servers in the 
standard library is that servers need to be maintained much more 
fastidiously than the standard library, and need to be released on a 
timescale that is independent of python releases.

Note the security hole incovered in the standard library xml-rpc lib 
last year.

PSF-2005-001 - SimpleXMLRPCServer.py allows unrestricted traversal
http://www.python.org/security/PSF-2005-001/

This particular security hole is the very reason why the Python Security 
response team had to be founded, and required point-releases of the 
entire python distribution to fix, i.e. python 2.3.5 and python 2.4.1 
were released simply to fix this bug.

There are two primary areas of the python distro that can result in such 
significant security holes.

1. Crypto libraries. Fortunately, the Timbot has been carefully watching 
over us, and ensuring the excellence of the python crypto libraries (as 
witnessed by the appearance of Ron Rivest on python-dev (!) last December:

http://mail.python.org/pipermail/python-dev/2005-December/058850.html

2. Internet-exposed servers. No matter how careful developers are, it is 
very difficult to avoid designing security holes into such servers. 
Therefore, IMHO, it is wrong to include such servers into the standard 
distribution. Instead, production-ready servers should be independent of 
the standard distribution, have their own development teams, have 
independent release-cycles, etc, etc: think Twisted, mod_python, etc.

So, I still think that only basic servers educational/playpen servers 
should go in the standard library, with an indication that the user 
should pick an openly server from outside the distro if they require to 
do serious server work.

Maybe if there were no production-ready servers in the standard 
library, there would be no need for a Python Security Response Team.

Just my €0,02.

Regards,

Alan.

___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] WSGI in standard library

2006-02-12 Thread Alan Kennedy
[Graham Dumpleton]
 Anyway, not that it matters, but the security fix was not the only thing
 in those releases.

Still, I think my point stands that internet-facing servers in the 
standard lilbrary are currently the only source of security advisories 
in python.

http://www.python.org/security/

How sure are we that any proposed production WSGI server in the standard 
library will not become a source of further holes, especially if it 
tries to cover all the bases of a true production server, i.e. security, 
flexibility, efficiency, full http1.1 compliance, etc?

Regards,

Alan.
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Bowing out (was Re: A trivial template API counter-proposal)

2006-02-12 Thread Alan Kennedy
[Alan Kennedy]
 Looking at this in an MVC context ...

[Phillip J. Eby]
 As soon as you start talking about what templates should or should not 
 do (as opposed to what they *already* do), you've stopped writing an 
 inclusive spec and have wandered off into evangelizing a particular 
 framework philosophy.

Sorry if my message seemed unreasonable. My approach to such matters is 
to attempt to start from best design practice, keeping a keen focus on 
the best way to do things in the future, relegating poorly-architected 
legacy systems, e.g. active page systems, to being a secondary concern.

Also, my take on active page systems is that they could easily be 
encompassed by an MVC model. The View is the active page, the Model is 
the namespace in which the active page is rendered and the Controller is 
the thing that does the rendering.

[Phillip J. Eby]
  At this point it has become clear to me that even if I spent my days
  and nights writing a compelling spec of what I'm proposing and then
  trying to sell it to the Web SIG, it would be at best a 50/50 chance
  of getting through, and in the process it appears that I'd be burning
  through every bit of goodwill I might have previously possessed here.

  .. I'd rather save whatever karma I
  have left here for something with a better chance of success.

I'm sorry to hear that.

[Phillip J. Eby]
 Good luck with the spec.

Well, I'm currently designing and implementing a View and ViewResolver 
in Spring for a customer, so I'll be keeping a note of requirements as I 
go, and will attempt to come up with a generic design which is suitable 
for a a templating standard. But it will be a few weeks before I can 
spec that, document it and start doing sample implementations which I 
can open source.

Regards,

Alan.

___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Standardized template API

2006-01-31 Thread Alan Kennedy
[Clark C. Evans]
 I'd stick with the notion of a template_name that is neither the
 template file nor the template body.  Then you'd want a template factory
 method that takes the name and produces the template body (complied if
 necessary).  

I agree.

If you're looking for an existing model (in java), the Spring framework 
has View objects (i.e. the V in MVC) and View Resolver objects. The 
latter resolve logical template names to actual templates, compiled if 
necessary.

View Interface
http://static.springframework.org/spring/docs/1.2.x/api/org/springframework/web/servlet/View.html

ViewResovler Interface
http://static.springframework.org/spring/docs/1.2.x/api/org/springframework/web/servlet/ViewResolver.html

 This way your template could be stored
 in-memory, on-disk, or in a database, or even remotely using an HTTP
 cashe.  The actual storage mechanism for the template source code should
 not be part of this interface.

A very important requirement IMHO.

Regards,

Alan.

___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Communicating authenticated user information

2006-01-22 Thread Alan Kennedy
[Jim Fulton]
 Is Zope the only WSGI application that performs authentication
 itself?

[Phillip J. Eby]
 I think Zope is the only WSGI application that cares about
  communicating this information back to the web server's logs.  :)

[Jim Fulton]
  I hope that's not true.  Certainly, if anyone else is doing
  authentication in their applications or middleware, they
  *should* care about getting information into the access logs.

Well, Apache records auth info in logs as well, and it seems like a 
perfectly reasonable thing for a server to do .

http://httpd.apache.org/docs/2.0/logs.html#accesslog

[Phillip J. Eby]
  Perhaps an X-Authenticated-User: foo header could be added
  in a future spec version?  (And as an optional feature in the
  current PEP.)

[Jim Fulton]
  Perhaps. Note that it should be clear that this is soley for use
  in the access log.  There should be no assumption that this is
  a principal id or a login name.  It is really just a label for the
  log.  To make this clearer, I'd use something like:
  X-Access-User-Label: foo.

Sending X-headers seems hacky, and results in unnecessary information 
being transmitted back to the user (possibly revealing sensitive 
information, or opening security holes?)

I think that the communication mechanism for auth information is 
possibly best served by a simple convention between auth middleware 
authors. Perhaps servers that are aware that auth middleware is in use 
can put a callable into the WSGI environment, which auth middleware 
calls when it has auth'ed the user?

[Phillip J. Eby]
  This seems a simpler way to incorporate the feature than adding
  an extension API to environ.

[Jim Fulton]
  Why is that?  Isn't the env meant for communication between
  the WSGI layers?  I'm not sure I'd want to send this information
  back to the browser.

I think an API could be very simple, and optional for servers that know 
they won't be logging auth information.

I agree about not sending this information back to the user: it's 
unnecessary and potentially dangerous.

Regards,

Alan Kennedy.
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Communicating authenticated user information

2006-01-22 Thread Alan Kennedy
[Alan Kennedy]
 I agree about not sending this information back to the user: it's
 unnecessary and potentially dangerous.

[Phillip J. Eby]
 Yep, it would be really dangerous to let me know who I just logged in to 
 an application as.  I might find out who I really am! ;)

Very droll ;-)

What if other information, such as meta-information about the auth 
directory or database in which the credentials were looked up, was also 
communicated through X-headers, e.g. server connection details, etc.

Happy for that to go back to the user too?

If X-headers are to be used in WSGI, I think there should be something 
in the spec about whether or not they should be transmitted to the user.

Alan.
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com