Re: [Web-SIG] Request for Comments on upcoming WSGI Changes

2009-09-21 Thread James Bennett
On Mon, Sep 21, 2009 at 10:19 AM, P.J. Eby  wrote:
> +1.  I'd really rather not have the spec dictated by the need to work around
> problems in the stdlib or language definition.  Better to fix them ASAP.

This is a *Python* web server gateway interface, yes? Fixing stdlib
bugs is fine, but asking for the language to change just to make
gateway interfaces a bit easier to write seems a bit much; I'd hope we
can take Python the language as granted, and work from there.


-- 
"Bureaucrat Conrad, you are technically correct -- the best kind of correct."
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Request for Comments on upcoming WSGI Changes

2009-09-20 Thread James Bennett
On Mon, Sep 21, 2009 at 1:28 AM, Armin Ronacher
 wrote:
> If it was just that I would be happy to stay with bytes.  But unless the
> standard library changes in the way it works on Python 3 there is not
> much but unicode we can use.  bytes no longer behave like strings, it's
> not very comfortable to work with them.

Indeed. Hence my comments about WSGI leaking up into other code. Now
that bytes and strings are incompatible, a lot of code which relied on
(arguably) a wart in Python will break.


-- 
"Bureaucrat Conrad, you are technically correct -- the best kind of correct."
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Request for Comments on upcoming WSGI Changes

2009-09-20 Thread James Bennett
On Sun, Sep 20, 2009 at 11:25 PM, Chris McDonough  wrote:
> WSGI is a fairly low-level protocol aimed at folks who need to interface a
> server to the outside world.  The outside world (by its nature) talks bytes.
>  I fear that any implied conversion of environment values and iterable
> return values to Unicode will actually eventually make things harder than
> they are now.  I realize that it would make middleware implementors lives
> harder to need to deal in bytes.  However, at this point, I also believe
> that middleware kinda should be hard.  We have way too much middleware that
> shouldn't be middleware these days (some written by myself).

Well, ordinarily I'd be inclined to agree: HTTP deals in bytes, so an
interface to HTTP should deal in bytes as well.

The problem, really is that despite being a very low-level interface,
WSGI has a tendency to leak up into much higher-level code, and (IMO)
authors of that high-level code really shouldn't have to waste their
time dealing with details of the underlying low-level gateway.

You've said you don't want to hear "Python 3" as the reason, but it
provides some useful examples: in high-level code you'll commonly want
to be doing things like, say, comparing parts of the requested URL
path to known strings or patterns. And that high-level code will
almost certainly use strings, while WSGI, in theory, will be using
bytes. That's just a recipe for disaster; if WSGI mandates bytes, then
bytes will have to start "infecting" much higher-level code (since
Python 3 -- rightly -- doesn't let you be nearly as promiscuous about
mixing bytes and strings).

Once I'm at a point where I can use Python 3, I know I'll personally
be looking for some library which will normalize everything for me
before I interact with it, precisely to avoid this sort of leakage; if
WSGI itself would at least *allow* that normalization to happen at the
low level (mandating it is another discussion entirely) I'd feel much
happier about it going forward.


-- 
"Bureaucrat Conrad, you are technically correct -- the best kind of correct."
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


[Web-SIG] PEP 333 and gzipping of responses

2009-08-10 Thread James Bennett
Earlier today I posted an article on my blog following up on some
discussions of WSGI; one criticism presented was of language in PEP
333 regarding gzipping of responses by WSGI applications. Ian posted a
comment which stated that the criticism was not correct, but I'm at a
loss to figure out what *is* correct, so I'll bring up the question
here.

In a parenthetical at the end of the section entitled "Handling the
Content-Length Header", PEP 333 states:

> Note: applications and middleware must not apply any kind of
> Transfer-Encoding to their output, such as chunking or gzipping; as
> "hop-by-hop" operations, these encodings are the province of the
> actual web server/gateway. See Other HTTP Features below, for more
> details.

In the section "Other HTTP Features", PEP 333 states, in part:

> However, because WSGI servers and applications do not communicate
> via HTTP, what RFC 2616 calls "hop-by-hop" headers do not apply to
> WSGI internal communications. WSGI applications must not generate
> any "hop-by-hop" headers [4], attempt to use HTTP features that
> would require them to generate such headers, or rely on the content
> of any incoming "hop-by-hop" headers in the environ dictionary.

My criticism of this is that this is at best ambiguous, and quite
possibly openly misleading to readers of the PEP.

The ambiguity here is that "gzip" is a valid value for the
Transfer-Encoding header in HTTP (RFC 2616, Sections 3.6 and 14.41),
but is also a valid value for the Content-Encoding header (RFC 2616,
Sections 3.5 and 14.11).

Web frameworks and libraries (in many languages, not just Python)
which support gzipping of responses all seem to opt for the latter
method. Additionally, Apache's mod_deflate -- which so far as I know
is overwhelmingly the most common mechanism for enabling gzipping at
the server level -- also opts for this method, and uses the
Content-Encoding header.

Given this, gzipping of responses seems to be rather universally
associated, in the minds of web developers, with the Content-Encoding
header, which is not a "hop-by-hop" header (RFC 2616, Section
13.5.1). As such, the immediate (and misleading) impression given to
readers of PEP 333 will likely be one of:

1. PEP 333 forbids applications using Content-Encoding to signal
   gzipped response bodies (since it mentions gzipping as something
   applications specifically must not do), or

2. PEP 333 is ambiguous or contradictory on account of mentioning
   Transfer-Encoding and "hop-by-hop" headers in a context in which
   no-one uses Transfer-Encoding or a "hop-by-hop" header, or

3. This text in PEP 333 is based upon a misunderstanding of this
   feature of HTTP or of its use in the real world.

None of these seem particularly good, and this is why I took that
section of the spec to task (albeit in a much briefer and more cursory
fashion, since this message is already starting to run a bit long).

If I'm misreading or misunderstanding either PEP 333 or RFC 2616, I'd
appreciate it if someone would explain where I've gone astray. But as
it stands, I believe the text of PEP 333 quoted above is problematic
and likely to lead to confusion, and (if I'm not misreading or
misunderstanding it) should probably be revised to address these
concerns.


-- 
"Bureaucrat Conrad, you are technically correct -- the best kind of correct."
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] WSGI 2

2009-08-04 Thread James Bennett
On Tue, Aug 4, 2009 at 11:54 AM, James Y Knight wrote:
> But that works just fine today. Your WSGI app sends streaming data back
> using the iterator functionality, and the server automatically turns it into
> chunks if it's talking to an HTTP 1.1 client. What's the problem?

No, it doesn't work just fine today. Either the server has to assume
that every response from that application should be chunked (which is
wrong), or the application needs a way to tell the server to chunk.
Turns out HTTP has a way to indicate that, but WSGI outright forbids
its use. So instead you have to invent out-of-band mechanisms for the
application to tell the server what to do, and in the process reinvent
part of HTTP.


-- 
"Bureaucrat Conrad, you are technically correct -- the best kind of correct."
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] WSGI 2

2009-08-04 Thread James Bennett
On Tue, Aug 4, 2009 at 11:05 AM, P.J. Eby wrote:
> 1. Force all encodings to be explicit, and

This can be handled without forcing application authors to work with
bytestrings (or forcing them to remember to coerce to bytestrings
before returning responses).

> 2. Ensure WSGI<->HTTP equivalence (i.e., WSGI==HTTP encoded in Python
> objects)

TBH, WSGI doesn't expose enough of HTTP's functionality to convince me
that this is a good argument. When I can use advanced HTTP features
(chunked transfer and friends) from a WSGI app, maybe I'll feel
differently.

> Please remember that WSGI is not primarily intended to provide application
> developers with a convenient API; its first and most important job is to
> ship the data around without mangling it in the process.

Which it should try very hard to do without forcing *in*convenient
APIs onto developers.

> So I would ask, what is the practical use case for having the server decode
> bytes into strings, instead of leaving them as bytes?

Well, Django (for one example) already does some gymnastics to ensure
that character encoding issues are kept at the request/response
boundary, largely because it's an utter pain for an application
developer to have an API dump a bunch of bytestrings in your lap and
say "here, *you* figure it out". I suspect we're going to keep on
doing that, since it's a big win in terms of usability for application
developers (who end up having to deal with only a drastically-reduced
subset of character-encoding problems).


-- 
"Bureaucrat Conrad, you are technically correct -- the best kind of correct."
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Time a for JSON parser in the standard library?

2008-03-10 Thread James Bennett
On Mon, Mar 10, 2008 at 9:32 PM, Guido van Rossum <[EMAIL PROTECTED]> wrote:
>  Well, so fix this. How hard can it be?

A bit of poking around turned up the post I was looking for:

http://jmillikin.blogspot.com/2008/02/python-json-catastrophe.html

Seems like his beef with simplejson is mostly Unicode/encoding
handling; the floating-point stuff is a bit more debatable wrt to the
spec, because rfc4627 doesn't say anything about how to handle these
aside from saying that a "number" in JSON is allowed to contain a
decimal point followed by more digits.

Since the post is only a couple weeks old, I'm assuming that the
Unicode stuff is current, so if the consensus is in favor of
simplejson I suppose that'd be the area to concentrate on.


-- 
"Bureaucrat Conrad, you are technically correct -- the best kind of correct."
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Time a for JSON parser in the standard library?

2008-03-10 Thread James Bennett
On Mon, Mar 10, 2008 at 8:37 AM, Mark Ramm <[EMAIL PROTECTED]> wrote:
>  I would definitely support the incusion of a JSON library in the
>  standard lib.   And, I think that it should be simplejson which is
>  used by TurboGears, Pylons, and bundled with Django.

I'd tentatively agree, though I recall seeing a post not long ago
(which I am currently unable to find) from the author of jsonlib
lamenting the fact that most of the other JSON modules for Python had
various significant inconsistencies with the RFC. While authors of
competing tools should be taken with a grain of salt, I do think
compliance with the spec is an important factor for any particular
module that might be blessed with stdlib membership, and so should
play a bigger role in any such decision than mere benchmark speed.


-- 
"Bureaucrat Conrad, you are technically correct -- the best kind of correct."
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] WSGI, Python 3 and Unicode

2007-12-06 Thread James Bennett
On Dec 6, 2007 6:15 PM, Phillip J. Eby <[EMAIL PROTECTED]> wrote:
> WSGI already copes, actually.  Note that Jython and IronPython have
> this issue today, and see:
>
> http://www.python.org/dev/peps/pep-0333/#unicode-issues

I'm glad you brought that up, because it's been bugging me lately.

That section is somewhat ambiguous as-is, because in one sentence
applications are permitted to return strings encoded in a charset
other than ISO-8859-1, but in another they are unequivocally forbidden
to do so (with the "must not" in bold, even). And that's problematic
not only because of the ambiguity, but because the increasing
popularity of "AJAX" and web-based APIs is making it much more common
for WSGI applications to generate responses of types which do not
default to ISO-8859-1 -- e.g., XML and JSON, both of which default to
UTF-8.

Depending on how draconian one wishes to be when reading the relevant
section of WSGI, it's possible to conclude that XML and JSON must
always be transcoded/escaped to ISO-8859-1 -- with all the headaches
that entails -- before being passed to a WSGI-compliant piece of
software.

And the slightly less strict reading of the spec -- that such
gymnastics are required only when the string type of the Python
implementation is Unicode-based -- will grow increasingly troublesome
as/when Py3K enters production use.

So as long as we're talking about this, could the proscriptions with
respect to encoding perhaps be revisited and (hopefully)
clarified/revised?

-- 
"Bureaucrat Conrad, you are technically correct -- the best kind of correct."
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] more comments on Paste Deploy

2007-03-07 Thread James Bennett
On 3/7/07, Jim Fulton <[EMAIL PROTECTED]> wrote:
> Aside from the universal configuration file issue, I think this would
> be a terrific thing for us to focus on.  Something I hear a lot is
> how much easier PHP applications are to deploy to hosting providers.
> I would *love* it is Python had a similar story, even if only for
> smaller applications.
>
> I'd love to get some input who know a lot about what makes deploying
> PHP apps so easy.

I've mostly been lurking because everybody here's quite a bit smarter
than I am on most of the issues discussed, but in a past life I had a
fair amount of experience working with and deploying PHP, so I'll
throw in my $0.02.

PHP is (or was, when I was doing it) "easy to deploy" largely because
of two things:

1. mod_php.
2. Baked-in database libraries.

Everybody already knows that web-server setup is a wart for Python
(and the discussion on that lately has been encouraging), so I won't
dwell on it except to say that I live for the day I'll be able to drop
my Apache -> mod_proxy -> lighttpd -> Unix socket -> FastCGI -> WSGI
-> Django setup (this on a "Python-friendly" shared host, no less) and
have a server configuration that's simpler than the blog app it runs.

The database issue is one that seems to get overlooked a bit, but is
also a killer. PHP gives you SQLite and MySQL support for free, and
Postgres is trivially easy to add if a host is offering Postgres
databases. Meanwhile, most hosts are still with Python 2.3 or 2.4, so
you don't even get SQLite out-of-the-box. The better ones will have
appropriate DB modules installed anyway, but that still seems to be
something of a crap shoot, and somebody who has to build their own
copy of mysqldb to use Python on their hosting account is somebody
who's not going to use Python on their hosting account.

I'm hoping that the ongoing framework hype will help a lot with the
database issue, though; a number of hosting companies right now seem
to be waking up and realizing that there's a lot of money to be made
from framework converts who need solid support for languages that
aren't PHP.

I'd say that if/when these two issues are overcome, or even made
slightly less nasty to deal with, there's not really anything else PHP
can compete on; WSGI and the ever-expanding range of kick-ass web
tools Python offers blow PHP out of the water. To take an easy
example, cruft-free URLs are still anywhere from tedious to nasty under
PHP; you have to fiddle with mod_rewrite, and every PHP project has
its own monolithic URL dispatch system. On the Python side, WSGI and
tools like Paste Deploy make it trivially easy to hang any app anywhere you
want it in your URL scheme.


And setting aside actual technical issues, I also think there's room
to work with documentation; going back to Jim's comment at the PyCon
frameworks panel about documentation that tells stories, it's worth
pointing out that a lot of the "PHP is easier" perception is largely
just that -- a perception -- and that various languages and tools, PHP
included, have compensated for some pretty nasty warts by telling
compelling stories (Rails certainly wouldn't be where it is today if
not for some great storytelling on the part of the people marketing
it). I'm sure we have plenty of good stories we could tell, and I'm
pretty sure we don't have as many warts :)


-- 
"Bureaucrat Conrad, you are technically correct -- the best kind of correct."
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com