Re: [Web-SIG] Request for Comments on upcoming WSGI Changes

2009-09-21 Thread James Bennett
On Sun, Sep 20, 2009 at 11:25 PM, Chris McDonough chr...@plope.com wrote:
 WSGI is a fairly low-level protocol aimed at folks who need to interface a
 server to the outside world.  The outside world (by its nature) talks bytes.
  I fear that any implied conversion of environment values and iterable
 return values to Unicode will actually eventually make things harder than
 they are now.  I realize that it would make middleware implementors lives
 harder to need to deal in bytes.  However, at this point, I also believe
 that middleware kinda should be hard.  We have way too much middleware that
 shouldn't be middleware these days (some written by myself).

Well, ordinarily I'd be inclined to agree: HTTP deals in bytes, so an
interface to HTTP should deal in bytes as well.

The problem, really is that despite being a very low-level interface,
WSGI has a tendency to leak up into much higher-level code, and (IMO)
authors of that high-level code really shouldn't have to waste their
time dealing with details of the underlying low-level gateway.

You've said you don't want to hear Python 3 as the reason, but it
provides some useful examples: in high-level code you'll commonly want
to be doing things like, say, comparing parts of the requested URL
path to known strings or patterns. And that high-level code will
almost certainly use strings, while WSGI, in theory, will be using
bytes. That's just a recipe for disaster; if WSGI mandates bytes, then
bytes will have to start infecting much higher-level code (since
Python 3 -- rightly -- doesn't let you be nearly as promiscuous about
mixing bytes and strings).

Once I'm at a point where I can use Python 3, I know I'll personally
be looking for some library which will normalize everything for me
before I interact with it, precisely to avoid this sort of leakage; if
WSGI itself would at least *allow* that normalization to happen at the
low level (mandating it is another discussion entirely) I'd feel much
happier about it going forward.


-- 
Bureaucrat Conrad, you are technically correct -- the best kind of correct.
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Request for Comments on upcoming WSGI Changes

2009-09-21 Thread James Bennett
On Mon, Sep 21, 2009 at 1:28 AM, Armin Ronacher
armin.ronac...@active-4.com wrote:
 If it was just that I would be happy to stay with bytes.  But unless the
 standard library changes in the way it works on Python 3 there is not
 much but unicode we can use.  bytes no longer behave like strings, it's
 not very comfortable to work with them.

Indeed. Hence my comments about WSGI leaking up into other code. Now
that bytes and strings are incompatible, a lot of code which relied on
(arguably) a wart in Python will break.


-- 
Bureaucrat Conrad, you are technically correct -- the best kind of correct.
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


[Web-SIG] PEP 333 and gzipping of responses

2009-08-10 Thread James Bennett
Earlier today I posted an article on my blog following up on some
discussions of WSGI; one criticism presented was of language in PEP
333 regarding gzipping of responses by WSGI applications. Ian posted a
comment which stated that the criticism was not correct, but I'm at a
loss to figure out what *is* correct, so I'll bring up the question
here.

In a parenthetical at the end of the section entitled Handling the
Content-Length Header, PEP 333 states:

 Note: applications and middleware must not apply any kind of
 Transfer-Encoding to their output, such as chunking or gzipping; as
 hop-by-hop operations, these encodings are the province of the
 actual web server/gateway. See Other HTTP Features below, for more
 details.

In the section Other HTTP Features, PEP 333 states, in part:

 However, because WSGI servers and applications do not communicate
 via HTTP, what RFC 2616 calls hop-by-hop headers do not apply to
 WSGI internal communications. WSGI applications must not generate
 any hop-by-hop headers [4], attempt to use HTTP features that
 would require them to generate such headers, or rely on the content
 of any incoming hop-by-hop headers in the environ dictionary.

My criticism of this is that this is at best ambiguous, and quite
possibly openly misleading to readers of the PEP.

The ambiguity here is that gzip is a valid value for the
Transfer-Encoding header in HTTP (RFC 2616, Sections 3.6 and 14.41),
but is also a valid value for the Content-Encoding header (RFC 2616,
Sections 3.5 and 14.11).

Web frameworks and libraries (in many languages, not just Python)
which support gzipping of responses all seem to opt for the latter
method. Additionally, Apache's mod_deflate -- which so far as I know
is overwhelmingly the most common mechanism for enabling gzipping at
the server level -- also opts for this method, and uses the
Content-Encoding header.

Given this, gzipping of responses seems to be rather universally
associated, in the minds of web developers, with the Content-Encoding
header, which is not a hop-by-hop header (RFC 2616, Section
13.5.1). As such, the immediate (and misleading) impression given to
readers of PEP 333 will likely be one of:

1. PEP 333 forbids applications using Content-Encoding to signal
   gzipped response bodies (since it mentions gzipping as something
   applications specifically must not do), or

2. PEP 333 is ambiguous or contradictory on account of mentioning
   Transfer-Encoding and hop-by-hop headers in a context in which
   no-one uses Transfer-Encoding or a hop-by-hop header, or

3. This text in PEP 333 is based upon a misunderstanding of this
   feature of HTTP or of its use in the real world.

None of these seem particularly good, and this is why I took that
section of the spec to task (albeit in a much briefer and more cursory
fashion, since this message is already starting to run a bit long).

If I'm misreading or misunderstanding either PEP 333 or RFC 2616, I'd
appreciate it if someone would explain where I've gone astray. But as
it stands, I believe the text of PEP 333 quoted above is problematic
and likely to lead to confusion, and (if I'm not misreading or
misunderstanding it) should probably be revised to address these
concerns.


-- 
Bureaucrat Conrad, you are technically correct -- the best kind of correct.
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] WSGI 2

2009-08-04 Thread James Bennett
On Tue, Aug 4, 2009 at 11:05 AM, P.J. Ebyp...@telecommunity.com wrote:
 1. Force all encodings to be explicit, and

This can be handled without forcing application authors to work with
bytestrings (or forcing them to remember to coerce to bytestrings
before returning responses).

 2. Ensure WSGI-HTTP equivalence (i.e., WSGI==HTTP encoded in Python
 objects)

TBH, WSGI doesn't expose enough of HTTP's functionality to convince me
that this is a good argument. When I can use advanced HTTP features
(chunked transfer and friends) from a WSGI app, maybe I'll feel
differently.

 Please remember that WSGI is not primarily intended to provide application
 developers with a convenient API; its first and most important job is to
 ship the data around without mangling it in the process.

Which it should try very hard to do without forcing *in*convenient
APIs onto developers.

 So I would ask, what is the practical use case for having the server decode
 bytes into strings, instead of leaving them as bytes?

Well, Django (for one example) already does some gymnastics to ensure
that character encoding issues are kept at the request/response
boundary, largely because it's an utter pain for an application
developer to have an API dump a bunch of bytestrings in your lap and
say here, *you* figure it out. I suspect we're going to keep on
doing that, since it's a big win in terms of usability for application
developers (who end up having to deal with only a drastically-reduced
subset of character-encoding problems).


-- 
Bureaucrat Conrad, you are technically correct -- the best kind of correct.
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] WSGI 2

2009-08-04 Thread James Bennett
On Tue, Aug 4, 2009 at 11:54 AM, James Y Knightf...@fuhm.net wrote:
 But that works just fine today. Your WSGI app sends streaming data back
 using the iterator functionality, and the server automatically turns it into
 chunks if it's talking to an HTTP 1.1 client. What's the problem?

No, it doesn't work just fine today. Either the server has to assume
that every response from that application should be chunked (which is
wrong), or the application needs a way to tell the server to chunk.
Turns out HTTP has a way to indicate that, but WSGI outright forbids
its use. So instead you have to invent out-of-band mechanisms for the
application to tell the server what to do, and in the process reinvent
part of HTTP.


-- 
Bureaucrat Conrad, you are technically correct -- the best kind of correct.
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Time a for JSON parser in the standard library?

2008-03-10 Thread James Bennett
On Mon, Mar 10, 2008 at 8:37 AM, Mark Ramm [EMAIL PROTECTED] wrote:
  I would definitely support the incusion of a JSON library in the
  standard lib.   And, I think that it should be simplejson which is
  used by TurboGears, Pylons, and bundled with Django.

I'd tentatively agree, though I recall seeing a post not long ago
(which I am currently unable to find) from the author of jsonlib
lamenting the fact that most of the other JSON modules for Python had
various significant inconsistencies with the RFC. While authors of
competing tools should be taken with a grain of salt, I do think
compliance with the spec is an important factor for any particular
module that might be blessed with stdlib membership, and so should
play a bigger role in any such decision than mere benchmark speed.


-- 
Bureaucrat Conrad, you are technically correct -- the best kind of correct.
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] WSGI, Python 3 and Unicode

2007-12-06 Thread James Bennett
On Dec 6, 2007 6:15 PM, Phillip J. Eby [EMAIL PROTECTED] wrote:
 WSGI already copes, actually.  Note that Jython and IronPython have
 this issue today, and see:

 http://www.python.org/dev/peps/pep-0333/#unicode-issues

I'm glad you brought that up, because it's been bugging me lately.

That section is somewhat ambiguous as-is, because in one sentence
applications are permitted to return strings encoded in a charset
other than ISO-8859-1, but in another they are unequivocally forbidden
to do so (with the must not in bold, even). And that's problematic
not only because of the ambiguity, but because the increasing
popularity of AJAX and web-based APIs is making it much more common
for WSGI applications to generate responses of types which do not
default to ISO-8859-1 -- e.g., XML and JSON, both of which default to
UTF-8.

Depending on how draconian one wishes to be when reading the relevant
section of WSGI, it's possible to conclude that XML and JSON must
always be transcoded/escaped to ISO-8859-1 -- with all the headaches
that entails -- before being passed to a WSGI-compliant piece of
software.

And the slightly less strict reading of the spec -- that such
gymnastics are required only when the string type of the Python
implementation is Unicode-based -- will grow increasingly troublesome
as/when Py3K enters production use.

So as long as we're talking about this, could the proscriptions with
respect to encoding perhaps be revisited and (hopefully)
clarified/revised?

-- 
Bureaucrat Conrad, you are technically correct -- the best kind of correct.
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] more comments on Paste Deploy

2007-03-07 Thread James Bennett
On 3/7/07, Jim Fulton [EMAIL PROTECTED] wrote:
 Aside from the universal configuration file issue, I think this would
 be a terrific thing for us to focus on.  Something I hear a lot is
 how much easier PHP applications are to deploy to hosting providers.
 I would *love* it is Python had a similar story, even if only for
 smaller applications.

 I'd love to get some input who know a lot about what makes deploying
 PHP apps so easy.

I've mostly been lurking because everybody here's quite a bit smarter
than I am on most of the issues discussed, but in a past life I had a
fair amount of experience working with and deploying PHP, so I'll
throw in my $0.02.

PHP is (or was, when I was doing it) easy to deploy largely because
of two things:

1. mod_php.
2. Baked-in database libraries.

Everybody already knows that web-server setup is a wart for Python
(and the discussion on that lately has been encouraging), so I won't
dwell on it except to say that I live for the day I'll be able to drop
my Apache - mod_proxy - lighttpd - Unix socket - FastCGI - WSGI
- Django setup (this on a Python-friendly shared host, no less) and
have a server configuration that's simpler than the blog app it runs.

The database issue is one that seems to get overlooked a bit, but is
also a killer. PHP gives you SQLite and MySQL support for free, and
Postgres is trivially easy to add if a host is offering Postgres
databases. Meanwhile, most hosts are still with Python 2.3 or 2.4, so
you don't even get SQLite out-of-the-box. The better ones will have
appropriate DB modules installed anyway, but that still seems to be
something of a crap shoot, and somebody who has to build their own
copy of mysqldb to use Python on their hosting account is somebody
who's not going to use Python on their hosting account.

I'm hoping that the ongoing framework hype will help a lot with the
database issue, though; a number of hosting companies right now seem
to be waking up and realizing that there's a lot of money to be made
from framework converts who need solid support for languages that
aren't PHP.

I'd say that if/when these two issues are overcome, or even made
slightly less nasty to deal with, there's not really anything else PHP
can compete on; WSGI and the ever-expanding range of kick-ass web
tools Python offers blow PHP out of the water. To take an easy
example, cruft-free URLs are still anywhere from tedious to nasty under
PHP; you have to fiddle with mod_rewrite, and every PHP project has
its own monolithic URL dispatch system. On the Python side, WSGI and
tools like Paste Deploy make it trivially easy to hang any app anywhere you
want it in your URL scheme.


And setting aside actual technical issues, I also think there's room
to work with documentation; going back to Jim's comment at the PyCon
frameworks panel about documentation that tells stories, it's worth
pointing out that a lot of the PHP is easier perception is largely
just that -- a perception -- and that various languages and tools, PHP
included, have compensated for some pretty nasty warts by telling
compelling stories (Rails certainly wouldn't be where it is today if
not for some great storytelling on the part of the people marketing
it). I'm sure we have plenty of good stories we could tell, and I'm
pretty sure we don't have as many warts :)


-- 
Bureaucrat Conrad, you are technically correct -- the best kind of correct.
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com