Re: [Web-SIG] transaction progress with cgi.FieldStorage

2005-12-30 Thread Chris McDonough
> An aside on cgi.FieldStorage itself. It reads data using readline
> instead of reading in blocks of limited size. doing this I think means
> a file with very long lines, 20MB, 100MB, ... could cause excessive
> memory consumption.

This was reported and solved a long time ago (but not yet fixed in  
any Python distro):

https://sourceforge.net/tracker/? 
func=detail&aid=1112549&group_id=5470&atid=105470

___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Standardized template API

2006-02-01 Thread Chris McDonough
One specific concern about the "returning the published object" for  
publisher-based frameworks is that often the published object has  
references to other objects that might not make sense in the context  
of the thread handling the rendering of the template.  For example,  
if you're using a thread pool behind a Twisted server, and the thing  
doing the rendering is in "the main thread", methods hanging off of  
the "published object" might try to make use of thread-local storage,  
which would fail.  Zope 3 uses thread-local storage for request  
objects, IIRC.

This might be a nonissue, because I'm a little fuzzy on which  
component(s) actually do(es) the rendering of the template in the  
models being proposed.  But the amount of fuzziness I have about  
what's trying to be specified here makes me wonder if there aren't  
better things to go specify.

>
> As I mentioned in my counter-proposal, there should probably be a  
> key like
> 'wti.source' to contain either the object to be published (for
> publisher-oriented frameworks) or a dictionary of variables (for
> controller-oriented frameworks).  I originally called it "published
> object", but that's biased towards publisher frameworks so perhaps  
> a more
> neutral name like 'source' or 'data' would be more appropriate.
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] My original template API proposal

2006-02-06 Thread Chris McDonough
Although I've been trying to follow this thread, I'm finding it  
difficult to get a handle on what is meant to *call* the template API  
(e.g. what typically calls "render" in Ian's ITemplatePlugin  
interface at http://svn.pythonpaste.org/home/ianb/templateapi/ 
interface.py)?  Is the framework meant to call "render"?

Sorry for the remedial question ;-)

- C


On Feb 5, 2006, at 5:19 PM, Phillip J. Eby wrote:

> At 02:46 PM 2/5/2006 -0600, Ian Bicking wrote:
>> Ian Bicking wrote:
>>>   def render(template_instance, vars, format="html",  
>>> fragment=False):
>>
>> Here I can magically turn this into a WEB templating spec:
>>
>> def render(template_instance, vars, format="html", fragment=False,
>> wsgi_environ=None, set_header_callback=None)
>>
>> wsgi_environ is the environ dictionary if this is being called in  
>> a WSGI
>> context.  set_header_callback can be called like
>> set_header_callback(header_name, header_value) to write such a  
>> header to
>> the response.  Frameworks may or may not allow for setting  
>> headers.  If
>> they don't allow for it, they shouldn't provide that callback (thus
>> headers will not be mysteriously thrown away -- instead they will be
>> rejected immediately).  [Should set_header_callback('Status', '404  
>> Not
>> Found') be used, or a separate callback, or...?]
>>
>> This follows what all "server pages" templates I know of do.  That  
>> is,
>> they do not have special syntax related to any metadata (i.e.,  
>> headers)
>> or even any special syntax related to web requests.  Instead the web
>> request is represented through some set of variables available in the
>> template.
>
> Yes, but different template systems offer different APIs based on  
> it; the
> idea of using WSGI here was to make it possible for them to offer  
> their
> *own*, native APIs under this spec, not to force the use of the host
> framework's API.
>
> The only thing that's missing from your proposal is streaming  
> control or
> large file support.  I'll agree that it's an edge use case, but it  
> seems to
> me just as easy to just offer a plain WSGI interface and not have to
> document a bunch of differences and limitations.  OTOH, if this is  
> what it
> takes to get consensus, so be it.
>
> The additional advantage to using plain ol' WSGI as the calling  
> interface,
> however, is that it also lets you embed *anything* as a template,  
> including
> whole applications if they provide a "template engine" whose syntax is
> actually the application's configuration.
>
> Anyway, the only differences I'm aware of between what you're  
> proposing and
> what I'm proposing are:
>
> 1. Syntax sugar (each proposal sweetens different use cases)
> 2. Feature restrictions (yours takes away streaming)
> 3. What's optional (you consider WSGI optional, I want strings to  
> be optional)
>
> It would be better, I think, to address further discussion to  
> addressing
> the actual points of difference.
>
> Regarding #2, I'm willing to compromise to get consensus.   
> Regarding #3,
> I'd be willing to compromise by making *both* optional, with clearly
> defined variations of the spec so that plugins and frameworks that  
> support
> each are clearly distinguishable.  This would also mean that we'd  
> both be
> able to get the syntaxes we want under #1.
>
> ___
> Web-SIG mailing list
> Web-SIG@python.org
> Web SIG: http://www.python.org/sigs/web-sig
> Unsubscribe: http://mail.python.org/mailman/options/web-sig/chrism% 
> 40plope.com
>

___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] WSGI in standard library

2006-02-12 Thread Chris McDonough
On Feb 12, 2006, at 6:39 AM, Alan Kennedy wrote:
> So, I still think that only basic servers educational/playpen servers
> should go in the standard library, with an indication that the user
> should pick an openly server from outside the distro if they  
> require to
> do serious server work.

I agree 100%.

>
> Maybe if there were no "production-ready" servers in the standard
> library, there would be no need for a "Python Security Response Team".

As an example, it's currently possible to perform denial of service  
on any framework/server that uses the cgi.FieldStorage module.  See  
http://sourceforge.net/tracker/? 
func=detail&aid=1112549&group_id=5470&atid=105470
  .  That module probably doesn't belong in the stdlib in the first  
place, but it's in there, and now things depend on it.

In the meantime, this patch *really* should have been applied by now  
but hasn't been.  If anyone has checkin access, or can help me poke  
the appropriate person, it would help... this was reported to the SRT  
at the time.

- C

___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] html dom like javascript ?

2006-08-17 Thread Chris McDonough
You probably want elementtree (http://effbot.org/zone/element-index.htm).

>
> Thanks for the rapid reply.  I am familiar with a number of these and
> have searched the web documentation but for the most part these appear
> to be parsers or things like:
>
> http://www.acooke.org/andrew/writing/python-xml.html#code
>
> That are xml centric and not html related.   I'm looking for something
> that is more html specific that contains all the options for any html
> widtget, like a form element with all of its options like style, css,
> and so forth.  In other words I dont want to have to write my own xml
> file with all the html tags and options.
>
>
>
> Jean-Paul Calderone wrote:
>> On Thu, 17 Aug 2006 10:10:47 -0400, seth <[EMAIL PROTECTED]> wrote:
>>> Is there a python library which is analogous to javascript for creating
>>> html/xhtml documents? e.g.:
>>>
>>> hidden = document.createElement("input")
>>> hidden.setAttribute("type", "hidden")
>>> hidden.setAttribute("name", "active_flag_hidden_" + ctl)
>>> if( dirtyArray[ctl].checked == true) {
>>>hidden.setAttribute("value", 'N')
>>> } else {
>>>hidden.setAttribute("value", 'Y')
>>> }
>>> document.forms['listForm'].appendChild(hidden)
>>
>> At least fifty.  The DOM API is heavily standardized with hundreds of
>> implementations in dozens of languages.
>>
>> http://python.org/doc/lib/module-xml.dom.html
>>
>> Jean-Paul
>>
>
> ___
> Web-SIG mailing list
> Web-SIG@python.org
> Web SIG: http://www.python.org/sigs/web-sig
> Unsubscribe:
> http://mail.python.org/mailman/options/web-sig/chrism%40plope.com
>
>

___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Web Site Process Bus

2007-06-26 Thread Chris McDonough
On Jun 26, 2007, at 1:04 AM, Graham Dumpleton wrote:
> In Apache changing the certificates would need a complete restart of
> everything. Because the  child processes aren't privileged they would
> not be able to trigger the main server to do so. This actually gets to
> one of my reservations about some of the stuff being discussed. That
> is, that the WSGI applications should even have any ability to control
> the underlying web server. In a shared web hosting environment using
> Apache, allowing such control is not practical as you don't want
> arbitrary user doing things to the server. If you are running Apache
> as a dedicated server for a single application that is a different
> matter however. Thus some aspects of what can be done by via the bus
> would  have to be controllable dependent on the environment in which
> one is running.
>
> At least with Apache, even initiating this sort of stuff from inside
> of a WSGI application may not make a great deal of sense even then. It
> would be far easier and preferable in Apache to use a suexec CGI
> script to accept the upload of the SSL certificate and then trigger a
> restart of Apache. So in the end the bus concept may be great for pure
> Python system, but not so sure about a complicated mixed code system
> like Apache, especially where there may be better ways of handling it
> through other features of Apache.

There are also non-webbish processes like postgres, mysql, etc. that  
need to be treated as "part of the application".

I handle this currently by running all of the processes related to a  
specific project under a process controller (which happens to be  
implemented in Python, but that's besides the point, see http:// 
www.plope.com/software/supervisor2/).  The process controller is  
responsible for execing the child processes upon its own startup.  It  
is also responsible for restarting children if they die, capturing  
their output (if any), and allowing sufficiently privileged users to  
start and stop each one independently.  The only promise a subprocess  
must make to be managed is that it must be possible to start the  
process "in the foreground" (not under its down daemon manager).

If a "process bus" is implemented I suspect it should be implemented  
at this kind of level.  "Actions" could be registered for a specific  
subprocess types to send some input to a pipe file descriptor, send a  
signal to the process, etc.  It would also be possible to create some  
sort of dependency map between processes in a configuration, that  
relate the actions of one process to another (restart process A if  
process B is restarted, send a signal S to process C if signal T is  
sent to process D, etc).

- C

___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Web Site Process Bus

2007-06-26 Thread Chris McDonough

On Jun 26, 2007, at 2:39 PM, Robert Brewer wrote:

> Chris McDonough wrote:
>> There are also non-webbish processes like postgres, mysql, etc. that
>> need to be treated as "part of the application".
>>
>> I handle this currently by running all of the processes related to a
>> specific project under a process controller (which happens to be
>> implemented in Python, but that's besides the point, see http://
>> www.plope.com/software/supervisor2/).  The process controller is
>> responsible for execing the child processes upon its own startup.
>> It is also responsible for restarting children if they die,
>> capturing their output (if any), and allowing sufficiently
>> privileged users to start and stop each one independently.
>> The only promise a subprocess must make to be managed is that
>> it must be possible to start the process "in the foreground"
>> (not under its down daemon manager).
>>
>> If a "process bus" is implemented I suspect it should be implemented
>> at this kind of level.
>
> Ah, but there's the rub: we all have different ideas about how to
> *implement* IPC and control.

I'm confused by this in your earlier message, describing example  
scenarios:

"""
If I'm primarily a Zope user instead, I might start my website with
zdaemon. This would work exactly like the above, but the Bus object
would be instantiated and started by the zdaemon package. If I'm using
Graham's new mod_wsgi with Apache, I might expect it to create and
control the Bus.
"""

I'm confused because zdaemon is a generic process controller, it  
knows nothing in particular about the application running under it  
except that it's a UNIX process.  It could start postgres instead of  
Zope if you configured it to.  If zdaemon creates a Bus object,  
nothing will be able to send messages to the bus except zdaemon  
itself, and there can't be any useful listeners because it doesn't  
share the same process space as its child.

I think I'm mostly confused by the name "process bus" because it  
seems like the primary use case for something like this is where all  
of the applications share the same process space and are all written  
in Python.  Am I right?  If so, maybe a different name is in order?   
"Application Bus"?  Or even "WSGI Bus", if its presumed that all of  
the applications will be WSGI applications?

- C

___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Web Site Process Bus

2007-06-26 Thread Chris McDonough
On Jun 26, 2007, at 5:07 PM, Robert Brewer wrote:
>> I think I'm mostly confused by the name "process bus" because it
>> seems like the primary use case for something like this is where all
>> of the applications share the same process space
>
> I don't see why it should be limited by that. The primary use case is
> anywhere site components and application components are interacting,
> that could benefit from a shared understanding (and control) of the
> state of the site. To me, that requires a common set of messages, but
> the transport mechanism for those messages should be flexible so that
> it's useful in both multithread and multiprocess architectures.

Thank you.  I see.  This is a little too abstract for me to get my  
brain around, but I'll continue listening and maybe I'll get  
religion. ;-)

- C



___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] [extension] x-wsgiorg.flush

2007-10-04 Thread Chris McDonough

On Oct 4, 2007, at 11:55 AM, Phillip J. Eby wrote:

> At 05:00 PM 10/4/2007 +0200, Manlio Perillo wrote:
>> Your are making a critical decision here.
>> You are lowering the level of WSGI to match the level of average WSGI
>> middlewares programmers.
>
> No, we're just getting rid of legacy cruft that's hard to support
> correctly.  There's a big difference.

Getting the start_response dance down and understanding how it plays  
with middleware is *hard*.  Even if we called it something other than  
WSGI 2.0 (which I don't think we should, because it really is an  
evolution), returning the three-tuple is the right thing to do.

- C




___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] HEAD requests, WSGI gateways, and middleware

2008-01-24 Thread Chris McDonough
I have applications that do detect the difference between a GET and a HEAD 
(they 
do slightly less work if the request is a HEAD request), so I suspect this is 
not a totally reasonable thing to add to the spec.  Maybe instead the 
middleware 
that does what you're describing should be changed instead to deal with HEAD 
requests.

In general, I don't think is (or should be) any guarantee that an arbitrary 
middleware stack will work with an arbitrary application.  Although that would 
be nice in theory, I suspect it would require a very complex protocol (more 
complex than what WSGI requires now).

- C

Brian Smith wrote:
> My application correctly responds to HEAD requests as-is. However, it doesn't 
> work with middleware that sets headers based on the content of the response 
> body.
> 
> For example, a gateway or middleware that sets ETag based on an checksum, 
> Content-Encoding, Content-Length and/or Content-MD5 will all result in wrong 
> results by default. Right now, my applications assume that any such gateway 
> or the first such middleware will change environ["REQUEST_METHOD"] from 
> "HEAD" to "GET" before the application is invoked, and discard the response 
> body that the application generates. 
> 
> However, many gateways and middleware do not do this, and PEP 333 doesn't 
> have anything to say about it. As a result, a 100% WSGI 1.0-compliant 
> application is not portable between gateways.
> 
> I suggest that a revision of PEP 333 should require the following behavior:
> 
> 1. WSGI gateways must always set environ["REQUEST_METHOD"] to "GET" for HEAD 
> requests. Middleware and applications will not be able to detect the 
> difference between GET and HEAD requests.
> 
> 2. For a HEAD request, A WSGI gateway must not iterate through the response 
> iterable, but it must call the response iterable's close() method, if any. It 
> must not send any output that was written via start_response(...).write() 
> either. Consequently, WSGI applications must work correctly, and must not 
> leak resources, when their output is not iterated; an application should not 
> signal or log an error if the iterable's close() method is invoked without 
> any iteration taking place.
> 
> Please add this issue to http://wsgi.org/wsgi/WSGI_2.0.
> 
> Regards,
> Brian
> 
> ___
> Web-SIG mailing list
> Web-SIG@python.org
> Web SIG: http://www.python.org/sigs/web-sig
> Unsubscribe: http://mail.python.org/mailman/options/web-sig/chrism%40plope.com
> 

___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Prototype of wsgi.input.readline().

2008-01-30 Thread Chris McDonough
Graham Dumpleton wrote:
> As I think we all know, no one implements readline() for wsgi.input as
> defined in the WSGI specification. The reason for this is that stuff
> like cgi.FieldStorage would refuse to work and would just generate an
> exception. This is because cgi.FieldStorage expects to pass an
> argument to readline().

I haven't been keeping up on the issues this has caused wrt WSGI, but note that 
the reason that cgi.FieldStorage passes a size argument to readline is in order 
to prevent memory exhaustion when reading files that don't have any linebreaks 
(denial of service).  See http://bugs.python.org/issue1112549 .

> 
> So, although this is linked in the issues list for possible amendments
> to WSGI specification, there hasn't that I recall been a discussion on
> how readline() would be defined in any amendment or future version.
> 
> In particular, would the specification be changed to either:
> 
> 1. readline(size) where size argument is mandatory, or:
> 
> 2. readline(size=-1) where size argument is optional.
> 
> If the size argument is made mandatory, then it would parallel how
> read() function is defined, but this in itself would mean
> cgi.FieldStorage would break.
> 
> This is because cgi.FieldStorage actually calls readline() with no
> argument as well as an argument in different places in the code.

cgi.FieldStorage doesn't call readline() without an argument. 
cgi.parse_multipart does, but this function is not used by cgi.FieldStorage.  I 
don't know if this changes anything.

- C

___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Prototype of wsgi.input.readline().

2008-01-30 Thread Chris McDonough
Graham Dumpleton wrote:
> 
>>>
>>> If the size argument is made mandatory, then it would parallel how
>>> read() function is defined, but this in itself would mean
>>> cgi.FieldStorage would break.
>>>
>>> This is because cgi.FieldStorage actually calls readline() with no
>>> argument as well as an argument in different places in the code.
>> cgi.FieldStorage doesn't call readline() without an argument.
>> cgi.parse_multipart does, but this function is not used by cgi.FieldStorage. 
>>  I
>> don't know if this changes anything.
> 
> Not really, I should have said 'cgi' module as a whole rather than
> specifically cgi.FieldStorage. Given that people might be using
> cgi.parse_multipart in standard CGI, there would probably still be an
> expectation that it worked for WSGI. We can't really say that you can
> use cgi.FieldStorage but not cgi.parse_multipart. People will just
> expect all the normal tools people would use for this to work.

Personally, I think parse_multipart should go away.  It's not suitable for 
anything but toy usage.

If people use it, and they expose their site to the world, arbitrary anonymous 
visitors can cause their Python's process size to grow to arbitrarily.  I don't 
think any existing well-known framework uses it, for this very reason.

If it can't go away, and there's a problem due to the non-parity between 
parse_multipart's use and FieldStorage's use, I suspect the right answer is to 
change cgi.parse_multipart to pass in a size value for readline too.  I 
probably 
should have done that when I made the patch. :-(

- C
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Prototype of wsgi.input.readline().

2008-01-30 Thread Chris McDonough
Graham Dumpleton wrote:
> On 31/01/2008, Chris McDonough <[EMAIL PROTECTED]> wrote:
>> Graham Dumpleton wrote:
>>>>> If the size argument is made mandatory, then it would parallel how
>>>>> read() function is defined, but this in itself would mean
>>>>> cgi.FieldStorage would break.
>>>>>
>>>>> This is because cgi.FieldStorage actually calls readline() with no
>>>>> argument as well as an argument in different places in the code.
>>>> cgi.FieldStorage doesn't call readline() without an argument.
>>>> cgi.parse_multipart does, but this function is not used by 
>>>> cgi.FieldStorage.  I
>>>> don't know if this changes anything.
>>> Not really, I should have said 'cgi' module as a whole rather than
>>> specifically cgi.FieldStorage. Given that people might be using
>>> cgi.parse_multipart in standard CGI, there would probably still be an
>>> expectation that it worked for WSGI. We can't really say that you can
>>> use cgi.FieldStorage but not cgi.parse_multipart. People will just
>>> expect all the normal tools people would use for this to work.
>> Personally, I think parse_multipart should go away.  It's not suitable for
>> anything but toy usage.
> 
> Not necessarily. Someone may see it as a trade off. The code itself says:
> 
> """This is easy to use but not
> much good if you are expecting megabytes to be uploaded -- in that case,
> use the FieldStorage class instead which is much more flexible."""
> 
> So comment implies it is easier to use and so some may think it is
> simpler for what they are doing if they are only dealing with small
> requests.
> 
> Of course, it would probably be prudent if you know your requests are
> always going to be small to use LimitRequestBody in Apache, or a
> specific check on content length if handled in Python code, to block
> someone sending over sized requests intentionally to try and break
> things. Provided you did this, may be quite reasonable to use it in
> specific circumstances.

Indeed.  But then again, I doubt the casual user would be able to make this 
judgment and take the necessary precautions.  This kind of user is likely the 
same class of user for whom CGI.FieldStorage is "too hard" (which it really 
isn't).

- C

___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


[Web-SIG] repoze.bfg web framework 1.0 released

2009-07-05 Thread Chris McDonough

Summary
---

The first major release of the BFG web framework (aka "repoze.bfg"),
version 1.0, is available.  See http://bfg.repoze.org/ for general
information about repoze.bfg.

Details
---

BFG is a Python web framework based on WSGI.  It is inspired by Zope,
Pylons, and Django.  It makes use of a number of Zope technologies
under the hood.

BFG is developed as part of the more general Repoze project
(http://repoze.org).  It is released under the BSD-like license
available from http://repoze.org/license.html .

BFG version 1.0 represents one year of development effort.  The first
release of BFG, version 0.1, was made in July of 2008.  Since then,
roughly 80 pre-1.0 releases have been made.  None of these pre-1.0
releases explicitly promised any backwards compatibility with any
earlier release.

Version 1.0, however, marks the first point at which the repoze.bfg
API has been "frozen".  Future releases in the 1.X line guarantee
API-level backward compatibility with 1.0.  A backwards
incompatibility with 1.0 at the API level in any future 1.X version
will be considered a bug.

More Details


BFG contains moderate, incremental improvements to patterns found in
earlier-generation web frameworks.  It tries to make real-world web
application development and deployment more fun, more predictable, and
more productive.  To this end, BFG has the the following features:

- WSGI-based deployment: PasteDeploy and mod_wsgi compatible.

- Runs under Python 2.4, 2.5, and 2.6.

- Runs on UNIX, Windows, and Google App Engine.

- Full documentation coverage: no feature or API is undocumented.

- A comprehensive set of unit tests.  The repoze.bfg package contains
  11K lines of Python code.  8000 lines of that total line count is
  unit test code that tests the remaining 3000 lines.

- Sparse resource utilization: BFG has a small memory footprint and
  doesn't waste any CPU cycles.

- Doesn't have an unreasonable set of dependencies: "easy_install"
  -ing repoze.bfg over broadband takes less than a minute.

- Quick startup: a typical BFG application starts up in about a
  second.

- Offers extremely fast XML/HTML and text templating via Chameleon
  (http://chameleon.repoze.org/).

- Persistence-agnostic: use SQLAlchemy, "raw" SQL, ZODB, CouchDB,
  filesystem files, LDAP, or anything else which suits a particular
  application's needs.

- Provides a variety of starter project templates.  Each template
  makes it possible to quickly start developing a BFG application
  using a particular application stack.

- Offers URL-to-code mapping like Django or Pylons' *URL routing* or
  like Zope's *graph traversal*, or allows a combination of both
  routing and traversal.  This helps make it feel familiar to both
  Zope and Pylons developers.

- Offers debugging modes for common development error conditions (for
  example, when a view cannot be found, or when authorization is being
  inappropriately granted or denied).

- Allows developers to organize their code however they see fit; the
  framework is not opinionated about code structure.

- Allows developers to write code that is easily unit-testable.
  Avoids using thread local data structures which hamper testability.
  Provides helper APIs which make it easy to mock framework components
  such as templates and views.

- Provides an optional declarative context-sensitive authorization
  system.  This system prevents or allows the execution of code based
  on a comparison of credentials possessed by the requestor against
  ACL information stored by a BFG application.

- Behavior of an an application built using BFG can be extended or
  overridden arbitrarily by a third-party developer without any
  modification to the original application's source code.  This makes
  BFG a good choice for building frameworks and other "extensible
  applications".

- Zope and Plone developers will be comfortable with the terminology
  and concepts used by BFG; they are almost all Zope-derived.

Excruciating Details


Quick installation:

  easy_install -i http://dist.repoze.org/bfg/current repoze.bfg

General support and information:

  http://bfg.repoze.org

Tutorials

  http://docs.repoze.org/bfg/current/#tutorials

Sample Applications

  http://docs.repoze.org/bfg/current/#sample-applications

Detailed narrative and API documentation:

  http://docs.repoze.org/bfg/current

Bug tracker:

  http://bfg.repoze.org/trac

Maillist:

  http://lists.repoze.org/listinfo/repoze-dev

IRC support:

  irc://irc.freenode.net#repoze

repoze.bfg is developed primarily by Agendaless Consulting
(http://agendaless.com) and a team of contributors.

Special thanks to these people, without whom this release would not
have been possible:

Malthe Borch, Carlos de la Guardia, Chris Rossi, Shane Hathaway, Tom
Moroz, Yalan Teng, Jason Lantz, Todd Koym, Jessica Geist, Hanno
Schlichting, Reed O'Brien, Sebastien Douche, Ian Bicking, Jim Fulton,
Martijn Faassen, Ben Bangert, Fernando Co

Re: [Web-SIG] repoze.bfg web framework 1.0 released

2009-07-06 Thread Chris McDonough

On 7/5/09 10:37 PM, Graham Dumpleton wrote:

The first major release of the BFG web framework (aka "repoze.bfg"),
version 1.0, is available.  See http://bfg.repoze.org/ for general
information about repoze.bfg.

...

- WSGI-based deployment: PasteDeploy and mod_wsgi compatible.

...

- A comprehensive set of unit tests.  The repoze.bfg package contains
  11K lines of Python code.  8000 lines of that total line count is
  unit test code that tests the remaining 3000 lines.


A question about your testing if you have time. Is this done in a fake
WSGI hosting environment, ie., test harness, or is it able to be run
through WSGI servers such as Paste server, Apache/mod_wsgi, etc, in
some way?


The tests I mentioned in there are mostly unit tests; they don't test any 
particular system configuration functionally.  In particular, none of the tests 
actually invokes a request via a WSGI stack.


But we do use functional testing in projects that use the framework.  For 
example, we use Twill (created by Titus Brown) to make sure things don't break 
at the request/response level in this project:  http://karlproject.org.



Am curious from the point of view that standalone test suites for WSGI
itself to run against WSGI hosting mechanisms don't really exist, so
the test suite for BFG, with the presumption that it would exercise a
lot of WSGI functionality, might be a good regression test for WSGI
servers themselves.


I think maybe some "ACID test" WSGI application could be built, and then some 
set of functional HTTP-level tests could be run against that application to gain 
confidence in a WSGI app.  This is more or less what we do with Twill on that 
KARL project:  the developers use the Paste#http server, but we actually deploy 
to a mod_wsgi server.  We can (and do) run the Twill tests against both to get 
confidence that the app isn't going to fall over in production.


- C
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] repoze.bfg web framework 1.0 released

2009-07-06 Thread Chris McDonough

On 7/5/09 11:44 PM, Randy Syring wrote:

Chris,

Sounds interesting.  Question: Does it support some
kind of module/plugin architecture that will allow me to develop "plug-
in" functionality across projects?  What would be called in
Django an "app".

For example, I would like to have a "news", "blog", and "calendar"
module that I can plug into different applications.  The goal is to
have everything for the module contained in one subdirectory or package
including
any configuration, routing, templates, controllers, model, etc.  So,
something like this:

/modules/news/...
/modules/calendar/...
/modules/blog/...

Or:

packages/
MyProj
NewsComponent
CalendarComponent
BlogComponent



I'm not sure if I can do this topic justice here (many have fallen on the sword 
when approaching it before), but I'll try.


"Plugin apps" is maybe less a feature of BFG than the stuff that BFG is built on 
top of.  Like Zope, BFG makes use of the Zope Component Architecture "under the 
hood".  Unlike Zope, BFG tends to hide the ZCA (conceptually and API-wise) from 
developers, because the ZCA introduces concepts like "adapters", "interfaces", 
and "utilities".  Direct exposure to these concepts in user-visible code evokes 
suspicion in people who just don't have the problems they try to solve.  The 
problems that the ZCA tries to solve usually revolve around code testability and 
reusability, and most people just don't care that much about these things.


So BFG is more like Pylons or Django in this respect: it provides helper APIs 
and places to hang your code so that you can build a single-purpose application 
reasonably easily without making you think in terms of building anything 
reusable.  The final application usually happens to be overrideable and 
extensible, but that's just a byproduct of using BFG, and doesn't really have 
very much to do with building a system out of plugins.


In the meantime, the Zope Component Architecture is a fantastic system on which 
to build a *framework* (as opposed to an application).  This is why BFG is built 
on top of it.  If you are willing to use the ZCA conceptually and API-wise *in 
your application code*, it becomes straightforward to build reusable 
applications like you mention.


So the answer to your original question is probably no.  BFG itself isn't a 
system which allows you to slot arbitrary components into place and have them 
"show up" somewhere.  It's instead a system (like Zope) in which you can build 
such a thing.  In fact, many of the applications that we (my company, 
Agendaless) build are these kinds of applications, where we tend to want to 
reuse a single application component across many "customers" or "projects".


The trick is this: when you build "pluggable applications", there's presumably 
something you're going to want to plug these applications into.  I *think* this 
the piece that most people are after when they talk about "pluggable 
applications"; they actually don't care too much about the applications 
themselves (because they'll build them themselves), it's the higher-level thing 
that gets plugged into that is of primary interest.  For better or worse, 
systems like Plone, Drupal, and Joomla are examples of such an application 
framework.  These systems allow you to build small pieces of functionality that 
drop in to some larger system.


We've done lots of Zope and Plone work, and we know the downsides of the "plug 
this bit into the larger framework" pattern pretty well.  We've found that it's 
useful to have the tools at hand to build miniature versions of such large 
frameworks on hand, so we can quickly come up with a custom solution to some 
problem without "fighting the framework" (any particular framework) so much. 
BFG plus direct use of the ZCA in application code tends to let us avoid using 
the larger frameworks in favor of rolling our own (more focused, simpler) 
frameworks.


Unfortunately, I don't have any "simple example" application code to show with 
respect to this pattern, because anything I could show here would be too trivial 
to be useful.  More unfortunately, anything I can point you to that we've built 
using this pattern will probably be too large to understand in any reasonable 
amount of time (e.g. http://karlproject.org).


This has always been the historical problem with trying to promote use of the 
ZCA for application code: until you work on a larger project that uses it 
"right", it's just too abstract.  So by the time you actually need it, it's too 
late and you've already invented your own mechanisms to do similar indirections. 
 For those reasons, I think it would be a useful exercise to build some very 
simple system that took "app plugins" and just exposed them in some very 
concrete way to end users, even if it meant losing some presentation 
flexibility.  Such a system could be created in any web framework, but using the 
ZCA inside the web framework for such a task is a no-brainer to me.


A

Re: [Web-SIG] Request for Comments on upcoming WSGI Changes

2009-09-20 Thread Chris McDonough

I'll try to digest some of this, currently I'm pretty clueless.

Personally, I find it a bit hard to get excited about Python 3 as a web 
application deployment platform.  This is of course a personal judgment (I 
don't mean to slight Python 3) but at this point, I'll think I'll probably be 
writing software that targets 2.X exclusively for at least the next five years.


Given this point of view, it would be extremely helpful if someone could 
explain to people with the same outlook why we should want to deal with Unicode 
strings in any WSGI specification.


WSGI is a fairly low-level protocol aimed at folks who need to interface a 
server to the outside world.  The outside world (by its nature) talks bytes.  I 
fear that any implied conversion of environment values and iterable return 
values to Unicode will actually eventually make things harder than they are 
now.  I realize that it would make middleware implementors lives harder to need 
to deal in bytes.  However, at this point, I also believe that middleware kinda 
should be hard.  We have way too much middleware that shouldn't be middleware 
these days (some written by myself).


Anyway, for us slower (and maybe wrongly fearful) folks, could someone 
summarize the benefits of having a WSGI specification that requires Unicode. 
Bonus points for an explanation that does not boil down to "it will be 
compatible with Python 3".


- C


Armin Ronacher wrote:

Hello everybody,

Thanks to Graham Dumpleton and Robert Brewer there is some serious
progress on WSGI currently.  I proposed a roadmap with some PEP changes
now that need some input.

Summary:

  WSGI 1.0   stays the same as PEP 0333 currently is
  WSGI 1.1   becomes what Ian and I added to PEP 0333
  WSGI 2.0   becomes a unicode powered version of WSGI 1.1
  WSGI 3.0   becomes WSGI 2.0 just without start_response

  WSGI 1.0 and 1.1 are byte based and nearly impossible to use on Python
  3 because of changes in the standard library that no longer work with
  a byte-only approach.


The PEPs themselves are here: http://bitbucket.org/ianb/wsgi-peps/
Neither the wording not the changes in there are anywhere near final.


Graham wrote down two questions he wants every major framework developer
to be answered.  These should guide the way to new WSGI standards:

1. Do we keep bytes everywhere forever in Python 2.X, or try to
   introduce unicode there at all to at least mirror what changes might
   be made to make WSGI workable in Python 3.X?

2. Do we skip WSGI 1.X completely for Python 3.X and go straight to
   WSGI 2.0 for Python 3.X?

I added a new question I think should be asked too:

3. Do we skip WSGI 2.0 as specified in the PEP and go straight to
   WSGI 3.0 and drop start_response?


The following things became pretty clear when playing around with
various specifications on Python 3:

-  Python 3 no longer implicitly converts between unicode and byte
   strings.  This covers comparisons, the regular expression engine,
   all string functions and many modules in the stdlib.

-  The Python 3 stdlib radically moved to unicode for non unicode things
   as well (the http servers, http clients, url handling etc.)

-  A byte only version of WSGI appears unrealistic on Python 3 because
   it would require server and middleware implementors to reimplement
   parts of the standard library to work on bytes again.

-  unicode support can be added for WSGI on both Python 2.x and Python
   3.x without removing functionality.  Browsers are already doing
   a similar encoding trick as proposed by Graham Dumpleton to handle
   URLs.

-  Python 2.x already accepts unicode strings for many things such as
   URL handling thanks to the fact that unicode and byte strings are
   surprisingly interchangeable.

-  cgi.FieldStorage and some other parts is now totally broken on
   Python 3 and should no longer be used in 3.0 and 3.1 because it
   reads the response body into memory.  This currently affects
   WebOb, Pylons and TurboGears.


I sent this mail to every major framework / WSGI implementor so that we
get input even if you're missing the discussion on web-sig.


Regards,
Armin
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/chrism%40plope.com



___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Request for Comments on upcoming WSGI Changes

2009-09-21 Thread Chris McDonough

OK, after some consideration, I think I'm sold.

Answering my own original question about why unicode seems to make sense as 
values in the WSGI environment even without consideration for Python 3 
compatibility:  *something* needs to do this translation.  Currently I 
personally rely on WebOb to do a lot of this translation.  I can't think of a 
good reason that implementations at the level of WebOb would each need to do 
this translation work; pushing the job into WSGI itself seems to make sense 
here.  This is particularly true for PATH_INFO and QUERY_STRING; these days 
it's foolish to assume these values will be entirely composed of "low order" 
characters, and thus being able to access them as bytes natively isn't very useful.


OTOH, I suspect the Python 3 stdlib is still broken if it requires native 
strings in various places (and prohibits the use of bytes).


James Bennett wrote:

On Sun, Sep 20, 2009 at 11:25 PM, Chris McDonough  wrote:

WSGI is a fairly low-level protocol aimed at folks who need to interface a
server to the outside world.  The outside world (by its nature) talks bytes.
 I fear that any implied conversion of environment values and iterable
return values to Unicode will actually eventually make things harder than
they are now.  I realize that it would make middleware implementors lives
harder to need to deal in bytes.  However, at this point, I also believe
that middleware kinda should be hard.  We have way too much middleware that
shouldn't be middleware these days (some written by myself).


Well, ordinarily I'd be inclined to agree: HTTP deals in bytes, so an
interface to HTTP should deal in bytes as well.

The problem, really is that despite being a very low-level interface,
WSGI has a tendency to leak up into much higher-level code, and (IMO)
authors of that high-level code really shouldn't have to waste their
time dealing with details of the underlying low-level gateway.

You've said you don't want to hear "Python 3" as the reason, but it
provides some useful examples: in high-level code you'll commonly want
to be doing things like, say, comparing parts of the requested URL
path to known strings or patterns. And that high-level code will
almost certainly use strings, while WSGI, in theory, will be using
bytes. That's just a recipe for disaster; if WSGI mandates bytes, then
bytes will have to start "infecting" much higher-level code (since
Python 3 -- rightly -- doesn't let you be nearly as promiscuous about
mixing bytes and strings).

Once I'm at a point where I can use Python 3, I know I'll personally
be looking for some library which will normalize everything for me
before I interact with it, precisely to avoid this sort of leakage; if
WSGI itself would at least *allow* that normalization to happen at the
low level (mandating it is another discussion entirely) I'd feel much
happier about it going forward.




___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Session events

2009-10-05 Thread Chris McDonough

This is supported at least here:

http://docs.repoze.org/session/usage.html#using-begin-and-end-subscribers



Alastair "Bell" Turner wrote:

Hi

I've been looking through the range of choices for Python web
[application] frameworks/libraries (Just to have all the bases
covered) for a new build project and standardisation of some small
utilities. There's one feature that I'm not finding and was just
wanting to check on before considering the joys of rolling my own: I'm
not finding any support for user session events, I'm particularly
interested in being able to register a handler on session expiry or
cleanup. I've mainly been looking at the lighter weight frameworks
since my requirement for the new build is mainly aggregate and list
operations, so the least suitable load for ORMs.

Have I missed the feature session event somewhere?

Thanks

Alastair "Bell" Turner
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/chrism%40plope.com



___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] [Paste] WebOb API

2009-10-29 Thread Chris McDonough

Ian Bicking wrote:

Also I'm planning on introducing a BaseRequest (and *maybe*
BaseResponse) class, that removes some functionality.  Specifically
for Repoze they'd like to remove __getattr__ and __setattr__ (which
has some performance implications),


FTR, after thinking about it, I'm not even sure BaseRequest is necessary for 
this purpose.  This seems to work too (at least it gets previously visible 
setattr/getattr stuff out of the profiling info):


class Request(WebobRequest):
__setattr__ = object.__setattr__
__getattr__ = object.__getattribute__
__delattr__ = object.__delattr__



___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


[Web-SIG] http://wiki.python.org/moin/WebFrameworks

2009-11-26 Thread Chris McDonough
http://wiki.python.org/moin/WebFrameworks seems to be the place where folks are 
registering their respective web frameworks.


I'd like to move some of the frameworks which are currently in the various 
categories which haven't been active in a few years.  In particular, I'd like 
to move any framework which hasn't had a release since the beginning of 2008 
(arbitrary) into the "Discontinued / Inactive" framework category.  I'd be 
willing to do the work to make sure I wasn't moving one that actually *did* 
have releases past that but just hadn't updated the page.


Any dissent?

- C

___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] http://wiki.python.org/moin/WebFrameworks

2009-11-28 Thread Chris McDonough

Aaron Watters wrote:

On Thu, Nov 26, 2009 at 1:02 PM, Chris McDonough 
wrote:

http://wiki.python.org/moin/WebFrameworks

seems to be the place where folks

are registering their respective web frameworks.

I'd like to move some of the frameworks which are

currently in the various

categories which haven't been active in a few years.

 In particular, I'd

like to move any framework which hasn't had a release

since the beginning of

2008 (arbitrary) into the "Discontinued / Inactive"

framework category.  I'd

be willing to do the work to make sure I wasn't moving

one that actually

*did* have releases past that but just hadn't updated

the page.

Any dissent?

- C


Why not call them "apparently stable"
versus "under active development"?  Is the
cgi module "discontinued"?


No, but the cgi module has undergone a lot of changes over the last couple of 
years which were present in Python releases:


http://svn.python.org/view/python/branches/release26-maint/Lib/cgi.py?view=log


I'm a little sensitive on this topic
because people tell me that Gadfly is "inactive"
or "discontinued"
but it still does what it does
as documented very well.

Frequent releases may actually be a sign of 
bugginess and bad design.


Agreed.  On the other hand, though, no release for two years sometimes *does* 
mean it's dead.  It's slightly unfair to the folks who are very actively 
improving a web framework to live in a "slot" on that page right next to 
actually-really-dead software because of the vagarities of lexical sorting.



If you suspect a project is really dead, maybe you
could try to contact the authors and ask about
what they think.


Well, that was my intention.  I don't want to remove *actually* active or 
stable-and-still-used packages from the list.  Maybe I should just dial back 
the date to the beginning of 2007 or something.


- C

___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] WSGI for Python 3

2010-07-16 Thread Chris McDonough
On Fri, 2010-07-16 at 11:07 -0500, Ian Bicking wrote:

> And this doesn't help with Python 3: either we have byte values of
> SCRIPT_NAME and PATH_INFO in Python 3, or we have text values.  I
> think bytes will be more awkward to port to than text, and
> inconsistent with other WSGI values.  If we have text then we have to
> choose an encoding.  Latin1 will work, but it will be the exact wrong
> encoding most of the time as UTF-8 is the typical  (unlike other
> headers, where Latin1 will mostly be an okay encoding, or as good a
> guess as we have).  If we firmly remove these keys then we can avoid
> this choice entirely... and we conveniently also get a better
> representation of the request.

My $.02: I'd rather lobby the core folks for a string ABC (which we can
hook with a stringlike bytes type) and consider all 3.X releases made so
far "dead to WSGI" than to have to tunnel arbitrary bytes through some
misleading Unicode encoding.

- C


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] WSGI for Python 3

2010-07-16 Thread Chris McDonough
On Fri, 2010-07-16 at 17:47 -0400, Tres Seaver wrote:

> > In the past when we've gotten down to specifics, the only holdup has been
> > SCRIPT_NAME/PATH_INFO, hence my suggestion to eliminate those.
> 
> I think I favor PJE's suggestion:  let WSGI deal only in bytes.

I'd prefer that WSGI 2 was defined in terms of a "bytes with benefits"
type (Python 2's ``str`` with an optional encoding attribute as a hint
for cast to unicode str) instead of Python 3-style bytes.

But if I had to make the Hobson's choice between Python 3 style bytes
and Python 3 style str, I'd choose bytes.  If I then needed to write
middleware or applications, I'd use WebOb or an equivalent library to
enable a policy which converted those bytes to strings on my behalf.
Making it easy to write "raw" middleware or applications without using
such a library doesn't seem as compelling a goal as being able to easily
write one which allowed me direct control at the raw level.

- C


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] WSGI for Python 3

2010-07-16 Thread Chris McDonough
On Fri, 2010-07-16 at 17:11 -0500, Ian Bicking wrote:
> On Fri, Jul 16, 2010 at 5:08 PM, Chris McDonough 
> wrote:
> On Fri, 2010-07-16 at 17:47 -0400, Tres Seaver wrote:
> 
> > > In the past when we've gotten down to specifics, the only
> holdup has been
> > > SCRIPT_NAME/PATH_INFO, hence my suggestion to eliminate
> those.
> >
> > I think I favor PJE's suggestion:  let WSGI deal only in
> bytes.
> 
> 
> I'd prefer that WSGI 2 was defined in terms of a "bytes with
> benefits"
> type (Python 2's ``str`` with an optional encoding attribute
> as a hint
> for cast to unicode str) instead of Python 3-style bytes.
> 
> But if I had to make the Hobson's choice between Python 3
> style bytes
> and Python 3 style str, I'd choose bytes.  If I then needed to
> write
> middleware or applications, I'd use WebOb or an equivalent
> library to
> enable a policy which converted those bytes to strings on my
> behalf.
> Making it easy to write "raw" middleware or applications
> without using
> such a library doesn't seem as compelling a goal as being able
> to easily
> write one which allowed me direct control at the raw level.
> 
> What are the concrete problems you envision with text request headers,
> text (URL-quoted) path, and text response status and headers?

Documentation is the main reason.  For example, the documentation for
making sense of path_info segments in a WSGI that used unicodey-strings
would, as I understand it, read something like this:

"""
The PATH_INFO environment variable is a string.  To decode it,

- First, split it on slashes::

segments = PATH_INFO.split('/')

- Then turn each segment into bytes::

bytes_segments = [ bytes(x, encoding='latin-1') for x in segments ]

- Then, de-encode each segment's urlencoded portions:

urldecoded_segments = [ urllib.unquote(x) for x in bytes_segments ]

- Then re-encode each urldecoded segment into the encoding expected
  by your application

app_segments = [ str(x, encoding='utf-8') for x in 
 urldecoded_segments ]

.. note:: We decode from latin-1 above because WSGI tunnels the bytes
representing the PATH_INFO by way of a string type which contains bytes
as characters.
"""

That looks pretty apologetic to me, and to be honest, I'm not even sure
it will work reliably in the face of existing/legacy applications which
have emitted URLs that are not url-encoded properly if those old URLs
need to be supported.   http://bugs.python.org/issue8136 contains a
variation on this theme.

I'd much rather say be able to say:

"""
The PATH_INFO environment variable is a ``bytes-with-benefits`` type.
To decode it:

- First, split it on slashes::

segments = PATH_INFO.split('/')

- Then, de-encode each segment's urlencoded portions:

urldecoded_segments = [ urllib.unquote(x) for x in segments ]

- Then re-encode each urldecoded segment into the encoding expected
  by your application

app_segments = [ str(x, encoding='utf-8') for x in 
 urldecoded_segments ]
"""

Let me know if I'm missing something.

- C



___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] WSGI for Python 3

2010-07-16 Thread Chris McDonough
On Sat, 2010-07-17 at 01:33 +0200, Armin Ronacher wrote:
> Hi,
> 
> On 7/17/10 1:20 AM, Chris McDonough wrote:
>  > Let me know if I'm missing something.
> The only thing you miss is that the bytes type of Python 3 is badly 
> supported in the stdlib (not an issue if we reimplement everything in 
> our libraries, not an issue for me) and that the bytes type has no 
> string formattings which makes us do the encode/decode dance in our own 
> implementation so of the missing stdlib functions.

This is why the docs mention "bytes with benefits" instead (like the
Python 2 "str" type). The existence of such a type would be the result
of us lobbying for its inclusion into some future Python 3, or at least
the result of lobbying for a String ABC that would allow us to define
our own.

But.. yeah.  Stdlib support for bytes.  Dunno.   What I really don't
want to do is implement a WSGI spec in terms of Unicodey strings just
because the webby stuff in the stdlib cannot deal with bytes.  Those
stdlib implementations should be changed to deal with bytes-ish things
instead.  I actually think fixing the stdlib will end up being a driver
for the "bytes with benefits" type.  Supporting such a type in the
implementation of stdlib functions is clearly the right way to fix it in
lots of cases, because they will be able to deal with BwB and
Unicodey-strings in exactly the same way.

In the meantime, I think using bytes is the only sane thing to do in
some interim specification, because moving from a spec which is
bytes-oriented to a spec that is text-oriented now will leave us in the
embarrassing position of needing to create yet another bytes-oriented
spec later (as, well, I/O is bytes), when Python 3 matures and realizes
it needs such a hybrid type.

- C


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] WSGI for Python 3

2010-07-16 Thread Chris McDonough
On Fri, 2010-07-16 at 20:46 -0500, Ian Bicking wrote:
> On Fri, Jul 16, 2010 at 6:20 PM, Chris McDonough 
> wrote:
> > What are the concrete problems you envision with text
> request headers,
> > text (URL-quoted) path, and text response status and
> headers?
> 
> 
> Documentation is the main reason.  For example, the
> documentation for
> making sense of path_info segments in a WSGI that used
> unicodey-strings
> would, as I understand it, read something like this:
> 
> Nah, not nearly that hard:
> 
> path_info =
> urllib.parse.unquote_to_bytes(environ['wsgi.raw_path_info']).decode('UTF-8')
> 
> I don't see the problem?  If you want to distinguish %2f from /, then
> you'll do it slightly differently, like:
> 
> path_parts = [
> urllib.parse.unquote_to_bytes(p).decode('UTF-8')
> for p in environ['wsgi.raw_path_info'].split('/')]
>  
> This second recipe is impossible to do currently with WSGI.
> 
> So... before jumping to conclusions, what's the hard part with using
> text?

It's extremely hard to swallow Python 3's current disregard for the
primacy of bytes at I/O boundaries.  I'm trying, but I can't help but
feel that the existence of an API like "unquote_to_bytes" is more
symptom treatment than solution.  Of course something that unquotes a
URL segment unquotes it into bytes; it's the only sane default because
URL segments found in URLs on the internet are bytes.

So I guess the "hard part" is more meta.  When you have legitimate
backwards compatibility constraints, suboptimal choices made during
protocol design are excusable.  But it just seems really very weird to
design one (WSGI 2) from scratch with such choices when the only reason
to do so is a systematic low-level denial of reality.  Why would we use
(and, worse, by doing so, implicitly promote) such a system in the first
place?

On the other hand, indignance about the issue shouldn't rule the day
either.  To me, the most pragmatic thing to do that doesn't deny reality
would be to use bytes.  It's also the easiest thing to remember (the
values in the environment are all bytes) and I think we'll be able to
drive the Py3K stdlib forward in a much saner direction if we choose
bytes than if we choose text to represent things that are naturally more
bytes-like.

- C

___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] WSGI for Python 3

2010-07-17 Thread Chris McDonough
On Fri, 2010-07-16 at 23:38 -0500, Ian Bicking wrote:
> On Fri, Jul 16, 2010 at 9:43 PM, Chris McDonough 
> wrote:
> 
> > Nah, not nearly that hard:
> >
> > path_info =
> >
> 
> urllib.parse.unquote_to_bytes(environ['wsgi.raw_path_info']).decode('UTF-8')
> >
> > I don't see the problem?  If you want to distinguish %2f
> from /, then
> > you'll do it slightly differently, like:
> >
> > path_parts = [
> > urllib.parse.unquote_to_bytes(p).decode('UTF-8')
> > for p in environ['wsgi.raw_path_info'].split('/')]
> >
> > This second recipe is impossible to do currently with WSGI.
> >
> > So... before jumping to conclusions, what's the hard part
> with using
> > text?
> 
> 
> It's extremely hard to swallow Python 3's current disregard
> for the
> primacy of bytes at I/O boundaries.  I'm trying, but I can't
> help but
> feel that the existence of an API like "unquote_to_bytes" is
> more
> symptom treatment than solution.  Of course something that
> unquotes a
> URL segment unquotes it into bytes; it's the only sane default
> because
> URL segments found in URLs on the internet are bytes.
> 
> Yes, URL quoted strings should decode to bytes, though arguably it is
> reasonable to also use the very reasonable UTF-8 default that
> urllib.parse.quote/unquote uses.  So it's really just a question of
> names, should be quote_to_string or quote_to_bytes that name.  Which
> honestly... whatever.

After some careful consideration, I realize I'm only able to offer stop
energy regarding the WSGI-as-text proposal, so I'll bow out of any
maillist conversation about it for now.

- C





___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


[Web-SIG] PEP 444 (aka Web3)

2010-09-15 Thread Chris McDonough
A PEP was submitted and accepted today for a WSGI successor protocol
named Web3:

http://python.org/dev/peps/pep-0444/

I'd encourage other folks to suggest improvements to that spec or to
submit a competing spec, so we can get WSGI-on-Python3 settled soon.

- C


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] PEP 444 (aka Web3)

2010-09-15 Thread Chris McDonough
On Wed, 2010-09-15 at 20:05 -0400, P.J. Eby wrote:
> At 07:03 PM 9/15/2010 -0400, Chris McDonough wrote:
> >A PEP was submitted and accepted today for a WSGI successor protocol
> >named Web3:
> >
> >http://python.org/dev/peps/pep-0444/
> >
> >I'd encourage other folks to suggest improvements to that spec or to
> >submit a competing spec, so we can get WSGI-on-Python3 settled soon.
> 
> The first thing I notice is that web3.async appears to force all 
> existing middleware to delete it from the environment if it wishes to 
> remain compatible, unless it adapts to support receiving callables itself.

We can ditch everything concerning web3.async as far as I'm concerned.
Ian has told me that this feature won't be liked by the async people
anyway, as it doesnt have a trigger mechanism.

> On further reading I see you have something about middleware 
> disabling itself if it doesn't support async execution, but this 
> doesn't make any sense to me: if it can't support async execution, 
> why wouldn't it just delete web3.async from the environ, forcing its 
> wrapped app to be synchronous instead?
> 
> I'm also not a fan of the bytes environ, or the new 
> path_info/script_name variables; note that the spec's sample CGI 
> implementation does not itself provide the new variables, and that 
> middleware must be explicitly written to handle the case where there 
> is duplication.

I'm not concerned about which environment variables have it, but I would
definitely like to be able to get at the "original" (non-%2F-decoded)
path info somewhere.  I'd be fine if PATH_INFO was just that, and get
rid of web3.path_info.  web3.script_name is probably just a mistake
entirely.

> My main fear with this spec is that people will assume they can just 
> make a few superficial changes to run WSGI code on it, when in fact 
> it is deeply incompatible where middleware is concerned.  In fact, 
> AFAICT, it seems like it will be *harder* to write correct web3 
> middleware than it is to write correct WSGI middleware now.

I'm very willing to drop web3.async entirely.  It seems reasonable to do
so.  I should have done so before I mailed the spec, as I knew it would
be unpopular.

> This seems like a step backward, since the whole idea behind dropping 
> start_response() was to make correct middleware *easier* to write.
> 
> Any time a spec makes something optional or allows More Than One Way 
> To Do It, it immediately doubles the mimimum code required to 
> implement that portion of the spec in compliant middleware.  This 
> spec has two optionalities: web3.async, and the optional 
> path_info/script_name, so the return handling of every piece of 
> middleware is doubled (or else  "environ['web3.async'] = False" must 
> be added at the top), and any code that modifies paths must similarly 
> ditch the special variables or do double work to update them.

No worries, let's get rid of both, with the caveat that it's pretty
essential (to me anyway) to be able to get at the non-%2F-encoded path
somewhere.  The most sensible thing to me would be to put it in
PATH_INFO.

As far as bytes vs. strings, whatever, we have to pick one.  Bytes makes
more sense to me.  I'll leave it to the native-string and/or unicode
people to create their own spec.

- C



___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] [Python-Dev] Add PEP 444, Python Web3 Interface.

2010-09-15 Thread Chris McDonough
It's, e.g.

b'8080'

.. instead of the integer value 8080.

Apparently the type of this value was not spelled out sufficiently in
the WSGI spec and string values and integer values were used
interchangeably, making it harder to join them with the other values in
the environ (a common thing to want to do).  Bytes instances are
attractive, as the rest of the values are also bytes, so they can be
joined together easily.

(I also redirected this to web-sig at the request of PJE).

- C

On Wed, 2010-09-15 at 17:02 -0700, John Nagle wrote:
> On 9/15/2010 4:44 PM, python-dev-requ...@python.org wrote:
> > ``SERVER_PORT`` must be a bytes instance (not an integer).
> 
> What's that supposed to mean?  What goes in the "bytes
> instance"?  A character string in some format?  A long binary
> number?  If the latter, with which byte ordering?  What
> problem does this\ solve?
> 
>   John Nagle
> 
> ___
> Python-Dev mailing list
> python-...@python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> http://mail.python.org/mailman/options/python-dev/lists%40plope.com
> 


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] PEP 444 (aka Web3)

2010-09-16 Thread Chris McDonough
On Thu, 2010-09-16 at 12:01 -0500, Ian Bicking wrote:
> Well, reiterating some things I've said before:
> 
> * This is clearly just WSGI slightly reworked, why the new name?

The PEP says "Web3 is clearly a WSGI derivative; it only uses a
different name than "WSGI" in order to indicate that it is not in any
way backwards compatible."

I don't really care what the name is.  My experience in various
communities suggests that naming the new totally-bw-incompat thing the
same as the old thing weakens both the new thing and the old thing,
but.. whatever.  I just don't care much.

> * Why byte values in the environ?  No one has offered any real reason
> they are better than native strings.  I keep asking people to offer a
> reason, *and no one ever does*.  It's just hyperbole and distraction.
> Frankly I'm feeling annoyed.  So far my experience makes me believe
> using native strings will make it easier to port and support libraries
> across 2 and 3.

I'm sorry you're annoyed.  I chose bytes here mainly out of ignorance
and fear. This is an extremely low level protocol, and I just literally
don't know how we can sanely convert environ values to Unicode without
some loss of control or potential for incorrect decoding without having
server encoding configuration.  You say it's easy and straightforward,
and that's fine.  I just haven't internalized enough specification to
know.

I'd very much encourage folks who want to use native strings to create
another PEP: it's just a lot easier to argue about one "thing" than it
is to argue endlessly in snippets on blogs and epic maillist threads.  I
could care less if this *particular* PEP is selected, to be honest.
Let's just get it over within a process where there's at least some
chance of resolution.

> * It makes sense to me that the error stream should accept both bytes
> and unicode, and should do a best effort to handle either.  Getting
> encoding errors or type errors when logging an error is very
> distracting.

Sounds good.

> * Instead of focusing on Response(*response_tuple), I'd rather just
> rely on something like Response.from_wsgi(response_tuple).  Body first
> feels very unnatural.

Others have said same, also good.

> * Regarding long response headers, I think we should ignore the HTTP
> spec.  You can put 4k in a Set-Cookie header, such headers aren't
> easily or safely folded... I think the line length constraint in the
> HTTP spec isn't a constraint we need to pay attention to.

OK.

- C


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] PEP 444 (aka Web3)

2010-09-16 Thread Chris McDonough
On Thu, 2010-09-16 at 14:04 -0400, P.J. Eby wrote:
> At 10:35 AM 9/16/2010 -0700, Guido van Rossum wrote:
> >No comments on the rest except to note that at this point it looks
> >unlikely that we can make everyone happy (or even get an agreement to
> >adopt what would be the long-term technically optimal solution --
> >AFAICT there is no agreement on what that solution would be, if one
> >weren't to take porting Python 2 code into account). IOW
> >something/sokebody has gotta give.
> 
> Indeed.  This entire discussion has pushed me strongly in favor of 
> doing a super-minimalist update to PEP 333 with the following points:

Right on, write it all down! ;-)

- C


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] PEP 444 (aka Web3)

2010-09-17 Thread Chris McDonough
On Fri, 2010-09-17 at 19:47 +0300, Ionel Maries Cristian wrote:
> I don't like this proposal at all. Besides having to go through the
> bytes craziness the design is pretty backwards for middleware and
> asynchronous applications.

We've acknowledged in other messages to this thread that the web3.async
red herring is speculative, and Armin has indicated that if he does not
find a champion willing to create a reference implementation for it
today that it will be taken out.  This doesn't help async people, but it
also doesn't harm them (no difference from WSGI really).  Personally, I
hope nobody steps up and we just rip it out. ;-)

I'm not sure why you characterize using bytes as "bytes craziness".  We
have been using strings as byte sequences in WSGI for over five years.
Python itself draws an equivalence between the Python 3 bytes type and
Python 2 "str" ("bytes" is aliased to "str" under Python 2).  I'm not
really sure why we shouldn't take advantage of that equivalence, and why
people are so enamored of treating envvar values, headers, and such as
text other than the brokenness of the Python 3 stdlib urllib stuff.  

IMO, WSGI/Web3 isn't really a programming platform (or at least if it
is, it is destined to be a pretty crappy one), it's just a connection
protocol, so any "its more typing" or "its ugly" argument seems pretty
thin to me.  I'd personally rather have it be more general and less easy
to use than potentially broken in some corner case circumstance.

- C


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] PEP 444 (aka Web3)

2010-09-19 Thread Chris McDonough
On Thu, 2010-09-16 at 05:29 +0200, Roberto De Ioris wrote:
> About the *.file_wrapper removal, i suggest
> a PSGI-like approach where 'body' can contains a File Object.
> 
> def file_app(environ):
> fd = open('/tmp/pippo.txt', 'r')
> status = b'200 OK'
> headers = [(b'Content-type', b'text/plain')]
> body = fd
> return body, status, headers

I don't see why this couldn't work as long as middleware didn't convert
the body into something not-file-like.  But it is really an
implementation detail of the origin server (it might specialize when the
body is a file), and doesn't really need to be in the spec.

> or
> 
> def file_app(environ):
> fd = open('/tmp/pippo.txt', 'r')
> status = b'200 OK'
> headers = [(b'Content-type', b'text/plain')]
> body = [b'Header', fd, b'Footer']
> return body, status, headers

This won't work, as the body is required to return an iterable which
returns bytes, and cannot be an iterable which returns either bytes or
other iterables (it must be a "flat" sequence).

- C


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] PEP 444 (aka Web3)

2010-09-19 Thread Chris McDonough
On Thu, 2010-09-16 at 13:44 +0200, Tarek Ziadé wrote:
> On Thu, Sep 16, 2010 at 1:03 AM, Chris McDonough  wrote:
> > A PEP was submitted and accepted today for a WSGI successor protocol
> > named Web3:
> >
> > http://python.org/dev/peps/pep-0444/
> >
> > I'd encourage other folks to suggest improvements to that spec or to
> > submit a competing spec, so we can get WSGI-on-Python3 settled soon.
> 
> I have a request for the middleware stack. There should be one obvious
> way to get back to the original application, through the stack
> 
> Right now, I have to write crazy things like this depending on the stack:
> 
>   original_app = self.app.app.application.app
> 
> Because some middleware use "app", some "application" etc..
> 
> I propose to write in the PEP that a middleware should provide an
> "app" attribute to get the wrapped application or middleware.
> It seems to be the most common name used out there.

We can't really mandate this because middleware is not required to be an
instance.  It can be a function.

- C


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] PEP 444 (aka Web3)

2010-09-19 Thread Chris McDonough


On Sat, 2010-09-18 at 14:08 +0300, Ionel Maries Cristian wrote:
> There's a framework called cogen and it relies on this policy.

I've been told by a number of people (both async and sync people) that
WSGI is a poor protocol on top of which to develop async applications,
and they usually go on to say that async applications and servers really
should communicate over separate (perhaps-WSGI-like) protocol.

I don't really know much about developing async web applications, but
frankly I'm loath to keep features in this thing that are only tolerated
(spat upon lightly! ;-))  by async folks, but which are also common
tripping points for people who never write async applications.

This is an apologetic way of saying "please find more champions for this
feature".

- C




> 
> -- ionel
> 
> On Sat, Sep 18, 2010 at 12:34, Ian Bicking 
> wrote:
> On Sat, Sep 18, 2010 at 5:03 AM, Marcel Hellkamp
>  wrote:
> 
> With WSGI it was possible to yield empty strings as
> long as the
> application is waiting for data and call
> start_response once the headers
> are final. Not perfect, but at least non-blocking.
> Web3 removes this
> possibility. The headers must be returned before the
> body iterable
> yielded its first element, empty or not.
> 
> Removing any support for this type of asynchronism
> would render web3
> useless for all but completely synchronous and trivial
> applications.
> Even frameworks would have no way to work around this
> anymore.
> 
> I'm aware of what a lot of people have done with WSGI, but I'm
> not aware of anyone doing an async proxy of any sort, or
> implementing anything in a way where this empty string policy
> served any function.  It's not implausible that it *could* be
> used, but years of practice have shown it is not used.
> 
> 
> 
> -- 
> Ian Bicking  |  http://blog.ianbicking.org
> 
> 
> ___
> Web-SIG mailing list
> Web-SIG@python.org
> Web SIG: http://www.python.org/sigs/web-sig
> Unsubscribe:
> http://mail.python.org/mailman/options/web-sig/ionel.mc%
> 40gmail.com
> 
> 
> ___
> Web-SIG mailing list
> Web-SIG@python.org
> Web SIG: http://www.python.org/sigs/web-sig
> Unsubscribe: http://mail.python.org/mailman/options/web-sig/chrism%40plope.com


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] PEP 444 (aka Web3)

2010-09-19 Thread Chris McDonough
On Fri, 2010-09-17 at 14:14 -0400, Ian Bicking wrote:

> OK, so maybe it should just be clarified:
> 
> * Middleware and servers should not modify or add Content-Length,
> Date, or other headers unless they have reason to do so, and they must
> ensure that the response is valid (e.g., there should never be two
> Content-Length headers).

I tried adding such a statement to a local copy of the specification,
but I wasn't able to really "nail" it.  If someone here can come up with
some unambiguous wording (defining "unless they have reason to do so"
and "other headers" above would be a good start), I'd just put it in.

> It still seems reasonable that *if* there is no Content-Length, and
> the server can guess easily enough (mostly it is returned an actual
> list/tuple that we know can be introspected fast and without side
> effects), then it's perfectly reasonable to set it -- but certainly
> the server doesn't "own" that header (or any other, except maybe some
> connection-related headers?).

I'm -0 on the server trying to guess the Content-Length header.  It just
doesn't seem like much of a burden to place on an application and it's
easier to specify that an application must do this than it is to specify
how a server should behave in the face of a missing Content-Length.  I
also believe Graham has argued against making the server guess, I
presume this causes him some pain somehow (probably underspecification
in WSGI).

- C



___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] PEP 444 (aka Web3)

2010-09-19 Thread Chris McDonough
On Sun, 2010-09-19 at 21:52 -0400, Chris McDonough wrote:

> I'm -0 on the server trying to guess the Content-Length header.  It just
> doesn't seem like much of a burden to place on an application and it's
> easier to specify that an application must do this than it is to specify
> how a server should behave in the face of a missing Content-Length.  I
> also believe Graham has argued against making the server guess, I
> presume this causes him some pain somehow (probably underspecification
> in WSGI).

Graham's issues with requiring the server to set Content-Length are
detailed here:

http://blog.dscpl.com.au/2009/10/wsgi-issues-with-http-head-requests.html


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] PEP 444 (aka Web3)

2010-09-21 Thread Chris McDonough
I have some pending changes to the PEP 444 spec (the working copy is at
http://github.com/mcdonc/web3/blob/master/pep-0444.rst but please don't
consider that canonical in any sense, it will change before an official
republication of the proposal).  The modifications fold in most of what
we've talked about on the list, or at least acknowledge the issues; a
change log is contained near the top.

However, I'm currently trying work work through what to do about
offering up quoted PATH_INFO and SCRIPT_NAME values (unquoted in the
sense that, at least on platforms that support it, these would be the
original values before being run through urllib.unquote).

The current published proposal on Python.org indicates that these would
go into "web3.path_info" and "web3.script_name" but nobody seems to much
like that because it would make things like "path_info_pop" hard (the
code would need to keep two data structures in sync, and would need to
be pretty magical in the face of %2F markers).

The pending, unpublished proposal turns SCRIPT_NAME and PATH_INFO into
*quoted* values, and adds a ``web3.path_requoted`` flag for debugging
purposes, which will be True if the SCRIPT_NAME and/or PATH_INFO needed
to be recomposed and requoted (eg. on CGI platforms).  But private
conversations lead me to believe that not many folks will like this
either, because it comandeers CGI names that are well-understood to be
unquoted.

The only sensible way to break the deadlock seems to be to not use any
"CGI names" in the specification at all, so as not to break people's
expectations.  I know that when I change it to not use any CGI names, it
will be received poorly, but I can't think of a better idea.

- C

On Wed, 2010-09-15 at 19:03 -0400, Chris McDonough wrote:
> A PEP was submitted and accepted today for a WSGI successor protocol
> named Web3:
> 
> http://python.org/dev/peps/pep-0444/
> 
> I'd encourage other folks to suggest improvements to that spec or to
> submit a competing spec, so we can get WSGI-on-Python3 settled soon.
> 
> - C
> 
> 
> ___
> Web-SIG mailing list
> Web-SIG@python.org
> Web SIG: http://www.python.org/sigs/web-sig
> Unsubscribe: http://mail.python.org/mailman/options/web-sig/chrism%40plope.com
> 


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Backup plan: WSGI 1 Addenda and wsgiref update for Py3

2010-09-21 Thread Chris McDonough
On Tue, 2010-09-21 at 12:09 -0400, P.J. Eby wrote:
> While the Web-SIG is trying to hash out PEP 444, I thought it would 
> be a good idea to have a backup plan that would allow the Python 3 
> stdlib to move forward, without needing a major new spec to settle 
> out implementation questions.

If a WSGI-1-compatible protocol seems more sensible to folks, I'm
personally happy to defer discussion on PEP 444 or any other
backwards-incompatible proposal.

- C


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Is PEP 3333 the final solution for WSGI on Python 3?

2010-10-22 Thread Chris McDonough
For what it's worth, I'm happy with the changes made to WSGI 1 that
produced PEP .

I'm unlikely to champion PEP 444 going forward.  It has already served
its primary duty to me personally (which was to catalyze the
formalization of some specification that is Python 3 inclusive).

However, Armin may feel differently about it, so this doesn't constitute
a withdrawal of PEP 444.  I'm instead just signaling my own personal
attitude: "don't really care as much now that there's something out
there".

On Fri, 2010-10-22 at 10:35 +1100, Graham Dumpleton wrote:
> Any one care to comment on my blog post?
> 
>   http://blog.dscpl.com.au/2010/10/is-pep--final-solution-for-wsgi-on.html
> 
> As far as web framework developers commenting, Armin at:
> 
>   
> http://www.reddit.com/r/Python/comments/du7bf/is_pep__the_final_solution_for_wsgi_on_python/
> 
> has said:
> 
>   """Hopefully not. WSGI could do better and there is a proposal for
> that (444)."""
> 
> So, looks he is very cool on the idea.
> 
> No other developers of actual web frameworks has commented at all on
> PEP  from what I can see.
> 
> Graham
> ___
> Web-SIG mailing list
> Web-SIG@python.org
> Web SIG: http://www.python.org/sigs/web-sig
> Unsubscribe: http://mail.python.org/mailman/options/web-sig/chrism%40plope.com
> 


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Is PEP 3333 the final solution for WSGI on Python 3?

2010-10-24 Thread Chris McDonough
On Sun, 2010-10-24 at 10:17 +0300, Armin Ronacher wrote:

> I have to admit that my interest in Python 3 is not very high and I am 
> most likely not the most reliable person when it comes to driving PEP 444 :)

We should probably withdraw the PEP, then (unless someone else wants to
step up and champion it), because neither am I.

- C


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Is PEP 3333 the final solution for WSGI on Python 3?

2010-10-24 Thread Chris McDonough
On Sun, 2010-10-24 at 17:16 +0200, Georg Brandl wrote:
> Am 24.10.2010 16:40, schrieb Chris McDonough:
> > On Sun, 2010-10-24 at 10:17 +0300, Armin Ronacher wrote:
> > 
> >> I have to admit that my interest in Python 3 is not very high and I am 
> >> most likely not the most reliable person when it comes to driving PEP 444 
> >> :)
> > 
> > We should probably withdraw the PEP, then (unless someone else wants to
> > step up and champion it), because neither am I.
> 
> Don't give it up yet -- Deferring is probably the better option.

TBH, unless someone has immediate interest in championing it, I'd rather
just withdraw it and let someone else resubmit it (or something like it)
later if they want.  It's just going to cause confusion if it's left in
a zombie state without a champion.

- C


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] PEP 444

2010-11-21 Thread Chris McDonough
PEP 444 has no champion currently.  Both Armin and I have basically left
it behind.  It would be great if you wanted to be its champion.

- C

On Sun, 2010-11-21 at 03:12 -0800, Alice Bevan-McGregor wrote:
> (A version of this is is available at http://web-core.org/2.0/pep-0444/ — 
> links are links, code may be easier to read.)
> 
> PEP 444 is quite exciting to me.  So much so that I’ve been spending a few 
> days writing a high-performance (C10K, 10Krsec) Py2.6+/3.1+ HTTP/1.1 server 
> which implements much of the proposed standard.  The server is functional 
> (less web3.input at the time of this writing), but differs from PEP 444 in 
> several ways.  It also adds several features I feel should be part of the 
> spec.
> 
> Source for the server is available on GitHub:
> 
>   https://github.com/pulp/marrow.server.http
> 
> I have made several notes about the PEP 444 specification during 
> implementation of the above, and concern over some implementation details:
> 
> First, async is poorly defined:
> 
> > If the origin server advertises that it has the web3.async capability, a 
> > Web3 application callable used by the server is permitted to return a 
> > callable that accepts no arguments. When it does so, this callable is to be 
> > called periodically by the origin server until it returns a non-None 
> > response, which must be a normal Web3 response tuple.
> 
> Polling is not true async.  I believe that it should be up to the server to 
> define how async is utilized, and that the specification should be clarified 
> on this point.  (“Called periodically” is too vague.)  “Callable” should 
> likely be redefined as “generator” (a callable that yields) as most 
> applications require holding on to state and wrapping everything in 
> functools.partial() is somewhat ugly.  Utilizing generators would improve 
> support for existing Python async frameworks, and allow four modes of 
> operation: yield None (no response, keep waiting), yield response_tuple 
> (standard response), return / raise StopIteration (close the async 
> connection) and allow for data to be passed back to the async callable by the 
> higher-level async framework.
> 
> Second, WSGI middleware, while impressive in capability, are somewhat… 
> heavy-weight.  Heavily nesting function calls is wasteful of CPU and RAM, 
> especially if the middleware decides it can’t operate, for example, GZip 
> compression disabling itself for non-text/ mimetypes.  The majority of WSGI 
> middleware can, and probably should be, implemented as linear ingress or 
> egress filters.  For example, on-disk static file serving could be an ingress 
> filter, and GZip compression an egress filter.  m.s.http supports this 
> filtering and demonstrates one API for such.  Also, I am in the process of 
> writing an example egress CompressionFilter.
> 
> An example API and filter use implementation: (paraphrased from 
> marrow.server.http)
> 
> > # No filters, near 0 overhead.
> > for filter_ in ingress_filters:
> > # Can mutate the environment.
> > result = filter_(env)
> > 
> > # Allow the filter to return a response rather than continuing.
> > if result:
> > # result is a status, headers, body_iter tuple
> > return result[0], result[1], result[2]
> > 
> > status, headers, body = application(env)
> > 
> > for filter_ in egress_filters:
> > # Can mutate the environment, status, headers, body, or
> > # return completely new status, headers, and body.
> > status, headers, body = filter_(env, status, headers, body)
> > 
> > return status, headers, body
> 
> The environment has some minor issues.  I’ll write up my changes in RFC-style:
> 
> SERVER_NAME is REQUIRED and MUST contain the DNS name of the server OR 
> virtual server name for the web server if available OR an empty bytestring if 
> DNS resolution is unavailable.  SERVER_ADDR is REQUIRED and MUST contain the 
> web server’s bound IP address.  URL reconstruction SHOULD use HTTP_HOST if 
> available, SERVER_NAME if there is no HTTP_HOST, and fall back on SERVER_ADDR 
> if SERVER_NAME is an empty bytestring.
> 
> CONTENTL_LENGTH is REQUIRED and MUST be None if not defined by the client.  
> Testing explicitly for None is more efficient than armoring against missing 
> values; also, explicit is better than implicit.  (Paste’s WSGI1 server 
> defines CONTENT_LENGTH as 0, but this implies the client explicitly declared 
> it as zero, which is not the case.)
> 
> FRAGMENT and PARAMETERS are REQUIRED and are parsed out of the URL in the 
> same way as the QUERY_STRING. FRAGMENT is the text after a hash mark (a.k.a. 
> “anchor” to browsers, e.g. /foo#bar). PARAMETERS come before QUERY_STRING, 
> and after PATH_INFO separated by a semicolon, e.g. /foo;bar?baz.  Both values 
> MUST be empty bytestrings if not present in the URL. (Rarely used — I’ve only 
> seen it in Java and ColdFusion applications — but still useful.)
> 
> Points of contention:
> 
> Changing the namesp

Re: [Web-SIG] PEP 444

2010-11-21 Thread Chris McDonough
On Sun, 2010-11-21 at 09:32 -0800, Alice Bevan-McGregor wrote:
> > PEP 444 has no champion currently.  Both Armin and I have basically left it 
> > behind.  It would be great if you wanted to be its champion.
> 
> Done.
> 
> As I already have a functional, performant HTTP server[1] and example 
> filter[2] (compression) utilizing a slightly modified version of PEP 444, and 
> hope to be giving a presentation on its design and related utilities[3] early 
> next year, I’d love to have the opportunity to directly shape its future.  My 
> server may be a bit large to be a reference implementation, but until it has 
> its first user I have the benefit of being able to experiment whole-heartedly 
> with features and proposals.
> 
> Since Python 3 was released I haven’t heard of much forward-progress in 
> getting web frameworks compatible.  The largest complaint I’ve heard is that 
> there are too few things already ported, which is a chicken and the egg 
> problem.  This is one scenario where re-inventing the wheel may be the only 
> way to see forward movement.  So far, I seem to be buckling down and Getting 
> Things Done™ in this regard.
> 
> How would I go about getting access to the PEP in order to fix the issues 
> I’ve been catching up on?  (I’ve been reading through quite a bit of old 
> mailing list traffic these last few hours in-between writing docs and unit 
> tests for the compression egress filter.)

Georg Brandl has thus far been updating the canonical PEP on python.org.
I don't know how you get access to that.  My working copy is at
https://github.com/mcdonc/web3 .

- C


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] PEP 444

2010-11-22 Thread Chris McDonough
On Mon, 2010-11-22 at 00:08 -0800, Alice Bevan-McGregor wrote:
> Would you prefer to give me collaboration permissions on your repo, or
> should I fork it?

Please fork it or create another repository entirely. I have no plans to
do more work on it personally, so I don't think it should really be
associated with me.  To that end, I think I'd prefer my name to either
be off the PEP entirely or just listed as a helper or typist or
something. ;-)

- C


> 
> This message was sent from a mobile device. Please excuse any
> terseness and spelling or grammatical errors. If additional
> information is indicated it will be sent from a desktop computer as
> soon as possible. Thank you.
> 
> On 2010-11-21, at 11:40 PM, Chris McDonough  wrote:
> 
> 
> > Georg Brandl has thus far been updating the canonical PEP on
> > python.org.
> > I don't know how you get access to that.  My working copy is at
> > https://github.com/mcdonc/web3 .
> 


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] PEP 444 != WSGI 2.0

2011-01-02 Thread Chris McDonough
On Sun, 2011-01-02 at 09:21 -0800, Guido van Rossum wrote:
> Graham, I hope that you can stop being grumpy about the process that
> is being followed and start using your passion to write up a critique
> of the technical merits of Alice's draft. You don't have to attack the
> whole draft at once -- you can start by picking one or two important
> issues and try to guide a discussion here on web-sig to tease out the
> best solutions.  Please  understand that given the many different ways
> people use and implement WSGI there may be no perfect solution within
> reach -- writing a successful standard is the art of the compromise.
> (If you still think the process going forward should be different,
> please write me off-list with your concerns.)
> 
> Everyone else on this list, please make a new year's resolution to
> help the WSGI 2.0 standard become a reality in 2011.

I think Graham mostly has an issue with this thing being called "WSGI
2".

FTR, avoiding naming arguments is why I titled the original PEP "Web3".
I knew that if I didn't (even though personally I couldn't care less if
it was called Buick or McNugget), people would expend effort arguing
about the name rather than concentrate on the process of creating a new
standard.  They did anyway of course; many people argued publically
wishing to rename Web3 to WSGI2.  On balance, though, I think giving the
standard a "neutral" name before it's widely accepted as a WSGI
successor was (and still is) a good idea, if only as a conflict
avoidance strategy. ;-)

That said, I have no opinion on the technical merits of the new PEP 444
draft; I've resigned myself to using derivatives of PEP  "forever".
It's good enough.  Most of the really interesting stuff seems to happen
at higher levels anyway, and the benefit of a new standard doesn't
outweigh the angst caused by trying to reach another compromise.  I'd
suggest we just embrace it, adding minor tweaks as necessary, until we
reach some sort of technical impasse it doesn't address.

- C


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


[Web-SIG] wsgi server...

2011-12-26 Thread Chris McDonough
Does anyone know of a pure-Python WSGI server that:

- Is distributed indepdently from a web framework or larger whole.

- Runs on UNIX and Windows.

- Runs on both Python 2 and Python 3.

- Has good test coverage.

- Is useful in production.

(I sent this already to the Pylons-discuss maillist and got some good
responses, so not ignoring those, just want to ask a wider audience)

Thanks!

- C

___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


[Web-SIG] PEP3333 and PATH_INFO

2012-01-03 Thread Chris McDonough
Perrenial topic, it seems, from the archives.

As far as I can tell from PEP , every WSGI application that wants to
run on both Python 2 and Python 3 and which uses PATH_INFO will need to
define a helper function something like this:

"""
import sys

def decode_path_info(environ, encoding='utf-8'):
PY3 = sys.version_info[0] == 3
path_info = environ['PATH_INFO']
if PY3:
return path_info.encode('latin-1').decode(encoding)
else:
return path_info.decode(encoding)
"""

Is there a more elegant way to handle this?

- C


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] A 'shutdown' function in WSGI

2012-02-20 Thread Chris McDonough
On Mon, 2012-02-20 at 17:39 -0500, PJ Eby wrote:
> The standard way to do this would be to define an "optional server
> extension" API supplied in the environ; for example, a
> 'x-wsgiorg.register_shutdown' function.

Unlikely, AFACIT, as shutdown may happen when no request is active.
Even if this somehow happened to not be the case, asking the application
to put it in the environ is not useful, as the environ can't really be
relied on to retain values "up" the call stack.

- C


>   The wsgi.org wiki used to be the place to propose these sorts of
> things for standardization, but it appears to no longer be a wiki, so
> the mailing list is probably a good place to discuss such a proposal.
> 
> On Mon, Feb 20, 2012 at 2:30 PM, Tarek Ziadé 
> wrote:
> oops my examples were broken, should be:
> 
> def hello_world_app(environ, start_response): status = '200
> OK' # HTTP Status headers = [('Content-type', 'text/plain')]
> start_response(status, headers) return ["Hello World"] 
> 
> def shutdown():   # or maybe something else as an argument I
> don't know
> do_some_cleanup()
> 
> 
> 
> and:
> 
> $ gunicorn myapp:hello_world_app myapp:shutdown
> 
> 
> 
> Cheers
> Tarek
> 
> ___
> Web-SIG mailing list
> Web-SIG@python.org
> Web SIG: http://www.python.org/sigs/web-sig
> Unsubscribe:
> http://mail.python.org/mailman/options/web-sig/pje%
> 40telecommunity.com
> 
> 
> ___
> Web-SIG mailing list
> Web-SIG@python.org
> Web SIG: http://www.python.org/sigs/web-sig
> Unsubscribe: http://mail.python.org/mailman/options/web-sig/chrism%40plope.com


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] A 'shutdown' function in WSGI

2012-02-20 Thread Chris McDonough
On Mon, 2012-02-20 at 20:54 -0500, PJ Eby wrote:
> 2012/2/20 Chris McDonough 
> On Mon, 2012-02-20 at 17:39 -0500, PJ Eby wrote:
> > The standard way to do this would be to define an "optional
> server
> > extension" API supplied in the environ; for example, a
> > 'x-wsgiorg.register_shutdown' function.
> 
> 
> Unlikely, AFACIT, as shutdown may happen when no request is
> active.
> Even if this somehow happened to not be the case, asking the
> application
> to put it in the environ is not useful, as the environ can't
> really be
> relied on to retain values "up" the call stack.
> 
> 
> "Optional server extension APIs" are things that the server puts in
> the environ, not things the app puts there.  That's why it's
> 'register_shutdown', e.g.
> environ['x-wsgiorg.register_shutdown'](shutdown_function).  

I get it now, but it's still not the right thing I don't think.  Servers
shut down without issuing any requests at all.

- C



___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] A 'shutdown' function in WSGI

2012-02-22 Thread Chris McDonough
On Wed, 2012-02-22 at 09:06 +1100, Graham Dumpleton wrote:
> If you want to be able to control a thread like that from an atexit
> callback, you need to create the thread as daemonised. Ie.
> setDaemon(True) call on thread.
> 
> By default a thread will actually inherit the daemon flag from the
> parent. For a command line Python where thread created from main
> thread it will not be daemonised and thus why the thread will be
> waited upon on shutdown prior to atexit being called.
> 
> If you ran the same code in mod_wsgi, my memory is that the thread
> will actually inherit as being daemonised because request handler in
> mod_wsgi, from which import is trigger, are notionally daemonised.
> 
> Thus the code should work in mod_wsgi. Even so, to be portable, if
> wanting to manipulate thread from atexit, make it daemonised.
> 
> Example of background threads in mod_wsgi at:
> 
> http://code.google.com/p/modwsgi/wiki/ReloadingSourceCode#Monitoring_For_Code_Changes
> 
> shows use of setDaemon().
> 
> Graham

I've read all the messages in this thread and the traffic on the bug
entry at http://bugs.python.org/issue14073 but I'm still not sure what
to tell people who want to invoke code "at shutdown".

Do we tell them to use atexit?  If so, are we saying that atexit is
sufficient for all user-defined shutdown code that needs to run save for
code that needs to stop threads?

Is it sufficient to define "shutdown" as "when the process associated
with the application exits"?  It still seems to not necessarily be
directly correlated.

- C


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


[Web-SIG] Standardized configuration

2005-07-16 Thread Chris McDonough
I've also been putting a bit of thought into middleware configuration,
although maybe in a different direction.  I'm not too concerned yet
about being able to introspect the configuration of an individual
component.  Maybe that's because I haven't thought about the problem
enough to be concerned about it.  In the meantime, though, I *am*
concerned about being able to configure a middleware "pipeline" easily
and have it work.

I've been attempting to divine a declarative way to configure a pipeline
of WSGI middleware components.  This is simple enough through code,
except that at least in terms of how I'm attempting to factor my
middleware, some components in the pipeline may have dependencies on
other pipeline components.

For example, it would be useful in some circumstances to create separate
WSGI components for user identification and user authorization.  The
process of identification -- obtaining user credentials from a request
-- and user authorization  -- ensuring that the user is who he says he
is by comparing the credentials against a data source -- are really
pretty much distinct operations.  There might also be a "challenge"
component which forces a login dialog.

In practice, I don't know if this is a truly useful separation of
concerns that need to be implemented in terms of separate components in
the middleware pipeline (I see that paste.login conflates them), it's
just an example.  But at very least it would keep each component simpler
if the concerns were factored out into separate pieces.

But in the example I present, the "authentication" component depends
entirely on the result of the "identification" component.  It would be
simple enough to glom them together by using a distinct environment key
for the identification component results and have the authentication
component look for that key later in the middleware result chain, but
then it feels like you might as well have written the whole process
within one middleware component because the coupling is pretty strong.

I have a feeling that adapters fit in here somewhere, but I haven't
really puzzled that out yet.  I'm sure this has been discussed somewhere
in the lifetime of WSGI but I can't find much in this list's archives.

> Lately I've been thinking about the role of Paste and WSGI and
> whatnot. Much of what makes a Paste component Pastey is
> configuration;  otherwise the bits are just independent pieces of
> middleware, WSGI applications, etc.  So, potentially if we can agree
> on configuration, we can start using each other's middleware more
> usefully.
>
> I think we should avoid questions of configuration file syntax for
> now.  Lets instead simply consider configuration consumers.  A
> standard would consist of:
>
> * A WSGI environment key (e.g., 'webapp01.config')
> * A standard for what goes in that key (e.g., a dictionary object)
> * A reference implementation of the middleware
> * Maybe a non-WSGI-environment way to access the configuration (like 
> paste.CONFIG, which is a global object that dispatches to per-request 
> configuration objects) -- in practice this is really really useful, as 
> you don't have to pass the configuration object around.
>
> There's some other things we have to consider, as configuration syntaxes 
> do effect the configuration objects significantly.  So, the standard for 
> what goes in the key has to take into consideration some possible 
> configuration syntaxes.
>
> The obvious starting place is a dictionary-like object.  I would suggest 
> that the keys should be valid Python identifiers.  Not all syntaxes 
> require this, but some do.  This restriction simply means that 
> configuration consumers should try to consume Python identifiers.
>
> There's also a question about name conflicts (two consumers that are 
> looking for the same key), and whether nested configuration should be 
> preferred, and in what style.
>
> Note that the standard we decide on here doesn't have to be the only way 
> the object can be accessed.  For instance, you could make your 
> configuration available through 'myframework.config', and create a 
> compliant wrapper that lives in 'webapp01.config', perhaps even doing 
> different kinds of mapping to fix convention differences.
>
> There's also a question about what types of objects we can expect in the 
> configuration.  Some input styles (e.g., INI and command line) only 
> produce strings.  I think consumers should treat strings (or maybe a 
> special string subclass) specially, performing conversions as necessary 
> (e.g., 'yes'->True).
>
> Thoughts?



___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Standardized configuration

2005-07-16 Thread Chris McDonough
On Sat, 2005-07-16 at 23:29 -0500, Ian Bicking wrote:
> There's nothing in WSGI to facilitate introspection.  Sometimes that 
> seems annoying, though I suspect lots of headaches are removed because 
> of it, and I haven't found it to be a stopper yet.  The issue I'm 
> interested in is just how to deliver configuration to middleware.

Whew, I hoped you'd respond. ;-)

It appears that I haven't gotten as far as to want introspection into
the implementation or configuration of a middleware component.  Instead,
I want the ability to declaratively construct a pipeline out of largely
opaque and potentially interdependent (but loosely coupled) WSGI
middleware components, which is another problem entirely.  It seemed
cogent, so I just somewhat belligerently coopted this thread, sorry!

> Because middleware can't be introspected (generally), this makes things 
> like configuration schemas very hard to implement.  It all needs to be 
> late-bound.

The pipeline itself isn't really late bound.  For instance, if I was to
create a WSGI middleware pipeline something like this:

   server <--> session <--> identification <--> authentication <--> 
   <--> challenge <--> application

... session, identification, authentication, and challenge are
middleware components (you'll need to imagine their implementations).
And within a module that started a server, you might end up doing
something like:

def configure_pipeline(app):
return SessionMiddleware(
IdentificationMiddleware(
  AuthenticationMiddleware(
ChallengeMiddleware(app)

if __name__ == '__main__':
app = Application()
pipeline = configure_pipeline(app)
server = Server(pipeline)
server.serve()

The pipeline is static.  When a request comes in, the pipeline itself is
already constructed.  I don't really want a way to prevent "improper"
pipeline construction at startup time (right now anyway), because
failures due to missing dependencies will be fairly obvious.

But some elements of the pipeline at this level of factoring do need to
have dependencies on availability and pipeline placement of the other
elements.  In this example, proper operation of the authentication
component depends on the availability and pipeline placement of the
identification component.  Likewise, the identification component may
depend on values that need to be retrieved from the session component.

I've just seen Phillip's post where he implies that this kind of
fine-grained component factoring wasn't really the initial purpose of
WSGI middleware.  That's kind of a bummer. ;-)

Factoring middleware components in this way seems to provide clear
demarcation points for reuse and maintenance.  For example, I imagined a
declarative security module that might be factored as a piece of
middleware here:  http://www.plope.com/Members/chrism/decsec_proposal .

Of course, this sort of thing doesn't *need* to be middleware.  But
making it middleware feels very right to me in terms of being able to
deglom nice features inspired by Zope and other frameworks into pieces
that are easy to recombine as necessary.  Implementations as WSGI
middleware seems a nice way to move these kinds of features out of our
respective applications and into more application-agnostic pieces that
are very loosely coupled, but perhaps I'm taking it too far.

> > For example, it would be useful in some circumstances to create separate
> > WSGI components for user identification and user authorization.  The
> > process of identification -- obtaining user credentials from a request
> > -- and user authorization  -- ensuring that the user is who he says he
> > is by comparing the credentials against a data source -- are really
> > pretty much distinct operations.  There might also be a "challenge"
> > component which forces a login dialog.
> 
> I've always thought that a 401 response is a good way of indicating 
> that, but not everyone agrees.  (The idea being that the middleware 
> catches the 401 and possibly translates it into a redirect or something.)

Yep.  That'd be a fine signaling mechanism.

> > In practice, I don't know if this is a truly useful separation of
> > concerns that need to be implemented in terms of separate components in
> > the middleware pipeline (I see that paste.login conflates them), it's
> > just an example.  
> 
> Do you mean identification and authentication (you mention authorization 
> above)? 

Aggh.  Yes, I meant to write authentication, sorry.

>  I think authorization is different, and is conflated in 
> paste.login, but I don't have any many use cases where it's a useful 
> distinction.  I guess there's a number of ways of getting a username and 
> password; and to some degree the  authenticator object works at that 
> level of abstraction.  And there's a couple other ways of authenticating 
> a user as well (public keys, IP address, etc).  I've generally used a 
> "user manager" object for this kind of abstraction, with subclassing f

Re: [Web-SIG] Standardized configuration

2005-07-17 Thread Chris McDonough
On Sun, 2005-07-17 at 03:16 -0500, Ian Bicking wrote:
> This is what Paste does in configuration, like:
> 
> middleware.extend([
>  SessionMiddleware, IdentificationMiddleware,
>  AuthenticationMiddleware, ChallengeMiddleware])
> 
> This kind of middleware takes a single argument, which is the 
> application it will wrap.  In practice, this means all the other 
> parameters go into lazily-read configuration.

I'm finding it hard to imagine a reason to have another kind of
middleware.

Well, actually that's not true.  In noodling about this, I did think it
would be kind of neat in a twisted way to have "decision middleware"
like:

class DecisionMiddleware:
 def __init__(self, apps):
 self.apps = apps

 def __call__(self, environ, start_response):
app = self.choose(environ)
for chunk in app(environ, start_response):
yield chunk

 def choose(self, environ):
 app = some_decision_function(self.apps, environ)

I can imagine using this pattern as a decision point for a WSGI pipeline
serving multiple application end-points (perhaps based on URL matching
of the PATH_INFO in environ).

But by and large, most middleware components seem to be just wrappers
for the next application in the chain.  There seem to be two types of
middleware that takes a single application object as a parameter to its
constructor.  There is "decorator" middleware where you want to add
something to the environment for an application to find later and
"action" middleware that does some rewriting of the body or the response
headers before the response is sent back to the client.  Some of this
kind of middleware does both.

> You can also define a "framework" (a plugin to Paste), which in addition 
> to finding an "app" can also add middleware; basically embodying all the 
> middleware that is typical for a framework.

This appears to be what I'm trying to do too, which is why I'm intrigued
by Paste.

OTOH, I'm not sure that I want my framework to "find" an app for me.
I'd like to be able to define pipelines that include my app, but I'd
typically just want to statically declare it as the end point of a
pipeline composed of service middleware.  I should look at Paste a
little more to see if it has the same philosophy or if I'm
misunderstanding you.

> Paste is really a deployment configuration.  Well, that as well as stuff 
> to deploy.  And two frameworks.  And whatever else I feel a need or 
> desire to throw in there.

Yeah.  FWIW, as someone who has recently taken a brief look at Paste, I
think it would be helpful (at least for newbies) to partition out the
bits of Paste which are meant to be deployment configuration from the
bits that are meant to be deployed.  Zope 2 fell into the same trap
early on, and never recovered.  For example, ZPublisher (nee Bobo) was
always meant to be able to be useful outside of Zope, but in practice it
never happened because nobody could figure out how to disentangle it
from its ever-increasing dependencies on other software only found in a
Zope checkout.  In the end, nobody even remembered what its dependencies
were *supposed* to be.  If you ask ten people, you'd get ten different
answers.

I also think that the rigor of separating out different components helps
to make the software stronger and more easily understood in bite-sized
pieces.  Unfortunately, separating them makes configuration tough, but I
think that's what we're trying to find an answer about how to do "the
right way" here.

> Note also that parts of the pipeline are very much late bound.  For 
> instance, the way I implemented Webware (and Wareweb) each servlet is a 
> WSGI application.  So while there's one URLParser application, the 
> application that actually handles the request differs per request.  If 
> you start hanging more complete applications (that might have their own 
> middleware) at different URLs, then this happens more generally.

Well, if you put the "decider" in middleware itself, all of the
middleware components in each pipeline could still be at least
constructed early.  I'm pretty sure this doesn't really strictly qualify
as "early binding" but it's not terribly dynamic either.  It also makes
configuration pretty straightforward.  At least I can imagine a
declarative syntax for configuring pipelines this way.

I'm pretty sure you're not advocating it, but in case you are, I'm not
sure it adds as much value as it removes to be able to have a "dynamic"
middleware chain whereby new middleware elements can be added "on the
fly" to a pipeline after a request has begun.  That is *very* "late
binding" to me and it's impossible to configure declaratively.

> > But some elements of the pipeline at this level of factoring do need to
> > have dependencies on availability and pipeline placement of the other
> > elements.  In this example, proper operation of the authentication
> > component depends on the availability and pipeline placement of the
> > identification com

Re: [Web-SIG] Standardized configuration

2005-07-17 Thread Chris McDonough
I tried to think of this today in terms of creating a "deployment spec"
but boy, it gets complicated if you want a lot of useful features out of
it.  I have about four or five pages of a straw man "deployment
configuration" proposal, but it makes way too many assumptions.

So I tried to boil the problem down into its parts.  There seem to be
three distinct categories of configuration:

- Server/gateway/application instance configuration.  This is the
  kind of configuration that may be exposed to deployers by
  application authors.  Creating an instance configuration results
  in an instance of an application or gateway or maybe even
  a server.

- "Wiring" configuration which allows you to string together a
  "stack" out of instances.   I like calling it a "pipeline" better,
  but when in Rome... This is the kind of configuration that
  would be useful if you already have a bunch of instance configurations
  from the step above laying around and you want to create a stack
  out of them for deployment purposes.

- "Service" configuration which allows you create bits of 
  context that can be used by applications in the stack, but which
  aren't inserted into the stack itself.

I suspect we should stick to the first category of configuration first,
but I'll note that the desire for the other two categories might impose
some design constraints on the first.  The last kind of configuration
definitely ventures far out into framework land and though it'd be
terribly useful and seems to be where a lot of people think the value of
WSGI is, it might be something other than WSGI entirely.

So, anyway, towards the first category, I'll throw something out to the
wolves.  Note that below when I say "component" I mean a WSGI server,
gateway, or application:

  Each Python package which includes one or more WSGI components may
  optionally include descriptions of these components'
  "meta-configuration".  This meta-configuration would take the form
  of one or more "schemas".  Each schema would enumerate the
  configurable elements of a single WSGI component implementation.
  A schema for a component defines *the minimal number* of typed,
  component-specific keys and values that may be used to create
  instances of this component.

  >>> # load the schemas
  >>> server_schema  = loadSchema('components/server/server.schema')
  >>> gateway_schema = loadSchema('components/gateway/gateway.schema')
  >>> app_schema = loadSchema('components/app/app.schema')

  >>> # create the instances; any one of these steps would fail
  >>> # if the config file violated its schema.
  >>> server_factory  = loadConfig('instances/server/server.conf',
  schema = server_schema)
  >>> gateway_factory = loadConfig('instances/gateway/gateway.conf',
  schema = gateway_schema)
  >>> app_factory = loadConfig('instances/app/app.conf',
schema = app_schema)

  >>> # create instances from the factories
  >>> server = server_factory.create()
  >>> gateway = gateway_factory.create()
  >>> app = app_factory.create()

  # configure the instances into a pipeline
  >>> pipeline = server(gateway(app))

  # serve up the pipeline (notionally)
  >>> server.serve()

Of course this is just a more declarative way to do what is already
possible in code except for the schema-checking part, which presumably
would supply the deployer with clues if he had screwed up a config file.

I purposely didn't attempt to describe the syntax of the configuration
or schema files, but I suspect it would be best to make them both
ConfigParser files.  FWIW, ZConfig already does this exact thing, and
it's already written, but introducing dependencies on non-stdlib things
seems problematic.

Is this more or less what people have in mind for deployment
configuration or am I out in left field?

On Sun, 2005-07-17 at 13:56 -0400, Phillip J. Eby wrote:
> At 07:29 AM 7/17/2005 -0400, Chris McDonough wrote:
> >I'm a bit confused because one of the canonical examples of
> >how WSGI middleware is useful seems to be the example of implementing a
> >framework-agnostic sessioning service.  And for that sessioning service
> >to be useful, your application has to be able to depend on its
> >availability so it can't be "oblivious".
> 
> Exactly.  As soon as you start trying to have configured services, you are 
> creating Yet Another Framework.  Which isn't a bad thing per se, except 
> that it falls outside the scope of  PEP 333.  It deserves a separate PEP, I 
> think, and a separate implementation me

Re: [Web-SIG] Standardized configuration

2005-07-18 Thread Chris McDonough
On Mon, 2005-07-18 at 22:49 -0500, Ian Bicking wrote:
> In addition to the examples I gave in response to Graham, I wrote a 
> document on this a while ago: 
> http://pythonpaste.org/docs/url-parsing-with-wsgi.html
> 
> The hard part about this is configuration; it's easy to configure a 
> non-branching chain of middleware.  Once it branches the configuration 
> becomes hard (like programming-hard; which isn't *hard*, but it quickly 
> stops feeling like configuration).

Yep.  I think I'm getting it.  For example, I see that Paste's URLParser
seems to *construct* applications if they don't already exist based on
the URL.  And I assume that these applications could themselves be
middleware.  I don't think that is configurable declaratively if you
want to decide which app to use based on arbitrary request parameters.

But if we already had the config for each app "instance" that URLParser
wanted to consult laying around as files on disk, wouldn't it be just as
easy to construct these app objects "eagerly" at startup time?  Then you
URLParser could choose an already-configured app based on some sort of
configuration file in the URLParser component itself.  The "apps"
themselves may be pipelines, too, I realize that, but that is still
configurable without coding.

Maybe there'd be some concern about needing to stop the process in order
to add new applications.  That's a use case I hadn't really considered.
I suspect this could be done with a signal handler, though, which could
tell the URLParser to reload its config file instead of potentially
locating a and creating a new application within every request.

This would make URLParser a kind of "decision" middleware, but it would
choose from a static set of existing applications (or pipelines) for the
lifetime of the process as opposed to constructing them lazily.

> > OTOH, I'm not sure that I want my framework to "find" an app for me.
> > I'd like to be able to define pipelines that include my app, but I'd
> > typically just want to statically declare it as the end point of a
> > pipeline composed of service middleware.  I should look at Paste a
> > little more to see if it has the same philosophy or if I'm
> > misunderstanding you.
> 
> Mostly I wanted to avoid lots of magical incantations for the simple 
> case.  If you are used to Webware, well it has a very straight-forward 
> way of finding your application -- you give it a directory name.  If 
> Quixote or CherryPy, you give it a root object.  Maybe Zope would take a 
> ZEO connection string, and so on.

I think I understand now.

In general, I think I'd rather create "instance" locations of WSGI
applications (which would essentially consist of a config file on disk
plus any state info required by the app), configure and construct Python
objects out of those instances eagerly at "startup time" and just choose
between already-constructed apps if in "decision middleware" that has
its own declarative configuration if decisions need to be made about
which app to use.

This is mostly because I want the configuration info to live within the
application/middleware instance and have some other "starter" import
those configurations from application/middleware instance locations on
the filesystem.  The "starter" would construct required instances as
Python objects, and chain them together arbitrarily based on some other
"pipeline configuration" file that lives with the "starter".  The first
part of that (construct required instances) is described in a post I
made to this list yesterday.

This is probably because I'd like there to be one well-understood way to
declaratively configure pipelines as opposed to each piece of middleware
potentially needing to manage app construction and having its own
configuration to do so.

I don't know if this is reasonable for simpler requirements.  This is
more of a "formal deployment spec" idea and of course is likely flawed
in some subtle way I don't understand yet.

> > I'm pretty sure you're not advocating it, but in case you are, I'm not
> > sure it adds as much value as it removes to be able to have a "dynamic"
> > middleware chain whereby new middleware elements can be added "on the
> > fly" to a pipeline after a request has begun.  That is *very* "late
> > binding" to me and it's impossible to configure declaratively.
> 
> I'm comfortable with a little of both.  I don't even know *how* I'd stop 
> dynamic middleware.  For instance, one of the methods I added to Wareweb 
> recently allows any servlet to forward to any WSGI application; but from 
> the outside the servlet looks like a normal WSGI application just like 
> before.

It's obviously fine if applications themselves want to do this.  I'm not
sure that it would be possible to create a "deployment spec" that
canonized *how* to do it because as you mentioned it's not really a
configuration task, it's a programming task.

> > I agree!  I'm a bit confused because one of the canonical examples of
> > how WSGI middleware i

Re: [Web-SIG] Standardized configuration

2005-07-22 Thread Chris McDonough
I've had a stab at creating a simple WSGI deployment implementation.
I use the term "WSGI component" in here as shorthand to indicate all
types of WSGI implementations (server, application, gateway).

The primary deployment concern is to create a way to specify the
configuration of an instance of a WSGI component, preferably within a
declarative configuration file.  A secondary deployment concern is to
create a way to "wire up" components together into a specific
deployable "pipeline".  

A strawman implementation that solves both issues via the
"configurator", which would be presumed to live in "wsgiref". Currently
it lives in a package named "wsgiconfig" on my laptop.  This module
follows.

""" Configurator for establishing a WSGI pipeline """

from ConfigParser import ConfigParser
import types

def configure(path):
config = ConfigParser()
if isinstance(path, types.StringTypes):
config.readfp(open(path))
else:
config.readfp(path)

appsections = []

for name in config.sections():
if name.startswith('application:'):
appsections.append(name)
elif name == 'pipeline':
pass
else:
raise ValueError, '%s is not a valid section name'

app_defs = {}

for appsection in appsections:
app_config_file = config.get(appsection, 'config')
app_factory_name = config.get(appsection, 'factory')
app_name = appsection.split('application:')[1]
if app_config_file is None:
raise ValueError, ('application section %s requires a
"config" '
   'option' % app_config_file)
if app_factory_name is None:
raise ValueError, ('application %s requires a "factory"'
   ' option' % app_factory_name)
app_defs[app_name] = {'config':app_config_file,
  'factory':app_factory_name}

if not config.has_section('pipeline'):
raise ValueError, 'must have a "pipeline" section in config'

pipeline_str = config.get('pipeline', 'apps')
if pipeline_str is None:
raise ValueError, ('must have an "apps" definition in the '
   'pipeline section')

pipeline_def = pipeline_str.split()

next = None

while pipeline_def:
app_name = pipeline_def.pop()
app_def = app_defs.get(app_name)
if app_def is None:
raise ValueError, ('appname %s os defined in pipeline '
   '%s butno application is defined '
   'with that name')
factory_name = app_def['factory']
factory = import_by_name(factory_name)
config_file = app_def['config']
app_factory = factory(config_file)
app = app_factory(next)
next = app

if not next:
raise ValueError, 'no apps defined in pipeline'
return next

def import_by_name(name):
if not "." in name:
raise ValueError("unloadable name: " + `name`)
components = name.split('.')
start = components[0]
g = globals()
package = __import__(start, g, g)
modulenames = [start]
for component in components[1:]:
modulenames.append(component)
try:
package = getattr(package, component)
except AttributeError:
n = '.'.join(modulenames)
package = __import__(n, g, g, component)
return package

  We configure a pipeline based on a config file, which
  creates and chains two "sample" WSGI applications together.

  To do this, we use a ConfigParser-format config file named
  'myapplication.conf' that looks like this::

[application:sample1]
config = sample1.conf
factory = wsgiconfig.tests.sample_components.factory1

[application:sample2]
config = sample2.conf
factory = wsgiconfig.tests.sample_components.factory2

[pipeline]
apps = sample1 sample2

  The configurator exposes a function that accepts a single argument,
  "configure".

>>> from wsgiconfig.configurator import configure
>>> appchain = configure('myapplication.conf')

  The "sample_components" module referred to in the
  'myapplication.conf' file application definitions might look like
  this::

  class sample1:
  """ middleware """
  def __init__(self, app):
  self.app = app
  def __call__(self, environ, start_response):
  environ['sample1'] = True
  return self.app(environ, start_response)

  class sample2:
   """ end-point app """
  def __init__(self, app):
  self.app = app

  def __call__(self, environ, start_response):
  environ[

Re: [Web-SIG] Standardized configuration

2005-07-23 Thread Chris McDonough
On Fri, 2005-07-22 at 17:26 -0500, Ian Bicking wrote:

> >   To do this, we use a ConfigParser-format config file named
> >   'myapplication.conf' that looks like this::
> > 
> > [application:sample1]
> > config = sample1.conf
> > factory = wsgiconfig.tests.sample_components.factory1
> > 
> > [application:sample2]
> > config = sample2.conf
> > factory = wsgiconfig.tests.sample_components.factory2
> > 
> > [pipeline]
> > apps = sample1 sample2
> 
> I think it's confusing to call both these applications.  I think 
> "middleware" or "filter" would be better.  I think people understand 
> "filter" far better, so I'm inclined to use that.  So...

The reason I called them applications instead of filters is because all
of them implement the WSGI "application" API (they all implement "a
callable that accepts two parameters, environ and start_response").
Some happen to be gateways/filters/middleware/whatever but at least one
is just an application and does no delegation.  In my example above,
"sample2" is not a filter, it is the end-point application.  "sample1"
is a filter, but it's of course also an application too.

Would you maybe rather make it more explicit that some apps are also
gateways, e.g.:

[application:bleeb]
config = bleeb.conf
factory = bleeb.factory

[filter:blaz]
config = blaz.conf
factory = blaz.factory

?  I don't know that there's any way we could make use of the
distinction between the two types in the configurator other than
disallowing people to place an application "before" a filter in a
pipeline through validation.  Is there something else you had in mind?

> [application:sample2]
> # What is this relative to?  I hate both absolute paths and
> # paths relative to pwd equally...
> config = sample1.conf
> factory = wsgiconfig...

This was from a doctest I wrote so I could rely on relative paths,
sorry.  You're right.  U... we could probably cause use the
environment as "defaults" to ConfigParser inerpolation and set whatever
we need before the configurator is run:

$ export APP_ROOT=/home/chrism/myapplication
$ ./wsgi-configurator.py myapplication.conf

And in myapplication.conf:

[application:sample1]
config = %(APP_ROOT)s/sample1.conf
factory = myapp.sample1.factory

That would probably be the least-effort and most flexible thing to do
and doesn't mandate any particular directory structure.  Of course, we
could provide a convention for a recommended directory structure, but
this gives us an "out" from being painted in to that in specific cases.

> [pipeline]
> # The app is unique and special...?
> app = sample2
> filters = sample1
> 
> 
> 
> Well, that's just a first refactoring; I'm having other inclinations...

I'm not sure whether this is just a stylistic thing or if there's a
reason you want to treat the endpoint app specially.  By definition, in
my implementation, the endpoint app is just the last app mentioned in
the pipeline.

> > Potential points of contention
> > 
> >  - The WSGI configurator assumes that you are willing to write WSGI
> >component factories which accept a filename as a config file.  This
> >factory returns *another* factory (typically a class) that accepts
> >"the next" application in the pipeline chain and returns a WSGI
> >application instance.  This pattern is necessary to support
> >argument currying across a declaratively configured pipeline,
> >because the WSGI spec doesn't allow for it.  This is more contract
> >than currently exists in the WSGI specification but it would be
> >trivial to change existing WSGI components to adapt to this
> >pattern.  Or we could adopt a pattern/convention that removed one
> >of the factories, passing both the "next" application and the
> >config file into a single factory function.  Whatever.  In any
> >case, in order to do declarative pipeline configuration, some
> >convention will need to be adopted.  The convention I'm advocating
> >above seems to already have been for the current crop of middleware
> >components (using a factory which accepts the application as the
> >first argument).
> 
> I hate the proliferation of configuration files this implies.  I 
> consider the filters an implementation detail; if they each have 
> partitioned configuration then they become a highly exposed piece of the 
> architecture.
> 
> It's also a lot of management overhead.  Typical middleware takes 0-5 
> configuration parameters.  For instance, paste.profilemiddleware is 
> perfectly usable with no configuration at all, and only has two parameters.

True.  The config file param should be optional.  Apps might use the
environment to configure themselves.

> But this is reasonably easy to resolve -- there's a perfectly good 
> configuration section sitting there, waiting to be used:
> 
>[filter:profile]
>factory = paste.profilemiddleware.ProfileMiddleware
># Show top 50 functions:
>limit = 50
> 
> This in no way precludes 'co

Re: [Web-SIG] Standardized configuration

2005-07-23 Thread Chris McDonough
On Sat, 2005-07-23 at 20:21 -0400, Phillip J. Eby wrote:
> At 08:08 PM 7/23/2005 -0400, Chris McDonough wrote:
> >Would you maybe rather make it more explicit that some apps are also
> >gateways, e.g.:
> >
> >[application:bleeb]
> >config = bleeb.conf
> >factory = bleeb.factory
> >
> >[filter:blaz]
> >config = blaz.conf
> >factory = blaz.factory
> 
> That looks backwards to me.  Why not just list the sections in pipeline 
> order?  i.e., outermost middleware first, and the final application last?
> 
> For that matter, if you did that, you could specify the above as:
> 
>  [blaz.factory]
>  config=blaz.conf
> 
>  [bleeb.factory]
>  config=bleeb.conf

Guess that would work for me, but out of the box, ConfigParser doesn't
appear to preserve section ordering.  I'm sure we could make it do that.
Not a dealbreaker either, but if you ever did want a way to
declaratively configure something in the config file like the generic
"decision middleware" I described in that message, this wouldn't really
work.  I hadn't described it yet, but I can also imagine declaring
multiple pipelines in the config file and using decision middleware to
choose the first app in the next pipeline (as opposed to just an app).

- C


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Standardized configuration

2005-07-24 Thread Chris McDonough
On Sat, 2005-07-23 at 21:57 -0400, Phillip J. Eby wrote:
> > > For that matter, if you did that, you could specify the above as:
> > >
> > >  [blaz.factory]
> > >  config=blaz.conf
> > >
> > >  [bleeb.factory]
> > >  config=bleeb.conf
> >
> >Guess that would work for me, but out of the box, ConfigParser doesn't
> >appear to preserve section ordering.  I'm sure we could make it do that.
> >Not a dealbreaker either, but if you ever did want a way to
> >declaratively configure something in the config file like the generic
> >"decision middleware" I described in that message, this wouldn't really
> >work.  I hadn't described it yet, but I can also imagine declaring
> >multiple pipelines in the config file and using decision middleware to
> >choose the first app in the next pipeline (as opposed to just an app).
> 
> I consider this a YAGNI, myself.  But then again, most of the pipeline 
> stuff seems like a YAGNI to me.
> 
> Probably that's because everything you guys are talking about implementing 
> with pipelines of middleware, I'd use a single generic function for. 

FWIW, I think I fall somewhere between you and Ian on this, and maybe
more towards you.

I believe that there are services that are usefully composed as
middleware ("oblivious" things like XSL renderering and caches).  But
sessioning and auth services and whatnot I wouldn't put into middleware.
Instead, I'd use some service library that would have a much nicer
configuration API.  But none of that should really be described within
the deployment spec, so I haven't done so.

I'm trying to be sensitive of Ian's desire to use middleware for all
kinds of services.  I also do think there is a place for middleware, so
it's useful to be able to compose pipelines declaratively even if they
are terribly simple.  OTOH, if I set up an actual deployment for a
customer, it would rarely consist of more than one or two gateways and
then the application and many times it would just be the application if
I had no need for "oblivious" middleware apps in the pipeline.

Anyway, back to the nitty gritty of config, I'd rather just use
ConfigParser "as is" right now than to come up with another .ini parser
that preserves section ordering, thus the non-dependence on ordering
within the deployment file.

>  If I 
> was wrapping oblivious or legacy apps, I'd just make one middleware object 
> that then calls the generic function to do any and all dynamic 
> requirements, because it would only take a little bit of syntax sugar to 
> implement "configuration" scripts like:
> 
>  use_auth("/some/subdir", some_auth_service)
>  mount_app("/other/path", some_app_object)
> 
> etc.  So, all the time spent on coming up with an uglier, less-powerful 
> pseudo-framework to simulate these capabilities using crude .ini files and 
> poking stuff into environ seems kind of wasteful to me, versus defining a 
> powerful API to -- dare I say it -- "paste" applications together.  :)
> 
> However, such an API deserves to be both powerful and easy-to-use, not 
> kludged together with .ini syntax.

I agree.

> That's not saying I don't think WSGI should have a deployment configuration 
> format based on .ini syntax -- I still do!  I just don't think it should 
> even attempt to allow anything complex.  A simple static pipeline and some 
> server-defined and WSGI-defined options will do nicely for the "simple 
> things are simple" case, and a Python file will do nicely for all the 
> "complex things are possible" cases.

That's fine by me.

> That's why I'd like to see this effort split into two parts: 1) simple 
> deployment, and 2) a "pasting" API whose entire purpose in life is to 
> stack, route, and multiplex "middleware" and "applications" without having 
> to explicitly manage a pipeline.
> 
> This API would use *specificity* as a basis for establishing pipelines, 
> because it's not at all scalable (developer-wise) to set up pipelines on a 
> URL-by-URL basis for a complex application -- especially for applications 
> that aren't page-based!  Usually, you'll need some kind of pipeline 
> inheritance to manage that sort of thing.
> 
> There is little reason, however, why you can't configure a significant 
> portion of a URL space using a single WSGI component, using an appropriate 
> mechanism.  For example, recasting my earlier example:
> 
>  def factory(container):
>  container.use_auth("some/subdir", some_auth_service)
>  container.mount_app_factory("other/path", some_app_factory)

Yes.  I hadn't thought about managing service context based on
containment like this (and I like that), but to me, this is a services
registration all the same.

> Then, the 'mount_app_factory()' call could invoke 
> 'some_app_factory(subcontainer)' where 'subcontainer' is a wrapper that 
> prepends 'other/path' to URLs before delegating to 'container'.
> 
> In other words, once you have this "container API", there's no reason not 
> to just use it to implement t

Re: [Web-SIG] Standardized configuration

2005-07-24 Thread Chris McDonough
Thanks for the response... I'm not going to respond point-by-point here
because probably nobody has time to read this stuff anyway.

But in general:

1) I'm for creating a simple deployment spec that allows you to define
static pipelines declaratively.  The decision middleware thing is just
an idea.  I'm not really sure it's even a good idea, but it's a stab at
a compromise which would allow for a bit of pipeline dynamicism.

2) I don't have a strong preference one way or another about what the
main config looks like other than it should be simple.  So I'd probably
be fine  with any of:

  [application:foo]
  factory = foo.factory
  config = foo.conf

  [application:bar]
  factory = bar.factory
  config = bar.conf

  [pipeline]
  apps = foo bar

- OR (assuming we have section ordering and we can live with a single
pipeline) -

  [foo.factory]
  config = foo.conf

  [bar.factory]
  config = bar.conf

- OR (if we passed the factory a namespace instead of a filename) -

  [foo.factory]
  arbitrarykey1 = arbitraryvalue1
  arbitrarykey2 = arbitraryvalue2

  [bar.factory]
  arbitrarykey1 = arbitraryvalue1
  arbitrarykey2 = arbitraryvalue2

  (Forget my ramblings about os.environ.  You're right.
  It all comes out the same.)

3) I don't have a strong opinion on whether middleware and endpoint
   apps should be treated differently in the config file.
   If we used section ordering in configparser to imply the pipeline, 
   I'd suspect they wouldn't be.

So where does that leave us?

- C

On Sat, 2005-07-23 at 20:01 -0500, Ian Bicking wrote:
> Chris McDonough wrote:
> > On Fri, 2005-07-22 at 17:26 -0500, Ian Bicking wrote:
> >>>  To do this, we use a ConfigParser-format config file named
> >>>  'myapplication.conf' that looks like this::
> >>>
> >>>[application:sample1]
> >>>config = sample1.conf
> >>>factory = wsgiconfig.tests.sample_components.factory1
> >>>
> >>>[application:sample2]
> >>>config = sample2.conf
> >>>factory = wsgiconfig.tests.sample_components.factory2
> >>>
> >>>[pipeline]
> >>>apps = sample1 sample2
> >>
> >>I think it's confusing to call both these applications.  I think 
> >>"middleware" or "filter" would be better.  I think people understand 
> >>"filter" far better, so I'm inclined to use that.  So...
> > 
> > 
> > The reason I called them applications instead of filters is because all
> > of them implement the WSGI "application" API (they all implement "a
> > callable that accepts two parameters, environ and start_response").
> > Some happen to be gateways/filters/middleware/whatever but at least one
> > is just an application and does no delegation.  In my example above,
> > "sample2" is not a filter, it is the end-point application.  "sample1"
> > is a filter, but it's of course also an application too.
> 
> Well, the difference I see is that a filter accepts a next-application, 
> where a plain application does not.  From the perspective of this 
> configuration file, those seem ver different.  In fact, it could 
> actually be:
> 
>[application:sample1]
>config = sample1.conf
>factory = ...
> 
>...
> 
>[application:real_sample1]
>pipeline = printdebug_app sample1
> 
> That is, a "pipeline" simply describes a new application.  And then -- 
> perhaps with a conventional name, or through some more global 
> configuration -- we indicate which application we are going to serve.
> 
> Hmm... thinking about it, this seems much more general, in a very useful 
> way, since anyone can plugin in ways to compose applications. 
> "pipeline" is just one use case for how to compose applications.
> 
> > Would you maybe rather make it more explicit that some apps are also
> > gateways, e.g.:
> > 
> > [application:bleeb]
> > config = bleeb.conf
> > factory = bleeb.factory
> > 
> > [filter:blaz]
> > config = blaz.conf
> > factory = blaz.factory
> > 
> > ?  I don't know that there's any way we could make use of the
> > distinction between the two types in the configurator other than
> > disallowing people to place an application "before" a filter in a
> > pipeline through validation.  Is there something else you had in mind?
> 
> I have forgotten what the actual factory interface was, but I think it 
> should be different for the two.  Well, I think it *is* different, and 
> passing in a next-application of None just co

Re: [Web-SIG] Entry points and import maps (was Re: Scarecrow deployment config

2005-07-24 Thread Chris McDonough
Sorry, I think I may have lost track of where we were going wrt the
deployment spec.  Specifically, I don't know how we got to using eggs
(which I'd really like to, BTW, they're awesome conceptually!) from
where we were in the discussion about configuring a WSGI pipeline.  What
is a "feature"?  What is an "import map"? "Entry point"?  Should I just
get more familiar with eggs to understand what's being discussed here or
did I miss a few posts?

On Sun, 2005-07-24 at 12:49 -0400, Phillip J. Eby wrote:
> [cc:ed to distutils-sig because much of the below is about a new egg 
> feature; follow-ups about the web stuff should stay on web-sig]
> 
> At 04:04 AM 7/24/2005 -0500, Ian Bicking wrote:
> >So maybe here's a deployment spec we can start with.  It looks like:
> >
> >[feature1]
> >someapplication.somemodule.some_function
> >
> >[feature2]
> >someapplication.somemodule.some_function2
> >
> >You can't get dumber than that!  There should also be a "no-feature"
> >section; maybe one without a section identifier, or some special section
> >identifier.
> >
> >It goes in the .egg-info directory.  This way elsewhere you can say:
> >
> >application = SomeApplication[feature1]
> 
> I like this a lot, although for a different purpose than the format Chris 
> and I were talking about.  I see this fitting into that format as maybe:
> 
> [feature1 from SomeApplication]
> # configuration here
> 
> 
> >And it's quite unambiguous.  Note that there is *no* "configuration" in
> >the egg-info file, because you can't put any configuration related to a
> >deployment in an .egg-info directory, because it's not specific to any
> >deployment.  Obviously we still need a way to get configuration in
> >there, but lets say that's a different matter.
> 
> Easily fixed via what I've been thinking of as the "deployment descriptor"; 
> I would call your proposal here the "import map".  Basically, an import map 
> describes a mapping from some sort of feature name to qualified names in 
> the code.
> 
> I have an extension that I would make, though.  Instead of using sections 
> for features, I would use name/value pairs inside of sections named for the 
> kind of import map.  E.g.:
> 
>  [wsgi.app_factories]
>  feature1 = somemodule:somefunction
>  feature2 = another.module:SomeClass
>  ...
> 
>  [mime.parsers]
>  application/atom+xml = something:atom_parser
>  ...
> 
> In other words, feature maps could be a generic mechanism offered by 
> setuptools, with a 'Distribution.load_entry_point(kind,name)' API to 
> retrieve the desired object.  That way, we don't end up reinventing this 
> idea for dozens of frameworks or pluggable applications that just need a 
> way to find a few simple entry points into the code.
> 
> In addition to specifying the entry point, each entry in the import map 
> could optionally list the "extras" that are required if that entry point is 
> used.
> It could also issue a 'require()' for the corresponding feature if it has 
> any additional requirements listed in the extras_require dictionary.
> 
> So, I'm thinking that this would be implemented with an entry_points.txt 
> file in .egg-info, but supplied in setup.py like this:
> 
>  setup(
>  ...
>  entry_points = {
>  "wsgi.app_factories": dict(
>  feature1 = "somemodule:somefunction",
>  feature2 = "another.module:SomeClass [extra1,extra2]",
>  ),
>  "mime.parsers": {
>  "application/atom+xml": "something:atom_parser [feedparser]"
>  }
>  },
>  extras_require = dict(
>  feedparser = [...],
>  extra1 = [...],
>  extra2 = [...],
>  )
>  )
> 
> Anyway, this would make the most common use case for eggs-as-plugins very 
> easy: an application or framework would simply define entry points, and 
> plugin projects would declare the ones they offer in their setup script.
> 
> I think this is a fantastic idea and I'm about to leap into implementing 
> it.  :)
> 
> 
> >This puts complex middleware construction into the function that is
> >referenced.  This function might be, in turn, an import from a
> >framework.  Or it might be some complex setup specific to the
> >application.  Whatever.
> >
> >The API would look like:
> >
> >wsgiapp = wsgiref.get_egg_application('SomeApplication[feature1]')
> >
> >Which ultimately resolves to:
> >
> >wsgiapp = some_function()
> >
> >get_egg_application could also take a pkg_resources.Distribution object.
> 
> Yeah, I'm thinking that this could be implemented as something like:
> 
>  import pkg_resources
> 
>  def get_wsgi_app(project_name, app_name, *args, **kw):
>  dist = pkg_resources.require(project_name)[0]
>  return dist.load_entry_point('wsgi.app_factories', 
> app_name)(*args,**kw)
> 
> with all the heavy lifting happening in the pkg_resources.Distribution 
> class, 

Re: [Web-SIG] Entry points and import maps (was Re: Scarecrow deployment config

2005-07-24 Thread Chris McDonough
Thanks...

I'm still confused about high level requirements so please try to be
patient with me as I try get back on track.

These are the requirements as I understand them:

1.  We want to be able to distribute WSGI applications and middleware
(presumably in a format supported by setuptools).

3.  We want to be able to configure a WSGI application in order
to create an application instance.

2.  We want a way to combine configured instances of those
applications into pipelines and start an "instance" of a pipeline.

Are these requirements the ones being discussed?  If so, which of the
config file formats we've been discussing matches which requirement?

Thanks,

- C

On Sun, 2005-07-24 at 22:24 -0400, Phillip J. Eby wrote:
> At 08:35 PM 7/24/2005 -0400, Chris McDonough wrote:
> >Sorry, I think I may have lost track of where we were going wrt the
> >deployment spec.  Specifically, I don't know how we got to using eggs
> >(which I'd really like to, BTW, they're awesome conceptually!) from
> >where we were in the discussion about configuring a WSGI pipeline.  What
> >is a "feature"?  What is an "import map"? "Entry point"?  Should I just
> >get more familiar with eggs to understand what's being discussed here or
> >did I miss a few posts?
> 
> I suggest this post as the shortest architectural introduction to the whole 
> egg thang:
> 
>  http://mail.python.org/pipermail/distutils-sig/2005-June/004652.html
> 
> It explains pretty much all of the terminology I'm currently using, except 
> for the new terms invented today...
> 
> Entry points are a new concept, invented today by Ian and myself.  Ian 
> proposed having a mapping file (which I dubbed an "import map") included in 
> an egg's metadata, and then referring to named entries from a pipeline 
> descriptor, so that you don't have to know or care about the exact name to 
> import.  The application or middleware factory name would be looked up in 
> the egg's import map in order to find the actual factory object.
> 
> I took Ian's proposal and did two things:
> 
> 1) Generalized the idea to a concept of "entry points".  An entry point is 
> a name that corresponds to an import specification, and an optional list of 
> "extras" (see terminology link above) that the entry point may 
> require.  Entry point names exist in a namespace called an "entry point 
> group", and I implied that the WSGI deployment spec would define two such 
> groups: wsgi.applications and wsgi.middleware, but a vast number of other 
> possibilities for entry points and groups exist.  In fact, I went ahead and 
> implemented them in setuptools today, and realized I could use them to 
> register setup commands with setuptools, making it extensible by any 
> project that registers entry points in a 'distutils.commands' group.
> 
> 2) I then proposed that we extend our deployment descriptor (.wsgi file) 
> syntax so that you can do things like:
> 
>  [foo from SomeProject]
>  # configuration here
> 
> What this does is tell the WSGI deployment API to look up the "foo" entry 
> point in either the wsgi.middleware or wsgi.applications entry point group 
> for the named project, according to whether it's the last item in the .wsgi 
> file.  It then invokes the factory as before, with the configuration values 
> as keyword arguments.
> 
> This proposal is of course an *extension*; it should still be possible to 
> use regular dotted names as section headings, if you haven't yet drunk the 
> setuptools kool-aid.  But, it makes for interesting possibilities because 
> we could now have a tool that reads a WSGI deployment descriptor and runs 
> easy_install to find and download the right projects.  So, you could 
> potentially just write up a descriptor that lists what you want and the 
> server could install it, although I think I personally would want to run a 
> tool explicitly; maybe I'll eventually add a --wsgi=FILENAME option to 
> EasyInstall that would tell it to find out what to install from a WSGI 
> deployment descriptor.
> 
> That would actually be pretty cool, when you realize it means that all you 
> have to do to get an app deployed across a bunch of web servers is to copy 
> the deployment descriptor and tell 'em to install stuff.  You can always 
> create an NFS-mounted cache directory where you put pre-built eggs, and 
> EasyInstall would just fetch and extract them in that case.
> 
> Whew.  Almost makes me wish I was back in my web apps shop, where this kind 
> of thing would've been *really* useful to have.
> 

___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Entry points and import maps (was Re: Scarecrow deployment config

2005-07-24 Thread Chris McDonough
BTW, a simple example that includes proposed solutions for all of these
requirements would go a long way towards helping me (and maybe others)
understand how all the pieces fit together.  Maybe something like:

- Define two simple WSGI components:  a WSGI middleware and a WSGI
  application.

- Describe how to package each as an indpendent egg.

- Describe how to configure an instance of the application.

- Describe how to configure an instance of the middleware

- Describe how to string them together into a pipeline.

- C


On Mon, 2005-07-25 at 02:33 -0400, Chris McDonough wrote:
> Thanks...
> 
> I'm still confused about high level requirements so please try to be
> patient with me as I try get back on track.
> 
> These are the requirements as I understand them:
> 
> 1.  We want to be able to distribute WSGI applications and middleware
> (presumably in a format supported by setuptools).
> 
> 3.  We want to be able to configure a WSGI application in order
> to create an application instance.
> 
> 2.  We want a way to combine configured instances of those
> applications into pipelines and start an "instance" of a pipeline.
> 
> Are these requirements the ones being discussed?  If so, which of the
> config file formats we've been discussing matches which requirement?
> 
> Thanks,
> 
> - C
> 
> On Sun, 2005-07-24 at 22:24 -0400, Phillip J. Eby wrote:
> > At 08:35 PM 7/24/2005 -0400, Chris McDonough wrote:
> > >Sorry, I think I may have lost track of where we were going wrt the
> > >deployment spec.  Specifically, I don't know how we got to using eggs
> > >(which I'd really like to, BTW, they're awesome conceptually!) from
> > >where we were in the discussion about configuring a WSGI pipeline.  What
> > >is a "feature"?  What is an "import map"? "Entry point"?  Should I just
> > >get more familiar with eggs to understand what's being discussed here or
> > >did I miss a few posts?
> > 
> > I suggest this post as the shortest architectural introduction to the whole 
> > egg thang:
> > 
> >  http://mail.python.org/pipermail/distutils-sig/2005-June/004652.html
> > 
> > It explains pretty much all of the terminology I'm currently using, except 
> > for the new terms invented today...
> > 
> > Entry points are a new concept, invented today by Ian and myself.  Ian 
> > proposed having a mapping file (which I dubbed an "import map") included in 
> > an egg's metadata, and then referring to named entries from a pipeline 
> > descriptor, so that you don't have to know or care about the exact name to 
> > import.  The application or middleware factory name would be looked up in 
> > the egg's import map in order to find the actual factory object.
> > 
> > I took Ian's proposal and did two things:
> > 
> > 1) Generalized the idea to a concept of "entry points".  An entry point is 
> > a name that corresponds to an import specification, and an optional list of 
> > "extras" (see terminology link above) that the entry point may 
> > require.  Entry point names exist in a namespace called an "entry point 
> > group", and I implied that the WSGI deployment spec would define two such 
> > groups: wsgi.applications and wsgi.middleware, but a vast number of other 
> > possibilities for entry points and groups exist.  In fact, I went ahead and 
> > implemented them in setuptools today, and realized I could use them to 
> > register setup commands with setuptools, making it extensible by any 
> > project that registers entry points in a 'distutils.commands' group.
> > 
> > 2) I then proposed that we extend our deployment descriptor (.wsgi file) 
> > syntax so that you can do things like:
> > 
> >  [foo from SomeProject]
> >  # configuration here
> > 
> > What this does is tell the WSGI deployment API to look up the "foo" entry 
> > point in either the wsgi.middleware or wsgi.applications entry point group 
> > for the named project, according to whether it's the last item in the .wsgi 
> > file.  It then invokes the factory as before, with the configuration values 
> > as keyword arguments.
> > 
> > This proposal is of course an *extension*; it should still be possible to 
> > use regular dotted names as section headings, if you haven't yet drunk the 
> > setuptools kool-aid.  But, it makes for interesting possibilities because 
> > we could now have a tool that reads a WSGI deployment descriptor and runs 
> > easy_install to find a

Re: [Web-SIG] Entry points and import maps (was Re: Scarecrow deployment config

2005-07-25 Thread Chris McDonough
Actually, let me give this a shot.

We package up an egg called helloworld.egg.  It happens to contain
something that can be used as a WSGI component.  Let's say it's a WSGI
application that always returns 'Hello World'.  And let's say it also
contains middleware that lowercases anything that passes through before
it's returned.

The implementations of these components could be as follows:

class HelloWorld:
def __init__(self, app, **kw):
pass # nothing to configure

def __call__(self, environ, start_response):
start_response('200 OK', [])
return ['Hello World']

class Lowercaser:
def __init__(self, app, **kw):
self.app = app
# nothing else to configure

def __call__(self, environ, start_response):
for chunk in self.app(environ, start_response):
yield chunk.lower()

An import map would ship inside of the egg-info dir:

[wsgi.app_factories]
helloworld = helloworld:HelloWorld
lowercaser = helloworld:Lowercaser

So we install the egg and this does nothing except allow it to be used
from within Python.
  
But when we create a "deployment descriptor" like so in a text editor:

[helloworld from helloworld]

[lowercaser from helloworld]

... and run some "starter" script that parses that as a pipeline,
creates the two instances, wires them together, and we get a running
pipeline?

Am I on track?

OK, back to Battlestar Galactica ;-)



On Mon, 2005-07-25 at 02:40 -0400, Chris McDonough wrote:
> BTW, a simple example that includes proposed solutions for all of these
> requirements would go a long way towards helping me (and maybe others)
> understand how all the pieces fit together.  Maybe something like:
> 
> - Define two simple WSGI components:  a WSGI middleware and a WSGI
>   application.
> 
> - Describe how to package each as an indpendent egg.
> 
> - Describe how to configure an instance of the application.
> 
> - Describe how to configure an instance of the middleware
> 
> - Describe how to string them together into a pipeline.
> 
> - C
> 
> 
> On Mon, 2005-07-25 at 02:33 -0400, Chris McDonough wrote:
> > Thanks...
> > 
> > I'm still confused about high level requirements so please try to be
> > patient with me as I try get back on track.
> > 
> > These are the requirements as I understand them:
> > 
> > 1.  We want to be able to distribute WSGI applications and middleware
> > (presumably in a format supported by setuptools).
> > 
> > 3.  We want to be able to configure a WSGI application in order
> > to create an application instance.
> > 
> > 2.  We want a way to combine configured instances of those
> > applications into pipelines and start an "instance" of a pipeline.
> > 
> > Are these requirements the ones being discussed?  If so, which of the
> > config file formats we've been discussing matches which requirement?
> > 
> > Thanks,
> > 
> > - C
> > 
> > On Sun, 2005-07-24 at 22:24 -0400, Phillip J. Eby wrote:
> > > At 08:35 PM 7/24/2005 -0400, Chris McDonough wrote:
> > > >Sorry, I think I may have lost track of where we were going wrt the
> > > >deployment spec.  Specifically, I don't know how we got to using eggs
> > > >(which I'd really like to, BTW, they're awesome conceptually!) from
> > > >where we were in the discussion about configuring a WSGI pipeline.  What
> > > >is a "feature"?  What is an "import map"? "Entry point"?  Should I just
> > > >get more familiar with eggs to understand what's being discussed here or
> > > >did I miss a few posts?
> > > 
> > > I suggest this post as the shortest architectural introduction to the 
> > > whole 
> > > egg thang:
> > > 
> > >  http://mail.python.org/pipermail/distutils-sig/2005-June/004652.html
> > > 
> > > It explains pretty much all of the terminology I'm currently using, 
> > > except 
> > > for the new terms invented today...
> > > 
> > > Entry points are a new concept, invented today by Ian and myself.  Ian 
> > > proposed having a mapping file (which I dubbed an "import map") included 
> > > in 
> > > an egg's metadata, and then referring to named entries from a pipeline 
> > > descriptor, so that you don't have to know or care about the exact name 
> > > to 
> > > import.  The application or middleware factory name would be looked up in 
> > > the egg's import map in order to find the actual factory object.
> 

Re: [Web-SIG] Entry points and import maps (was Re: Scarecrow deployment config

2005-07-25 Thread Chris McDonough
Great.  Given that, I've created the beginnings of a more formal
specification:

WSGI Deployment Specification
-

  I use the term "WSGI component" in here as shorthand to indicate all
  types of WSGI implementations (application, middleware).

  The primary deployment concern is to create a way to specify the
  configuration of an instance of a WSGI component within a
  declarative configuration file.  A secondary deployment concern is
  to create a way to "wire up" components together into a specific
  deployable "pipeline".

Pipeline Descriptors


  Pipeline descriptors are file representations of a particular WSGI
  "pipeline".  They include enough information to configure,
  instantiate, and wire together WSGI apps and middleware components
  into one pipeline for use by a WSGI server.  Installation of the
  software which composes those components is handled separately.

  In order to define a pipeline, we use a ".ini"-format configuration
  file conventionally named '.wsgi'.  This file may
  optionally be marked as executable and associated with a simple UNIX
  interpreter via a leading hash-bang line to allow servers which
  employ stdin and stdout streams (ala CGI) to run the pipeline
  directly without any intermediation.  For example, a deployment
  descriptor named 'myapplication.wsgi' might be composed of the
  following text::

#!/usr/bin/runwsgi

[mypackage.mymodule.factory1]
quux = arbitraryvalue
eekx = arbitraryvalue

[mypackage.mymodule.factory2]
foo = arbitraryvalue
bar = arbitraryvalue

  Section names are Python-dotted-path names (or setuptools "entry
  point names" described in a later section) which represent
  factories.  Key-value pairs within a given section are used as
  keyword arguments to the factory that can be used as configuration
  for the component being instantiated.

  All sections in the deployment descriptor describe 'middleware'
  except for the last section, which must describe an application.

  Factories which construct middleware must return something which is
  a WSGI "callable" by implementing the following API::

 def factory(next_app, [**kw]):
 """ next_app is the next application in the WSGI pipeline,
 **kw is optional, and accepts the key-value pairs
 that are used in the section as a dictionary, used
 for configuration """

  Factories which construct middleware must return something which is
  a WSGI "callable" by implementing the following API::

 def factory([**kw]):
 """" **kw is optional, and accepts the key-value pairs
  that are used in the section as a dictionary, used
  for configuration """

  A deployment descriptor can also be parsed from within Python.  An
  importable configurator which resides in 'wsgiref' exposes a
  function that accepts a single argument, "configure"::

>>> from wsgiref.runwsgi import parse_deployment
>>> appchain = parse_deployment('myapplication.wsgi')

  'appchain' will be an object representing the fully configured
  "pipeline".  'parse_deployment' is guaranteed to return something
  that implements the WSGI "callable" API described in PEP 333.

Entry Points

  



On Mon, 2005-07-25 at 10:39 -0400, Phillip J. Eby wrote:
> At 03:02 AM 7/25/2005 -0400, Chris McDonough wrote:
> >Actually, let me give this a shot.
> >
> >We package up an egg called helloworld.egg.  It happens to contain
> >something that can be used as a WSGI component.  Let's say it's a WSGI
> >application that always returns 'Hello World'.  And let's say it also
> >contains middleware that lowercases anything that passes through before
> >it's returned.
> >
> >The implementations of these components could be as follows:
> >
> >class HelloWorld:
> > def __init__(self, app, **kw):
> > pass # nothing to configure
> >
> > def __call__(self, environ, start_response):
> > start_response('200 OK', [])
> > return ['Hello World']
> 
> I'm thinking that an application like this wouldn't take an 'app' 
> constuctor parameter, and if it takes no configuration parameters it 
> doesn't need **kw, but good so far.
> 
> 
> >class Lowercaser:
> > def __init__(self, app, **kw):
> > self.app = app
> > # nothing else to configure
> >
> > def __call__(self, environ, start_response):
> > for chunk in self.app(environ, start_response):
> > yield chu

Re: [Web-SIG] WSGI deployment use case

2005-07-25 Thread Chris McDonough
On Mon, 2005-07-25 at 20:29 -0500, Ian Bicking wrote:
> > We probably need something like a "site map" configuration, that can 
> > handle tree structure, and can specify pipelines on a per location 
> > basis, including the ability to specify pipeline components to be 
> > applied above everything under a certain URL pattern.  This is more or 
> > less the same as my "container API" concept, but we are a little closer 
> > to being able to think about such a thing.
> 
> It could also be something based on general matching rules, with some 
> notion of precedence and how the rule effects SCRIPT_NAME/PATH_INFO.  Or 
> something like that.
How much of this could be solved by using a web server's
directory/alias-mapping facility?

For instance, if you needed a single Apache webserver to support
multiple pipelines based on URL mapping, wouldn't it be possible in many
cases to compose that out of things like rewrite rules and script
aliases (the below assumes running them just as CGI scripts, obviously
it would be different with something using mod_python or what-have-you):


 ServerAdmin [EMAIL PROTECTED]
 ServerName plope.com
 ServerAlias plope.com
 ScriptAlias /viewcvs "/home/chrism/viewcvs.wsgi"
 ScriptAlias /blog "/home/chrism/blog.wsgi"
 RewriteEngine On
 RewriteRule ^/[^/]viewcvs*$ /home/chrism/viewcvs.wsgi [PT]
 RewriteRule ^/[^/]blog*$ /home/chrism/blog.wsgi [PT]


Obviously it would mean some repetition in "wsgi" files if you needed to
repeat parts of a pipeline for each URL mapping.  But it does mean we
wouldn't need to invent more software.


> 
> > Of course, I still think it's something that can be added *after* having 
> > a basic deployment spec.
> 
> I feel a very strong need that this be resolved before settling on 
> anything deployment related.  Not necessarily as a standard, but 
> possibly as a set of practices.  Even a realistic and concrete use case 
> might be enough.


I *think* more complicated use cases may revolve around attempting to
use middleware as services that dynamize the pipeline instead of as
"oblivious" things.  I don't think there's anything really wrong with
that but I also don't think it can ever be specified with as much
clarity as what we've already got because IMHO it's a programming task.

I'm repeating myself, I'm sure, but I'm more apt to put a "service
manager" piece of middleware in the pipeline (or maybe just implement it
as a library) which would allow my endpoint app to use it to do
sessioning and auth and whatnot.  I realize that is essentially
"building a framework" (which is reviled lately) but since the endpoint
app needs to collaborate anyway, I don't see a better way to do it
except to rely completely on convention for service lookup (which is
what you seem to be struggling with in the later bits of your post).

- C




___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] WSGI deployment use case

2005-07-25 Thread Chris McDonough
On Mon, 2005-07-25 at 22:01 -0500, Ian Bicking wrote:
> > 
> >  ServerAdmin [EMAIL PROTECTED]
> >  ServerName plope.com
> >  ServerAlias plope.com
> >  ScriptAlias /viewcvs "/home/chrism/viewcvs.wsgi"
> >  ScriptAlias /blog "/home/chrism/blog.wsgi"
> >  RewriteEngine On
> >  RewriteRule ^/[^/]viewcvs*$ /home/chrism/viewcvs.wsgi [PT]
> >  RewriteRule ^/[^/]blog*$ /home/chrism/blog.wsgi [PT]
> > 

> Messy configuration files (and RewriteRule for that matter) are my bane.

I agree.  In fact, I stole that snippet from my own server and modified
it.  It would probably do *something* but to be honest I'm not even sure
I remember exactly what. ;-)  But there's always the docs to fall back
on...

> To be fair, in a shared hosting situation (websites maintained by 
> customers, not the host) this would seem more workable than a 
> centralized configuration.  Perhaps... it's not the kind of situation I 
> deal with much anymore, so I've lost touch with that case.  And would 
> that mean we'd start seeing ".wsgi" in URLs?  Hrm.

No, I think I just remembered... that's what the RewriteRules are
for! ;-)

- C


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] WSGI deployment use case

2005-07-25 Thread Chris McDonough
Just for a frame of reference, I'll say how I might do these things.
These all assume I'd use Apache and mod_python, for better or worse:

> I'm not clear exactly what you are proposing.  Let's use a more 
> realistic example.  Components:
> 
> * Exception catcher.  Takes "email_errors", which is a list of addresses 
> to email exceptions to.  I want to apply this globally.

I'd likely do this in my endpoint apps (maybe share some sort of library
between them to do it).  Errors that occur in middleware would be
diagnosable/detectable via mod_python's error logging facility and
something like snort.

> * An application mounted on /, which takes "document_root" and serves up 
> those files directly.

Use the webserver.

> * An application mounted at /blog, takes "database" (a string) where all 
> its information is kept.

Separate WSGI pipeline descriptor with rewrite rules or whatever
aliasing "/blog" to it.

> * An application mounted at /admin.  Takes "document_root", which is 
> where the editable files are located.  Around it goes two pieces of 
> middleware...

Same as above...

> * A authentication middleware, which takes "database", which is where 
> user information is kept.  And...

I'd probably make this into a service that would be consumable by
applications with a completely separate configuration outside of any
deployment spec.  For example, I might try to pull Zope's "Pluggable
Authentication Utility" out of Zope 3, leaving intact its
configurability through ZCML.

But if I did put it in middleware, I'd put it in each of my application
pipelines (implied by /blog, /admin) in an appropriate place.

> * An authorization middleware, that takes "allowed_roles", and checks it 
> against what the authentication middleware puts in.

This one I know wouldn't make into middleware.  Instead, I'd use a
library much like the thing I proposed as "decsec" (although at the time
I wrote that proposal, I did think it would be middleware; I changed my
mind).

- C


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] WSGI deployment use case

2005-07-26 Thread Chris McDonough
On Tue, 2005-07-26 at 01:18 -0500, Ian Bicking wrote:
> Well, the stack is really just an example, meant to be more realistic 
> than "sample1" and "sample2".  I actually think it's a very reasonable 
> example, but that's not really the point.  Presuming this stack, how 
> would you configure it?

I typically roll out software to clients using a build mechanism (I
happens to use "pymake" at http://www.plope.com/software/pymake/ but
anything dependency-based works).

I write "generic" build scripts for all of the software components.  For
example, I might write makefiles that check out and build python,
openldap, mysql and so on (each into a "non-system" location).  I leave
a bit of room for customization in their build definitions that I can
override from within a "profile".  A "profile" is a set of customized
software builds for a specific purpose.

I might have, maybe, 3 different profiles for each customer where the
profile usually works out to be tied to machine function (load balancer,
app server, database server).  I mantain these build scripts and the
profiles in CVS for each customer.  I never install anything by hand, I
always change the buildout and rerun it if I need to get something set
up.

This usually works out pretty well because to roll out a new major
version of software, I just rerun the build scripts for a particular
profile and move the data over.  Usually the only thing that needs to
change frequently are a few bits of software that are checked out of
version control, so doing "cvs up" on those bits typically gets me where
I need to be unless it's a major revision.

So in this case, I'd likely write a build that either built Apache from
source or at least created an "httpd-includes" file meant to be
referenced from within the "system" Apache config file with the proper
stuff in it given the profile's purpose.  The build would also download
and install Python, it would get the the proper eggs and/or Python
software and the database, and so forth.  All the configuration would be
done via the "profile" which is in version control.

I don't know if this kind of thing works for everybody, but it has
worked well for me so far.  I do this all the time, and I have a good
library of buildout scripts already so it's less painful for me than it
might be for someone who is starting from scratch.  That said, it is
time-consuming and imperfect... upgrades are the most painful.  New
installs are simple, though.

So, anyway, the short answer is "I write a script to do the config for
me so I can repeat it on demand".

- C


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] and now for something completely different!

2005-08-15 Thread Chris McDonough
I've also got reams of code in Zope for sessions.

Maybe we should just wait til the next PyCon and have a consolidation
sprint.

- C


On Mon, 2005-08-15 at 10:17 -0700, Shannon -jj Behrens wrote:
> Heh, I'm overwhelmed by too much code and not enough direction. 
> Naturally, I've got nice session code in Aquarium as well.  *Sigh*
> this Python Web thing is going to be the death of me!
> 
> -jj
> 
> On 8/14/05, Titus Brown <[EMAIL PROTECTED]> wrote:
> > -> I think that would be useful.  Flup has a fairly decoupled session store
> > -> (http://www.saddi.com/software/flup/ in
> > -> http://svn.saddi.com/flup/trunk/flup/middleware/session.py).  Is there
> > -> other current work that should be considered?  PythonWeb has a session
> > -> module, but I don't know what its insides look like:
> > -> 
> > http://www.pythonweb.org/projects/webmodules/doc/0.5.3/html_multipage/lib/session.html
> > ->
> > -> Paste has one too, but it's Not Very Good ;)  I started using the flup
> > -> session, but I got lazy and never flipped the switch to make it the
> > -> default.  There's been some discussion about sessions in the last few
> > -> months on the Quixote list as well.
> > 
> > I've been decoupled from Web-SIG e-mails for the last two months, but
> > Mike Orr and I built a simple session store for Quixote that has a
> > fairly simple and generic storage API:
> > 
> > http://cafepy.com/quixote_extras/titus/session2/session2/store/SessionStore.py
> > 
> > With the comments deleted, here's the core API:
> > 
> > class SessionStore:
> > def load_session(self, id, default=None):
> > pass
> > 
> > def save_session(self, session):
> > pass
> > 
> > def delete_session(self, session):
> > pass
> > 
> > def has_session(self, id):
> > return self.load_session(id, None)
> > 
> > The only constraint is that 'id' must be a string in order for it to
> > work with all of the session stores.
> > 
> > We have implemented stores for postgres, durus, mysql, directory/file,
> > and shelve persistence mechanisms.
> > 
> > cheers,
> > --titus
> > ___
> > Web-SIG mailing list
> > Web-SIG@python.org
> > Web SIG: http://www.python.org/sigs/web-sig
> > Unsubscribe: 
> > http://mail.python.org/mailman/options/web-sig/jjinux%40gmail.com
> > 
> 
> 

___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Session interface

2005-08-16 Thread Chris McDonough
I haven't been closely following this thread and this may have already
been said but IMO sessions are most useful when the querying user is not
identified and you need a place to stash data related to that user (e.g.
a shopping cart).  They are convenient in other cirumstances but rarely
necessary.  

I've never quite understood why people use server-side sessions for
authentication.  Maybe it's because they're typically so easy to use and
have been sold as "the way to maintain state" in a web application to a
lot of people.  But in reality they can be quite expensive under high
load  because of their generality and there's almost always a better
way.

On Tue, 2005-08-16 at 17:42 -0400, Phillip J. Eby wrote:
> At 05:08 PM 8/16/2005 -0400, Geoffrey Talvola wrote:
> >Jonathan Ellis wrote:
> > > Still, it can be good to have a simple place to store non-permanent
> > > information.
> >
> >For example...
> >
> >I think a good use of sessions is in remembering selections that have been
> >made earlier on.  For example, suppose you have a reporting application
> >where you allow the user to select one or more items to report on from a
> >list box, several filtering options in dropdowns or checkboxes, sorting and
> >grouping behavior, etc.  You want to remember those settings so that if the
> >user returns to the report selection page, their last selected settings are
> >pre-selected.  But, unless the user chooses to save those settings as a
> >"stored report", you'd like to forget the settings when the user logs out or
> >when they close their browser.
> >
> >Also, assume that your application already has this bundle of selections in
> >the form of a Python object.
> >
> >Isn't the cleanest, easiest, and more efficient way to handle this to simply
> >save the Python object in a session variable?
> 
> No.  :)
> 
> I have to admit I'm probably biased by early Zope experience, where cookie 
> variables are as easy to use as form variables or any other kind of 
> variable.  Just set the cookies to save the options, then refer to them in 
> the page.  Sweet and simple.  And if you set the cookie path to the path of 
> the page, then the client doesn't have to send them on every request, only 
> the ones where it makes a difference.
> 
> 
> >   In some cases, for example
> >using Webware's in-memory sessions, for example, this data never has to be
> >marshaled or leave the application server at all.
> >
> >If I didn't have sessions, I think using either cookies or a back-end db
> >would be more work, less clean, and less efficient in this case.
> 
> Maybe that's a limitation of the framework?  As I said, I'm probably 
> spoiled by how easily Zope merges GET/POST/cookie variables, such that form 
> variables override cookies, but if the form variable isn't supplied the 
> cookie is used as a default.  That one simple behavior made "smart forms" 
> really easy to make in Zope and Zope-like systems.
> 
> ___
> Web-SIG mailing list
> Web-SIG@python.org
> Web SIG: http://www.python.org/sigs/web-sig
> Unsubscribe: http://mail.python.org/mailman/options/web-sig/chrism%40plope.com
> 

___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Session interface

2005-08-16 Thread Chris McDonough
FWIW, some interesting ideas (and not so interesting ideas) for
sessioning architecture in general are captured at

http://www.zope.org/Wikis/DevSite/Projects/CoreSessionTracking/UseCases

and

http://www.zope.org/Wikis/DevSite/Projects/CoreSessionTracking/CoreSessionTrackingDiscussion

UML that more or less represents Zope's current sessioning model is at:

http://www.zope.org/Wikis/DevSite/Projects/CoreSessionTracking/CoreSessionTrackingUML

- C

On Wed, 2005-08-17 at 00:31 -0500, Ian Bicking wrote:
> Mike Orr wrote:
> > Regarding Ian's session interface:
> > http://svn.colorstudy.com/home/ianb/scarecrow_session_interface.py
> > 
> > Ian Bicking wrote:
> > 
> >> Thinking on it more, probably a good place to start would be agreeing 
> >> on specific terminology for the objects involved, since I've seen 
> >> several different sets of terminology, many of which use the same 
> >> words for different ideas:
> >>
> >> Session:
> >>   An instance of this represents one user/browser's session.
> >> SessionStore:
> >>   An instance of this represents the persistence mechanism.  This
> >>   is a functional component, not embodying any policy.
> >> SessionManager:
> >>   This is a container for sessions, and uses a SessionStore.  This
> >>   contains all the policy for loading, saving, locking, expiring
> >>   sessions.
> >>  
> >>
> > 
> > 
> > At minimum, the SessionManager links the SessionStore, Session, and 
> > application together.  It can be generic, along with 
> > loading/saving/locking.  (Although we might allow the application to 
> > choose a locking policy.)  
> 
> That could be a little difficult, since multiple applications may be 
> sharing a session.  But at the same time, applications that don't expect 
> ConflictError are going to be pissed if you configure your system for 
> optimistic locking.
> 
> Of course, given a session ID and a session store, each application 
> could have its own manager.  Possibly.  Hmm... interesting.  In that 
> case each SessionManager needs an id, which is a bit annoying -- it has 
> to be stable and shared, because the same SessionManager has to be 
> identifiable over multiple processes.  But I hate inventing IDs all over 
> the place.  I feel like I'm pulling string keys out of my ass, and if 
> I'm going to pull things out of my ass I at least don't want to then put 
> them into my code.  I sense UUIDs coming on :(
> 
> That said, this isn't the only place I need strings that are unique to 
> an application instance.
> 
> > But expiring is very application-specific, 
> > and it may not be the "application" doing it but a separate cron job.  
> > Perhaps most applications will be happy with an "expire all sessions 
> > unmodified for N minutes", but some will want to inspect the metadata 
> > and others the content.  So maybe all the SessionManager can do is:
> > 
> >.delete_session(id)   => pass message directly to SessionStore
> >.iter_sessions()  =>  tuples of (id, metadata)
> >.iter_sessions_with_content() => tuples of (id, metadata, content)
> 
> I think metadata is probably good; or lazily-loaded sessions or 
> something.  The metadata is important I think, because updating metadata 
> shouldn't be effected by locking and whatnot.  I think Mike mentioned a 
> problem with locking and updating the timestamp contained in the session 
> -- we should avoid that.
> 
> > ... where metadata includes the access time and whatever else we 
> > decide.  Of course, iterating the content may be disk/memory intensive.
> 
> Sure.  We could have a callback to do filtering too, maybe with a 
> default filter by expiration time.  Or event callbacks.
> 
> > If .delete_expired_sessions() is included, the application would have to 
> > subclass SessionManager rather than just using it.  That's not 
> > necessarily bad but a potential limitation.  Or the application could 
> > kludge up a policy from your methods:
> > 
> >cutoff = time.time() - (60 * 60 * 4)
> >for sid in sm.session_ids():
> >if sm.last_accessed(sid) < cutoff:
> >sm.delete_session(sid)
> > 
> > I suppose kludgy is in the eye of the beholder.  This would not be kludgy:
> > 
> >cutoff = time.time() - (60 * 60 * 4)
> >for sid, metadata in sm.iter_sessions():
> >if metadata.atime < cutoff:
> >sm.delete_session(sid)
> > 
> > Curses on anybody who says, "What's the difference?"
> > 
> > PS. Kudos for using .names_with_underscores rather than .studlyCaps.
> > 
> > Your other methods look all right at first glance.  We'll know when we 
> > port existing frameworks to it whether it's adequate.  (Or should that 
> > be "when we port it to existing frameworks"?  Or "when we make existing 
> > frameworks use it as middleware"?)  We'll also have to keep an eye on a 
> > usage pattern to recommend for future frameworks, and on whether this 
> > API has anything to do with the "sessionless" persistance patterns that 
> > have also been proposed.
> 
> Acquir

Re: [Web-SIG] and now for something completely different!

2005-08-17 Thread Chris McDonough
On Thu, 2005-08-18 at 13:43 +1000, Rene Dudfield wrote:

> ... and now for all the arguments pro Session rolled up into one paragraph.
> 
> Taking load off the database server(with sessions) is a way to make an
> application more scalable.  

In my experience, they can make applications less scalable because
typically people don't need to know much about how sessions work so they
tend to overuse them without understanding their cost.  Very general
persistent session implementations that serialize object data into a
blob are typically even more expensive than simple relational database
row reads and writes, too.  This cost is amplified by session ease of
use.

> Often the database server is the
> bottleneck of the web app.  Being able to move some load to the
> client, or the webservers is a good option to have.

This is probably true for a lot of folks but my web apps are almost
always CPU bound at the web/application server.  I wish I had the
database-too-slow problem.

>   Being able to not
> use 2 tiers is also what people may want.  In this way sessions allow
> you to scale up, and down.  Sessions allow you to do a lot of jobs
> which databases are not needed for.

Typically persistent sessions are backed by some sort of database
anyway.  It's just that they're craftily coded in such a way that you
typically don't need to know much about it.

>   Sessions are also more reliable,
> and secure than cookies.
>  Cookies may not be enabled on the browser,

The most common way of enabling sessions is via cookies, and whether
sessions work reliably or not is often contingent on cookies.  Formvar
or URL-encoded session identifiers tend to be hit-and-miss and much
harder to maintain across pages.

> and storing some stuff on the client side in the clear, or even
> encrypted is dangerous.

I agree.  At least it's harder to get right.

>   Sessions are understood by a large amount of
> php/java/perl/apache people.  Lots of the python web frameworks have
> implemented sessions too.  This means sessions will be used.  So
> making a good working implementation of sessions that everyone can
> share would be double plus good.

What might be more practical and easier to think about because its scope
is so much smaller is a a common "browser identifier" implementation.

The most useful purpose of a session is to allow you to store state
across requests by some anonymous browser.  If you can reliably detect
that "the requesting browser is the browser identified by token ABC123"
and that token can be associated with the browser reliably for some
extended period of time, that's half the battle.  This can be done with
a cookie, a URL element, a form variable, or a query string element.
The association between an identifier and a browser doesn't really even
need to time out; it could live forever with no ill effect.

Creating namespaces that can be written to from within application code
and which expire after some number of minutes of inactivity and so forth
(aka sessions) could be written in terms of storing and retrieving data
based on this browser identifier.

- C


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com