2005-07-24 Thread Phillip J. Eby
At 04:05 AM 7/24/2005 -0400, Chris McDonough wrote:
>- OR (if we passed the factory a namespace instead of a filename) -
>   [foo.factory]
>   arbitrarykey1 = arbitraryvalue1
>   arbitrarykey2 = arbitraryvalue2
>   [bar.factory]
>   arbitrarykey1 = arbitraryvalue1
>   arbitrarykey2 = arbitraryvalue2

This one's my favorite.  I'd say the semantics are that each factory gets 
passed the key/value pairs as keyword arguments, with a positional argument 
used to pass in the "next application".  The last factory in the file 
wouldn't get the positional argument.

If a section's name has len(sectionName.split())>1, then the second and 
subsequent words are directives that change the default interpretation of 
the section, so that we can have things like:

 [WSGI options]
 # WSGI options, like required eggs, threading mode, etc.

 [mod_python options]
 # mod_python-specific options

 [ object]
 # this app is an object, not a factory

I don't care that ConfigParser doesn't support any of this, because 
low-level .ini parsers are easy to write and I've previously written two: 
one for peak.config and one for pkg_resources.  And if the implementation 
can assume pkg_resources is available, it can use the one that's there to 
do the sequential section-splitting part of the job.

I'm not sure of this, but I tend towards thinking that the 
'arbitraryvalues' should be Python expressions, rather than raw strings.  I 
also think that we should support a source-encoding comment to allow for 
localization of Unicode literals, whether we treat values as raw strings or 
Python expressions.

Re: [Web-SIG] Standardized configuration

2005-07-24 Thread Chris McDonough
Thanks for the response... I'm not going to respond point-by-point here
because probably nobody has time to read this stuff anyway.

But in general:

1) I'm for creating a simple deployment spec that allows you to define
static pipelines declaratively.  The decision middleware thing is just
an idea.  I'm not really sure it's even a good idea, but it's a stab at
a compromise which would allow for a bit of pipeline dynamicism.

2) I don't have a strong preference one way or another about what the
main config looks like other than it should be simple.  So I'd probably
be fine  with any of:

  factory = foo.factory
  config = foo.conf

  factory = bar.factory
  config = bar.conf

  apps = foo bar

- OR (assuming we have section ordering and we can live with a single
pipeline) -

  config = foo.conf

  config = bar.conf

- OR (if we passed the factory a namespace instead of a filename) -

  arbitrarykey1 = arbitraryvalue1
  arbitrarykey2 = arbitraryvalue2

  arbitrarykey1 = arbitraryvalue1
  arbitrarykey2 = arbitraryvalue2

  (Forget my ramblings about os.environ.  You're right.
  It all comes out the same.)

3) I don't have a strong opinion on whether middleware and endpoint
   apps should be treated differently in the config file.
   If we used section ordering in configparser to imply the pipeline, 
   I'd suspect they wouldn't be.

So where does that leave us?

- C

On Sat, 2005-07-23 at 20:01 -0500, Ian Bicking wrote:
> Chris McDonough wrote:
> > On Fri, 2005-07-22 at 17:26 -0500, Ian Bicking wrote:
> >>>  To do this, we use a ConfigParser-format config file named
> >>>  'myapplication.conf' that looks like this::
> >>>
> >>>[application:sample1]
> >>>config = sample1.conf
> >>>factory = wsgiconfig.tests.sample_components.factory1
> >>>
> >>>[application:sample2]
> >>>config = sample2.conf
> >>>factory = wsgiconfig.tests.sample_components.factory2
> >>>
> >>>[pipeline]
> >>>apps = sample1 sample2
> >>
> >>I think it's confusing to call both these applications.  I think 
> >>"middleware" or "filter" would be better.  I think people understand 
> >>"filter" far better, so I'm inclined to use that.  So...
> > 
> > 
> > The reason I called them applications instead of filters is because all
> > of them implement the WSGI "application" API (they all implement "a
> > callable that accepts two parameters, environ and start_response").
> > Some happen to be gateways/filters/middleware/whatever but at least one
> > is just an application and does no delegation.  In my example above,
> > "sample2" is not a filter, it is the end-point application.  "sample1"
> > is a filter, but it's of course also an application too.
> Well, the difference I see is that a filter accepts a next-application, 
> where a plain application does not.  From the perspective of this 
> configuration file, those seem ver different.  In fact, it could 
> actually be:
>config = sample1.conf
>factory = ...
>pipeline = printdebug_app sample1
> That is, a "pipeline" simply describes a new application.  And then -- 
> perhaps with a conventional name, or through some more global 
> configuration -- we indicate which application we are going to serve.
> Hmm... thinking about it, this seems much more general, in a very useful 
> way, since anyone can plugin in ways to compose applications. 
> "pipeline" is just one use case for how to compose applications.
> > Would you maybe rather make it more explicit that some apps are also
> > gateways, e.g.:
> > 
> > [application:bleeb]
> > config = bleeb.conf
> > factory = bleeb.factory
> > 
> > [filter:blaz]
> > config = blaz.conf
> > factory = blaz.factory
> > 
> > ?  I don't know that there's any way we could make use of the
> > distinction between the two types in the configurator other than
> > disallowing people to place an application "before" a filter in a
> > pipeline through validation.  Is there something else you had in mind?
> I have forgotten what the actual factory interface was, but I think it 
> should be different for the two.  Well, I think it *is* different, and 
> passing in a next-application of None just covers up that difference.
> >>[application:sample2]
> >># What is this relative to?  I hate both absolute paths and
> >># paths relative to pwd equally...
> >>config = sample1.conf
> >>factory = wsgiconfig...
> > 
> > 
> > This was from a doctest I wrote so I could rely on relative paths,
> > sorry.  You're right.  U... we could probably cause use the
> > environment as "defaults" to ConfigParser inerpolation and set whatever
> > we need before the configurator is run:
> > 
> > $ export APP_ROOT=/home/chrism/myapplication
> > $ ./ myapplication.conf
> > 
> > And in myapplication.conf:
> > 
> > [application:sampl

2005-07-24 Thread Chris McDonough
On Sat, 2005-07-23 at 21:57 -0400, Phillip J. Eby wrote:
> > > For that matter, if you did that, you could specify the above as:
> > >
> > >  [blaz.factory]
> > >  config=blaz.conf
> > >
> > >  [bleeb.factory]
> > >  config=bleeb.conf
> >
> >Guess that would work for me, but out of the box, ConfigParser doesn't
> >appear to preserve section ordering.  I'm sure we could make it do that.
> >Not a dealbreaker either, but if you ever did want a way to
> >declaratively configure something in the config file like the generic
> >"decision middleware" I described in that message, this wouldn't really
> >work.  I hadn't described it yet, but I can also imagine declaring
> >multiple pipelines in the config file and using decision middleware to
> >choose the first app in the next pipeline (as opposed to just an app).
> I consider this a YAGNI, myself.  But then again, most of the pipeline 
> stuff seems like a YAGNI to me.
> Probably that's because everything you guys are talking about implementing 
> with pipelines of middleware, I'd use a single generic function for. 

FWIW, I think I fall somewhere between you and Ian on this, and maybe
more towards you.

I believe that there are services that are usefully composed as
middleware ("oblivious" things like XSL renderering and caches).  But
sessioning and auth services and whatnot I wouldn't put into middleware.
Instead, I'd use some service library that would have a much nicer
configuration API.  But none of that should really be described within
the deployment spec, so I haven't done so.

I'm trying to be sensitive of Ian's desire to use middleware for all
kinds of services.  I also do think there is a place for middleware, so
it's useful to be able to compose pipelines declaratively even if they
are terribly simple.  OTOH, if I set up an actual deployment for a
customer, it would rarely consist of more than one or two gateways and
then the application and many times it would just be the application if
I had no need for "oblivious" middleware apps in the pipeline.

Anyway, back to the nitty gritty of config, I'd rather just use
ConfigParser "as is" right now than to come up with another .ini parser
that preserves section ordering, thus the non-dependence on ordering
within the deployment file.

>  If I 
> was wrapping oblivious or legacy apps, I'd just make one middleware object 
> that then calls the generic function to do any and all dynamic 
> requirements, because it would only take a little bit of syntax sugar to 
> implement "configuration" scripts like:
>  use_auth("/some/subdir", some_auth_service)
>  mount_app("/other/path", some_app_object)
> etc.  So, all the time spent on coming up with an uglier, less-powerful 
> pseudo-framework to simulate these capabilities using crude .ini files and 
> poking stuff into environ seems kind of wasteful to me, versus defining a 
> powerful API to -- dare I say it -- "paste" applications together.  :)
> However, such an API deserves to be both powerful and easy-to-use, not 
> kludged together with .ini syntax.

I agree.

> That's not saying I don't think WSGI should have a deployment configuration 
> format based on .ini syntax -- I still do!  I just don't think it should 
> even attempt to allow anything complex.  A simple static pipeline and some 
> server-defined and WSGI-defined options will do nicely for the "simple 
> things are simple" case, and a Python file will do nicely for all the 
> "complex things are possible" cases.

That's fine by me.

> That's why I'd like to see this effort split into two parts: 1) simple 
> deployment, and 2) a "pasting" API whose entire purpose in life is to 
> stack, route, and multiplex "middleware" and "applications" without having 
> to explicitly manage a pipeline.
> This API would use *specificity* as a basis for establishing pipelines, 
> because it's not at all scalable (developer-wise) to set up pipelines on a 
> URL-by-URL basis for a complex application -- especially for applications 
> that aren't page-based!  Usually, you'll need some kind of pipeline 
> inheritance to manage that sort of thing.
> There is little reason, however, why you can't configure a significant 
> portion of a URL space using a single WSGI component, using an appropriate 
> mechanism.  For example, recasting my earlier example:
>  def factory(container):
>  container.use_auth("some/subdir", some_auth_service)
>  container.mount_app_factory("other/path", some_app_factory)

Yes.  I hadn't thought about managing service context based on
containment like this (and I like that), but to me, this is a services
registration all the same.

> Then, the 'mount_app_factory()' call could invoke 
> 'some_app_factory(subcontainer)' where 'subcontainer' is a wrapper that 
> prepends 'other/path' to URLs before delegating to 'container'.
> In other words, once you have this "container API", there's no reason not 
> to just use it to implement t

2005-07-23 Thread Phillip J. Eby
At 08:41 PM 7/23/2005 -0400, Chris McDonough wrote:
>On Sat, 2005-07-23 at 20:21 -0400, Phillip J. Eby wrote:
> > At 08:08 PM 7/23/2005 -0400, Chris McDonough wrote:
> > >Would you maybe rather make it more explicit that some apps are also
> > >gateways, e.g.:
> > >
> > >[application:bleeb]
> > >config = bleeb.conf
> > >factory = bleeb.factory
> > >
> > >[filter:blaz]
> > >config = blaz.conf
> > >factory = blaz.factory
> >
> > That looks backwards to me.  Why not just list the sections in pipeline
> > order?  i.e., outermost middleware first, and the final application last?
> >
> > For that matter, if you did that, you could specify the above as:
> >
> >  [blaz.factory]
> >  config=blaz.conf
> >
> >  [bleeb.factory]
> >  config=bleeb.conf
>Guess that would work for me, but out of the box, ConfigParser doesn't
>appear to preserve section ordering.  I'm sure we could make it do that.
>Not a dealbreaker either, but if you ever did want a way to
>declaratively configure something in the config file like the generic
>"decision middleware" I described in that message, this wouldn't really
>work.  I hadn't described it yet, but I can also imagine declaring
>multiple pipelines in the config file and using decision middleware to
>choose the first app in the next pipeline (as opposed to just an app).

I consider this a YAGNI, myself.  But then again, most of the pipeline 
stuff seems like a YAGNI to me.

Probably that's because everything you guys are talking about implementing 
with pipelines of middleware, I'd use a single generic function for.  If I 
was wrapping oblivious or legacy apps, I'd just make one middleware object 
that then calls the generic function to do any and all dynamic 
requirements, because it would only take a little bit of syntax sugar to 
implement "configuration" scripts like:

 use_auth("/some/subdir", some_auth_service)
 mount_app("/other/path", some_app_object)

etc.  So, all the time spent on coming up with an uglier, less-powerful 
pseudo-framework to simulate these capabilities using crude .ini files and 
poking stuff into environ seems kind of wasteful to me, versus defining a 
powerful API to -- dare I say it -- "paste" applications together.  :)

However, such an API deserves to be both powerful and easy-to-use, not 
kludged together with .ini syntax.

That's not saying I don't think WSGI should have a deployment configuration 
format based on .ini syntax -- I still do!  I just don't think it should 
even attempt to allow anything complex.  A simple static pipeline and some 
server-defined and WSGI-defined options will do nicely for the "simple 
things are simple" case, and a Python file will do nicely for all the 
"complex things are possible" cases.

That's why I'd like to see this effort split into two parts: 1) simple 
deployment, and 2) a "pasting" API whose entire purpose in life is to 
stack, route, and multiplex "middleware" and "applications" without having 
to explicitly manage a pipeline.

This API would use *specificity* as a basis for establishing pipelines, 
because it's not at all scalable (developer-wise) to set up pipelines on a 
URL-by-URL basis for a complex application -- especially for applications 
that aren't page-based!  Usually, you'll need some kind of pipeline 
inheritance to manage that sort of thing.

There is little reason, however, why you can't configure a significant 
portion of a URL space using a single WSGI component, using an appropriate 
mechanism.  For example, recasting my earlier example:

 def factory(container):
 container.use_auth("some/subdir", some_auth_service)
 container.mount_app_factory("other/path", some_app_factory)

Then, the 'mount_app_factory()' call could invoke 
'some_app_factory(subcontainer)' where 'subcontainer' is a wrapper that 
prepends 'other/path' to URLs before delegating to 'container'.

In other words, once you have this "container API", there's no reason not 
to just use it to implement the whole stack in a single middleware object.

Anyway, this is why I think there should be a "WSGI Services" and/or "WSGI 
Container API" spec, distinct from a "WSGI Deployment Metadata" 
spec.  These two spheres are both valuable, but I think it'll take longer 
to get a "deployment" spec if we mix "container API" stuff into it -- and 
get a much less useful container API than if we set our minds on making a 
good container API, rather than a souped-up deployment descriptor.

Re: [Web-SIG] Standardized configuration

2005-07-23 Thread Ian Bicking
Chris McDonough wrote:
> On Fri, 2005-07-22 at 17:26 -0500, Ian Bicking wrote:
>>>  To do this, we use a ConfigParser-format config file named
>>>  'myapplication.conf' that looks like this::
>>>config = sample1.conf
>>>factory = wsgiconfig.tests.sample_components.factory1
>>>config = sample2.conf
>>>factory = wsgiconfig.tests.sample_components.factory2
>>>apps = sample1 sample2
>>I think it's confusing to call both these applications.  I think 
>>"middleware" or "filter" would be better.  I think people understand 
>>"filter" far better, so I'm inclined to use that.  So...
> The reason I called them applications instead of filters is because all
> of them implement the WSGI "application" API (they all implement "a
> callable that accepts two parameters, environ and start_response").
> Some happen to be gateways/filters/middleware/whatever but at least one
> is just an application and does no delegation.  In my example above,
> "sample2" is not a filter, it is the end-point application.  "sample1"
> is a filter, but it's of course also an application too.

Well, the difference I see is that a filter accepts a next-application, 
where a plain application does not.  From the perspective of this 
configuration file, those seem ver different.  In fact, it could 
actually be:

   config = sample1.conf
   factory = ...


   pipeline = printdebug_app sample1

That is, a "pipeline" simply describes a new application.  And then -- 
perhaps with a conventional name, or through some more global 
configuration -- we indicate which application we are going to serve.

Hmm... thinking about it, this seems much more general, in a very useful 
way, since anyone can plugin in ways to compose applications. 
"pipeline" is just one use case for how to compose applications.

> Would you maybe rather make it more explicit that some apps are also
> gateways, e.g.:
> [application:bleeb]
> config = bleeb.conf
> factory = bleeb.factory
> [filter:blaz]
> config = blaz.conf
> factory = blaz.factory
> ?  I don't know that there's any way we could make use of the
> distinction between the two types in the configurator other than
> disallowing people to place an application "before" a filter in a
> pipeline through validation.  Is there something else you had in mind?

I have forgotten what the actual factory interface was, but I think it 
should be different for the two.  Well, I think it *is* different, and 
passing in a next-application of None just covers up that difference.

>># What is this relative to?  I hate both absolute paths and
>># paths relative to pwd equally...
>>config = sample1.conf
>>factory = wsgiconfig...
> This was from a doctest I wrote so I could rely on relative paths,
> sorry.  You're right.  U... we could probably cause use the
> environment as "defaults" to ConfigParser inerpolation and set whatever
> we need before the configurator is run:
> $ export APP_ROOT=/home/chrism/myapplication
> $ ./ myapplication.conf
> And in myapplication.conf:
> [application:sample1]
> config = %(APP_ROOT)s/sample1.conf
> factory = myapp.sample1.factory

I hate %(APP_ROOT)s as a syntax; I think it's okay to simply say that 
the configuration loader (in some fashion) should determine the root 
(maybe with an environmental variable or command line parameter).

Though, realistically, there might be several app roots.  Apache's root 
directory configuration (for relative paths) isn't very useful to me, in 
practice, because it's not flexible enough nor allow more than one root.

>>But this is reasonably easy to resolve -- there's a perfectly good 
>>configuration section sitting there, waiting to be used:
>>   [filter:profile]
>>   factory = paste.profilemiddleware.ProfileMiddleware
>>   # Show top 50 functions:
>>   limit = 50
>>This in no way precludes 'config', which is just a special case of this 
>>general configuration.  The only real problem is a possible conflict if 
>>we wanted to add new special names to the configuration, i.e., 
> I think I'd maybe rather see configuration settings for apps that don't
> require much configuration to come in as environment variables (maybe
> not necessarily in the "environ" namespace that is implied by the WSGI
> callable interface but instead in os.environ).  Envvars are
> uncontroversial, so they don't cost us any coding time, PEP time, or
> brain cycles.

Yikes!  Were you like the ZConfig holdout or something?  os.environ is 
way, way, way too inflexible.

Just the other day I was able to deploy a single application I wrote 
with two configurations in the same process, without having thought 
about that possibility ahead of time, and without doing any extra work 
or avoiding any particular shortcuts. 

2005-07-23 Thread Chris McDonough
On Sat, 2005-07-23 at 20:21 -0400, Phillip J. Eby wrote:
> At 08:08 PM 7/23/2005 -0400, Chris McDonough wrote:
> >Would you maybe rather make it more explicit that some apps are also
> >gateways, e.g.:
> >
> >[application:bleeb]
> >config = bleeb.conf
> >factory = bleeb.factory
> >
> >[filter:blaz]
> >config = blaz.conf
> >factory = blaz.factory
> That looks backwards to me.  Why not just list the sections in pipeline 
> order?  i.e., outermost middleware first, and the final application last?
> For that matter, if you did that, you could specify the above as:
>  [blaz.factory]
>  config=blaz.conf
>  [bleeb.factory]
>  config=bleeb.conf

Guess that would work for me, but out of the box, ConfigParser doesn't
appear to preserve section ordering.  I'm sure we could make it do that.
Not a dealbreaker either, but if you ever did want a way to
declaratively configure something in the config file like the generic
"decision middleware" I described in that message, this wouldn't really
work.  I hadn't described it yet, but I can also imagine declaring
multiple pipelines in the config file and using decision middleware to
choose the first app in the next pipeline (as opposed to just an app).

- C

Re: [Web-SIG] Standardized configuration

2005-07-23 Thread Phillip J. Eby
At 08:08 PM 7/23/2005 -0400, Chris McDonough wrote:
>Would you maybe rather make it more explicit that some apps are also
>gateways, e.g.:
>config = bleeb.conf
>factory = bleeb.factory
>config = blaz.conf
>factory = blaz.factory

That looks backwards to me.  Why not just list the sections in pipeline 
order?  i.e., outermost middleware first, and the final application last?

For that matter, if you did that, you could specify the above as:



Which looks a lot nicer to me.  If you want global WSGI or server options 
for the stack, one could always use multi-word section names e.g.:

 [WSGI options]
 multi_thread = 0

 [mod_python options]
 blah = "feh"

and not treat these sections as part of the pipeline.  For Ian's idea about 
requiring particular projects to be available (via pkg_resources), I'd 
suggest making that sort of thing part of one of the options sections.

2005-07-23 Thread Chris McDonough
On Fri, 2005-07-22 at 17:26 -0500, Ian Bicking wrote:

> >   To do this, we use a ConfigParser-format config file named
> >   'myapplication.conf' that looks like this::
> > 
> > [application:sample1]
> > config = sample1.conf
> > factory = wsgiconfig.tests.sample_components.factory1
> > 
> > [application:sample2]
> > config = sample2.conf
> > factory = wsgiconfig.tests.sample_components.factory2
> > 
> > [pipeline]
> > apps = sample1 sample2
> I think it's confusing to call both these applications.  I think 
> "middleware" or "filter" would be better.  I think people understand 
> "filter" far better, so I'm inclined to use that.  So...

The reason I called them applications instead of filters is because all
of them implement the WSGI "application" API (they all implement "a
callable that accepts two parameters, environ and start_response").
Some happen to be gateways/filters/middleware/whatever but at least one
is just an application and does no delegation.  In my example above,
"sample2" is not a filter, it is the end-point application.  "sample1"
is a filter, but it's of course also an application too.

Would you maybe rather make it more explicit that some apps are also
gateways, e.g.:

config = bleeb.conf
factory = bleeb.factory

config = blaz.conf
factory = blaz.factory

?  I don't know that there's any way we could make use of the
distinction between the two types in the configurator other than
disallowing people to place an application "before" a filter in a
pipeline through validation.  Is there something else you had in mind?

> [application:sample2]
> # What is this relative to?  I hate both absolute paths and
> # paths relative to pwd equally...
> config = sample1.conf
> factory = wsgiconfig...

This was from a doctest I wrote so I could rely on relative paths,
sorry.  You're right.  U... we could probably cause use the
environment as "defaults" to ConfigParser inerpolation and set whatever
we need before the configurator is run:

$ export APP_ROOT=/home/chrism/myapplication
$ ./ myapplication.conf

And in myapplication.conf:

config = %(APP_ROOT)s/sample1.conf
factory = myapp.sample1.factory

That would probably be the least-effort and most flexible thing to do
and doesn't mandate any particular directory structure.  Of course, we
could provide a convention for a recommended directory structure, but
this gives us an "out" from being painted in to that in specific cases.

> [pipeline]
> # The app is unique and special...?
> app = sample2
> filters = sample1
> Well, that's just a first refactoring; I'm having other inclinations...

I'm not sure whether this is just a stylistic thing or if there's a
reason you want to treat the endpoint app specially.  By definition, in
my implementation, the endpoint app is just the last app mentioned in
the pipeline.

> > Potential points of contention
> > 
> >  - The WSGI configurator assumes that you are willing to write WSGI
> >component factories which accept a filename as a config file.  This
> >factory returns *another* factory (typically a class) that accepts
> >"the next" application in the pipeline chain and returns a WSGI
> >application instance.  This pattern is necessary to support
> >argument currying across a declaratively configured pipeline,
> >because the WSGI spec doesn't allow for it.  This is more contract
> >than currently exists in the WSGI specification but it would be
> >trivial to change existing WSGI components to adapt to this
> >pattern.  Or we could adopt a pattern/convention that removed one
> >of the factories, passing both the "next" application and the
> >config file into a single factory function.  Whatever.  In any
> >case, in order to do declarative pipeline configuration, some
> >convention will need to be adopted.  The convention I'm advocating
> >above seems to already have been for the current crop of middleware
> >components (using a factory which accepts the application as the
> >first argument).
> I hate the proliferation of configuration files this implies.  I 
> consider the filters an implementation detail; if they each have 
> partitioned configuration then they become a highly exposed piece of the 
> architecture.
> It's also a lot of management overhead.  Typical middleware takes 0-5 
> configuration parameters.  For instance, paste.profilemiddleware is 
> perfectly usable with no configuration at all, and only has two parameters.

True.  The config file param should be optional.  Apps might use the
environment to configure themselves.

> But this is reasonably easy to resolve -- there's a perfectly good 
> configuration section sitting there, waiting to be used:
>factory = paste.profilemiddleware.ProfileMiddleware
># Show top 50 functions:
>limit = 50
> This in no way precludes 'co

2005-07-23 Thread Ian Bicking
>>  To do this, we use a ConfigParser-format config file named
>>  'myapplication.conf' that looks like this::
>>config = sample1.conf
>>factory = wsgiconfig.tests.sample_components.factory1
>>config = sample2.conf
>>factory = wsgiconfig.tests.sample_components.factory2
>>apps = sample1 sample2

On another tack, I think it's important we consider how 
setuptools/pkg_resources fits into this.  Specifically we should allow:

require = WSGIConfig
factory = ...

Since the factory might not be importable until require() is called. 
There's lots of other potential benefits to being able to get that 
information about requirements as well.

Another option is if, instead of a factory (or as an alternative 
alongside it) we make distributions publishable themselves, like:

egg = MyAppSuite[filebrowser]

Which would require('MyAppSuite[filebrowser]'), and look in 
Paste.egg-info for a configuration file.  The [filebrowser] portion is 
pkg_resource's way of defining a feature, and I figure it can also 
identify a specific application if one package holds multiple 
applications.  However, that feature specification would be optional. 
What the configuration file in egg-info looks like, I don't know. 
Probably just like the original configuration file, except this time 
with a factory.

I don't like the configuration key "egg" though.  But eh, that's a detail.

Re: [Web-SIG] Standardized configuration

2005-07-22 Thread Ian Bicking
Chris McDonough wrote:
> I've had a stab at creating a simple WSGI deployment implementation.
> I use the term "WSGI component" in here as shorthand to indicate all
> types of WSGI implementations (server, application, gateway).
> The primary deployment concern is to create a way to specify the
> configuration of an instance of a WSGI component, preferably within a
> declarative configuration file.  A secondary deployment concern is to
> create a way to "wire up" components together into a specific
> deployable "pipeline".  
> A strawman implementation that solves both issues via the
> "configurator", which would be presumed to live in "wsgiref". Currently
> it lives in a package named "wsgiconfig" on my laptop.  This module
> follows.

I have a weird problem reading unhighlighted source.  I dunno why.  But 
anyway, the configuration file is what interests me most...

>   To do this, we use a ConfigParser-format config file named
>   'myapplication.conf' that looks like this::
> [application:sample1]
> config = sample1.conf
> factory = wsgiconfig.tests.sample_components.factory1
> [application:sample2]
> config = sample2.conf
> factory = wsgiconfig.tests.sample_components.factory2
> [pipeline]
> apps = sample1 sample2

I think it's confusing to call both these applications.  I think 
"middleware" or "filter" would be better.  I think people understand 
"filter" far better, so I'm inclined to use that.  So...

# What is this relative to?  I hate both absolute paths and
# paths relative to pwd equally...
config = sample1.conf
factory = wsgiconfig...

config = sample1.conf
factory = ...

# The app is unique and special...?
app = sample2
filters = sample1

Well, that's just a first refactoring; I'm having other inclinations...

> Potential points of contention
>  - The WSGI configurator assumes that you are willing to write WSGI
>component factories which accept a filename as a config file.  This
>factory returns *another* factory (typically a class) that accepts
>"the next" application in the pipeline chain and returns a WSGI
>application instance.  This pattern is necessary to support
>argument currying across a declaratively configured pipeline,
>because the WSGI spec doesn't allow for it.  This is more contract
>than currently exists in the WSGI specification but it would be
>trivial to change existing WSGI components to adapt to this
>pattern.  Or we could adopt a pattern/convention that removed one
>of the factories, passing both the "next" application and the
>config file into a single factory function.  Whatever.  In any
>case, in order to do declarative pipeline configuration, some
>convention will need to be adopted.  The convention I'm advocating
>above seems to already have been for the current crop of middleware
>components (using a factory which accepts the application as the
>first argument).

I hate the proliferation of configuration files this implies.  I 
consider the filters an implementation detail; if they each have 
partitioned configuration then they become a highly exposed piece of the 

It's also a lot of management overhead.  Typical middleware takes 0-5 
configuration parameters.  For instance, paste.profilemiddleware is 
perfectly usable with no configuration at all, and only has two parameters.

But this is reasonably easy to resolve -- there's a perfectly good 
configuration section sitting there, waiting to be used:

   factory = paste.profilemiddleware.ProfileMiddleware
   # Show top 50 functions:
   limit = 50

This in no way precludes 'config', which is just a special case of this 
general configuration.  The only real problem is a possible conflict if 
we wanted to add new special names to the configuration, i.e., 

Another option is indirection like:

   factory = paste.profilemiddleware.ProfileMiddleware

   limit = 50

If we do something like this, the interface for these factories does 
become larger, as we're passing in objects that are more complex than 

Another thing this could allow is recursive configuration, like:

factory = paste.urlmap.URLMapBuilder
app1 = blog
app1.url = /
app2 = statview
app2.url = /stats
app3 = cms = dev.*

factory = leonardo.wsgifactory
config = myblog.conf

factory = statview
log_location = /var/logs/apache2

factory = proxy
location = http://localhost:8080
map = / /cms.php

app = urlmap

So URLMapBuilder needs the entire configuration file passed in, along 
with the name of the section it is building.  It then reads some keys, 
and builds some named applications, and creates an application that 
delegates based on patterns.  That's the kind of configuration file I 

2005-07-22 Thread Chris McDonough
I've had a stab at creating a simple WSGI deployment implementation.
I use the term "WSGI component" in here as shorthand to indicate all
types of WSGI implementations (server, application, gateway).

The primary deployment concern is to create a way to specify the
configuration of an instance of a WSGI component, preferably within a
declarative configuration file.  A secondary deployment concern is to
create a way to "wire up" components together into a specific
deployable "pipeline".  

A strawman implementation that solves both issues via the
"configurator", which would be presumed to live in "wsgiref". Currently
it lives in a package named "wsgiconfig" on my laptop.  This module

""" Configurator for establishing a WSGI pipeline """

from ConfigParser import ConfigParser
import types

def configure(path):
config = ConfigParser()
if isinstance(path, types.StringTypes):

appsections = []

for name in config.sections():
if name.startswith('application:'):
elif name == 'pipeline':
raise ValueError, '%s is not a valid section name'

app_defs = {}

for appsection in appsections:
app_config_file = config.get(appsection, 'config')
app_factory_name = config.get(appsection, 'factory')
app_name = appsection.split('application:')[1]
if app_config_file is None:
raise ValueError, ('application section %s requires a
"config" '
   'option' % app_config_file)
if app_factory_name is None:
raise ValueError, ('application %s requires a "factory"'
   ' option' % app_factory_name)
app_defs[app_name] = {'config':app_config_file,

if not config.has_section('pipeline'):
raise ValueError, 'must have a "pipeline" section in config'

pipeline_str = config.get('pipeline', 'apps')
if pipeline_str is None:
raise ValueError, ('must have an "apps" definition in the '
   'pipeline section')

pipeline_def = pipeline_str.split()

next = None

while pipeline_def:
app_name = pipeline_def.pop()
app_def = app_defs.get(app_name)
if app_def is None:
raise ValueError, ('appname %s os defined in pipeline '
   '%s butno application is defined '
   'with that name')
factory_name = app_def['factory']
factory = import_by_name(factory_name)
config_file = app_def['config']
app_factory = factory(config_file)
app = app_factory(next)
next = app

if not next:
raise ValueError, 'no apps defined in pipeline'
return next

def import_by_name(name):
if not "." in name:
raise ValueError("unloadable name: " + `name`)
components = name.split('.')
start = components[0]
g = globals()
package = __import__(start, g, g)
modulenames = [start]
for component in components[1:]:
package = getattr(package, component)
except AttributeError:
n = '.'.join(modulenames)
package = __import__(n, g, g, component)
return package

  We configure a pipeline based on a config file, which
  creates and chains two "sample" WSGI applications together.

  To do this, we use a ConfigParser-format config file named
  'myapplication.conf' that looks like this::

config = sample1.conf
factory = wsgiconfig.tests.sample_components.factory1

config = sample2.conf
factory = wsgiconfig.tests.sample_components.factory2

apps = sample1 sample2

  The configurator exposes a function that accepts a single argument,

>>> from wsgiconfig.configurator import configure
>>> appchain = configure('myapplication.conf')

  The "sample_components" module referred to in the
  'myapplication.conf' file application definitions might look like

  class sample1:
  """ middleware """
  def __init__(self, app): = app
  def __call__(self, environ, start_response):
  environ['sample1'] = True
  return, start_response)

  class sample2:
   """ end-point app """
  def __init__(self, app): = app

  def __call__(self, environ, start_response):

Re: [Web-SIG] Standardized configuration

2005-07-19 Thread ChunWei Ho
> (b)
> Have chain application = authmiddleware(fileserverapp)
> Use Handlers, as Ian suggested, and in the fileserverapp's init:
> Handlers(
>   IfTest(method=GET,MimeOkForGzip=True, RunApp=gzipmiddleware(doGET)),
>   IfTest(method=GET,MimeOkForGzip=False, RunApp=doGET),
>   IfTest(method=POST,MimeOkForGzip=True, RunApp=gzipmiddleware(doPOST)),
>   IfTest(method=POST,MimeOkForGzip=False, RunApp=doPOST),
>   IfTest(method=PUT, RunApp=doPOST)
> )

It was Graham who suggested the use of Handlers initially. Sincere
apologies for my confusion.

> (c)
> Make gzipmiddleware a service in the following form:
> class gzipmiddleware:
>   def __init__(self, application=None, configparam=None):
>  self._application = application
>   def __call__(self, environ, start_response, application=None,
> configparam=None):
>  if application and configparam is specified, use them instead of
> the init values
>  do start_response
>  call self._application(environ, start_response) as iterable
>  get each iterator output and zip and yield it.
> This "middleware" is still compatible with PEP-333, but can also be used as:
> #on main application initialization, create a gzipservice and put it
> in environ without
> #specifying application or configparams for init():
> environ['service.gzip'] = gzipmiddleware()
> Modify fileserverapp to:
> def fileserverapp(environ, start_response):
>if(mimetype ok for gzip):
>gzipservice = environ['service.gzip']
>return gzipservice(environ, start_response, doGET, 
> gzipconfigparams)
>else: return doGET(environ, start_response)
>if(mimetype ok for gzip):
>gzipservice = environ['service.gzip']
>return gzipservice(environ, start_response, doPOST,
> gzipconfigparams)
>else: return doPOST(environ, start_response)
>if(PUT): doPUT(environ, start_response)
> The main difference here is that you don't have to initialize full
> application chains for each possible middleware-path for the request.
> This would be very useful if you had many middleware in the chain with
> many permutations as to which middleware are needed
> You could also instead put a service factory object into environ, it
> will return the gzipmiddleware object as a service if already exist,
> otherwise it will create it and then return it.
2005-07-19 Thread Shannon -jj Behrens

100% agreed.

Libraries are more flexible than middleware because you get to decide
when, if, and how they get called.  Middleware has its place, but it
doesn't make sense to try to package all library code as middleware. 
Even when you do write middleware, it should simply link in library
code so that you can use the library code in the absence of the

Consider an XSLT middleware layer.  It makes sense to have such a
thing.  It doesn't make sense to only be able to use the XSLT code via
the middleware interface.  As much as possible, you want to be able to
interact with libraries directly.

Best Regards,

On 7/17/05, Phillip J. Eby <[EMAIL PROTECTED]> wrote:
> At 03:28 AM 7/17/2005 -0500, Ian Bicking wrote:
> >Phillip J. Eby wrote:
> >>What I think you actually need is a way to create WSGI application
> >>objects with a "context" object.  The "context" object would have a
> >>method like "get_service(name)", and if it didn't find the service, it
> >>would ask its parent context, and so on, until there's no parent context
> >>to get it from.  The web server would provide a way to configure a root
> >>or default context.
> >
> >I guess I'm treating the request environment as that context.  I don't
> >really see the problem with that...?
> It puts a layer in the request call stack for each service you want to
> offer, versus *no* layers for an arbitrary number of services.  It adds
> work to every request to put stuff into the environment, then take it out
> again, versus just getting what you want in the first place.
> >In many cases, the middleware is modifying or watching the application's
> >output.  For instance, catching a 401 and turning that into the
> >appropriate login -- which might mean producing a 401, a redirect, a login
> >page via internal redirect, or whatever.
> And that would be legitimate middleware, except I don't think that's what
> you really want for that use case.  What you want is an "authentication
> service" that you just call to say, "I need a login" and get the login
> information from, and return its return value so that it does
> start_response for you and sends the right output.
> The difference is obliviousness; if you want to *wrap* an application not
> written to use WSGI services, then it makes sense to make it
> middleware.  If you're writing a new application, just have it use
> components instead of mocking up a 401 just so you can use the existing
> middleware.
> Notice, by the way, that it's trivial to create middleware that detects the
> 401 and then *invokes the service*.  So, it's more reusable to make
> services be services, and middleware be wrappers to apply services to
> oblivious applications.
> >I guess you could make one Uber Middleware that could handle the services'
> >needs to rewrite output, watch for errors and finalize resources, etc.
> Um, it's called a library of functions.  :)  WSGI was designed to make it
> easy to use library calls to do stuff.  If you don't need the
> obliviousness, then library calls (or service calls) are the Obvious Way To
> Do It.
> >   This isn't unreasonable, and I've kind of expected one to evolve at
> > some point.  But you'll have to say more to get me to see how "services"
> > is a better way to manage this.
> I'm saying that middleware can use services, and applications can use
> services.  Making applications *have to* use middleware in order to use the
> services is wasteful of both computer time and developer brainpower.  Just
> let them use services directly when the situation calls for it, and you can
> always write middleware to use the services when you encounter the
> occasional (and ever-rarer with time) oblivious application.
> >>Really, the only stuff that actually needs to be middleware, is stuff
> >>that wraps an *oblivious* application; i.e., the application doesn't know
> >>it's there.  If it's a service the application uses, then it makes more
> >>sense to create a service management mechanism for configuration and
> >>deployment of WSGI applications.
> >
> >Applications always care about the things around them, so any convention
> >that middleware and applications be unaware of each other would rule out
> >most middleware.
> Yes, exactly!  Now you understand me.  :)  If the application is what wants
> the service, let it just call the service.  Middleware is *overhead* in
> that case.
> >>I hope this isn't too vague; I've been wanting to say something about
> >>this since I saw your blog post about doing transaction services in WSGI,
> >>as that was when I first understood why you were making everything into
> >>middleware.  (i.e., to create a poor man's substitute for "placeful"
> >>services and utilities as found in PEAK and Zope 3.)
> >
> >What do they provide that middleware does not?
> Well, some services may be things the application needs only when it's
> being initially configured.  Or maybe the service is somethi

Re: [Web-SIG] Standardized configuration

2005-07-19 Thread mike bayer

While I'm not following every detail of this discussion, this line caught
my attention -

Ian Bicking said:
> Really, if you are building user-visible standard libraries, you are
> building a framework.

only because Fowler recently posted something that made me think about
this, where he distinguishes a "framework" as being something which
employs the "inversion of control" principle, as Paste does, versus a
"library" which does not: .

I know theres a lot of discussion over "A Framework ? Not a Framework?"
lately, largely in response to the recent meme "more frameworks == BAD"
that seems to be getting around these days; perhaps Fowler's distinction
is helpful...I hadn't thought of it that way before.
Re: [Web-SIG] Standardized configuration

2005-07-19 Thread ChunWei Ho
Hi, I have been looking at WSGI for only a few weeks, but had some
ideas similar (I hope) to what is being discussed that I'll put down
here. I'm new to this so I beg your indulgence if this is heading down
the wrong track or wildly offtopic :)

It seems to me that a major drawback of WSGI middleware that is
preventing flexible configuration/chain paths is that the application
to be run has to be determined at init time. It is much flexible if we
were able to specify what application to run and configuration
information at call time - the middleware would be able to approximate
a service of sorts.

An example:
I have an WSGI application simulating a file-server, and I wish to
authenticate users and gzip served files where application. In a
middleware chain it would probably work out to be:
application = authmiddleware(gzipmiddleware(fileserverapp))

For example, a simplified gzipping middleware consists of:
class gzipmiddleware:
  def __init__(self, application, configparam):
 self._application = application
  def __call__(self, environ, start_response):
 do start_response
 call self._application(environ, start_response) as iterable
 get each iterator output and zip and yield it.

and the fileserverapp, with doGET, doPUT, doPOST subapplications that
do the actual processing:
def fileserverapp(environ, start_response):
   if(GET): return doGET(environ, start_response)
   if(POST): return doPOST(environ, start_response)
   if(PUT): return doPUT(environ, start_response)

Now, the application-server is specific on what it wishes to gzip
(usually only on GET or POST entity responses and only if the mimetype
allows it). But this level of logic is not to be placed in the
gzipping middleware, since its configurable on the webserver. So in
order to tell the gzipmiddleware whether to gzip or not:

(a) Add a key in environ, say environ[gzip.do_gzip] = True or False to
inform the gzipmiddleware to do gzip or not. This does mean that
gzipmiddleware remains in the chain, irregardless of whether it is
needed or not.

Have chain application = authmiddleware(fileserverapp)
Use Handlers, as Ian suggested, and in the fileserverapp's init:
  IfTest(method=GET,MimeOkForGzip=True, RunApp=gzipmiddleware(doGET)), 
  IfTest(method=GET,MimeOkForGzip=False, RunApp=doGET), 
  IfTest(method=POST,MimeOkForGzip=True, RunApp=gzipmiddleware(doPOST)), 
  IfTest(method=POST,MimeOkForGzip=False, RunApp=doPOST), 
  IfTest(method=PUT, RunApp=doPOST) 

Make gzipmiddleware a service in the following form:
class gzipmiddleware:
  def __init__(self, application=None, configparam=None):
 self._application = application
  def __call__(self, environ, start_response, application=None,
 if application and configparam is specified, use them instead of
the init values
 do start_response
 call self._application(environ, start_response) as iterable
 get each iterator output and zip and yield it.

This "middleware" is still compatible with PEP-333, but can also be used as:
#on main application initialization, create a gzipservice and put it
in environ without
#specifying application or configparams for init():
environ['service.gzip'] = gzipmiddleware()

Modify fileserverapp to:
def fileserverapp(environ, start_response):
   if(mimetype ok for gzip):
   gzipservice = environ['service.gzip']
   return gzipservice(environ, start_response, doGET, gzipconfigparams) 
   else: return doGET(environ, start_response)
   if(mimetype ok for gzip):
   gzipservice = environ['service.gzip']
   return gzipservice(environ, start_response, doPOST,
   else: return doPOST(environ, start_response)
   if(PUT): doPUT(environ, start_response)

The main difference here is that you don't have to initialize full
application chains for each possible middleware-path for the request.
This would be very useful if you had many middleware in the chain with
many permutations as to which middleware are needed

You could also instead put a service factory object into environ, it
will return the gzipmiddleware object as a service if already exist,
otherwise it will create it and then return it.
2005-07-19 Thread Ian Bicking
Phillip J. Eby wrote:
>> In many cases, the middleware is modifying or watching the 
>> application's output.  For instance, catching a 401 and turning that 
>> into the appropriate login -- which might mean producing a 401, a 
>> redirect, a login page via internal redirect, or whatever.
> And that would be legitimate middleware, except I don't think that's 
> what you really want for that use case.  What you want is an 
> "authentication service" that you just call to say, "I need a login" and 
> get the login information from, and return its return value so that it 
> does start_response for you and sends the right output.

Like I mentioned in my response to Chris, this kind of contract about 
return values is a difficult one to implement.  A "return 401 status" 
contract is pretty simple, in that it's vague in a way that fits with 
typical frameworks -- they all have a way of changing the status, and 
most have a way of aborting with that kind of error.

> The difference is obliviousness; if you want to *wrap* an application 
> not written to use WSGI services, then it makes sense to make it 
> middleware.  If you're writing a new application, just have it use 
> components instead of mocking up a 401 just so you can use the existing 
> middleware.

Who's writing new applications?  OK... I guess a lot of people are.  I 
may be more focused on retrofitting compared to other people.

> Notice, by the way, that it's trivial to create middleware that detects 
> the 401 and then *invokes the service*.  So, it's more reusable to make 
> services be services, and middleware be wrappers to apply services to 
> oblivious applications.

Yes, this would be the single-middleware-multiple-service model.  I 
don't understand exactly how services work myself, so I can't write 
that, but I'm certainly interested in examples.  Well... I'll throw out 
one just for the heck of it:

class ServiceMiddleware(object):

 def __init__(self, app): = app
 def __call__(self, environ, start_response):
 context = environ['webapp.service_context'] = ServiceContext()
 # You could also do some thread-local registering of this
 # context at this point
 def replacement_start_response(status, headers):
 status, headers, writer = context.start_response(
 start_response, status, headers)
 return writer
 app_iter =, start_response)
 return context.app_iter(app_iter)

class ServiceContext(object):
 def __init__(self): = []
 def get_service(self, name):
 ... something I don't understand ...
 return service
 def start_response(self, start_response, status, headers):
 for service in
 if hasattr(service, 'munge_start_response'):
 status, headers = service.munge_start_response(status, 
 return start_response(status, headers)
 def app_iter(self, app_iter):
 return app_iter

And ServiceContext should also ask services if they care to munge_body 
or something, and then pipe all calls to the writer and all the parts of 
app_iter into that service if so.  And it should let services catch 

>> I guess you could make one Uber Middleware that could handle the 
>> services' needs to rewrite output, watch for errors and finalize 
>> resources, etc.
> Um, it's called a library of functions.  :)  WSGI was designed to make 
> it easy to use library calls to do stuff.  If you don't need the 
> obliviousness, then library calls (or service calls) are the Obvious Way 
> To Do It.

I do use library calls when possible; and even when not possible I 
(generally) try to make the middleware as small as possible, just 
handling the logic of the transformation.  But mostly libraries don't 
need to be discussed here, because they are simple ;)

There are perhaps a few places where standardization of some library 
manipulations would be useful.  E.g., get_cookies() and 
parse_querystring() in paste.wsgilib 
( could be 
standardized, and then WSGI-based libraries that were interested in the 
request could probably retrieve the frameworks' parsed version of URL 
and cookie parameters.

>>> Really, the only stuff that actually needs to be middleware, is stuff 
>>> that wraps an *oblivious* application; i.e., the application doesn't 
>>> know it's there.  If it's a service the application uses, then it 
>>> makes more sense to create a service management mechanism for 
>>> configuration and deployment of WSGI applications.
>> Applications always care about the things around them, so any 
>> convention that middleware and applications be unaware of each other 
>> would rule out most middleware.
> Yes, exactly!  Now you understand me.  :)  If the application is what 
> wants the service, le

2005-07-19 Thread Shannon -jj Behrens
It seems to me that authentication and authorization should be a put
into a library that isn't bound to the Web at all.  I thought that
those guys reimplementing J2EE in Python did that. :-/

Oh well,

On 7/16/05, Chris McDonough <[EMAIL PROTECTED]> wrote:
> I've also been putting a bit of thought into middleware configuration,
> although maybe in a different direction.  I'm not too concerned yet
> about being able to introspect the configuration of an individual
> component.  Maybe that's because I haven't thought about the problem
> enough to be concerned about it.  In the meantime, though, I *am*
> concerned about being able to configure a middleware "pipeline" easily
> and have it work.
> I've been attempting to divine a declarative way to configure a pipeline
> of WSGI middleware components.  This is simple enough through code,
> except that at least in terms of how I'm attempting to factor my
> middleware, some components in the pipeline may have dependencies on
> other pipeline components.
> For example, it would be useful in some circumstances to create separate
> WSGI components for user identification and user authorization.  The
> process of identification -- obtaining user credentials from a request
> -- and user authorization  -- ensuring that the user is who he says he
> is by comparing the credentials against a data source -- are really
> pretty much distinct operations.  There might also be a "challenge"
> component which forces a login dialog.
> In practice, I don't know if this is a truly useful separation of
> concerns that need to be implemented in terms of separate components in
> the middleware pipeline (I see that paste.login conflates them), it's
> just an example.  But at very least it would keep each component simpler
> if the concerns were factored out into separate pieces.
> But in the example I present, the "authentication" component depends
> entirely on the result of the "identification" component.  It would be
> simple enough to glom them together by using a distinct environment key
> for the identification component results and have the authentication
> component look for that key later in the middleware result chain, but
> then it feels like you might as well have written the whole process
> within one middleware component because the coupling is pretty strong.
> I have a feeling that adapters fit in here somewhere, but I haven't
> really puzzled that out yet.  I'm sure this has been discussed somewhere
> in the lifetime of WSGI but I can't find much in this list's archives.
> > Lately I've been thinking about the role of Paste and WSGI and
> > whatnot. Much of what makes a Paste component Pastey is
> > configuration;  otherwise the bits are just independent pieces of
> > middleware, WSGI applications, etc.  So, potentially if we can agree
> > on configuration, we can start using each other's middleware more
> > usefully.
> >
> > I think we should avoid questions of configuration file syntax for
> > now.  Lets instead simply consider configuration consumers.  A
> > standard would consist of:
> >
> > * A WSGI environment key (e.g., 'webapp01.config')
> > * A standard for what goes in that key (e.g., a dictionary object)
> > * A reference implementation of the middleware
> > * Maybe a non-WSGI-environment way to access the configuration (like
> > paste.CONFIG, which is a global object that dispatches to per-request
> > configuration objects) -- in practice this is really really useful, as
> > you don't have to pass the configuration object around.
> >
> > There's some other things we have to consider, as configuration syntaxes
> > do effect the configuration objects significantly.  So, the standard for
> > what goes in the key has to take into consideration some possible
> > configuration syntaxes.
> >
> > The obvious starting place is a dictionary-like object.  I would suggest
> > that the keys should be valid Python identifiers.  Not all syntaxes
> > require this, but some do.  This restriction simply means that
> > configuration consumers should try to consume Python identifiers.
> >
> > There's also a question about name conflicts (two consumers that are
> > looking for the same key), and whether nested configuration should be
> > preferred, and in what style.
> >
> > Note that the standard we decide on here doesn't have to be the only way
> > the object can be accessed.  For instance, you could make your
> > configuration available through 'myframework.config', and create a
> > compliant wrapper that lives in 'webapp01.config', perhaps even doing
> > different kinds of mapping to fix convention differences.
> >
> > There's also a question about what types of objects we can expect in the
> > configuration.  Some input styles (e.g., INI and command line) only
> > produce strings.  I think consumers should treat strings (or maybe a
> > special string subclass) specially, performing conversions as necessary
> > (e.g., 'yes'->True).
> >
> > Thoughts?

2005-07-19 Thread Ian Bicking
Phillip J. Eby wrote:
> At 07:29 AM 7/17/2005 -0400, Chris McDonough wrote:
>> I'm a bit confused because one of the canonical examples of
>> how WSGI middleware is useful seems to be the example of implementing a
>> framework-agnostic sessioning service.  And for that sessioning service
>> to be useful, your application has to be able to depend on its
>> availability so it can't be "oblivious".
> Exactly.  As soon as you start trying to have configured services, you 
> are creating Yet Another Framework.  Which isn't a bad thing per se, 
> except that it falls outside the scope of  PEP 333.  It deserves a 
> separate PEP, I think, and a separate implementation mechanism than 
> being crammed into the request environment.  These things should be 
> allowed to be static, so that an application can do some reasonable 
> setup, and so that you don't have per-request overhead to shove ninety 
> services into the environment.

The services themselves can be fairly lazy; though unfortunately you 
can't be trickly and add laziness when a service was originally written 
in a very concrete way, since that would require fake dictionaries and 
other things WSGI disallows.

But there's not a lot of overhead to environ['paste.session.factory']() 
-- it's just a stub object stuck in a particulra key, that knows the 
context in which it was created so it can communicate with that context 

> Also, because we are dealing not with basic plumbing but with making a 
> nice kitchen, it seems to me we can afford to make the fixtures nice.  
> That is, for an add-on specification to WSGI we don't need to adhere to 
> the "let it be ugly for apps if it makes the server easier" principle 
> that guided PEP 333.  The assumption there was that people would mostly 
> port existing wrappers over HTTP/CGI to be wrappers over WSGI.  But for 
> services, we are talking about an actual framework to be used by 
> application developers directly, so more user-friendliness is definitely 
> in order.

My own vision for most middleware is that it get wrapped by frameworks. 
  In fact, that it be so godawful ugly you can't help but wrap it ;) 
Well, not deliberately horrible for no good reason... but at least that 
it doesn't matter that much, because the frameworks will want to wrap it 

This is the "aesthetically neutral" aspect of middleware that I've 
mentioned before.  People get all bothered if you use underscores 
instead of mixed case, or vice versa, even though that's one of the 
least important aspects of the features being implemented.

Of course, there are real problems with wrapping.  Like it reduces the 
transparency -- middleware becomes this magic part of the system because 
it's not something people deal with day-to-day, and if your first chance 
to work with middleware is to write it, that's intimidating.  There's 
also the leaky abstraction problem; though I think well-defined 
middleware helps avoid this.

Really, if you are building user-visible standard libraries, you are 
building a framework.  And maybe I'm just too pessimistic about a 
standard framework... but, well, I am certainly not optimistic about it. 
  On the other hand, it's not like people are breaking down my door with 
their enthusiasm to use Paste middleware either.  So I dunno.

I can only say a good strategy clearly has to build on developer's 
laziness, their fear of new things, and their reluctance to learn new 
things.  Well, that's the negative way of saying it.  It has to build on 
the likelihood that their attention is primarily focused on their 
domain, that it builds on their existing knowledge, and that it presents 
a minimal set of new concepts.

Ian Bicking  /  [EMAIL PROTECTED]  /
2005-07-19 Thread Ian Bicking
Chris McDonough wrote:
> On Mon, 2005-07-18 at 22:49 -0500, Ian Bicking wrote:
>>In addition to the examples I gave in response to Graham, I wrote a 
>>document on this a while ago: 
>>The hard part about this is configuration; it's easy to configure a 
>>non-branching chain of middleware.  Once it branches the configuration 
>>becomes hard (like programming-hard; which isn't *hard*, but it quickly 
>>stops feeling like configuration).
> Yep.  I think I'm getting it.  For example, I see that Paste's URLParser
> seems to *construct* applications if they don't already exist based on
> the URL.  And I assume that these applications could themselves be
> middleware.  I don't think that is configurable declaratively if you
> want to decide which app to use based on arbitrary request parameters.
> But if we already had the config for each app "instance" that URLParser
> wanted to consult laying around as files on disk, wouldn't it be just as
> easy to construct these app objects "eagerly" at startup time?  Then you
> URLParser could choose an already-configured app based on some sort of
> configuration file in the URLParser component itself.  The "apps"
> themselves may be pipelines, too, I realize that, but that is still
> configurable without coding.

That's what paste.urlmap is for:

(I haven't actually tried using it much for practical things, so it's 
quite possible it has design mistakes in it)

The idea being that you do:

   urlmap['/myapp'] = MyApp()

But additionally (in PathProxyURLMap):

   urlmap['/myapp'] = 'myapp.conf'

And it builds the application from the configuration file.

> Maybe there'd be some concern about needing to stop the process in order
> to add new applications.  That's a use case I hadn't really considered.
> I suspect this could be done with a signal handler, though, which could
> tell the URLParser to reload its config file instead of potentially
> locating a and creating a new application within every request.
> This would make URLParser a kind of "decision" middleware, but it would
> choose from a static set of existing applications (or pipelines) for the
> lifetime of the process as opposed to constructing them lazily.

URLParser itself is just one parsing implementation, though maybe named 
too generically.  I don't think that particular code needs to grow many 
more features, but there's also room for many other parsers.  And it's 
also fairly easy to wrestle control from URLParser if that gets put in 
the stack (for instance, putting an application function in 
will basically take over URL parsing for that  directory).

>>>OTOH, I'm not sure that I want my framework to "find" an app for me.
>>>I'd like to be able to define pipelines that include my app, but I'd
>>>typically just want to statically declare it as the end point of a
>>>pipeline composed of service middleware.  I should look at Paste a
>>>little more to see if it has the same philosophy or if I'm
>>>misunderstanding you.
>>Mostly I wanted to avoid lots of magical incantations for the simple 
>>case.  If you are used to Webware, well it has a very straight-forward 
>>way of finding your application -- you give it a directory name.  If 
>>Quixote or CherryPy, you give it a root object.  Maybe Zope would take a 
>>ZEO connection string, and so on.
> I think I understand now.
> In general, I think I'd rather create "instance" locations of WSGI
> applications (which would essentially consist of a config file on disk
> plus any state info required by the app), configure and construct Python
> objects out of those instances eagerly at "startup time" and just choose
> between already-constructed apps if in "decision middleware" that has
> its own declarative configuration if decisions need to be made about
> which app to use.

I think this is a laudible goal.  Right now, when I'm deploying 
applications written for Paste, I am reluctant to intermingle them in 
the same process and configuration... but that's because Paste is young, 
not because that's a bad idea.  But as a result I haven't tried it, and 
I only have a moderate concept of what it would mean in practice.

A neat feature would be to configure fairly seemlessly across process 
boundaries.  E.g., add a "fork=True" parameter to an application's 
configuration, and the server would fork a process (or delegate to an 
already forked worker process) for that application.  That's the sort of 
thing that could move Python into PHP-style hosting situations.

> This is mostly because I want the configuration info to live within the
> application/middleware instance and have some other "starter" import
> those configurations from application/middleware instance locations on
> the filesystem.  The "starter" would construct required instances as
> Python objects, and chain them together arbitrarily based on some

Re: [Web-SIG] Standardized configuration

2005-07-18 Thread Chris McDonough
On Mon, 2005-07-18 at 22:49 -0500, Ian Bicking wrote:
> In addition to the examples I gave in response to Graham, I wrote a 
> document on this a while ago: 
> The hard part about this is configuration; it's easy to configure a 
> non-branching chain of middleware.  Once it branches the configuration 
> becomes hard (like programming-hard; which isn't *hard*, but it quickly 
> stops feeling like configuration).

Yep.  I think I'm getting it.  For example, I see that Paste's URLParser
seems to *construct* applications if they don't already exist based on
the URL.  And I assume that these applications could themselves be
middleware.  I don't think that is configurable declaratively if you
want to decide which app to use based on arbitrary request parameters.

But if we already had the config for each app "instance" that URLParser
wanted to consult laying around as files on disk, wouldn't it be just as
easy to construct these app objects "eagerly" at startup time?  Then you
URLParser could choose an already-configured app based on some sort of
configuration file in the URLParser component itself.  The "apps"
themselves may be pipelines, too, I realize that, but that is still
configurable without coding.

Maybe there'd be some concern about needing to stop the process in order
to add new applications.  That's a use case I hadn't really considered.
I suspect this could be done with a signal handler, though, which could
tell the URLParser to reload its config file instead of potentially
locating a and creating a new application within every request.

This would make URLParser a kind of "decision" middleware, but it would
choose from a static set of existing applications (or pipelines) for the
lifetime of the process as opposed to constructing them lazily.

> > OTOH, I'm not sure that I want my framework to "find" an app for me.
> > I'd like to be able to define pipelines that include my app, but I'd
> > typically just want to statically declare it as the end point of a
> > pipeline composed of service middleware.  I should look at Paste a
> > little more to see if it has the same philosophy or if I'm
> > misunderstanding you.
> Mostly I wanted to avoid lots of magical incantations for the simple 
> case.  If you are used to Webware, well it has a very straight-forward 
> way of finding your application -- you give it a directory name.  If 
> Quixote or CherryPy, you give it a root object.  Maybe Zope would take a 
> ZEO connection string, and so on.

I think I understand now.

In general, I think I'd rather create "instance" locations of WSGI
applications (which would essentially consist of a config file on disk
plus any state info required by the app), configure and construct Python
objects out of those instances eagerly at "startup time" and just choose
between already-constructed apps if in "decision middleware" that has
its own declarative configuration if decisions need to be made about
which app to use.

This is mostly because I want the configuration info to live within the
application/middleware instance and have some other "starter" import
those configurations from application/middleware instance locations on
the filesystem.  The "starter" would construct required instances as
Python objects, and chain them together arbitrarily based on some other
"pipeline configuration" file that lives with the "starter".  The first
part of that (construct required instances) is described in a post I
made to this list yesterday.

This is probably because I'd like there to be one well-understood way to
declaratively configure pipelines as opposed to each piece of middleware
potentially needing to manage app construction and having its own
configuration to do so.

I don't know if this is reasonable for simpler requirements.  This is
more of a "formal deployment spec" idea and of course is likely flawed
in some subtle way I don't understand yet.

> > I'm pretty sure you're not advocating it, but in case you are, I'm not
> > sure it adds as much value as it removes to be able to have a "dynamic"
> > middleware chain whereby new middleware elements can be added "on the
> > fly" to a pipeline after a request has begun.  That is *very* "late
> > binding" to me and it's impossible to configure declaratively.
> I'm comfortable with a little of both.  I don't even know *how* I'd stop 
> dynamic middleware.  For instance, one of the methods I added to Wareweb 
> recently allows any servlet to forward to any WSGI application; but from 
> the outside the servlet looks like a normal WSGI application just like 
> before.

It's obviously fine if applications themselves want to do this.  I'm not
sure that it would be possible to create a "deployment spec" that
canonized *how* to do it because as you mentioned it's not really a
configuration task, it's a programming task.

> > I agree!  I'm a bit confused because one of the canonical examples of
> > how WSGI middleware i

Re: [Web-SIG] Standardized configuration

2005-07-18 Thread Graham Dumpleton
Ian Bicking wrote ..
> There's several conventions that could be used for trying applications
> in-sequence.  For instance, you could do something like this (untested)
> for delegating to different apps until one of them doesn't respond with
> a 404:
> class FirstFound(object):
>  """Try apps in sequence until one doesn't return 404"""
>  def __init__(self, apps):
>  self.apps = apps
>  def __call__(self, environ, start_response):
>  def replacement_start_response(status, headers):
>  if int(status.split()[0]) == 404:
>  raise HTTPNotFound
>  return start_response(status, headers)
>  for app in self.apps[:-1]:
>  try:
>  return app(environ, replacement_start_response)
>  except HTTPNotFound:
>  pass
>  # If the last one responds with 404, so be it
>  return self.apps[-1](environ, start_response)
> > Anyway, people may feel that this is totally contrary to what WSGI is
> > all about and
> > not relevant and that is fine, I am at least finding it an interesting
> > idea to
> > play with in respect of mod_python at least.

As far as using 404 to indicate this, I had thought of that, but it then
precludes one of those applications actually raising that as a real
response. I often return NotFound as opposed to Forbidden when
access is to files such as ".py" files. Return forbidden still gives a
clue as to what implementation language is used where as returning not
found doesn't. I do this, perhaps in a misguided way, as by not exposing
how something is implemented, I feel it makes it just a bit harder for
people to work out how to breach your security. :-)

If one was going to use a specific error code to indicate next
application object should be tried, maybe it might be more appropriate
to use 303 (See Other) with there being no redirect URL specified. Ie.,
something that doesn't necessarily overlap with something that might
be valid for a application object to do.

> It's very relevent, at least in my opinion.  This is exactly the sort of
> architecture I've been attracted to, and the kind of middleware I've 
> been adding to Paste.  The biggest difference is that mod_python uses an
> actual list and return values, where WSGI uses nested function calls.

To say that mod_python uses an actual list is only really true at the
level of Apache configuration where one defines the PythonHandler
directive and can specify multiple handlers to run in succession. Most
people would only have the one.

At the level I am working where I use "Handlers()", not a part of
mod_python itself, I am using both sequences of handlers as well as
recursive nesting. The "IfLocationMatches()" object in my examples was
wrapping the "NotFound()" object, but it could equally have wrapped a
"Handlers()" or another "If" object, which in turn wraps lower level
objects. Even the "PythonModule()" object wrapped objects indirectly,
they just happen to be loaded at run time much like the URLParser
example for Paste.

Thus I am using both lists and nested callable objects in the way of
wrappers. WSGI seems to focus mainly on the latter of using only nested
calls in all the examples I have seen, although you do show above one
way perhaps of having a lineal search for an application object.

Anyway, the point I was trying to make was that to me, the lineal
search through a list of handlers (or application objects) seems
to be an easier way of dealing with things in some cases and looks
simpler in code than having a long nested chain of objects, yet WSGI
doesn't seem to make any real use of that approach to composing
together middleware components.

I'll leave it at that for the moment. I guess I'll just have to show whether
one way works better and is easier to understand than the other by way
of example at some point. :-)

Thanks for the response.

Re: [Web-SIG] Standardized configuration

2005-07-18 Thread Ian Bicking
Chris McDonough wrote:
> On Sun, 2005-07-17 at 03:16 -0500, Ian Bicking wrote:
>>This is what Paste does in configuration, like:
>> SessionMiddleware, IdentificationMiddleware,
>> AuthenticationMiddleware, ChallengeMiddleware])
>>This kind of middleware takes a single argument, which is the 
>>application it will wrap.  In practice, this means all the other 
>>parameters go into lazily-read configuration.
> I'm finding it hard to imagine a reason to have another kind of
> middleware.
> Well, actually that's not true.  In noodling about this, I did think it
> would be kind of neat in a twisted way to have "decision middleware"
> like:

In addition to the examples I gave in response to Graham, I wrote a 
document on this a while ago:

The hard part about this is configuration; it's easy to configure a 
non-branching chain of middleware.  Once it branches the configuration 
becomes hard (like programming-hard; which isn't *hard*, but it quickly 
stops feeling like configuration).

>>You can also define a "framework" (a plugin to Paste), which in addition 
>>to finding an "app" can also add middleware; basically embodying all the 
>>middleware that is typical for a framework.
> This appears to be what I'm trying to do too, which is why I'm intrigued
> by Paste.
> OTOH, I'm not sure that I want my framework to "find" an app for me.
> I'd like to be able to define pipelines that include my app, but I'd
> typically just want to statically declare it as the end point of a
> pipeline composed of service middleware.  I should look at Paste a
> little more to see if it has the same philosophy or if I'm
> misunderstanding you.

Mostly I wanted to avoid lots of magical incantations for the simple 
case.  If you are used to Webware, well it has a very straight-forward 
way of finding your application -- you give it a directory name.  If 
Quixote or CherryPy, you give it a root object.  Maybe Zope would take a 
ZEO connection string, and so on.

>>Paste is really a deployment configuration.  Well, that as well as stuff 
>>to deploy.  And two frameworks.  And whatever else I feel a need or 
>>desire to throw in there.
> Yeah.  FWIW, as someone who has recently taken a brief look at Paste, I
> think it would be helpful (at least for newbies) to partition out the
> bits of Paste which are meant to be deployment configuration from the
> bits that are meant to be deployed.  Zope 2 fell into the same trap
> early on, and never recovered.  For example, ZPublisher (nee Bobo) was
> always meant to be able to be useful outside of Zope, but in practice it
> never happened because nobody could figure out how to disentangle it
> from its ever-increasing dependencies on other software only found in a
> Zope checkout.  In the end, nobody even remembered what its dependencies
> were *supposed* to be.  If you ask ten people, you'd get ten different
> answers.

Maybe with setuptools' namespace packages I can try this sometime.  It's 
not a high priority, though if splitting pieces out would make them more 
appealing then I could do that.

Deployment doesn't actually interest me, it's just a pain in the ass and 
I wanted to give it a go.  There's no real competition that I know of, 
because it's a boring and annoying problem ;)  So if I split it off, it 
might become accidentally orphaned...

> I also think that the rigor of separating out different components helps
> to make the software stronger and more easily understood in bite-sized
> pieces.  Unfortunately, separating them makes configuration tough, but I
> think that's what we're trying to find an answer about how to do "the
> right way" here.

Yes, you've reminded me why I brought this up, for that exact reason, 
though we've digressed a great deal.  Lots of pieces of Paste have zero 
(or close to it) dependencies, except for configuration.  That's what 
distinguishes a Paste component from a generic WSGI component, and I'm 
just as happy if there is no distinction.

>>Note also that parts of the pipeline are very much late bound.  For 
>>instance, the way I implemented Webware (and Wareweb) each servlet is a 
>>WSGI application.  So while there's one URLParser application, the 
>>application that actually handles the request differs per request.  If 
>>you start hanging more complete applications (that might have their own 
>>middleware) at different URLs, then this happens more generally.
> Well, if you put the "decider" in middleware itself, all of the
> middleware components in each pipeline could still be at least
> constructed early.  I'm pretty sure this doesn't really strictly qualify
> as "early binding" but it's not terribly dynamic either.  It also makes
> configuration pretty straightforward.  At least I can imagine a
> declarative syntax for configuring pipelines this way.

This is close to how Paste works now.  The typical middleware stack does 

Re: [Web-SIG] Standardized configuration

2005-07-18 Thread Ian Bicking
Graham Dumpleton wrote:
> My understanding from reading the WSGI PEP and examples like that above is
> that the WSGI middleware stack concept is very much tree like, but where at
> any specific node within the tree, one can only traverse into one child. 
> Ie.,
> a parent middleware component could make a decision to defer to one 
> child or
> another, but there is no means of really trying out multiple choices until
> you find one that is prepared to handle the request. The only way around it
> seems to be make the linear chain of nested applications longer and longer,
> something which to me just doesn't sit right. In some respects the need for
> the configuration scheme is in part to make that less unwieldy.

It's not at all limited to this, but these are simply the ones that are 
easy to configure, and can be inserted into a stack without changing the 
stack very much.

> What I am doing is making it acceptable for a handler to also return None.
> If this were returned by the highest level handler, it would equate to 
> being
> the same as DECLINED, but within the context of middleware components it
> has a lightly relaxed meaning. Specifically, it indicates that that handler
> isn't returning a response, but not that it is indicating that the request
> as a whole is being DECLINED causing a return to Apache.

Incidentally, I'd typically use an exception when the return value 
didn't include the semantics I wanted, but that might not be a problem here.

> One last example, is what a session based login mechanism might look like
> since this was one of the examples posed in the initial discussion. Here 
> you
> might have a handler for a whole directory which contains:
> _userDatabase = _users.UserDatabase()
> handler = Handlers(
> IfLocationMatches(r"\.bak(/.*)?$",NotFound()),
> IfLocationMatches(r"\.tmpl(/.*)?$",NotFound()),
> IfLocationIsADirectory(ExternalRedirect('index.html')),
> # Create session and stick it in request object.
> CreateUserSession(),
> # Login form shouldn't require user to be logged in to access it.
> IfLocationMatches(r"^/login\.html(/.*)?$",CheetahModule()),
> # Serve requests against login/logout URLs and otherwise
> # don't let request proceed if user not yet authenticated.
> # Will redirect to login form if not authenticated.
> FormAuthentication(_userDatabase,"login.html"),
> SetResponseHeader('Pragma','no-cache'),
> SetResponseHeader('Cache-Control','no-cache'),
> SetResponseHeader('Expires','-1'),
> IfLocationMatches(r"/.*\.html(/.*)?$",CheetahModule()),
> )
> Again, one has done away with the need for a configuration files as the 
> code
> itself specifies what is required, along with the constraints as to what
> order things should be done in.
> Another thing this example shows is that handlers when they return None due
> to not returning an actual response, can still add to the response headers
> in the way of special cookies as required by sessions, or headers 
> controlling
> caching etc.

This is not possible in WSGI middleware if handled in a chain-like 
fashion.  Nested middleware can do this, of course.

This kind of chaining would be necessary if "services" were used, as 
many services have to effect the response, and there's no WSGI-related 
spec about where or how they would do that.  Though I haven't digested 
all the long emails lately...

> In terms of late binding of which handler is executed, the "PythonModule"
> handler is one example in that it selects which Python module to load only
> when the request is being handled. Another example of late construction of
> an instance of a handler in what I am doing, albeit the same type, is:
>   class Handler:
> def __init__(self,req):
>   self.__req = req
> def __call__(self,name="value"):
>   self.__req.content_type = "text/html"
>   self.__req.send_http_header()
>   self.__req.write("")
>   self.__req.write("name=%r"%cgi.escape(name))
>   self.__req.write("")
>   return apache.OK
>   handler = IfExtensionEquals("html",HandlerInstance(Handler))
> First off the "HandlerInstance" object is only triggered if the request
> against this specific file based resource was by way of a ".html"
> extension. When it is triggered, it is only at that point that an instance
> of "Handler" is created, with the request object being supplied to the
> constructor.

Incidentally, I'm doing something a little like that with the 
filebrowser example in Paste:

Looking at it now, it's not clear where that's happening, but (in 
application()) context.path(path) creates a WSGI application using a 
class based on the extension/expected mime type.  So the dispatching is 

> To round this off, the special "Handlers" handler only contains the 
> following
> code. Pretty simple, but makes construction of the component

Re: [Web-SIG] Standardized configuration

2005-07-18 Thread mso
A couple things I don't understand in this discussion.

Phillip J. Eby said:
> At 03:28 AM 7/17/2005 -0500, Ian Bicking wrote:
>>I guess I'm treating the request environment as that context.  I don't
>>really see the problem with that...?
> It puts a layer in the request call stack for each service you want to
> offer, versus *no* layers for an arbitrary number of services.  It adds
> work to every request to put stuff into the environment, then take it out
> again, versus just getting what you want in the first place.

But the "overhead" is adding one dictionary item and reading it again.
The most insignificant thing imaginable.  More important is the ugliness
of accessing an arbitrarily-named key in the application, but even that is

> The difference is obliviousness; if you want to *wrap* an application not
> written to use WSGI services, then it makes sense to make it
> middleware.  If you're writing a new application, just have it use
> components instead of mocking up a 401 just so you can use the existing
> middleware.

That seems to suggest the whole PEP 333 excersise was a waste of time. 
(I'm not saying it is, just that it seems to be the logical conclusion of
your statement.)  WSGI is just "backward compatibility" for existing
applications?  Practically all the interesting middleware falls into this
"component" category.  I'm having a hard time seeing what middleware a
naive CGI/legacy application would benefit from, besides access to
alternative webservers.  (But at this point, none of these are "better"
than the frameworks' native servers.)  Especially since legacy apps access
their services in a framework-specific way and would need specific
middleware or patching.

If a new API is in order, it seems high priority to get a PEP out soon, or
at least some reference implementations.  Otherwise the middleware way
will become a de facto standard.


Re: [Web-SIG] Standardized configuration

2005-07-17 Thread Chris McDonough
I tried to think of this today in terms of creating a "deployment spec"
but boy, it gets complicated if you want a lot of useful features out of
it.  I have about four or five pages of a straw man "deployment
configuration" proposal, but it makes way too many assumptions.

So I tried to boil the problem down into its parts.  There seem to be
three distinct categories of configuration:

- Server/gateway/application instance configuration.  This is the
  kind of configuration that may be exposed to deployers by
  application authors.  Creating an instance configuration results
  in an instance of an application or gateway or maybe even
  a server.

- "Wiring" configuration which allows you to string together a
  "stack" out of instances.   I like calling it a "pipeline" better,
  but when in Rome... This is the kind of configuration that
  would be useful if you already have a bunch of instance configurations
  from the step above laying around and you want to create a stack
  out of them for deployment purposes.

- "Service" configuration which allows you create bits of 
  context that can be used by applications in the stack, but which
  aren't inserted into the stack itself.

I suspect we should stick to the first category of configuration first,
but I'll note that the desire for the other two categories might impose
some design constraints on the first.  The last kind of configuration
definitely ventures far out into framework land and though it'd be
terribly useful and seems to be where a lot of people think the value of
WSGI is, it might be something other than WSGI entirely.

So, anyway, towards the first category, I'll throw something out to the
wolves.  Note that below when I say "component" I mean a WSGI server,
gateway, or application:

  Each Python package which includes one or more WSGI components may
  optionally include descriptions of these components'
  "meta-configuration".  This meta-configuration would take the form
  of one or more "schemas".  Each schema would enumerate the
  configurable elements of a single WSGI component implementation.
  A schema for a component defines *the minimal number* of typed,
  component-specific keys and values that may be used to create
  instances of this component.

  >>> # load the schemas
  >>> server_schema  = loadSchema('components/server/server.schema')
  >>> gateway_schema = loadSchema('components/gateway/gateway.schema')
  >>> app_schema = loadSchema('components/app/app.schema')

  >>> # create the instances; any one of these steps would fail
  >>> # if the config file violated its schema.
  >>> server_factory  = loadConfig('instances/server/server.conf',
  schema = server_schema)
  >>> gateway_factory = loadConfig('instances/gateway/gateway.conf',
  schema = gateway_schema)
  >>> app_factory = loadConfig('instances/app/app.conf',
schema = app_schema)

  >>> # create instances from the factories
  >>> server = server_factory.create()
  >>> gateway = gateway_factory.create()
  >>> app = app_factory.create()

  # configure the instances into a pipeline
  >>> pipeline = server(gateway(app))

  # serve up the pipeline (notionally)
  >>> server.serve()

Of course this is just a more declarative way to do what is already
possible in code except for the schema-checking part, which presumably
would supply the deployer with clues if he had screwed up a config file.

I purposely didn't attempt to describe the syntax of the configuration
or schema files, but I suspect it would be best to make them both
ConfigParser files.  FWIW, ZConfig already does this exact thing, and
it's already written, but introducing dependencies on non-stdlib things
seems problematic.

Is this more or less what people have in mind for deployment
configuration or am I out in left field?

On Sun, 2005-07-17 at 13:56 -0400, Phillip J. Eby wrote:
> At 07:29 AM 7/17/2005 -0400, Chris McDonough wrote:
> >I'm a bit confused because one of the canonical examples of
> >how WSGI middleware is useful seems to be the example of implementing a
> >framework-agnostic sessioning service.  And for that sessioning service
> >to be useful, your application has to be able to depend on its
> >availability so it can't be "oblivious".
> Exactly.  As soon as you start trying to have configured services, you are 
> creating Yet Another Framework.  Which isn't a bad thing per se, except 
> that it falls outside the scope of  PEP 333.  It deserves a separate PEP, I 
> think, and a separate implementation mechanism than being crammed into the 
> request environment.  These things should be allowed to be static, so that 
> an application can do some reasonable setup, and so that you don't have 
> per-request overhead to shove ninety services into the environment.
> Also, because we are dealing not with basic plumbing but with making a nice 
> kitchen, it seems to me we can afford to make the fixtur

Re: [Web-SIG] Standardized configuration

2005-07-17 Thread Phillip J. Eby
At 03:28 AM 7/17/2005 -0500, Ian Bicking wrote:
>Phillip J. Eby wrote:
>>What I think you actually need is a way to create WSGI application 
>>objects with a "context" object.  The "context" object would have a 
>>method like "get_service(name)", and if it didn't find the service, it 
>>would ask its parent context, and so on, until there's no parent context 
>>to get it from.  The web server would provide a way to configure a root 
>>or default context.
>I guess I'm treating the request environment as that context.  I don't 
>really see the problem with that...?

It puts a layer in the request call stack for each service you want to 
offer, versus *no* layers for an arbitrary number of services.  It adds 
work to every request to put stuff into the environment, then take it out 
again, versus just getting what you want in the first place.

>In many cases, the middleware is modifying or watching the application's 
>output.  For instance, catching a 401 and turning that into the 
>appropriate login -- which might mean producing a 401, a redirect, a login 
>page via internal redirect, or whatever.

And that would be legitimate middleware, except I don't think that's what 
you really want for that use case.  What you want is an "authentication 
service" that you just call to say, "I need a login" and get the login 
information from, and return its return value so that it does 
start_response for you and sends the right output.

The difference is obliviousness; if you want to *wrap* an application not 
written to use WSGI services, then it makes sense to make it 
middleware.  If you're writing a new application, just have it use 
components instead of mocking up a 401 just so you can use the existing 

Notice, by the way, that it's trivial to create middleware that detects the 
401 and then *invokes the service*.  So, it's more reusable to make 
services be services, and middleware be wrappers to apply services to 
oblivious applications.

>I guess you could make one Uber Middleware that could handle the services' 
>needs to rewrite output, watch for errors and finalize resources, etc.

Um, it's called a library of functions.  :)  WSGI was designed to make it 
easy to use library calls to do stuff.  If you don't need the 
obliviousness, then library calls (or service calls) are the Obvious Way To 
Do It.

>   This isn't unreasonable, and I've kind of expected one to evolve at 
> some point.  But you'll have to say more to get me to see how "services" 
> is a better way to manage this.

I'm saying that middleware can use services, and applications can use 
services.  Making applications *have to* use middleware in order to use the 
services is wasteful of both computer time and developer brainpower.  Just 
let them use services directly when the situation calls for it, and you can 
always write middleware to use the services when you encounter the 
occasional (and ever-rarer with time) oblivious application.

>>Really, the only stuff that actually needs to be middleware, is stuff 
>>that wraps an *oblivious* application; i.e., the application doesn't know 
>>it's there.  If it's a service the application uses, then it makes more 
>>sense to create a service management mechanism for configuration and 
>>deployment of WSGI applications.
>Applications always care about the things around them, so any convention 
>that middleware and applications be unaware of each other would rule out 
>most middleware.

Yes, exactly!  Now you understand me.  :)  If the application is what wants 
the service, let it just call the service.  Middleware is *overhead* in 
that case.

>>I hope this isn't too vague; I've been wanting to say something about 
>>this since I saw your blog post about doing transaction services in WSGI, 
>>as that was when I first understood why you were making everything into 
>>middleware.  (i.e., to create a poor man's substitute for "placeful" 
>>services and utilities as found in PEAK and Zope 3.)
>What do they provide that middleware does not?

Well, some services may be things the application needs only when it's 
being initially configured.  Or maybe the service is something like a 
scheduler that gives timed callbacks.  There are lots of non-per-request 
services that make sense, so forcing service access to be only through the 
environment makes for cruftier code, since you now have to keep track of 
whether you've been called before, and then do any setup during your first 
web hit.  For that matter, some service configuration might need to be 
dynamically determined, based on the application object requesting it.

But the main thing they provide that middleware does not is simplicity and 
ease of use.  I understand your desire to preserve the appearance of 
neutrality, but you are creating new web frameworks here, and making them 
ugly doesn't make them any less of a framework.  :)

What's worse is that by tying the service access mechanism to the request 
environment, you'r

Re: [Web-SIG] Standardized configuration

2005-07-17 Thread Phillip J. Eby
At 07:29 AM 7/17/2005 -0400, Chris McDonough wrote:
>I'm a bit confused because one of the canonical examples of
>how WSGI middleware is useful seems to be the example of implementing a
>framework-agnostic sessioning service.  And for that sessioning service
>to be useful, your application has to be able to depend on its
>availability so it can't be "oblivious".

Exactly.  As soon as you start trying to have configured services, you are 
creating Yet Another Framework.  Which isn't a bad thing per se, except 
that it falls outside the scope of  PEP 333.  It deserves a separate PEP, I 
think, and a separate implementation mechanism than being crammed into the 
request environment.  These things should be allowed to be static, so that 
an application can do some reasonable setup, and so that you don't have 
per-request overhead to shove ninety services into the environment.

Also, because we are dealing not with basic plumbing but with making a nice 
kitchen, it seems to me we can afford to make the fixtures nice.  That is, 
for an add-on specification to WSGI we don't need to adhere to the "let it 
be ugly for apps if it makes the server easier" principle that guided PEP 
333.  The assumption there was that people would mostly port existing 
wrappers over HTTP/CGI to be wrappers over WSGI.  But for services, we are 
talking about an actual framework to be used by application developers 
directly, so more user-friendliness is definitely in order.

For WSGI itself, the server-side implementation has to be very server 
specific.  But the bulk of a service stack could be implemented once (e.g. 
as part of wsgiref), and then just used by servers.  So, we don't have to 
worry as much about making it easy for server people to implement, except 
for any server-specific choices about how configuration might be 
stacked.  (For example, in a filesystem-oriented server like Apache, you 
might want subdirectories to inherit services defined in parent directories.)

>OTOH, the primary benefit -- to me, at least -- of modeling services as
>WSGI middleware is the fact that someone else might be able to use my
>service outside the scope of my projects (and thus help maintain it and
>find bugs, etc).  So if I've got the wrong concept of what kinds of
>middleware that I can expect "normal" people to use, I don't want to go
>very far down that road without listening carefully to Phillip.  Perhaps
>I'll have a shot at influencing the direction of WSGI to make it more
>appropriate for this sort of thing or maybe we'll come up with a better
>way of doing it.
>Zope 3 is a component system much like what I'm after, and I may just
>end up using it wholesale.  But my immediate problem with Zope 3 is that
>like Zope 2, it's a collection of libraries that have dependencies on
>other libraries that are only included within its own checkout and don't
>yet have much of a life of their own.  It's not really a technical
>problem, it's a social one... I'd rather have a somewhat messy framework
>with a lot of diversity composed of wildly differing component
>implementations that have a life of their own than to be be trapped in a
>clean, pure world where all the components are used only within that
>I suspect there's a middle ground here somewhere.

Right; I'm suggesting that we grow a "WSGI Deployment" or "WSGI Stack" 
specification that includes a simple way to obtain services (using the Zope 
3 definition of "service" as simply a named component).  This would form 
the basis for various "WSGI Service" specifications.  And, for existing 
frameworks there's at least some potential possibility of integrating with 
this stack, since PEAK and Zope 3 both already have ways to define and 
acquire named services, so it might be possible to define the spec in such 
a way that their implementations could be reused by wrapping them in a thin 
"WSGI Stack" adapter.  Similarly, if there are any other frameworks out 
there that offer similar functionality, then they ought to be able to play 
too, at least in principle.

Re: [Web-SIG] Standardized configuration

2005-07-17 Thread Chris McDonough
On Sun, 2005-07-17 at 03:16 -0500, Ian Bicking wrote:
> This is what Paste does in configuration, like:
> middleware.extend([
>  SessionMiddleware, IdentificationMiddleware,
>  AuthenticationMiddleware, ChallengeMiddleware])
> This kind of middleware takes a single argument, which is the 
> application it will wrap.  In practice, this means all the other 
> parameters go into lazily-read configuration.

I'm finding it hard to imagine a reason to have another kind of

Well, actually that's not true.  In noodling about this, I did think it
would be kind of neat in a twisted way to have "decision middleware"

class DecisionMiddleware:
 def __init__(self, apps):
 self.apps = apps

 def __call__(self, environ, start_response):
app = self.choose(environ)
for chunk in app(environ, start_response):
yield chunk

 def choose(self, environ):
 app = some_decision_function(self.apps, environ)

I can imagine using this pattern as a decision point for a WSGI pipeline
serving multiple application end-points (perhaps based on URL matching
of the PATH_INFO in environ).

But by and large, most middleware components seem to be just wrappers
for the next application in the chain.  There seem to be two types of
middleware that takes a single application object as a parameter to its
constructor.  There is "decorator" middleware where you want to add
something to the environment for an application to find later and
"action" middleware that does some rewriting of the body or the response
headers before the response is sent back to the client.  Some of this
kind of middleware does both.

> You can also define a "framework" (a plugin to Paste), which in addition 
> to finding an "app" can also add middleware; basically embodying all the 
> middleware that is typical for a framework.

This appears to be what I'm trying to do too, which is why I'm intrigued
by Paste.

OTOH, I'm not sure that I want my framework to "find" an app for me.
I'd like to be able to define pipelines that include my app, but I'd
typically just want to statically declare it as the end point of a
pipeline composed of service middleware.  I should look at Paste a
little more to see if it has the same philosophy or if I'm
misunderstanding you.

> Paste is really a deployment configuration.  Well, that as well as stuff 
> to deploy.  And two frameworks.  And whatever else I feel a need or 
> desire to throw in there.

Yeah.  FWIW, as someone who has recently taken a brief look at Paste, I
think it would be helpful (at least for newbies) to partition out the
bits of Paste which are meant to be deployment configuration from the
bits that are meant to be deployed.  Zope 2 fell into the same trap
early on, and never recovered.  For example, ZPublisher (nee Bobo) was
always meant to be able to be useful outside of Zope, but in practice it
never happened because nobody could figure out how to disentangle it
from its ever-increasing dependencies on other software only found in a
Zope checkout.  In the end, nobody even remembered what its dependencies
were *supposed* to be.  If you ask ten people, you'd get ten different

I also think that the rigor of separating out different components helps
to make the software stronger and more easily understood in bite-sized
pieces.  Unfortunately, separating them makes configuration tough, but I
think that's what we're trying to find an answer about how to do "the
right way" here.

> Note also that parts of the pipeline are very much late bound.  For 
> instance, the way I implemented Webware (and Wareweb) each servlet is a 
> WSGI application.  So while there's one URLParser application, the 
> application that actually handles the request differs per request.  If 
> you start hanging more complete applications (that might have their own 
> middleware) at different URLs, then this happens more generally.

Well, if you put the "decider" in middleware itself, all of the
middleware components in each pipeline could still be at least
constructed early.  I'm pretty sure this doesn't really strictly qualify
as "early binding" but it's not terribly dynamic either.  It also makes
configuration pretty straightforward.  At least I can imagine a
declarative syntax for configuring pipelines this way.

I'm pretty sure you're not advocating it, but in case you are, I'm not
sure it adds as much value as it removes to be able to have a "dynamic"
middleware chain whereby new middleware elements can be added "on the
fly" to a pipeline after a request has begun.  That is *very* "late
binding" to me and it's impossible to configure declaratively.

> > But some elements of the pipeline at this level of factoring do need to
> > have dependencies on availability and pipeline placement of the other
> > elements.  In this example, proper operation of the authentication
> > component depends on the availability and pipeline placement of the
> > identification com

Re: [Web-SIG] Standardized configuration

2005-07-17 Thread Graham Dumpleton

On 17/07/2005, at 6:16 PM, Ian Bicking wrote:
>> The pipeline itself isn't really late bound.  For instance, if I was 
>> to
>> create a WSGI middleware pipeline something like this:
>>server <--> session <--> identification <--> authentication <-->
>><--> challenge <--> application
>> ... session, identification, authentication, and challenge are
>> middleware components (you'll need to imagine their implementations).
>> And within a module that started a server, you might end up doing
>> something like:
>> def configure_pipeline(app):
>> return SessionMiddleware(
>> IdentificationMiddleware(
>>   AuthenticationMiddleware(
>> ChallengeMiddleware(app)
> This is what Paste does in configuration, like:
> middleware.extend([
>  SessionMiddleware, IdentificationMiddleware,
>  AuthenticationMiddleware, ChallengeMiddleware])
> This kind of middleware takes a single argument, which is the
> application it will wrap.  In practice, this means all the other
> parameters go into lazily-read configuration.

Sorry, but you have given me a nice opening here to hijack this 
a bit and make some comments and pose some questions about WSGI that I 
been thinking on for a while.

My understanding from reading the WSGI PEP and examples like that above 
that the WSGI middleware stack concept is very much tree like, but 
where at
any specific node within the tree, one can only traverse into one 
child. Ie.,
a parent middleware component could make a decision to defer to one 
child or
another, but there is no means of really trying out multiple choices 
you find one that is prepared to handle the request. The only way 
around it
seems to be make the linear chain of nested applications longer and 
something which to me just doesn't sit right. In some respects the need 
the configuration scheme is in part to make that less unwieldy.

To explain what I am going on about, I am going to use examples from 
work I have been doing with componentised construction of request 
stacks in mod_python. I will not use the term middleware here, as I 
note that
someone here in this discussion has already made the point of saying 
the components being talked about here aren't really middleware and in 
I have been doing I have been taking it to an even more fine grained 

I believe I can draw a reasonable analogy to mod_python as at the 
a mod_python request handler and a WSGI application are both providing 
most basic function of proving the service for responding to a request,
they just do so in different ways.

Normally in mod_python a handler can return an OK response, an error 
or a DECLINED response. The DECLINED response is special and indicates 
mod_python that any further content handlers defined by mod_python 
should be
skipped and control passed back up to Apache so that it can potentially
serve up a matched static file.

What I am doing is making it acceptable for a handler to also return 
If this were returned by the highest level handler, it would equate to 
the same as DECLINED, but within the context of middleware components it
has a lightly relaxed meaning. Specifically, it indicates that that 
isn't returning a response, but not that it is indicating that the 
as a whole is being DECLINED causing a return to Apache.

Doing this means that within the context of a tree based middleware 
at a particular node in the stack one can introduce a list of handlers 
a particular node. Each handler in the list will in turn be tried to see
if it wishes to handle the response, returning either an error or valid
response, or None. If it doesn't raise a response, the next handler in 
list would be tried until one is found, and if one isn't, then None is 
back to the parent middleware component.

This all means I could write something like:

   handler = Handlers(

This handler might be associated with any access to a directory as a 
In iterating over each of the handlers it filters out requests to files
that we don't want to provide access to, with the final handler 
to a handler within a Python module associated with the actual resource
being requested. Although Apache provides means of filtering out 
it only works properly for physical files and not virtual resources 
by way of the path info.

For example, a file "page.tmpl" (a Cheetah file) could have a ""
file that defines:

   handler = Handlers(

Again, more filtering and finally a handler is triggered which knows how
to trigger a precompiled C

Re: [Web-SIG] Standardized configuration

2005-07-17 Thread Ian Bicking
Phillip J. Eby wrote:
> At 01:57 PM 7/11/2005 -0500, Ian Bicking wrote:
>> Lately I've been thinking about the role of Paste and WSGI and whatnot.
>>   Much of what makes a Paste component Pastey is configuration;
>> otherwise the bits are just independent pieces of middleware, WSGI
>> applications, etc.  So, potentially if we can agree on configuration, we
>> can start using each other's middleware more usefully.
> I'm going to go ahead and throw my hat in the ring here, even though 
> I've been trying to avoid it.
> Most of the stuff you are calling middleware really isn't, or at any 
> rate it has no reason to be middleware.

Well, it is if you implement it that way ;)  I think I'd prefer the term 
"filter" actually; less bad connotations for people.  But that's really 
unrelated to your point.

> What I think you actually need is a way to create WSGI application 
> objects with a "context" object.  The "context" object would have a 
> method like "get_service(name)", and if it didn't find the service, it 
> would ask its parent context, and so on, until there's no parent context 
> to get it from.  The web server would provide a way to configure a root 
> or default context.

I guess I'm treating the request environment as that context.  I don't 
really see the problem with that...?

> This would allow you to do early binding of services without needing to 
> do lookups on every web hit.  E.g.::
> class MyApplication:
> def __init__(self, context):
> self.authenticate = 
> context.get_service('security.authentication')
> def __call__(self, environ, start_response):
> user = self.authenticate(environ)
> So, you would simply register an application *factory* with the web 
> server instead of an application instance, and it invokes it on the 
> context object in order to get the right thing.

I don't see the distinction between a factory and an instance.  Or at 
least, it's easy to translate from one to the other.

In many cases, the middleware is modifying or watching the application's 
output.  For instance, catching a 401 and turning that into the 
appropriate login -- which might mean producing a 401, a redirect, a 
login page via internal redirect, or whatever.

I guess you could make one Uber Middleware that could handle the 
services' needs to rewrite output, watch for errors and finalize 
resources, etc.  This isn't unreasonable, and I've kind of expected one 
to evolve at some point.  But you'll have to say more to get me to see 
how "services" is a better way to manage this.

> Really, the only stuff that actually needs to be middleware, is stuff 
> that wraps an *oblivious* application; i.e., the application doesn't 
> know it's there.  If it's a service the application uses, then it makes 
> more sense to create a service management mechanism for configuration 
> and deployment of WSGI applications.

Applications always care about the things around them, so any convention 
that middleware and applications be unaware of each other would rule out 
most middleware.

> However, I think that the again the key part of configuration that 
> actually relates to WSGI here is *deployment* configuration, such as 
> which service implementations to use for the various kinds of services.  
> Configuration *of* the services can and should be private to those 
> services, since they'll have implementation-specific needs.  (This 
> doesn't mean, however, that a "configuration service" couldn't be part 
> of the family of WSGI service interfaces.)
> I hope this isn't too vague; I've been wanting to say something about 
> this since I saw your blog post about doing transaction services in 
> WSGI, as that was when I first understood why you were making everything 
> into middleware.  (i.e., to create a poor man's substitute for 
> "placeful" services and utilities as found in PEAK and Zope 3.)

What do they provide that middleware does not?

> Anyway, I don't have a problem with trying to create a framework-neutral 
> (in theory, anyway) component system, but I think it would be a good 
> idea to take lessons from ones that have solved this problem well, and 
> then create an extremely scaled-down version, rather than kludging 
> application configuration into what's really per-request data.

Per-request or not, from the application's side I don't see the 
difference.  It is convenient to put configuration into the request, 
though paste.CONFIG is also provided as a global variable that 
represents the current request's configuration.

In practice the configuration is usually identical for all requests, but 
Re: [Web-SIG] Standardized configuration

2005-07-17 Thread Ian Bicking
Chris McDonough wrote:
>>Because middleware can't be introspected (generally), this makes things 
>>like configuration schemas very hard to implement.  It all needs to be 
> The pipeline itself isn't really late bound.  For instance, if I was to
> create a WSGI middleware pipeline something like this:
>server <--> session <--> identification <--> authentication <--> 
><--> challenge <--> application
> ... session, identification, authentication, and challenge are
> middleware components (you'll need to imagine their implementations).
> And within a module that started a server, you might end up doing
> something like:
> def configure_pipeline(app):
> return SessionMiddleware(
> IdentificationMiddleware(
>   AuthenticationMiddleware(
> ChallengeMiddleware(app)
> if __name__ == '__main__':
> app = Application()
> pipeline = configure_pipeline(app)
> server = Server(pipeline)
> server.serve()

This is what Paste does in configuration, like:

 SessionMiddleware, IdentificationMiddleware,
 AuthenticationMiddleware, ChallengeMiddleware])

This kind of middleware takes a single argument, which is the 
application it will wrap.  In practice, this means all the other 
parameters go into lazily-read configuration.

You can also define a "framework" (a plugin to Paste), which in addition 
to finding an "app" can also add middleware; basically embodying all the 
middleware that is typical for a framework.

Paste is really a deployment configuration.  Well, that as well as stuff 
to deploy.  And two frameworks.  And whatever else I feel a need or 
desire to throw in there.

Note also that parts of the pipeline are very much late bound.  For 
instance, the way I implemented Webware (and Wareweb) each servlet is a 
WSGI application.  So while there's one URLParser application, the 
application that actually handles the request differs per request.  If 
you start hanging more complete applications (that might have their own 
middleware) at different URLs, then this happens more generally.

There's a newish poorly tested feature where you can do urlmap['/path'] 
= 'config_file.conf' and it'll hang the application described by that 
configuration file at that URL.

> The pipeline is static.  When a request comes in, the pipeline itself is
> already constructed.  I don't really want a way to prevent "improper"
> pipeline construction at startup time (right now anyway), because
> failures due to missing dependencies will be fairly obvious.

I think that's reasonable too; it's what Paste implements now.

> But some elements of the pipeline at this level of factoring do need to
> have dependencies on availability and pipeline placement of the other
> elements.  In this example, proper operation of the authentication
> component depends on the availability and pipeline placement of the
> identification component.  Likewise, the identification component may
> depend on values that need to be retrieved from the session component.

Yes; and potentially you could have several middlewares implementing the 
same functionality for a single request, e.g., if you had different kind 
of authentication for part of your site/application; that might shadow 
authentication further up the stack.

> I've just seen Phillip's post where he implies that this kind of
> fine-grained component factoring wasn't really the initial purpose of
> WSGI middleware.  That's kind of a bummer. ;-)

Well, I don't understand the services he's proposing yet.  I'm quite 
happy with using middleware the way I have been, so I'm not seeing a 
problem with it, and there's lots of benefits.

> Factoring middleware components in this way seems to provide clear
> demarcation points for reuse and maintenance.  For example, I imagined a
> declarative security module that might be factored as a piece of
> middleware here: .

Yes, I read that before; I haven't quite figured out how to digest it, 
though.  This is probably in part because of the resource-based 
orientation of Zope, and WSGI is application-based, where applications 
are rather opaque and defined only in terms of function.

> Of course, this sort of thing doesn't *need* to be middleware.  But
> making it middleware feels very right to me in terms of being able to
> deglom nice features inspired by Zope and other frameworks into pieces
> that are easy to recombine as necessary.  Implementations as WSGI
> middleware seems a nice way to move these kinds of features out of our
> respective applications and into more application-agnostic pieces that
> are very loosely coupled, but perhaps I'm taking it too far.

Certainly these pieces of code can apply to multiple applications and 
disparate systems.  The most obvious instance right now that I think of 
is a WSGI WebDAV server (and someone's working on that for Google Summer 
of Code

Re: [Web-SIG] Standardized configuration

2005-07-16 Thread Chris McDonough
On Sat, 2005-07-16 at 23:29 -0500, Ian Bicking wrote:
> There's nothing in WSGI to facilitate introspection.  Sometimes that 
> seems annoying, though I suspect lots of headaches are removed because 
> of it, and I haven't found it to be a stopper yet.  The issue I'm 
> interested in is just how to deliver configuration to middleware.

Whew, I hoped you'd respond. ;-)

It appears that I haven't gotten as far as to want introspection into
the implementation or configuration of a middleware component.  Instead,
I want the ability to declaratively construct a pipeline out of largely
opaque and potentially interdependent (but loosely coupled) WSGI
middleware components, which is another problem entirely.  It seemed
cogent, so I just somewhat belligerently coopted this thread, sorry!

> Because middleware can't be introspected (generally), this makes things 
> like configuration schemas very hard to implement.  It all needs to be 
> late-bound.

The pipeline itself isn't really late bound.  For instance, if I was to
create a WSGI middleware pipeline something like this:

   server <--> session <--> identification <--> authentication <--> 
   <--> challenge <--> application

... session, identification, authentication, and challenge are
middleware components (you'll need to imagine their implementations).
And within a module that started a server, you might end up doing
something like:

def configure_pipeline(app):
return SessionMiddleware(

if __name__ == '__main__':
app = Application()
pipeline = configure_pipeline(app)
server = Server(pipeline)

The pipeline is static.  When a request comes in, the pipeline itself is
already constructed.  I don't really want a way to prevent "improper"
pipeline construction at startup time (right now anyway), because
failures due to missing dependencies will be fairly obvious.

But some elements of the pipeline at this level of factoring do need to
have dependencies on availability and pipeline placement of the other
elements.  In this example, proper operation of the authentication
component depends on the availability and pipeline placement of the
identification component.  Likewise, the identification component may
depend on values that need to be retrieved from the session component.

I've just seen Phillip's post where he implies that this kind of
fine-grained component factoring wasn't really the initial purpose of
WSGI middleware.  That's kind of a bummer. ;-)

Factoring middleware components in this way seems to provide clear
demarcation points for reuse and maintenance.  For example, I imagined a
declarative security module that might be factored as a piece of
middleware here: .

Of course, this sort of thing doesn't *need* to be middleware.  But
making it middleware feels very right to me in terms of being able to
deglom nice features inspired by Zope and other frameworks into pieces
that are easy to recombine as necessary.  Implementations as WSGI
middleware seems a nice way to move these kinds of features out of our
respective applications and into more application-agnostic pieces that
are very loosely coupled, but perhaps I'm taking it too far.

> > For example, it would be useful in some circumstances to create separate
> > WSGI components for user identification and user authorization.  The
> > process of identification -- obtaining user credentials from a request
> > -- and user authorization  -- ensuring that the user is who he says he
> > is by comparing the credentials against a data source -- are really
> > pretty much distinct operations.  There might also be a "challenge"
> > component which forces a login dialog.
> I've always thought that a 401 response is a good way of indicating 
> that, but not everyone agrees.  (The idea being that the middleware 
> catches the 401 and possibly translates it into a redirect or something.)

Yep.  That'd be a fine signaling mechanism.

> > In practice, I don't know if this is a truly useful separation of
> > concerns that need to be implemented in terms of separate components in
> > the middleware pipeline (I see that paste.login conflates them), it's
> > just an example.  
> Do you mean identification and authentication (you mention authorization 
> above)? 

Aggh.  Yes, I meant to write authentication, sorry.

>  I think authorization is different, and is conflated in 
> paste.login, but I don't have any many use cases where it's a useful 
> distinction.  I guess there's a number of ways of getting a username and 
> password; and to some degree the  authenticator object works at that 
> level of abstraction.  And there's a couple other ways of authenticating 
> a user as well (public keys, IP address, etc).  I've generally used a 
> "user manager" object for this kind of abstraction, with subclassing f

Re: [Web-SIG] Standardized configuration

2005-07-16 Thread Phillip J. Eby
At 01:57 PM 7/11/2005 -0500, Ian Bicking wrote:
>Lately I've been thinking about the role of Paste and WSGI and whatnot.
>   Much of what makes a Paste component Pastey is configuration;
>otherwise the bits are just independent pieces of middleware, WSGI
>applications, etc.  So, potentially if we can agree on configuration, we
>can start using each other's middleware more usefully.

I'm going to go ahead and throw my hat in the ring here, even though I've 
been trying to avoid it.

Most of the stuff you are calling middleware really isn't, or at any rate 
it has no reason to be middleware.

What I think you actually need is a way to create WSGI application objects 
with a "context" object.  The "context" object would have a method like 
"get_service(name)", and if it didn't find the service, it would ask its 
parent context, and so on, until there's no parent context to get it 
from.  The web server would provide a way to configure a root or default 

This would allow you to do early binding of services without needing to do 
lookups on every web hit.  E.g.::

 class MyApplication:
 def __init__(self, context):
 self.authenticate = context.get_service('security.authentication')
 def __call__(self, environ, start_response):
 user = self.authenticate(environ)

So, you would simply register an application *factory* with the web server 
instead of an application instance, and it invokes it on the context object 
in order to get the right thing.

Really, the only stuff that actually needs to be middleware, is stuff that 
wraps an *oblivious* application; i.e., the application doesn't know it's 
there.  If it's a service the application uses, then it makes more sense to 
create a service management mechanism for configuration and deployment of 
WSGI applications.

However, I think that the again the key part of configuration that actually 
relates to WSGI here is *deployment* configuration, such as which service 
implementations to use for the various kinds of services.  Configuration 
*of* the services can and should be private to those services, since 
they'll have implementation-specific needs.  (This doesn't mean, however, 
that a "configuration service" couldn't be part of the family of WSGI 
service interfaces.)

I hope this isn't too vague; I've been wanting to say something about this 
since I saw your blog post about doing transaction services in WSGI, as 
that was when I first understood why you were making everything into 
middleware.  (i.e., to create a poor man's substitute for "placeful" 
services and utilities as found in PEAK and Zope 3.)

Anyway, I don't have a problem with trying to create a framework-neutral 
(in theory, anyway) component system, but I think it would be a good idea 
to take lessons from ones that have solved this problem well, and then 
create an extremely scaled-down version, rather than kludging application 
configuration into what's really per-request data.

Web-SIG mailing list
Web SIG:

Re: [Web-SIG] Standardized configuration

2005-07-16 Thread Ian Bicking
Chris McDonough wrote:
> I've also been putting a bit of thought into middleware configuration,
> although maybe in a different direction.  I'm not too concerned yet
> about being able to introspect the configuration of an individual
> component.  Maybe that's because I haven't thought about the problem
> enough to be concerned about it.  In the meantime, though, I *am*
> concerned about being able to configure a middleware "pipeline" easily
> and have it work.

There's nothing in WSGI to facilitate introspection.  Sometimes that 
seems annoying, though I suspect lots of headaches are removed because 
of it, and I haven't found it to be a stopper yet.  The issue I'm 
interested in is just how to deliver configuration to middleware.

Because middleware can't be introspected (generally), this makes things 
like configuration schemas very hard to implement.  It all needs to be 

> I've been attempting to divine a declarative way to configure a pipeline
> of WSGI middleware components.  This is simple enough through code,
> except that at least in terms of how I'm attempting to factor my
> middleware, some components in the pipeline may have dependencies on
> other pipeline components.

At least in Paste, you just have to set up the stack properly.  It would 
be cool if middleware could detect the presence of its prerequesites, 
and add the prerequesites if they weren't present; I don't think that's 
terribly complicated, but I haven't actually tried it.  Mostly you'd 
test for a key, and if not present then you'd instantiate the middleware 
and reinvoke.

> For example, it would be useful in some circumstances to create separate
> WSGI components for user identification and user authorization.  The
> process of identification -- obtaining user credentials from a request
> -- and user authorization  -- ensuring that the user is who he says he
> is by comparing the credentials against a data source -- are really
> pretty much distinct operations.  There might also be a "challenge"
> component which forces a login dialog.

I've always thought that a 401 response is a good way of indicating 
that, but not everyone agrees.  (The idea being that the middleware 
catches the 401 and possibly translates it into a redirect or something.)

> In practice, I don't know if this is a truly useful separation of
> concerns that need to be implemented in terms of separate components in
> the middleware pipeline (I see that paste.login conflates them), it's
> just an example.  

Do you mean identification and authentication (you mention authorization 
above)?  I think authorization is different, and is conflated in 
paste.login, but I don't have any many use cases where it's a useful 
distinction.  I guess there's a number of ways of getting a username and 
password; and to some degree the  authenticator object works at that 
level of abstraction.  And there's a couple other ways of authenticating 
a user as well (public keys, IP address, etc).  I've generally used a 
"user manager" object for this kind of abstraction, with subclassing for 
different kinds of generality (e.g., the basic abstract class makes 
username/password logins simple, but a subclass can override that and 
authenticate based on anything in the request).

Maybe there's a better term, the fact these two words start with "auth" 
causes all kinds of confusion.  Conflating identification and 
authentication isn't so bad, but authentication and authorization is 
really bad (but common).

> But at very least it would keep each component simpler
> if the concerns were factored out into separate pieces.
> But in the example I present, the "authentication" component depends
> entirely on the result of the "identification" component.  It would be
> simple enough to glom them together by using a distinct environment key
> for the identification component results and have the authentication
> component look for that key later in the middleware result chain, but
> then it feels like you might as well have written the whole process
> within one middleware component because the coupling is pretty strong.
> I have a feeling that adapters fit in here somewhere, but I haven't
> really puzzled that out yet.  I'm sure this has been discussed somewhere
> in the lifetime of WSGI but I can't find much in this list's archives.

No, I don't think so.  It was something I experimented with in 
paste.login (purely intellectually, I haven't used it in a real app), 
and Aaron Lav did a little work on it as well, but until it gets some 
use it's hard to know how complete it is.

As long as it's properly partitioned, I don't think it's a terribly hard 
problem.  That is, with proper partitioning the pieces can be 
recombined, even if the implementations aren't general enough for all 
cases.  Apache and Zope 2 authentication being examples where the 
partitioning was done improperly.

Ian Bicking  /  [EMAIL PROTECTED]  /

Re: [Web-SIG] Standardized configuration

2005-07-16 Thread Jp Calderone might 
be of interest on this topic.

[Web-SIG] Standardized configuration

2005-07-16 Thread Chris McDonough
I've also been putting a bit of thought into middleware configuration,
although maybe in a different direction.  I'm not too concerned yet
about being able to introspect the configuration of an individual
component.  Maybe that's because I haven't thought about the problem
enough to be concerned about it.  In the meantime, though, I *am*
concerned about being able to configure a middleware "pipeline" easily
and have it work.

I've been attempting to divine a declarative way to configure a pipeline
of WSGI middleware components.  This is simple enough through code,
except that at least in terms of how I'm attempting to factor my
middleware, some components in the pipeline may have dependencies on
other pipeline components.

For example, it would be useful in some circumstances to create separate
WSGI components for user identification and user authorization.  The
process of identification -- obtaining user credentials from a request
-- and user authorization  -- ensuring that the user is who he says he
is by comparing the credentials against a data source -- are really
pretty much distinct operations.  There might also be a "challenge"
component which forces a login dialog.

In practice, I don't know if this is a truly useful separation of
concerns that need to be implemented in terms of separate components in
the middleware pipeline (I see that paste.login conflates them), it's
just an example.  But at very least it would keep each component simpler
if the concerns were factored out into separate pieces.

But in the example I present, the "authentication" component depends
entirely on the result of the "identification" component.  It would be
simple enough to glom them together by using a distinct environment key
for the identification component results and have the authentication
component look for that key later in the middleware result chain, but
then it feels like you might as well have written the whole process
within one middleware component because the coupling is pretty strong.

I have a feeling that adapters fit in here somewhere, but I haven't
really puzzled that out yet.  I'm sure this has been discussed somewhere
in the lifetime of WSGI but I can't find much in this list's archives.

> Lately I've been thinking about the role of Paste and WSGI and
> whatnot. Much of what makes a Paste component Pastey is
> configuration;  otherwise the bits are just independent pieces of
> middleware, WSGI applications, etc.  So, potentially if we can agree
> on configuration, we can start using each other's middleware more
> usefully.
> I think we should avoid questions of configuration file syntax for
> now.  Lets instead simply consider configuration consumers.  A
> standard would consist of:
> * A WSGI environment key (e.g., 'webapp01.config')
> * A standard for what goes in that key (e.g., a dictionary object)
> * A reference implementation of the middleware
> * Maybe a non-WSGI-environment way to access the configuration (like 
> paste.CONFIG, which is a global object that dispatches to per-request 
> configuration objects) -- in practice this is really really useful, as 
> you don't have to pass the configuration object around.
> There's some other things we have to consider, as configuration syntaxes 
> do effect the configuration objects significantly.  So, the standard for 
> what goes in the key has to take into consideration some possible 
> configuration syntaxes.
> The obvious starting place is a dictionary-like object.  I would suggest 
> that the keys should be valid Python identifiers.  Not all syntaxes 
> require this, but some do.  This restriction simply means that 
> configuration consumers should try to consume Python identifiers.
> There's also a question about name conflicts (two consumers that are 
> looking for the same key), and whether nested configuration should be 
> preferred, and in what style.
> Note that the standard we decide on here doesn't have to be the only way 
> the object can be accessed.  For instance, you could make your 
> configuration available through 'myframework.config', and create a 
> compliant wrapper that lives in 'webapp01.config', perhaps even doing 
> different kinds of mapping to fix convention differences.
> There's also a question about what types of objects we can expect in the 
> configuration.  Some input styles (e.g., INI and command line) only 
> produce strings.  I think consumers should treat strings (or maybe a 
> special string subclass) specially, performing conversions as necessary 
> (e.g., 'yes'->True).
> Thoughts?

[Web-SIG] Standardized configuration

2005-07-11 Thread Ian Bicking
Lately I've been thinking about the role of Paste and WSGI and whatnot. 
  Much of what makes a Paste component Pastey is configuration; 
otherwise the bits are just independent pieces of middleware, WSGI 
applications, etc.  So, potentially if we can agree on configuration, we 
can start using each other's middleware more usefully.

I think we should avoid questions of configuration file syntax for now. 
  Lets instead simply consider configuration consumers.  A standard 
would consist of:

* A WSGI environment key (e.g., 'webapp01.config')
* A standard for what goes in that key (e.g., a dictionary object)
* A reference implementation of the middleware
* Maybe a non-WSGI-environment way to access the configuration (like 
paste.CONFIG, which is a global object that dispatches to per-request 
configuration objects) -- in practice this is really really useful, as 
you don't have to pass the configuration object around.

There's some other things we have to consider, as configuration syntaxes 
do effect the configuration objects significantly.  So, the standard for 
what goes in the key has to take into consideration some possible 
configuration syntaxes.

The obvious starting place is a dictionary-like object.  I would suggest 
that the keys should be valid Python identifiers.  Not all syntaxes 
require this, but some do.  This restriction simply means that 
configuration consumers should try to consume Python identifiers.

There's also a question about name conflicts (two consumers that are 
looking for the same key), and whether nested configuration should be 
preferred, and in what style.

Note that the standard we decide on here doesn't have to be the only way 
the object can be accessed.  For instance, you could make your 
configuration available through 'myframework.config', and create a 
compliant wrapper that lives in 'webapp01.config', perhaps even doing 
different kinds of mapping to fix convention differences.

There's also a question about what types of objects we can expect in the 
configuration.  Some input styles (e.g., INI and command line) only 
produce strings.  I think consumers should treat strings (or maybe a 
special string subclass) specially, performing conversions as necessary 
(e.g., 'yes'->True).


Ian Bicking  /  [EMAIL PROTECTED]  /
