Re: [Web-SIG] Standardized configuration

2005-07-23 Thread Ian Bicking
  To do this, we use a ConfigParser-format config file named
  'myapplication.conf' that looks like this::

[application:sample1]
config = sample1.conf
factory = wsgiconfig.tests.sample_components.factory1

[application:sample2]
config = sample2.conf
factory = wsgiconfig.tests.sample_components.factory2

[pipeline]
apps = sample1 sample2

On another tack, I think it's important we consider how 
setuptools/pkg_resources fits into this.  Specifically we should allow:

[application:sample1]
require = WSGIConfig
factory = ...

Since the factory might not be importable until require() is called. 
There's lots of other potential benefits to being able to get that 
information about requirements as well.

Another option is if, instead of a factory (or as an alternative 
alongside it) we make distributions publishable themselves, like:

[application:sample]
egg = MyAppSuite[filebrowser]

Which would require('MyAppSuite[filebrowser]'), and look in 
Paste.egg-info for a configuration file.  The [filebrowser] portion is 
pkg_resource's way of defining a feature, and I figure it can also 
identify a specific application if one package holds multiple 
applications.  However, that feature specification would be optional. 
What the configuration file in egg-info looks like, I don't know. 
Probably just like the original configuration file, except this time 
with a factory.

I don't like the configuration key egg though.  But eh, that's a detail.

-- 
Ian Bicking  /  [EMAIL PROTECTED]  / http://blog.ianbicking.org
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Standardized configuration

2005-07-23 Thread Chris McDonough
On Fri, 2005-07-22 at 17:26 -0500, Ian Bicking wrote:

To do this, we use a ConfigParser-format config file named
'myapplication.conf' that looks like this::
  
  [application:sample1]
  config = sample1.conf
  factory = wsgiconfig.tests.sample_components.factory1
  
  [application:sample2]
  config = sample2.conf
  factory = wsgiconfig.tests.sample_components.factory2
  
  [pipeline]
  apps = sample1 sample2
 
 I think it's confusing to call both these applications.  I think 
 middleware or filter would be better.  I think people understand 
 filter far better, so I'm inclined to use that.  So...

The reason I called them applications instead of filters is because all
of them implement the WSGI application API (they all implement a
callable that accepts two parameters, environ and start_response).
Some happen to be gateways/filters/middleware/whatever but at least one
is just an application and does no delegation.  In my example above,
sample2 is not a filter, it is the end-point application.  sample1
is a filter, but it's of course also an application too.

Would you maybe rather make it more explicit that some apps are also
gateways, e.g.:

[application:bleeb]
config = bleeb.conf
factory = bleeb.factory

[filter:blaz]
config = blaz.conf
factory = blaz.factory

?  I don't know that there's any way we could make use of the
distinction between the two types in the configurator other than
disallowing people to place an application before a filter in a
pipeline through validation.  Is there something else you had in mind?

 [application:sample2]
 # What is this relative to?  I hate both absolute paths and
 # paths relative to pwd equally...
 config = sample1.conf
 factory = wsgiconfig...

This was from a doctest I wrote so I could rely on relative paths,
sorry.  You're right.  U... we could probably cause use the
environment as defaults to ConfigParser inerpolation and set whatever
we need before the configurator is run:

$ export APP_ROOT=/home/chrism/myapplication
$ ./wsgi-configurator.py myapplication.conf

And in myapplication.conf:

[application:sample1]
config = %(APP_ROOT)s/sample1.conf
factory = myapp.sample1.factory

That would probably be the least-effort and most flexible thing to do
and doesn't mandate any particular directory structure.  Of course, we
could provide a convention for a recommended directory structure, but
this gives us an out from being painted in to that in specific cases.

 [pipeline]
 # The app is unique and special...?
 app = sample2
 filters = sample1
 
 
 
 Well, that's just a first refactoring; I'm having other inclinations...

I'm not sure whether this is just a stylistic thing or if there's a
reason you want to treat the endpoint app specially.  By definition, in
my implementation, the endpoint app is just the last app mentioned in
the pipeline.

  Potential points of contention
  
   - The WSGI configurator assumes that you are willing to write WSGI
 component factories which accept a filename as a config file.  This
 factory returns *another* factory (typically a class) that accepts
 the next application in the pipeline chain and returns a WSGI
 application instance.  This pattern is necessary to support
 argument currying across a declaratively configured pipeline,
 because the WSGI spec doesn't allow for it.  This is more contract
 than currently exists in the WSGI specification but it would be
 trivial to change existing WSGI components to adapt to this
 pattern.  Or we could adopt a pattern/convention that removed one
 of the factories, passing both the next application and the
 config file into a single factory function.  Whatever.  In any
 case, in order to do declarative pipeline configuration, some
 convention will need to be adopted.  The convention I'm advocating
 above seems to already have been for the current crop of middleware
 components (using a factory which accepts the application as the
 first argument).
 
 I hate the proliferation of configuration files this implies.  I 
 consider the filters an implementation detail; if they each have 
 partitioned configuration then they become a highly exposed piece of the 
 architecture.
 
 It's also a lot of management overhead.  Typical middleware takes 0-5 
 configuration parameters.  For instance, paste.profilemiddleware is 
 perfectly usable with no configuration at all, and only has two parameters.

True.  The config file param should be optional.  Apps might use the
environment to configure themselves.

 But this is reasonably easy to resolve -- there's a perfectly good 
 configuration section sitting there, waiting to be used:
 
[filter:profile]
factory = paste.profilemiddleware.ProfileMiddleware
# Show top 50 functions:
limit = 50
 
 This in no way precludes 'config', which is just a special case of this 
 general configuration.  The only real problem is a possible conflict if 
 we 

Re: [Web-SIG] Standardized configuration

2005-07-23 Thread Ian Bicking
Chris McDonough wrote:
 On Fri, 2005-07-22 at 17:26 -0500, Ian Bicking wrote:
  To do this, we use a ConfigParser-format config file named
  'myapplication.conf' that looks like this::

[application:sample1]
config = sample1.conf
factory = wsgiconfig.tests.sample_components.factory1

[application:sample2]
config = sample2.conf
factory = wsgiconfig.tests.sample_components.factory2

[pipeline]
apps = sample1 sample2

I think it's confusing to call both these applications.  I think 
middleware or filter would be better.  I think people understand 
filter far better, so I'm inclined to use that.  So...
 
 
 The reason I called them applications instead of filters is because all
 of them implement the WSGI application API (they all implement a
 callable that accepts two parameters, environ and start_response).
 Some happen to be gateways/filters/middleware/whatever but at least one
 is just an application and does no delegation.  In my example above,
 sample2 is not a filter, it is the end-point application.  sample1
 is a filter, but it's of course also an application too.

Well, the difference I see is that a filter accepts a next-application, 
where a plain application does not.  From the perspective of this 
configuration file, those seem ver different.  In fact, it could 
actually be:

   [application:sample1]
   config = sample1.conf
   factory = ...

   ...

   [application:real_sample1]
   pipeline = printdebug_app sample1

That is, a pipeline simply describes a new application.  And then -- 
perhaps with a conventional name, or through some more global 
configuration -- we indicate which application we are going to serve.

Hmm... thinking about it, this seems much more general, in a very useful 
way, since anyone can plugin in ways to compose applications. 
pipeline is just one use case for how to compose applications.

 Would you maybe rather make it more explicit that some apps are also
 gateways, e.g.:
 
 [application:bleeb]
 config = bleeb.conf
 factory = bleeb.factory
 
 [filter:blaz]
 config = blaz.conf
 factory = blaz.factory
 
 ?  I don't know that there's any way we could make use of the
 distinction between the two types in the configurator other than
 disallowing people to place an application before a filter in a
 pipeline through validation.  Is there something else you had in mind?

I have forgotten what the actual factory interface was, but I think it 
should be different for the two.  Well, I think it *is* different, and 
passing in a next-application of None just covers up that difference.

[application:sample2]
# What is this relative to?  I hate both absolute paths and
# paths relative to pwd equally...
config = sample1.conf
factory = wsgiconfig...
 
 
 This was from a doctest I wrote so I could rely on relative paths,
 sorry.  You're right.  U... we could probably cause use the
 environment as defaults to ConfigParser inerpolation and set whatever
 we need before the configurator is run:
 
 $ export APP_ROOT=/home/chrism/myapplication
 $ ./wsgi-configurator.py myapplication.conf
 
 And in myapplication.conf:
 
 [application:sample1]
 config = %(APP_ROOT)s/sample1.conf
 factory = myapp.sample1.factory

I hate %(APP_ROOT)s as a syntax; I think it's okay to simply say that 
the configuration loader (in some fashion) should determine the root 
(maybe with an environmental variable or command line parameter).

Though, realistically, there might be several app roots.  Apache's root 
directory configuration (for relative paths) isn't very useful to me, in 
practice, because it's not flexible enough nor allow more than one root.

But this is reasonably easy to resolve -- there's a perfectly good 
configuration section sitting there, waiting to be used:

   [filter:profile]
   factory = paste.profilemiddleware.ProfileMiddleware
   # Show top 50 functions:
   limit = 50

This in no way precludes 'config', which is just a special case of this 
general configuration.  The only real problem is a possible conflict if 
we wanted to add new special names to the configuration, i.e., 
meta-filter-configuration.
 
 
 I think I'd maybe rather see configuration settings for apps that don't
 require much configuration to come in as environment variables (maybe
 not necessarily in the environ namespace that is implied by the WSGI
 callable interface but instead in os.environ).  Envvars are
 uncontroversial, so they don't cost us any coding time, PEP time, or
 brain cycles.

Yikes!  Were you like the ZConfig holdout or something?  os.environ is 
way, way, way too inflexible.

Just the other day I was able to deploy a single application I wrote 
with two configurations in the same process, without having thought 
about that possibility ahead of time, and without doing any extra work 
or avoiding any particular shortcuts.  It worked absolutely seamlessly, 
because I wasn't using any global variables, and I had stuck to a 
convention where Paste nests configurations in a 

Re: [Web-SIG] Standardized configuration

2005-07-23 Thread Phillip J. Eby
At 08:41 PM 7/23/2005 -0400, Chris McDonough wrote:
On Sat, 2005-07-23 at 20:21 -0400, Phillip J. Eby wrote:
  At 08:08 PM 7/23/2005 -0400, Chris McDonough wrote:
  Would you maybe rather make it more explicit that some apps are also
  gateways, e.g.:
  
  [application:bleeb]
  config = bleeb.conf
  factory = bleeb.factory
  
  [filter:blaz]
  config = blaz.conf
  factory = blaz.factory
 
  That looks backwards to me.  Why not just list the sections in pipeline
  order?  i.e., outermost middleware first, and the final application last?
 
  For that matter, if you did that, you could specify the above as:
 
   [blaz.factory]
   config=blaz.conf
 
   [bleeb.factory]
   config=bleeb.conf

Guess that would work for me, but out of the box, ConfigParser doesn't
appear to preserve section ordering.  I'm sure we could make it do that.
Not a dealbreaker either, but if you ever did want a way to
declaratively configure something in the config file like the generic
decision middleware I described in that message, this wouldn't really
work.  I hadn't described it yet, but I can also imagine declaring
multiple pipelines in the config file and using decision middleware to
choose the first app in the next pipeline (as opposed to just an app).

I consider this a YAGNI, myself.  But then again, most of the pipeline 
stuff seems like a YAGNI to me.

Probably that's because everything you guys are talking about implementing 
with pipelines of middleware, I'd use a single generic function for.  If I 
was wrapping oblivious or legacy apps, I'd just make one middleware object 
that then calls the generic function to do any and all dynamic 
requirements, because it would only take a little bit of syntax sugar to 
implement configuration scripts like:

 use_auth(/some/subdir, some_auth_service)
 mount_app(/other/path, some_app_object)

etc.  So, all the time spent on coming up with an uglier, less-powerful 
pseudo-framework to simulate these capabilities using crude .ini files and 
poking stuff into environ seems kind of wasteful to me, versus defining a 
powerful API to -- dare I say it -- paste applications together.  :)

However, such an API deserves to be both powerful and easy-to-use, not 
kludged together with .ini syntax.

That's not saying I don't think WSGI should have a deployment configuration 
format based on .ini syntax -- I still do!  I just don't think it should 
even attempt to allow anything complex.  A simple static pipeline and some 
server-defined and WSGI-defined options will do nicely for the simple 
things are simple case, and a Python file will do nicely for all the 
complex things are possible cases.

That's why I'd like to see this effort split into two parts: 1) simple 
deployment, and 2) a pasting API whose entire purpose in life is to 
stack, route, and multiplex middleware and applications without having 
to explicitly manage a pipeline.

This API would use *specificity* as a basis for establishing pipelines, 
because it's not at all scalable (developer-wise) to set up pipelines on a 
URL-by-URL basis for a complex application -- especially for applications 
that aren't page-based!  Usually, you'll need some kind of pipeline 
inheritance to manage that sort of thing.

There is little reason, however, why you can't configure a significant 
portion of a URL space using a single WSGI component, using an appropriate 
mechanism.  For example, recasting my earlier example:

 def factory(container):
 container.use_auth(some/subdir, some_auth_service)
 container.mount_app_factory(other/path, some_app_factory)

Then, the 'mount_app_factory()' call could invoke 
'some_app_factory(subcontainer)' where 'subcontainer' is a wrapper that 
prepends 'other/path' to URLs before delegating to 'container'.

In other words, once you have this container API, there's no reason not 
to just use it to implement the whole stack in a single middleware object.

Anyway, this is why I think there should be a WSGI Services and/or WSGI 
Container API spec, distinct from a WSGI Deployment Metadata 
spec.  These two spheres are both valuable, but I think it'll take longer 
to get a deployment spec if we mix container API stuff into it -- and 
get a much less useful container API than if we set our minds on making a 
good container API, rather than a souped-up deployment descriptor.

___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Standardized configuration

2005-07-22 Thread Chris McDonough
I've had a stab at creating a simple WSGI deployment implementation.
I use the term WSGI component in here as shorthand to indicate all
types of WSGI implementations (server, application, gateway).

The primary deployment concern is to create a way to specify the
configuration of an instance of a WSGI component, preferably within a
declarative configuration file.  A secondary deployment concern is to
create a way to wire up components together into a specific
deployable pipeline.  

A strawman implementation that solves both issues via the
configurator, which would be presumed to live in wsgiref. Currently
it lives in a package named wsgiconfig on my laptop.  This module
follows.

 Configurator for establishing a WSGI pipeline 

from ConfigParser import ConfigParser
import types

def configure(path):
config = ConfigParser()
if isinstance(path, types.StringTypes):
config.readfp(open(path))
else:
config.readfp(path)

appsections = []

for name in config.sections():
if name.startswith('application:'):
appsections.append(name)
elif name == 'pipeline':
pass
else:
raise ValueError, '%s is not a valid section name'

app_defs = {}

for appsection in appsections:
app_config_file = config.get(appsection, 'config')
app_factory_name = config.get(appsection, 'factory')
app_name = appsection.split('application:')[1]
if app_config_file is None:
raise ValueError, ('application section %s requires a
config '
   'option' % app_config_file)
if app_factory_name is None:
raise ValueError, ('application %s requires a factory'
   ' option' % app_factory_name)
app_defs[app_name] = {'config':app_config_file,
  'factory':app_factory_name}

if not config.has_section('pipeline'):
raise ValueError, 'must have a pipeline section in config'

pipeline_str = config.get('pipeline', 'apps')
if pipeline_str is None:
raise ValueError, ('must have an apps definition in the '
   'pipeline section')

pipeline_def = pipeline_str.split()

next = None

while pipeline_def:
app_name = pipeline_def.pop()
app_def = app_defs.get(app_name)
if app_def is None:
raise ValueError, ('appname %s os defined in pipeline '
   '%s butno application is defined '
   'with that name')
factory_name = app_def['factory']
factory = import_by_name(factory_name)
config_file = app_def['config']
app_factory = factory(config_file)
app = app_factory(next)
next = app

if not next:
raise ValueError, 'no apps defined in pipeline'
return next

def import_by_name(name):
if not . in name:
raise ValueError(unloadable name:  + `name`)
components = name.split('.')
start = components[0]
g = globals()
package = __import__(start, g, g)
modulenames = [start]
for component in components[1:]:
modulenames.append(component)
try:
package = getattr(package, component)
except AttributeError:
n = '.'.join(modulenames)
package = __import__(n, g, g, component)
return package

  We configure a pipeline based on a config file, which
  creates and chains two sample WSGI applications together.

  To do this, we use a ConfigParser-format config file named
  'myapplication.conf' that looks like this::

[application:sample1]
config = sample1.conf
factory = wsgiconfig.tests.sample_components.factory1

[application:sample2]
config = sample2.conf
factory = wsgiconfig.tests.sample_components.factory2

[pipeline]
apps = sample1 sample2

  The configurator exposes a function that accepts a single argument,
  configure.

 from wsgiconfig.configurator import configure
 appchain = configure('myapplication.conf')

  The sample_components module referred to in the
  'myapplication.conf' file application definitions might look like
  this::

  class sample1:
   middleware 
  def __init__(self, app):
  self.app = app
  def __call__(self, environ, start_response):
  environ['sample1'] = True
  return self.app(environ, start_response)

  class sample2:
end-point app 
  def __init__(self, app):
  self.app = app

  def __call__(self, environ, start_response):
  environ['sample2'] = True
  return ['return value 

Re: [Web-SIG] Standardized configuration

2005-07-19 Thread Chris McDonough
On Mon, 2005-07-18 at 22:49 -0500, Ian Bicking wrote:
 In addition to the examples I gave in response to Graham, I wrote a 
 document on this a while ago: 
 http://pythonpaste.org/docs/url-parsing-with-wsgi.html
 
 The hard part about this is configuration; it's easy to configure a 
 non-branching chain of middleware.  Once it branches the configuration 
 becomes hard (like programming-hard; which isn't *hard*, but it quickly 
 stops feeling like configuration).

Yep.  I think I'm getting it.  For example, I see that Paste's URLParser
seems to *construct* applications if they don't already exist based on
the URL.  And I assume that these applications could themselves be
middleware.  I don't think that is configurable declaratively if you
want to decide which app to use based on arbitrary request parameters.

But if we already had the config for each app instance that URLParser
wanted to consult laying around as files on disk, wouldn't it be just as
easy to construct these app objects eagerly at startup time?  Then you
URLParser could choose an already-configured app based on some sort of
configuration file in the URLParser component itself.  The apps
themselves may be pipelines, too, I realize that, but that is still
configurable without coding.

Maybe there'd be some concern about needing to stop the process in order
to add new applications.  That's a use case I hadn't really considered.
I suspect this could be done with a signal handler, though, which could
tell the URLParser to reload its config file instead of potentially
locating a and creating a new application within every request.

This would make URLParser a kind of decision middleware, but it would
choose from a static set of existing applications (or pipelines) for the
lifetime of the process as opposed to constructing them lazily.

  OTOH, I'm not sure that I want my framework to find an app for me.
  I'd like to be able to define pipelines that include my app, but I'd
  typically just want to statically declare it as the end point of a
  pipeline composed of service middleware.  I should look at Paste a
  little more to see if it has the same philosophy or if I'm
  misunderstanding you.
 
 Mostly I wanted to avoid lots of magical incantations for the simple 
 case.  If you are used to Webware, well it has a very straight-forward 
 way of finding your application -- you give it a directory name.  If 
 Quixote or CherryPy, you give it a root object.  Maybe Zope would take a 
 ZEO connection string, and so on.

I think I understand now.

In general, I think I'd rather create instance locations of WSGI
applications (which would essentially consist of a config file on disk
plus any state info required by the app), configure and construct Python
objects out of those instances eagerly at startup time and just choose
between already-constructed apps if in decision middleware that has
its own declarative configuration if decisions need to be made about
which app to use.

This is mostly because I want the configuration info to live within the
application/middleware instance and have some other starter import
those configurations from application/middleware instance locations on
the filesystem.  The starter would construct required instances as
Python objects, and chain them together arbitrarily based on some other
pipeline configuration file that lives with the starter.  The first
part of that (construct required instances) is described in a post I
made to this list yesterday.

This is probably because I'd like there to be one well-understood way to
declaratively configure pipelines as opposed to each piece of middleware
potentially needing to manage app construction and having its own
configuration to do so.

I don't know if this is reasonable for simpler requirements.  This is
more of a formal deployment spec idea and of course is likely flawed
in some subtle way I don't understand yet.

  I'm pretty sure you're not advocating it, but in case you are, I'm not
  sure it adds as much value as it removes to be able to have a dynamic
  middleware chain whereby new middleware elements can be added on the
  fly to a pipeline after a request has begun.  That is *very* late
  binding to me and it's impossible to configure declaratively.
 
 I'm comfortable with a little of both.  I don't even know *how* I'd stop 
 dynamic middleware.  For instance, one of the methods I added to Wareweb 
 recently allows any servlet to forward to any WSGI application; but from 
 the outside the servlet looks like a normal WSGI application just like 
 before.

It's obviously fine if applications themselves want to do this.  I'm not
sure that it would be possible to create a deployment spec that
canonized *how* to do it because as you mentioned it's not really a
configuration task, it's a programming task.

  I agree!  I'm a bit confused because one of the canonical examples of
  how WSGI middleware is useful seems to be the example of implementing a
  framework-agnostic 

Re: [Web-SIG] Standardized configuration

2005-07-19 Thread Ian Bicking
Chris McDonough wrote:
 On Mon, 2005-07-18 at 22:49 -0500, Ian Bicking wrote:
 
In addition to the examples I gave in response to Graham, I wrote a 
document on this a while ago: 
http://pythonpaste.org/docs/url-parsing-with-wsgi.html

The hard part about this is configuration; it's easy to configure a 
non-branching chain of middleware.  Once it branches the configuration 
becomes hard (like programming-hard; which isn't *hard*, but it quickly 
stops feeling like configuration).
 
 
 Yep.  I think I'm getting it.  For example, I see that Paste's URLParser
 seems to *construct* applications if they don't already exist based on
 the URL.  And I assume that these applications could themselves be
 middleware.  I don't think that is configurable declaratively if you
 want to decide which app to use based on arbitrary request parameters.
 
 But if we already had the config for each app instance that URLParser
 wanted to consult laying around as files on disk, wouldn't it be just as
 easy to construct these app objects eagerly at startup time?  Then you
 URLParser could choose an already-configured app based on some sort of
 configuration file in the URLParser component itself.  The apps
 themselves may be pipelines, too, I realize that, but that is still
 configurable without coding.

That's what paste.urlmap is for:

   http://svn.pythonpaste.org/Paste/trunk/paste/urlmap.py

(I haven't actually tried using it much for practical things, so it's 
quite possible it has design mistakes in it)

The idea being that you do:

   urlmap['/myapp'] = MyApp()

But additionally (in PathProxyURLMap):

   urlmap['/myapp'] = 'myapp.conf'

And it builds the application from the configuration file.

 Maybe there'd be some concern about needing to stop the process in order
 to add new applications.  That's a use case I hadn't really considered.
 I suspect this could be done with a signal handler, though, which could
 tell the URLParser to reload its config file instead of potentially
 locating a and creating a new application within every request.
 
 This would make URLParser a kind of decision middleware, but it would
 choose from a static set of existing applications (or pipelines) for the
 lifetime of the process as opposed to constructing them lazily.

URLParser itself is just one parsing implementation, though maybe named 
too generically.  I don't think that particular code needs to grow many 
more features, but there's also room for many other parsers.  And it's 
also fairly easy to wrestle control from URLParser if that gets put in 
the stack (for instance, putting an application function in __init__.py 
will basically take over URL parsing for that  directory).

OTOH, I'm not sure that I want my framework to find an app for me.
I'd like to be able to define pipelines that include my app, but I'd
typically just want to statically declare it as the end point of a
pipeline composed of service middleware.  I should look at Paste a
little more to see if it has the same philosophy or if I'm
misunderstanding you.

Mostly I wanted to avoid lots of magical incantations for the simple 
case.  If you are used to Webware, well it has a very straight-forward 
way of finding your application -- you give it a directory name.  If 
Quixote or CherryPy, you give it a root object.  Maybe Zope would take a 
ZEO connection string, and so on.
 
 
 I think I understand now.
 
 In general, I think I'd rather create instance locations of WSGI
 applications (which would essentially consist of a config file on disk
 plus any state info required by the app), configure and construct Python
 objects out of those instances eagerly at startup time and just choose
 between already-constructed apps if in decision middleware that has
 its own declarative configuration if decisions need to be made about
 which app to use.

I think this is a laudible goal.  Right now, when I'm deploying 
applications written for Paste, I am reluctant to intermingle them in 
the same process and configuration... but that's because Paste is young, 
not because that's a bad idea.  But as a result I haven't tried it, and 
I only have a moderate concept of what it would mean in practice.

A neat feature would be to configure fairly seemlessly across process 
boundaries.  E.g., add a fork=True parameter to an application's 
configuration, and the server would fork a process (or delegate to an 
already forked worker process) for that application.  That's the sort of 
thing that could move Python into PHP-style hosting situations.

 This is mostly because I want the configuration info to live within the
 application/middleware instance and have some other starter import
 those configurations from application/middleware instance locations on
 the filesystem.  The starter would construct required instances as
 Python objects, and chain them together arbitrarily based on some other
 pipeline configuration file that lives with the starter.  The first
 part of that (construct required 

Re: [Web-SIG] Standardized configuration

2005-07-19 Thread Ian Bicking
Phillip J. Eby wrote:
 In many cases, the middleware is modifying or watching the 
 application's output.  For instance, catching a 401 and turning that 
 into the appropriate login -- which might mean producing a 401, a 
 redirect, a login page via internal redirect, or whatever.
 
 
 And that would be legitimate middleware, except I don't think that's 
 what you really want for that use case.  What you want is an 
 authentication service that you just call to say, I need a login and 
 get the login information from, and return its return value so that it 
 does start_response for you and sends the right output.

Like I mentioned in my response to Chris, this kind of contract about 
return values is a difficult one to implement.  A return 401 status 
contract is pretty simple, in that it's vague in a way that fits with 
typical frameworks -- they all have a way of changing the status, and 
most have a way of aborting with that kind of error.

 The difference is obliviousness; if you want to *wrap* an application 
 not written to use WSGI services, then it makes sense to make it 
 middleware.  If you're writing a new application, just have it use 
 components instead of mocking up a 401 just so you can use the existing 
 middleware.

Who's writing new applications?  OK... I guess a lot of people are.  I 
may be more focused on retrofitting compared to other people.

 Notice, by the way, that it's trivial to create middleware that detects 
 the 401 and then *invokes the service*.  So, it's more reusable to make 
 services be services, and middleware be wrappers to apply services to 
 oblivious applications.

Yes, this would be the single-middleware-multiple-service model.  I 
don't understand exactly how services work myself, so I can't write 
that, but I'm certainly interested in examples.  Well... I'll throw out 
one just for the heck of it:

class ServiceMiddleware(object):

 def __init__(self, app):
 self.app = app
 def __call__(self, environ, start_response):
 context = environ['webapp.service_context'] = ServiceContext()
 # You could also do some thread-local registering of this
 # context at this point
 def replacement_start_response(status, headers):
 status, headers, writer = context.start_response(
 start_response, status, headers)
 return writer
 app_iter = self.app(environ, start_response)
 return context.app_iter(app_iter)

class ServiceContext(object):
 def __init__(self):
 self.services = []
 def get_service(self, name):
 ... something I don't understand ...
 self.services.append(service)
 return service
 def start_response(self, start_response, status, headers):
 for service in self.services:
 if hasattr(service, 'munge_start_response'):
 status, headers = service.munge_start_response(status, 
headers)
 return start_response(status, headers)
 def app_iter(self, app_iter):
 return app_iter


And ServiceContext should also ask services if they care to munge_body 
or something, and then pipe all calls to the writer and all the parts of 
app_iter into that service if so.  And it should let services catch 
exceptions.

 I guess you could make one Uber Middleware that could handle the 
 services' needs to rewrite output, watch for errors and finalize 
 resources, etc.
 
 
 Um, it's called a library of functions.  :)  WSGI was designed to make 
 it easy to use library calls to do stuff.  If you don't need the 
 obliviousness, then library calls (or service calls) are the Obvious Way 
 To Do It.

I do use library calls when possible; and even when not possible I 
(generally) try to make the middleware as small as possible, just 
handling the logic of the transformation.  But mostly libraries don't 
need to be discussed here, because they are simple ;)

There are perhaps a few places where standardization of some library 
manipulations would be useful.  E.g., get_cookies() and 
parse_querystring() in paste.wsgilib 
(http://svn.pythonpaste.org/Paste/trunk/paste/wsgilib.py) could be 
standardized, and then WSGI-based libraries that were interested in the 
request could probably retrieve the frameworks' parsed version of URL 
and cookie parameters.

 Really, the only stuff that actually needs to be middleware, is stuff 
 that wraps an *oblivious* application; i.e., the application doesn't 
 know it's there.  If it's a service the application uses, then it 
 makes more sense to create a service management mechanism for 
 configuration and deployment of WSGI applications.


 Applications always care about the things around them, so any 
 convention that middleware and applications be unaware of each other 
 would rule out most middleware.
 
 
 Yes, exactly!  Now you understand me.  :)  If the application is what 
 wants the service, let it just call the service.  Middleware is 
 *overhead* in that case.


Re: [Web-SIG] Standardized configuration

2005-07-19 Thread mike bayer

While I'm not following every detail of this discussion, this line caught
my attention -

Ian Bicking said:
 Really, if you are building user-visible standard libraries, you are
 building a framework.

only because Fowler recently posted something that made me think about
this, where he distinguishes a framework as being something which
employs the inversion of control principle, as Paste does, versus a
library which does not: 
http://martinfowler.com/bliki/InversionOfControl.html .

I know theres a lot of discussion over A Framework ? Not a Framework?
lately, largely in response to the recent meme more frameworks == BAD
that seems to be getting around these days; perhaps Fowler's distinction
is helpful...I hadn't thought of it that way before.
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Standardized configuration

2005-07-19 Thread ChunWei Ho
 (b)
 Have chain application = authmiddleware(fileserverapp)
 Use Handlers, as Ian suggested, and in the fileserverapp's init:
 Handlers(
   IfTest(method=GET,MimeOkForGzip=True, RunApp=gzipmiddleware(doGET)),
   IfTest(method=GET,MimeOkForGzip=False, RunApp=doGET),
   IfTest(method=POST,MimeOkForGzip=True, RunApp=gzipmiddleware(doPOST)),
   IfTest(method=POST,MimeOkForGzip=False, RunApp=doPOST),
   IfTest(method=PUT, RunApp=doPOST)
 )

It was Graham who suggested the use of Handlers initially. Sincere
apologies for my confusion.

 (c)
 Make gzipmiddleware a service in the following form:
 class gzipmiddleware:
   def __init__(self, application=None, configparam=None):
  self._application = application
  
   def __call__(self, environ, start_response, application=None,
 configparam=None):
  if application and configparam is specified, use them instead of
 the init values
  do start_response
  call self._application(environ, start_response) as iterable
  get each iterator output and zip and yield it.
 
 This middleware is still compatible with PEP-333, but can also be used as:
 #on main application initialization, create a gzipservice and put it
 in environ without
 #specifying application or configparams for init():
 environ['service.gzip'] = gzipmiddleware()
 
 Modify fileserverapp to:
 def fileserverapp(environ, start_response):
if(GET):
if(mimetype ok for gzip):
gzipservice = environ['service.gzip']
return gzipservice(environ, start_response, doGET, 
 gzipconfigparams)
else: return doGET(environ, start_response)
if(POST):
if(mimetype ok for gzip):
gzipservice = environ['service.gzip']
return gzipservice(environ, start_response, doPOST,
 gzipconfigparams)
else: return doPOST(environ, start_response)
if(PUT): doPUT(environ, start_response)
 
 The main difference here is that you don't have to initialize full
 application chains for each possible middleware-path for the request.
 This would be very useful if you had many middleware in the chain with
 many permutations as to which middleware are needed

 You could also instead put a service factory object into environ, it
 will return the gzipmiddleware object as a service if already exist,
 otherwise it will create it and then return it.

___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Standardized configuration

2005-07-18 Thread Ian Bicking
Graham Dumpleton wrote:
 My understanding from reading the WSGI PEP and examples like that above is
 that the WSGI middleware stack concept is very much tree like, but where at
 any specific node within the tree, one can only traverse into one child. 
 Ie.,
 a parent middleware component could make a decision to defer to one 
 child or
 another, but there is no means of really trying out multiple choices until
 you find one that is prepared to handle the request. The only way around it
 seems to be make the linear chain of nested applications longer and longer,
 something which to me just doesn't sit right. In some respects the need for
 the configuration scheme is in part to make that less unwieldy.

It's not at all limited to this, but these are simply the ones that are 
easy to configure, and can be inserted into a stack without changing the 
stack very much.

 What I am doing is making it acceptable for a handler to also return None.
 If this were returned by the highest level handler, it would equate to 
 being
 the same as DECLINED, but within the context of middleware components it
 has a lightly relaxed meaning. Specifically, it indicates that that handler
 isn't returning a response, but not that it is indicating that the request
 as a whole is being DECLINED causing a return to Apache.

Incidentally, I'd typically use an exception when the return value 
didn't include the semantics I wanted, but that might not be a problem here.

 One last example, is what a session based login mechanism might look like
 since this was one of the examples posed in the initial discussion. Here 
 you
 might have a handler for a whole directory which contains:
 
 _userDatabase = _users.UserDatabase()
 
 handler = Handlers(
 IfLocationMatches(r\.bak(/.*)?$,NotFound()),
 IfLocationMatches(r\.tmpl(/.*)?$,NotFound()),
 
 IfLocationIsADirectory(ExternalRedirect('index.html')),
 
 # Create session and stick it in request object.
 CreateUserSession(),
 
 # Login form shouldn't require user to be logged in to access it.
 IfLocationMatches(r^/login\.html(/.*)?$,CheetahModule()),
 
 # Serve requests against login/logout URLs and otherwise
 # don't let request proceed if user not yet authenticated.
 # Will redirect to login form if not authenticated.
 FormAuthentication(_userDatabase,login.html),
 
 SetResponseHeader('Pragma','no-cache'),
 SetResponseHeader('Cache-Control','no-cache'),
 SetResponseHeader('Expires','-1'),
 
 IfLocationMatches(r/.*\.html(/.*)?$,CheetahModule()),
 )
 
 Again, one has done away with the need for a configuration files as the 
 code
 itself specifies what is required, along with the constraints as to what
 order things should be done in.
 
 Another thing this example shows is that handlers when they return None due
 to not returning an actual response, can still add to the response headers
 in the way of special cookies as required by sessions, or headers 
 controlling
 caching etc.

This is not possible in WSGI middleware if handled in a chain-like 
fashion.  Nested middleware can do this, of course.

This kind of chaining would be necessary if services were used, as 
many services have to effect the response, and there's no WSGI-related 
spec about where or how they would do that.  Though I haven't digested 
all the long emails lately...

 In terms of late binding of which handler is executed, the PythonModule
 handler is one example in that it selects which Python module to load only
 when the request is being handled. Another example of late construction of
 an instance of a handler in what I am doing, albeit the same type, is:
 
   class Handler:
 
 def __init__(self,req):
   self.__req = req
 
 def __call__(self,name=value):
   self.__req.content_type = text/html
   self.__req.send_http_header()
   self.__req.write(htmlbody)
   self.__req.write(pname=%r/p%cgi.escape(name))
   self.__req.write(/body/html)
   return apache.OK
 
   handler = IfExtensionEquals(html,HandlerInstance(Handler))
 
 First off the HandlerInstance object is only triggered if the request
 against this specific file based resource was by way of a .html
 extension. When it is triggered, it is only at that point that an instance
 of Handler is created, with the request object being supplied to the
 constructor.

Incidentally, I'm doing something a little like that with the 
filebrowser example in Paste:

http://svn.pythonpaste.org/Paste/trunk/examples/filebrowser/web/__init__.py

Looking at it now, it's not clear where that's happening, but (in 
application()) context.path(path) creates a WSGI application using a 
class based on the extension/expected mime type.  So the dispatching is 
similar.

 To round this off, the special Handlers handler only contains the 
 following
 code. Pretty simple, but makes construction of the component hierarchy a 
 bit
 easier in my mind when multiple things need to be done in turn where 
 nesting
 

Re: [Web-SIG] Standardized configuration

2005-07-17 Thread Ian Bicking
Chris McDonough wrote:
Because middleware can't be introspected (generally), this makes things 
like configuration schemas very hard to implement.  It all needs to be 
late-bound.
 
 
 The pipeline itself isn't really late bound.  For instance, if I was to
 create a WSGI middleware pipeline something like this:
 
server -- session -- identification -- authentication -- 
-- challenge -- application
 
 ... session, identification, authentication, and challenge are
 middleware components (you'll need to imagine their implementations).
 And within a module that started a server, you might end up doing
 something like:
 
 def configure_pipeline(app):
 return SessionMiddleware(
 IdentificationMiddleware(
   AuthenticationMiddleware(
 ChallengeMiddleware(app)
 
 if __name__ == '__main__':
 app = Application()
 pipeline = configure_pipeline(app)
 server = Server(pipeline)
 server.serve()

This is what Paste does in configuration, like:

middleware.extend([
 SessionMiddleware, IdentificationMiddleware,
 AuthenticationMiddleware, ChallengeMiddleware])

This kind of middleware takes a single argument, which is the 
application it will wrap.  In practice, this means all the other 
parameters go into lazily-read configuration.

You can also define a framework (a plugin to Paste), which in addition 
to finding an app can also add middleware; basically embodying all the 
middleware that is typical for a framework.

Paste is really a deployment configuration.  Well, that as well as stuff 
to deploy.  And two frameworks.  And whatever else I feel a need or 
desire to throw in there.


Note also that parts of the pipeline are very much late bound.  For 
instance, the way I implemented Webware (and Wareweb) each servlet is a 
WSGI application.  So while there's one URLParser application, the 
application that actually handles the request differs per request.  If 
you start hanging more complete applications (that might have their own 
middleware) at different URLs, then this happens more generally.

There's a newish poorly tested feature where you can do urlmap['/path'] 
= 'config_file.conf' and it'll hang the application described by that 
configuration file at that URL.

 The pipeline is static.  When a request comes in, the pipeline itself is
 already constructed.  I don't really want a way to prevent improper
 pipeline construction at startup time (right now anyway), because
 failures due to missing dependencies will be fairly obvious.

I think that's reasonable too; it's what Paste implements now.

 But some elements of the pipeline at this level of factoring do need to
 have dependencies on availability and pipeline placement of the other
 elements.  In this example, proper operation of the authentication
 component depends on the availability and pipeline placement of the
 identification component.  Likewise, the identification component may
 depend on values that need to be retrieved from the session component.

Yes; and potentially you could have several middlewares implementing the 
same functionality for a single request, e.g., if you had different kind 
of authentication for part of your site/application; that might shadow 
authentication further up the stack.

 I've just seen Phillip's post where he implies that this kind of
 fine-grained component factoring wasn't really the initial purpose of
 WSGI middleware.  That's kind of a bummer. ;-)

Well, I don't understand the services he's proposing yet.  I'm quite 
happy with using middleware the way I have been, so I'm not seeing a 
problem with it, and there's lots of benefits.

 Factoring middleware components in this way seems to provide clear
 demarcation points for reuse and maintenance.  For example, I imagined a
 declarative security module that might be factored as a piece of
 middleware here:  http://www.plope.com/Members/chrism/decsec_proposal .

Yes, I read that before; I haven't quite figured out how to digest it, 
though.  This is probably in part because of the resource-based 
orientation of Zope, and WSGI is application-based, where applications 
are rather opaque and defined only in terms of function.

 Of course, this sort of thing doesn't *need* to be middleware.  But
 making it middleware feels very right to me in terms of being able to
 deglom nice features inspired by Zope and other frameworks into pieces
 that are easy to recombine as necessary.  Implementations as WSGI
 middleware seems a nice way to move these kinds of features out of our
 respective applications and into more application-agnostic pieces that
 are very loosely coupled, but perhaps I'm taking it too far.

Certainly these pieces of code can apply to multiple applications and 
disparate systems.  The most obvious instance right now that I think of 
is a WSGI WebDAV server (and someone's working on that for Google Summer 
of Code), which should be implemented pretty framework-free, simply 
because a 

Re: [Web-SIG] Standardized configuration

2005-07-17 Thread Ian Bicking
Phillip J. Eby wrote:
 At 01:57 PM 7/11/2005 -0500, Ian Bicking wrote:
 
 Lately I've been thinking about the role of Paste and WSGI and whatnot.
   Much of what makes a Paste component Pastey is configuration;
 otherwise the bits are just independent pieces of middleware, WSGI
 applications, etc.  So, potentially if we can agree on configuration, we
 can start using each other's middleware more usefully.
 
 
 I'm going to go ahead and throw my hat in the ring here, even though 
 I've been trying to avoid it.
 
 Most of the stuff you are calling middleware really isn't, or at any 
 rate it has no reason to be middleware.

Well, it is if you implement it that way ;)  I think I'd prefer the term 
filter actually; less bad connotations for people.  But that's really 
unrelated to your point.

 What I think you actually need is a way to create WSGI application 
 objects with a context object.  The context object would have a 
 method like get_service(name), and if it didn't find the service, it 
 would ask its parent context, and so on, until there's no parent context 
 to get it from.  The web server would provide a way to configure a root 
 or default context.

I guess I'm treating the request environment as that context.  I don't 
really see the problem with that...?

 This would allow you to do early binding of services without needing to 
 do lookups on every web hit.  E.g.::
 
 class MyApplication:
 def __init__(self, context):
 self.authenticate = 
 context.get_service('security.authentication')
 def __call__(self, environ, start_response):
 user = self.authenticate(environ)
 
 So, you would simply register an application *factory* with the web 
 server instead of an application instance, and it invokes it on the 
 context object in order to get the right thing.

I don't see the distinction between a factory and an instance.  Or at 
least, it's easy to translate from one to the other.

In many cases, the middleware is modifying or watching the application's 
output.  For instance, catching a 401 and turning that into the 
appropriate login -- which might mean producing a 401, a redirect, a 
login page via internal redirect, or whatever.

I guess you could make one Uber Middleware that could handle the 
services' needs to rewrite output, watch for errors and finalize 
resources, etc.  This isn't unreasonable, and I've kind of expected one 
to evolve at some point.  But you'll have to say more to get me to see 
how services is a better way to manage this.

 Really, the only stuff that actually needs to be middleware, is stuff 
 that wraps an *oblivious* application; i.e., the application doesn't 
 know it's there.  If it's a service the application uses, then it makes 
 more sense to create a service management mechanism for configuration 
 and deployment of WSGI applications.

Applications always care about the things around them, so any convention 
that middleware and applications be unaware of each other would rule out 
most middleware.

 However, I think that the again the key part of configuration that 
 actually relates to WSGI here is *deployment* configuration, such as 
 which service implementations to use for the various kinds of services.  
 Configuration *of* the services can and should be private to those 
 services, since they'll have implementation-specific needs.  (This 
 doesn't mean, however, that a configuration service couldn't be part 
 of the family of WSGI service interfaces.)
 
 I hope this isn't too vague; I've been wanting to say something about 
 this since I saw your blog post about doing transaction services in 
 WSGI, as that was when I first understood why you were making everything 
 into middleware.  (i.e., to create a poor man's substitute for 
 placeful services and utilities as found in PEAK and Zope 3.)

What do they provide that middleware does not?

 Anyway, I don't have a problem with trying to create a framework-neutral 
 (in theory, anyway) component system, but I think it would be a good 
 idea to take lessons from ones that have solved this problem well, and 
 then create an extremely scaled-down version, rather than kludging 
 application configuration into what's really per-request data.

Per-request or not, from the application's side I don't see the 
difference.  It is convenient to put configuration into the request, 
though paste.CONFIG is also provided as a global variable that 
represents the current request's configuration.

In practice the configuration is usually identical for all requests, but 
I haven't seen any reason to enforce this.

-- 
Ian Bicking  /  [EMAIL PROTECTED]  / http://blog.ianbicking.org
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Standardized configuration

2005-07-17 Thread Graham Dumpleton

On 17/07/2005, at 6:16 PM, Ian Bicking wrote:
 The pipeline itself isn't really late bound.  For instance, if I was 
 to
 create a WSGI middleware pipeline something like this:

server -- session -- identification -- authentication --
-- challenge -- application

 ... session, identification, authentication, and challenge are
 middleware components (you'll need to imagine their implementations).
 And within a module that started a server, you might end up doing
 something like:

 def configure_pipeline(app):
 return SessionMiddleware(
 IdentificationMiddleware(
   AuthenticationMiddleware(
 ChallengeMiddleware(app)

 This is what Paste does in configuration, like:

 middleware.extend([
  SessionMiddleware, IdentificationMiddleware,
  AuthenticationMiddleware, ChallengeMiddleware])

 This kind of middleware takes a single argument, which is the
 application it will wrap.  In practice, this means all the other
 parameters go into lazily-read configuration.

Sorry, but you have given me a nice opening here to hijack this 
conversation
a bit and make some comments and pose some questions about WSGI that I 
have
been thinking on for a while.

My understanding from reading the WSGI PEP and examples like that above 
is
that the WSGI middleware stack concept is very much tree like, but 
where at
any specific node within the tree, one can only traverse into one 
child. Ie.,
a parent middleware component could make a decision to defer to one 
child or
another, but there is no means of really trying out multiple choices 
until
you find one that is prepared to handle the request. The only way 
around it
seems to be make the linear chain of nested applications longer and 
longer,
something which to me just doesn't sit right. In some respects the need 
for
the configuration scheme is in part to make that less unwieldy.

To explain what I am going on about, I am going to use examples from 
some
work I have been doing with componentised construction of request 
handler
stacks in mod_python. I will not use the term middleware here, as I 
note that
someone here in this discussion has already made the point of saying 
that
the components being talked about here aren't really middleware and in 
what
I have been doing I have been taking it to an even more fine grained 
level.

I believe I can draw a reasonable analogy to mod_python as at the 
simplest,
a mod_python request handler and a WSGI application are both providing 
the
most basic function of proving the service for responding to a request,
they just do so in different ways.

Normally in mod_python a handler can return an OK response, an error 
response
or a DECLINED response. The DECLINED response is special and indicates 
to
mod_python that any further content handlers defined by mod_python 
should be
skipped and control passed back up to Apache so that it can potentially
serve up a matched static file.

What I am doing is making it acceptable for a handler to also return 
None.
If this were returned by the highest level handler, it would equate to 
being
the same as DECLINED, but within the context of middleware components it
has a lightly relaxed meaning. Specifically, it indicates that that 
handler
isn't returning a response, but not that it is indicating that the 
request
as a whole is being DECLINED causing a return to Apache.

Doing this means that within the context of a tree based middleware 
stack,
at a particular node in the stack one can introduce a list of handlers 
at
a particular node. Each handler in the list will in turn be tried to see
if it wishes to handle the response, returning either an error or valid
response, or None. If it doesn't raise a response, the next handler in 
the
list would be tried until one is found, and if one isn't, then None is 
passed
back to the parent middleware component.

This all means I could write something like:

   handler = Handlers(
 IfLocationMatches(r/_,NotFound()),
 IfLocationMatches(r\.py(/.*)?$,NotFound()),
 PythonModule(),
   )

This handler might be associated with any access to a directory as a 
whole.
In iterating over each of the handlers it filters out requests to files
that we don't want to provide access to, with the final handler 
deferring
to a handler within a Python module associated with the actual resource
being requested. Although Apache provides means of filtering out 
requests,
it only works properly for physical files and not virtual resources 
specified
by way of the path info.

For example, a file page.tmpl (a Cheetah file) could have a page.py
file that defines:

   handler = Handlers(
 IfLocationMatches(r\.bak(/.*)?$,NotFound()),
 IfLocationMatches(r\.tmpl(/.*)?$,NotFound()),
 IfLocationMatches(r/.*\.html(/.*)?$,CheetahModule()),
   )

Again, more filtering and finally a handler is triggered which knows how
to trigger a precompiled Cheetah template stored as a Python module.

All in all a similar tree 

Re: [Web-SIG] Standardized configuration

2005-07-17 Thread Chris McDonough
On Sun, 2005-07-17 at 03:16 -0500, Ian Bicking wrote:
 This is what Paste does in configuration, like:
 
 middleware.extend([
  SessionMiddleware, IdentificationMiddleware,
  AuthenticationMiddleware, ChallengeMiddleware])
 
 This kind of middleware takes a single argument, which is the 
 application it will wrap.  In practice, this means all the other 
 parameters go into lazily-read configuration.

I'm finding it hard to imagine a reason to have another kind of
middleware.

Well, actually that's not true.  In noodling about this, I did think it
would be kind of neat in a twisted way to have decision middleware
like:

class DecisionMiddleware:
 def __init__(self, apps):
 self.apps = apps

 def __call__(self, environ, start_response):
app = self.choose(environ)
for chunk in app(environ, start_response):
yield chunk

 def choose(self, environ):
 app = some_decision_function(self.apps, environ)

I can imagine using this pattern as a decision point for a WSGI pipeline
serving multiple application end-points (perhaps based on URL matching
of the PATH_INFO in environ).

But by and large, most middleware components seem to be just wrappers
for the next application in the chain.  There seem to be two types of
middleware that takes a single application object as a parameter to its
constructor.  There is decorator middleware where you want to add
something to the environment for an application to find later and
action middleware that does some rewriting of the body or the response
headers before the response is sent back to the client.  Some of this
kind of middleware does both.

 You can also define a framework (a plugin to Paste), which in addition 
 to finding an app can also add middleware; basically embodying all the 
 middleware that is typical for a framework.

This appears to be what I'm trying to do too, which is why I'm intrigued
by Paste.

OTOH, I'm not sure that I want my framework to find an app for me.
I'd like to be able to define pipelines that include my app, but I'd
typically just want to statically declare it as the end point of a
pipeline composed of service middleware.  I should look at Paste a
little more to see if it has the same philosophy or if I'm
misunderstanding you.

 Paste is really a deployment configuration.  Well, that as well as stuff 
 to deploy.  And two frameworks.  And whatever else I feel a need or 
 desire to throw in there.

Yeah.  FWIW, as someone who has recently taken a brief look at Paste, I
think it would be helpful (at least for newbies) to partition out the
bits of Paste which are meant to be deployment configuration from the
bits that are meant to be deployed.  Zope 2 fell into the same trap
early on, and never recovered.  For example, ZPublisher (nee Bobo) was
always meant to be able to be useful outside of Zope, but in practice it
never happened because nobody could figure out how to disentangle it
from its ever-increasing dependencies on other software only found in a
Zope checkout.  In the end, nobody even remembered what its dependencies
were *supposed* to be.  If you ask ten people, you'd get ten different
answers.

I also think that the rigor of separating out different components helps
to make the software stronger and more easily understood in bite-sized
pieces.  Unfortunately, separating them makes configuration tough, but I
think that's what we're trying to find an answer about how to do the
right way here.

 Note also that parts of the pipeline are very much late bound.  For 
 instance, the way I implemented Webware (and Wareweb) each servlet is a 
 WSGI application.  So while there's one URLParser application, the 
 application that actually handles the request differs per request.  If 
 you start hanging more complete applications (that might have their own 
 middleware) at different URLs, then this happens more generally.

Well, if you put the decider in middleware itself, all of the
middleware components in each pipeline could still be at least
constructed early.  I'm pretty sure this doesn't really strictly qualify
as early binding but it's not terribly dynamic either.  It also makes
configuration pretty straightforward.  At least I can imagine a
declarative syntax for configuring pipelines this way.

I'm pretty sure you're not advocating it, but in case you are, I'm not
sure it adds as much value as it removes to be able to have a dynamic
middleware chain whereby new middleware elements can be added on the
fly to a pipeline after a request has begun.  That is *very* late
binding to me and it's impossible to configure declaratively.

  But some elements of the pipeline at this level of factoring do need to
  have dependencies on availability and pipeline placement of the other
  elements.  In this example, proper operation of the authentication
  component depends on the availability and pipeline placement of the
  identification component.  Likewise, the identification component may
  

Re: [Web-SIG] Standardized configuration

2005-07-17 Thread Phillip J. Eby
At 07:29 AM 7/17/2005 -0400, Chris McDonough wrote:
I'm a bit confused because one of the canonical examples of
how WSGI middleware is useful seems to be the example of implementing a
framework-agnostic sessioning service.  And for that sessioning service
to be useful, your application has to be able to depend on its
availability so it can't be oblivious.

Exactly.  As soon as you start trying to have configured services, you are 
creating Yet Another Framework.  Which isn't a bad thing per se, except 
that it falls outside the scope of  PEP 333.  It deserves a separate PEP, I 
think, and a separate implementation mechanism than being crammed into the 
request environment.  These things should be allowed to be static, so that 
an application can do some reasonable setup, and so that you don't have 
per-request overhead to shove ninety services into the environment.

Also, because we are dealing not with basic plumbing but with making a nice 
kitchen, it seems to me we can afford to make the fixtures nice.  That is, 
for an add-on specification to WSGI we don't need to adhere to the let it 
be ugly for apps if it makes the server easier principle that guided PEP 
333.  The assumption there was that people would mostly port existing 
wrappers over HTTP/CGI to be wrappers over WSGI.  But for services, we are 
talking about an actual framework to be used by application developers 
directly, so more user-friendliness is definitely in order.

For WSGI itself, the server-side implementation has to be very server 
specific.  But the bulk of a service stack could be implemented once (e.g. 
as part of wsgiref), and then just used by servers.  So, we don't have to 
worry as much about making it easy for server people to implement, except 
for any server-specific choices about how configuration might be 
stacked.  (For example, in a filesystem-oriented server like Apache, you 
might want subdirectories to inherit services defined in parent directories.)


OTOH, the primary benefit -- to me, at least -- of modeling services as
WSGI middleware is the fact that someone else might be able to use my
service outside the scope of my projects (and thus help maintain it and
find bugs, etc).  So if I've got the wrong concept of what kinds of
middleware that I can expect normal people to use, I don't want to go
very far down that road without listening carefully to Phillip.  Perhaps
I'll have a shot at influencing the direction of WSGI to make it more
appropriate for this sort of thing or maybe we'll come up with a better
way of doing it.

Zope 3 is a component system much like what I'm after, and I may just
end up using it wholesale.  But my immediate problem with Zope 3 is that
like Zope 2, it's a collection of libraries that have dependencies on
other libraries that are only included within its own checkout and don't
yet have much of a life of their own.  It's not really a technical
problem, it's a social one... I'd rather have a somewhat messy framework
with a lot of diversity composed of wildly differing component
implementations that have a life of their own than to be be trapped in a
clean, pure world where all the components are used only within that
world.

I suspect there's a middle ground here somewhere.

Right; I'm suggesting that we grow a WSGI Deployment or WSGI Stack 
specification that includes a simple way to obtain services (using the Zope 
3 definition of service as simply a named component).  This would form 
the basis for various WSGI Service specifications.  And, for existing 
frameworks there's at least some potential possibility of integrating with 
this stack, since PEAK and Zope 3 both already have ways to define and 
acquire named services, so it might be possible to define the spec in such 
a way that their implementations could be reused by wrapping them in a thin 
WSGI Stack adapter.  Similarly, if there are any other frameworks out 
there that offer similar functionality, then they ought to be able to play 
too, at least in principle.

___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Standardized configuration

2005-07-17 Thread Phillip J. Eby
At 03:28 AM 7/17/2005 -0500, Ian Bicking wrote:
Phillip J. Eby wrote:
What I think you actually need is a way to create WSGI application 
objects with a context object.  The context object would have a 
method like get_service(name), and if it didn't find the service, it 
would ask its parent context, and so on, until there's no parent context 
to get it from.  The web server would provide a way to configure a root 
or default context.

I guess I'm treating the request environment as that context.  I don't 
really see the problem with that...?

It puts a layer in the request call stack for each service you want to 
offer, versus *no* layers for an arbitrary number of services.  It adds 
work to every request to put stuff into the environment, then take it out 
again, versus just getting what you want in the first place.


In many cases, the middleware is modifying or watching the application's 
output.  For instance, catching a 401 and turning that into the 
appropriate login -- which might mean producing a 401, a redirect, a login 
page via internal redirect, or whatever.

And that would be legitimate middleware, except I don't think that's what 
you really want for that use case.  What you want is an authentication 
service that you just call to say, I need a login and get the login 
information from, and return its return value so that it does 
start_response for you and sends the right output.

The difference is obliviousness; if you want to *wrap* an application not 
written to use WSGI services, then it makes sense to make it 
middleware.  If you're writing a new application, just have it use 
components instead of mocking up a 401 just so you can use the existing 
middleware.

Notice, by the way, that it's trivial to create middleware that detects the 
401 and then *invokes the service*.  So, it's more reusable to make 
services be services, and middleware be wrappers to apply services to 
oblivious applications.


I guess you could make one Uber Middleware that could handle the services' 
needs to rewrite output, watch for errors and finalize resources, etc.

Um, it's called a library of functions.  :)  WSGI was designed to make it 
easy to use library calls to do stuff.  If you don't need the 
obliviousness, then library calls (or service calls) are the Obvious Way To 
Do It.


   This isn't unreasonable, and I've kind of expected one to evolve at 
 some point.  But you'll have to say more to get me to see how services 
 is a better way to manage this.

I'm saying that middleware can use services, and applications can use 
services.  Making applications *have to* use middleware in order to use the 
services is wasteful of both computer time and developer brainpower.  Just 
let them use services directly when the situation calls for it, and you can 
always write middleware to use the services when you encounter the 
occasional (and ever-rarer with time) oblivious application.


Really, the only stuff that actually needs to be middleware, is stuff 
that wraps an *oblivious* application; i.e., the application doesn't know 
it's there.  If it's a service the application uses, then it makes more 
sense to create a service management mechanism for configuration and 
deployment of WSGI applications.

Applications always care about the things around them, so any convention 
that middleware and applications be unaware of each other would rule out 
most middleware.

Yes, exactly!  Now you understand me.  :)  If the application is what wants 
the service, let it just call the service.  Middleware is *overhead* in 
that case.


I hope this isn't too vague; I've been wanting to say something about 
this since I saw your blog post about doing transaction services in WSGI, 
as that was when I first understood why you were making everything into 
middleware.  (i.e., to create a poor man's substitute for placeful 
services and utilities as found in PEAK and Zope 3.)

What do they provide that middleware does not?

Well, some services may be things the application needs only when it's 
being initially configured.  Or maybe the service is something like a 
scheduler that gives timed callbacks.  There are lots of non-per-request 
services that make sense, so forcing service access to be only through the 
environment makes for cruftier code, since you now have to keep track of 
whether you've been called before, and then do any setup during your first 
web hit.  For that matter, some service configuration might need to be 
dynamically determined, based on the application object requesting it.

But the main thing they provide that middleware does not is simplicity and 
ease of use.  I understand your desire to preserve the appearance of 
neutrality, but you are creating new web frameworks here, and making them 
ugly doesn't make them any less of a framework.  :)

What's worse is that by tying the service access mechanism to the request 
environment, you're effectively locking out frameworks like PEAK and Zope 3 
from 

Re: [Web-SIG] Standardized configuration

2005-07-16 Thread Jp Calderone
http://twistedmatrix.com/pipermail/twisted-python/2005-July/010902.html might 
be of interest on this topic.

Jp
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Standardized configuration

2005-07-16 Thread Ian Bicking
Chris McDonough wrote:
 I've also been putting a bit of thought into middleware configuration,
 although maybe in a different direction.  I'm not too concerned yet
 about being able to introspect the configuration of an individual
 component.  Maybe that's because I haven't thought about the problem
 enough to be concerned about it.  In the meantime, though, I *am*
 concerned about being able to configure a middleware pipeline easily
 and have it work.

There's nothing in WSGI to facilitate introspection.  Sometimes that 
seems annoying, though I suspect lots of headaches are removed because 
of it, and I haven't found it to be a stopper yet.  The issue I'm 
interested in is just how to deliver configuration to middleware.

Because middleware can't be introspected (generally), this makes things 
like configuration schemas very hard to implement.  It all needs to be 
late-bound.

 I've been attempting to divine a declarative way to configure a pipeline
 of WSGI middleware components.  This is simple enough through code,
 except that at least in terms of how I'm attempting to factor my
 middleware, some components in the pipeline may have dependencies on
 other pipeline components.

At least in Paste, you just have to set up the stack properly.  It would 
be cool if middleware could detect the presence of its prerequesites, 
and add the prerequesites if they weren't present; I don't think that's 
terribly complicated, but I haven't actually tried it.  Mostly you'd 
test for a key, and if not present then you'd instantiate the middleware 
and reinvoke.

 For example, it would be useful in some circumstances to create separate
 WSGI components for user identification and user authorization.  The
 process of identification -- obtaining user credentials from a request
 -- and user authorization  -- ensuring that the user is who he says he
 is by comparing the credentials against a data source -- are really
 pretty much distinct operations.  There might also be a challenge
 component which forces a login dialog.

I've always thought that a 401 response is a good way of indicating 
that, but not everyone agrees.  (The idea being that the middleware 
catches the 401 and possibly translates it into a redirect or something.)

 In practice, I don't know if this is a truly useful separation of
 concerns that need to be implemented in terms of separate components in
 the middleware pipeline (I see that paste.login conflates them), it's
 just an example.  

Do you mean identification and authentication (you mention authorization 
above)?  I think authorization is different, and is conflated in 
paste.login, but I don't have any many use cases where it's a useful 
distinction.  I guess there's a number of ways of getting a username and 
password; and to some degree the  authenticator object works at that 
level of abstraction.  And there's a couple other ways of authenticating 
a user as well (public keys, IP address, etc).  I've generally used a 
user manager object for this kind of abstraction, with subclassing for 
different kinds of generality (e.g., the basic abstract class makes 
username/password logins simple, but a subclass can override that and 
authenticate based on anything in the request).

Maybe there's a better term, the fact these two words start with auth 
causes all kinds of confusion.  Conflating identification and 
authentication isn't so bad, but authentication and authorization is 
really bad (but common).

 But at very least it would keep each component simpler
 if the concerns were factored out into separate pieces.
 
 But in the example I present, the authentication component depends
 entirely on the result of the identification component.  It would be
 simple enough to glom them together by using a distinct environment key
 for the identification component results and have the authentication
 component look for that key later in the middleware result chain, but
 then it feels like you might as well have written the whole process
 within one middleware component because the coupling is pretty strong.
 
 I have a feeling that adapters fit in here somewhere, but I haven't
 really puzzled that out yet.  I'm sure this has been discussed somewhere
 in the lifetime of WSGI but I can't find much in this list's archives.

No, I don't think so.  It was something I experimented with in 
paste.login (purely intellectually, I haven't used it in a real app), 
and Aaron Lav did a little work on it as well, but until it gets some 
use it's hard to know how complete it is.

As long as it's properly partitioned, I don't think it's a terribly hard 
problem.  That is, with proper partitioning the pieces can be 
recombined, even if the implementations aren't general enough for all 
cases.  Apache and Zope 2 authentication being examples where the 
partitioning was done improperly.

-- 
Ian Bicking  /  [EMAIL PROTECTED]  / http://blog.ianbicking.org
___
Web-SIG mailing list

Re: [Web-SIG] Standardized configuration

2005-07-16 Thread Phillip J. Eby
At 01:57 PM 7/11/2005 -0500, Ian Bicking wrote:
Lately I've been thinking about the role of Paste and WSGI and whatnot.
   Much of what makes a Paste component Pastey is configuration;
otherwise the bits are just independent pieces of middleware, WSGI
applications, etc.  So, potentially if we can agree on configuration, we
can start using each other's middleware more usefully.

I'm going to go ahead and throw my hat in the ring here, even though I've 
been trying to avoid it.

Most of the stuff you are calling middleware really isn't, or at any rate 
it has no reason to be middleware.

What I think you actually need is a way to create WSGI application objects 
with a context object.  The context object would have a method like 
get_service(name), and if it didn't find the service, it would ask its 
parent context, and so on, until there's no parent context to get it 
from.  The web server would provide a way to configure a root or default 
context.

This would allow you to do early binding of services without needing to do 
lookups on every web hit.  E.g.::

 class MyApplication:
 def __init__(self, context):
 self.authenticate = context.get_service('security.authentication')
 def __call__(self, environ, start_response):
 user = self.authenticate(environ)

So, you would simply register an application *factory* with the web server 
instead of an application instance, and it invokes it on the context object 
in order to get the right thing.

Really, the only stuff that actually needs to be middleware, is stuff that 
wraps an *oblivious* application; i.e., the application doesn't know it's 
there.  If it's a service the application uses, then it makes more sense to 
create a service management mechanism for configuration and deployment of 
WSGI applications.

However, I think that the again the key part of configuration that actually 
relates to WSGI here is *deployment* configuration, such as which service 
implementations to use for the various kinds of services.  Configuration 
*of* the services can and should be private to those services, since 
they'll have implementation-specific needs.  (This doesn't mean, however, 
that a configuration service couldn't be part of the family of WSGI 
service interfaces.)

I hope this isn't too vague; I've been wanting to say something about this 
since I saw your blog post about doing transaction services in WSGI, as 
that was when I first understood why you were making everything into 
middleware.  (i.e., to create a poor man's substitute for placeful 
services and utilities as found in PEAK and Zope 3.)

Anyway, I don't have a problem with trying to create a framework-neutral 
(in theory, anyway) component system, but I think it would be a good idea 
to take lessons from ones that have solved this problem well, and then 
create an extremely scaled-down version, rather than kludging application 
configuration into what's really per-request data.

___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] Standardized configuration

2005-07-16 Thread Chris McDonough
On Sat, 2005-07-16 at 23:29 -0500, Ian Bicking wrote:
 There's nothing in WSGI to facilitate introspection.  Sometimes that 
 seems annoying, though I suspect lots of headaches are removed because 
 of it, and I haven't found it to be a stopper yet.  The issue I'm 
 interested in is just how to deliver configuration to middleware.

Whew, I hoped you'd respond. ;-)

It appears that I haven't gotten as far as to want introspection into
the implementation or configuration of a middleware component.  Instead,
I want the ability to declaratively construct a pipeline out of largely
opaque and potentially interdependent (but loosely coupled) WSGI
middleware components, which is another problem entirely.  It seemed
cogent, so I just somewhat belligerently coopted this thread, sorry!

 Because middleware can't be introspected (generally), this makes things 
 like configuration schemas very hard to implement.  It all needs to be 
 late-bound.

The pipeline itself isn't really late bound.  For instance, if I was to
create a WSGI middleware pipeline something like this:

   server -- session -- identification -- authentication -- 
   -- challenge -- application

... session, identification, authentication, and challenge are
middleware components (you'll need to imagine their implementations).
And within a module that started a server, you might end up doing
something like:

def configure_pipeline(app):
return SessionMiddleware(
IdentificationMiddleware(
  AuthenticationMiddleware(
ChallengeMiddleware(app)

if __name__ == '__main__':
app = Application()
pipeline = configure_pipeline(app)
server = Server(pipeline)
server.serve()

The pipeline is static.  When a request comes in, the pipeline itself is
already constructed.  I don't really want a way to prevent improper
pipeline construction at startup time (right now anyway), because
failures due to missing dependencies will be fairly obvious.

But some elements of the pipeline at this level of factoring do need to
have dependencies on availability and pipeline placement of the other
elements.  In this example, proper operation of the authentication
component depends on the availability and pipeline placement of the
identification component.  Likewise, the identification component may
depend on values that need to be retrieved from the session component.

I've just seen Phillip's post where he implies that this kind of
fine-grained component factoring wasn't really the initial purpose of
WSGI middleware.  That's kind of a bummer. ;-)

Factoring middleware components in this way seems to provide clear
demarcation points for reuse and maintenance.  For example, I imagined a
declarative security module that might be factored as a piece of
middleware here:  http://www.plope.com/Members/chrism/decsec_proposal .

Of course, this sort of thing doesn't *need* to be middleware.  But
making it middleware feels very right to me in terms of being able to
deglom nice features inspired by Zope and other frameworks into pieces
that are easy to recombine as necessary.  Implementations as WSGI
middleware seems a nice way to move these kinds of features out of our
respective applications and into more application-agnostic pieces that
are very loosely coupled, but perhaps I'm taking it too far.

  For example, it would be useful in some circumstances to create separate
  WSGI components for user identification and user authorization.  The
  process of identification -- obtaining user credentials from a request
  -- and user authorization  -- ensuring that the user is who he says he
  is by comparing the credentials against a data source -- are really
  pretty much distinct operations.  There might also be a challenge
  component which forces a login dialog.
 
 I've always thought that a 401 response is a good way of indicating 
 that, but not everyone agrees.  (The idea being that the middleware 
 catches the 401 and possibly translates it into a redirect or something.)

Yep.  That'd be a fine signaling mechanism.

  In practice, I don't know if this is a truly useful separation of
  concerns that need to be implemented in terms of separate components in
  the middleware pipeline (I see that paste.login conflates them), it's
  just an example.  
 
 Do you mean identification and authentication (you mention authorization 
 above)? 

Aggh.  Yes, I meant to write authentication, sorry.

  I think authorization is different, and is conflated in 
 paste.login, but I don't have any many use cases where it's a useful 
 distinction.  I guess there's a number of ways of getting a username and 
 password; and to some degree the  authenticator object works at that 
 level of abstraction.  And there's a couple other ways of authenticating 
 a user as well (public keys, IP address, etc).  I've generally used a 
 user manager object for this kind of abstraction, with subclassing for 
 different kinds of generality (e.g., the basic abstract