Re: [Web-SIG] WSGI deployment use case

2005-07-26 Thread Ian Bicking
Well, the stack is really just an example, meant to be more realistic 
than sample1 and sample2.  I actually think it's a very reasonable 
example, but that's not really the point.  Presuming this stack, how 
would you configure it?


Chris McDonough wrote:
 Just for a frame of reference, I'll say how I might do these things.
 These all assume I'd use Apache and mod_python, for better or worse:
 
 
I'm not clear exactly what you are proposing.  Let's use a more 
realistic example.  Components:

* Exception catcher.  Takes email_errors, which is a list of addresses 
to email exceptions to.  I want to apply this globally.
 
 
 I'd likely do this in my endpoint apps (maybe share some sort of library
 between them to do it).  Errors that occur in middleware would be
 diagnosable/detectable via mod_python's error logging facility and
 something like snort.
 
 
* An application mounted on /, which takes document_root and serves up 
those files directly.
 
 
 Use the webserver.
 
 
* An application mounted at /blog, takes database (a string) where all 
its information is kept.
 
 
 Separate WSGI pipeline descriptor with rewrite rules or whatever
 aliasing /blog to it.
 
 
* An application mounted at /admin.  Takes document_root, which is 
where the editable files are located.  Around it goes two pieces of 
middleware...
 
 
 Same as above...
 
 
* A authentication middleware, which takes database, which is where 
user information is kept.  And...
 
 
 I'd probably make this into a service that would be consumable by
 applications with a completely separate configuration outside of any
 deployment spec.  For example, I might try to pull Zope's Pluggable
 Authentication Utility out of Zope 3, leaving intact its
 configurability through ZCML.
 
 But if I did put it in middleware, I'd put it in each of my application
 pipelines (implied by /blog, /admin) in an appropriate place.
 
 
* An authorization middleware, that takes allowed_roles, and checks it 
against what the authentication middleware puts in.
 
 
 This one I know wouldn't make into middleware.  Instead, I'd use a
 library much like the thing I proposed as decsec (although at the time
 I wrote that proposal, I did think it would be middleware; I changed my
 mind).
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] WSGI deployment use case

2005-07-26 Thread Chris McDonough
On Tue, 2005-07-26 at 01:18 -0500, Ian Bicking wrote:
 Well, the stack is really just an example, meant to be more realistic 
 than sample1 and sample2.  I actually think it's a very reasonable 
 example, but that's not really the point.  Presuming this stack, how 
 would you configure it?

I typically roll out software to clients using a build mechanism (I
happens to use pymake at http://www.plope.com/software/pymake/ but
anything dependency-based works).

I write generic build scripts for all of the software components.  For
example, I might write makefiles that check out and build python,
openldap, mysql and so on (each into a non-system location).  I leave
a bit of room for customization in their build definitions that I can
override from within a profile.  A profile is a set of customized
software builds for a specific purpose.

I might have, maybe, 3 different profiles for each customer where the
profile usually works out to be tied to machine function (load balancer,
app server, database server).  I mantain these build scripts and the
profiles in CVS for each customer.  I never install anything by hand, I
always change the buildout and rerun it if I need to get something set
up.

This usually works out pretty well because to roll out a new major
version of software, I just rerun the build scripts for a particular
profile and move the data over.  Usually the only thing that needs to
change frequently are a few bits of software that are checked out of
version control, so doing cvs up on those bits typically gets me where
I need to be unless it's a major revision.

So in this case, I'd likely write a build that either built Apache from
source or at least created an httpd-includes file meant to be
referenced from within the system Apache config file with the proper
stuff in it given the profile's purpose.  The build would also download
and install Python, it would get the the proper eggs and/or Python
software and the database, and so forth.  All the configuration would be
done via the profile which is in version control.

I don't know if this kind of thing works for everybody, but it has
worked well for me so far.  I do this all the time, and I have a good
library of buildout scripts already so it's less painful for me than it
might be for someone who is starting from scratch.  That said, it is
time-consuming and imperfect... upgrades are the most painful.  New
installs are simple, though.

So, anyway, the short answer is I write a script to do the config for
me so I can repeat it on demand.

- C


___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] WSGI deployment use case

2005-07-25 Thread Phillip J. Eby
At 06:40 PM 7/25/2005 -0500, Ian Bicking wrote:
But configuration and composition of multiple independent applications
into a single process isn't.  I don't think we can solve these
separately, because the Hard Problem is how to handle configuration
alongside composition.  How can I apply configuration to a set of
applications?  How can I make exceptions?  How can an application
consume configuration as well as delegate configuration to a
subapplication?  The pipeline is often more like a tree, so the logic is
a little complex.  Or, rather, there's actual *logic* in how
configuration is applied, almost all of which are viable.

We probably need something like a site map configuration, that can handle 
tree structure, and can specify pipelines on a per location basis, 
including the ability to specify pipeline components to be applied above 
everything under a certain URL pattern.  This is more or less the same as 
my container API concept, but we are a little closer to being able to 
think about such a thing.

Of course, I still think it's something that can be added *after* having a 
basic deployment spec.


I can figure out a bunch of ad hoc and formal ways of accomplishing this
in Paste; most of it is already possible, and entry points alone clean
up a lot of what's there (encouraging a separation between how an
application is invoked generally, and install-specific configuration).
But with a more limited and declarative configuration it is harder.

But the tradeoff is greater ability to build tools that operate on the 
configuration to do something -- like James Gardner's ideas about 
backup/restore and documentation tools.


Also when configuration is pushed into factories as keyword arguments,
instead of being pulled out of a dictionary, it is much harder -- the
configuration becomes unhackable.

But a **kw argument *is* a dictionary, so I don't understand what you mean 
here.

___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] WSGI deployment use case

2005-07-25 Thread Chris McDonough
On Mon, 2005-07-25 at 20:29 -0500, Ian Bicking wrote:
  We probably need something like a site map configuration, that can 
  handle tree structure, and can specify pipelines on a per location 
  basis, including the ability to specify pipeline components to be 
  applied above everything under a certain URL pattern.  This is more or 
  less the same as my container API concept, but we are a little closer 
  to being able to think about such a thing.
 
 It could also be something based on general matching rules, with some 
 notion of precedence and how the rule effects SCRIPT_NAME/PATH_INFO.  Or 
 something like that.
How much of this could be solved by using a web server's
directory/alias-mapping facility?

For instance, if you needed a single Apache webserver to support
multiple pipelines based on URL mapping, wouldn't it be possible in many
cases to compose that out of things like rewrite rules and script
aliases (the below assumes running them just as CGI scripts, obviously
it would be different with something using mod_python or what-have-you):

VirtualHost *:80
 ServerAdmin [EMAIL PROTECTED]
 ServerName plope.com
 ServerAlias plope.com
 ScriptAlias /viewcvs /home/chrism/viewcvs.wsgi
 ScriptAlias /blog /home/chrism/blog.wsgi
 RewriteEngine On
 RewriteRule ^/[^/]viewcvs*$ /home/chrism/viewcvs.wsgi [PT]
 RewriteRule ^/[^/]blog*$ /home/chrism/blog.wsgi [PT]
/VirtualHost

Obviously it would mean some repetition in wsgi files if you needed to
repeat parts of a pipeline for each URL mapping.  But it does mean we
wouldn't need to invent more software.


 
  Of course, I still think it's something that can be added *after* having 
  a basic deployment spec.
 
 I feel a very strong need that this be resolved before settling on 
 anything deployment related.  Not necessarily as a standard, but 
 possibly as a set of practices.  Even a realistic and concrete use case 
 might be enough.


I *think* more complicated use cases may revolve around attempting to
use middleware as services that dynamize the pipeline instead of as
oblivious things.  I don't think there's anything really wrong with
that but I also don't think it can ever be specified with as much
clarity as what we've already got because IMHO it's a programming task.

I'm repeating myself, I'm sure, but I'm more apt to put a service
manager piece of middleware in the pipeline (or maybe just implement it
as a library) which would allow my endpoint app to use it to do
sessioning and auth and whatnot.  I realize that is essentially
building a framework (which is reviled lately) but since the endpoint
app needs to collaborate anyway, I don't see a better way to do it
except to rely completely on convention for service lookup (which is
what you seem to be struggling with in the later bits of your post).

- C




___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com


Re: [Web-SIG] WSGI deployment use case

2005-07-25 Thread Ian Bicking
Phillip J. Eby wrote:
 At 08:29 PM 7/25/2005 -0500, Ian Bicking wrote:
 
 Right now Paste hands around a fairly flat dictionary.  This 
 dictionary is passed around in full (as part of the WSGI environment) 
 to every piece of middleware, and actually to everything (via an 
 import and threadlocal storage).  It gets used all over the place, and 
 the ability to draw in configuration without passing it around is very 
 important.  I know it seems like heavy coupling, but in practice it 
 causes unstable APIs if it is passed around explicitly, and as long as 
 you keep clever dynamic values out of the configuration it isn't a 
 problem.

 Anyway, every piece gets the full dictionary, so if any piece expected 
 a constrained set of keys it would break.  Even ignoring that there 
 are multiple consumers with different keys that they pull out, it is 
 common to create intermediate configuration values to make the 
 configuration more abstract.  E.g., I set a base_dir, then derive 
 publish_dir and template_dir from that.  Apache configuration is a 
 good anti-example here; its lack of variables hurts me daily.  While 
 some variables could be declared abstract somehow, that adds 
 complexity where the unconstrained model avoids that complexity.
 
 
 *shudder* I think someone just walked over my grave.  ;)
 
 I'd rather add complexity to the deployment format (e.g. variables, 
 interpolation, etc.) to handle this sort of thing than add complexity to 
 the components.  I also find it hard to understand why e.g. multiple 
 components would need the same template_dir.  Why isn't there a 
 template service component, for example?

In that case, no, multiple components are unlikely to usefully share 
template_dir.  But that's not an issue I'm really hitting -- though it 
does start to add importance to the order in which configuration files 
are loaded.

 When one piece delegates to another, it passes the entire dictionary 
 through (by convention, and by the fact it gets passed around 
 implicitly).  It is certainly possible in some circumstances that a 
 filtered version of the configuration should be passed in; that hasn't 
 happened to me yet, but I can certainly imagine it being necessary 
 (especially when a larger amount of more diverse software is running 
 in the same process).

 One downside of this is that there's no protection from name 
 conflicts.  Though name conflicts can go both ways.  The Happy 
 Coincidence is when two pieces use the same name for the same purpose 
 (e.g., it's highly likely smtp_server would be the subject of a 
 Happy Coincidence).  An Unhappy Coincidence is when two pieces use the 
 same value for different purposes (publish_dir perhaps).  An 
 Expected Coincidence is when the same code, invoked in two separate 
 call stacks, consumes the same value.  Of course, I allow 
 configuration to be overwritten depending on the request, so high 
 collision names (like publish_dir) in practice are unlikely to be a 
 problem.
 
 
 I think you've just explained why this approach doesn't scale very well, 
 even to a large team, let alone to inter-organization collaboration 
 (i.e. open source projects).

I admit there's problems.  On the other hand, it's a similar problem as 
the fact that attributes on objects don't have namespaces.  It causes 
problems, but those problems aren't so bad in practice.

If you can offer something where configuration can be applied to a set 
of components without exposing the internal structure of those 
components, and without the frontend copying each piece destined for an 
internal application explicitly, then great.  I'm not closed to other 
ideas, but I'm not happy putting it off either.  Back when I started up 
this WSGI thread, it was about just this issue, so it's one of the 
things I'm fairly concerned about.

Unlike deployment, this issue of configuration touches all of my code. 
So I'm happier putting off deployment, which though it is suboptimal 
currently, I suspect my code will be forward-compatible to without great 
effort.

   For instance an application-specific middleware that could plausibly 
 be used more widely -- does it consume the application configuration, 
 or does it take its own configuration?  But even excluding those 
 ambiguous situations, the way my middleware is factored is an internal 
 implementation detail, and I don't feel comfortable pushing that 
 structure into the configuration.
 
 
 That's what encapsulation is for.  Just create a factory that takes a 
 set of application-level parameters (like template_dir, publish_dir, 
 etc.) and then *passes* them to the lower level components.
 
 Heck, we could even add that to the .wsgi format...
 
# app template file
[WSGI options]
parameters = template_dir, publish_dir, ...
 
[filter1 from foo]
some_param = template_dir
 
[filter2 from bar]
other_param = publish_dir
 
 
# deployment file
[use file app_template.wsgi]
template_dir = 

Re: [Web-SIG] WSGI deployment use case

2005-07-25 Thread Ian Bicking
Chris McDonough wrote:
 How much of this could be solved by using a web server's
 directory/alias-mapping facility?
 
 For instance, if you needed a single Apache webserver to support
 multiple pipelines based on URL mapping, wouldn't it be possible in many
 cases to compose that out of things like rewrite rules and script
 aliases (the below assumes running them just as CGI scripts, obviously
 it would be different with something using mod_python or what-have-you):
 
 VirtualHost *:80
  ServerAdmin [EMAIL PROTECTED]
  ServerName plope.com
  ServerAlias plope.com
  ScriptAlias /viewcvs /home/chrism/viewcvs.wsgi
  ScriptAlias /blog /home/chrism/blog.wsgi
  RewriteEngine On
  RewriteRule ^/[^/]viewcvs*$ /home/chrism/viewcvs.wsgi [PT]
  RewriteRule ^/[^/]blog*$ /home/chrism/blog.wsgi [PT]
 /VirtualHost
 
 Obviously it would mean some repetition in wsgi files if you needed to
 repeat parts of a pipeline for each URL mapping.  But it does mean we
 wouldn't need to invent more software.

No, we already have templating languages to generate those configuration 
files so it's no problem ;)

Messy configuration files (and RewriteRule for that matter) are my bane.

To be fair, in a shared hosting situation (websites maintained by 
customers, not the host) this would seem more workable than a 
centralized configuration.  Perhaps... it's not the kind of situation I 
deal with much anymore, so I've lost touch with that case.  And would 
that mean we'd start seeing .wsgi in URLs?  Hrm.

-- 
Ian Bicking  /  [EMAIL PROTECTED]  / http://blog.ianbicking.org
___
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com