On 2011-04-13 18:16:36 -0700, Ian Bicking said:
While initially reluctant to use zip files, after further discussion
and thought they seem fine to me, so long as any tool that takes a zip
file can also take a directory. The reverse might not be true -- for
instance, I'd like a way to install or update a library for (and
inside) an application, but I doubt I would make pip rewrite zip files
to do this ;) But it could certainly work on directories. Supporting
both isn't a big deal except that you can't do symlinks in a zip file.
I'm not talking about using zip files as per eggs, where the code is
maintained within the zip file during execution. It is merely a
packaging format with the software itself extracted from the zip during
installation / upgrade. A transitory container format. (Folders in
the end.)
Symlinks are an OS-specific feature, so those are out as a core
requirement. ;)
I don't think we're talking about something like a buildout recipe.
Well, Eric kind of brought something like that up... but otherwise I
think the consensus is in that direction.
Ambiguous statements FTW, but I think I know what you meant. ;)
So specifically if you need something like lxml the application
specifies that somehow, but doesn't specify *how* that library is
acquired. There is some disagreement on whether this is generally
true, or only true for libraries that are not portable.
+1
I think something along the lines of autoconf (those lovely ./configure
scripts you run when building GNU-style software from source) with
published base 'checkers' (predicates as I referred to them previously)
would be great. A clear way for an application to declare a
dependency, have the application server check those dependencies, then
notify the administrator installing the package.
I've seen several Python libraries that include the C library code that
they expose; while not so terribly efficient (i.e. you can't install
the C library once, then share it amongst venvs), it is effective for
small packages.
Larger (i.e. global or application-local) would require the
intervention of a systems administrator.
Something like a database takes this a bit further. We haven't really
discussed it, but I think this is where it gets interesting. Silver
Lining has one model for this. The general rule in Silver Lining is
that you can't have anything with persistence without asking for it as
a service, including an area to write files (except temporary files?)
+1
Databases are slightly more difficult; an application could ask for:
:: (Very Generic) A PEP-249 database connection.
:: (Generic) A relational database connection string.
:: (Specific) A connection string to a specific vendor of database.
:: (Odd) A NoSQL database connection string.
I've been making heavy use of MongoDB over the last year and a half,
but AFIK each NoSQL database engine does its own thing API-wise. (Then
there are ORMs on top of that, but passing a connection string like
mysql://user:pass@host/db or mongo://host/db is pretty universal.)
It is my intention to write an application server that is capable of
creating and securing databases on-the-fly. This would require fairly
high-level privileges in the database engine, but would result in far
more "plug-and-play" configuration. Obviously when deleting an
application you will have the opportunity to delete the database and
associated user.
I assume everyone agrees that an application can't write to its own
files (but of course it could execfile something in another location).
+1; that _almost_ goes without saying. :) At the same time, an
application server /must not/ require root access to do its work, thus
no mandating of (real) chroots, on-the-fly user creation, etc.
There are ways around almost all security policies, but where possible
setting the read-only flag (Windows) or removing write (chmod -w on
POSIX systems) should be enough to prevent casual abuse.
I suspect there's some disagreement about how the Python environment
gets setup, specifically sys.path and any other application-specific
customizations (e.g., I've set environ['DJANGO_SETTINGS_MODULE'] in
silvercustomize.py, and find it helpful).
Similar to Paste's "here" variable for INI files, having some method of
the application defining environment variables with base path
references would be needed.
I've tossed out my idea of sharing dependencies, BTW, so a simple
extraction of the zipped application into one package folder (linked in
using a .pth file) with the dependencies installed into an app-packages
folder in the path (like site-packages) would be ideal. At least, for
me. ;)
Describing the scope of this, it seems kind of boring. In, for
example, App Engine you do all your setup in your runner -- I find this
deeply annoying because it makes the runner the only entry point, and
thus makes testing, scripts, etc. hard.
I agree; that's a short-sighted approach to an application container
format. There should be some way to advertise a test suite and, for
example, have the suite run before installation or during upgrade.
(Rolling back the upgrade process thus far if there is a failure.)
My shiny end goal would be a form of continuous deployment: a git-based
application which gets a post-commit notification, pulls the latest,
runs the tests, rolls back on failure or fully deploys the update on
success.
We would start with just WSGI. Other things could follow, but I don't
see any reason to worry about that now. Maybe we should just punt on
aggregate applications now too. I don't feel like there's anything we
would do that would prevent other kinds of runtime models (besides the
starting point, container-controlled WSGI), and the places to add
support for new things are obvious enough (e.g., something like Silver
Lining's platform setting). I would define a server with accompanying
daemon processes as an "aggregate".
Since in my model the application server does not proxy requests to the
instantiated applications (each running in its own process), I'm not
sure I'm interpreting what you mean by an aggregate application
properly.
If "my" application server managed Nginx or Apache configurations,
dispatch to applications based on base path would be very easy to do
while still keeping the applications isolated.
An important distinction to make, I believe, is application concerns
and deployment concerns. For instance, what you do with logging is a
deployment concern. Generating logging messages is of course an
application concern. In practice these are often conflated, especially
in the case of bespoke applications where the only person deploying the
application is the person (or team) developing the application. It
shouldn't be annoying for these users, though. Maybe it makes sense
for people to be able to include tool-specific default settings in an
application -- things that could be overridden, but especially for the
case when the application is not widely reused it could be useful. (An
example where Silver Lining gets is all backwards is I created a
[production] section in app.ini when the very concept of "production"
is not meaningful in that context -- but these kind of named profiles
would make sense for actual application deployment tools.)
Having an application define default logging levels for different
scopes would be very useful. The application server could take those
defaults, and allow an administrator to modify them or define
additional scopes quite easily.
There's actually a kind of layered way of thinking of this:
1. The first, maybe most important part, is how you get a proper Python
environment. That includes sys.path of course, with all the
accompanying libraries, but it also includes environment description.
Virtualenv-like, with the application itself linked in via a .pth file
(a la setup.py develop, allowing inline upgrades via SCM) and
dependencies extracted from the zip distributable into an app-packages
folder a la site-packages.
I don't install global Python modules on any of my servers, so the
--no-site-packages option is somewhat unnecessary for me, but having
something similar would be useful, too. Unfortunately, that one
feature seems to require a lot of additional work.
In Silver Lining there's two stages -- first, set some environmental
variables (both general ones like $SILVER_CANONICAL_HOST and
service-specific ones like $CONFIG_MYSQL_DBNAME), then get sys.path
proper, then import silvercustomize by which an environment can do any
more customization it wants (e.g., set $DJANGO_SETTINGS_MODULE)
Environment variables are typeless (raw strings) and thus less than
optimum for sharing rich configurations.
Host names depend on how the application is mounted, and a single
application may be mounted to multiple domains or paths, so utilizing
the front end web server's rewriting capability is probably the best
solution for that.
What about multiple database connections? Environment variables are
also not so good for repeated values.
A /few/ environment variables are a good idea, though:
:: TMPDIR — when don't you need temporary files?
:: APP_CONFIG_PATH — the path to a YAML file containing the real configuration.
The configuration file would even include a dict-based logging
configuration routing all messages to the parent app server for final
delivery, removing the need for per-app logging files, etc.
2. Define some basic generic metadata. "app_name" being the most obvious one.
The standard Python setup metadata is pretty good:
:: Application title.
:: Application (package) name.
:: Short description.
:: Long description / documentation.
:: Author information.
:: License.
:: Source information (URL, download URL).
:: Dependencies.
:: Entry point-style hooks. (Post-install, pre/post upgrade,
pre-removal, etc.)
Likely others.
3. Define how to get the WSGI app. This is WSGI specific, but (1) is
*not* WSGI specific (it's only Python specific, and would apply well to
other platforms)
I could imagine there would be multiple "application types":
:: WSGI application. Define a package dot-notation entry point to a
WSGI application factory.
:: Networked daemon. This would allow deployment of Twisted services,
for example. Define a package dot-notation entry point to the 'main'
callable.
Again, there are likely others, but those are the big two. In both of
these cases the configuration (loaded automatically) could be passed as
a dict to the callable.
4. Define some *web specific* metadata, like static files to serve.
This isn't necessarily WSGI or even Python specific (not that we should
bend backwards to be agnostic -- but in practice I think we'd have to
bend backwards to make it Python-specific).
Explicitly defining the paths to static files is not just a good idea,
it's The Slaw™.
5. Define some lifecycle metadata, like update_fetch. These are
generally commands to invoke. IMHO these can be ad hoc, but exist in
the scope of (1) and a full "environment". So it's not radically
different than anything else the app does, it's just we declare
specific times these actions happen.
Script name, dot-notation callable, or URL. I see those as the 'big
three' to support. Using a dot-notation callable has the same benefit
as my comments to #3.
The URL would be relative to wherever the application is mounted within
a domain, of course.
6. Define services (or "resources" or whatever -- the name "resource"
doesn't make as much sense to me, but that's bike shedding). These are
things the app can't provide for itself, but requires (or perhaps only
wants; e.g., an app might be able to use SQLite, but could also use
PostgreSQL). While the list of services will increase over time,
without a basic list most apps can't run at all. We also need a core
set as a kind of reference implementation of what a fully-specified
service *is*.
I touched on this up above; any DBAPI compliant database or various
configuration strings. (I'd implement this as a string-like object
with accessor properties so you can pass it to SQLAlchemy straight, or
dissect it to do something custom.)
More below.
7. In Silver Lining I've distinguished active services (like a running
database) from passive resources (like an installed binary library). I
don't see a reason to conflate these, as they are so very different.
Maybe this is part of why "resource" strikes me as an odd name for
something like a database.
You hit the terminology perfectly: active services (such as databases)
are just that, services. Installed binary libraries are resources. :)
So... there's kind of some thoughts about process.
Good stuff.
— Alice.
_______________________________________________
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe:
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com