On 2011-04-13 18:16:36 -0700, Ian Bicking said:

While initially reluctant to use zip files, after further discussion and thought they seem fine to me, so long as any tool that takes a zip file can also take a directory.  The reverse might not be true -- for instance, I'd like a way to install or update a library for (and inside) an application, but I doubt I would make pip rewrite zip files to do this ;)  But it could certainly work on directories.  Supporting both isn't a big deal except that you can't do symlinks in a zip file.

I'm not talking about using zip files as per eggs, where the code is maintained within the zip file during execution. It is merely a packaging format with the software itself extracted from the zip during installation / upgrade. A transitory container format. (Folders in the end.)

Symlinks are an OS-specific feature, so those are out as a core requirement. ;)

I don't think we're talking about something like a buildout recipe.  Well, Eric kind of brought something like that up... but otherwise I think the consensus is in that direction.

Ambiguous statements FTW, but I think I know what you meant.  ;)

So specifically if you need something like lxml the application specifies that somehow, but doesn't specify *how* that library is acquired.  There is some disagreement on whether this is generally true, or only true for libraries that are not portable.  

+1

I think something along the lines of autoconf (those lovely ./configure scripts you run when building GNU-style software from source) with published base 'checkers' (predicates as I referred to them previously) would be great. A clear way for an application to declare a dependency, have the application server check those dependencies, then notify the administrator installing the package.

I've seen several Python libraries that include the C library code that they expose; while not so terribly efficient (i.e. you can't install the C library once, then share it amongst venvs), it is effective for small packages.

Larger (i.e. global or application-local) would require the intervention of a systems administrator.

Something like a database takes this a bit further.  We haven't really discussed it, but I think this is where it gets interesting.  Silver Lining has one model for this.  The general rule in Silver Lining is that you can't have anything with persistence without asking for it as a service, including an area to write files (except temporary files?)

+1

Databases are slightly more difficult; an application could ask for:

:: (Very Generic) A PEP-249 database connection.

:: (Generic) A relational database connection string.

:: (Specific) A connection string to a specific vendor of database.

:: (Odd) A NoSQL database connection string.

I've been making heavy use of MongoDB over the last year and a half, but AFIK each NoSQL database engine does its own thing API-wise. (Then there are ORMs on top of that, but passing a connection string like mysql://user:pass@host/db or mongo://host/db is pretty universal.)

It is my intention to write an application server that is capable of creating and securing databases on-the-fly. This would require fairly high-level privileges in the database engine, but would result in far more "plug-and-play" configuration. Obviously when deleting an application you will have the opportunity to delete the database and associated user.

I assume everyone agrees that an application can't write to its own files (but of course it could execfile something in another location).

+1; that _almost_ goes without saying. :) At the same time, an application server /must not/ require root access to do its work, thus no mandating of (real) chroots, on-the-fly user creation, etc.

There are ways around almost all security policies, but where possible setting the read-only flag (Windows) or removing write (chmod -w on POSIX systems) should be enough to prevent casual abuse.

I suspect there's some disagreement about how the Python environment gets setup, specifically sys.path and any other application-specific customizations (e.g., I've set environ['DJANGO_SETTINGS_MODULE'] in silvercustomize.py, and find it helpful).

Similar to Paste's "here" variable for INI files, having some method of the application defining environment variables with base path references would be needed.

I've tossed out my idea of sharing dependencies, BTW, so a simple extraction of the zipped application into one package folder (linked in using a .pth file) with the dependencies installed into an app-packages folder in the path (like site-packages) would be ideal. At least, for me. ;)

Describing the scope of this, it seems kind of boring.  In, for example, App Engine you do all your setup in your runner -- I find this deeply annoying because it makes the runner the only entry point, and thus makes testing, scripts, etc. hard.

I agree; that's a short-sighted approach to an application container format. There should be some way to advertise a test suite and, for example, have the suite run before installation or during upgrade. (Rolling back the upgrade process thus far if there is a failure.)

My shiny end goal would be a form of continuous deployment: a git-based application which gets a post-commit notification, pulls the latest, runs the tests, rolls back on failure or fully deploys the update on success.

We would start with just WSGI.  Other things could follow, but I don't see any reason to worry about that now.  Maybe we should just punt on aggregate applications now too.  I don't feel like there's anything we would do that would prevent other kinds of runtime models (besides the starting point, container-controlled WSGI), and the places to add support for new things are obvious enough (e.g., something like Silver Lining's platform setting).  I would define a server with accompanying daemon processes as an "aggregate".

Since in my model the application server does not proxy requests to the instantiated applications (each running in its own process), I'm not sure I'm interpreting what you mean by an aggregate application properly.

If "my" application server managed Nginx or Apache configurations, dispatch to applications based on base path would be very easy to do while still keeping the applications isolated.

An important distinction to make, I believe, is application concerns and deployment concerns.  For instance, what you do with logging is a deployment concern.  Generating logging messages is of course an application concern.  In practice these are often conflated, especially in the case of bespoke applications where the only person deploying the application is the person (or team) developing the application.  It shouldn't be annoying for these users, though.  Maybe it makes sense for people to be able to include tool-specific default settings in an application -- things that could be overridden, but especially for the case when the application is not widely reused it could be useful.  (An example where Silver Lining gets is all backwards is I created a [production] section in app.ini when the very concept of "production" is not meaningful in that context -- but these kind of named profiles would make sense for actual application deployment tools.)

Having an application define default logging levels for different scopes would be very useful. The application server could take those defaults, and allow an administrator to modify them or define additional scopes quite easily.

There's actually a kind of layered way of thinking of this:

1. The first, maybe most important part, is how you get a proper Python environment.  That includes sys.path of course, with all the accompanying libraries, but it also includes environment description.

Virtualenv-like, with the application itself linked in via a .pth file (a la setup.py develop, allowing inline upgrades via SCM) and dependencies extracted from the zip distributable into an app-packages folder a la site-packages.

I don't install global Python modules on any of my servers, so the --no-site-packages option is somewhat unnecessary for me, but having something similar would be useful, too. Unfortunately, that one feature seems to require a lot of additional work.

In Silver Lining there's two stages -- first, set some environmental variables (both general ones like $SILVER_CANONICAL_HOST and service-specific ones like $CONFIG_MYSQL_DBNAME), then get sys.path proper, then import silvercustomize by which an environment can do any more customization it wants (e.g., set $DJANGO_SETTINGS_MODULE)

Environment variables are typeless (raw strings) and thus less than optimum for sharing rich configurations.

Host names depend on how the application is mounted, and a single application may be mounted to multiple domains or paths, so utilizing the front end web server's rewriting capability is probably the best solution for that.

What about multiple database connections? Environment variables are also not so good for repeated values.

A /few/ environment variables are a good idea, though:

:: TMPDIR — when don't you need temporary files?

:: APP_CONFIG_PATH — the path to a YAML file containing the real configuration.

The configuration file would even include a dict-based logging configuration routing all messages to the parent app server for final delivery, removing the need for per-app logging files, etc.

2. Define some basic generic metadata.  "app_name" being the most obvious one.

The standard Python setup metadata is pretty good:

:: Application title.
:: Application (package) name.
:: Short description.
:: Long description / documentation.
:: Author information.
:: License.
:: Source information (URL, download URL).
:: Dependencies.
:: Entry point-style hooks. (Post-install, pre/post upgrade, pre-removal, etc.)

Likely others.

3. Define how to get the WSGI app.  This is WSGI specific, but (1) is *not* WSGI specific (it's only Python specific, and would apply well to other platforms)

I could imagine there would be multiple "application types":

:: WSGI application. Define a package dot-notation entry point to a WSGI application factory.

:: Networked daemon. This would allow deployment of Twisted services, for example. Define a package dot-notation entry point to the 'main' callable.

Again, there are likely others, but those are the big two. In both of these cases the configuration (loaded automatically) could be passed as a dict to the callable.

4. Define some *web specific* metadata, like static files to serve.  This isn't necessarily WSGI or even Python specific (not that we should bend backwards to be agnostic -- but in practice I think we'd have to bend backwards to make it Python-specific).

Explicitly defining the paths to static files is not just a good idea, it's The Slaw™.

5. Define some lifecycle metadata, like update_fetch.  These are generally commands to invoke.  IMHO these can be ad hoc, but exist in the scope of (1) and a full "environment".  So it's not radically different than anything else the app does, it's just we declare specific times these actions happen.

Script name, dot-notation callable, or URL. I see those as the 'big three' to support. Using a dot-notation callable has the same benefit as my comments to #3.

The URL would be relative to wherever the application is mounted within a domain, of course.

6. Define services (or "resources" or whatever -- the name "resource" doesn't make as much sense to me, but that's bike shedding).  These are things the app can't provide for itself, but requires (or perhaps only wants; e.g., an app might be able to use SQLite, but could also use PostgreSQL).  While the list of services will increase over time, without a basic list most apps can't run at all.  We also need a core set as a kind of reference implementation of what a fully-specified service *is*.

I touched on this up above; any DBAPI compliant database or various configuration strings. (I'd implement this as a string-like object with accessor properties so you can pass it to SQLAlchemy straight, or dissect it to do something custom.)

More below.

7. In Silver Lining I've distinguished active services (like a running database) from passive resources (like an installed binary library).  I don't see a reason to conflate these, as they are so very different.  Maybe this is part of why "resource" strikes me as an odd name for something like a database.

You hit the terminology perfectly: active services (such as databases) are just that, services. Installed binary libraries are resources. :)

So... there's kind of some thoughts about process. 

Good stuff.

        — Alice.


_______________________________________________
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Reply via email to