On Apr 11, 2011, at 2:48 PM, Alice Bevan–McGregor wrote:

> On 2011-04-11 00:53:02 -0700, Eric Larson said:
> 
>> Hi,
>> On Apr 10, 2011, at 10:29 PM, Alice Bevan–McGregor wrote:
>>> However, the package format I describe in that gist does include the source 
>>> for the dependencies as "snapshotted" during bundling.  If your application 
>>> is working in development, after snapshotting it /will/ work on sandbox or 
>>> production deployments.
>> I wanted to chime in on this one aspect b/c I think the concept is somewhat 
>> flawed. If your application is working in development and "snapshot" the 
>> dependencies that is no guarantee that things will work in production. The 
>> only way to say that snapshot or bundle is guaranteed to work is if you 
>> snapshot the entire system and make it available as a production system.
> 
> `pwaf bundle` bundles the source tarballs, effectively, of your application 
> and dependencies into a single file.  Not unlike a certain feature of pip.
> 
> And… wait, am I the only one who uses built-from-snapshot virtual servers for 
> sandbox and production deployment?  I can't be the only one who likes things 
> to work as expected.
> 
>> Using a real world example, say you develop your application on OS X and you 
>> deploy on Ubuntu 8.04 LTS. Right away you are dealing with two different 
>> operating systems with entirely different system calls. If you use something 
>> like lxml and simplejson, you have no choice but to repackage or install 
>> from source on the production server.
> 
> Installing from source is what I was suggesting.  Also, Ubuntu on a server?  
> All your `linux single` (root) are belong to me.  ;^P
> 

I realize your intent was to install from source and I'm saying that is the 
problem. Not from the standpoint of a Python web application of course. But 
instead, from the standpoint of a Python web application that is working within 
the context of a larger system. A sandbox is nice b/c it gives you a place to 
do whatever you want and be somewhat oblivious of the rest of the world. My 
point is not that its incorrect to install Python packages from source, but 
assuming that all dependencies should be installed from source is flawed. Just 
b/c a C library needs some library to compile, it doesn't meant that the same 
library is necessary to run. It is generally a good idea to keep compilers off 
of production machines. 

>> While it is fair to say that generally you could avoid packages that don't 
>> use C, both lxml and simplejson are rather obvious choices for web 
>> development.
> 
> Except that json is built-in in 2.6 (admittedly with fewer features, but I've 
> never needed the extras) and there are alternate xml parsers, too.
> 

Ok, you are correct that there are other parsers and that the json module is 
builtin. But we've already made a conscious decision to use lxml and simplejson 
instead of other tools (including the json module) because they are slower. 
These compiled packages have been very frustrating to deal with in production 
because they need to be compiled on the server. Along similar lines, we have 
our own Python apps that use C and these are similarly very difficult to 
deploy. This is because our deployment system is built off of setuptools and 
eggs (no zip). This is generally not a bad thing and speaks to the quality of 
Python as a platform. But, the pain of having a very Python centric system is 
substantial. My point is that we recognize that while it is very convenient to 
install Python packages and let pip (and setuptools) handle our dependencies, 
it also doesn't allow a way to interact with the host system that is housing 
our sandbox. 

>> It sounds like Ian doesn't want to have any build steps which I think is a 
>> bad mantra. A build step lets you prepare things for deployment. A 
>> deployment package is different than a development package and mixing the 
>> two by forcing builds on the server or seems like asking for trouble.
> 
> I'm having difficulty following this statement: build steps good, building on 
> server bad?  So I take it you know the exact target architecture and have 
> cross-compilers installed in your development environment?  That's not 
> practical (or simple) at all!
> 

I'd think it is pretty bad practice to release software to production machines 
with no assumptions made about that target machine. 

It doesn't have to be impractical. All it takes is an acknowledgement that the 
system might need to supply some requirement and state that requirement in a 
way that makes sense for your system. That is it. A list of package names that 
are downloadable via some system level package manager might be more than 
enough. URLs to source packages might be fine. The idea is that we as Python 
application developers can make the lives of others who work with the system 
easier by providing a mechanism for communicating system level dependencies. 

>> I'm not saying this is what you (Alice) are suggesting, but rather pointing 
>> out that as a model, depending on virtualenv + pip's bundling capabilities 
>> seems slightly flawed.
> 
> Virtualenv (or something utilizing a similar Python path 'chrooting' 
> capability) and pip using the extracted "deps" as the source for "offline" 
> installation actually seems quite reasonable to me.  The benefit of a known 
> set of working packages (i.e. specific version numbers, tested in 
> development) and the ability to compile C extensions in-place.  (Because sure 
> as hell you can't reliably compile them before-hand if they have any form of 
> system library dependency!)
> 

I understand that this is not always that easy, so I agree it is not something 
I would prescribe out of the gate. But I would make the system agnostic to 
whether or not you have to compile things on the server or not. Operating 
system vendors have all conquered the problem of releasing software to machines 
with a much larger variety then you'll ever see in a single production 
environment. It isn't impossible or that difficult to an idea to support. That 
said, I'm not suggesting creating the tools or having the requirement to 
deliver pre-built binary Python modules. Instead my point is to make sure it is 
possible and supported as a requirement. 

>> I think it should offer hooks for running tests, learning basic status and 
>> allow simple configuration for typical sysadmin needs (logging via syslog, 
>> process management, nagios checks, etc.). Instead of focusing on what format 
>> that should take in terms of packages, it seems more effective to spend time 
>> defining a standard means of managing WSGI apps and piggyback or plain old 
>> copy some format like RPMs or dpkg.
> 
> RPMs are terrible, dpkg is terrible.  Binary package distribution, in 
> general, is terrible.  I got the distinct impression at PyCon that binary 
> distributable .eggs were thought of as terrible and should be phased out.
> 

RPMs and dpkg are both just tar files. You untar the at the root of the file 
system and the files in the tar are "installed" in the correct place on the 
file system. Pip does the same basic thing with the exception being you are 
untarring in $prefix/lib/ instead. I think that model is excellent. I said to 
copy it if need be. My only point is to realize that you are installing the 
package in a guest sandbox. Include some facility to communicate how the system 
might need to meet some dependencies. 

> Also, nobody so far seems to have noticed the centralized logging management 
> or deamon management lines from my notes.
> 
>> Just my .02. Again, I haven't offered code, so feel free to ignore me. But I 
>> do hope that if there are others that suspect this model of putting source 
>> on the server is a problem pipe up. If I were to add a requirement it would 
>> be that Python web applications help system administrators become more 
>> effective. That means finding consistent ways of deploying apps that plays 
>> well with other languages / platforms. After all, keeping a C compiler on a 
>> public server is rarely a good idea.
> 
> If you could demonstrate a fool-proof way to install packages with system 
> library dependencies using cross-compilation from a remote machine, I'm all 
> ears.  ;)
> 

pre-install-hooks: [
  "apt-get install libxml2",  # the person deploying the package assumes 
apt-get is available
  "run-some-shell-script.sh", # the shell script might do the following on a 
list of URLs
  "wget http://mydomain.com/canonical/repo/dependency.tar.gz && tar zxf 
dependency.tar.gz && rm dependency.tar.gz"
]

Does that make some sense? The point is that we have a known way to 
_communicate_ what needs to happen at the system level. I agree that there 
isn't a fool proof way. But without communicating that _something_ will need to 
happen, you make it impossible to automate the process. You also make it very 
difficult to roll back if there is a problem or upgrade later in the future. 
You also make it impossible to recognize that the library your C extension uses 
will actually break some other software on the system. Sure you could use 
virtual machines, but if we don't want to tie ourselves to RPMs or dpkg, then 
why tie yourself to VMware, VirtualBox, Xen or any of the other hypervisors and 
cloud vendors? 

I hope I've made my point clearer. The idea is not to implement everything but 
just as setuptools has provided helpful hooks like entry points that help 
facilitate functionality, I'm suggesting that if this idea moves forward, 
similar hooks are available to help facilitate the host systems that will house 
our sandboxes. 

Eric


>       — Alice.
> 
> 
> _______________________________________________
> Web-SIG mailing list
> Web-SIG@python.org
> Web SIG: http://www.python.org/sigs/web-sig
> Unsubscribe: http://mail.python.org/mailman/options/web-sig/eric%40ionrock.org

_______________________________________________
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Reply via email to