Phillip J. Eby wrote:
At 05:10 PM 3/17/2008 -0500, Jeff Rush wrote:
1. Many felt the existing dependency resolver was not correct.  They wanted a
    full tree traversal resulting in an intersection of all restrictions,
    instead of a first-acceptable-solution approach taking now, which can
result in top-level dependencies not being enforced upon lower levels. The
    latter is faster however.  One solution would be to make the resolver
    pluggable.

Patches welcome, on both counts. Personally, Bob and I originally wanted a full-tree intersection, too, but it turned out to be hairier to implement than it seems at first. My guess is that none of the people who want it, have actually tried to implement it without a factorial or exponential O(). But that doesn't mean I'll be unhappy if somebody succeeds. :)

I think we'd make significant progress by just intersecting the dependencies we know about as we progress through the dependency tree. For example, if A requires B==2 and C==3, and if B requires C>=2,<=4, then at the time we install A we'd pick C==3 and also at the time we install B we'd pick C==3. As opposed to the current scheme that would choose C==4 for the latter case. This would allow dependent projects (think applications here) to better control the versions of the full set of libraries they use. Things would still fail (like they do now) if you ran across dependencies that had no intersection or if you encountered a new requirement after the target projected was already installed.


If you really wanted to do a full-tree intersection, it seems to me that the problem is detecting all the dependencies without having to spend significant time downloading/building in order to find them out. This could be solved by simply extending the cheeseshop interface to export the set of requirements outside of the egg / tarball / etc. We've done this for our own egg repository by extracting the appropriate meta-data files out of EGG-INFO and putting it into a separate file. This info is also useful for users as it gives them an idea of how much *new* stuff is going to be installed (a la yum, apt-get, etc.)


In other words, we attempt to achieve heuristically what's being proposed to do algorithmically. And my guess is that whatever cases the heuristic is failing at, would probably not be helped by an algorithmic approach either. But I would welcome some actual data, either way.

With our ETS projects, we've run into problems with the current heuristic. Perhaps we just don't know how to make it work like we want? We have a set of projects that we want to be individually installable (to the extent that we limit cross-project dependencies) but we also want to make it easy to install the complete set. We use a meta-egg for the latter. It's purpose is only to specify the exact versions of each project that have been explicitly tested to work together -- you could almost think of it as a source control system tag. Whereas on the individual projects, we explicitly want to ensure that people get the latest possible release of each required API so the version requirements are wider here. This setup causes problems whenever we release new versions of projects because it seems easy_install ignores the meta-egg exact versions when it gets down into a project and comes across a wider cross-project dependency. We ended up having to give up on the ranges in the cross-project dependencies and synchronize them to the same values in the meta-egg dependencies. There are numerous side-effects of this that we don't like but we haven't found a way around it.

Again, though, patches are welcome. :) (Specifically, for the trunk; I don't see a resolver overhaul as being suitable for the 0.6 stable branch.)

We're planning to pursue this (for the above mentioned strategy) as soon as we work ourselves out of a bit of a backlog of other things to do.



2. People want a solution for the handling of documentation.  The distutils
    module has had commented out sections related to this for several years.

As with so many other things, this gets tossed around the distutils-sig every now and then. A couple of times I've thrown out some options for how this might be done, but then the conversation peters out around the time anybody would have to actually do some work on it. (Me included, since I don't have an itch that needs scratching in this area.)

In particular, if somebody wants to come up with a metadata standard for including documentation in eggs, we've got a boatload of hooks by which it could be done. Nothing's stopping anybody from proposing a standard and building a tool, here. (e.g. using the setuptools command hook, .egg-info writer hook, etc.)

Enthought has started an effort (it's currently one of two things in our ETSProjectTools project at https://svn.enthought.com/svn/enthought/ETSProjectTools/trunk) and we're experimenting with our solution before proposing it as a patch. We'd love some more help if anyone wants to participate.


3. A more flexible internal handing of the different types of files is needed.
Currently the code, data, lib, etc. files are aggregated at build time and
    people would like them to be kept separate until install/packaging time.

I don't know what this means, exactly.

A number of projects want to provide various types of files besides code in their distributable, and they'd like these to end up in standard locations for that type of file. Think documentation, sample data, web templates, configuration settings, etc. Each of these should be treated differently at installation time depending on platform. On *nix, docs should go in /usr/share/doc whereas we might need to create a C:\Python2.5\docs on Windows. With sample data and templates, you probably just want it accessible outside of the zipped egg so users can easily look at it, add to it, edit it, etc. Configuration settings should be installed with some defaults into a standard configuration directory like /etc on *nix, etc.

Basically the issue is that it needs to be easier to include different sets of files into an egg for different actions to be taken during installation or packaging into an OS-specific distribution format.


The other is the use of a single .pth file to control the list of activated
    packages.  Those who produce distributions would prefer a magic directory
into which links to distributions could be dropped, similar to the current
    best practices for Linux, with /etc/conf.d/, /etc/profile.d/,
    /etc/xinetd.d/ and so forth.

site-packages is that directory, and has been since long before setuptools. Just drop uniquely-named .pth files there, and you're good to go.

But the docs for easy_install claim that the list of active eggs is maintained in easy-install.pth. Also, if I create my own .pth file, and the user tries to update my version to a new one, will the easy_install tool modify my .pth file to remove the mention of the old version from my sys.path and put the new version in the same .pth file? Or will it now be listed in both places? Or will it only in easy-install.pth?

7. Many wanted to ability to install files anywhere in the install tree and
    not just under the Python package.  Under distutils this was possible but
    it was removed in setuptools for security reasons.

It wasn't security, it was manageability. Egg-based installation means containment, (analagous to GNU stow) and therefore portability and disposability of plugins. (Which again is what eggs were really developed for in the first place.)

Yes, but as you've already pointed out, they've escaped into a larger ecosystem and this restriction is a severe limitation -- leading to significant frustration. Especially as projects evolve and want to do something more complex than simply install pure Python code. Here at Enthought, we use and ship a number of projects that have extensions and thus dynamic libraries that need to either be modified during installation to work from the user's installed location, or copied elsewhere on the system to avoid the need to modify (which we also can't do via an egg install) env variables, registries, etc. We'd also love to be able to ship end-user enterprise-scale applications via eggs so that bug fixes and updates don't require downloading a monolithic 100MB+ installer. But doing that requires the ability to update desktop icons, menus, etc. which we also can't do automatically via an egg.

If you don't want the burden on setuptools to support, much less track, all these options, then perhaps it could just support automatic execution of a post-install script (and pre-uninstall script if uninstallation ever happens) that allows individual project developers to do what they need to do? Let the burden of describing how those things happen and how to uninstall/relocate/update them fall to the provider of the projects that do them.

Also, IIUC, stow only tries to "contain" the hard files. It puts links in multiple standard locations (for man pages, executables, libraries, etc.) If setuptools supported these options, I don't think there'd be any discussion here except for things like "how do I extend the set of things the tool supports so that my foo-type files get linked into the standard /os/path/to/foo for the X os?"


  Custom code can still
    be written to do this explicitly but this is not popular.

No kidding. :) Current best practice is to include a script or module in the package that can install other files to a designated location. Personally, though, I tend to view applications and libraries that target specific install locations to be overreaching their bounds, and stepping into sysadmin territory. Give me the tools to install the data, don't just dump it somewhere on my system where *you* think it should go, in other words.

I should have read ahead. This sounds close to what I've been describing except that this leads me to picture a script that prompts for install locations and allows the user to customize the destinations rather than one that assumes everything goes in a standard place. I'm all for this, and the continuation of the ability to install an egg into a user-environment vs. a system-environment. The only thing missing here is the ability for the installer to automatically run that script so that installation isn't a disjointed, two-step manual process that a user is prone to forgot to complete. One of the features of Enthought's Enstaller extension to easy_install was that it looks for a post_install.py script in EGG-INFO and if one is found, runs it. I would think that getting this into setuptools would be a significant step forward but I believe you previously rejected that idea. We'll take a stab at creating a patch for you if you're more receptive to that idea now. Just let me know.


On the other hand, I've been puzzling over how to handle legitimate post-install features. On Windows, both wx and pywin32 have a real need to do some actuall "install" operations. Some is just copying files, but pywin32 also has to do some registry stuff. I don't know how to allow just what's sensible, without opening up a huge can of worms, though.

I think there are lots of situations that are legitimate (projects with extensions, projects that want to put icons on the desktop or in menus, projects that need to interact with a registry, projects that want to put configuration information somewhere other than in a zip file in a site-packages dir, etc.) I think we should worry less about preventing developers from shooting themselves in the foot and more about ensuring that they can hunt for food for their survival. We can always tighten things down after seeing the usecases that develop, right?


-- Dave
_______________________________________________
Distutils-SIG maillist  -  Distutils-SIG@python.org
http://mail.python.org/mailman/listinfo/distutils-sig

Reply via email to