Re: [Python-Dev] [Distutils] Capsule Summary of Some Packaging/Deployment Technology Concerns
On Wed, Mar 19, 2008 at 6:15 PM, Jeff Rush [EMAIL PROTECTED] wrote: Frankly I'd like to see setuptools exploded, with those parts of general use folded back into the standard library, the creation of a set of non-implementation-specific documents of the distribution formats and behavior, leaving a small core of one implementation of how to do it and the door open for others to compete with their own implementation. If I hazard an opinion seconding this sentiment. In my use of setuptools, it definitely feels like it wants to be three (mostly) independent projects: 1) The project that standardizes the concept now embodied by eggs and provides the basic machinery to work with them (find them, introspect metadata, import them, etc.), but not install them per se. This is generally useful as common plug-in framework, if nothing else. Currently, this run-time support functionality is in pkg_resources. 2) The tool you can use to build eggs (but not install them per se). Currently this is the setuptools extension to distutils. 3) The tool for installing eggs (or their equivalent) and (optionally) their dependencies (optionally using remote hosts) as well as uninstalling. Currently this is easy_install (well, except for uninstalling, which is understandable quite difficult). Finally, there is the fourth and already separate project of PyPI: 4) The hosted repository of publicly available eggs (or their equivalent). This should export any metadata required to resolve dependencies relatively cheeply. Breaking them apart will make it easier to have two separate projects for building eggs (or their equivalents) -- one based on distutils and the other replacing it. Even more importantly, it will make it possible for multiple installers to be developed that scratch particular itches. Hopefully one would eventually emerge as the de-facto standard, but this will ultimately be decided by community adoption. Alex ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Distutils] Capsule Summary of Some Packaging/Deployment Technology Concerns
Phillip J. Eby wrote: At 03:57 AM 3/19/2008 -0500, Jeff Rush wrote: I'd be willing to help out, and keep a carefully balanced hand in what is accepted. I'm not sure exactly how to go about such a handoff though. My guess is that we need a bug/patch tracker, and a few people to review, test, and apply. Maybe a transitional period during which I just say yea or nay and let others do the test and apply, before opening it up entirely. That way, we can perhaps solidify a few principles that I'd like to have stay in place. (Like no arbitrary post-install code hooks.) +1 to blessing more people to commit. +1 to the transition period idea. These two ought to enable things to move a bit quicker than taking a year to accept a patch. :-) In addition to a bug tracker and patch manager, seems like perhaps a wiki to help document some of these solidified principles and other notes would be a good thing. (Like a patch should almost always include at least one test, possibly more.) Given that the source for setuptools is in the python.org svn, couldn't we just use the python.org roundup and wiki for these facilities? Though looking at the list of components, it seems that things in the sandbox generally aren't tracked in this infrastructure. In which case, I'm sure we could use sf, launchpad, or some such external provider. Enthought could even host this stuff. Like Jeff Rush, I'm also willing to help out as both a writer and reviewer of patches. As you can see from my earlier posts there are a number of things (besides running an arbitrary post-install script) that we'd like to be able to get into the codebase. -- Dave ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Distutils] Capsule Summary of Some Packaging/Deployment Technology Concerns
Marius Gedminas wrote: On Mon, Mar 17, 2008 at 08:37:30PM -0400, Phillip J. Eby wrote: At 05:10 PM 3/17/2008 -0500, Jeff Rush wrote: People also want a greater variety of file_finders to be included with setuptools. Instead of just CVS and SVN, they want it to comprehend Mercurial, Bazaar, Git and so forth. Did you point them to the Cheeseshop? There are plugins already available for all the systems you mentioned, plus Darcs and Monotone. If you mean included as in bundled, this doesn't make a whole lot of sense to me. They knew there were plugins out there, of various quality and availability but wanted them bundled. ;-) It's a pain to track them down. Perhaps if the RPM format were broken out from setuptools, as the inclusion of some formats leads them to believe the set is just incomplete, not intentionally sparse. I'd think that if you're using setuptools as a developer (the only reason you need the file finders, since source distributions include a prebuilt manifest), you'd not have a problem saying easy_install setuptools-git or adding a setup_requires='setuptools-git' line to your setup.py. (Although the latter would only be needed for *development*, not deployment.) setup_requires looks like a solution, but it requires extra attention from the developers who write the setup.py. Writing a setup.py is already quite complicated -- I usually end up copying an existing one and modifying it. As a compromise, of making new formats easily available but not bundled, and not requiring special action within setup.py, setuptools could treat --formats=dpkg as an implicit setup_requires= and pull it from PyPI. And the --list-formats option could query PyPI for the possibilities, just as --list-classifiers does today. If would require a few standards in keywording/classifying those format eggs but we already need those standards for other projects, such as locating recipes for buildout and plugins for trac. -Jeff ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Distutils] Capsule Summary of Some Packaging/Deployment Technology Concerns
Phillip J. Eby wrote: At 05:10 PM 3/17/2008 -0500, Jeff Rush wrote: 1. Many felt the existing dependency resolver was not correct. They wanted a full tree traversal resulting in an intersection of all restrictions, instead of a first-acceptable-solution approach taking now, which can result in top-level dependencies not being enforced upon lower levels. The latter is faster however. One solution would be to make the resolver pluggable. Patches welcome, on both counts. Personally, Bob and I originally wanted a full-tree intersection, too, but it turned out to be hairier to implement than it seems at first. My guess is that none of the people who want it, have actually tried to implement it without a factorial or exponential O(). But that doesn't mean I'll be unhappy if somebody succeeds. :) I think we'd make significant progress by just intersecting the dependencies we know about as we progress through the dependency tree. For example, if A requires B==2 and C==3, and if B requires C=2,=4, then at the time we install A we'd pick C==3 and also at the time we install B we'd pick C==3. As opposed to the current scheme that would choose C==4 for the latter case. This would allow dependent projects (think applications here) to better control the versions of the full set of libraries they use. Things would still fail (like they do now) if you ran across dependencies that had no intersection or if you encountered a new requirement after the target projected was already installed. If you really wanted to do a full-tree intersection, it seems to me that the problem is detecting all the dependencies without having to spend significant time downloading/building in order to find them out. This could be solved by simply extending the cheeseshop interface to export the set of requirements outside of the egg / tarball / etc. We've done this for our own egg repository by extracting the appropriate meta-data files out of EGG-INFO and putting it into a separate file. This info is also useful for users as it gives them an idea of how much *new* stuff is going to be installed (a la yum, apt-get, etc.) In other words, we attempt to achieve heuristically what's being proposed to do algorithmically. And my guess is that whatever cases the heuristic is failing at, would probably not be helped by an algorithmic approach either. But I would welcome some actual data, either way. With our ETS projects, we've run into problems with the current heuristic. Perhaps we just don't know how to make it work like we want? We have a set of projects that we want to be individually installable (to the extent that we limit cross-project dependencies) but we also want to make it easy to install the complete set. We use a meta-egg for the latter. It's purpose is only to specify the exact versions of each project that have been explicitly tested to work together -- you could almost think of it as a source control system tag. Whereas on the individual projects, we explicitly want to ensure that people get the latest possible release of each required API so the version requirements are wider here. This setup causes problems whenever we release new versions of projects because it seems easy_install ignores the meta-egg exact versions when it gets down into a project and comes across a wider cross-project dependency. We ended up having to give up on the ranges in the cross-project dependencies and synchronize them to the same values in the meta-egg dependencies. There are numerous side-effects of this that we don't like but we haven't found a way around it. Again, though, patches are welcome. :) (Specifically, for the trunk; I don't see a resolver overhaul as being suitable for the 0.6 stable branch.) We're planning to pursue this (for the above mentioned strategy) as soon as we work ourselves out of a bit of a backlog of other things to do. 2. People want a solution for the handling of documentation. The distutils module has had commented out sections related to this for several years. As with so many other things, this gets tossed around the distutils-sig every now and then. A couple of times I've thrown out some options for how this might be done, but then the conversation peters out around the time anybody would have to actually do some work on it. (Me included, since I don't have an itch that needs scratching in this area.) In particular, if somebody wants to come up with a metadata standard for including documentation in eggs, we've got a boatload of hooks by which it could be done. Nothing's stopping anybody from proposing a standard and building a tool, here. (e.g. using the setuptools command hook, .egg-info writer hook, etc.) Enthought has started an effort (it's currently one of two things in our ETSProjectTools project at
Re: [Python-Dev] [Distutils] Capsule Summary of Some Packaging/Deployment Technology Concerns
We should probably move this off of Python-Dev, as we're getting into deep details now... At 07:27 PM 3/18/2008 -0500, Dave Peterson wrote: If you really wanted to do a full-tree intersection, it seems to me that the problem is detecting all the dependencies without having to spend significant time downloading/building in order to find them out. This could be solved by simply extending the cheeseshop interface to export the set of requirements outside of the egg / tarball / etc. We've done this for our own egg repository by extracting the appropriate meta-data files out of EGG-INFO and putting it into a separate file. This info is also useful for users as it gives them an idea of how much *new* stuff is going to be installed (a la yum, apt-get, etc.) ...and now we're more directly competing with them, too. The original idea Bob and I had was to do XML files ala Eclipse feature repositories, but then later I realized that for what we were doing, HTML was both adequate and already available. However, I don't see a problem in principle with having header files available for this sort of thing. With our ETS projects, we've run into problems with the current heuristic. Perhaps we just don't know how to make it work like we want? We have a set of projects that we want to be individually installable (to the extent that we limit cross-project dependencies) but we also want to make it easy to install the complete set. We use a meta-egg for the latter. It's purpose is only to specify the exact versions of each project that have been explicitly tested to work together -- you could almost think of it as a source control system tag. I would think that as long as that meta-egg specifies *all* the required versions (right down to recursive dependencies), then there shouldn't be any problem. Maybe it's me who's not understanding something? I would think that you could get the appropriate data by running the tl.eggdeps tool. A number of projects want to provide various types of files besides code in their distributable, and they'd like these to end up in standard locations for that type of file. Think documentation, sample data, web templates, configuration settings, etc. Each of these should be treated differently at installation time depending on platform. On *nix, docs should go in /usr/share/doc whereas we might need to create a C:\Python2.5\docs on Windows. With sample data and templates, you probably just want it accessible outside of the zipped egg so users can easily look at it, add to it, edit it, etc. Configuration settings should be installed with some defaults into a standard configuration directory like /etc on *nix, etc. Basically the issue is that it needs to be easier to include different sets of files into an egg for different actions to be taken during installation or packaging into an OS-specific distribution format. Yes, it would be nice to define a metadata standard for including installable datasets either through copying or symlinking, optionally with entry points for running some code, too. When you install an egg, these things could get added to a post-install to-do list, that you could then read to find out what steps to do, or invoke a tool on to actually do some of those steps. But the docs for easy_install claim that the list of active eggs is maintained in easy-install.pth. Also, if I create my own .pth file, and the user tries to update my version to a new one, will the easy_install tool modify my .pth file to remove the mention of the old version from my sys.path and put the new version in the same .pth file? Or will it now be listed in both places? Or will it only in easy-install.pth? My understanding of the context of the question was that it applied to *system* packaging tools, which would be exclusively maintaining the .pth entries for the packages they installed. i.e., a scenario with *no* easy-install.pth. Setuptools will still detect the presence of their eggs, regardless of the means by which they're added to sys.path. But it would not *maintain* those .pth files. Yes, but as you've already pointed out, they've escaped into a larger ecosystem and this restriction is a severe limitation -- leading to significant frustration. Especially as projects evolve and want to do something more complex than simply install pure Python code. Here at Enthought, we use and ship a number of projects that have extensions and thus dynamic libraries that need to either be modified during installation to work from the user's installed location, or copied elsewhere on the system to avoid the need to modify (which we also can't do via an egg install) env variables, registries, etc. By the way, there *is* experimental shared library building support in setuptools, and I recently heard from Andi Vajda that he was successful in using it in his JCC project to make available a C++ library for linkage from