Re: [Python-Dev] [Distutils] Capsule Summary of Some Packaging/Deployment Technology Concerns

2008-03-20 Thread Alexander Michael
On Wed, Mar 19, 2008 at 6:15 PM, Jeff Rush [EMAIL PROTECTED] wrote:
  Frankly I'd like to see setuptools exploded, with those parts of general use
  folded back into the standard library, the creation of a set of
  non-implementation-specific documents of the distribution formats and
  behavior, leaving a small core of one implementation of how to do it and the
  door open for others to compete with their own implementation.

If I hazard an opinion seconding this sentiment. In my use of
setuptools, it definitely feels like it wants to be three (mostly)
independent projects:

1) The project that standardizes the concept now embodied by eggs and
provides the basic machinery to work with them (find them, introspect
metadata, import them, etc.), but not install them per se. This is
generally useful as common plug-in framework, if nothing else.
Currently, this run-time support functionality is in pkg_resources.
2) The tool you can use to build eggs (but not install them per se).
Currently this is the setuptools extension to distutils.
3) The tool for installing eggs (or their equivalent) and (optionally)
their dependencies (optionally using remote hosts) as well as
uninstalling. Currently this is easy_install (well, except for
uninstalling, which is understandable quite difficult).

Finally, there is the fourth and already separate project of PyPI:
4) The hosted repository of publicly available eggs (or their
equivalent). This should export any metadata required to resolve
dependencies relatively cheeply.

Breaking them apart will make it easier to have two separate projects
for building eggs (or their equivalents) -- one based on distutils and
the other replacing it. Even more importantly, it will make it
possible for multiple installers to be developed that scratch
particular itches. Hopefully one would eventually emerge as the
de-facto standard, but this will ultimately be decided by community
adoption.

Alex
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Distutils] Capsule Summary of Some Packaging/Deployment Technology Concerns

2008-03-19 Thread Dave Peterson

Phillip J. Eby wrote:

At 03:57 AM 3/19/2008 -0500, Jeff Rush wrote:
  
  I'd be willing to help out, and keep a carefully balanced hand in 
what is accepted.




I'm not sure exactly how to go about such a handoff though.  My guess 
is that we need a bug/patch tracker, and a few people to review, 
test, and apply.  Maybe a transitional period during which I just say 
yea or nay and let others do the test and apply, before opening it up 
entirely.  That way, we can perhaps solidify a few principles that 
I'd like to have stay in place.  (Like no arbitrary post-install code hooks.)
  


+1 to blessing more people to commit.
+1 to the transition period idea.

These two ought to enable things to move a bit quicker than taking a 
year to accept a patch. :-)


In addition to a bug tracker and patch manager, seems like perhaps a 
wiki to help document some of these solidified principles and other 
notes would be a good thing.  (Like a patch should almost always include 
at least one test, possibly more.)  Given that the source for setuptools 
is in the python.org svn, couldn't we just use the python.org roundup 
and wiki for these facilities?  Though looking at the list of 
components, it seems that things in the sandbox generally aren't tracked 
in this infrastructure.  In which case, I'm sure we could use sf, 
launchpad, or some such external provider.  Enthought could even host 
this stuff.


Like Jeff Rush, I'm also willing to help out as both a writer and 
reviewer of patches.  As you can see from my earlier posts there are a 
number of things (besides running an arbitrary post-install script) that 
we'd like to be able to get into the codebase.



-- Dave
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Distutils] Capsule Summary of Some Packaging/Deployment Technology Concerns

2008-03-18 Thread Jeff Rush
Marius Gedminas wrote:
 On Mon, Mar 17, 2008 at 08:37:30PM -0400, Phillip J. Eby wrote:
 At 05:10 PM 3/17/2008 -0500, Jeff Rush wrote:
 People also want a greater variety of file_finders to be included with
 setuptools.  Instead of just CVS and SVN, they want it to comprehend
 Mercurial, Bazaar, Git and so forth.
 Did you point them to the Cheeseshop?  There are plugins already 
 available for all the systems you mentioned, plus Darcs and 
 Monotone.  If you mean included as in bundled, this doesn't make 
 a whole lot of sense to me.

They knew there were plugins out there, of various quality and availability 
but wanted them bundled. ;-)  It's a pain to track them down.  Perhaps if the 
RPM format were broken out from setuptools, as the inclusion of some formats 
leads them to believe the set is just incomplete, not intentionally sparse.


 I'd think that if you're using 
 setuptools as a developer (the only reason you need the file finders, 
 since source distributions include a prebuilt manifest), you'd not 
 have a problem saying easy_install setuptools-git or adding a 
 setup_requires='setuptools-git' line to your setup.py.  (Although 
 the latter would only be needed for *development*, not deployment.)
 
 setup_requires looks like a solution, but it requires extra attention
 from the developers who write the setup.py.  Writing a setup.py is
 already quite complicated -- I usually end up copying an existing one
 and modifying it.

As a compromise, of making new formats easily available but not bundled, and 
not requiring special action within setup.py, setuptools could treat 
--formats=dpkg as an implicit setup_requires= and pull it from PyPI.  And the 
--list-formats option could query PyPI for the possibilities, just as 
--list-classifiers does today.  If would require a few standards in 
keywording/classifying those format eggs but we already need those standards 
for other projects, such as locating recipes for buildout and plugins for trac.

-Jeff

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Distutils] Capsule Summary of Some Packaging/Deployment Technology Concerns

2008-03-18 Thread Dave Peterson

Phillip J. Eby wrote:

At 05:10 PM 3/17/2008 -0500, Jeff Rush wrote:
  


1. Many felt the existing dependency resolver was not correct.  They wanted a
full tree traversal resulting in an intersection of all restrictions,
instead of a first-acceptable-solution approach taking now, which can
result in top-level dependencies not being enforced upon lower 
levels.  The

latter is faster however.  One solution would be to make the resolver
pluggable.



Patches welcome, on both counts.  Personally, Bob and I originally 
wanted a full-tree intersection, too, but it turned out to be hairier 
to implement than it seems at first.  My guess is that none of the 
people who want it, have actually tried to implement it without a 
factorial or exponential O().  But that doesn't mean I'll be unhappy 
if somebody succeeds.  :)
  


I think we'd make significant progress by just intersecting the 
dependencies we know about as we progress through the dependency tree.  
For example, if A requires B==2 and C==3, and if B requires C=2,=4, 
then at the time we install A we'd pick C==3 and also at the time we 
install B we'd pick C==3.   As opposed to the current scheme that would 
choose C==4 for the latter case.   This would allow dependent projects 
(think applications here) to better control the versions of the full set 
of libraries they use.   Things would still fail (like they do now) if 
you ran across dependencies that had no intersection or if you 
encountered a new requirement after the target projected was already 
installed.



If you really wanted to do a full-tree intersection, it seems to me that 
the problem is detecting all the dependencies without having to spend 
significant time downloading/building in order to find them out.   This 
could be solved by simply extending the cheeseshop interface to export 
the set of requirements outside of the egg / tarball / etc.  We've done 
this for our own egg repository by extracting the appropriate meta-data 
files out of EGG-INFO and putting it into a separate file.  This info is 
also useful for users as it gives them an idea of how much *new* stuff 
is going to be installed (a la yum, apt-get, etc.)



In other words, we attempt to achieve heuristically what's being 
proposed to do algorithmically.  And my guess is that whatever cases 
the heuristic is failing at, would probably not be helped by an 
algorithmic approach either.  But I would welcome some actual data, either way.
  


With our ETS projects, we've run into problems with the current 
heuristic.  Perhaps we just don't know how to make it work like we want? 

We have a set of projects that we want to be individually installable 
(to the extent that we limit cross-project dependencies) but we also 
want to make it easy to install the complete set.  We use a meta-egg for 
the latter.  It's purpose is only to specify the exact versions of each 
project that have been explicitly tested to work together -- you could 
almost think of it as a source control system tag.  Whereas on the 
individual projects, we explicitly want to ensure that people get the 
latest possible release of each required API so the version requirements 
are wider here.   This setup causes problems whenever we release new 
versions of projects because it seems easy_install ignores the meta-egg 
exact versions when it gets down into a project and comes across a wider 
cross-project dependency.   We ended up having to give up on the ranges 
in the cross-project dependencies and synchronize them to the same 
values in the meta-egg dependencies.   There are numerous side-effects 
of this that we don't like but we haven't found a way around it.


Again, though, patches are welcome.  :)  (Specifically, for the 
trunk; I don't see a resolver overhaul as being suitable for the 0.6 
stable branch.)
  


We're planning to pursue this (for the above mentioned strategy) as soon 
as we work ourselves out of a bit of a backlog of other things to do.





2. People want a solution for the handling of documentation.  The distutils
module has had commented out sections related to this for several years.



As with so many other things, this gets tossed around the 
distutils-sig every now and then.  A couple of times I've thrown out 
some options for how this might be done, but then the conversation 
peters out around the time anybody would have to actually do some 
work on it.  (Me included, since I don't have an itch that needs 
scratching in this area.)


In particular, if somebody wants to come up with a metadata standard 
for including documentation in eggs, we've got a boatload of hooks by 
which it could be done.  Nothing's stopping anybody from proposing a 
standard and building a tool, here.  (e.g. using the setuptools 
command hook, .egg-info writer hook, etc.)


Enthought has started an effort (it's currently one of two things in our 
ETSProjectTools project at 

Re: [Python-Dev] [Distutils] Capsule Summary of Some Packaging/Deployment Technology Concerns

2008-03-18 Thread Phillip J. Eby
We should probably move this off of Python-Dev, as we're getting into 
deep details now...

At 07:27 PM 3/18/2008 -0500, Dave Peterson wrote:
If you really wanted to do a full-tree intersection, it seems to me 
that the problem is detecting all the dependencies without having to 
spend significant time downloading/building in order to find them 
out.   This could be solved by simply extending the cheeseshop 
interface to export the set of requirements outside of the egg / 
tarball / etc.  We've done this for our own egg repository by 
extracting the appropriate meta-data files out of EGG-INFO and 
putting it into a separate file.  This info is also useful for users 
as it gives them an idea of how much *new* stuff is going to be 
installed (a la yum, apt-get, etc.)

...and now we're more directly competing with them, too.  The 
original idea Bob and I had was to do XML files ala Eclipse feature 
repositories, but then later I realized that for what we were doing, 
HTML was both adequate and already available.  However, I don't see a 
problem in principle with having header files available for this 
sort of thing.


With our ETS projects, we've run into problems with the current 
heuristic.  Perhaps we just don't know how to make it work like we want?

We have a set of projects that we want to be individually 
installable (to the extent that we limit cross-project dependencies) 
but we also want to make it easy to install the complete set.  We 
use a meta-egg for the latter.  It's purpose is only to specify the 
exact versions of each project that have been explicitly tested to 
work together -- you could almost think of it as a source control system tag.

I would think that as long as that meta-egg specifies *all* the 
required versions (right down to recursive dependencies), then there 
shouldn't be any problem.  Maybe it's me who's not understanding something?

I would think that you could get the appropriate data by running the 
tl.eggdeps tool.


A number of projects want to provide various types of files besides 
code in their distributable, and they'd like these to end up in 
standard locations for that type of file.  Think documentation, 
sample data, web templates, configuration settings, etc.   Each of 
these should be treated differently at installation time depending 
on platform.  On *nix, docs should go in /usr/share/doc whereas we 
might need to create a C:\Python2.5\docs on Windows.   With sample 
data and templates, you probably just want it accessible outside of 
the zipped egg so users can easily look at it, add to it, edit it, 
etc.  Configuration settings should be installed with some defaults 
into a standard configuration directory like /etc on *nix, etc.

Basically the issue is that it needs to be easier to include 
different sets of files into an egg for different actions to be 
taken during installation or packaging into an OS-specific distribution format.

Yes, it would be nice to define a metadata standard for including 
installable datasets either through copying or symlinking, 
optionally with entry points for running some code, too.  When you 
install an egg, these things could get added to a post-install 
to-do list, that you could then read to find out what steps to do, 
or invoke a tool on to actually do some of those steps.


But the docs for easy_install claim that the list of active eggs is 
maintained in easy-install.pth.  Also, if I create my own .pth file, 
and the user tries to update my version to a new one, will the 
easy_install tool modify my .pth file to remove the mention of the 
old version from my sys.path and put the new version in the same 
.pth file?  Or will it now be listed in both places?  Or will it 
only in easy-install.pth?

My understanding of the context of the question was that it applied 
to *system* packaging tools, which would be exclusively maintaining 
the .pth entries for the packages they installed.  i.e., a scenario 
with *no* easy-install.pth.  Setuptools will still detect the 
presence of their eggs, regardless of the means by which they're 
added to sys.path.  But it would not *maintain* those .pth files.


Yes, but as you've already pointed out, they've escaped into a 
larger ecosystem and this restriction is a severe limitation -- 
leading to significant frustration.  Especially as projects evolve 
and want to do something more complex than simply install pure 
Python code.  Here at Enthought, we use and ship a number of 
projects that have extensions and thus dynamic libraries that need 
to either be modified during installation to work from the user's 
installed location, or copied elsewhere on the system to avoid the 
need to modify (which we also can't do via an egg install) env 
variables, registries, etc.

By the way, there *is* experimental shared library building support 
in setuptools, and I recently heard from Andi Vajda that he was 
successful in using it in his JCC project to make available a C++ 
library for linkage from