Re: [matplotlib-devel] [Numpy-discussion] Announcing toydist, improving distribution and packaging situation

2010-01-04 Thread Dag Sverre Seljebotn
Nathaniel Smith n...@pobox.com wrote:
 On Sun, Jan 3, 2010 at 4:23 AM, David Cournapeau courn...@gmail.com
 wrote:
 Another way is to provide our own repository for a few major
 distributions, with automatically built packages. This is how most
 open source providers work. Miguel de Icaza explains this well:

 http://tirania.org/blog/archive/2007/Jan-26.html

 I hope we will be able to reuse much of the opensuse build service
 infrastructure.

 Sure, I'm aware of the opensuse build service, have built third-party
 packages for my projects, etc. It's a good attempt, but also has a lot
 of problems, and when talking about scientific software it's totally
 useless to me :-). First, I don't have root on our compute cluster.

I use Sage for this very reason, and others use EPD or FEMHub or
Python(x,y) for the same reasons.

Rolling this into the Python package distribution scheme seems backwards
though, since a lot of binary packages that have nothing to do with Python
are used as well -- the Python stuff is simply thin wrappers around what
should ideally be located in /usr/lib or similar (but are nowadays
compiled into the Python extension .so because of distribution problems).

To solve the exact problem you (and me) have I think the best solution is
to integrate the tools mentioned above with what David is planning (SciPI
etc.). Or if that isn't good enough, find generic userland package
manager that has nothing to do with Python (I'm sure a dozen
half-finished ones must have been written but didn't look), finish it, and
connect it to SciPI.

What David does (I think) is seperate the concerns. This makes the task
feasible, and also has the advantage of convenience for the people that
*do* want to use Ubuntu, Red Hat or whatever to roll out scientific
software on hundreds of clients.

Dag Sverre


--
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
___
Matplotlib-devel mailing list
Matplotlib-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/matplotlib-devel


Re: [matplotlib-devel] [Numpy-discussion] Announcing toydist, improving distribution and packaging situation

2010-01-04 Thread David Cournapeau
On Mon, Jan 4, 2010 at 5:48 PM, Dag Sverre Seljebotn
da...@student.matnat.uio.no wrote:


 Rolling this into the Python package distribution scheme seems backwards
 though, since a lot of binary packages that have nothing to do with Python
 are used as well

Yep, exactly.


 To solve the exact problem you (and me) have I think the best solution is
 to integrate the tools mentioned above with what David is planning (SciPI
 etc.). Or if that isn't good enough, find generic userland package
 manager that has nothing to do with Python (I'm sure a dozen
 half-finished ones must have been written but didn't look), finish it, and
 connect it to SciPI.

You have 0install, autopackage and klik, to cite the ones I know
about. I wish people had looked at those before rolling toy solutions
to complex problems.


 What David does (I think) is seperate the concerns.

Exactly - you've describe this better than I did

David

--
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
___
Matplotlib-devel mailing list
Matplotlib-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/matplotlib-devel


Re: [matplotlib-devel] [Numpy-discussion] Announcing toydist, improving distribution and packaging situation

2010-01-03 Thread Gael Varoquaux
On Sun, Jan 03, 2010 at 03:05:54AM -0800, Nathaniel Smith wrote:
 What I do -- and documented for people in my lab to do -- is set up
 one virtualenv in my user account, and use it as my default python. (I
 'activate' it from my login scripts.) The advantage of this is that
 easy_install (or pip) just works, without any hassle about permissions
 etc. This should be easier, but I think the basic approach is sound.
 Integration with the package system is useless; the advantage of
 distribution packages is that distributions can provide a single
 coherent system with consistent version numbers across all packages,
 etc., and the only way to integrate with that is to, well, get the
 packages into the distribution.

That works because either you use packages that don't have much hard-core
compiled dependencies, or these are already installed.

Think about installing VTK or ITK this way, even something simpler such
as umfpack. I think that you would loose most of your users. In my lab,
I do lose users on such packages actually.

Beside, what you are describing is possible without package isolation, it
is simply the use of a per-user local site-packages, which now semi
automatic in python2.6 using the '.local' directory. I do agree that, in
a research lab, this is a best practice.

Gaël

--
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
___
Matplotlib-devel mailing list
Matplotlib-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/matplotlib-devel


Re: [matplotlib-devel] [Numpy-discussion] Announcing toydist, improving distribution and packaging situation

2010-01-03 Thread Nathaniel Smith
On Tue, Dec 29, 2009 at 6:34 AM, David Cournapeau courn...@gmail.com wrote:
 Buildout, virtualenv all work by sandboxing from the system python:
 each of them do not see each other, which may be useful for
 development, but as a deployment solution to the casual user who may
 not be familiar with python, it is useless. A scientist who installs
 numpy, scipy, etc... to try things out want to have everything
 available in one python interpreter, and does not want to jump to
 different virtualenvs and whatnot to try different packages.

What I do -- and documented for people in my lab to do -- is set up
one virtualenv in my user account, and use it as my default python. (I
'activate' it from my login scripts.) The advantage of this is that
easy_install (or pip) just works, without any hassle about permissions
etc. This should be easier, but I think the basic approach is sound.
Integration with the package system is useless; the advantage of
distribution packages is that distributions can provide a single
coherent system with consistent version numbers across all packages,
etc., and the only way to integrate with that is to, well, get the
packages into the distribution.

On another note, I hope toydist will provide a source prepare step,
that allows arbitrary code to be run on the source tree. (For, e.g.,
cython-C conversion, ad-hoc template languages, etc.) IME this is a
very common pain point with distutils; there is just no good way to do
it, and it has to be supported in the distribution utility in order to
get everything right. In particular:
  -- Generated files should never be written to the source tree
itself, but only the build directory
  -- Building from a source checkout should run the source prepare
step automatically
  -- Building a source distribution should also run the source
prepare step, and stash the results in such a way that when later
building the source distribution, this step can be skipped. This is a
common requirement for user convenience, and necessary if you want to
avoid arbitrary code execution during builds.
And if you just set up the distribution util so that the only place
you can specify arbitrary code execution is in the source prepare
step, then even people who know nothing about packaging will
automatically get all of the above right.

Cheers,
-- Nathaniel

--
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
___
Matplotlib-devel mailing list
Matplotlib-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/matplotlib-devel


Re: [matplotlib-devel] [Numpy-discussion] Announcing toydist, improving distribution and packaging situation

2010-01-03 Thread David Cournapeau
On Sun, Jan 3, 2010 at 8:05 PM, Nathaniel Smith n...@pobox.com wrote:
 On Tue, Dec 29, 2009 at 6:34 AM, David Cournapeau courn...@gmail.com wrote:
 Buildout, virtualenv all work by sandboxing from the system python:
 each of them do not see each other, which may be useful for
 development, but as a deployment solution to the casual user who may
 not be familiar with python, it is useless. A scientist who installs
 numpy, scipy, etc... to try things out want to have everything
 available in one python interpreter, and does not want to jump to
 different virtualenvs and whatnot to try different packages.

 What I do -- and documented for people in my lab to do -- is set up
 one virtualenv in my user account, and use it as my default python. (I
 'activate' it from my login scripts.) The advantage of this is that
 easy_install (or pip) just works, without any hassle about permissions
 etc.

It just works if you happen to be able to build everything from
sources. That alone means you ignore the majority of users I intend to
target.

No other community (except maybe Ruby) push those isolated install
solutions as a general deployment solutions. If it were such a great
idea, other people would have picked up those solutions.

 This should be easier, but I think the basic approach is sound.
 Integration with the package system is useless; the advantage of
 distribution packages is that distributions can provide a single
 coherent system with consistent version numbers across all packages,
 etc., and the only way to integrate with that is to, well, get the
 packages into the distribution.

Another way is to provide our own repository for a few major
distributions, with automatically built packages. This is how most
open source providers work. Miguel de Icaza explains this well:

http://tirania.org/blog/archive/2007/Jan-26.html

I hope we will be able to reuse much of the opensuse build service
infrastructure.


 On another note, I hope toydist will provide a source prepare step,
 that allows arbitrary code to be run on the source tree. (For, e.g.,
 cython-C conversion, ad-hoc template languages, etc.) IME this is a
 very common pain point with distutils; there is just no good way to do
 it, and it has to be supported in the distribution utility in order to
 get everything right. In particular:
  -- Generated files should never be written to the source tree
 itself, but only the build directory
  -- Building from a source checkout should run the source prepare
 step automatically
  -- Building a source distribution should also run the source
 prepare step, and stash the results in such a way that when later
 building the source distribution, this step can be skipped. This is a
 common requirement for user convenience, and necessary if you want to
 avoid arbitrary code execution during builds.

Build directories are hard to implement right. I don't think toydist
will support this directly. IMO, those advanced builds warrant a real
build tool - one main goal of toydist is to make integration with waf
or scons much easier. Both waf and scons have the concept of a build
directory, which should do everything you described.

David

--
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
___
Matplotlib-devel mailing list
Matplotlib-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/matplotlib-devel


Re: [matplotlib-devel] [Numpy-discussion] Announcing toydist, improving distribution and packaging situation

2010-01-03 Thread Nathaniel Smith
On Sun, Jan 3, 2010 at 4:23 AM, David Cournapeau courn...@gmail.com wrote:
 On Sun, Jan 3, 2010 at 8:05 PM, Nathaniel Smith n...@pobox.com wrote:
 What I do -- and documented for people in my lab to do -- is set up
 one virtualenv in my user account, and use it as my default python. (I
 'activate' it from my login scripts.) The advantage of this is that
 easy_install (or pip) just works, without any hassle about permissions
 etc.

 It just works if you happen to be able to build everything from
 sources. That alone means you ignore the majority of users I intend to
 target.

 No other community (except maybe Ruby) push those isolated install
 solutions as a general deployment solutions. If it were such a great
 idea, other people would have picked up those solutions.

AFAICT, R works more-or-less identically (once I convinced it to use a
per-user library directory); install.packages() builds from source,
and doesn't automatically pull in and build random C library
dependencies.

I'm not advocating the 'every app in its own world' model that
virtualenv's designers had min mind, but virtualenv is very useful to
give each user their own world. Normally I only use a fraction of
virtualenv's power this way, but sometimes it's handy that they've
solved the more general problem -- I can easily move my environment
out of the way and rebuild if I've done something stupid, or
experiment with new python versions in isolation, or whatever. And
when you *do* have to reproduce some old environment -- if only to
test that the new improved environment gives the same results -- then
it's *really* handy.

 This should be easier, but I think the basic approach is sound.
 Integration with the package system is useless; the advantage of
 distribution packages is that distributions can provide a single
 coherent system with consistent version numbers across all packages,
 etc., and the only way to integrate with that is to, well, get the
 packages into the distribution.

 Another way is to provide our own repository for a few major
 distributions, with automatically built packages. This is how most
 open source providers work. Miguel de Icaza explains this well:

 http://tirania.org/blog/archive/2007/Jan-26.html

 I hope we will be able to reuse much of the opensuse build service
 infrastructure.

Sure, I'm aware of the opensuse build service, have built third-party
packages for my projects, etc. It's a good attempt, but also has a lot
of problems, and when talking about scientific software it's totally
useless to me :-). First, I don't have root on our compute cluster.
Second, even if I did I'd be very leery about installing third-party
packages because there is no guarantee that the version numbering will
be consistent between the third-party repo and the real distro repo --
suppose that the distro packages 0.1, then the third party packages
0.2, then the distro packages 0.3, will upgrades be seamless? What if
the third party screws up the version numbering at some point? Debian
has epochs to deal with this, but third-parties can't use them and
maintain compatibility. What if the person making the third party
packages is not an expert on these random distros that they don't even
use? Will bug reporting tools work properly? Distros are complicated.
Third, while we shouldn't advocate that people screw up backwards
compatibility, version skew is a real issue. If I need one version of
a package and my lab-mate needs another and we have submissions due
tomorrow, then filing bugs is a great idea but not a solution. Fourth,
even if we had expert maintainers taking care of all these third-party
packages and all my concerns were answered, I couldn't convince our
sysadmin of that; he's the one who'd have to clean up if something
went wrong we don't have a big budget for overtime.

Let's be honest -- scientists, on the whole, suck at IT
infrastructure, and small individual packages are not going to be very
expertly put together. IMHO any real solution should take this into
account, keep them sandboxed from the rest of the system, and focus on
providing the most friendly and seamless sandbox possible.

 On another note, I hope toydist will provide a source prepare step,
 that allows arbitrary code to be run on the source tree. (For, e.g.,
 cython-C conversion, ad-hoc template languages, etc.) IME this is a
 very common pain point with distutils; there is just no good way to do
 it, and it has to be supported in the distribution utility in order to
 get everything right. In particular:
  -- Generated files should never be written to the source tree
 itself, but only the build directory
  -- Building from a source checkout should run the source prepare
 step automatically
  -- Building a source distribution should also run the source
 prepare step, and stash the results in such a way that when later
 building the source distribution, this step can be skipped. This is a
 common requirement for user convenience, and necessary if you want to

Re: [matplotlib-devel] [Numpy-discussion] Announcing toydist, improving distribution and packaging situation

2010-01-03 Thread David Cournapeau
On Mon, Jan 4, 2010 at 8:42 AM, Nathaniel Smith n...@pobox.com wrote:
 On Sun, Jan 3, 2010 at 4:23 AM, David Cournapeau courn...@gmail.com wrote:
 On Sun, Jan 3, 2010 at 8:05 PM, Nathaniel Smith n...@pobox.com wrote:
 What I do -- and documented for people in my lab to do -- is set up
 one virtualenv in my user account, and use it as my default python. (I
 'activate' it from my login scripts.) The advantage of this is that
 easy_install (or pip) just works, without any hassle about permissions
 etc.

 It just works if you happen to be able to build everything from
 sources. That alone means you ignore the majority of users I intend to
 target.

 No other community (except maybe Ruby) push those isolated install
 solutions as a general deployment solutions. If it were such a great
 idea, other people would have picked up those solutions.

 AFAICT, R works more-or-less identically (once I convinced it to use a
 per-user library directory); install.packages() builds from source,
 and doesn't automatically pull in and build random C library
 dependencies.

As mentioned by Robert, this is different from the usual virtualenv
approach. Per-user app installation is certainly a useful (and
uncontroversial) feature.

And R does support automatically-built binary installers.


 Sure, I'm aware of the opensuse build service, have built third-party
 packages for my projects, etc. It's a good attempt, but also has a lot
 of problems, and when talking about scientific software it's totally
 useless to me :-). First, I don't have root on our compute cluster.

True, non-root install is a problem. Nothing *prevents* dpkg to run in
non root environment in principle if the packages itself does not
require it, but it is not really supported by the tools ATM.

 Second, even if I did I'd be very leery about installing third-party
 packages because there is no guarantee that the version numbering will
 be consistent between the third-party repo and the real distro repo --
 suppose that the distro packages 0.1, then the third party packages
 0.2, then the distro packages 0.3, will upgrades be seamless? What if
 the third party screws up the version numbering at some point? Debian
 has epochs to deal with this, but third-parties can't use them and
 maintain compatibility.

Actually, at least with .deb-based distributions, this issue has a
solution. As packages has their own version in addition to the
upstream version, PPA-built packages have their own versions.

https://help.launchpad.net/Packaging/PPA/BuildingASourcePackage

Of course, this assumes a simple versioning scheme in the first place,
instead of the cluster-fck that versioning has became within python
packages (again, the scheme used in python is much more complicated
than everyone else, and it seems that nobody has ever stopped and
thought 5 minutes about the consequences, and whether this complexity
was a good idea in the first place).

 What if the person making the third party
 packages is not an expert on these random distros that they don't even
 use?

I think simple rules/conventions + build farms would solve most
issues. The problem is if you allow total flexibility as input, then
automatic and simple solutions become impossible. Certainly, PPA and
the build service provides for a much better experience than anything
pypi has ever given to me.

 Third, while we shouldn't advocate that people screw up backwards
 compatibility, version skew is a real issue. If I need one version of
 a package and my lab-mate needs another and we have submissions due
 tomorrow, then filing bugs is a great idea but not a solution.

Nothing prevents you from using virtualenv in that case (I may sound
dismissive of those tools, but I am really not. I use them myselves.
What I strongly react to is when those are pushed as the de-facto,
standard method).

 Fourth,
 even if we had expert maintainers taking care of all these third-party
 packages and all my concerns were answered, I couldn't convince our
 sysadmin of that; he's the one who'd have to clean up if something
 went wrong we don't have a big budget for overtime.

I am not advocating using only packaged, binary installers. I am
advocating using them as much as possible where it makes sense - on
windows and mac os x in particular.

Toydist also aims at making it easier to build, customize installs.
Although not yet implemented, --user-like scheme would be quite simple
to implement, because toydist installer internally uses autoconf-like
directories description (of which --user is a special case).

If you need sandboxed installs, customized installs, toydist will not
prevent it. It is certainly my intention to make it possible to use
virtualenv and co (you already can by building eggs, actually). I hope
that by having our own SciPi, we can actually have a more reliable
approach. For example, the static dependency description + mandated
metadata would make this much easier and more robust, as there would
not be a need to run a 

Re: [matplotlib-devel] [Numpy-discussion] Announcing toydist, improving distribution and packaging situation

2010-01-02 Thread Dag Sverre Seljebotn
David wrote:
 Repository
 

 The goal here is to have something like CRAN
 (http://cran.r-project.org/web/views/), ideally with a build farm so
 that whenever anyone submits a package to our repository, it would
 automatically be checked, and built for windows/mac os x and maybe a
 few major linux distributions. One could investigate the build service
 from open suse to that end (http://en.opensuse.org/Build_Service),
 which is based on xen VM to build installers in a reproducible way.

Do you here mean automatic generation of Ubuntu debs, Debian debs, Windows
MSI installer, Windows EXE installer, and so on? (If so then great!)

If this is the goal, I wonder if one looks outside of Python-land one
might find something that already does this -- there's a lot of different
package format, Linux meta-distributions, install everywhere packages
and so on.

Of course, toydist could have such any such tool as a backend/in a pipeline.

 What's next ?
 ==

 At this point, I would like to ask for help and comments, in particular:
  - Does all this make sense, or hopelessly intractable ?
  - Besides the points I have mentioned, what else do you think is needed ?

Hmm. What I miss is the discussion of other native libraries which the
Python libraries need to bundle. Is it assumed that one want to continue
linking C and Fortran code directly into Python .so modules, like the
scipy library currently does?

Let me take CHOLMOD (sparse Cholesky) as an example.

 - The Python package cvxopt use it, simply by linking about 20 C files
directly into the Python-loadable module (.so) which goes into the Python
site-packages (or wherever). This makes sure it just works. But, it
doesn't feel like the right way at all.

 - scikits.sparse.cholmod OTOH simple specifies libraries=[cholmod], and
leave it up to the end-user to make sure it is installed. Linux users
with root access can simply apt-get, but it is a pain for everybody else
(Windows, Mac, non-root Linux).

 - Currently I'm making a Sage SPKG for CHOLMOD. This essentially gets the
job done by not bothering about the problem, not even using the
OS-installed Python.

Something that would spit out both Sage SPKGs, Ubuntu debs, Windows
installers, both with Python code and C/Fortran code or a mix (and put
both in the place preferred by the system in question), seems ideal. Of
course one would still need to make sure that the code builds properly
everywhere, but just solving the distribution part of this would be a huge
step ahead.

What I'm saying is that this is a software distribution problem in
general, and I'm afraid that Python-specific solutions are too narrow.

Dag Sverre


--
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
___
Matplotlib-devel mailing list
Matplotlib-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/matplotlib-devel


Re: [matplotlib-devel] [Numpy-discussion] Announcing toydist, improving distribution and packaging situation

2010-01-01 Thread Pierre Raybaut
Hi David,

Following your announcement for the 'toydist' module, I think that
your project is very promising: this is certainly a great idea and it
will be very controversial but that's because people expectactions are
great on this matter (distutils is so disappointing indeed).

Anyway, if I may be useful, I'll gladly contribute to it.
In time, I could change the whole Python(x,y) packaging system (which
is currently quite ugly... but easy/quick to manage/maintain) to
use/promote this new module.

Happy New Year!
and Long Live Scientific Python! ;-)

Cheers,
Pierre

--
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
___
Matplotlib-devel mailing list
Matplotlib-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/matplotlib-devel


Re: [matplotlib-devel] [Numpy-discussion] Announcing toydist, improving distribution and packaging situation

2009-12-30 Thread David Cournapeau
On Wed, Dec 30, 2009 at 8:15 PM, René Dudfield ren...@gmail.com wrote:


 Sitting down with Tarek(who is one of the current distutils
 maintainers) in Berlin we had a little discussion about packaging over
 pizza and beer... and he was quite mindful of OS packagers problems
 and issues.

This has been said many times on distutils-sig, but no concrete action
has ever been taken in that direction. For example, toydist already
supports the FHS better than distutils, and is more flexible. I have
tried several times to explain why this matters on distutils-sig, but
you then have the peanuts gallery interfering with unrelated nonsense
(like it would break windows, as if it could not be implemented
independently).

Also, retrofitting support for --*dir in distutils would be *very*
difficult, unless you are ready to break backward compatibility (there
are 6 ways to install data files, and each of them has some corner
cases, for example - it is a real pain to support this correctly in
the convert command of toydist, and you simply cannot recover missing
information to comply with the FHS in every case).


 However these systems were developed by the zope/plone/web crowd, so
 they are naturally going to be thinking a lot about zope/plone/web
 issues.

Agreed - it is natural that they care about their problems first,
that's how it works in open source. What I find difficult is when our
concern are constantly dismissed by people who have no clue about our
issues - and later claim we are not cooperative.

  Debian, and ubuntu packages for them are mostly useless
 because of the age.

That's where the build farm enters. This is known issue, that's why
the build service or PPA exist in the first place.

 I think
 perhaps if toydist included something like stdeb as not an extension
 to distutils, but a standalone tool (like toydist) there would be less
 problems with it.

That's pretty much how I intend to do things. Currently, in toydist,
you can do something like:

from toydist.core import PackageDescription

pkg = PackageDescription.from_file(toysetup.info)
# pkg now gives you access to metadata, as well as extensions, python
modules, etc...

I think this gives almost everything that is needed to implement a
sdist_dsc command. Contrary to the Distribution class in distutils,
this class would not need to be subclassed/monkey-patched by
extensions, as it only cares about the description, and is 100 %
uncoupled from the build part.

 yes, I have also battled with distutils over the years.  However it is
 simpler than autotools (for me... maybe distutils has perverted my
 fragile mind), and works on more platforms for python than any other
 current system.

Autotools certainly works on more platforms (windows notwhistanding),
if only because python itself is built with autoconf. Distutils
simplicity is a trap: it is simpler only if you restrict to what
distutils gives you. Don't get me wrong, autotools are horrible, but I
have never encountered cases where I had to spend hours to do trivial
tasks, as has been the case with distutils. Numpy build system would
be much, much easier to implement through autotools, and would be much
more reliable.

 However
 distutils has had more tests and testing systems added, so that
 refactoring/cleaning up of distutils can happen more so.

You can't refactor distutils without breaking backward compatibility,
because distutils has no API. The whole implementation is the API.
That's one of the fundamental disagreement I and other scipy dev have
with current contributors on distutils-sig: the starting point
(distutils) and the goal are so far away from each other that getting
there step by step is hopeless.

 I agree with many things in that post.  Except your conclusion on
 multiple versions of packages in isolation.  Package isolation is like
 processes, and package sharing is like threads - and threads are evil!

I don't find the comparison very helpful (for once, you can share data
between processes, whereas virtualenv cannot see each other AFAIK).

 Science is supposed to allow repeatability.  Without the same versions
 of packages, repeating experiments is harder.  This is a big problem
 in science that multiple versions of packages in _isolation_ can help
 get to a solution to the repeatability problem.

I don't think that's true - at least it does not reflect my experience
at all. But then, I don't pretend to have an extensive experience
either. From most of my discussions at scipy conferences, I know most
people are dissatisfied with the current python solutions.


 Plenty of good work is going on with python packaging.

 That's the opposite of my experience. What I care about is:
  - tools which are hackable and easily extensible
  - robust install/uninstall
  - real, DAG-based build system
  - explicit and repeatability

 None of this is supported by the tools, and the current directions go
 even further away. When I have to explain at length why the
 command-based design of 

Re: [matplotlib-devel] [Numpy-discussion] Announcing toydist, improving distribution and packaging situation

2009-12-30 Thread David Cournapeau
On Wed, Dec 30, 2009 at 11:26 PM, Darren Dale dsdal...@gmail.com wrote:
 Hi David,

 On Mon, Dec 28, 2009 at 9:03 AM, David Cournapeau courn...@gmail.com wrote:
 Executable: grin
    module: grin
    function: grin_main

 Executable: grind
    module: grin
    function: grind_main

 Have you thought at all about operations that are currently performed
 by post-installation scripts? For example, it might be desirable for
 the ipython or MayaVi windows installers to create a folder in the
 Start menu that contains links the the executable and the
 documentation. This is probably a secondary issue at this point in
 toydist's development, but I think it is an important feature in the
 long run.

 Also, have you considered support for package extras (package variants
 in Ports, allowing you to specify features that pull in additional
 dependencies like traits[qt4])? Enthought makes good use of them in
 ETS, and I think they would be worth keeping.

Does this example covers what you have in mind ? I am not so familiar
with this feature of setuptools:

Name: hello
Version: 1.0

Library:
BuildRequires: paver, sphinx, numpy
if os(windows)
BuildRequires: pywin32
Packages:
hello
Extension: hello._bar
sources:
src/hellomodule.c
if os(linux)
Extension: hello._linux_backend
sources:
src/linbackend.c

Note that instead of os(os_name), you can use flag(flag_name), where
flag are boolean variables which can be user defined:

http://github.com/cournape/toydist/blob/master/examples/simples/conditional/toysetup.info

http://github.com/cournape/toydist/blob/master/examples/var_example/toysetup.info

David

--
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
___
Matplotlib-devel mailing list
Matplotlib-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/matplotlib-devel


Re: [matplotlib-devel] [Numpy-discussion] Announcing toydist, improving distribution and packaging situation

2009-12-29 Thread David Cournapeau
On Tue, Dec 29, 2009 at 10:27 PM, René Dudfield ren...@gmail.com wrote:
 Hi,

 In the toydist proposal/release notes, I would address 'what does
 toydist do better' more explicitly.



  A big problem for science users is that numpy does not work with
 pypi + (easy_install, buildout or pip) and python 2.6. 



 Working with the rest of the python community as much as possible is
 likely a good goal.

Yes, but it is hopeless. Most of what is being discussed on
distutils-sig is useless for us, and what matters is ignored at best.
I think most people on distutils-sig are misguided, and I don't think
the community is representative of people concerned with packaging
anyway - most of the participants seem to be around web development,
and are mostly dismissive of other's concerns (OS packagers, etc...).

I want to note that I am not starting this out of thin air - I know
most of distutils code very well, I have been the mostly sole
maintainer of numpy.distutils for 2 years now. I have written
extensive distutils extensions, in particular numscons which is able
to fully build numpy, scipy and matplotlib on every platform that
matters.

Simply put, distutils code is horrible (this is an objective fact) and
 flawed beyond repair (this is more controversial). IMHO, it has
almost no useful feature, except being standard.

If you want a more detailed explanation of why I think distutils and
all tools on top are deeply flawed, you can look here:

http://cournape.wordpress.com/2009/04/01/python-packaging-a-few-observations-cabal-for-a-solution/

 numpy used to work with buildout in python2.5, but not with 2.6.
 buildout lets other team members get up to speed with a project by
 running one command.  It installs things in the local directory, not
 system wide.  So you can have different dependencies per project.

I don't think it is a very useful feature, honestly. It seems to me
that they created a huge infrastructure to split packages into tiny
pieces, and then try to get them back together, imaganing that
multiple installed versions is a replacement for backward
compatibility. Anyone with extensive packaging experience knows that's
a deeply flawed model in general.

 Plenty of good work is going on with python packaging.

That's the opposite of my experience. What I care about is:
  - tools which are hackable and easily extensible
  - robust install/uninstall
  - real, DAG-based build system
  - explicit and repeatability

None of this is supported by the tools, and the current directions go
even further away. When I have to explain at length why the
command-based design of distutils is a nightmare to work with, I don't
feel very confident that the current maintainers are aware of the
issues, for example. It shows that they never had to extend distutils
much.


 There are build farms for windows packages and OSX uploaded to pypi.
 Start uploading pre releases to pypi, and you get these for free (once
 you make numpy compile out of the box on those compile farms).  There
 are compile farms for other OSes too... like ubuntu/debian, macports
 etc.  Some distributions even automatically download, compile and
 package new releases once they spot a new file on your ftp/web site.

I am familiar with some of those systems (PPA and opensuse build
service in particular). One of the goal of my proposal is to make it
easier to interoperate with those tools.

I think Pypi is mostly useless. The lack of enforced metadata is a big
no-no IMHO. The fact that Pypi is miles beyond CRAN for example is
quite significant. I want CRAN for scientific python, and I don't see
Pypi becoming it in the near future.

The point of having our own Pypi-like server is that we could do the following:
 - enforcing metadata
 - making it easy to extend the service to support our needs


 pypm:  http://pypm.activestate.com/list-n.html#numpy

It is interesting to note that one of the maintainer of pypm has
recently quitted the discussion about Pypi, most likely out of
frustration from the other participants.

 Documentation projects are being worked on to document, give tutorials
 and make python packaging be easier all round.  As witnessed by 20 or
 so releases on pypi every day(and growing), lots of people are using
 the python packaging tools successfully.

This does not mean much IMO. Uploading on Pypi is almost required to
use virtualenv, buildout, etc.. An interesting metric is not how many
packages are uploaded, but how much it is used outside developers.


 I'm not sure making a separate build tool is a good idea.  I think
 going with the rest of the python community, and improving the tools
 there is a better idea.

It has been tried, and IMHO has been proved to have failed. You can
look at the recent discussion (the one started by Guido in
particular).

 pps. some notes on toydist itself.
 - toydist convert is cool for people converting a setup.py .  This
 means that most people can try out toydist right away.  but what does
 it gain 

Re: [matplotlib-devel] [Numpy-discussion] Announcing toydist, improving distribution and packaging situation

2009-12-29 Thread David Cournapeau
On Tue, Dec 29, 2009 at 10:27 PM, René Dudfield ren...@gmail.com wrote:

 Buildout is what a lot of the python community are using now.

I would like to note that buildout is a solution to a problem that I
don't care to solve. This issue is particularly difficult to explain
to people accustomed with buildout in my experience - I have not found
a way to explain it very well yet.

Buildout, virtualenv all work by sandboxing from the system python:
each of them do not see each other, which may be useful for
development, but as a deployment solution to the casual user who may
not be familiar with python, it is useless. A scientist who installs
numpy, scipy, etc... to try things out want to have everything
available in one python interpreter, and does not want to jump to
different virtualenvs and whatnot to try different packages.

This has strong consequences on how you look at things from a packaging POV:
 - uninstall is crucial
 - a package bringing down python is a big no no (this happens way too
often when you install things through setuptools)
 - if something fails, the recovery should be trivial - the person
doing the installation may not know much about python
 - you cannot use sandboxing as a replacement for backward
compatibility (that's why I don't care much about all the discussion
about versioning - I don't think it is very useful as long as python
itself does not support it natively).

In the context of ruby, this article makes a similar point:
http://www.madstop.com/ruby/ruby_has_a_distribution_problem.html

David

--
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
___
Matplotlib-devel mailing list
Matplotlib-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/matplotlib-devel


Re: [matplotlib-devel] [Numpy-discussion] Announcing toydist, improving distribution and packaging situation

2009-12-29 Thread Gael Varoquaux
On Tue, Dec 29, 2009 at 11:34:44PM +0900, David Cournapeau wrote:
 Buildout, virtualenv all work by sandboxing from the system python:
 each of them do not see each other, which may be useful for
 development, but as a deployment solution to the casual user who may
 not be familiar with python, it is useless. A scientist who installs
 numpy, scipy, etc... to try things out want to have everything
 available in one python interpreter, and does not want to jump to
 different virtualenvs and whatnot to try different packages.

I think that you are pointing out a large source of misunderstanding
in packaging discussion. People behind setuptools, pip or buildout care
to have a working ensemble of packages that deliver an application (often
a web application)[1]. You and I, and many scientific developers see
libraries as building blocks that need to be assembled by the user, the
scientist using them to do new science. Thus the idea of isolation is not
something that we can accept, because it means that we are restricting
the user to a set of libraries.

Our definition of user is not the same as the user targeted by buildout.
Our user does not push buttons, but he writes code. However, unlike the
developer targeted by buildout and distutils, our user does not want or
need to learn about packaging.

Trying to make the debate clearer...

Gaël

[1] I know your position on why simply focusing on sandboxing working
ensemble of libraries is not a replacement for backward compatibility,
and will only create impossible problems in the long run. While I agree
with you, this is not my point here.

--
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
___
Matplotlib-devel mailing list
Matplotlib-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/matplotlib-devel


Re: [matplotlib-devel] [Numpy-discussion] Announcing toydist, improving distribution and packaging situation

2009-12-29 Thread Christopher Barker
David Cournapeau wrote:

 Buildout, virtualenv all work by sandboxing from the system python:
 each of them do not see each other, which may be useful for
 development,

And certain kinds of deployment, like web servers or installed tools.

 but as a deployment solution to the casual user who may
 not be familiar with python, it is useless. A scientist who installs
 numpy, scipy, etc... to try things out want to have everything
 available in one python interpreter, and does not want to jump to
 different virtualenvs and whatnot to try different packages.

Absolutely true -- which is why Python desperately needs package version 
selection of some sort. I've been tooting this horn on and off for years 
but never got any interest at all from the core python developers.

I see putting packages in with no version like having non-versioned 
dynamic libraries in a system -- i.e. dll hell. If I have a bunch of 
stuff running just fine with the various package versions I've 
installed, but then I start working on something (maybe just testing, 
maybe something more real) that requires the latest version of a 
package, I have a few choices:
   - install the new package and hope I don't break too much
   - use something like virtualenv, which requires a lot of overhead to 
setup and use (my evidence is personal, despite working with a team that 
uses it, somehow I've never gotten around to using for my dev work, even 
though, in theory, it should be a good solution)
   - setuptools does supposedly support multiple version installs and 
selection, but it's ugly and poorly documented enough that I've never 
figured out how to use it.

This has been addressed with a handful of ad-hock solution: wxPython as 
wxversion.select, and I think PyGTK has something, and who knows what 
else. It would be really nice to have a standard solution available.

Note that the usual response I've gotten is to use py2exe or something 
to distribute, so you're defining the whole stack. That's good for some 
things, but not all (though py2app's alias bundles are nice), and 
really pretty worthless for development. Also, many, many packages are a 
  pain to use with py2exe and friends anyway (see my forthcoming other 
long post...)

  - you cannot use sandboxing as a replacement for backward
 compatibility (that's why I don't care much about all the discussion
 about versioning - I don't think it is very useful as long as python
 itself does not support it natively).

could be -- I'd love to have Python support it natively, though 
wxversion isn't too bad.

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/ORR(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov

--
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
___
Matplotlib-devel mailing list
Matplotlib-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/matplotlib-devel


Re: [matplotlib-devel] [Numpy-discussion] Announcing toydist, improving distribution and packaging situation

2009-12-28 Thread David Cournapeau
On Tue, Dec 29, 2009 at 3:49 AM, Dag Sverre Seljebotn
da...@student.matnat.uio.no wrote:


 Do you here mean automatic generation of Ubuntu debs, Debian debs, Windows
 MSI installer, Windows EXE installer, and so on? (If so then great!)

Yes (although this is not yet implemented). In particular on windows,
I want to implement a scheme so that you can convert from eggs to .exe
and vice et versa, so people can still install as exe (or msi), even
though the method would default to eggs.

 If this is the goal, I wonder if one looks outside of Python-land one
 might find something that already does this -- there's a lot of different
 package format, Linux meta-distributions, install everywhere packages
 and so on.

Yes, there are things like 0install or autopackage. I think those are
deemed to fail, as long as it is not supported thoroughly by the
distribution. Instead, my goal here is much simpler: producing
rpm/deb. It does not solve every issue (install by non root, multiple
// versions), but one has to be realistic :)

I think automatically built rpm/deb, easy integration with native
method can solve a lot of issues already.


  - Currently I'm making a Sage SPKG for CHOLMOD. This essentially gets the
 job done by not bothering about the problem, not even using the
 OS-installed Python.

 Something that would spit out both Sage SPKGs, Ubuntu debs, Windows
 installers, both with Python code and C/Fortran code or a mix (and put
 both in the place preferred by the system in question), seems ideal. Of
 course one would still need to make sure that the code builds properly
 everywhere, but just solving the distribution part of this would be a huge
 step ahead.

On windows, this issue may be solved using eggs: enstaller has a
feature where dll put in a special location of an egg are installed in
python such as they are found by the OS loader. One could have
mechanisms based on $ORIGIN + rpath on linux to solve this issue for
local installs on Linux, etc...

But again, one has to be realistic on the goals. With toydist, I want
to remove all the pile of magic, hacks built on top of distutils so
that people can again hack their own solutions, as it should have been
from the start (that's a big plus of python in general). It won't
magically solve every issue out there, but it would hopefully help
people to make their own.

Bundling solutions like SAGE, EPD, etc... are still the most robust
ways to deal with those issues in general, and I do not intended to
replace those.

 What I'm saying is that this is a software distribution problem in
 general, and I'm afraid that Python-specific solutions are too narrow.

Distribution is a hard problem. Instead of pushing a very narrow (and
mostly ill-funded) view of how people should do things like
distutils/setuptools/pip/buildout do, I want people to be able to be
able to build their own solutions. No more use this magic stick v
4.0.3.3.14svn1234, trust me it work you don't have to understand
which is too prevalant with those tools, which has always felt deeply
unpythonic to me.

David

--
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
___
Matplotlib-devel mailing list
Matplotlib-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/matplotlib-devel