David Cournapeau <cournape <at> gmail.com> writes:

> Note that being able to convert a package does not mean the conversion
> is working. You need to make sure that installing something from this
> new format gives the same thing as installing from the setup.py.
> That's harder to test, obviously.

Right, and I've considered this. The most additional basic test I've done is to
do the equivalent of "python setup.py sdist" using *only* the package.yaml,
actually running "python setup.py sdist" and then comparing the two archives.

While I don't have the stats to hand, most of the 18,000 PyPI packages are such
that the comparison of archives shows no meaningful differences. Many of the
failures are due to custom code in setup.py, such as command classes.

My work so far also doesn't faithfully mock e.g. Cython or numpy, so that
calls to their customisations to distutils from setup.py aren't captured,
and so in those cases the resulting package.yaml is incomplete. However,
that's perhaps just a matter on spending some more time on the mocking
approach.

Obviously checking sdist is just a first step, but it's disappointing to
see how many packages on PyPI just fail the "download from PyPI, then
run python setup.py sdist" test because of e.g. importing packages which
don't exist.

To my mind, any source package on PyPI should be downloadable and be able to
have an sdist run on it to regenerate the archive, without needing any other
packages to be present: it's a source package already, right? But perhaps
that's what you get by allowing arbitrary code in setup.py, and eliminating
setup.py is definitely a step in the right direction.

> >
> > Is there a good "download the latest versions of everything hosted on
> > pypi" script? Mine was pretty terrible as it could not resume after a
> > crash or after the data got stale.
> 
> I would be interested in that as well, I wanted to do the same kind of
> analysis for Bento's convert command.
> 

Not as a standalone script that anyone can use, unfortunately. I don't have the
space to store all those downloads, and stuff on PyPI keeps getting updated,
anyway: so what I did was to run a first pass using the XML-RPC API to get
a list of package, version and archive URLs into a text file; my scripts then
pick up individual packages, download them and do the mocking/capture to
package.yaml, followed by the sdist comparison. I just use grep to filter the
archive list to determine which packages to process.

Every now and again I update that text file of archives and versions, see what
changed and arrange to run my code over updated and new packages. I just keep
the package.yaml files and the listings of the archives produced by the
distutils/setuptools dist operation and my package.yaml-using version.

Regards,

Vinay Sajip

_______________________________________________
Distutils-SIG maillist  -  Distutils-SIG@python.org
http://mail.python.org/mailman/listinfo/distutils-sig

Reply via email to