On 12/23/2009 1:33 PM, Lennart Regebro wrote:
On Wed, Dec 23, 2009 at 20:24, Sridhar Ratnakumar
<[email protected]>  wrote:
The reason why PyPI does not have such third-party services - I think - is
that it lacks the CPAN like simple directory structure that can be easily
mirrored using ftp/rsync, to wit:

Nah, you can do that via /packages/, there is also an API to get the
metadata for a package. I think in general it's not an API problem.

It is indeed technically possible to do that with the PyPI XmlRpc API alone; but what I was referring to is enabling the mindset: a simple *self-contained* (i.e., without having to use an API to get metadata) directory structure that can simply be mirrored by using existing tools like rsync could *enable* developers interested in providing extending packaging functionality such as testing, quality measurements, documents, search, etc... to easily create such sites and maintain it.

At least, this is what - I understand - happened in the Perl community.

I think it's partly a problem that nobody has thunk the thought. I
think the idea of a site with automatically generated documentation
for *every* package is interesting. But I don't have time to work on
that right now. Talk to me again in six months, then I might have time
for another free-time project. :)

1/ Missing packages (eg: Twisted is not there)

The Twisted guys do not upload their packages to PyPI. I think that's
a mistake, but it's hardly PyPI's fault. There is no law saying you
have to use CPAN either.

Yes, as I said it is more of a community issue (than a PyPI one). What I also did mention was that because of this community issue, tools like easy_install/pip had to resolve to scrapping project webpages for guessing download links in an adhoc fashion. Also, further below the mail, I suggested PyPI to disallow mere project listings (without sources) and require sources to be stored in the server. One way to achieve this is requiring package authors to use the `sdist upload` toolchain which automatically creates a source tarball including metadata (in case one forgets to include it).

2/ No metadata: When only source tarballs are stored
[pypi.python.org/packages/source/P/Pylons/], what is the reliable way to a)
get the source for latest version

Download it from the above location.

b) get the source for a particular

Download it from the above location.

Perhaps if I were to rephrase the question, it would be clear this time: When only source tarballs are stored [pypi.python.org/packages/source/P/Pylons/], what is the reliable way to a) get the source for the latest version (when the /P/Pylons contains multiple versions -- in other words, how do I find the later version in first place?), b) get the source for a particular version (without having to construct the filename, or do a adhoc matching with filenames to guess that Pylons-1.2.3.tar.gz corresponds to version 1.2.3)? If the answer is to do a HTTP GET first, then please see the next response.

version? In CPAN [cpan.org/modules/by-module/AppConfig/ABW/], each tarball
has a .meta file describing the module metadata (similar to PKG-INFO).

http://pypi.python.org/pypi?:action=doap&name=Twisted%20Mail&version=9.0.0

This is not a problem about missing API or functionality, but that you
don't know about it. In the last case that link exists at the bottom
of every package page. And you see how it works.

As the CPAN .meta example was given in the context of having a simple directory structure that can be mirrored using existing tools like rsync, what I was pointing out is the lack of such an implementation, not the functionality itself (which, as you have shown, is currently supported by doing a HTTP GET that would return a XML content -- not something that is rsync-friendly).

don't want XmlRpc, but just files/directories (note simplicity in Steffen's
post).

It's not XML-RPC because the metadata file is in XML-format.

While the specific case mentioned above (metadata for a specific or the latest version of a package) uses HTTP GET and XML, generally speaking .. to get a) the list of recently releases, b) list of all versions of a package, one has to use the XmlRpc API methods `changelog` and `package_releases` respectively.

But yes, you can't duplicate both the files and the metadata in one
go, you have to do it separately. But that then begs the question: How
often do you need to do both?

As often as the mirror sites would update their content (i.e., one or more times a day).

As often as the (future) third-party sites update their PyPI content (source + metadata). One such user is the PyPM backend itself which at the moment uses the XmlRpc to pull data from PyPI on a daily basis.

-srid
_______________________________________________
Distutils-SIG maillist  -  [email protected]
http://mail.python.org/mailman/listinfo/distutils-sig

Reply via email to