On 12/30/2009 10:57 AM, [email protected] wrote:
On Dec 30, 2009, at 1:48 PM, Sebastien Douche wrote:
> On Sun, Dec 27, 2009 at 11:47, Lennart Regebro<[email protected]> wrote:
>
>> Out of a total of 8522 packages on PyPI, there are 203 packages (2.4%)
>> whose latest release does not provide either a package on PyPI, nor a
>> download url. Of these 16 does not provide any contact data.
>
> Hi Lennart,
> Glad to see someone is interested by a PyPI mirror, I have one here
> and it's a pity.
>
> Statistics (from the creation of the mirror / proxy. The goal is to
> avoid external download, like an internal debian mirror):
> 2009-12-15 21:37:20,855 DEBUG Found (cached): 0
> 2009-12-15 21:37:20,855 DEBUG Stored (downloaded): 15367
> 2009-12-15 21:37:20,855 DEBUG Not found (404): 188
> 2009-12-15 21:37:20,855 DEBUG Invalid packages: 0
> 2009-12-15 21:37:20,855 DEBUG Invalid URLs: 54
> 2009-12-15 21:37:20,855 DEBUG Runtime: 208m38s
>
> The root issue (for me) is: packages out of the PyPI. A lot of broken
> links, broken html pages or stupid scripts (cf. old SourceForge).
I will put a way of getting this data out, thanks for the heads up.
Greetings Sebastien and Steve,
The way of getting [external packages] was already implemented. It is
called `setuptools.package_index` which is what we use in our internal
mirror program (planning to open-source and, perhaps also, host it
publicly) which also does the metadata extraction (PKG-INFO,
requires.txt) and index files that I mentioned earlier.
It is of no use to pity z3c.pypimirror or any other mirror program,
because the issue is not with those programs, but with the lack of a
central archive from which all sources and metadata can be reliably
mirrored.
I will, once again, draw the reader's attention to the following:
[Steffen Mueller]
My thesis is that the huge success of the CPAN has been facilitated by
two factors[2]. The first is simplicity. When Jarkko Hietaniemi
originally came up with it, the CPAN was (and mostly still is) just an
FTP archive with a by-author directory structure that is mirrored many
times.
http://www.mail-archive.com/[email protected]/msg10537.html
-srid
_______________________________________________
Distutils-SIG maillist - [email protected]
http://mail.python.org/mailman/listinfo/distutils-sig