Interesting! I produced that dump as part of a demo of using Xapian for cheese shop search (still a work in progress, when I get a free moment). Adding e.g. a "depends:" operator is something I'd like, and your database sounds very useful for achieving that goal.
Thanks for the link. I may be e-mailing you shortly ;) On 17 May 2013 02:50, Daniel Holth <[email protected]> wrote: > On Thu, May 16, 2013 at 3:46 PM, David Wilson <[email protected]> wrote: > > Would something like http://pypi.h1.botanicus.net/static/dump.txt.gz be > > useful to you? (warning: 57mb expanding to 540mb). Each line is a > > JSON-encoded dict containing a single package release. > > > > for line in gzip.open('dump.txt.gz'): > > dct = json.loads(line) > > .... > > > > etc > > > > The code for it is very simple, would be willing to clean it up and turn > it > > into a cron job if people found it useful. > > > > Note the dump above is outdated, I only made it as a test. > > Seems like a useful format. > > https://bitbucket.org/dholth/pypi_stats is a prototype that parses > requires.txt and other metadata out of all the sdists in a folder, > putting them into a sqlite3 database. It may be interesting for > experimentation. For example, I can easily tell you how many different > version numbers there are and which are the most popular, or I can > tell you which metadata keys and version numbers have been used. The > database winds up being 1.6 GB or about 200MB if you delete the > unparsed files. >
_______________________________________________ Distutils-SIG maillist - [email protected] http://mail.python.org/mailman/listinfo/distutils-sig
