I like this kind of study. Fixing 1300 packages sounds a lot more manageable than fixing 18,000. (I took a similar look at setup.py but with the ast module instead of actually running the things. Your method is probably more accurate.) It would be very cool to know how many packages use if: statements to affect install_requires...
I have tried to include the vital setuptools metadata in Metadata 1.3 without the json. It maps to an ordered dict. A few files like entry_points.txt stay in their own files, and are better off (more performant) that way, since you may be able to avoid parsing METADATA at all if you just want to know if a package has entry_points (os.path.exists(entry_points.txt)). Did you look at the bento ipkg (internal package metadata) format? A barebones one is at https://gist.github.com/3715068 Is there a good "download the latest versions of everything hosted on pypi" script? Mine was pretty terrible as it could not resume after a crash or after the data got stale. _______________________________________________ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig