You raise a really good point, which is especially relevant in light of pypi performance issues and discussions.
I'm copying the distutils and catalog sigs to get some wider discussion. I apologize for the cross posting. I'm beginning to wonder about the strategy that setuptools uses, or maybe about the way we are using the index. It's important to note that there is nothing specific about the buildout package here. It is very important to make multiple versions available to support requirements for specific package versions. It make builds/installs repeatable, whether talking about buildout or other systems built on setuptools. When someone has tested and wants to release an application built from a collection of distributions, they will want to specify those *specific* versions for future builds or installs. This means that we need to retain any versions published indefinitely in a way that can be found by setuptools. Currently, the only way to support multiple versions with the cheeseshop is to unhide past releases. This has a fairly severe effect on performance. As the example below shows, setuptools will fetch the package page and then fetch the pages for each release. That's a lot of requests. What makes it worse is that the individual package pages can be fairly long. I've gotten in the habit of including full documentation on every release page. For example, recent release pages for zc.buildout are around 200K. This is a fairly significant amount of data to transfer. This will certainly make the scanning process take a long time for clients. (Obviously, if we keep doing things the way we are, I'll need to stop doing that.) All of this aggravates any performance problems we might have. Up to now, setuptools has tried hard to use existing systems without change. This means that it reuses systems designed primarily for people, not software. I think that setuptools rightly took the approach it has up to now so that progress could be made without making people change other systems. This was appropriate when setuptools was evolving and people were figuring out ways to use it. I think it is time to take a step back and think a lot harder about how we'd want to structure an index to support setuptools. IMO, a setuptools-aware index would have a single page for each package: - The single page would be published in a case-insensitive way. It would be nice to find a way to avoid this, or maybe we should use a windows-based web server. :) It would also be served very cheaply, for example statically. - The single page would list links for all available distributions, which should include all distributions published. It would also list any other URLs that should be scanned for releases, when releases aren't all uploaded to PyPI. - The single page would contain very little additional information. It would be for use by software, not humans. In addition, the root page with a trailing / would be empty and very cheap. There are a lot of ways we could achieve this pretty cheaply while keeping the existing system pretty much as it is. For example, the current effort to bake static pages could bake these pages instead. We could make the new index available at a different URL for people to play with while we worked the kinks out of the process. Of course, those of us who use the cheesehop and setuptools extensively can also achieve much of this by changing the way we work. Thoughts? Jim On Jul 10, 2007, at 8:44 AM, Philipp von Weitershausen wrote: > When easy_installing zc.buildout I realized that the CheeseShop > still lists a gazillion old versions of zc.buildout. That makes it > take quite some time to install zc.buildout (see below), and I > reckon the same sort of check has to happen each time it looks for > a new version of that egg... > > Is there any reason for having so many old versions around? > > > $ easy_install zc.buildout > Searching for zc.buildout > Reading http://cheeseshop.python.org/pypi/zc.buildout/ > Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b19 > Reading http://svn.zope.org/zc.buildout > Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b22 > Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b23 > Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b20 > Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b21 > Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b26 > Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b27 > Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b24 > Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b25 > Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b28 > Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b17 > Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b16 > Reading http://cheeseshop.python.org/pypi/zc.buildout/1.0.0b18 > Best match: zc.buildout 1.0.0b28 > ... -- Jim Fulton mailto:[EMAIL PROTECTED] Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org _______________________________________________ Distutils-SIG maillist - [email protected] http://mail.python.org/mailman/listinfo/distutils-sig
