Am Freitag, 21. März 2008 19:45:22 schrieb Martin v. Löwis:
> > I did some research on which interfaces are available to retrieve
> > data from the index. The two most promising interfaces are the
> > XML-RPCs and http://pypi.python.org/simple. But both lack a compact
> > index (information that I can download with only one request) that
> > contains at least the package names and the available versions.
> 
> I can't quite understand where the need for a *single* request comes
> from. The information is surely available by use of multiple requests.
> 
> > Would it be possibe to extend the existing interfaces so that they
> > fit this needs?
> > 
> > I'll gladly help working on those new interfaces if help is wanted. 
> > Thanks in advance for any suggestions.
> 
> Indeed, such an interface could only become reality by means of
> somebody contributing it. However, before you start doing so:
> have you considered alternatives, such as using multiple requests,
> along with incremental updates?
> 
> If the interface were available, how would you use it? (e.g. how
> often, and what for)
> 
> Regards,
> Martin
> 
> 

The need for a single request is basically a matter of efficiency as shown 
below in the use case.

Usually, if all the metadata is readily available (ie. without downloading all 
packages) the package manager periodically (at most once a day - typically once 
a week or less) synchronises all "repositories" (containing all necessary 
package metadata). This means the metadata is stored on the user's disk for 
further use.
Considering the server load, synchronising seems not to be an option at the 
moment (IMHO) since this would mean one request for each package in the 
repository.

The problems with the currently available interfaces are best shown by an use 
case.
Let's say the user want's to update all installed packages:

Without the ability to sync (as described above) this would mean:
1. requesting one page per installed package to determine which versions are 
available.
2. downloading the new versions
3. resolve the dependencies and starting at step 1 for every new dependency
4. install the packages
This results in m+n+2*q requests to the server (m = number of installed 
packages, n = number of updateable packages, q = number of new dependencies). 
Typically m is by far the largest number.

With the ability to sync this would be:
1. syncing the repository
2. determining newer versions and resolving their dependencies
3. download and install the list of packages
This results in 1+n+q requests.

This shows us that it would vastly improve the situation if at least the 
version was available in a similar way to the simple-index 
http://pypi.python.org/simple or the corresponding xml-rpc.

I hope this clears things up a bit.

Regards,
Roman
_______________________________________________
Catalog-SIG mailing list
[email protected]
http://mail.python.org/mailman/listinfo/catalog-sig

Reply via email to