On May 15, 2015, at 9:19 PM, Donald Stufft <don...@stufft.io> wrote:

>> 
>> On May 15, 2015, at 2:57 PM, Robert Collins <robe...@robertcollins.net> 
>> wrote:
>> 
>> So, I am working on pip issue 988: pip doesn't resolve packages at all.
>> 
>> This is O(packages^alternatives_per_package): if you are resolving 10
>> packages with 10 versions each, there are approximately 10^10 or 10G
>> combinations. 10 packages with 100 versions each - 10^100.
>> 
>> So - its going to depend pretty heavily on some good heuristics in
>> whatever final algorithm makes its way in, but the problem is
>> exacerbated by PyPI's nature.
>> 
>> Most Linux (all that i'm aware of) distributions have at most 5
>> versions of a package to consider at any time - installed(might be
>> None), current release, current release security updates, new release
>> being upgraded to, new release being upgraded to's security updates.
>> And their common worst case is actually 2 versions: installed==current
>> release and one new release present. They map alternatives out into
>> separate packages (e.g. when an older soname is deliberately kept
>> across an ABI incompatibility, you end up with 2 packages, not 2
>> versions of one package). To when comparing pip's challenge to apt's:
>> apt has ~20-30K packages, with altnernatives ~= 2, or
>> pip has ~60K packages, with alternatives ~= 5.7 (I asked dstufft)
>> 
>> Scaling the number of packages is relatively easy; scaling the number
>> of alternatives is harder. Even 300 packages (the dependency tree for
>> openstack) is ~2.4T combinations to probe.
>> 
>> I wonder if it makes sense to give some back-pressure to people, or at
>> the very least encourage them to remove distributions that:
>> - they don't support anymore
>> - have security holes
>> 
>> If folk consider PyPI a sort of historical archive then perhaps we
>> could have a feature to select 'supported' versions by the author, and
>> allow a query parameter to ask for all the versions.
>> 
> 
> There have been a handful of projects which would only keep the latest N
> versions uploaded to PyPI. I know this primarily because it has caused
> people a decent amount of pain over time. It’s common for deployments people
> have to use a requirements.txt file like ``foo==1.0`` and to just continue
> to pull from PyPI. Deleting the old files breaks anyone doing that, so it 
> would
> require either having people bundle their deps in their repositories or
> some way to get at those old versions. Personally I think that we shouldn’t
> go deleting the old versions or encouraging people to do that.

+1 for this. While I appreciate why Linux distress purge old versions, it is 
absolutely hellish for reproducibility. If you are looking for prior art, check 
out the Molinillo project (https://github.com/CocoaPods/Molinillo) used by 
Bundler and CocoaPods. It is not as complex as the Solve gem used in Chef but 
offers a good balance of performance in satisfying constraints and false 
negatives on solution failures.

--Noah

Attachment: signature.asc
Description: Message signed with OpenPGP using GPGMail

_______________________________________________
Distutils-SIG maillist  -  Distutils-SIG@python.org
https://mail.python.org/mailman/listinfo/distutils-sig

Reply via email to