On Mar 8, 2013, at 8:13 AM, Donald Stufft <don...@stufft.io> wrote: > > On Mar 8, 2013, at 8:07 AM, Jesse Noller <jnol...@gmail.com> wrote: > >> As long as external URLs eventually are completely removed I'm okay with >> caching things > > So I have mixed feelings on caching the urls. I'm not completely against it > however it does present a problem of "Well how do we know if the url we are > fetching is the accurate url for that package". Downloading and caching them > and presenting them the same as if someone uploaded them directly to PyPI > loses a point of distinction between "PyPI can verify this is the package > that the author intended to release" and "This is something we think that the > author releases, maybe, probably?".
The distinction can be fixed with a rel="external" or rel="cached" or whatever. I believe all the tools will still find them as downloadable targets and can be adapted to print a warning if that's desired. We *might* be caching a package that has already been replaced by an attacker but by caching and centralizing it we have a better way of removing it once it's found. The legal issues is something we'd probably need to ask VanL? So that's an Ok, Neutral, and Unknown for my 3 major complaints. > > It does solve the backwards compatibility issue of killing external urls > immediately so I'm not flat out against it, but there may be legal issues > involved too? > >> >> On Mar 8, 2013, at 6:49 AM, "M.-A. Lemburg" <m...@egenix.com> wrote: >> >>> On 08.03.2013 02:40, Donald Stufft wrote: >>>> So I updated my script (had to remove eventlet) and I believe it's now >>>> accurate. The total time was ~54 hours so this is hardly scientific but it >>>> should give a good idea what sort of impact we are talking about. >>>> >>>> This is a list of versions that pip's PackageFinder (what it uses to >>>> locate packages to install) could find that were not available on PyPI. >>>> >>>> The results and script is available at: >>>> https://gist.github.com/dstufft/5088915 >>>> >>>> Some statistics: >>>> >>>> Projects affected (with dev): 2269 >>>> Versions affected (with dev): 8006 >>>> >>>> Projects affected (without dev): 1880 >>>> Versions affected (without dev): 7586 >>>> >>>> These numbers are if all external urls were immediately removed from PyPI, >>>> so this would be the total affected. This does not test if the actual >>>> package is installable, just if pip is able to locate an url that it >>>> thinks represents a version for that project. >>> >>> Thanks for running the test. >>> >>> About 10% of all packages. The numbers are already impressive, >>> but if you factor in the popularity of some of those >>> packages, the situation becomes worse. >>> >>> I'm beginning to wonder whether caching the external link content >>> on the PyPI CDN wouldn't be a better idea. >>> >>> We'd have to make that legally waterproof and also have an opt-out >>> mechanism, but it would get us from here to there a lot faster. >>> >>> Together with the added hash tag on the download file URLs (*), >>> this would solve the availability and the security aspects. >>> Instead of deprecating external links altogether, we could then >>> deprecate non-compliant download links and get an overall >>> very flexible system for Python package distribution. >>> >>> (*) Yes, I know, I still have to deliver the updated proposal - >>> been working on getting our indexes ready to serve as example :-) >>> >>> -- >>> Marc-Andre Lemburg >>> eGenix.com >>> >>> Professional Python Services directly from the Source (#1, Mar 07 2013) >>>>>> Python Projects, Consulting and Support ... http://www.egenix.com/ >>>>>> mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ >>>>>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ >>> ________________________________________________________________________ >>> >>> ::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: >>> >>> eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 >>> D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg >>> Registered at Amtsgericht Duesseldorf: HRB 46611 >>> http://www.egenix.com/company/contact/ >>> _______________________________________________ >>> Catalog-SIG mailing list >>> Catalog-SIG@python.org >>> http://mail.python.org/mailman/listinfo/catalog-sig > > > ----------------- > Donald Stufft > PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA > > _______________________________________________ > Catalog-SIG mailing list > Catalog-SIG@python.org > http://mail.python.org/mailman/listinfo/catalog-sig ----------------- Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA
signature.asc
Description: Message signed with OpenPGP using GPGMail
_______________________________________________ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig