On May 9, 2014, at 1:41 PM, Paul Moore <p.f.mo...@gmail.com> wrote: > On 9 May 2014 16:56, Donald Stufft <don...@stufft.io> wrote: >> Right, but I think a similar win can be had just by folding —allow-external >> into —allow-unverifiable and make it —allow-off-pypi (needs a better name, >> maybe just keep it as --allow-external?). This would effectively mean that >> an end user cannot say "allow safe downloading X externally but disallow >> downloading it unsafely externally". > > I still find this hard to understand. If I get what you're saying, you > would rather have a single flag that claims to be to allow externally > hosted files to be downloaded, regardless of whether they are safe or > not than have a clean security model that says you need to opt into > downloading unverifiable files simply to avoid allowing users to > download argparse (or any of the other 0.x% of files that are safe but > external) by default? > > Once again, I'm struggling to see why *safe* externally hosted files > are such a bad thing.
- Introduces the chance of package specific random failures that the Python Infra team have no ability to fix (We have someone on call all the time). - Makes it harder for people to install some packages in restricted environments where they need to ask special permission to add individual hosts to a firewall. - Even when it does work, there is a good chance people who use projects that are hosted externally are going to experience slower more latent downloads as it's unlikely that they are going to be hosted behind a geo distributed CDN like Fastly. - Makes it harder for people to host their own mirrors of PyPI, if it's hosted on PyPI people can legally download it and distribute it however if it's hosted externally they may or may not be able to do that. This means that people must manually mirror packages that are not hosted on PyPI instead of having software like bandersnatch able to handle it all completely. - Is surprising behavior to most people. - Is complicated to explain and implement. - Is useful to practically nobody. > >> I'm normally someone who advocates towards better decisions on the security >> side of things, however if most people are going to need to use the >> --allow-unverifiable flag anyways then I think the benefits of having the >> two separated isn't very large. There is still a benefit to not installing >> externally hosted things by default which is why I think that just rolling >> the two options together is better. > > This is what bothers me about your position. I would expect you to be > insisting that unverifiable downloads *have* to be opt-in, and that's > why I've never advocated removing or changing the meaning of the > --allow-unverifiable flag. I agree with that position, and want things > to stay as they are for unverifiable links. And yet you seem to be in > favour of diluting that straightforward, strong security message just > to make users opt into a tiny minority of files that are completely > safe to download, but which are not hosted on PyPI. So I do unequivocally believe that unsafe downloads *must* be opt in by default. I also believe that external downloads *should* be opt in by default. In the current situation we have two knobs that control these independently. The paranoid security person in me loves this because it means that for some set of projects I can still opt in to the reliability hit but not the security hit. However the UX person in me hates this because more knobs is more confusion, especially in this situation because the line between what is external+safe and what is external+unsafe isn’t very easy to explain. So In my mind I've had to reconcile between these two viewpoints and when I look at the set of projects which are utilizing the external+safe hosting option I cannot find anything that tells me that many people are ever going utilize the external+safe option because the fact is projects simply are not using it in any meaningful numbers. There are currently 0.06% (23 total) of projects on PyPI that have *all* of their files hosted off of PyPI but done so safely. Looking closer at them I can see that the number that have files that will actually be installed by pip specifically that number drops down to 0.04% (15). Originally I had pointed out that 0.2% of projects host *any* files externally but safely. Looking again closer at them and removing projects which have also uploaded all files to PyPI, or which the file(s) that are safely hosted externally are not otherwise suitable I've determined that only 0.08% (32) of projects which I was able to discover any files for would be helped *in any way* by the external+safe option. And looking even closer at those, only 0.07% (26) of them will have the outcome of ``pip install whatever`` change (in other words, the latest version requires external+safe). So when I look at the data, I cannot make a very good claim to the UX side of me that external+safe deserves it's own option when the number of projects which would ever use it instead of external+unsafe is minuscule, and of those projects I can only point to argparse and mysql-connector-python which are likely to affect many people at all. So my beliefs are (in order of priority/conviction): 1. Unsafe downloads *MUST* but opt in. 2. External downloads *SHOULD* be opt in. 3. There is not enough potential users for separate knobs that allow external+safe or external+unsafe individually and they should be collapsed into a single option. So following those beliefs lead me to conclude that the best result for these options are (in order of preference): A. Collapse the two options into a single option and have it off by default. - Satisfies #1 and #2 because they are opt in. - Satisfies #3 because users don't have to futz with two different options. - Slightly makes me sad that people can't install externally things safely, but the UX win makes up for it. B. Leave the situation as, two options and one off by default. - Satisfies #1 and #2 because they are opt in. - Throws away #3. C. Keep the options separate, but enable --allow-all-external by default and re-add the --no-allow-external option. - Satisfies #1 because it is opt in. - Makes #2 sad because it is opt out. - Throws away #3 D. Remove the --allow-external family of options and enable them by default and always. - Satisfies #1 because it is opt in. - Throws away #2. - Throws away #3. There are bother benefits to option (A) of course. I'm someone of a public figure with packaging so people often come to me with questions, concerns, comments, etc. In that capacity I've had a number of people confused about the difference between an external file and an unverifiable file. In these cases I've had some difficulty in explaining what the difference is, especially since a lot of people have zero idea how the installer API even works. In the words of the Zen -> "If the implementation is hard to explain, it's a bad idea." and I can tell you that it is quite difficult to explain it to people. The rules are: If there is a <meta api-version="2"> tag: Trust all URLs with rel="internal" on the simple page Require --allow-external for any URL on the simple page that links directly to something that looks installable to pip and that also includes a hash fragment like #<hash_name>=<hash_value> which can be either md5, sha1, or in the sha-2 family. Require --allow-unverifiable for any URL on that simple page that directly links to something that looks installable to pip and that does not include a hash fragment with a hash in it. Also any URL that can be found by looking for URLs on the simple page that are linked with a rel=download or rel=homepage, fetching that page, and processing it's HTML looking for direct links to files that look installable to pip. else: Trust all URLs that directly link to a file on the simple page Trust all URLs that can be found by looking for links with rel=download or rel=homepage, fetching that page, and processing it's HTML looking for direct links that look installable to pip. That's complex, and quite often when I explain that the first response is "what's a simple index?". Although I mostly only need to explain the first part and the "else" part I don't because most people don't install from not-PyPI. On the flip side option (A) allows us to make this much simpler overall. We can simply do: If it's hosted on PyPI: Trust it. else if it's not hosted on PyPI: Require a --allow-external-and-unverifiable [*] This is *much*, *much*, *much* easier to explain, and I think it may be a good idea ala the Zen. > > I'm genuinely concerned here that I'm missing a glaringly obvious > reason why off-PyPI safe files are such a bad thing. You (and Nick, > and the authors of PEP 438) seem to be willing to accept a lot of > negative feeling and user unhappiness to defend making pip a > PyPI-only-by-default tool. I'd much rather that PyPI stand on its own > merits (which are many and compelling) rather than need a "use us or > pip will make your life inconvenient" crutch, which is what the > current behaviour feels like. Actually my opinion is that allowing external+safe files by default is not going to have any meaningful impact to *any* (or at the very least, 99.9%) of pip's users. ----------------- Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA
signature.asc
Description: Message signed with OpenPGP using GPGMail
_______________________________________________ Distutils-SIG maillist - Distutils-SIG@python.org https://mail.python.org/mailman/listinfo/distutils-sig