Re: [Distutils] What to do about the PyPI mirrors
Hi, looks like I'm late to the party to figure out that I'm going to be hurt again. I'd like to suggest explicitly considering what is going to break due to this and how much work you are forcefully inflicting on others. My whole experience around the packaging (distribute/setuptools) and mirroring/CDN in this year estimates cost for my company somewhere between 10k-20k EUR just for keeping up with the breakage those changes incure. It might be that we're wonderfully stupid (..enough to contribute) and all of this causes no headaches for anybody else …. Overall, guessing that the packaging infrastructure is used by probably multiple thousands of companies then I'd expect that at least 100 of them might be experiencing problems like us. Juggling arbritrary numbers I can see that we're inflicting around a million EURs of cost that nobody asked for. More specific statements below. On 2013-08-04 22:25:01 +, Donald Stufft said: Here's my PEP for Deprecating and Removing the Official Public Mirrors It's source is at: https://github.com/dstufft/peps/blob/master/mirror-removal.rst Abstract === This PEP provides a path to deprecate and ultimately remove the official public mirroring infrastructure for `PyPI`_. It does not propose the removal of mirroring support in general. -1 - maybe I don't have the right to speak up on CDN usage, but personally I feel it's a bad idea to delegate overall PyPI availability exclusively to a commercial third party. It's OK for me that we're using them to improve PyPI availability, but completely putting our faith in their hands, doesn't sound right to me. Rationale The PyPI mirroring infrastructure (defined in `PEP381`_) provides a means to mirror the content of PyPI used by the automatic installers. It also provides a method for autodiscovery of mirrors and a consistent naming scheme. There are a number of problems with the official public mirrors: * They give control over a \*.python.org domain name to a third party, allowing that third party to set or read cookies on the pypi.python.org and python.org domain name. Agreed, that's a problem. * The use of a sub domain of pypi.python.org means that the mirror operators will never be able to get a certificate of their own, and giving them one for a python.org domain name is unlikely to happen. Agreed. * They are often out of date, most often by several hours to a few days, but regularly several days and even months. That's something that the mirroring infrastructure should have been constructed for. I completely agree that the way the mirroring was established was way sub-optimal. I think we can do better. * With the introduction of the CDN on PyPI the public mirroring infrastructure is not as important as it once was as the CDN is also a globally distributed network of servers which will function even if PyPI is down. Well, now we have one breakage point more which keeps annoying me. This argument is not completely true. They may be getting better over time but we have invested heavily to accomodate the breakage - that needs to be balanced with some benefit in the near future. * Although there is provisions in place for it, there is currently no known installer which uses the authenticity checks discussed in `PEP381`_ which means that any download from a mirror is subject to attack by a malicious mirror operator, but further more due to the lack of TLS it also means that any download from a mirror is also subject to a MITM attack. Again, I think that was a mistake during the introduction of the mirroring infrastructure: too few people, too confusing PEP. * They have only ever been implemented by one installer (pip), and its implementation, besides being insecure, has serious issues with performance and is slated for removal with it's next release (1.5). Only if you consider the mirror auto-discovery protocol. I'm not sure whether using DNS was such a smart move. A simple HTTP request to find mirrors would have been nice. I think we can still do that. Also, not everyone wants or needs auto-detection the way that the protocol describes it. I personally just hand-pick a mirror (my own, hah) and keep using that. We are also thinking about providing system-level default configuration to hint tools like PIP and setuptools to a different default index that is closer from a network perspective. From a customer perspective this should be PyPI. I'd like to avoid breakage. Again, if you don't let me choose where to spend my time, I'd rather invest the time I need for cleaning up the breakage into something constructive. The indices are in active use. f.pypi.python.org is seeing between 150-300GB of traffic per month, the patterns widely ranging over the last month. This is traffic that is not used internally from gocept. Due to the number of issues, some of them very serious, and the CDN which more or less
Re: [Distutils] What to do about the PyPI mirrors
Two more things: why is the CDN not suffering from the security problems you describe for the mirrors? a) Fastly seems to be the one owning the certificate for pypi.python.org. What?!? b) What does stop Fastly from introducing incorrect/rogue code in package downloads? Christian ___ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] What to do about the PyPI mirrors
On Aug 5, 2013, at 11:11 PM, Christian Theune c...@gocept.com wrote: Two more things: why is the CDN not suffering from the security problems you describe for the mirrors? a) Fastly seems to be the one owning the certificate for pypi.python.org. What?!? They have a delegated SAN for it, which digicert (the CA) authorizes with the domain contact (the board in this case). b) What does stop Fastly from introducing incorrect/rogue code in package downloads? Basically this one boils down to personal trust from me to the Fastly team combined with the other companies using them being very reputable. At the end of the day, there is not currently any cryptographic mechanism preventing Fastly from doing bad things. --Noah signature.asc Description: Message signed with OpenPGP using GPGMail ___ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] What to do about the PyPI mirrors
On Aug 6, 2013, at 2:31 AM, Noah Kantrowitz n...@coderanger.net wrote: On Aug 5, 2013, at 11:11 PM, Christian Theune c...@gocept.com wrote: Two more things: why is the CDN not suffering from the security problems you describe for the mirrors? a) Fastly seems to be the one owning the certificate for pypi.python.org. What?!? They have a delegated SAN for it, which digicert (the CA) authorizes with the domain contact (the board in this case). b) What does stop Fastly from introducing incorrect/rogue code in package downloads? Basically this one boils down to personal trust from me to the Fastly team combined with the other companies using them being very reputable. At the end of the day, there is not currently any cryptographic mechanism preventing Fastly from doing bad things. To further expand on this answer, you need to trust *someone*. If we cut out Fastly here you could say, well what prevents Dyn Inc (DNS host) from simply redirecting the DNS to a different host? What prevents OSUOL from simply accessing the machines stored there and doing bad things (™). Hell, how many people here know the entire infrastructure team and has personally decided to trust them? At the end of the day you need to pick and choose who you trust. Right now we're working on narrowing down the number of people trusted. The Python Infrastructure has decided it is willing to extend trust to Fastly to cover PyPI the same as it was willing to extend trust to Dyn, and OSOUL, and even the members of the Infra team. Now that being said narrowing the list of people you need to trust is an ongoing goal, and one that isn't going to stop with limiting the number of places able to publish at varying python.org domain names who don't need to be. We're not in a particularly well off position yet but we are getting better all the time. --Noah ___ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig - Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA signature.asc Description: Message signed with OpenPGP using GPGMail ___ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] What to do about the PyPI mirrors
On Aug 5, 2013, at 11:09 PM, Christian Theune c...@gocept.com wrote: Hi, looks like I'm late to the party to figure out that I'm going to be hurt again. I'd like to suggest explicitly considering what is going to break due to this and how much work you are forcefully inflicting on others. My whole experience around the packaging (distribute/setuptools) and mirroring/CDN in this year estimates cost for my company somewhere between 10k-20k EUR just for keeping up with the breakage those changes incure. It might be that we're wonderfully stupid (..enough to contribute) and all of this causes no headaches for anybody else …. Overall, guessing that the packaging infrastructure is used by probably multiple thousands of companies then I'd expect that at least 100 of them might be experiencing problems like us. Juggling arbritrary numbers I can see that we're inflicting around a million EURs of cost that nobody asked for. More specific statements below. On 2013-08-04 22:25:01 +, Donald Stufft said: Here's my PEP for Deprecating and Removing the Official Public Mirrors It's source is at: https://github.com/dstufft/peps/blob/master/mirror-removal.rst Abstract === This PEP provides a path to deprecate and ultimately remove the official public mirroring infrastructure for `PyPI`_. It does not propose the removal of mirroring support in general. -1 - maybe I don't have the right to speak up on CDN usage, but personally I feel it's a bad idea to delegate overall PyPI availability exclusively to a commercial third party. It's OK for me that we're using them to improve PyPI availability, but completely putting our faith in their hands, doesn't sound right to me. Rationale The PyPI mirroring infrastructure (defined in `PEP381`_) provides a means to mirror the content of PyPI used by the automatic installers. It also provides a method for autodiscovery of mirrors and a consistent naming scheme. There are a number of problems with the official public mirrors: * They give control over a \*.python.org domain name to a third party, allowing that third party to set or read cookies on the pypi.python.org and python.org domain name. Agreed, that's a problem. * The use of a sub domain of pypi.python.org means that the mirror operators will never be able to get a certificate of their own, and giving them one for a python.org domain name is unlikely to happen. Agreed. * They are often out of date, most often by several hours to a few days, but regularly several days and even months. That's something that the mirroring infrastructure should have been constructed for. I completely agree that the way the mirroring was established was way sub-optimal. I think we can do better. * With the introduction of the CDN on PyPI the public mirroring infrastructure is not as important as it once was as the CDN is also a globally distributed network of servers which will function even if PyPI is down. Well, now we have one breakage point more which keeps annoying me. This argument is not completely true. They may be getting better over time but we have invested heavily to accomodate the breakage - that needs to be balanced with some benefit in the near future. To be clear, the CDN and other server-side improvements are not a hard-HA replacement like a local company mirror. You are exactly the use case that can and should be using a mirror for your own use. We are doing _nothing_ that disrupts this use case and will support is exactly as before. * Although there is provisions in place for it, there is currently no known installer which uses the authenticity checks discussed in `PEP381`_ which means that any download from a mirror is subject to attack by a malicious mirror operator, but further more due to the lack of TLS it also means that any download from a mirror is also subject to a MITM attack. Again, I think that was a mistake during the introduction of the mirroring infrastructure: too few people, too confusing PEP. * They have only ever been implemented by one installer (pip), and its implementation, besides being insecure, has serious issues with performance and is slated for removal with it's next release (1.5). Only if you consider the mirror auto-discovery protocol. I'm not sure whether using DNS was such a smart move. A simple HTTP request to find mirrors would have been nice. I think we can still do that. Also, not everyone wants or needs auto-detection the way that the protocol describes it. I personally just hand-pick a mirror (my own, hah) and keep using that. We are also thinking about providing system-level default configuration to hint tools like PIP and setuptools to a different default index that is closer from a network perspective. From a customer perspective this should be PyPI. I'd like to avoid breakage. Again, if you
Re: [Distutils] What to do about the PyPI mirrors
On Aug 6, 2013, at 2:49 AM, Noah Kantrowitz n...@coderanger.net wrote: I am also hoping that pypi-mirrors.org will continue to operate as a community project (side note, I would be happy to assist with hosting for it if Ken reads this list and if thats a concern of his) and that the mirror operators can develop policies for things like this. Additionally if anyone else wants to maintain a list like this I think it would be more than appropriate to link to it in addition to pypi-mirrors.org on the page about mirroring. - Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA signature.asc Description: Message signed with OpenPGP using GPGMail ___ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] What to do about the PyPI mirrors
On Mon, Aug 05, 2013 at 23:31 -0700, Noah Kantrowitz wrote: On Aug 5, 2013, at 11:11 PM, Christian Theune c...@gocept.com wrote: Two more things: why is the CDN not suffering from the security problems you describe for the mirrors? a) Fastly seems to be the one owning the certificate for pypi.python.org. What?!? They have a delegated SAN for it, which digicert (the CA) authorizes with the domain contact (the board in this case). b) What does stop Fastly from introducing incorrect/rogue code in package downloads? Basically this one boils down to personal trust from me to the Fastly team combined with the other companies using them being very reputable. At the end of the day, there is not currently any cryptographic mechanism preventing Fastly from doing bad things. The problem is not so much trusting individuals but that the companies in question are based in the US. If its government wants to temporarily serve backdoored packages to select regions, they could silently force Fastly to do it. I guess the only way around this is to work with pypi- and eventually author/maintainer-signatures and verification. best, holger ___ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] What to do about the PyPI mirrors
On Aug 5, 2013, at 11:56 PM, holger krekel hol...@merlinux.eu wrote: On Mon, Aug 05, 2013 at 23:31 -0700, Noah Kantrowitz wrote: On Aug 5, 2013, at 11:11 PM, Christian Theune c...@gocept.com wrote: Two more things: why is the CDN not suffering from the security problems you describe for the mirrors? a) Fastly seems to be the one owning the certificate for pypi.python.org. What?!? They have a delegated SAN for it, which digicert (the CA) authorizes with the domain contact (the board in this case). b) What does stop Fastly from introducing incorrect/rogue code in package downloads? Basically this one boils down to personal trust from me to the Fastly team combined with the other companies using them being very reputable. At the end of the day, there is not currently any cryptographic mechanism preventing Fastly from doing bad things. The problem is not so much trusting individuals but that the companies in question are based in the US. If its government wants to temporarily serve backdoored packages to select regions, they could silently force Fastly to do it. I guess the only way around this is to work with pypi- and eventually author/maintainer-signatures and verification. No, I have carefully selected whom I trust to work with on the PSF infrastructure. I can promise you there is a 100% chance that the head of Fastly would sooner shut down the company than allow a government interdiction of any kind. I extend this trust to Dyn and OSL as well, and I do not do so lightly. --Noah signature.asc Description: Message signed with OpenPGP using GPGMail ___ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] What to do about the PyPI mirrors
On Aug 6, 2013, at 2:56 AM, holger krekel hol...@merlinux.eu wrote: On Mon, Aug 05, 2013 at 23:31 -0700, Noah Kantrowitz wrote: On Aug 5, 2013, at 11:11 PM, Christian Theune c...@gocept.com wrote: Two more things: why is the CDN not suffering from the security problems you describe for the mirrors? a) Fastly seems to be the one owning the certificate for pypi.python.org. What?!? They have a delegated SAN for it, which digicert (the CA) authorizes with the domain contact (the board in this case). b) What does stop Fastly from introducing incorrect/rogue code in package downloads? Basically this one boils down to personal trust from me to the Fastly team combined with the other companies using them being very reputable. At the end of the day, there is not currently any cryptographic mechanism preventing Fastly from doing bad things. The problem is not so much trusting individuals but that the companies in question are based in the US. If its government wants to temporarily serve backdoored packages to select regions, they could silently force Fastly to do it. I guess the only way around this is to work with pypi- and eventually author/maintainer-signatures and verification. PyPI is hosted in the US. Anything the Government could do to Fastly it could do to OSUOL where PyPI is hosted. The solution to that is signature validation but I think it's premature to worry too much about that when there are lower hanging fruit that don't require the US Government deciding to backdoor packages. best, holger ___ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig - Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA signature.asc Description: Message signed with OpenPGP using GPGMail ___ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] What to do about the PyPI mirrors
On 6 August 2013 16:09, Christian Theune c...@gocept.com wrote: Hi, looks like I'm late to the party to figure out that I'm going to be hurt again. That's why I asked for this to be put through the PEP process: to give it more visibility, and provide more opportunity for people potentially affected to have a chance to comment and offer alternatives. Giving third parties the opportunity to read python.org cookies indefinitely isn't an option. Everything else is negotiable. I'd like to suggest explicitly considering what is going to break due to this and how much work you are forcefully inflicting on others. My whole experience around the packaging (distribute/setuptools) and mirroring/CDN in this year estimates cost for my company somewhere between 10k-20k EUR just for keeping up with the breakage those changes incure. It might be that we're wonderfully stupid (..enough to contribute) and all of this causes no headaches for anybody else …. Overall, guessing that the packaging infrastructure is used by probably multiple thousands of companies then I'd expect that at least 100 of them might be experiencing problems like us. Juggling arbritrary numbers I can see that we're inflicting around a million EURs of cost that nobody asked for. More specific statements below. On 2013-08-04 22:25:01 +, Donald Stufft said: Here's my PEP for Deprecating and Removing the Official Public Mirrors It's source is at: https://github.com/dstufft/peps/blob/master/mirror-removal.rst Abstract === This PEP provides a path to deprecate and ultimately remove the official public mirroring infrastructure for `PyPI`_. It does not propose the removal of mirroring support in general. -1 - maybe I don't have the right to speak up on CDN usage, but personally I feel it's a bad idea to delegate overall PyPI availability exclusively to a commercial third party. It's OK for me that we're using them to improve PyPI availability, but completely putting our faith in their hands, doesn't sound right to me. Would you be happier if it said the current incarnation of the public mirroring infrastructure? I have no objections to somebody proposing a *new* less broken mirroring process. That's something that the mirroring infrastructure should have been constructed for. I completely agree that the way the mirroring was established was way sub-optimal. I think we can do better. As noted above, this PEP is about killing off the *current* public mirroring system as being irredeemably broken. If that inspires somebody to come up with a more sensible alternative, so much the better. * With the introduction of the CDN on PyPI the public mirroring infrastructure is not as important as it once was as the CDN is also a globally distributed network of servers which will function even if PyPI is down. Well, now we have one breakage point more which keeps annoying me. This argument is not completely true. They may be getting better over time but we have invested heavily to accomodate the breakage - that needs to be balanced with some benefit in the near future. That's why explicit mirror usage is still supported and recommended. * Although there is provisions in place for it, there is currently no known installer which uses the authenticity checks discussed in `PEP381`_ which means that any download from a mirror is subject to attack by a malicious mirror operator, but further more due to the lack of TLS it also means that any download from a mirror is also subject to a MITM attack. Again, I think that was a mistake during the introduction of the mirroring infrastructure: too few people, too confusing PEP. Which is why *this* incarnation of it needs to go away. * They have only ever been implemented by one installer (pip), and its implementation, besides being insecure, has serious issues with performance and is slated for removal with it's next release (1.5). Only if you consider the mirror auto-discovery protocol. I'm not sure whether using DNS was such a smart move. A simple HTTP request to find mirrors would have been nice. I think we can still do that. And can be done regardless of what happens to the current system. Also, not everyone wants or needs auto-detection the way that the protocol describes it. I personally just hand-pick a mirror (my own, hah) and keep using that. Which will be unaffected for anyone not relying on a pypi.python.org subdomain. We are also thinking about providing system-level default configuration to hint tools like PIP and setuptools to a different default index that is closer from a network perspective. From a customer perspective this should be PyPI. I'd like to avoid breakage. Again, if you don't let me choose where to spend my time, I'd rather invest the time I need for cleaning up the breakage into something constructive. The indices are in active use. f.pypi.python.org is seeing between
Re: [Distutils] What to do about the PyPI mirrors
-1 - maybe I don't have the right to speak up on CDN usage, but personally I feel it's a bad idea to delegate overall PyPI availability exclusively to a commercial third party. Well, it's been done, and it was always a better idea than the way mirrors was implemented. It's OK for me that we're using them to improve PyPI availability, but completely putting our faith in their hands, doesn't sound right to me. We must put out faith in somebody's hands with regards to PyPI. That hasn't changed. That's something that the mirroring infrastructure should have been constructed for. I completely agree that the way the mirroring was established was way sub-optimal. I think we can do better. Only by building our own CDN. We won't do better than the ones that exist. Well, now we have one breakage point more which keeps annoying me. We do? How? Also, not everyone wants or needs auto-detection the way that the protocol describes it. I personally just hand-pick a mirror (my own, hah) and keep using that. I agree that this is probably the best choice, and you can still do that. I'd like to avoid breakage. Again, if you don't let me choose where to spend my time, I'd rather invest the time I need for cleaning up the breakage into something constructive. The only breakage I can see in this proposal is that the [a-z] dns names go away. That would take four months. I think perhaps that's a bit short. I don't see why we can't keep them around for much longer. A way to find mirrors is needed, but perhaps not automatic, but for when pypi goes down. //Lennart ___ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] What to do about the PyPI mirrors
Quoting holger krekel hol...@merlinux.eu: The problem is not so much trusting individuals but that the companies in question are based in the US. If its government wants to temporarily serve backdoored packages to select regions, they could silently force Fastly to do it. I guess the only way around this is to work with pypi- and eventually author/maintainer-signatures and verification. Both are actually in place, just not widely used. Each simple page gets a pypi signature, in /serversig, which would allow to validate that a mirror or the CDN has the copy that is also on the master. For author signatures, PGP has been available for quite some time. As with any author signature, you then need to convince yourself that the key actually belongs to the author. Regards, Martin ___ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] What to do about the PyPI mirrors
On Aug 6, 2013, at 2:36 AM, Lennart Regebro rege...@gmail.com wrote: The only breakage I can see in this proposal is that the [a-z] dns names go away. That would take four months. I think perhaps that's a bit short. I don't see why we can't keep them around for much longer. I think we're all willing to increase the time :) The current timeframe was somewhat arbitrarily suggested by Noah and I just rolled with it for a first draft of the PEP figuring if it was too short someone would (hopefully!) speak up. The other breakage is people relying on --use-mirrors in PyPI but that is NOPd in the upcoming pip 1.5, and the side effect of removing these names in older versions of pip simply won't get mirroring support (It's mirroring support still hits PyPI itself). A way to find mirrors is needed, but perhaps not automatic, but for when pypi goes down. Thank you for calling this out, I forgot to include in the PEP that moving the mirroring listing off of PyPI itself means that we increase the chances that the listing will be available if PyPI itself happens to be down. - Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA signature.asc Description: Message signed with OpenPGP using GPGMail ___ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] What to do about the PyPI mirrors
On Mon, Aug 05, 2013 at 23:49 -0700, Noah Kantrowitz wrote: On Aug 5, 2013, at 11:09 PM, Christian Theune c...@gocept.com wrote: (...) Between now and the first DNS change, I would absolutely recommend any current public mirrors to redirect users to their new domain name if they intend to have one, and we'll do whatever we can to help make users aware of the switch. I would rather have a clear timeline with fewer steps than add another stage where we (PSF) are issuing redirects to non-PSF servers. Very very +1 on the easier bandersnatch-ing though, I really would love to see more mirrors out there, I just don't want them associated with PyPI or python.org, and I don't want pip to be trying to auto-discover them. PyPI mirrors _are_ associated with PyPI and pypi.python.org. (Why) Do do want to flatly rule out pip/pypi.python.org support for managing mirrors? The perl CPAN mirroring provides this nice little machine-readable file: http://www.cpan.org/indices/mirrors.json and a python-equivalent could be consumed by pip, i guess. best, holger ___ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] What to do about the PyPI mirrors
On Aug 6, 2013, at 12:01 AM, Nick Coghlan ncogh...@gmail.com wrote: On 6 August 2013 16:09, Christian Theune c...@gocept.com wrote: Hi, looks like I'm late to the party to figure out that I'm going to be hurt again. That's why I asked for this to be put through the PEP process: to give it more visibility, and provide more opportunity for people potentially affected to have a chance to comment and offer alternatives. Giving third parties the opportunity to read python.org cookies indefinitely isn't an option. Everything else is negotiable. I'd like to suggest explicitly considering what is going to break due to this and how much work you are forcefully inflicting on others. My whole experience around the packaging (distribute/setuptools) and mirroring/CDN in this year estimates cost for my company somewhere between 10k-20k EUR just for keeping up with the breakage those changes incure. It might be that we're wonderfully stupid (..enough to contribute) and all of this causes no headaches for anybody else …. Overall, guessing that the packaging infrastructure is used by probably multiple thousands of companies then I'd expect that at least 100 of them might be experiencing problems like us. Juggling arbritrary numbers I can see that we're inflicting around a million EURs of cost that nobody asked for. More specific statements below. On 2013-08-04 22:25:01 +, Donald Stufft said: Here's my PEP for Deprecating and Removing the Official Public Mirrors It's source is at: https://github.com/dstufft/peps/blob/master/mirror-removal.rst Abstract === This PEP provides a path to deprecate and ultimately remove the official public mirroring infrastructure for `PyPI`_. It does not propose the removal of mirroring support in general. -1 - maybe I don't have the right to speak up on CDN usage, but personally I feel it's a bad idea to delegate overall PyPI availability exclusively to a commercial third party. It's OK for me that we're using them to improve PyPI availability, but completely putting our faith in their hands, doesn't sound right to me. Would you be happier if it said the current incarnation of the public mirroring infrastructure? I have no objections to somebody proposing a *new* less broken mirroring process. That's something that the mirroring infrastructure should have been constructed for. I completely agree that the way the mirroring was established was way sub-optimal. I think we can do better. As noted above, this PEP is about killing off the *current* public mirroring system as being irredeemably broken. If that inspires somebody to come up with a more sensible alternative, so much the better. * With the introduction of the CDN on PyPI the public mirroring infrastructure is not as important as it once was as the CDN is also a globally distributed network of servers which will function even if PyPI is down. Well, now we have one breakage point more which keeps annoying me. This argument is not completely true. They may be getting better over time but we have invested heavily to accomodate the breakage - that needs to be balanced with some benefit in the near future. That's why explicit mirror usage is still supported and recommended. * Although there is provisions in place for it, there is currently no known installer which uses the authenticity checks discussed in `PEP381`_ which means that any download from a mirror is subject to attack by a malicious mirror operator, but further more due to the lack of TLS it also means that any download from a mirror is also subject to a MITM attack. Again, I think that was a mistake during the introduction of the mirroring infrastructure: too few people, too confusing PEP. Which is why *this* incarnation of it needs to go away. * They have only ever been implemented by one installer (pip), and its implementation, besides being insecure, has serious issues with performance and is slated for removal with it's next release (1.5). Only if you consider the mirror auto-discovery protocol. I'm not sure whether using DNS was such a smart move. A simple HTTP request to find mirrors would have been nice. I think we can still do that. And can be done regardless of what happens to the current system. Also, not everyone wants or needs auto-detection the way that the protocol describes it. I personally just hand-pick a mirror (my own, hah) and keep using that. Which will be unaffected for anyone not relying on a pypi.python.org subdomain. We are also thinking about providing system-level default configuration to hint tools like PIP and setuptools to a different default index that is closer from a network perspective. From a customer perspective this should be PyPI. I'd like to avoid breakage. Again, if you don't let me choose where to spend my time, I'd rather invest the time I need
Re: [Distutils] What to do about the PyPI mirrors
On Tue, Aug 06, 2013 at 08:36 +0200, Lennart Regebro wrote: Well, now we have one breakage point more which keeps annoying me. We do? How? Christian, Donald and me invested considerable debugging time, repeatably, to accomodate Fastly/CDN issues. It required multiple rounds of changes on bandersnatch, devpi and to pypi.python.org source code. Apart from that there have been intermittent install/cache-inconsistency failures. Due to the fast response times of the people involved most of these issues didn't last for too long but the CDN did introduce a new breakage point. holger ___ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] What to do about the PyPI mirrors
Quoting Nick Coghlan ncogh...@gmail.com: On 6 August 2013 16:09, Christian Theune c...@gocept.com wrote: Hi, looks like I'm late to the party to figure out that I'm going to be hurt again. That's why I asked for this to be put through the PEP process: to give it more visibility, and provide more opportunity for people potentially affected to have a chance to comment and offer alternatives. Giving third parties the opportunity to read python.org cookies indefinitely isn't an option. Define third party. There are a number of organisations other than the PSF that can read python.org cookies. As Noah explains, it's a matter of trust. Noah chooses to trust Fastly, I choose to trust Christian Theune. We both have then imposed our trust on the community. In any case, I consider the cookie issue a red herring. Mirror operators could only steal cookies if users actually pointed their web browsers to the mirrors. They typically don't, since they use setuptools or pip, which doesn't even have access to the cookies. And, if a mirror operator actually does request cookies, there is a high risk in being caught in doing so. If that happens, the mirror operator will not only lose the mirror, but also lose community trust. Regards, Martin ___ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] What to do about the PyPI mirrors
On Aug 6, 2013, at 3:03 AM, mar...@v.loewis.de wrote: Quoting holger krekel hol...@merlinux.eu: The problem is not so much trusting individuals but that the companies in question are based in the US. If its government wants to temporarily serve backdoored packages to select regions, they could silently force Fastly to do it. I guess the only way around this is to work with pypi- and eventually author/maintainer-signatures and verification. Both are actually in place, just not widely used. Each simple page gets a pypi signature, in /serversig, which would allow to validate that a mirror or the CDN has the copy that is also on the master. Unless I'm forgetting something there's no real way to get the server key without going through Fastly, and even if there was Fastly could just hijack an upload (and murder their entire business in the process). Regards, Martin ___ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig - Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA signature.asc Description: Message signed with OpenPGP using GPGMail ___ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] What to do about the PyPI mirrors
On 6 August 2013 17:13, Noah Kantrowitz n...@coderanger.net wrote: Also, CPAN, like Linux distro trees, can be mirrored with rsync rather than needing a custom client. It's much easier to maintain backwards compatibility when the only required server API is the ability to serve static files. I will fight any attempt to do this with every fiber of my being. This kind of dumb server API means that any metadata indexing or searching either needs to be precomputed or implemented in a much more intelligent client. This is already somewhat the case with pip, and as someone that has to deal with multiple client implementations it makes me very sad that I can't just call a REST endpoint to know what will be installed when I do a thing. This is neither here nor there, but I wanted to stake out my grounds so I can growl when people get too close :) I agree having a smart server is good, I just think exposing a dumb, easy to mirror, signed data store is good, too :) Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] What to do about the PyPI mirrors
On Aug 6, 2013, at 12:10 AM, holger krekel hol...@merlinux.eu wrote: On Mon, Aug 05, 2013 at 23:49 -0700, Noah Kantrowitz wrote: On Aug 5, 2013, at 11:09 PM, Christian Theune c...@gocept.com wrote: (...) Between now and the first DNS change, I would absolutely recommend any current public mirrors to redirect users to their new domain name if they intend to have one, and we'll do whatever we can to help make users aware of the switch. I would rather have a clear timeline with fewer steps than add another stage where we (PSF) are issuing redirects to non-PSF servers. Very very +1 on the easier bandersnatch-ing though, I really would love to see more mirrors out there, I just don't want them associated with PyPI or python.org, and I don't want pip to be trying to auto-discover them. PyPI mirrors _are_ associated with PyPI and pypi.python.org. (Why) Do do want to flatly rule out pip/pypi.python.org support for managing mirrors? The perl CPAN mirroring provides this nice little machine-readable file: http://www.cpan.org/indices/mirrors.json and a python-equivalent could be consumed by pip, i guess. Because at this time there is no Python package installer that can install from a public mirror in a way that makes me comfortable supporting it as an official resource. This could be addressed in pip by verifying the /simple signatures, but this mostly precludes improved mirroring mechanisms like that used by Crate. More to the point, I as the head of infrastructure am responsible for *.python.org, but if there is an issue with a mirror, be it downtime, server compromise, or anything else, me and my team can't do anything to fix that. This is, again, not a situation I am comfortable with. --Noah signature.asc Description: Message signed with OpenPGP using GPGMail ___ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] What to do about the PyPI mirrors
On Aug 6, 2013, at 3:15 AM, mar...@v.loewis.de wrote: Quoting Nick Coghlan ncogh...@gmail.com: On 6 August 2013 16:09, Christian Theune c...@gocept.com wrote: Hi, looks like I'm late to the party to figure out that I'm going to be hurt again. That's why I asked for this to be put through the PEP process: to give it more visibility, and provide more opportunity for people potentially affected to have a chance to comment and offer alternatives. Giving third parties the opportunity to read python.org cookies indefinitely isn't an option. Define third party. There are a number of organisations other than the PSF that can read python.org cookies. As Noah explains, it's a matter of trust. Noah chooses to trust Fastly, I choose to trust Christian Theune. We both have then imposed our trust on the community. Sure, but there's also a matter of the *number* of people trusted each new person to trust is another potential pain point. There's really no requirement to have the mirrors hosted on N.pypi.python.org. The fact they do is a legacy issue that can be corrected with a much better story for reliability and security. In any case, I consider the cookie issue a red herring. Mirror operators could only steal cookies if users actually pointed their web browsers to the mirrors. They typically don't, since they use setuptools or pip, which doesn't even have access to the cookies. And, if a mirror operator actually does request cookies, there is a high risk in being caught in doing so. If that happens, the mirror operator will not only lose the mirror, but also lose community trust. The cookie issue is very serious because it does not require someone to knowingly point their browser at N.pypi.python.org. A mirror operator could simply inline an image tag in a package, someone views the package page, and automatically makes a request to N.pypi.python.org which is sent the cookie and a script on N.pypi.python.org can read it. Also the claim that there is a high risk in being caught, there isn't really. It would be very easily to do this near silently. Regards, Martin ___ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig - Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA signature.asc Description: Message signed with OpenPGP using GPGMail ___ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] What to do about the PyPI mirrors
On Tue, Aug 06, 2013 at 17:19 +1000, Nick Coghlan wrote: On 6 August 2013 17:13, Noah Kantrowitz n...@coderanger.net wrote: Also, CPAN, like Linux distro trees, can be mirrored with rsync rather than needing a custom client. It's much easier to maintain backwards compatibility when the only required server API is the ability to serve static files. I will fight any attempt to do this with every fiber of my being. This kind of dumb server API means that any metadata indexing or searching either needs to be precomputed or implemented in a much more intelligent client. This is already somewhat the case with pip, and as someone that has to deal with multiple client implementations it makes me very sad that I can't just call a REST endpoint to know what will be installed when I do a thing. This is neither here nor there, but I wanted to stake out my grounds so I can growl when people get too close :) I agree having a smart server is good, I just think exposing a dumb, easy to mirror, signed data store is good, too :) FWIW I think CPAN is structured such that search sites operate on mirrored data. The master server thus can remain dumb. Sounds like a good recipe to me. It's a bit sad but i think even now we are struggling to meet CPAN's architecture and ease-of-use, let alone improve on it. holger ___ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] What to do about the PyPI mirrors
Well, now we have one breakage point more which keeps annoying me. We do? How? Installations that mention a specific mirror in their configuration file (such as f.pypi.python.org) will break when the DNS name is removed. Also, not everyone wants or needs auto-detection the way that the protocol describes it. I personally just hand-pick a mirror (my own, hah) and keep using that. I agree that this is probably the best choice, and you can still do that. See above. He did that, and the PyPI maintainers will break it. Regards, Martin ___ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] What to do about the PyPI mirrors
On Aug 6, 2013, at 3:24 AM, holger krekel hol...@merlinux.eu wrote: FWIW I think CPAN is structured such that search sites operate on mirrored data. The master server thus can remain dumb. Sounds like a good recipe to me. It's a bit sad but i think even now we are struggling to meet CPAN's architecture and ease-of-use, let alone improve on it. Changes like this aren't off the table in the future. Right now a lot of the work is being to bring some level of sanity to what is already there and then make it possible to iterate and hash out what we want a python package index to look like. - Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA signature.asc Description: Message signed with OpenPGP using GPGMail ___ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] What to do about the PyPI mirrors
Quoting Donald Stufft don...@stufft.io: Unless I'm forgetting something there's no real way to get the server key without going through Fastly You should have a copy of the server key upfront, on your disk. You can still get it directly from pypi with HTTP request to pypi.into.python.org/serverkey. and even if there was Fastly could just hijack an upload (and murder their entire business in the process). Couldn't you also use pypi.int.python.org for uploading? Regards, Martin ___ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] What to do about the PyPI mirrors
On Aug 6, 2013, at 3:20 AM, mar...@v.loewis.de wrote: Well, now we have one breakage point more which keeps annoying me. We do? How? Installations that mention a specific mirror in their configuration file (such as f.pypi.python.org) will break when the DNS name is removed. I don't see how this is relevant to his statement? He was talking about the CDN and I was asking him to clarify. Also, not everyone wants or needs auto-detection the way that the protocol describes it. I personally just hand-pick a mirror (my own, hah) and keep using that. I agree that this is probably the best choice, and you can still do that. See above. He did that, and the PyPI maintainers will break it. I don't think anyones claimed that removing the names won't break things for people who directly referenced them, but it's an important step that we do that. - Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA signature.asc Description: Message signed with OpenPGP using GPGMail ___ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] What to do about the PyPI mirrors
On Aug 6, 2013, at 3:29 AM, mar...@v.loewis.de wrote: Quoting Donald Stufft don...@stufft.io: Unless I'm forgetting something there's no real way to get the server key without going through Fastly You should have a copy of the server key upfront, on your disk. You can still get it directly from pypi with HTTP request to pypi.into.python.org/serverkey. and even if there was Fastly could just hijack an upload (and murder their entire business in the process). Couldn't you also use pypi.int.python.org for uploading? Regards, Martin pypi.int.python.org is not a public name and has no promise on existing tomorrow. Even if it was it's HTTP only and thus now you have an attacker who can substitute his own key for the server key and his own serversig for packages downloaded over HTTP from a mirror. The same thing applies to uploading, so you remove the possibility of Fastly attacking you and open up the much wider chance that a MITM would attack you. - Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA signature.asc Description: Message signed with OpenPGP using GPGMail ___ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] What to do about the PyPI mirrors
On 6 August 2013 17:30, Donald Stufft don...@stufft.io wrote: On Aug 6, 2013, at 3:20 AM, mar...@v.loewis.de wrote: See above. He did that, and the PyPI maintainers will break it. I don't think anyones claimed that removing the names won't break things for people who directly referenced them, but it's an important step that we do that. Right, but I think it's one where we can offer responsive mirror maintainers a generous time frame. We're down to only 5 mirrors using the *.pypi.python.org naming scheme anyway, so we should probably include contacting the maintainers directly in the transition plan. That makes the process: - we immediately stop handing out any new *.pypi.python.org mirror names (this has effectively happened already, the PEP will just be making it official) - the operators of the 5 current *.pypi.python.org mirrors are contacted directly, informing them of the plan to deprecate and remove those domain names, and offering the choice of two alternatives: 1. After 2 months (or earlier if requested), the domain name is redirected to the PyPI CDN and the mirror is effectively retired. 2 months after the release of pip 1.5, the name is removed entirely 2. The mirror operator establishes a 301 redirect to a HTTPS capable domain name they control and negotiates the time frame for retirement and removal of the *.pypi.python.org domain record with the PSF infrastructure team - after two months, last.pypi.python.org and any *.pypi.python.org mirror names which didn't request option 2 above are redirected to the CDN - two months after the release of pip 1.5, last.pypi.python.org and any *.pypi.python.org mirror names which didn't request option 2 above are removed from the DNS - the exact time frames for option 2 above will be worked out individually with the mirror operators that request it (that would be at least Christian for f.pypi.python.org, and perhaps some of the other mirror operators if they also choose option 2) Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] What to do about the PyPI mirrors
On Tue, Aug 6, 2013 at 9:10 AM, holger krekel hol...@merlinux.eu wrote: PyPI mirrors _are_ associated with PyPI and pypi.python.org. (Why) Do do want to flatly rule out pip/pypi.python.org support for managing mirrors? Automatic mirror discovery opens extra security holes until we have found some way to tighten up the security in general. Once we have a way of verifying packages that work and that doesn't rely on the mirror you are using, we could add it back. Indeed, just having a json list makes sense. //Lennart ___ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] What to do about the PyPI mirrors
On Aug 6, 2013, at 3:47 AM, Nick Coghlan ncogh...@gmail.com wrote: On 6 August 2013 17:30, Donald Stufft don...@stufft.io wrote: On Aug 6, 2013, at 3:20 AM, mar...@v.loewis.de wrote: See above. He did that, and the PyPI maintainers will break it. I don't think anyones claimed that removing the names won't break things for people who directly referenced them, but it's an important step that we do that. Right, but I think it's one where we can offer responsive mirror maintainers a generous time frame. We're down to only 5 mirrors using the *.pypi.python.org naming scheme anyway, so we should probably include contacting the maintainers directly in the transition plan. That makes the process: - we immediately stop handing out any new *.pypi.python.org mirror names (this has effectively happened already, the PEP will just be making it official) - the operators of the 5 current *.pypi.python.org mirrors are contacted directly, informing them of the plan to deprecate and remove those domain names, and offering the choice of two alternatives: Minor point but it's 4 mirrors. The a mirror is simply an alias for PyPI itself which leaves, c, e, f, g. 1. After 2 months (or earlier if requested), the domain name is redirected to the PyPI CDN and the mirror is effectively retired. 2 months after the release of pip 1.5, the name is removed entirely 2. The mirror operator establishes a 301 redirect to a HTTPS capable domain name they control and negotiates the time frame for retirement and removal of the *.pypi.python.org domain record with the PSF infrastructure team - after two months, last.pypi.python.org and any *.pypi.python.org mirror names which didn't request option 2 above are redirected to the CDN - two months after the release of pip 1.5, last.pypi.python.org and any *.pypi.python.org mirror names which didn't request option 2 above are removed from the DNS - the exact time frames for option 2 above will be worked out individually with the mirror operators that request it (that would be at least Christian for f.pypi.python.org, and perhaps some of the other mirror operators if they also choose option 2) It's probably simpler to just lengthen the timeframe and allow early opt in to having the N.pypi.python.org redirected back to PyPI (Minor point, it doesn't actually go directly through the CDN because the CDN is configured to require SSL). I would much rather have the details laid out in the PEP than have the Infra team being placed in the line of fire. I think it would even be reasonable to not have a forced redirect to the CDN and instead say in N amount of time the DNS entries will be removed, and allow mirror operators to ask us to redirect their N.pypi.python.org back to the CDN if they've felt their migration is complete before N amount of time happens. The big question then becomes what is a reasonable value for N amount of time, the original proposal essentially used 4 months for no real reason. Would 6 months be better? 8? I think making this window _too_ long doesn't really do anything except delay the inevitable and the window should be decided on for what's a reasonable amount of time for people to move away from pointing directly at the N.pypi.python.org not delaying the need to do it until a later date. Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia - Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA signature.asc Description: Message signed with OpenPGP using GPGMail ___ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] What to do about the PyPI mirrors
On 6 August 2013 17:59, Donald Stufft don...@stufft.io wrote: It's probably simpler to just lengthen the timeframe and allow early opt in to having the N.pypi.python.org redirected back to PyPI (Minor point, it doesn't actually go directly through the CDN because the CDN is configured to require SSL). I would much rather have the details laid out in the PEP than have the Infra team being placed in the line of fire. I think it would even be reasonable to not have a forced redirect to the CDN and instead say in N amount of time the DNS entries will be removed, and allow mirror operators to ask us to redirect their N.pypi.python.org back to the CDN if they've felt their migration is complete before N amount of time happens. Sounds good to me. The big question then becomes what is a reasonable value for N amount of time, the original proposal essentially used 4 months for no real reason. Would 6 months be better? 8? I think making this window _too_ long doesn't really do anything except delay the inevitable and the window should be decided on for what's a reasonable amount of time for people to move away from pointing directly at the N.pypi.python.org not delaying the need to do it until a later date. I believe Christian gets to define reasonable on this point :) Plucking a date out of the air, though, why not: July 1, 2014, with 6 month, 3 month and 1 month warnings sent to the operators of mirrors that haven't yet been redirected back to PyPI. That's nearly 11 months away, and hopefully other changes will have settled down by then. If the mirror operators are happy their transition is complete before then, cool, otherwise they have a hard deadline to work with. Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia On 6 August 2013 17:59, Donald Stufft don...@stufft.io wrote: On Aug 6, 2013, at 3:47 AM, Nick Coghlan ncogh...@gmail.com wrote: On 6 August 2013 17:30, Donald Stufft don...@stufft.io wrote: On Aug 6, 2013, at 3:20 AM, mar...@v.loewis.de wrote: See above. He did that, and the PyPI maintainers will break it. I don't think anyones claimed that removing the names won't break things for people who directly referenced them, but it's an important step that we do that. Right, but I think it's one where we can offer responsive mirror maintainers a generous time frame. We're down to only 5 mirrors using the *.pypi.python.org naming scheme anyway, so we should probably include contacting the maintainers directly in the transition plan. That makes the process: - we immediately stop handing out any new *.pypi.python.org mirror names (this has effectively happened already, the PEP will just be making it official) - the operators of the 5 current *.pypi.python.org mirrors are contacted directly, informing them of the plan to deprecate and remove those domain names, and offering the choice of two alternatives: Minor point but it's 4 mirrors. The a mirror is simply an alias for PyPI itself which leaves, c, e, f, g. 1. After 2 months (or earlier if requested), the domain name is redirected to the PyPI CDN and the mirror is effectively retired. 2 months after the release of pip 1.5, the name is removed entirely 2. The mirror operator establishes a 301 redirect to a HTTPS capable domain name they control and negotiates the time frame for retirement and removal of the *.pypi.python.org domain record with the PSF infrastructure team - after two months, last.pypi.python.org and any *.pypi.python.org mirror names which didn't request option 2 above are redirected to the CDN - two months after the release of pip 1.5, last.pypi.python.org and any *.pypi.python.org mirror names which didn't request option 2 above are removed from the DNS - the exact time frames for option 2 above will be worked out individually with the mirror operators that request it (that would be at least Christian for f.pypi.python.org, and perhaps some of the other mirror operators if they also choose option 2) It's probably simpler to just lengthen the timeframe and allow early opt in to having the N.pypi.python.org redirected back to PyPI (Minor point, it doesn't actually go directly through the CDN because the CDN is configured to require SSL). I would much rather have the details laid out in the PEP than have the Infra team being placed in the line of fire. I think it would even be reasonable to not have a forced redirect to the CDN and instead say in N amount of time the DNS entries will be removed, and allow mirror operators to ask us to redirect their N.pypi.python.org back to the CDN if they've felt their migration is complete before N amount of time happens. The big question then becomes what is a reasonable value for N amount of time, the original proposal essentially used 4 months for no real reason. Would 6 months be better? 8? I think making this window _too_ long doesn't really do anything except
Re: [Distutils] What to do about the PyPI mirrors
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Am 06.08.13 09:59, schrieb Donald Stufft: The big question then becomes what is a reasonable value for N amount of time, the original proposal essentially used 4 months for no real reason. Would 6 months be better? 8? I think making this window _too_ long doesn't really do anything except delay the inevitable and the window should be decided on for what's a reasonable amount of time for people to move away from pointing directly at the N.pypi.python.org not delaying the need to do it until a later date. Assuming the main breakage comes from people having hard-coded the mirror names in configuration files: Why not leave the *.pypi names available forever (ten years), all pointing to the master? Regards, Martin -BEGIN PGP SIGNATURE- Version: GnuPG/MacGPG2 v2.0.18 (Darwin) Comment: GPGTools - http://gpgtools.org Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iEYEARECAAYFAlIA084ACgkQavBT8H2dyNLqhQCdFa1N3X/x7K2pYyakDlfkAgDW u74Ani9rN6zQ9TTGxAtl48MI36SmzNxc =kAJm -END PGP SIGNATURE- ___ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] What to do about the PyPI mirrors
On Aug 6, 2013, at 6:45 AM, Martin v. Löwis mar...@v.loewis.de wrote: Assuming the main breakage comes from people having hard-coded the mirror names in configuration files: Why not leave the *.pypi names available forever (ten years), all pointing to the master? The major reason (for me, Noah might have others as Infra lead) is that they have never been available via TLS, so everyone using them hard-coded is using them hard-coded as HTTP. A lot of those people likely don't realize that by using them they are risking a man in the middle attack. So by continuing to support them we are essentially continuing to enable a grossly insecure setting with the very likely case being the folks vulnerable to it have not made an informed decision to do so and instead have merely done what they thought was best practice. Ensuring that the transport is safe is one of my primary goals right now. A secondary (but minor) reason is simply one of logistics. Throughout various migrations around as things on PyPI settled the ones that do point back to PyPI have randomly become broken, sometimes for weeks or months. It's easy to miss checking all of them that they continue to work and I believe that it's better to have a clean break than half ass support those names. - Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA signature.asc Description: Message signed with OpenPGP using GPGMail ___ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] What to do about the PyPI mirrors
Hi, Thanks for all the feedback, I'll calm down a bit and ponder some more structured reply. However, you're responding to the technicalities. I didn't see any consideration to the user pain. It seems irrelevant. Almost like arguing with the TSA about taking off your shoes. f.pypi.python.org is going to go away. And *everyone* using it needs to change it. Manually - or else. Other communities, like the Linux distributions, are doing simple, file-based stuff for ages. They did not learn from us, and AFAICT we didn't learn from them? My overall feeling is that we're telling the story to the outside world that they should not rely on Python. We can break anything, will find a good technical argument, and be done. If we're getting so much better with all the changes, then this goodness should be available to anyone invested with the platform already. Currently this seems to be: hurt the people who are with us now, so we can get more new ones when they leave. Sorry for the sarcasm, as promised, I'll come back with a more structured technical response later - need to go and calm down. Christian -- Christian Theune · c...@gocept.com gocept gmbh co. kg · Forsterstraße 29 · 06112 Halle (Saale) · Germany http://gocept.com · Tel +49 345 1229889-7 Python, Pyramid, Plone, Zope · consulting, development, hosting, operations signature.asc Description: Message signed with OpenPGP using GPGMail ___ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] What to do about the PyPI mirrors
On 6 August 2013 16:59, Christian Theune c...@gocept.com wrote: Hi, Thanks for all the feedback, I'll calm down a bit and ponder some more structured reply. However, you're responding to the technicalities. I didn't see any consideration to the user pain. It seems irrelevant. Almost like arguing with the TSA about taking off your shoes. User pain is the only reason for not making the change tomorrow. People need time to adjust, or to propose alternative solutions. f.pypi.python.org is going to go away. And *everyone* using it needs to change it. Manually - or else. Delegating subdomains of python.org without a contractual relationship in place was a fundamental mistake. It should never have happened. We can either admit We screwed up and set up a seriously flawed mirroring system and take steps to fix it, or we can leave the HTTP-only mirrors open as a security hole forever. One means by which I could see an f.pypi.python.org DNS record being left in place indefinitely is if the TUF folks are able to come up with a scheme for offering end-to-end security for the *existing* PyPI metadata, *and* the TUF metadata is mirrored by bandersnatch *and* the TUF client side integrity checks are invoked by pip. In that case, the security argument regarding the lack of TLS on the subdomains would be rendered moot, and the backwards compatibility argument for keeping it active would win. Another potential alternative might be for Gocept to approach the PSF about getting an SSL certificate for that domain, ensuring pip and setuptools both support HSTS, and then switching that mirror over to using HSTS (so even configurations hardcoded to use http://f.pypi.python.org will still get a validated secure connection). Both of those approaches would close the security hole, while leaving the domain in place. If upgrading pip and easy_install clients is a more acceptable solution than updating affected configurations to use a different domain name, then these are certainly options we should discuss. The only option which I consider completely out of the question is leaving f.pypi.python.org (or any other *.pypi.python.org subdomain) in place indefinitely as an insecure HTTP-only endpoint. Other communities, like the Linux distributions, are doing simple, file-based stuff for ages. They did not learn from us, and AFAICT we didn't learn from them? This case *is* a matter of us learning from other mirroring systems: none of them are based on delegating subdomains to third parties, they're all based on lists of mirror URLs, and some mechanism for retrieving that list. However, as long as the flawed way remains blessed as the official mirroring network, it's difficult for an alternative model to gain any traction. Regards, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] What to do about the PyPI mirrors
One means by which I could see an f.pypi.python.org DNS record being left in place indefinitely is if the TUF folks are able to come up with a scheme for offering end-to-end security for the *existing* PyPI metadata, *and* the TUF metadata is mirrored by bandersnatch *and* the TUF client side integrity checks are invoked by pip. In that case, the security argument regarding the lack of TLS on the subdomains would be rendered moot, and the backwards compatibility argument for keeping it active would win. It seems like you've been reading our minds (or at least our mailing list)! Thanks, Justin ___ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] What to do about the PyPI mirrors
On Aug 6, 2013, at 8:22 AM, Nick Coghlan ncogh...@gmail.com wrote: On 6 August 2013 16:59, Christian Theune c...@gocept.com wrote: Hi, Thanks for all the feedback, I'll calm down a bit and ponder some more structured reply. However, you're responding to the technicalities. I didn't see any consideration to the user pain. It seems irrelevant. Almost like arguing with the TSA about taking off your shoes. User pain is the only reason for not making the change tomorrow. People need time to adjust, or to propose alternative solutions. f.pypi.python.org is going to go away. And *everyone* using it needs to change it. Manually - or else. Delegating subdomains of python.org without a contractual relationship in place was a fundamental mistake. It should never have happened. We can either admit We screwed up and set up a seriously flawed mirroring system and take steps to fix it, or we can leave the HTTP-only mirrors open as a security hole forever. One means by which I could see an f.pypi.python.org DNS record being left in place indefinitely is if the TUF folks are able to come up with a scheme for offering end-to-end security for the *existing* PyPI metadata, *and* the TUF metadata is mirrored by bandersnatch *and* the TUF client side integrity checks are invoked by pip. In that case, the security argument regarding the lack of TLS on the subdomains would be rendered moot, and the backwards compatibility argument for keeping it active would win. It would be rendered moot as far as any tooling that was updated to work with it (such as pip etc) however the browser level attacks would still be in play. If those can be solved essentially require giving mirror operators a SSL certificate for N.pypi.python.org (which still does not preclude a malicious mirror operator or a compromised mirror from being used to steal logins on PyPI). Another potential alternative might be for Gocept to approach the PSF about getting an SSL certificate for that domain, ensuring pip and setuptools both support HSTS, and then switching that mirror over to using HSTS (so even configurations hardcoded to use http://f.pypi.python.org will still get a validated secure connection). FWIW this doesn't make it secure unless they change their configuration to point to https://f.pypi.python.org/simple instead of http://f.pypi.python.org/simple.* Programmatic libraries typically don't support HSTS (as HSTS it primarily used to prevent attacks that don't typically apply to command line clients). And certainly none of the existing tools support it. So given that we'd be relying on the redirect that upgrades the connection to HTTPS an attacker could simply return HTML instead of the redirect to HTTPS. Hence why it makes sense to remove, because either way users will need to edit their existing configurations if they intend to be secure and moving to a different domain will prevent the in browser attacks as well. * This is also true of anyone who has hard coded an url to http://pypi.python.org/simple/ however there's no reasonable way to fix that. Both of those approaches would close the security hole, while leaving the domain in place. If upgrading pip and easy_install clients is a more acceptable solution than updating affected configurations to use a different domain name, then these are certainly options we should discuss. The only option which I consider completely out of the question is leaving f.pypi.python.org (or any other *.pypi.python.org subdomain) in place indefinitely as an insecure HTTP-only endpoint. Other communities, like the Linux distributions, are doing simple, file-based stuff for ages. They did not learn from us, and AFAICT we didn't learn from them? This case *is* a matter of us learning from other mirroring systems: none of them are based on delegating subdomains to third parties, they're all based on lists of mirror URLs, and some mechanism for retrieving that list. However, as long as the flawed way remains blessed as the official mirroring network, it's difficult for an alternative model to gain any traction. Regards, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia - Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA signature.asc Description: Message signed with OpenPGP using GPGMail ___ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] How to disable PYTHONPATH checking when installing packages using distribute
On Fri, Aug 2, 2013 at 8:53 AM, lukshun...@gmail.com wrote: Hi, During installing a package which uses distribute (matplotlib in this case), it refuses to work with this message running install Checking .pth file support in /usr/local/stow/matplotlib-1.3.0/lib/python2.7/site-packages/ /usr/bin/python -E -c pass TEST FAILED: /usr/local/stow/matplotlib-1.3.0/lib/python2.7/site-packages/ does NOT support .pth files error: bad install directory or PYTHONPATH ... Please make the appropriate changes for your system and try again. I install local packages using the stow approach, which installs each package under its own sub-directory and later stowed (https://www.gnu.org/software/stow/). Such error becomes a nuisance as a different PYTHONPATH has to be set for each installation of a package. How can the checking be disable? I don't seem to be able to find anything in the documentation and would be grateful for any pointer. Or maybe it's better turned into a warning and users be reminded to add the install directory to PYTHONPATH. By default setuptools (distribute) installs packages as eggs, and loading eggs requires the ability to write a .pth file to a directory that will be checked for .pth files at start up (i.e. is in PYTHONPATH or otherwise on sys.path by default). You can avoid doing an egg-based install by instead running: python setup.py install --single-version-externally-managed --prefix /usr/local/stow/matplotlib-1.3.0 or something to that effect. I think if you do this you also need to make sure to manually add the .egg-info directory as well. I think you can do this with python setup.py install_egg_info --install-dir /usr/local/stow/matplotlib-1.3.0/lib/python2.7/site-packages/ but YMMV. You might also try just installing with pip since it will basically install the package in the same way by default. Erik ___ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] What to do about the PyPI mirrors
On Aug 6, 2013, at 5:22 AM, Nick Coghlan wrote: On 6 August 2013 16:59, Christian Theune c...@gocept.com wrote: Hi, Thanks for all the feedback, I'll calm down a bit and ponder some more structured reply. However, you're responding to the technicalities. I didn't see any consideration to the user pain. It seems irrelevant. Almost like arguing with the TSA about taking off your shoes. User pain is the only reason for not making the change tomorrow. People need time to adjust, or to propose alternative solutions. My reasoning for picking 4 months total on the migration is that an individual user switching their mirror hostnames is a relatively quick process (maybe a few days in a really big case) and anyone that doesn't hear about this change within a few months is highly unlikely to learn about it in a larger period of time. Humans are generally deadline-driven, so moving the dealing back doesn't get us much except moving the conversion work back with it. Basically I think paste the 6-8 weeks mark, we are just hitting the long tail in terms of actual benefit to users, and it is better to just break the system and force them to notice they need to fix things (since one reason for doing this is current system is unsafe and allowing that to exist for another year is not really on my list). --Noah signature.asc Description: Message signed with OpenPGP using GPGMail ___ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] What to do about the PyPI mirrors
On 7 August 2013 01:58, Donald Stufft don...@stufft.io wrote: On Aug 6, 2013, at 8:22 AM, Nick Coghlan ncogh...@gmail.com wrote: One means by which I could see an f.pypi.python.org DNS record being left in place indefinitely is if the TUF folks are able to come up with a scheme for offering end-to-end security for the *existing* PyPI metadata, *and* the TUF metadata is mirrored by bandersnatch *and* the TUF client side integrity checks are invoked by pip. In that case, the security argument regarding the lack of TLS on the subdomains would be rendered moot, and the backwards compatibility argument for keeping it active would win. It would be rendered moot as far as any tooling that was updated to work with it (such as pip etc) however the browser level attacks would still be in play. If those can be solved essentially require giving mirror operators a SSL certificate for N.pypi.python.org (which still does not preclude a malicious mirror operator or a compromised mirror from being used to steal logins on PyPI). We're not talking about mirror operators in an abstract sense any more: we're talking specifically about the possibility of a formal relationship with Gocept to keep the f.pypi.python.org domain alive indefinitely. It isn't Gocept's fault that the upstream mirroring system design was broken, so the burden is on us to work with Christian to come up with a transition plan that is acceptable to both sides. That may mean retiring the subdomain and Gocept changing the configuration at affected sites, or it may mean Gocept negotiating a more formal relationship with the PSF to continue operating f.pypi.python.org in particular. Both options need to be on the table from the upstream side, giving Gocept a chance to assess the alternatives (i.e. change existing sites to point to a new domain, or accept that some existing sites will remain insecure until they have been upgraded, just as we have to accept that old clients will remain insecure when accessing the main site over HTTP). Another potential alternative might be for Gocept to approach the PSF about getting an SSL certificate for that domain, ensuring pip and setuptools both support HSTS, and then switching that mirror over to using HSTS (so even configurations hardcoded to use http://f.pypi.python.org will still get a validated secure connection). FWIW this doesn't make it secure unless they change their configuration to point to https://f.pypi.python.org/simple instead of http://f.pypi.python.org/simple.* Programmatic libraries typically don't support HSTS (as HSTS it primarily used to prevent attacks that don't typically apply to command line clients). And certainly none of the existing tools support it. So given that we'd be relying on the redirect that upgrades the connection to HTTPS an attacker could simply return HTML instead of the redirect to HTTPS. Yeah, I realised this flaw after posting. An alternative hack that would allow the problem to be solved through an upgrade pip/easy_install solution rather than an ensure all configs have been edited approach would be a simple URL translation map that converts legacy http://f.pypi.python.org; references to a new URL. Hence why it makes sense to remove, because either way users will need to edit their existing configurations if they intend to be secure and moving to a different domain will prevent the in browser attacks as well. * This is also true of anyone who has hard coded an url to http://pypi.python.org/simple/ however there's no reasonable way to fix that. Hardcoded references to http://f.pypi.python.org/simple/ aren't *that* different from hardcoded references to the main site. The only addition is the inclusion of Gocept in the chain of trust. Given that Christian wrote the now recommended mirroring client, trusting Christian/Gocept is fairly unavoidable at this point :) We don't have a hard deadline for fixing this on the upstream side - it's in the important but not currently urgent category. If we can get the active legacy mirrors down to just f.pypi.python.org that will be solid progress, and then we can work out a specific arrangement for that last mirror which works for Gocept as well. Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] What to do about the PyPI mirrors
On Aug 6, 2013, at 9:10 PM, Nick Coghlan ncogh...@gmail.com wrote: On 7 August 2013 01:58, Donald Stufft don...@stufft.io wrote: On Aug 6, 2013, at 8:22 AM, Nick Coghlan ncogh...@gmail.com wrote: One means by which I could see an f.pypi.python.org DNS record being left in place indefinitely is if the TUF folks are able to come up with a scheme for offering end-to-end security for the *existing* PyPI metadata, *and* the TUF metadata is mirrored by bandersnatch *and* the TUF client side integrity checks are invoked by pip. In that case, the security argument regarding the lack of TLS on the subdomains would be rendered moot, and the backwards compatibility argument for keeping it active would win. It would be rendered moot as far as any tooling that was updated to work with it (such as pip etc) however the browser level attacks would still be in play. If those can be solved essentially require giving mirror operators a SSL certificate for N.pypi.python.org (which still does not preclude a malicious mirror operator or a compromised mirror from being used to steal logins on PyPI). We're not talking about mirror operators in an abstract sense any more: we're talking specifically about the possibility of a formal relationship with Gocept to keep the f.pypi.python.org domain alive indefinitely. It isn't Gocept's fault that the upstream mirroring system design was broken, so the burden is on us to work with Christian to come up with a transition plan that is acceptable to both sides. I recognize this. The problem is that Gocept isn't the sole user of their mirror. If they were (or we knew the set of users who were) then we could approach each one of them. That may mean retiring the subdomain and Gocept changing the configuration at affected sites, or it may mean Gocept negotiating a more formal relationship with the PSF to continue operating f.pypi.python.org in particular. Both options need to be on the table from the upstream side, giving Gocept a chance to assess the alternatives (i.e. change existing sites to point to a new domain, or accept that some existing sites will remain insecure until they have been upgraded, just as we have to accept that old clients will remain insecure when accessing the main site over HTTP). Gocept can't make that decision for installations other than their own. Christian said that his mirror is seeing 150-300GB of traffic besides the traffic Gocept generates. Some of that is coming from ``pip --use-mirrors`` I'm sure, but some of it is also coming from other people who have hardcoded f.pypi.python.org in their config for whom we don't know who they are and we don't have a good way of informing them so they can make an informed consent on using an insecure transport. Another potential alternative might be for Gocept to approach the PSF about getting an SSL certificate for that domain, ensuring pip and setuptools both support HSTS, and then switching that mirror over to using HSTS (so even configurations hardcoded to use http://f.pypi.python.org will still get a validated secure connection). FWIW this doesn't make it secure unless they change their configuration to point to https://f.pypi.python.org/simple instead of http://f.pypi.python.org/simple.* Programmatic libraries typically don't support HSTS (as HSTS it primarily used to prevent attacks that don't typically apply to command line clients). And certainly none of the existing tools support it. So given that we'd be relying on the redirect that upgrades the connection to HTTPS an attacker could simply return HTML instead of the redirect to HTTPS. Yeah, I realised this flaw after posting. An alternative hack that would allow the problem to be solved through an upgrade pip/easy_install solution rather than an ensure all configs have been edited approach would be a simple URL translation map that converts legacy http://f.pypi.python.org; references to a new URL. Hence why it makes sense to remove, because either way users will need to edit their existing configurations if they intend to be secure and moving to a different domain will prevent the in browser attacks as well. * This is also true of anyone who has hard coded an url to http://pypi.python.org/simple/ however there's no reasonable way to fix that. Hardcoded references to http://f.pypi.python.org/simple/ aren't *that* different from hardcoded references to the main site. The only addition is the inclusion of Gocept in the chain of trust. Given that Christian wrote the now recommended mirroring client, trusting Christian/Gocept is fairly unavoidable at this point :) The main difference is hard coded references to PyPI is unlikely. The clients defaulted to that so there was no reason to point to it in your configuration. This is representative of the traffic we see coming into PyPI and the decline of HTTP connections to /simple/. We don't have a hard deadline for
Re: [Distutils] What to do about the PyPI mirrors
How about building a deprecation period into the tooling? pip 1.5+ could warn users who are using *.pypi.python.org of the error in their ways and encourage them to switch to the new system and gives a date of total removal. After removal the code could also be removed from pip 1.x+. - Michael On Tue, Aug 6, 2013 at 8:36 PM, Donald Stufft don...@stufft.io wrote: On Aug 6, 2013, at 9:10 PM, Nick Coghlan ncogh...@gmail.com wrote: On 7 August 2013 01:58, Donald Stufft don...@stufft.io wrote: On Aug 6, 2013, at 8:22 AM, Nick Coghlan ncogh...@gmail.com wrote: One means by which I could see an f.pypi.python.org DNS record being left in place indefinitely is if the TUF folks are able to come up with a scheme for offering end-to-end security for the *existing* PyPI metadata, *and* the TUF metadata is mirrored by bandersnatch *and* the TUF client side integrity checks are invoked by pip. In that case, the security argument regarding the lack of TLS on the subdomains would be rendered moot, and the backwards compatibility argument for keeping it active would win. It would be rendered moot as far as any tooling that was updated to work with it (such as pip etc) however the browser level attacks would still be in play. If those can be solved essentially require giving mirror operators a SSL certificate for N.pypi.python.org (which still does not preclude a malicious mirror operator or a compromised mirror from being used to steal logins on PyPI). We're not talking about mirror operators in an abstract sense any more: we're talking specifically about the possibility of a formal relationship with Gocept to keep the f.pypi.python.org domain alive indefinitely. It isn't Gocept's fault that the upstream mirroring system design was broken, so the burden is on us to work with Christian to come up with a transition plan that is acceptable to both sides. I recognize this. The problem is that Gocept isn't the sole user of their mirror. If they were (or we knew the set of users who were) then we could approach each one of them. That may mean retiring the subdomain and Gocept changing the configuration at affected sites, or it may mean Gocept negotiating a more formal relationship with the PSF to continue operating f.pypi.python.org in particular. Both options need to be on the table from the upstream side, giving Gocept a chance to assess the alternatives (i.e. change existing sites to point to a new domain, or accept that some existing sites will remain insecure until they have been upgraded, just as we have to accept that old clients will remain insecure when accessing the main site over HTTP). Gocept can't make that decision for installations other than their own. Christian said that his mirror is seeing 150-300GB of traffic besides the traffic Gocept generates. Some of that is coming from ``pip --use-mirrors`` I'm sure, but some of it is also coming from other people who have hardcoded f.pypi.python.org in their config for whom we don't know who they are and we don't have a good way of informing them so they can make an informed consent on using an insecure transport. Another potential alternative might be for Gocept to approach the PSF about getting an SSL certificate for that domain, ensuring pip and setuptools both support HSTS, and then switching that mirror over to using HSTS (so even configurations hardcoded to use http://f.pypi.python.org will still get a validated secure connection). FWIW this doesn't make it secure unless they change their configuration to point to https://f.pypi.python.org/simple instead of http://f.pypi.python.org/simple.* Programmatic libraries typically don't support HSTS (as HSTS it primarily used to prevent attacks that don't typically apply to command line clients). And certainly none of the existing tools support it. So given that we'd be relying on the redirect that upgrades the connection to HTTPS an attacker could simply return HTML instead of the redirect to HTTPS. Yeah, I realised this flaw after posting. An alternative hack that would allow the problem to be solved through an upgrade pip/easy_install solution rather than an ensure all configs have been edited approach would be a simple URL translation map that converts legacy http://f.pypi.python.org; references to a new URL. Hence why it makes sense to remove, because either way users will need to edit their existing configurations if they intend to be secure and moving to a different domain will prevent the in browser attacks as well. * This is also true of anyone who has hard coded an url to http://pypi.python.org/simple/ however there's no reasonable way to fix that. Hardcoded references to http://f.pypi.python.org/simple/ aren't *that* different from hardcoded references to the main site. The only addition is the inclusion of Gocept in the chain of trust.
Re: [Distutils] What to do about the PyPI mirrors
On Aug 6, 2013, at 9:50 PM, Michael Merickel mmeri...@gmail.com wrote: How about building a deprecation period into the tooling? pip 1.5+ could warn users who are using *.pypi.python.org of the error in their ways and encourage them to switch to the new system and gives a date of total removal. After removal the code could also be removed from pip 1.x+. - Michael pip 1.5 already warns if you use ``--use-mirrors`` or ``--mirrors``. I suppose a warning could be added if you use -i N.pypi.python.org as well. - Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA signature.asc Description: Message signed with OpenPGP using GPGMail ___ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] What to do about the PyPI mirrors
On Aug 6, 2013, at 10:11 PM, Trishank Karthik Kuppusamy t...@students.poly.edu wrote: On 08/06/2013 09:59 PM, Donald Stufft wrote: On Aug 6, 2013, at 9:50 PM, Michael Merickel mmeri...@gmail.com wrote: How about building a deprecation period into the tooling? pip 1.5+ could warn users who are using *.pypi.python.org of the error in their ways and encourage them to switch to the new system and gives a date of total removal. After removal the code could also be removed from pip 1.x+. - Michael pip 1.5 already warns if you use ``--use-mirrors`` or ``--mirrors``. I suppose a warning could be added if you use -i N.pypi.python.org as well. Does anyone use anything other than pip to download from N.pypi.python.org? Yes. Other tooling can be pointed to N.pypi.python.org by specifying them as an index. - Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA signature.asc Description: Message signed with OpenPGP using GPGMail ___ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] What to do about the PyPI mirrors
On 08/06/2013 09:59 PM, Donald Stufft wrote: On Aug 6, 2013, at 9:50 PM, Michael Merickel mmeri...@gmail.com mailto:mmeri...@gmail.com wrote: How about building a deprecation period into the tooling? pip 1.5+ could warn users who are using *.pypi.python.org http://pypi.python.org/ of the error in their ways and encourage them to switch to the new system and gives a date of total removal. After removal the code could also be removed from pip 1.x+. - Michael pip 1.5 already warns if you use ``--use-mirrors`` or ``--mirrors``. I suppose a warning could be added if you use -i N.pypi.python.org http://N.pypi.python.org as well. Does anyone use anything other than pip to download from N.pypi.python.org? ___ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig