Re: [Distutils] Draft PEP for JSON based metadata published
A couple of significant upcoming changes: "build label" will be renamed as "source label" (since it refers to the common unbuilt source rather than a specific build) "version URL" will be renamed as "source URL" (same rationale) The current names are ambiguous as to whether they refer to the source code for the version or can be used to refer to built versions. Since they're specifically for source references (you need to add at least PEP 425 compatibility tags to construct a built reference), it makes sense to change the names. A more minor change is that the "organization" type/role for contacts will go away. Organization will be able to have any of the defined roles (author, maintainer, contributor) and if we later decide we need a programmatic means to distinguish abstract organisations from flesh and blood humans we can consider adding a new mechanism. Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Draft PEP for JSON based metadata published
On Tue, May 28, 2013 at 7:28 AM, Donald Stufft wrote: > On May 27, 2013, at 10:44 AM, Ronald Oussoren > wrote: > The versioning spec mentions that distribution tools may refuse to publish > distributions that pin the versions of dependencies. I understand why this > is needed, and agree in general, but have a usecase that I don't know how to > express without pinning. > > In particular, PyObjC consists of a number of distributions (pyobjc-core, > pyobjc-framework-Cocoa, ...) and an umbrella package (pyobjc) what depends > on the various distributions to make it easier to install all of PyObjC. The > umbrella package currently pins the versions of subpackages to ensure that > "pip install pyobjc==2.5.1" installs exactly that version of the entire > project. When I'd use the "compatible release" specifier I can no longer > easily ensure that users can install an exact version of the entire project, > other than by hacking the system: specify a compatible version with an > additional level that isn't used by the project (for example ~=2.5.2.0). > What is the correct way to create an umberella project without getting > yelled at by distribution tools? > > > It's unlikely PyPI will get more than a warning for ``==``, `is` comparisons > might be disallowed? Not sure. I think Ronald's example of publishing metadistributions that pin particular versions of subdistributions is a valid one (I do exactly the same thing myself with RPM, it just didn't occur to me as a use case while updating the PEPs), so I need to reconsider some of the index server restrictions currently proposed in the PEPs. However, I'd also still like to not-so-gently steer users away from overly restrictive dependencies in the general case. This is a case where in a *technical* sense there's no difference between "We are making these distributions we maintain easier to install all at once" and "Our distribution needs a compatible version of this other distribution in order to work", but *semantically* they're two quite different operations. So, what do people think of the idea of a new top level "distributes" field? Syntax identical to "requires", but *semantically* distinguished in that version pinning in "distributes" would be not only allowed, but encouraged. A metapackage like PyObjC would then have just entries in the "distributes" field, and no direct dependencies of its own. Thoughts? Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Draft PEP for JSON based metadata published
On Wed, May 29, 2013 at 11:04 AM, Donald Stufft wrote: > > On May 28, 2013, at 9:00 PM, Nick Coghlan wrote: > > On Mon, May 27, 2013 at 9:36 PM, Nick Coghlan wrote: > > After preliminary reviews by Donald and Daniel, I have now pushed the > first complete draft of the JSON-based metadata 2.0 proposal to > python.org > > PEP 426 (metadata 2.0): http://www.python.org/dev/peps/pep-0426/ > PEP 440 (versioning): http://www.python.org/dev/peps/pep-0440/ > > > Based on some offline feedback from Daniel, I'm going to change the > current "type" field in the contact metadata to "role". The name of > the default role will change from "individual" to "contributor", and > projects will be given freedom to define their own roles beyond the > predefined ones. (We're actually stealing this from the way contact > metadata works in PHP's composer). Hmm, I may actually drop the extensibility idea - it makes the tooling harder without providing a significant benefit. So just the name changes for the field and the default value. Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Draft PEP for JSON based metadata published
On May 28, 2013, at 9:00 PM, Nick Coghlan wrote: > On Mon, May 27, 2013 at 9:36 PM, Nick Coghlan wrote: >> After preliminary reviews by Donald and Daniel, I have now pushed the >> first complete draft of the JSON-based metadata 2.0 proposal to >> python.org >> >> PEP 426 (metadata 2.0): http://www.python.org/dev/peps/pep-0426/ >> PEP 440 (versioning): http://www.python.org/dev/peps/pep-0440/ > > Based on some offline feedback from Daniel, I'm going to change the > current "type" field in the contact metadata to "role". The name of > the default role will change from "individual" to "contributor", and > projects will be given freedom to define their own roles beyond the > predefined ones. (We're actually stealing this from the way contact > metadata works in PHP's composer). > > Cheers, > Nick. > ___ > Distutils-SIG maillist - Distutils-SIG@python.org > http://mail.python.org/mailman/listinfo/distutils-sig Please define what the valid values for the role field are when you include it. - Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA signature.asc Description: Message signed with OpenPGP using GPGMail ___ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Draft PEP for JSON based metadata published
On Mon, May 27, 2013 at 9:36 PM, Nick Coghlan wrote: > After preliminary reviews by Donald and Daniel, I have now pushed the > first complete draft of the JSON-based metadata 2.0 proposal to > python.org > > PEP 426 (metadata 2.0): http://www.python.org/dev/peps/pep-0426/ > PEP 440 (versioning): http://www.python.org/dev/peps/pep-0440/ Based on some offline feedback from Daniel, I'm going to change the current "type" field in the contact metadata to "role". The name of the default role will change from "individual" to "contributor", and projects will be given freedom to define their own roles beyond the predefined ones. (We're actually stealing this from the way contact metadata works in PHP's composer). Cheers, Nick. ___ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig
[Distutils] [ANN] pypiserver 1.1.1 - minimal private pypi server
Hi, I've just uploaded pypiserver 1.1.1 to the python package index. pypiserver is a minimal PyPI compatible server. It can be used to serve a set of packages and eggs to easy_install or pip. pypiserver is easy to install (i.e. just 'pip install pypiserver'). It doesn't have any external dependencies. https://pypi.python.org/pypi/pypiserver/ should contain enough information to easily get you started running your own PyPI server in a few minutes. The code is available on github: https://github.com/schmir/pypiserver Changes in this version --- - add 'overwrite' option to allow overwriting existing package files (default: false) - show names with hyphens instead of underscores on the "/simple" listing - make the standalone version work with jython 2.5.3 - upgrade waitress to 0.8.5 in the standalone version - workaround broken xmlrpc api on pypi.python.org by using HTTPS -- Cheers Ralf ___ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Draft PEP for JSON based metadata published
On Tue, May 28, 2013 at 2:07 PM, Erik Bray wrote: > On Mon, May 27, 2013 at 7:36 AM, Nick Coghlan wrote: >> After preliminary reviews by Donald and Daniel, I have now pushed the >> first complete draft of the JSON-based metadata 2.0 proposal to >> python.org >> >> PEP 426 (metadata 2.0): http://www.python.org/dev/peps/pep-0426/ >> PEP 440 (versioning): http://www.python.org/dev/peps/pep-0440/ >> >> With the rationale and commentary, they're over 3000 lines between >> them, so I'm not attaching them here. >> >> The rationale for many of the changes is at the end of each PEP, along >> with some comments on features that I have either rejected or >> deliberately chosen to defer to the next revision of the metadata (at >> the earliest). >> >> Those with BitBucket accounts may also comment inline on the drafts here: >> >> PEP 426: >> https://bitbucket.org/ncoghlan/misc/src/05d3586464b10d6a04a35409468269d7c89a87ba/pep_drafts/pep-0426.txt?at=default >> PEP 440: >> https://bitbucket.org/ncoghlan/misc/src/05d3586464b10d6a04a35409468269d7c89a87ba/pep_drafts/pep-0440.txt?at=default > > This is looking fantastic so far--thanks to Nick, Daniel, and Donald > for their continued work on this. For now I just have a handful of > minor notes on the latest draft of PEP 426: > > Typos: > > Under "Essential dependency resolution metadata" the "may_require" and > related metadata keywords are spelled with hyphens instead of > underscores. > > Under "Metabuild system" in the first example I think > "some_test_harness.metabuild_hook" was meant to read > "some_test_harness:metabuild_hook" > > > Under "Development, build and deployment dependencies": "allow" -> "allows" > > Under "Support for metabuild hooks": "by allows projects" -> "by > allowing projects" > > Comment: > > I'm not sure if this PEP is the best place for this, but I wonder if > the description of the "Keywords" format could provide some > clarification on how that field should be formatted in older metadata > versions (specifically when including version 1.x metadata for > backwards compatibility). In the past its format has never been > specified. Some tools treat it as a space-separated fields. Others > have treated it as a comma-separated field. Sometimes one or the > other depending on whether commas are present. It's a very annoying > field. I suggest treating it as a space-separated field for converting from 2.0 to 1.0. To convert from 1.0 to 2.0 you should just split on "not a letter" or if you are feeling ambitious "not some larger set of characters, probably resembling the identifier or package name rules". ___ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Good news everyone, PyPI is behind a CDN
On May 28, 2013, at 2:21 PM, Erik Bray wrote: > On Mon, May 27, 2013 at 1:19 AM, Lennart Regebro wrote: >> On Sun, May 26, 2013 at 7:34 PM, Noah Kantrowitz wrote: >>> >>> >>> but seriously, at long last today it was my honor to throw the DNS switch >>> to move PyPI to the Fastly caching CDN. I would like to thank Donald Stufft >>> for doing much of the heavy lifting on the PyPI side, and to Fastly for >>> graciously offering to host us. What does this mean for everyone? Well the >>> biggest change is PyPI should get a whole lot faster. There are two major >>> downsides however. There will now be a delay of several minutes in some >>> cases between updating a package and having it be installable, and download >>> counts will now be even more incorrect than they were before. The PyPI >>> admins are discussing what to do about download counts long-term, but for >>> now we all feel that the performance and availability benefits outweigh the >>> loss. If anyone has any questions, or hears anything about issues with PyPI >>> please don't hesitate to contact me. >> >> This is going to spell disaster for the coffee industry, as you no >> longer have to take a coffee break when re-running a buildout. >> >> Thanks! > > I always test pip installation from PyPI "just in case" after > uploading a new package, so the new cache delay still leaves some time > for a coffee break (until Daniel gets the cache invalidation > integrated :/). But yes, so many hoorays for this \o/ I already enabled Cache Invalidation. > ___ > Distutils-SIG maillist - Distutils-SIG@python.org > http://mail.python.org/mailman/listinfo/distutils-sig - Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA signature.asc Description: Message signed with OpenPGP using GPGMail ___ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Good news everyone, PyPI is behind a CDN
On Mon, May 27, 2013 at 1:19 AM, Lennart Regebro wrote: > On Sun, May 26, 2013 at 7:34 PM, Noah Kantrowitz wrote: >> >> >> but seriously, at long last today it was my honor to throw the DNS switch to >> move PyPI to the Fastly caching CDN. I would like to thank Donald Stufft for >> doing much of the heavy lifting on the PyPI side, and to Fastly for >> graciously offering to host us. What does this mean for everyone? Well the >> biggest change is PyPI should get a whole lot faster. There are two major >> downsides however. There will now be a delay of several minutes in some >> cases between updating a package and having it be installable, and download >> counts will now be even more incorrect than they were before. The PyPI >> admins are discussing what to do about download counts long-term, but for >> now we all feel that the performance and availability benefits outweigh the >> loss. If anyone has any questions, or hears anything about issues with PyPI >> please don't hesitate to contact me. > > This is going to spell disaster for the coffee industry, as you no > longer have to take a coffee break when re-running a buildout. > > Thanks! I always test pip installation from PyPI "just in case" after uploading a new package, so the new cache delay still leaves some time for a coffee break (until Daniel gets the cache invalidation integrated :/). But yes, so many hoorays for this \o/ ___ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Draft PEP for JSON based metadata published
On Mon, May 27, 2013 at 7:36 AM, Nick Coghlan wrote: > After preliminary reviews by Donald and Daniel, I have now pushed the > first complete draft of the JSON-based metadata 2.0 proposal to > python.org > > PEP 426 (metadata 2.0): http://www.python.org/dev/peps/pep-0426/ > PEP 440 (versioning): http://www.python.org/dev/peps/pep-0440/ > > With the rationale and commentary, they're over 3000 lines between > them, so I'm not attaching them here. > > The rationale for many of the changes is at the end of each PEP, along > with some comments on features that I have either rejected or > deliberately chosen to defer to the next revision of the metadata (at > the earliest). > > Those with BitBucket accounts may also comment inline on the drafts here: > > PEP 426: > https://bitbucket.org/ncoghlan/misc/src/05d3586464b10d6a04a35409468269d7c89a87ba/pep_drafts/pep-0426.txt?at=default > PEP 440: > https://bitbucket.org/ncoghlan/misc/src/05d3586464b10d6a04a35409468269d7c89a87ba/pep_drafts/pep-0440.txt?at=default This is looking fantastic so far--thanks to Nick, Daniel, and Donald for their continued work on this. For now I just have a handful of minor notes on the latest draft of PEP 426: Typos: Under "Essential dependency resolution metadata" the "may_require" and related metadata keywords are spelled with hyphens instead of underscores. Under "Metabuild system" in the first example I think "some_test_harness.metabuild_hook" was meant to read "some_test_harness:metabuild_hook" Under "Development, build and deployment dependencies": "allow" -> "allows" Under "Support for metabuild hooks": "by allows projects" -> "by allowing projects" Comment: I'm not sure if this PEP is the best place for this, but I wonder if the description of the "Keywords" format could provide some clarification on how that field should be formatted in older metadata versions (specifically when including version 1.x metadata for backwards compatibility). In the past its format has never been specified. Some tools treat it as a space-separated fields. Others have treated it as a comma-separated field. Sometimes one or the other depending on whether commas are present. It's a very annoying field. Erik ___ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] changelog / CDN inconsistency
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Am 28.05.13 11:04, schrieb Christian Theune: > So - what's the next step that can happen ASAP? In addition to the changes Donald already did, I think it would be wise to restart mirroring at min(last_serial, last_mirroring - 1 minute). This will cause any simple pages to be re-downloaded that had been updated just before the last mirror run completed. If you aren't checking md5sums of files after download, you should (I always wanted to put this into pep381client). Then, if you re-download the simple page, you can skip files that you already have downloaded, and whose md5sum did not change. Regards, Martin -BEGIN PGP SIGNATURE- Version: GnuPG/MacGPG2 v2.0.18 (Darwin) Comment: GPGTools - http://gpgtools.org Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iEYEARECAAYFAlGk28UACgkQavBT8H2dyNKmSACffo+Nwa0R+csgRgm/5fJUsqUY Xm0AnjanJrexpu7Y/Rv0CJP76r6rdsS7 =oMsM -END PGP SIGNATURE- ___ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] pypi protocol
Am 26.05.13 22:08, schrieb Jonas Geiregat: > I ended up reading pypiserver's source code to find out the internals. Notice that this is the wrong source code. The real PyPI source code is in https://bitbucket.org/pypa/pypi/src Regards, Martin ___ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] [Infrastructure] changelog / CDN inconsistency
On May 28, 2013, at 12:00 PM, "Martin v. Löwis" wrote: > Am 28.05.13 16:23, schrieb Donald Stufft: >> Option 4: We add the expected hash of the simple page to the change log. >> Mirror clients can then assert their state consistent. > > That would work. It would also cover the case where a new release > happens while the mirroring is in progress. > > On the other hand, it's difficult to advise what to do if you find that > the simple page does *not* match the most recent hashsum. You'ld have to > wait a little bit, and hope that the CDN will eventually provide the > current version. > >> Should also probably assert the file hashes that are in the simple index. > > Indeed, with the same limitation as above: if you find that the CDN > gives the old version, you'll have to wait (or bypass the CDN). > > Regards, > Martin > > Immediately after committing the database transaction PyPI tells the CDN to purge it's cache for the packages that have been affected. Fastly advertises "instant" purging and in practice this means that the CDN will be serving the current version in less than a second after the database transaction has been commited. At certainly at the most a handful of seconds. - Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA signature.asc Description: Message signed with OpenPGP using GPGMail ___ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] [Infrastructure] changelog / CDN inconsistency
Am 28.05.13 16:23, schrieb Donald Stufft: > Option 4: We add the expected hash of the simple page to the change log. > Mirror clients can then assert their state consistent. That would work. It would also cover the case where a new release happens while the mirroring is in progress. On the other hand, it's difficult to advise what to do if you find that the simple page does *not* match the most recent hashsum. You'ld have to wait a little bit, and hope that the CDN will eventually provide the current version. > Should also probably assert the file hashes that are in the simple index. Indeed, with the same limitation as above: if you find that the CDN gives the old version, you'll have to wait (or bypass the CDN). Regards, Martin ___ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] [Infrastructure] changelog / CDN inconsistency
Am 28.05.13 16:39, schrieb Donald Stufft: > On May 28, 2013, at 10:36 AM, holger krekel wrote: >> yes, i also thought of option 4. Is that easy to implement on the side of >> pypi? >> If we checksum the simple-page, we need idem-potent generation of simple >> pages >> and ordering to begin with -- which is probably anyway a good idea. >> It doesn't need to be version-ordering, just some consistent ordering. > > Check summing is easy yes. And there is already a guarantee of a stable checksum for simple pages, because of the server signing of simple pages (which also computes a hash of the simple page already). Regards, Martin ___ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] [Infrastructure] changelog / CDN inconsistency
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 > Mirroring is in a bad state because it comes (and has always) with > absolutely no guarantees of consistency. That is not true. There are no absolute guarantees, but certainly partial guarantees of consistency. It's a kind of "eventual consistency": if the releases are all older than the mirror frequency, the mirror will be consistent. > You dismiss the issues of having serial n+1 changes, but that is a > serious problem. If you fetch up to serial N of package1 which has > the released version of 1.0, and then you fetch serial N+2 of > package2 which has a hard requirement on package 1.1 (which was > released in serial N+1) you now have packages that are not > installable via your mirror because of inconsistent state. Sure, but that is only a temporary problem, with a inconsistency window of a few minutes in the worst case - and it only occurs if serials N, N+1, and N+2 all happen within 5 minutes (i.e. two releases of package1, and one release of package2). When the mirror script runs again, it will find that serial N+1 already happened, and fetch package1 and package2 again. > If someone comes up with a better option that doesn't require a > large rearch of the storage code in PyPI I'm happy to review and > deploy it. This could be fixed by having PyPI provide old versions of the simple page. It would not be possible to do so exactly currently. However, excluding releases newer than a given date would be possible, by inspecting the journal. Regards, Martin -BEGIN PGP SIGNATURE- Version: GnuPG/MacGPG2 v2.0.18 (Darwin) Comment: GPGTools - http://gpgtools.org Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iEYEARECAAYFAlGk0YgACgkQavBT8H2dyNImkQCffy1BKiYNxV71Bvtxpk+UAwPc j7wAn39wK7vMmERQhpSTfJ5iBPcP3wCr =yZBk -END PGP SIGNATURE- ___ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] [Infrastructure] Good news everyone, PyPI is behind a CDN
Am 28.05.13 14:48, schrieb M.-A. Lemburg: > We've had the CDN discussion for quite a while and I even setup > a test CDN some months ago. No one ever mentioned the HTTP/1.0 > problem and so it simply wasn't on the radar. On the other hand, the other problems *where* mentioned with respect to CDNs multiple times over the recent years, so this shouldn't have surprised anybody. Regards, Martin ___ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] changelog / CDN inconsistency (was: Re: Good news everyone, PyPI is behind a CDN)
On May 28, 2013, at 10:36 AM, holger krekel wrote: > On Tue, May 28, 2013 at 10:23 -0400, Donald Stufft wrote: >> On May 28, 2013, at 8:20 AM, Donald Stufft wrote: >> >>> >>> On May 28, 2013, at 5:04 AM, Christian Theune wrote: >>> Hi, On 27. May2013, at 10:41 PM, Donald Stufft wrote: > Just to assure folks. I do consider Mirroring a first class citizen and > an important feature. Thanks for that acknowledgement. Lets sort out what to do now - this is becoming urgent for me as the author of the currently recommended mirroring tool for public mirrors and as an operator of a mirror that is being relied upon. I agree with Holgers points. I don't think the mirroring is completely backwards right now. I agree there's been an incomplete PEP that's been hanging around too long. My current client implementation is pretty simple and has had reliable semantics until now. A couple of things I noticed in the discussion that I'd like to point out: - We mirror simple pages because the PEP requires us to - this is part of the existing validation approach. I can drop that to get mirrors not to rely on simple pages from the CDN but then authentication of the simple pages will be broken. - Release files are replaced all the time. The semantics that I like to keep with the mirrors is this: When I get a changelog for serial X and I start copying simple pages and files then I (as a mirror) promise my clients that I have incorporated *at least* all changes up until serial X (but maybe also partial changes from X+n). I'm afraid that the mirrors data are now inconsistent - we can repair that once we have a stable mirroring approach again, but until then people will start getting annoyed again. I'm also concerned that I don't really have time to follow up on what's happening with TUF regarding mirroring on top of what happened regarding the CDN. My feeling is that will result in more fire fighting. So - what's the next step that can happen ASAP? >>> >>> Options) >>> >>> 1) When mirroring retain N minutes worth of old serials and redo them. >>> Mirroring is idempotent you can repeat it with no negative side effects. >>> Conditional HTTP requests should also be supported to minimize the >>> bandwidth. >>> 2) Wait a few seconds after fetching the change log to begin processing. >>> 3) Use front.python.org with the pypi.python.org HOST header with the >>> caveat this is not guaranteed to be stable in the long term. >>> 4) ??? >> >> Option 4: We add the expected hash of the simple page to the change log. >> Mirror clients can then assert their state consistent. >> >> Should also probably assert the file hashes that are in the simple index. > > yes, i also thought of option 4. Is that easy to implement on the side of > pypi? > If we checksum the simple-page, we need idem-potent generation of simple pages > and ordering to begin with -- which is probably anyway a good idea. > It doesn't need to be version-ordering, just some consistent ordering. Check summing is easy yes. > > As mentioned in the other mail, for the short-term i'd go for 3) once Noah > and you confirm you are not going to kill it before we have settled on > a new solution (maybe option 4). > > best, > holger > > >>> Of them 1) is more likely to give you the best >>> resultshttp://mail.python.org/pipermail/distutils-sig/2013-May/020855.html >>> the constraints of HTTP. All it takes is someone to run your mirroring >>> script behind a caching proxy and pre-CDN you'd have the exact situation we >>> have now. >>> >>> Mirroring is in a bad state because it comes (and has always) with >>> absolutely no guarantees of consistency. You dismiss the issues of having >>> serial n+1 changes, but that is a serious problem. If you fetch up to >>> serial N of package1 which has the released version of 1.0, and then you >>> fetch serial N+2 of package2 which has a hard requirement on package 1.1 >>> (which was released in serial N+1) you now have packages that are not >>> installable via your mirror because of inconsistent state. >>> >>> If someone comes up with a better option that doesn't require a large >>> rearch of the storage code in PyPI I'm happy to review and deploy it. >>> Christian -- Christian Theune · c...@gocept.com gocept gmbh & co. kg · Forsterstraße 29 · 06112 Halle (Saale) · Germany http://gocept.com · Tel +49 345 1229889-7 Python, Pyramid, Plone, Zope · consulting, development, hosting, operations >>> >>> >>> - >>> Donald Stufft >>> PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA >>> >>> ___ >>> Distutils-SIG maillist - Distutils-SIG@python
Re: [Distutils] changelog / CDN inconsistency (was: Re: Good news everyone, PyPI is behind a CDN)
On May 28, 2013, at 10:36 AM, holger krekel wrote: > On Tue, May 28, 2013 at 10:23 -0400, Donald Stufft wrote: >> On May 28, 2013, at 8:20 AM, Donald Stufft wrote: >> >>> >>> On May 28, 2013, at 5:04 AM, Christian Theune wrote: >>> Hi, On 27. May2013, at 10:41 PM, Donald Stufft wrote: > Just to assure folks. I do consider Mirroring a first class citizen and > an important feature. Thanks for that acknowledgement. Lets sort out what to do now - this is becoming urgent for me as the author of the currently recommended mirroring tool for public mirrors and as an operator of a mirror that is being relied upon. I agree with Holgers points. I don't think the mirroring is completely backwards right now. I agree there's been an incomplete PEP that's been hanging around too long. My current client implementation is pretty simple and has had reliable semantics until now. A couple of things I noticed in the discussion that I'd like to point out: - We mirror simple pages because the PEP requires us to - this is part of the existing validation approach. I can drop that to get mirrors not to rely on simple pages from the CDN but then authentication of the simple pages will be broken. - Release files are replaced all the time. The semantics that I like to keep with the mirrors is this: When I get a changelog for serial X and I start copying simple pages and files then I (as a mirror) promise my clients that I have incorporated *at least* all changes up until serial X (but maybe also partial changes from X+n). I'm afraid that the mirrors data are now inconsistent - we can repair that once we have a stable mirroring approach again, but until then people will start getting annoyed again. I'm also concerned that I don't really have time to follow up on what's happening with TUF regarding mirroring on top of what happened regarding the CDN. My feeling is that will result in more fire fighting. So - what's the next step that can happen ASAP? >>> >>> Options) >>> >>> 1) When mirroring retain N minutes worth of old serials and redo them. >>> Mirroring is idempotent you can repeat it with no negative side effects. >>> Conditional HTTP requests should also be supported to minimize the >>> bandwidth. >>> 2) Wait a few seconds after fetching the change log to begin processing. >>> 3) Use front.python.org with the pypi.python.org HOST header with the >>> caveat this is not guaranteed to be stable in the long term. >>> 4) ??? >> >> Option 4: We add the expected hash of the simple page to the change log. >> Mirror clients can then assert their state consistent. >> >> Should also probably assert the file hashes that are in the simple index. > > yes, i also thought of option 4. Is that easy to implement on the side of > pypi? > If we checksum the simple-page, we need idem-potent generation of simple pages > and ordering to begin with -- which is probably anyway a good idea. > It doesn't need to be version-ordering, just some consistent ordering. > > As mentioned in the other mail, for the short-term i'd go for 3) once Noah > and you confirm you are not going to kill it before we have settled on > a new solution (maybe option 4). #3 is how fastly connects. > > best, > holger > > >>> Of them 1) is more likely to give you the best >>> resultshttp://mail.python.org/pipermail/distutils-sig/2013-May/020855.html >>> the constraints of HTTP. All it takes is someone to run your mirroring >>> script behind a caching proxy and pre-CDN you'd have the exact situation we >>> have now. >>> >>> Mirroring is in a bad state because it comes (and has always) with >>> absolutely no guarantees of consistency. You dismiss the issues of having >>> serial n+1 changes, but that is a serious problem. If you fetch up to >>> serial N of package1 which has the released version of 1.0, and then you >>> fetch serial N+2 of package2 which has a hard requirement on package 1.1 >>> (which was released in serial N+1) you now have packages that are not >>> installable via your mirror because of inconsistent state. >>> >>> If someone comes up with a better option that doesn't require a large >>> rearch of the storage code in PyPI I'm happy to review and deploy it. >>> Christian -- Christian Theune · c...@gocept.com gocept gmbh & co. kg · Forsterstraße 29 · 06112 Halle (Saale) · Germany http://gocept.com · Tel +49 345 1229889-7 Python, Pyramid, Plone, Zope · consulting, development, hosting, operations >>> >>> >>> - >>> Donald Stufft >>> PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA >>> >>> ___ >>> Distutils-SIG maillist - Distutils-SIG@pytho
Re: [Distutils] changelog / CDN inconsistency (was: Re: Good news everyone, PyPI is behind a CDN)
On Tue, May 28, 2013 at 10:23 -0400, Donald Stufft wrote: > On May 28, 2013, at 8:20 AM, Donald Stufft wrote: > > > > > On May 28, 2013, at 5:04 AM, Christian Theune wrote: > > > >> Hi, > >> > >> > >> On 27. May2013, at 10:41 PM, Donald Stufft wrote: > >>> Just to assure folks. I do consider Mirroring a first class citizen and > >>> an important feature. > >> > >> Thanks for that acknowledgement. Lets sort out what to do now - this is > >> becoming urgent for me as the author of the currently recommended > >> mirroring tool for public mirrors and as an operator of a mirror that is > >> being relied upon. > >> > >> I agree with Holgers points. > >> > >> I don't think the mirroring is completely backwards right now. I agree > >> there's been an incomplete PEP that's been hanging around too long. > >> > >> My current client implementation is pretty simple and has had reliable > >> semantics until now. > >> > >> A couple of things I noticed in the discussion that I'd like to point out: > >> > >> - We mirror simple pages because the PEP requires us to - this is part of > >> the existing validation approach. I can drop that to get mirrors not to > >> rely on simple pages from the CDN but then authentication of the simple > >> pages will be broken. > >> > >> - Release files are replaced all the time. > >> > >> The semantics that I like to keep with the mirrors is this: > >> > >> When I get a changelog for serial X and I start copying simple pages and > >> files then I (as a mirror) promise my clients that I have incorporated *at > >> least* all changes up until serial X (but maybe also partial changes from > >> X+n). > >> > >> I'm afraid that the mirrors data are now inconsistent - we can repair that > >> once we have a stable mirroring approach again, but until then people will > >> start getting annoyed again. > >> > >> I'm also concerned that I don't really have time to follow up on what's > >> happening with TUF regarding mirroring on top of what happened regarding > >> the CDN. My feeling is that will result in more fire fighting. > >> > >> So - what's the next step that can happen ASAP? > > > > Options) > > > > 1) When mirroring retain N minutes worth of old serials and redo them. > > Mirroring is idempotent you can repeat it with no negative side effects. > > Conditional HTTP requests should also be supported to minimize the > > bandwidth. > > 2) Wait a few seconds after fetching the change log to begin processing. > > 3) Use front.python.org with the pypi.python.org HOST header with the > > caveat this is not guaranteed to be stable in the long term. > > 4) ??? > > > > Option 4: We add the expected hash of the simple page to the change log. > Mirror clients can then assert their state consistent. > > Should also probably assert the file hashes that are in the simple index. yes, i also thought of option 4. Is that easy to implement on the side of pypi? If we checksum the simple-page, we need idem-potent generation of simple pages and ordering to begin with -- which is probably anyway a good idea. It doesn't need to be version-ordering, just some consistent ordering. As mentioned in the other mail, for the short-term i'd go for 3) once Noah and you confirm you are not going to kill it before we have settled on a new solution (maybe option 4). best, holger > > Of them 1) is more likely to give you the best > > resultshttp://mail.python.org/pipermail/distutils-sig/2013-May/020855.html > > the constraints of HTTP. All it takes is someone to run your mirroring > > script behind a caching proxy and pre-CDN you'd have the exact situation we > > have now. > > > > Mirroring is in a bad state because it comes (and has always) with > > absolutely no guarantees of consistency. You dismiss the issues of having > > serial n+1 changes, but that is a serious problem. If you fetch up to > > serial N of package1 which has the released version of 1.0, and then you > > fetch serial N+2 of package2 which has a hard requirement on package 1.1 > > (which was released in serial N+1) you now have packages that are not > > installable via your mirror because of inconsistent state. > > > > If someone comes up with a better option that doesn't require a large > > rearch of the storage code in PyPI I'm happy to review and deploy it. > > > >> > >> Christian > >> > >> -- > >> Christian Theune · c...@gocept.com > >> gocept gmbh & co. kg · Forsterstraße 29 · 06112 Halle (Saale) · Germany > >> http://gocept.com · Tel +49 345 1229889-7 > >> Python, Pyramid, Plone, Zope · consulting, development, hosting, operations > > > > > > - > > Donald Stufft > > PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA > > > > ___ > > Distutils-SIG maillist - Distutils-SIG@python.org > > http://mail.python.org/mailman/listinfo/distutils-sig > ___ > D
Re: [Distutils] changelog / CDN inconsistency (was: Re: Good news everyone, PyPI is behind a CDN)
On May 28, 2013, at 8:20 AM, Donald Stufft wrote: > > On May 28, 2013, at 5:04 AM, Christian Theune wrote: > >> Hi, >> >> >> On 27. May2013, at 10:41 PM, Donald Stufft wrote: >>> Just to assure folks. I do consider Mirroring a first class citizen and an >>> important feature. >> >> Thanks for that acknowledgement. Lets sort out what to do now - this is >> becoming urgent for me as the author of the currently recommended mirroring >> tool for public mirrors and as an operator of a mirror that is being relied >> upon. >> >> I agree with Holgers points. >> >> I don't think the mirroring is completely backwards right now. I agree >> there's been an incomplete PEP that's been hanging around too long. >> >> My current client implementation is pretty simple and has had reliable >> semantics until now. >> >> A couple of things I noticed in the discussion that I'd like to point out: >> >> - We mirror simple pages because the PEP requires us to - this is part of >> the existing validation approach. I can drop that to get mirrors not to rely >> on simple pages from the CDN but then authentication of the simple pages >> will be broken. >> >> - Release files are replaced all the time. >> >> The semantics that I like to keep with the mirrors is this: >> >> When I get a changelog for serial X and I start copying simple pages and >> files then I (as a mirror) promise my clients that I have incorporated *at >> least* all changes up until serial X (but maybe also partial changes from >> X+n). >> >> I'm afraid that the mirrors data are now inconsistent - we can repair that >> once we have a stable mirroring approach again, but until then people will >> start getting annoyed again. >> >> I'm also concerned that I don't really have time to follow up on what's >> happening with TUF regarding mirroring on top of what happened regarding the >> CDN. My feeling is that will result in more fire fighting. >> >> So - what's the next step that can happen ASAP? > > Options) > > 1) When mirroring retain N minutes worth of old serials and redo them. > Mirroring is idempotent you can repeat it with no negative side effects. > Conditional HTTP requests should also be supported to minimize the bandwidth. > 2) Wait a few seconds after fetching the change log to begin processing. > 3) Use front.python.org with the pypi.python.org HOST header with the caveat > this is not guaranteed to be stable in the long term. > 4) ??? > Option 4: We add the expected hash of the simple page to the change log. Mirror clients can then assert their state consistent. Should also probably assert the file hashes that are in the simple index. > Of them 1) is more likely to give you the best > resultshttp://mail.python.org/pipermail/distutils-sig/2013-May/020855.html > the constraints of HTTP. All it takes is someone to run your mirroring script > behind a caching proxy and pre-CDN you'd have the exact situation we have now. > > Mirroring is in a bad state because it comes (and has always) with absolutely > no guarantees of consistency. You dismiss the issues of having serial n+1 > changes, but that is a serious problem. If you fetch up to serial N of > package1 which has the released version of 1.0, and then you fetch serial N+2 > of package2 which has a hard requirement on package 1.1 (which was released > in serial N+1) you now have packages that are not installable via your mirror > because of inconsistent state. > > If someone comes up with a better option that doesn't require a large rearch > of the storage code in PyPI I'm happy to review and deploy it. > >> >> Christian >> >> -- >> Christian Theune · c...@gocept.com >> gocept gmbh & co. kg · Forsterstraße 29 · 06112 Halle (Saale) · Germany >> http://gocept.com · Tel +49 345 1229889-7 >> Python, Pyramid, Plone, Zope · consulting, development, hosting, operations > > > - > Donald Stufft > PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA > > ___ > Distutils-SIG maillist - Distutils-SIG@python.org > http://mail.python.org/mailman/listinfo/distutils-sig ___ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] [Infrastructure] Good news everyone, PyPI is behind a CDN
On 28.05.2013 14:26, Nick Coghlan wrote: > On Tue, May 28, 2013 at 10:07 PM, Donald Stufft wrote: >> Moving to a CDN has been discussed before on either catalog-sig or >> distutils-sig (Can't recall which offhand). >> >> Weekly status updates were posted to the infrastructure list as well as the >> communication between us and Fastly as we ironed out SSL issues. >> >> The mirroring issue pre-invalidation was quickly corrected. We now >> invalidate and we are looking at a window that is at most a few seconds >> large. > > One of the things I (successfully) advocated for at PyCon US was to > open up the PEP process to cover things where python-dev aren't > directly involved, but we need an official avenue for publication of > significant changes in the Python ecosystem (with my main aim being to > empower distutils-sig as a place where we could actually making final > decisions about the evolution of the packaging ecosystem). > > Given that, an Informational PEP with Discussion-To set to > infrastructure-sig and Noah as BDFL-Delegate would be an eminently > suitable way of keeping PyPI users and mirror operators that *aren't* > following infrastructure-sig informed of upcoming changes that may > impact the operation of PyPI clients. > > infrastructure-sig has historically just been for backend hosting > details, without significant impact to *client* facing behaviour - > while I think it's fine to change that, it's also understandable that > most developers of PyPI clients wouldn't be aware of upcoming changes > that have only been discussed in detail on that list. I don't think the infra sig is the right host for such discussions and decisions. I'd suggest to use the distutils-sig and make Donald/Richard the PEP master for PyPI things, as they are maintaining it. > So, as Holger said, great work and thanks for your efforts, but good > communication does matter with these things. People don't like > surprises, even well intentioned ones :) We've had the CDN discussion for quite a while and I even setup a test CDN some months ago. No one ever mentioned the HTTP/1.0 problem and so it simply wasn't on the radar. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 28 2013) >>> Python Projects, Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ 2013-07-01: EuroPython 2013, Florence, Italy ... 34 days to go : Try our mxODBC.Connect Python Database Interface for free ! :: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Good news everyone, PyPI is behind a CDN
On May 28, 2013, at 8:26 AM, Nick Coghlan wrote: > On Tue, May 28, 2013 at 10:07 PM, Donald Stufft wrote: >> Moving to a CDN has been discussed before on either catalog-sig or >> distutils-sig (Can't recall which offhand). >> >> Weekly status updates were posted to the infrastructure list as well as the >> communication between us and Fastly as we ironed out SSL issues. >> >> The mirroring issue pre-invalidation was quickly corrected. We now >> invalidate and we are looking at a window that is at most a few seconds >> large. > > One of the things I (successfully) advocated for at PyCon US was to > open up the PEP process to cover things where python-dev aren't > directly involved, but we need an official avenue for publication of > significant changes in the Python ecosystem (with my main aim being to > empower distutils-sig as a place where we could actually making final > decisions about the evolution of the packaging ecosystem). > > Given that, an Informational PEP with Discussion-To set to > infrastructure-sig and Noah as BDFL-Delegate would be an eminently > suitable way of keeping PyPI users and mirror operators that *aren't* > following infrastructure-sig informed of upcoming changes that may > impact the operation of PyPI clients. > > infrastructure-sig has historically just been for backend hosting > details, without significant impact to *client* facing behaviour - > while I think it's fine to change that, it's also understandable that > most developers of PyPI clients wouldn't be aware of upcoming changes > that have only been discussed in detail on that list. > It is only a significant change if you make invalid assumptions about HTTP and consistent state between two requests. If you want to rely on that then let's talk about a system where we can reliably promise that. > So, as Holger said, great work and thanks for your efforts, but good > communication does matter with these things. People don't like > surprises, even well intentioned ones :) > Point taken. In the future ill post any infrastructure upgrades I'm involved in not only to the infrastructure list but also to distutils sig. > Cheers, > Nick. > > -- > Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Good news everyone, PyPI is behind a CDN
On Tue, May 28, 2013 at 10:07 PM, Donald Stufft wrote: > Moving to a CDN has been discussed before on either catalog-sig or > distutils-sig (Can't recall which offhand). > > Weekly status updates were posted to the infrastructure list as well as the > communication between us and Fastly as we ironed out SSL issues. > > The mirroring issue pre-invalidation was quickly corrected. We now > invalidate and we are looking at a window that is at most a few seconds > large. One of the things I (successfully) advocated for at PyCon US was to open up the PEP process to cover things where python-dev aren't directly involved, but we need an official avenue for publication of significant changes in the Python ecosystem (with my main aim being to empower distutils-sig as a place where we could actually making final decisions about the evolution of the packaging ecosystem). Given that, an Informational PEP with Discussion-To set to infrastructure-sig and Noah as BDFL-Delegate would be an eminently suitable way of keeping PyPI users and mirror operators that *aren't* following infrastructure-sig informed of upcoming changes that may impact the operation of PyPI clients. infrastructure-sig has historically just been for backend hosting details, without significant impact to *client* facing behaviour - while I think it's fine to change that, it's also understandable that most developers of PyPI clients wouldn't be aware of upcoming changes that have only been discussed in detail on that list. So, as Holger said, great work and thanks for your efforts, but good communication does matter with these things. People don't like surprises, even well intentioned ones :) Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] changelog / CDN inconsistency (was: Re: Good news everyone, PyPI is behind a CDN)
On May 28, 2013, at 5:04 AM, Christian Theune wrote: > Hi, > > > On 27. May2013, at 10:41 PM, Donald Stufft wrote: >> Just to assure folks. I do consider Mirroring a first class citizen and an >> important feature. > > Thanks for that acknowledgement. Lets sort out what to do now - this is > becoming urgent for me as the author of the currently recommended mirroring > tool for public mirrors and as an operator of a mirror that is being relied > upon. > > I agree with Holgers points. > > I don't think the mirroring is completely backwards right now. I agree > there's been an incomplete PEP that's been hanging around too long. > > My current client implementation is pretty simple and has had reliable > semantics until now. > > A couple of things I noticed in the discussion that I'd like to point out: > > - We mirror simple pages because the PEP requires us to - this is part of the > existing validation approach. I can drop that to get mirrors not to rely on > simple pages from the CDN but then authentication of the simple pages will be > broken. > > - Release files are replaced all the time. > > The semantics that I like to keep with the mirrors is this: > > When I get a changelog for serial X and I start copying simple pages and > files then I (as a mirror) promise my clients that I have incorporated *at > least* all changes up until serial X (but maybe also partial changes from > X+n). > > I'm afraid that the mirrors data are now inconsistent - we can repair that > once we have a stable mirroring approach again, but until then people will > start getting annoyed again. > > I'm also concerned that I don't really have time to follow up on what's > happening with TUF regarding mirroring on top of what happened regarding the > CDN. My feeling is that will result in more fire fighting. > > So - what's the next step that can happen ASAP? Options) 1) When mirroring retain N minutes worth of old serials and redo them. Mirroring is idempotent you can repeat it with no negative side effects. Conditional HTTP requests should also be supported to minimize the bandwidth. 2) Wait a few seconds after fetching the change log to begin processing. 3) Use front.python.org with the pypi.python.org HOST header with the caveat this is not guaranteed to be stable in the long term. 4) ??? Of them 1) is more likely to give you the best results within the constraints of HTTP. All it takes is someone to run your mirroring script behind a caching proxy and pre-CDN you'd have the exact situation we have now. Mirroring is in a bad state because it comes (and has always) with absolutely no guarantees of consistency. You dismiss the issues of having serial n+1 changes, but that is a serious problem. If you fetch up to serial N of package1 which has the released version of 1.0, and then you fetch serial N+2 of package2 which has a hard requirement on package 1.1 (which was released in serial N+1) you now have packages that are not installable via your mirror because of inconsistent state. If someone comes up with a better option that doesn't require a large rearch of the storage code in PyPI I'm happy to review and deploy it. > > Christian > > -- > Christian Theune · c...@gocept.com > gocept gmbh & co. kg · Forsterstraße 29 · 06112 Halle (Saale) · Germany > http://gocept.com · Tel +49 345 1229889-7 > Python, Pyramid, Plone, Zope · consulting, development, hosting, operations > - Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA signature.asc Description: Message signed with OpenPGP using GPGMail ___ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Good news everyone, PyPI is behind a CDN
On May 28, 2013, at 2:57 AM, holger krekel wrote: > On Tue, May 28, 2013 at 07:42 +0100, Paul Moore wrote: >> On 28 May 2013 02:53, Donald Stufft wrote: >> >>> Figured it out. >>> >>> Use HTTPS. >>> >> >> Can I suggest that if the new CDN means that there are additional >> restrictions on what is supported (I've used the XMLRPC API without https >> in one-off scripts in the past) then the officially supported API should be >> properly documented once and for all in a PEP, including some sort of >> "what's new" or "rationale" section describing the various changes that >> have occurred recently and their impact on user code? > > I second this. I am building tools that interact with PyPI and people > and customers are using them. I don't want to find a switch announced > which breaks them and then hear "sorry, that's the future now" without > this future being documented and discussed before the fact. The PyPI > infrastructure and its supported tool interactions today are as important as > evolving the language itself so PEPs are warranted. As with PEP438 i am > willing to help this process. > >> I'm purely a casual user of the PyPI API and the discussion of these >> changes haa mostly gone over my head. The one thing I've taken away from it >> is that I may get problems if I just google for sample code to use. For >> example, the above comment implies that >> http://wiki.python.org/moin/PyPIXmlRpc (AIUI, the nearest to formal >> documentation that the XMLRPC API has) is wrong (as it uses http). >> >> I do appreciate all the work that is going on to improve the PyPI >> infrastructure. I'm not saying the changes should be reverted, just that >> the consequences should be clearly explained. > > I also appreciate Noah's and Donald's CDN work here, up to the point where > it breaks things for unclear reasons. Reasons which might very well > be valid, nevertheless! > > best, > holger Moving to a CDN has been discussed before on either catalog-sig or distutils-sig (Can't recall which offhand). Weekly status updates were posted to the infrastructure list as well as the communication between us and Fastly as we ironed out SSL issues. The mirroring issue pre-invalidation was quickly corrected. We now invalidate and we are looking at a window that is at most a few seconds large. - Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA signature.asc Description: Message signed with OpenPGP using GPGMail ___ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] changelog / CDN inconsistency (was: Re: Good news everyone, PyPI is behind a CDN)
Hi, On 27. May2013, at 10:41 PM, Donald Stufft wrote: > Just to assure folks. I do consider Mirroring a first class citizen and an > important feature. Thanks for that acknowledgement. Lets sort out what to do now - this is becoming urgent for me as the author of the currently recommended mirroring tool for public mirrors and as an operator of a mirror that is being relied upon. I agree with Holgers points. I don't think the mirroring is completely backwards right now. I agree there's been an incomplete PEP that's been hanging around too long. My current client implementation is pretty simple and has had reliable semantics until now. A couple of things I noticed in the discussion that I'd like to point out: - We mirror simple pages because the PEP requires us to - this is part of the existing validation approach. I can drop that to get mirrors not to rely on simple pages from the CDN but then authentication of the simple pages will be broken. - Release files are replaced all the time. The semantics that I like to keep with the mirrors is this: When I get a changelog for serial X and I start copying simple pages and files then I (as a mirror) promise my clients that I have incorporated *at least* all changes up until serial X (but maybe also partial changes from X+n). I'm afraid that the mirrors data are now inconsistent - we can repair that once we have a stable mirroring approach again, but until then people will start getting annoyed again. I'm also concerned that I don't really have time to follow up on what's happening with TUF regarding mirroring on top of what happened regarding the CDN. My feeling is that will result in more fire fighting. So - what's the next step that can happen ASAP? Christian -- Christian Theune · c...@gocept.com gocept gmbh & co. kg · Forsterstraße 29 · 06112 Halle (Saale) · Germany http://gocept.com · Tel +49 345 1229889-7 Python, Pyramid, Plone, Zope · consulting, development, hosting, operations signature.asc Description: Message signed with OpenPGP using GPGMail ___ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] PyPI Download Counts
I'll also be at EP and can help to explain how Nix could solve the isolation problem. On Mon, May 27, 2013 at 5:46 PM, holger krekel wrote: > Hi Florian, > > On Mon, May 27, 2013 at 10:36 +0200, Florian Friesdorf wrote: > > Hi Holger, > > > > holger krekel writes: > > > On Mon, May 27, 2013 at 17:41 +1000, Nick Coghlan wrote: > > >> On Mon, May 27, 2013 at 5:27 PM, holger krekel > wrote: > > >> > Not having download counts maybe lets us think harder about > > >> > better metrics. The number of projects using a package as a dep > > >> > might be one. > > >> > > >> With the current downside being that it's hard for PyPI to figure out > > >> that number, too :) > > > > > > Yip. But something like Vinaj's red-dove approach or Marius' > get_deps.py > > > could provide a base. We might think about a docker instance which > > > could allow to quickly spawn new light VMs so we can isolate setup.py > runs. > > > (Yes, it's only Linux but it'd be a start). > > > > nix and nixpkgs allow this isolation on-top off linux, freebsd, OS X and > > theoretically also cygwin (not sure how good cygwin is supported at the > > moment). > > > > http://nixos.org/nix/ > > http://nixos.org/nixpkgs/ > > > > From nixos.org: > > Nix is a purely functional package manager. This means that it can > > ensure that an upgrade to one package cannot break others, that you can > > always roll back to previous version, that multiple versions of a > > package can coexist on the same system, and much more. > > > > Nixpkgs is a large collection of packages that can be installed with the > > Nix package manager. > > Interesting stuff, didn't know about it. > > Did you post this as a suggestion for provisioning an environment to > run setup.py (on nix-supported platforms)? If so, i am not sure how it > would help exactly. I guess myself i'd aim for a 80% solution for > discovering > dependencies first. Simplest/quickest wins there :) > > > >> Agreed it would be a good number to publish once it's more readily > > >> available, too. > > > > > > I think "dep" numbers are mostly interesting for libraries, not so > > > much for applications like django or pyramid or tools like nose/pytest. > > > > > > Another more practical data point would be "does this package even > > > install on win32/linux/osx py26/py27/py33" and even better, do its > automated > > > tests pass? > > > > http://hydra.nixos.org/build/5062796 > > > > > If we could evolve to have this info published on pypi.python.org > > > it would be quite useful i think. I am actually currently implementing > > > a system which enables this (the "devpi" system) so i don't mean this > all just > > > as "nice to have" theory. I aim to present the status of this work > > > at EuroPython. > > > > Nice! Looking forward to that. > > > > If you have any questions about nix/nixpkgs/nixos, especially about the > > way python packages are packaged, please let me know. Also, it's not set > > in stone. > > are you going to be at EP? It's a long conference and i am more than > happy to sit together on this topic for a bit sometimes. > > best, > holger > > > > Personally, I'd love to see hydra.python.org providing builds of all > > pypi packages and would be happy to help. Also including Domen and Rok > > for whome I assume the same. > > > > > > You might have other tools that are better suited for you. > > > > regards > > florian > > -- > > Florian Friesdorf > > GPG FPR: 7A13 5EEE 1421 9FC2 108D BAAF 38F8 99A3 0C45 F083 > > Jabber/XMPP: f...@chaoflow.net > > IRC: chaoflow on freenode,ircnet,blafasel,OFTC > > > ___ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] Good news everyone, PyPI is behind a CDN
On May 28, 2013, at 2:42 AM, Paul Moore wrote: > On 28 May 2013 02:53, Donald Stufft wrote: > Figured it out. > > Use HTTPS. > > Can I suggest that if the new CDN means that there are additional > restrictions on what is supported (I've used the XMLRPC API without https in > one-off scripts in the past) then the officially supported API should be > properly documented once and for all in a PEP, including some sort of "what's > new" or "rationale" section describing the various changes that have occurred > recently and their impact on user code? > > I'm purely a casual user of the PyPI API and the discussion of these changes > haa mostly gone over my head. The one thing I've taken away from it is that I > may get problems if I just google for sample code to use. For example, the > above comment implies that http://wiki.python.org/moin/PyPIXmlRpc (AIUI, the > nearest to formal documentation that the XMLRPC API has) is wrong (as it uses > http). > > I do appreciate all the work that is going on to improve the PyPI > infrastructure. I'm not saying the changes should be reverted, just that the > consequences should be clearly explained. > > Paul. To be quite honest the HTTP 1.0 + HTTP issue simply wasn't discovered in testing. The http url works fine on Python 2.7 (which I'm assuming uses HTTP 1.1). I'm not completely happy that HTTP is broken in Python2.6 (and I'm assuming earlier) and have it on my list to see if there's anything that can be done. THat being said the most future compatible way will be to use the HTTPS url for any interaction (and ideally verify the SSL, but the built in XMLRPC library doesn't do that). My "Use HTTPS" was more to speak how to solve the issue *right now*. Documentation should be updated to point to HTTPS though. - Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA signature.asc Description: Message signed with OpenPGP using GPGMail ___ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig
Re: [Distutils] changelog / CDN inconsistency (was: Re: Good news everyone, PyPI is behind a CDN)
On Tue, May 28, 2013 at 11:04 +0200, Christian Theune wrote: > On 27. May2013, at 10:41 PM, Donald Stufft wrote: > > Just to assure folks. I do consider Mirroring a first class citizen and an > > important feature. > > Thanks for that acknowledgement. Lets sort out what to do now - this is > becoming urgent for me as the author of the currently recommended mirroring > tool for public mirrors and as an operator of a mirror that is being relied > upon. > > I agree with Holgers points. > > I don't think the mirroring is completely backwards right now. I agree > there's been an incomplete PEP that's been hanging around too long. > > My current client implementation is pretty simple and has had reliable > semantics until now. > > A couple of things I noticed in the discussion that I'd like to point out: > > - We mirror simple pages because the PEP requires us to - this is part of the > existing validation approach. I can drop that to get mirrors not to rely on > simple pages from the CDN but then authentication of the simple pages will be > broken. > > - Release files are replaced all the time. > > The semantics that I like to keep with the mirrors is this: > > When I get a changelog for serial X and I start copying simple pages and > files then I (as a mirror) promise my clients that I have incorporated *at > least* all changes up until serial X (but maybe also partial changes from > X+n). > > I'm afraid that the mirrors data are now inconsistent - we can repair that > once we have a stable mirroring approach again, but until then people will > start getting annoyed again. > > I'm also concerned that I don't really have time to follow up on what's > happening with TUF regarding mirroring on top of what happened regarding the > CDN. My feeling is that will result in more fire fighting. > > So - what's the next step that can happen ASAP? The immediate way to get around the CDN/mirroring problems and to revert to the pre-CDN consistency level, is to use the same access that fastly uses to get updates from pypi.python.org, namely a request on front.python.org with a host-header. I have this info from Donald with the cave-eat that it's not guaranteed to remain possible. Maybe Noah could agree to not remove this facility without the current actors being on board for changes? (i am also fine to have a dedicated domain instead of course). Once this is settled, we can move on to fix current tools and deployments and afterwards think about future improvements without the current urgency. holger > Christian > > -- > Christian Theune · c...@gocept.com > gocept gmbh & co. kg · Forsterstraße 29 · 06112 Halle (Saale) · Germany > http://gocept.com · Tel +49 345 1229889-7 > Python, Pyramid, Plone, Zope · consulting, development, hosting, operations > ___ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig