On Tue, May 28, 2013 at 10:23 -0400, Donald Stufft wrote: > On May 28, 2013, at 8:20 AM, Donald Stufft <don...@stufft.io> wrote: > > > > > On May 28, 2013, at 5:04 AM, Christian Theune <c...@gocept.com> wrote: > > > >> Hi, > >> > >> > >> On 27. May2013, at 10:41 PM, Donald Stufft <don...@stufft.io> wrote: > >>> Just to assure folks. I do consider Mirroring a first class citizen and > >>> an important feature. > >> > >> Thanks for that acknowledgement. Lets sort out what to do now - this is > >> becoming urgent for me as the author of the currently recommended > >> mirroring tool for public mirrors and as an operator of a mirror that is > >> being relied upon. > >> > >> I agree with Holgers points. > >> > >> I don't think the mirroring is completely backwards right now. I agree > >> there's been an incomplete PEP that's been hanging around too long. > >> > >> My current client implementation is pretty simple and has had reliable > >> semantics until now. > >> > >> A couple of things I noticed in the discussion that I'd like to point out: > >> > >> - We mirror simple pages because the PEP requires us to - this is part of > >> the existing validation approach. I can drop that to get mirrors not to > >> rely on simple pages from the CDN but then authentication of the simple > >> pages will be broken. > >> > >> - Release files are replaced all the time. > >> > >> The semantics that I like to keep with the mirrors is this: > >> > >> When I get a changelog for serial X and I start copying simple pages and > >> files then I (as a mirror) promise my clients that I have incorporated *at > >> least* all changes up until serial X (but maybe also partial changes from > >> X+n). > >> > >> I'm afraid that the mirrors data are now inconsistent - we can repair that > >> once we have a stable mirroring approach again, but until then people will > >> start getting annoyed again. > >> > >> I'm also concerned that I don't really have time to follow up on what's > >> happening with TUF regarding mirroring on top of what happened regarding > >> the CDN. My feeling is that will result in more fire fighting. > >> > >> So - what's the next step that can happen ASAP? > > > > Options) > > > > 1) When mirroring retain N minutes worth of old serials and redo them. > > Mirroring is idempotent you can repeat it with no negative side effects. > > Conditional HTTP requests should also be supported to minimize the > > bandwidth. > > 2) Wait a few seconds after fetching the change log to begin processing. > > 3) Use front.python.org with the pypi.python.org HOST header with the > > caveat this is not guaranteed to be stable in the long term. > > 4) ??? > > > > Option 4: We add the expected hash of the simple page to the change log. > Mirror clients can then assert their state consistent. > > Should also probably assert the file hashes that are in the simple index.
yes, i also thought of option 4. Is that easy to implement on the side of pypi? If we checksum the simple-page, we need idem-potent generation of simple pages and ordering to begin with -- which is probably anyway a good idea. It doesn't need to be version-ordering, just some consistent ordering. As mentioned in the other mail, for the short-term i'd go for 3) once Noah and you confirm you are not going to kill it before we have settled on a new solution (maybe option 4). best, holger > > Of them 1) is more likely to give you the best > > resultshttp://mail.python.org/pipermail/distutils-sig/2013-May/020855.html > > the constraints of HTTP. All it takes is someone to run your mirroring > > script behind a caching proxy and pre-CDN you'd have the exact situation we > > have now. > > > > Mirroring is in a bad state because it comes (and has always) with > > absolutely no guarantees of consistency. You dismiss the issues of having > > serial n+1 changes, but that is a serious problem. If you fetch up to > > serial N of package1 which has the released version of 1.0, and then you > > fetch serial N+2 of package2 which has a hard requirement on package 1.1 > > (which was released in serial N+1) you now have packages that are not > > installable via your mirror because of inconsistent state. > > > > If someone comes up with a better option that doesn't require a large > > rearch of the storage code in PyPI I'm happy to review and deploy it. > > > >> > >> Christian > >> > >> -- > >> Christian Theune · c...@gocept.com > >> gocept gmbh & co. kg · Forsterstraße 29 · 06112 Halle (Saale) · Germany > >> http://gocept.com · Tel +49 345 1229889-7 > >> Python, Pyramid, Plone, Zope · consulting, development, hosting, operations > > > > > > ----------------- > > Donald Stufft > > PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA > > > > _______________________________________________ > > Distutils-SIG maillist - Distutils-SIG@python.org > > http://mail.python.org/mailman/listinfo/distutils-sig > _______________________________________________ > Distutils-SIG maillist - Distutils-SIG@python.org > http://mail.python.org/mailman/listinfo/distutils-sig _______________________________________________ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig