On Tue, May 28, 2013 at 10:23 -0400, Donald Stufft wrote:
> On May 28, 2013, at 8:20 AM, Donald Stufft <don...@stufft.io> wrote:
> 
> > 
> > On May 28, 2013, at 5:04 AM, Christian Theune <c...@gocept.com> wrote:
> > 
> >> Hi,
> >> 
> >> 
> >> On 27. May2013, at 10:41 PM, Donald Stufft <don...@stufft.io> wrote:
> >>> Just to assure folks. I do consider Mirroring a first class citizen and 
> >>> an important feature.
> >> 
> >> Thanks for that acknowledgement. Lets sort out what to do now - this is 
> >> becoming urgent for me as the author of the currently recommended 
> >> mirroring tool for public mirrors and as an operator of a mirror that is 
> >> being relied upon.
> >> 
> >> I agree with Holgers points.
> >> 
> >> I don't think the mirroring is completely backwards right now. I agree 
> >> there's been an incomplete PEP that's been hanging around too long. 
> >> 
> >> My current client implementation is pretty simple and has had reliable 
> >> semantics until now.
> >> 
> >> A couple of things I noticed in the discussion that I'd like to point out:
> >> 
> >> - We mirror simple pages because the PEP requires us to - this is part of 
> >> the existing validation approach. I can drop that to get mirrors not to 
> >> rely on simple pages from the CDN but then authentication of the simple 
> >> pages will be broken.
> >> 
> >> - Release files are replaced all the time.
> >> 
> >> The semantics that I like to keep with the mirrors is this:
> >> 
> >> When I get a changelog for serial X and I start copying simple pages and 
> >> files then I (as a mirror) promise my clients that I have incorporated *at 
> >> least* all changes up until serial X  (but maybe also partial changes from 
> >> X+n).
> >> 
> >> I'm afraid that the mirrors data are now inconsistent - we can repair that 
> >> once we have a stable mirroring approach again, but until then people will 
> >> start getting annoyed again. 
> >> 
> >> I'm also concerned that I don't really have time to follow up on what's 
> >> happening with TUF regarding mirroring on top of what happened regarding 
> >> the CDN. My feeling is that will result in more fire fighting.
> >> 
> >> So - what's the next step that can happen ASAP?
> > 
> > Options)
> > 
> > 1) When mirroring retain N minutes worth of old serials and redo them. 
> > Mirroring is idempotent you can repeat it with no negative side effects.  
> > Conditional HTTP requests should also be supported to minimize the 
> > bandwidth.
> > 2) Wait a few seconds after fetching the change log to begin processing.
> > 3) Use front.python.org with the pypi.python.org HOST header with the 
> > caveat this is not guaranteed to be stable in the long term.
> > 4) ???
> > 
> 
> Option 4: We add the expected hash of the simple page to the change log. 
> Mirror clients can then assert their state consistent.
> 
> Should also probably assert the file hashes that are in the simple index. 

yes, i also thought of option 4.  Is that easy to implement on the side of pypi?
If we checksum the simple-page, we need idem-potent generation of simple pages
and ordering to begin with -- which is probably anyway a good idea.  
It doesn't need to be version-ordering, just some consistent ordering.

As mentioned in the other mail, for the short-term i'd go for 3) once Noah
and you confirm you are not going to kill it before we have settled on
a new solution (maybe option 4). 

best,
holger


> > Of them 1) is more likely to give you the best 
> > resultshttp://mail.python.org/pipermail/distutils-sig/2013-May/020855.html 
> > the constraints of HTTP. All it takes is someone to run your mirroring 
> > script behind a caching proxy and pre-CDN you'd have the exact situation we 
> > have now.
> > 
> > Mirroring is in a bad state because it comes (and has always) with 
> > absolutely no guarantees of consistency. You dismiss the issues of having 
> > serial n+1 changes, but that is a serious problem. If you fetch up to 
> > serial N of package1 which has the released version of 1.0, and then you 
> > fetch serial N+2 of package2 which has a hard requirement on package 1.1 
> > (which was released in serial N+1) you now have packages that are not 
> > installable via your mirror because of inconsistent state.
> > 
> > If someone comes up with a better option that doesn't require a large 
> > rearch of the storage code in PyPI I'm happy to review and deploy it.
> > 
> >> 
> >> Christian
> >> 
> >> -- 
> >> Christian Theune · c...@gocept.com
> >> gocept gmbh & co. kg · Forsterstraße 29 · 06112 Halle (Saale) · Germany
> >> http://gocept.com · Tel +49 345 1229889-7
> >> Python, Pyramid, Plone, Zope · consulting, development, hosting, operations
> > 
> > 
> > -----------------
> > Donald Stufft
> > PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA
> > 
> > _______________________________________________
> > Distutils-SIG maillist  -  Distutils-SIG@python.org
> > http://mail.python.org/mailman/listinfo/distutils-sig

> _______________________________________________
> Distutils-SIG maillist  -  Distutils-SIG@python.org
> http://mail.python.org/mailman/listinfo/distutils-sig

_______________________________________________
Distutils-SIG maillist  -  Distutils-SIG@python.org
http://mail.python.org/mailman/listinfo/distutils-sig

Reply via email to