On May 27, 2013, at 2:54 PM, holger krekel <hol...@merlinux.eu> wrote:
> On Mon, May 27, 2013 at 13:50 -0400, Donald Stufft wrote: >> On May 27, 2013, at 12:39 PM, Donald Stufft <don...@stufft.io> wrote: >> >>> >>> On May 27, 2013, at 8:08 AM, holger krekel <hol...@merlinux.eu> wrote: >>> >>>> Hi Noah, Donald, (CC also Richard, Christian), >>>> >>>> i just checked with a test package and think we might have a cache >>>> consistency / changelog API problem. It took me a while but here is >>>> the basic thing: I uploaded a test package, changelog API reports it has >>>> changed, then i go to its simple page, and some of the time the new release >>>> file shows up, sometimes not. >>>> >>>> Tools like bandersnatch, pep381 and devpi-server (and probably others) >>>> use PyPI's changelog API to determine if there are changes. It seems >>>> those changes are signalled faster than they become consistently >>>> accessible >>>> through the CDN. This can lead to inconsistent mirrors because when >>>> the CDN has the files there is no change event anymore. Such mirrors >>>> are run by companies in-house so i think it's a real problem. >>>> >>>> Even without mirroring there can be problems because installs are not >>>> directly repeatable: "pip install XYZ>=2.0" can give you first 2.0.1, >>>> then 2.0.0 a minute later. I had hoped that a particular ip address >>>> sees things consistently. >>>> >>>> I am not familiar with Fastly's caching properties -- can they notify >>>> about the fact that a page/file is consistently up-to-date everywhere? >>>> Or can the cache be globally invalidated for a particular page/file? >>>> Any other ideas? >>>> >>>> Failing customizing Fastly usage and also maybe for the short term, >>>> is/could there be a special location provided by pypi.python.org which >>>> the above tools could use to get at the actual non-cached data? We >>>> could then maybe mitigate the problem through updates of the respective >>>> tools. >>>> That would at least solve the problem for one of my customers i think. >>>> >>>> best, >>>> holger >>>> >>>> >>>> On Sun, May 26, 2013 at 10:34 -0700, Noah Kantrowitz wrote: >>>>> </farnsworth> >>>>> >>>>> but seriously, at long last today it was my honor to throw the DNS switch >>>>> to move PyPI to the Fastly caching CDN. I would like to thank Donald >>>>> Stufft for doing much of the heavy lifting on the PyPI side, and to >>>>> Fastly for graciously offering to host us. What does this mean for >>>>> everyone? Well the biggest change is PyPI should get a whole lot faster. >>>>> There are two major downsides however. There will now be a delay of >>>>> several minutes in some cases between updating a package and having it be >>>>> installable, and download counts will now be even more incorrect than >>>>> they were before. The PyPI admins are discussing what to do about >>>>> download counts long-term, but for now we all feel that the performance >>>>> and availability benefits outweigh the loss. If anyone has any questions, >>>>> or hears anything about issues with PyPI please don't hesitate to contact >>>>> me. >>>>> >>>>> --Noah >>>>> >>>> >>>> >>>> >>>>> _______________________________________________ >>>>> Distutils-SIG maillist - Distutils-SIG@python.org >>>>> http://mail.python.org/mailman/listinfo/distutils-sig >>>> >>>> _______________________________________________ >>>> Distutils-SIG maillist - Distutils-SIG@python.org >>>> http://mail.python.org/mailman/listinfo/distutils-sig >>> >>> I mentioned it on twitter but might as well mention it here as well. >>> >>> Currently there is no invalidation going on. The effect on the mirroring >>> was unanticipated and I'm currently getting the invalidation API setup >>> within PyPI. >>> >>> ----------------- >>> Donald Stufft >>> PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA >>> >>> _______________________________________________ >>> Distutils-SIG maillist - Distutils-SIG@python.org >>> http://mail.python.org/mailman/listinfo/distutils-sig >> >> >> >> /simple/ Pages should now be immediately invalidated when a new package is >> released. > > thanks Donald. Looking at the implementation, i wonder what happens if > after ``self._conn.commit()`` a changelog API call arrives, returns changes > and a client uses it to retrieve changes before the fastly-purging takes > place. It's still a potential race-condition or am i missing something? > > best, > holger > >> ----------------- >> Donald Stufft >> PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA >> > > There's no way around a race condition. ``self._conn.commit()`` is what makes the changes available. If we purge prior to committing it then if someone hits the page between the purge and the self._conn.commit() then the client will see a page cached prior to the update (while the change log will appear to be updated). Essentially the same problem we have now. The current implementation does mean that if a client happens to hit between the commit and the purge they'll see old data however that's pretty unlikely. ----------------- Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA
signature.asc
Description: Message signed with OpenPGP using GPGMail
_______________________________________________ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig