Another benefit of Winston's proposal is that it make it easy to install specific package versions from source. For the time being I'm using a construct like https://github.com/inbo/Rstable/blob/master/cran_install.sh to generate a Docker image.
Best regards, ir. Thierry Onkelinx Statisticus / Statistician Vlaamse Overheid / Government of Flanders INSTITUUT VOOR NATUUR- EN BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE AND FOREST Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance thierry.onkel...@inbo.be Havenlaan 88 bus 73, 1000 Brussel www.inbo.be /////////////////////////////////////////////////////////////////////////////////////////// To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey /////////////////////////////////////////////////////////////////////////////////////////// 2018-02-03 20:31 GMT+01:00 Winston Chang <winstoncha...@gmail.com>: > Although it may not have been the cause of this particular index > inconsistency, there are other causes of intermittent index > inconsistencies. They could be avoided if there were a different > directory structure on CRAN servers. > > One of the causes of inconsistencies is caching. With > cloud.r-project.org (note that this is not cran.r-project.org), the > there is a CDN in front of the server; the CDN has caching endpoints > around the world, and will serve files to the user from the nearest > endpoint. > > The cache timeout for each file is 30 minutes. Suppose a user > downloads file X from some endpoint at 1:00. If the endpoint doesn't > already have X in the cache, then it will fetch the file from the > server, and then send it to the user. The endpoint will consider the > cached file valid until 1:30. If another user requests X at 1:20, the > endpoint will serve up the file from its cache without checking with > the server. If someone requests X at 1:40, the endpoint will check > with the server to see if its cached version is still valid (and > download an updated version if necessary), then it wills end the file > to the user. > > Because the caching is on a per-file basis, this can lead to a > situation where the PACKAGES file served by an endpoint is out of sync > with the .tgz package files. Imagine this scenario: > > 1:00 Someone downloads PACKAGES. It is not yet in the endpoint's > cache, so it fetches it from the server. This version of PACKAGES says > that the current version of PkgA is 1.0. > 1:10 The server performs an rsync from the central CRAN mirror. It > gets an updated version of PACKAGES, which says that the current > version of PkgA is 2.0. The rsync also removes the PkgA_1.0.tgz file > and adds PkgA_2.0.tgz. > 1:20 Someone else wants to install PkgA, so their R session first > downloads PACKAGES, which points to PkgA_1.0.tgz. Then R tries to > download PkgA_1.0.tgz; it is not in the endpoint's cache, so the > endpoint tries to fetch it from the server, but the file is not > present there so it sends a 404 missing message. The endpoint passes > this to the R session, and the package installation fails. > > Anyone else who tries to install PkgA (and hits the same CDN endpoint) > will get the same installation failure, until the cache for PACKAGES > expires at 1:30. However, another person who happens to hit another > endpoint may be able to install PkgA, because each endpoint does its > caching independently. > > Something similar even without a CDN, because download.packages() > caches the contents of PACKAGES. However, that can be worked around by > telling download.packages() to not use the cache, or by simply > restarting R. > > One reason that package installations fail in these cases is that the > current version of a package is in one directory, and the old > (archived) versions of a package are in another directory. If current > and old versions were in the same directory, then package installation > would not fail. > > > -Winston > > > > On Tue, Jan 30, 2018 at 1:19 PM, Dirk Eddelbuettel <e...@debian.org> wrote: >> >> I have received three distinct (non-)bug reports where someone claimed a >> recent package of mine was broken ... simply because the macOS binary was not >> there. >> >> Is there something wrong with the cronjob providing the indices? Why is it >> pointing people to binaries that do not exist? >> >> Concretely, file >> >> https://cloud.r-project.org/bin/macosx/el-capitan/contrib/3.4/PACKAGES >> >> contains >> >> Package: digest >> Version: 0.6.15 >> Title: Create Compact Hash Digests of R Objects >> Depends: R (>= 2.4.1) >> Suggests: knitr, rmarkdown >> Built: R 3.4.3; x86_64-apple-darwin15.6.0; 2018-01-29 05:21:06 UTC; unix >> Archs: digest.so.dSYM >> >> yet the _same directory_ only has: >> >> digest_0.6.14.tgz 15-Jan-2018 21:36 157K >> >> I presume this is a temporary accident. >> >> We are all spoiled by you all providing such a wonderfully robust and >> well-oiled service---so again big THANKS for that--but today something is out >> of order. >> >> Dirk >> >> -- >> http://dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org >> >> ______________________________________________ >> R-devel@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-devel > > ______________________________________________ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel