Hi all, I want to raise the CRAN failure issue because it started to block Spark PRs time to time. Since the number of PRs grows hugely in Spark community, this is critical to not block other PRs.
There has been a problem at CRAN (See https://github.com/apache/spark/pull/20005 for analysis). To cut it short, the root cause is malformed package info from https://cran.r-project.org/src/contrib/PACKAGES from server side, and this had to be fixed by requesting it to CRAN sysaadmin's help. https://issues.apache.org/jira/browse/SPARK-24152 <- newly open. I am pretty sure it's the same issue https://issues.apache.org/jira/browse/SPARK-25923 <- reopen/resolved 2 times https://issues.apache.org/jira/browse/SPARK-22812 This happened 5 times for roughly about 10 months, causing blocking almost all PRs in Apache Spark. Historically, it blocked whole PRs for few days once, and whole Spark community had to stop working. I assume this has been not a super big big issue so far for other projects or other people because apparently higher version of R has some logics to handle this malformed documents (at least I verified R 3.4.0 works fine). For our side, Jenkins has low R version (R 3.1.1 if that's not updated from what I have seen before), which is unable to parse the malformed server's response. So, I want to talk about how we are going to handle this. Possible solutions are: 1. We should start a talk with CRAN sysadmin to permanently prevent this issue 2. We upgrade R to 3.4.0 in Jenkins (however we will not be able to test low R versions) 3. ... If if we fine, I would like to suggest to forward this email to CRAN sysadmin to discuss further about this. Adding Liang-Chi Felix and Shivaram who I already talked about this few times before. Thanks all.