Hi all,

I want to raise the CRAN failure issue because it started to block Spark
PRs time to time. Since the number
of PRs grows hugely in Spark community, this is critical to not block other
PRs.

There has been a problem at CRAN (See
https://github.com/apache/spark/pull/20005 for analysis).
To cut it short, the root cause is malformed package info from
https://cran.r-project.org/src/contrib/PACKAGES
from server side, and this had to be fixed by requesting it to CRAN
sysaadmin's help.

https://issues.apache.org/jira/browse/SPARK-24152 <- newly open. I am
pretty sure it's the same issue
https://issues.apache.org/jira/browse/SPARK-25923 <- reopen/resolved 2 times
https://issues.apache.org/jira/browse/SPARK-22812

This happened 5 times for roughly about 10 months, causing blocking almost
all PRs in Apache Spark.
Historically, it blocked whole PRs for few days once, and whole Spark
community had to stop working.

I assume this has been not a super big big issue so far for other projects
or other people because apparently
higher version of R has some logics to handle this malformed documents (at
least I verified R 3.4.0 works fine).

For our side, Jenkins has low R version (R 3.1.1 if that's not updated from
what I have seen before),
which is unable to parse the malformed server's response.

So, I want to talk about how we are going to handle this. Possible
solutions are:

1. We should start a talk with CRAN sysadmin to permanently prevent this
issue
2. We upgrade R to 3.4.0 in Jenkins (however we will not be able to test
low R versions)
3. ...

If if we fine, I would like to suggest to forward this email to CRAN
sysadmin to discuss further about this.

Adding Liang-Chi Felix and Shivaram who I already talked about this few
times before.

Thanks all.

Reply via email to