Hi all. CI is back to normal after Jake's commit:
https://github.com/apache/incubator-mxnet/pull/16968 please merge from
master. If someone could look into the TVM building issues described
above would be great.
On Tue, Dec 3, 2019 at 11:11 AM Pedro Larroy
wrote:
> Some PRs were experiencing
Some PRs were experiencing build timeouts in the past. I have diagnosed
this to be a saturation of the EFS volume holding the compilation cache.
Once CI is back online this problem is very likely to be solved and you
should not see any more build timeout issues.
On Tue, Dec 3, 2019 at 10:18 AM
Also please take note that there's a stage building TVM which is executing
compilation serially and takes a lot of time which impacts CI turnaround
time:
https://github.com/apache/incubator-mxnet/issues/16962
Pedro
On Tue, Dec 3, 2019 at 9:49 AM Pedro Larroy
wrote:
> Hi MXNet community. We
Hi MXNet community. We are in the process of updating the base AMIs for CI
with an updated CUDA driver to fix the CI blockage.
We would need help from the community to diagnose some of the build errors
which don't seem related to the infrastructure.
I have observed this build failure with tvm
Small update about CI, which is blocked.
Seems there's a nvidia driver compatibility problem in the base AMI that is
running in GPU instances and the nvidia docker images that we use for
building and testing.
We are working on providing a fix by updating the base images as doesn't
seem to be