Hello, MXNet dev community,
As you all know, the experience with CI infrastructure isn’t ideal in spite of 
its high cost. For this reason, we’re proposing the following changes to 
improve stability, reduce cost, and grant more control to contributors. As we 
work in a refresh of CI, we believe these changes will reduce the pain we all 
suffer when we try to push a PR through the system.

Following is the list of changes:
Fix missing status reports between GH and Jenkins
Update Jenkins permission groups to re-trigger builds
Introduce per-PR CI bot
Details:

- Fix missing status reports
Currently, once commit gets added to PR - the CI is run on that added commit. 
Sometimes, CI run status is missing from the commit in Github despite having 
completed in Jenkins. Example: CI run: 
http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/mxnet-validation%2Funix-cpu/detail/PR-17376/17/pipeline,
 commit status in github (missing unix-cpu, unix-gpu and windows-gpu statuses): 
https://github.com/apache/incubator-mxnet/pull/17376#partial-pull-merging.
Problem: There seems to be a bug where some status reports are missing on 
Github. The hypothesis is that there is some issue with Github Hooks.

- Update Jenkins permission groups to re-trigger builds
Problem: Currently, only MXNet Committers and selected people from AWS have the 
ability to re-trigger CI runs on PRs. This leaves the PR Authors waiting for 
authorized users to re-trigger their PRs for them.
Solution : Allow these membership categories Jenkins Admins, MXNet Committers, 
and PR Authors to re-trigger PR builds.

- Introduce per-PR CI bot
Problem: As of date, MXNet CI is automated. It runs every time a commit is 
pushed onto your Github PR. This results in lot of unnecessary CI runs apart 
from added costs.
Solution: Switch to Manual Trigger. Users from authorized groups (1 of the 3 
categories mentioned above) can trigger CI run by adding a simple comment to 
PR: “[mxnet-ci] run”. 

--
Thank you,

AWS MXNet team

 

Reply via email to