josephevans commented on issue #19373: URL: https://github.com/apache/incubator-mxnet/issues/19373#issuecomment-712623241
Ok, I believe I finally found the culprit. Our AMIs that are used for Jenkins slaves have auto-update turned on, and based on the logfiles of the slave instances, it looks like docker was being auto-updated and restarted, which was killing the log-output of the containers (and therefore jenkins jobs.) I've created a new AMI for mxnetlinux_cpu hosts with updated software versions, which also adds an option to the docker config to hopefully prevent this in the future. See https://docs.docker.com/config/containers/live-restore/ - Thanks @leezu for the recommendation. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
