Hello community, Since yesterday Oct 15, 2019, we found that the CI/CD was facing an issue related to auto-scaling. Lai and Pedro’s efforts helped to find the root cause (concerning closed port on Jenkin’s slave). The issue was temporarily resolved by removing the autoscaled instances & restarting the Jenkins master. As a result, the PRs would need to be restarted.
We need to do a post-mortem on this issue but some take-home issues that need to be fixed are: - Monitor number of slaves that failed to connect, add an alarm on threshold > 0 on failed to connect - Fix lambda error with a pending lifecycle (starting is not valid) - Deploy new lambda: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-lifecycle.html - Fix throttling problems with EC2. We are actively working on retriggering the PRs for the community. Apologies for the inconvenience caused. Thank you, Chai -- *Chaitanya Prakash Bapat* *+1 (973) 953-6299* [image: https://www.linkedin.com//in/chaibapat25] <https://github.com/ChaiBapchya>[image: https://www.facebook.com/chaibapat] <https://www.facebook.com/chaibapchya>[image: https://twitter.com/ChaiBapchya] <https://twitter.com/ChaiBapchya>[image: https://www.linkedin.com//in/chaibapat25] <https://www.linkedin.com//in/chaibapchya/>