Folks, Sometimes in our cluster upgrades start failing due to transient outages of dependencies or reasons unrelated to the new code being pushed out. Aurora hits its failure threshold and starts automatic rollback which may make a bad condition worse (e.g. if the outage was related to load rollback will increase load). Can Aurora distinguish between failures caused by the upgrade itself or other transient systemic issues (using e.g. reason code)? If not does this make sense as a new feature?
Mohit.