On Fri, Oct 19, 2018 at 8:25 AM Robert Butts <[email protected]> wrote: > > > simply restore pre-upgrade copy of the DB > > What about the data manually changed in-between? What if you've been > running for a week before discovering a critical issue requiring rollback? > All those changes would be lost?
Depending on the nature of the DB migrations run during the upgrade, it's possible that certain data would be lost using either method. As part of this restoration process we could save off a copy of the upgraded DB before restoring the original DB. Then data could be pulled out of the upgraded DB to be later added back into the original DB once fixed. This is really about getting everything back to a known good working state in the event of a major issue with the upgrade. If the issue is actually being caused by bad data added post-upgrade that isn't handled correctly in both the new and previous versions of the code, then you don't want to save that problematic data via `goose down` because you will still be experiencing the issue post-rollback. By saving off the pre-upgrade and post-upgrade versions of the DB, you won't lose data because you can still dissect the post-upgrade DB to extract any new data. If the upgrade is currently causing a service outage but you don't know the root cause yet, you can do a safe, immediate rollback of the upgrade to bring the service back to a known working state, find the root cause, fix it, then pull out any post-upgrade data to add back to the current DB. If it turns out the new data wasn't the problem and really it was just how the upgraded code was handling it, you could probably just restore the post-upgrade DB after fixing the upgrade. But, if the issue takes a full week to find after doing the upgrade, maybe it's not a major issue that would even require a full rollback of the upgrade and maybe just a hotfix instead. Does that sound like a reasonable process? - Rawlin
