Re: Backup juju with/without interrupting juju
On Thu, Jun 5, 2014 at 7:32 PM, Nate Finch wrote: > I guess what I don't understand is, why does it matter which mongo DB you > back up? They should all be identical, right? > Replication lag is a real thing, and shouldn't be discounted. We want to be sure that the backup corresponds to a point in time no earlier than the user issued the backup command, and I think the right way to do that is `mongodump --oplog` on the primary; which, conveniently, doesn't involve taking the db down. Cheers William -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: Backup juju with/without interrupting juju
I guess what I don't understand is, why does it matter which mongo DB you back up? They should all be identical, right? On Thu, Jun 5, 2014 at 12:58 PM, Curtis Hovey-Canonical < cur...@canonical.com> wrote: > On Wed, Jun 4, 2014 at 4:36 PM, Nate Finch > wrote: > > I'm not sure I understand the distinction, Curtis. Backing up the state > > data in an HA environment > > This test exists to verify a feature we decided to support > http://juju-ci.vapour.ws:8080/job/functional-ha-backup-restore-devel/ > "Verify that a HA state-server can be backed up and then restored if > there is a catastrophic failure of all state-servers." > > If you backup a HA state-server, there is some juju downtime. Clearly > we expect enterprises to use both strategies...otherwise we would have > change backup script to reject the backup of a HA state-server > > -- > Curtis Hovey > Canonical Cloud Development and Operations > http://launchpad.net/~sinzui > -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: Backup juju with/without interrupting juju
On Wed, Jun 4, 2014 at 4:36 PM, Nate Finch wrote: > I'm not sure I understand the distinction, Curtis. Backing up the state > data in an HA environment This test exists to verify a feature we decided to support http://juju-ci.vapour.ws:8080/job/functional-ha-backup-restore-devel/ "Verify that a HA state-server can be backed up and then restored if there is a catastrophic failure of all state-servers." If you backup a HA state-server, there is some juju downtime. Clearly we expect enterprises to use both strategies...otherwise we would have change backup script to reject the backup of a HA state-server -- Curtis Hovey Canonical Cloud Development and Operations http://launchpad.net/~sinzui -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: Backup juju with/without interrupting juju
I'm not sure I understand the distinction, Curtis. Backing up the state data in an HA environment is what I was talking about for large environments. On Wed, Jun 4, 2014 at 4:12 PM, Curtis Hovey-Canonical wrote: > On Wed, Jun 4, 2014 at 3:58 PM, Nate Finch > wrote: > ... > > Then the only case that is really a problem is large environments which > are > > not using HA, which should be something we discourage, and uninterrupted > > backup can be a way to show the benefits of HA. > > There is one other scenario. A backup and restore of a HA env. We > support this as a final fallback for cases where disaster strikes all > state-servers. We test that his can always be done. CI does see the > downtime. I think this is acceptable since a human chooses to bring > down state-server and none of the other services running in the > environment are affected. > > In the future, when charms can report the health of services, > consumers of health data may have some sense of downtime. > > > -- > Curtis Hovey > Canonical Cloud Development and Operations > http://launchpad.net/~sinzui > > -- > Juju-dev mailing list > Juju-dev@lists.ubuntu.com > Modify settings or unsubscribe at: > https://lists.ubuntu.com/mailman/listinfo/juju-dev > -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Re: Backup juju with/without interrupting juju
On Wed, Jun 4, 2014 at 3:58 PM, Nate Finch wrote: ... > Then the only case that is really a problem is large environments which are > not using HA, which should be something we discourage, and uninterrupted > backup can be a way to show the benefits of HA. There is one other scenario. A backup and restore of a HA env. We support this as a final fallback for cases where disaster strikes all state-servers. We test that his can always be done. CI does see the downtime. I think this is acceptable since a human chooses to bring down state-server and none of the other services running in the environment are affected. In the future, when charms can report the health of services, consumers of health data may have some sense of downtime. -- Curtis Hovey Canonical Cloud Development and Operations http://launchpad.net/~sinzui -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev
Backup juju with/without interrupting juju
We currently have a backup plugin and a restore script that are entirely divorced from the rest of the codebase, and are therefore prone to breaking if core changes. Moonstone has been tasked with recreating backup and restore as first class citizens of juju. One of the complaints about the current backup is that it requires you to stop your mongo database while we copy the data out of it. This means you can't modify your juju environment until backup is complete (note that it would not actually interrupt the deployed services). There had been some talk about various ways we can avoid the disruption via different methods. Here's my proposal: Let's leave it as-is. Here's why I think it's ok: For small environments, the amount of data to backup is going to be small, and therefore the interruption will be short. In addition, small environments aren't likely to be adversely affected by a short outage, since it only affects the juju state server, not any of the deployed services. And small environments should need less interaction with juju in general. For large environments which may take a long time to back up, they will likely be in HA mode, which means they have 2 or more replica servers in addition to the main server. In this case, you can just stop one of the replica servers, and back up that, with no actual interruption to juju at all. Then the only case that is really a problem is large environments which are not using HA, which should be something we discourage, and uninterrupted backup can be a way to show the benefits of HA. Thoughts? -Nate -- Juju-dev mailing list Juju-dev@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/juju-dev