Re: Backup juju with/without interrupting juju

2014-06-05 Thread William Reade
On Thu, Jun 5, 2014 at 7:32 PM, Nate Finch  wrote:

> I guess what I don't understand is, why does it matter which mongo DB you
> back up?  They should all be identical, right?
>

Replication lag is a real thing, and shouldn't be discounted. We want to be
sure that the backup corresponds to a point in time no earlier than the
user issued the backup command, and I think the right way to do that is
`mongodump --oplog` on the primary; which, conveniently, doesn't involve
taking the db down.

Cheers
William
-- 
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev


Re: Backup juju with/without interrupting juju

2014-06-05 Thread Nate Finch
I guess what I don't understand is, why does it matter which mongo DB you
back up?  They should all be identical, right?


On Thu, Jun 5, 2014 at 12:58 PM, Curtis Hovey-Canonical <
cur...@canonical.com> wrote:

> On Wed, Jun 4, 2014 at 4:36 PM, Nate Finch 
> wrote:
> > I'm not sure I understand the distinction, Curtis.  Backing up the state
> > data in an HA environment
>
> This test exists to verify a feature we decided to support
> http://juju-ci.vapour.ws:8080/job/functional-ha-backup-restore-devel/
> "Verify that a HA state-server can be backed up and then restored if
> there is a catastrophic failure of all state-servers."
>
> If you backup a HA state-server, there is some juju downtime. Clearly
> we expect enterprises to use both strategies...otherwise we would have
> change backup script to reject the backup of a HA state-server
>
> --
> Curtis Hovey
> Canonical Cloud Development and Operations
> http://launchpad.net/~sinzui
>
-- 
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev


Re: Backup juju with/without interrupting juju

2014-06-05 Thread Curtis Hovey-Canonical
On Wed, Jun 4, 2014 at 4:36 PM, Nate Finch  wrote:
> I'm not sure I understand the distinction, Curtis.  Backing up the state
> data in an HA environment

This test exists to verify a feature we decided to support
http://juju-ci.vapour.ws:8080/job/functional-ha-backup-restore-devel/
"Verify that a HA state-server can be backed up and then restored if
there is a catastrophic failure of all state-servers."

If you backup a HA state-server, there is some juju downtime. Clearly
we expect enterprises to use both strategies...otherwise we would have
change backup script to reject the backup of a HA state-server

-- 
Curtis Hovey
Canonical Cloud Development and Operations
http://launchpad.net/~sinzui

-- 
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev


Re: Backup juju with/without interrupting juju

2014-06-04 Thread Nate Finch
I'm not sure I understand the distinction, Curtis.  Backing up the state
data in an HA environment is what I was talking about for large
environments.


On Wed, Jun 4, 2014 at 4:12 PM, Curtis Hovey-Canonical  wrote:

> On Wed, Jun 4, 2014 at 3:58 PM, Nate Finch 
> wrote:
> ...
> > Then the only case that is really a problem is large environments which
> are
> > not using HA, which should be something we discourage, and uninterrupted
> > backup can be a way to  show the benefits of HA.
>
> There is one other scenario. A backup and restore of a HA env. We
> support this as a final fallback for cases where disaster strikes all
> state-servers. We test that his can always be done. CI does see the
> downtime. I think this is acceptable since a human chooses to bring
> down state-server and none of the other services running in the
> environment are affected.
>
> In the future, when charms can report the health of services,
> consumers of health data may have some sense of downtime.
>
>
> --
> Curtis Hovey
> Canonical Cloud Development and Operations
> http://launchpad.net/~sinzui
>
> --
> Juju-dev mailing list
> Juju-dev@lists.ubuntu.com
> Modify settings or unsubscribe at:
> https://lists.ubuntu.com/mailman/listinfo/juju-dev
>
-- 
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev


Re: Backup juju with/without interrupting juju

2014-06-04 Thread Curtis Hovey-Canonical
On Wed, Jun 4, 2014 at 3:58 PM, Nate Finch  wrote:
...
> Then the only case that is really a problem is large environments which are
> not using HA, which should be something we discourage, and uninterrupted
> backup can be a way to  show the benefits of HA.

There is one other scenario. A backup and restore of a HA env. We
support this as a final fallback for cases where disaster strikes all
state-servers. We test that his can always be done. CI does see the
downtime. I think this is acceptable since a human chooses to bring
down state-server and none of the other services running in the
environment are affected.

In the future, when charms can report the health of services,
consumers of health data may have some sense of downtime.


-- 
Curtis Hovey
Canonical Cloud Development and Operations
http://launchpad.net/~sinzui

-- 
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev


Backup juju with/without interrupting juju

2014-06-04 Thread Nate Finch
We currently have a backup plugin and a restore script that are entirely
divorced from the rest of the codebase, and are therefore prone to breaking
if core changes.

Moonstone has been tasked with recreating backup and restore as first class
citizens of juju.

One of the complaints about the current backup is that it requires you to
stop your mongo database while we copy the data out of it.  This means you
can't modify your juju environment until backup is complete (note that it
would not actually interrupt the deployed services).

There had been some talk about various ways we can avoid the disruption via
different methods.

Here's my proposal:

Let's leave it as-is.

Here's why I think it's ok:

For small environments, the amount of data to backup is going to be small,
and therefore the interruption will be short.  In addition, small
environments aren't likely to be adversely affected by a short outage,
since it only affects the juju state server, not any of the deployed
services.  And small environments should need less interaction with juju in
general.

For large environments which may take a long time to back up, they will
likely be in HA mode, which means they have 2 or more replica servers in
addition to the main server.  In this case, you can just stop one of the
replica servers, and back up that, with no actual interruption to juju at
all.

Then the only case that is really a problem is large environments which are
not using HA, which should be something we discourage, and uninterrupted
backup can be a way to  show the benefits of HA.

Thoughts?

-Nate
-- 
Juju-dev mailing list
Juju-dev@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/juju-dev