It would be great if each OpenStack component could provide a maintenance mode 
like this… there was some work being considered on Cells 
https://blueprints.launchpad.net/nova/+spec/disable-child-cell-support which 
would have allowed parts of Nova to indicate they were in maintenance.

Something generic would be very useful. Some operators have asked for 
‘read-only’ modes also where query is OK but update is not permitted.

Tim

From: Mike Scherbakov [mailto:mscherba...@mirantis.com]
Sent: 09 September 2014 23:20
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [All] Maintenance mode in OpenStack during 
patching/upgrades

Sergii, Clint,
to rephrase what you are saying - there are might be situations when our 
OpenStack API will not be responding, as simply services would be down for 
upgrade.
Do we want to support it somehow? For example, if we know that Nova is going to 
be down, can we respond with HTTP 503 with appropriate Retry-After time in 
header?

The idea is not simply deny or hang requests from clients, but provide them "we 
are in maintenance mode, retry in X seconds"

> Turbo Hipster was added to the gate
great idea, I think we should use it in Fuel too

> You probably would want 'nova host-servers-migrate <host>'
yeah for migrations - but as far as I understand, it doesn't help with 
disabling this host in scheduler - there is can be a chance that some workloads 
will be scheduled to the host.


On Tue, Sep 9, 2014 at 6:02 PM, Clint Byrum 
<cl...@fewbar.com<mailto:cl...@fewbar.com>> wrote:
Excerpts from Mike Scherbakov's message of 2014-09-09 00:35:09 -0700:
> Hi all,
> please see below original email below from Dmitry. I've modified the
> subject to bring larger audience to the issue.
>
> I'd like to split the issue into two parts:
>
>    1. Maintenance mode for OpenStack controllers in HA mode (HA-ed
>    Keystone, Glance, etc.)
>    2. Maintenance mode for OpenStack computes/storage nodes (no HA)
>
> For first category, we might not need to have maintenance mode at all. For
> example, if we apply patching/upgrade one by one node to 3-node HA cluster,
> 2 nodes will serve requests normally. Is that possible for our HA solutions
> in Fuel, TripleO, other frameworks?

You may have a broken cloud if you are pushing out an update that
requires a new schema. Some services are better than others about
handling old schemas, and can be upgraded before doing schema upgrades.
But most of the time you have to do at least a brief downtime:

 * turn off DB accessing services
 * update code
 * run db migration
 * turn on DB accessing services

It is for this very reason, I believe, that Turbo Hipster was added to
the gate, so that deployers running against the upstream master branches
can have a chance at performing these upgrades in a reasonable amount of
time.

>
> For second category, can not we simply do "nova-manage service disable...",
> so scheduler will simply stop scheduling new workloads on particular host
> which we want to do maintenance on?
>

You probably would want 'nova host-servers-migrate <host>' at that
point, assuming you have migration set up.

http://docs.openstack.org/user-guide/content/novaclient_commands.html

> On Thu, Aug 28, 2014 at 6:44 PM, Dmitry Pyzhov 
> <dpyz...@mirantis.com<mailto:dpyz...@mirantis.com>> wrote:
>
> > All,
> >
> > I'm not sure if it deserves to be mentioned in our documentation, this
> > seems to be a common practice. If an administrator wants to patch his
> > environment, he should be prepared for a temporary downtime of OpenStack
> > services. And he should plan to perform patching in advance: choose a time
> > with minimal load and warn users about possible interruptions of service
> > availability.
> >
> > Our current implementation of patching does not protect from downtime
> > during the patching procedure. HA deployments seems to be more or less
> > stable. But it looks like it is possible to schedule an action on a compute
> > node and get an error because of service restart. Deployments with one
> > controller... well, you won’t be able to use your cluster until the
> > patching is finished. There is no way to get rid of downtime here.
> >
> > As I understand, we can get rid of possible issues with computes in HA.
> > But it will require migration of instances and stopping of nova-compute
> > service before patching. And it will make the overall patching procedure
> > much longer. Do we want to investigate this process?
> >
> > _______________________________________________
> > OpenStack-dev mailing list
> > OpenStack-dev@lists.openstack.org<mailto:OpenStack-dev@lists.openstack.org>
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> >
> >
>

_______________________________________________
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org<mailto:OpenStack-dev@lists.openstack.org>
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



--
Mike Scherbakov
#mihgen
_______________________________________________
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to