Rename, in the design, the repair daemon to maintenance daemon.
This better reflects the role of this daemon as, besides repairs,
it will also take care of other maintenance operations, like
balancing.

Signed-off-by: Klaus Aehlig <[email protected]>
---
 doc/design-repaird.rst | 44 +++++++++++++++++++++++---------------------
 1 file changed, 23 insertions(+), 21 deletions(-)

diff --git a/doc/design-repaird.rst b/doc/design-repaird.rst
index 3cb6bd8..bd91cc0 100644
--- a/doc/design-repaird.rst
+++ b/doc/design-repaird.rst
@@ -1,11 +1,13 @@
-====================
-Ganeti Repair Daemon
-====================
+=========================
+Ganeti Maintenance Daemon
+=========================
 
 .. contents:: :depth: 4
 
 This design document outlines the implementation of a new Ganeti
-daemon coordinating repairs on a cluster.
+daemon coordinating all maintenance operations on a cluster
+(rebalancing, activate disks, ERROR_down handling, node-repair
+actions).
 
 
 Current state and shortcomings
@@ -32,7 +34,7 @@ swap.
 Proposed changes
 ================
 
-We propose the addition of an additional daemon, called ``repaird`` that will
+We propose the addition of an additional daemon, called ``maintd`` that will
 coordinate the work for repair needs of individual nodes. The information
 about the work to be done will be obtained from a dedicated data collector
 via the :doc:`design-monitoring-agent`.
@@ -70,7 +72,7 @@ attempting live migrations, respectively.
 details
 .......
 
-An opaque JSON value that the repair daemon will just pass through and
+An opaque JSON value that the maintenance daemon will just pass through and
 export. It is intended to contain information about the type of repair
 that needs to be done after the respective Ganeti action is finished.
 E.g., it might contain information which piece of hardware is to be
@@ -99,7 +101,7 @@ directory will be ``/etc/ganeti/node-diagnose-commands``.
 Result forging
 ..............
 
-As the repair daemon will take real Ganeti actions based on the diagnose
+As the maintenance daemon will take real Ganeti actions based on the diagnose
 reported by the self-diagnose script through the monitoring daemon, we
 need to verify integrity of such reports to avoid denial-of-service by
 fraudaulent error reports. Therefore, the monitoring daemon will sign
@@ -117,27 +119,27 @@ being requested) for this event and forget about it, as 
soon as it is
 no longer observed.
 
 Corresponding Ganeti actions will be initiated and success or failure of
-these Ganeti jobs monitored. All jobs submitted by the repair daemon
-will have the string ``gnt:daemon:repaird`` and the event identifier
+these Ganeti jobs monitored. All jobs submitted by the maintenance daemon
+will have the string ``gnt:daemon:maintd`` and the event identifier
 in the reason trail, so that :doc:`design-optables` is possible.
 Once a job fails, no further jobs will be submitted for this event
 to avoid further damage; the repair action is considered failed in this case.
 
 Once all requested actions succeeded, or one failed, the node where the
-event as observed will be tagged by a tag starting with 
``repaird:repairready:``
-or ``repaird:repairfailed:``, respectively, where the event identifier is
+event as observed will be tagged by a tag starting with 
``maintdd:repairready:``
+or ``maintd:repairfailed:``, respectively, where the event identifier is
 encoded in the rest of the tag. On the one hand, it can be used as an
 additional verification whether a node is ready for a specific repair.
 However, the main purpose is to provide a simple and uniform interface
-to acknowledge an event; once that tag is removed, the repair daemon
+to acknowledge an event; once that tag is removed, the maintenance daemon
 will forget about this event, as soon as it is no longer observed by
 any monitoring daemon.
 
 
-Repair daemon
--------------
+Maintenance daemon
+------------------
 
-The new daemon ``repaird`` will be running on the master node only. It will
+The new daemon ``maintd`` will be running on the master node only. It will
 verify the master status of its node by popular vote in the same way as all the
 other master-only daemons. If started on a non-master node, it will exit
 immediately with exit code ``exitNotmaster``, i.e., 11.
@@ -180,7 +182,7 @@ as a JSON object with at least the following information.
     is still observed.
 
   + ``failed`` At least one of the submitted jobs has failed. To avoid further
-    damage, the repair daemon will not take any further action for this event.
+    damage, the maintenance daemon will not take any further action for this 
event.
 
   + ``completed`` All Ganeti actions associated with this event have been
     completed successfully, including tagging the node.
@@ -195,7 +197,7 @@ State
 ~~~~~
 
 As repairs, especially those involving physically swapping hardware, can take
-a long time, the repair daemon needs to store its state persistently. As we
+a long time, the maintenance daemon needs to store its state persistently. As 
we
 cannot exclude master-failovers during a repair cycle, it does so by storing
 it as part of the Ganeti configuration.
 
@@ -205,9 +207,9 @@ The SSConf will not be changed.
 Superseeding ``harep`` and implicit balancing
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
-To have a single point coordinating all repair actions, the new repair daemon
+To have a single point coordinating all repair actions, the new daemon
 will also have the ability to take over the work currently done by ``harep``.
-To allow a smooth transition, ``repaird`` when carrying out ``harep``'s duties
+To allow a smooth transition, ``maintd`` when carrying out ``harep``'s duties
 will add tags in precisely the same way as ``harep`` does.
 As the new daemon will have to move instances, it will also have the ability
 to balance the cluster in a way coordinated with the necessary evacuation
@@ -222,7 +224,7 @@ continue to exist unchanged as part of the ``htools``.
 Mode of operation
 ~~~~~~~~~~~~~~~~~
 
-The repair daemon will at fixed interval poll the monitoring daemons for
+The maintenance daemon will at fixed interval poll the monitoring daemons for
 the value of the self-diagnose data collector; if load-based balancing is
 enabled, it will also collect for the the load data needed.
 
@@ -232,7 +234,7 @@ A new round will be started if all jobs of the old round 
have finished, and
 there is an unhandled repair event or the cluster is unbalanced enough 
(provided
 that autobalancing is enabled).
 
-In each round, ``repaird`` will first determine the most invasive action for
+In each round, ``maintd`` will first determine the most invasive action for
 each node; despite the self-diagnose collector summing observations in a single
 action recommendation, a new, more invasive recommendation can be issued before
 the handling of the first recommendation is finished. For all nodes to be
-- 
2.4.3.573.g4eafbef

Reply via email to