Re: [pve-devel] [PATCH docs v2] pveceph: document cluster shutdown

Aaron Lauterer Tue, 28 May 2024 04:54:23 -0700

thanks for the review. one comment inline

On  2024-05-23  14:23, Alexander Zeidler wrote:

On Wed, 2024-05-22 at 10:33 +0200, Aaron Lauterer wrote:

Signed-off-by: Aaron Lauterer <a.laute...@proxmox.com>
---
changes since v1:
* incorporated suggested changes in phrasing to fix grammar and
   distinguish the steps on how to power down the nodes better


  pveceph.adoc | 50 ++++++++++++++++++++++++++++++++++++++++++++++++++
  1 file changed, 50 insertions(+)

diff --git a/pveceph.adoc b/pveceph.adoc
index 089ac80..04bf462 100644
--- a/pveceph.adoc
+++ b/pveceph.adoc
@@ -1080,6 +1080,56 @@ scrubs footnote:[Ceph scrubbing 
{cephdocs-url}/rados/configuration/osd-config-re
  are executed.

+[[pveceph_shutdown]]

+Shutdown {pve} + Ceph HCI cluster
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+To shut down the whole {pve} + Ceph cluster, first stop all Ceph clients. This

Rather s/This/These/ ?

+will mainly be VMs and containers. If you have additional clients that might
+access a Ceph FS or an installed RADOS GW, stop these as well.
+Highly available guests will switch their state to 'stopped' when powered down
+via the {pve} tooling.
+
+Once all clients, VMs and containers are off or not accessing the Ceph cluster
+anymore, verify that the Ceph cluster is in a healthy state. Either via the 
Web UI
+or the CLI:
+
+----
+ceph -s
+----
+
+Then enable the following OSD flags in the Ceph -> OSD panel or the CLI:

For style consistency: **Ceph -> OSD panel**

Maybe: s/or the CLI/or via CLI/

+
+----
+ceph osd set noout
+ceph osd set norecover
+ceph osd set norebalance
+ceph osd set nobackfill
+ceph osd set nodown
+ceph osd set pause

Maybe sort alphabetically as in the UI.

I don't think this is a good idea. The order is roughly going from"should be set" to "would be good if set". While it would not beaffected in this case, as p comes after n, but still is an example thata pure alphabetical order by default can be problematic: pause will beset last, as is halts any IO in the cluster.

With that in mind, I realized that the sorting in the unset part shouldbe reversed.

+----
+
+This will halt all self-healing actions for Ceph and the 'pause' will stop any 
client IO.

Perhaps state the goal/result beforehand, e.g.:
Then enable the following OSD flags in the **Ceph -> OSD panel** or via CLI,
which halt all self-healing actions for Ceph and 'pause' any client IO:

+
+Start powering down your nodes without a monitor (MON). After these nodes are
+down, continue shutting down hosts with monitors on them.

Since the continuation is not meant/true for "hosts with monitors":
s/continue/continue by/

Maybe: s/hosts/nodes/

+
+When powering on the cluster, start the nodes with Monitors (MONs) first. Once

s/Monitors/monitors/

+all nodes are up and running, confirm that all Ceph services are up and running
+before you unset the OSD flags:

Maybe stay with either enable/disable or set/unset.

s/flags:/flags again:/

+
+----
+ceph osd unset noout
+ceph osd unset norecover
+ceph osd unset norebalance
+ceph osd unset nobackfill
+ceph osd unset nodown
+ceph osd unset pause

Above mentioned sorting.

+----
+
+You can now start up the guests. Highly available guests will change their 
state
+to 'started' when they power on.
+
  Ceph Monitoring and Troubleshooting
  -----------------------------------




_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

Re: [pve-devel] [PATCH docs v2] pveceph: document cluster shutdown

Reply via email to