Todd,
Thanks for that. Seems on the lines that I need.
The problem though is that I have an additional problem of the heketi
pod not starting because of a messed up database configuration.
These two problems happened independently, but on the same OpenShift
environment.
This means I'm unable to run the heketi-cli until that is fixed.
I'm not sure if I can modify the heketi database configuration as
described in the troubleshooting guide [1] so that it only knows about
the two good gluster nodes, and then add back the third one?
Any thoughts?
Tim
[1] https://github.com/heketi/heketi/blob/master/docs/troubleshooting.md
On 23/08/18 17:14, Walters, Todd wrote:
Tim,
I have had this issue with 3 node cluster. I created a new node with new
devices, ran scaleup and ran gluster playbook with some changes, then ran
heketi-cli commands to add new node and remove old node.
For your other question, I’ve restarted all glusterfs pods and hekeit pod and
resolved that issue before. I guess you can restart glusterd in each pod too?
Here’s doc I wrote on node replacement. I’m not sure if this is proper
procedure, but it works, and I wasn’t able to find any decent solution in the
docs.
# ----- Replacing a Failed Node ---- #
Disable Node to simulate failure
Get node id with heketi-cli node list or topology info
heketi-cli node disable fb344a2ea889c7e25a772e747eeeec2a -s http://localhost:8080 --user
admin --secret "$HEKETI_CLI_KEY"
Node fb344a2ea889c7e25a772e747eeeec2a is now offline
Stop Node in AWS Console
Scale up another node (4) for Gluster via Terraform
Run scaleup_node.yml playbook
Add New Node and Device
heketi-cli node add --zone=1 --cluster=441248c1b2f032a93aca4a4e03648b28
--management-host-name=ip-new-node.ec2.internal --storage-host-name=newnodeIP -s
http://localhost:8080 --user admin --secret "$HEKETI_CLI_KEY"
heketi-cli device add --name /dev/xvdc --node 8973b41d8a4e437bd8b36d7df1a93f06 -s
http://localhost:8080 --user admin --secret "$HEKETI_CLI_KEY"
Run deploy_gluster playbook, with the following changes in OSEv3
- openshift_storage_glusterfs_wipe: False
- openshift_storage_glusterfs_is_missing: False
- openshift_storage_glusterfs_heketi_is_missing: False
Verify topology
rsh into heketi pod
run heketi-exports (file i created with export commands)
get old and new node info (id)
Remove Node
sh-4.4# heketi-cli node remove fb344a2ea889c7e25a772e747eeeec2a -s http://localhost:8080
--user admin --secret "$HEKETI_CLI_KEY"
Node fb344a2ea889c7e25a772e747eeeec2a is now removed
Remove All Devices (check the topology)
sh-4.4# heketi-cli device delete ea85942eaec73cb666c4e3dcec8b3702 -s
http://localhost:8080 --user admin --secret "$HEKETI_CLI_KEY"
Device ea85942eaec73cb666c4e3dcec8b3702 deleted
Delete the Node
sh-4.4# heketi-cli node delete fb344a2ea889c7e25a772e747eeeec2a -s http://localhost:8080
--user admin --secret "$HEKETI_CLI_KEY"
Node fb344a2ea889c7e25a772e747eeeec2a deleted
Verify New Topology
$ heketi-cli topology info
make sure new node and device is listed.
Thanks,
Todd
# -----------------------
Check any existing pvc is still accessible.
Today's Topics:
2. Replacing failed gluster node (Tim Dudgeon)
------------------------------
Message: 2
Date: Thu, 23 Aug 2018 15:40:29 +0100
From: Tim Dudgeon <tdudgeon...@gmail.com>
To: users <users@lists.openshift.redhat.com>
Subject: Replacing failed gluster node
Message-ID: <b3999ba8-d4bd-16e1-6135-ca5d9ed76...@gmail.com>
Content-Type: text/plain; charset=utf-8; format=flowed
I have a 3 node containerised glusterfs setup, and one of the nodes has
just died.
I believe I can recover the disks that were used for the gluster storage.
What is the best approach to replacing that node with a new one?
Can I just create a new node with empty disks mounted and use the
scaleup.yml playbook and [new_nodes] section, or should I be creating a
node that re-uses the existing drives?
Tim
------------------------------
_______________________________________________
users mailing list
users@lists.openshift.redhat.com
https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Flists.openshift.redhat.com%2Fopenshiftmm%2Flistinfo%2Fusers&data=01%7C01%7Ctodd_walters%40unigroup.com%7C5dae269e490e4932a00008d609118137%7C259bdc2f86d3477b8cb34eee64289142%7C1&sdata=VkWMmlYIrfuEnZMGBtAqf2QER8dMSkFkVFYBAStVits%3D&reserved=0
End of users Digest, Vol 73, Issue 44
*************************************
########################################################################
The information contained in this message, and any attachments thereto,
is intended solely for the use of the addressee(s) and may contain
confidential and/or privileged material. Any review, retransmission,
dissemination, copying, or other use of the transmitted information is
prohibited. If you received this in error, please contact the sender
and delete the material from any computer. UNIGROUP.COM
########################################################################
_______________________________________________
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users