Hi! Generally I use "crm_mon -1Arfj” to see the cluster status, and I suspect it my be location restrictions or stickiness preventing resource balancing. Without config it’s hard to guess, however.
Kind regards, Ulrich Windl From: Users <users-boun...@clusterlabs.org> On Behalf Of chenzu...@gmail.com Sent: Friday, June 6, 2025 10:20 AM To: users <users@clusterlabs.org> Subject: [EXT] [ClusterLabs] Resource is Unbalanced After Powering Off One Node Hi all, I am writing to report an issue with uneven resource migration in our Lustre cluster. Below are the details: 一 Background: We have 3 physical nodes, each hosting 2 virtual machines: lustre-mds-nodexx (containing 2 MDTs) and lustre-oss-nodexx (containing 8 OSTs and MGS on one of them). We are using Lustre version 2.15.5 along with Pacemaker(2.1.0) for cluster management. 二 Problem: After powering off lustre-oss-node144 using the command virsh destroy lustre-oss-node144, the resources from lustre-oss-node144 did not migrate evenly. All resources migrated to lustre-oss-node31. 三 Resource Status Before and After powering off lustre-oss-node144: Before : [root@lustre-oss-node31 ~]# pcs status Cluster name: oss_cluster Cluster Summary: * Stack: corosync (Pacemaker is running) * Current DC: lustre-oss-node144 (version 2.1.7-5.el8_10-0f7f88312) - partition with quorum * Last updated: Fri Jun 6 14:10:54 2025 on lustre-oss-node31 * Last change: Fri Jun 6 14:06:46 2025 by root via root on lustre-oss-node31 * 3 nodes configured * 28 resource instances configured Node List: * Online: [ lustre-oss-node31 lustre-oss-node135 lustre-oss-node144 ] Full List of Resources: * vmfence_lustre-oss-node31 (stonith:fence_xvm): Started lustre-oss-node144 * vmfence_lustre-oss-node144 (stonith:fence_xvm): Started lustre-oss-node135 * vmfence_lustre-oss-node135 (stonith:fence_xvm): Started lustre-oss-node31 * mgt (ocf::heartbeat:Filesystem): Started lustre-oss-node31 * ost-0 (ocf::heartbeat:Filesystem): Started lustre-oss-node135 * ost-3 (ocf::heartbeat:Filesystem): Started lustre-oss-node135 * ost-6 (ocf::heartbeat:Filesystem): Started lustre-oss-node135 * ost-9 (ocf::heartbeat:Filesystem): Started lustre-oss-node135 * ost-12 (ocf::heartbeat:Filesystem): Started lustre-oss-node135 * ost-15 (ocf::heartbeat:Filesystem): Started lustre-oss-node135 * ost-18 (ocf::heartbeat:Filesystem): Started lustre-oss-node135 * ost-21 (ocf::heartbeat:Filesystem): Started lustre-oss-node135 * ost-1 (ocf::heartbeat:Filesystem): Started lustre-oss-node144 * ost-4 (ocf::heartbeat:Filesystem): Started lustre-oss-node144 * ost-7 (ocf::heartbeat:Filesystem): Started lustre-oss-node144 * ost-10 (ocf::heartbeat:Filesystem): Started lustre-oss-node144 * ost-13 (ocf::heartbeat:Filesystem): Started lustre-oss-node144 * ost-16 (ocf::heartbeat:Filesystem): Started lustre-oss-node144 * ost-19 (ocf::heartbeat:Filesystem): Started lustre-oss-node144 * ost-22 (ocf::heartbeat:Filesystem): Started lustre-oss-node144 * ost-2 (ocf::heartbeat:Filesystem): Started lustre-oss-node31 * ost-5 (ocf::heartbeat:Filesystem): Started lustre-oss-node31 * ost-8 (ocf::heartbeat:Filesystem): Started lustre-oss-node31 * ost-11 (ocf::heartbeat:Filesystem): Started lustre-oss-node31 * ost-14 (ocf::heartbeat:Filesystem): Started lustre-oss-node31 * ost-17 (ocf::heartbeat:Filesystem): Started lustre-oss-node31 * ost-20 (ocf::heartbeat:Filesystem): Started lustre-oss-node31 * ost-23 (ocf::heartbeat:Filesystem): Started lustre-oss-node31 2 After [root@lustre-oss-node31 ~]# date;pcs status Fri Jun 6 14:12:50 CST 2025 Cluster name: oss_cluster Cluster Summary: * Stack: corosync (Pacemaker is running) * Current DC: lustre-oss-node135 (version 2.1.7-5.el8_10-0f7f88312) - partition with quorum * Last updated: Fri Jun 6 14:12:50 2025 on lustre-oss-node31 * Last change: Fri Jun 6 14:06:46 2025 by root via root on lustre-oss-node31 * 3 nodes configured * 28 resource instances configured Node List: * Online: [ lustre-oss-node31 lustre-oss-node135 ] * OFFLINE: [ lustre-oss-node144 ] Full List of Resources: * vmfence_lustre-oss-node31 (stonith:fence_xvm): Started lustre-oss-node135 * vmfence_lustre-oss-node144 (stonith:fence_xvm): Started lustre-oss-node135 * vmfence_lustre-oss-node135 (stonith:fence_xvm): Started lustre-oss-node31 * mgt (ocf::heartbeat:Filesystem): Started lustre-oss-node31 * ost-0 (ocf::heartbeat:Filesystem): Started lustre-oss-node135 * ost-3 (ocf::heartbeat:Filesystem): Started lustre-oss-node135 * ost-6 (ocf::heartbeat:Filesystem): Started lustre-oss-node135 * ost-9 (ocf::heartbeat:Filesystem): Started lustre-oss-node135 * ost-12 (ocf::heartbeat:Filesystem): Started lustre-oss-node135 * ost-15 (ocf::heartbeat:Filesystem): Started lustre-oss-node135 * ost-18 (ocf::heartbeat:Filesystem): Started lustre-oss-node135 * ost-21 (ocf::heartbeat:Filesystem): Started lustre-oss-node135 * ost-1 (ocf::heartbeat:Filesystem): Started lustre-oss-node31 * ost-4 (ocf::heartbeat:Filesystem): Started lustre-oss-node31 * ost-7 (ocf::heartbeat:Filesystem): Started lustre-oss-node31 * ost-10 (ocf::heartbeat:Filesystem): Started lustre-oss-node31 * ost-13 (ocf::heartbeat:Filesystem): Started lustre-oss-node31 * ost-16 (ocf::heartbeat:Filesystem): Started lustre-oss-node31 * ost-19 (ocf::heartbeat:Filesystem): Started lustre-oss-node31 * ost-22 (ocf::heartbeat:Filesystem): Started lustre-oss-node31 * ost-2 (ocf::heartbeat:Filesystem): Started lustre-oss-node31 * ost-5 (ocf::heartbeat:Filesystem): Started lustre-oss-node31 * ost-8 (ocf::heartbeat:Filesystem): Started lustre-oss-node31 * ost-11 (ocf::heartbeat:Filesystem): Started lustre-oss-node31 * ost-14 (ocf::heartbeat:Filesystem): Started lustre-oss-node31 * ost-17 (ocf::heartbeat:Filesystem): Started lustre-oss-node31 * ost-20 (ocf::heartbeat:Filesystem): Started lustre-oss-node31 * ost-23 (ocf::heartbeat:Filesystem): Started lustre-oss-node31 Daemon Status: corosync: active/enabled pacemaker: active/enabled pcsd: active/enabled 三 Logs: Jun 06 14:11:12 lustre-oss-node135 pacemaker-schedulerd[1069268] (log_list_item) notice: Actions: Move ost-1 ( lustre-oss-node144 -> lustre-oss-node31 ) Jun 06 14:11:12 lustre-oss-node135 pacemaker-schedulerd[1069268] (log_list_item) notice: Actions: Move ost-4 ( lustre-oss-node144 -> lustre-oss-node31 ) Jun 06 14:11:12 lustre-oss-node135 pacemaker-schedulerd[1069268] (log_list_item) notice: Actions: Move ost-7 ( lustre-oss-node144 -> lustre-oss-node31 ) Jun 06 14:11:12 lustre-oss-node135 pacemaker-schedulerd[1069268] (log_list_item) notice: Actions: Move ost-10 ( lustre-oss-node144 -> lustre-oss-node31 ) Jun 06 14:11:12 lustre-oss-node135 pacemaker-schedulerd[1069268] (log_list_item) notice: Actions: Move ost-13 ( lustre-oss-node144 -> lustre-oss-node31 ) Jun 06 14:11:12 lustre-oss-node135 pacemaker-schedulerd[1069268] (log_list_item) notice: Actions: Move ost-16 ( lustre-oss-node144 -> lustre-oss-node31 ) Jun 06 14:11:12 lustre-oss-node135 pacemaker-schedulerd[1069268] (log_list_item) notice: Actions: Move ost-19 ( lustre-oss-node144 -> lustre-oss-node31 ) Jun 06 14:11:12 lustre-oss-node135 pacemaker-schedulerd[1069268] (log_list_item) notice: Actions: Move ost-22 ( lustre-oss-node144 -> lustre-oss-node31 ) 四 Attachments: The attached files include the configuration(config.txt) and logs(node135.log) during the uneven migration. Thank you for your attention and support. Best regards ________________________________ chenzu...@gmail.com
_______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/