Re: [ClusterLabs] Antw: Re: single node fails to start the ocfs2 resource
On 03/14/2018 08:35 AM, Muhammad Sharfuddin wrote: > Hi Andrei, > >Somehow I miss corosync confiuration in this thread. Do you know > >wait-for-all is set (how?) or you just assume it? > > > solution found, I was not using "wait_for_all" option, I was assuming > that "two_node: 1" > would be sufficient: > > nodelist { > node { ring0_addr: 10.8.9.151 } > node { ring0_addr: 10.8.9.152 } > } > ###previously: > quorum { > two_node: 1 > provider: corosync_votequorum > } > ###now/fix: > quorum { > two_node: 1 > provider: corosync_votequorum > wait_for_all: 0 } > > My observation: > when I was not using "wait_for_all: 0" in corosync.conf, only ocfs2 > resources were > not running, rest of the resources were running fine because: > a - "two_node: 1" in corosync.conf file. > b - "no-quorum-policy=ignore" in cib. If you now loose network-connection between the two nodes one node might be lucky to fence the other. If it is set to just power-off the other you are probably fine. (With sbd you can achieve this behavior if you configure it to just come up if the corresponding slot is clean.) If fencing reboots the other node that one would come up and right away fence the first doing startup-fencing. > > @ Klaus > > what I tried to point out is that "no-quorum-policy=ignore" > >is dangerous for services that do require a resource-manager. If you > don't > >have any of those go with a systemd startup. > > > running a single node is obviously something in-acceptable, but say if > both the nodes crashes > and only node come back and if I start the resources via systemd then > the day the other node > come back, I have to stop the services via systemd, to start the > resources via cluster, while if a > single node cluster was running the other node simply joins the > cluster and no downtime would occur. I had meant (a little bit provocative ;-) ) consider if you need the resources to be started via a resource-manager at all. Klaus > > -- > Regards, > Muhammad Sharfuddin > > On 3/13/2018 11:20 PM, Andrei Borzenkov wrote: >> 13.03.2018 17:32, Klaus Wenninger пишет: >>> On 03/13/2018 02:30 PM, Muhammad Sharfuddin wrote: Yes, by saying pacemaker, I meant to say corosync as well. Is there any fix ? or a two node cluster can't run ocfs2 resources when one node is offline ? >>> Actually there can't be a "fix" as 2 nodes are just not enough >>> for a partial-cluster to be quorate in the classical sense >>> (more votes than half of the cluster nodes). >>> >>> So to still be able to use it we have this 2-node config that >>> permanently sets quorum. But not to run into issues on >>> startup we need it to require both nodes seeing each >>> other once. >>> >> I'm rather confused. I have run quite a lot of 2 node clusters and >> standard way to resolve it is to require fencing on startup. Then single >> node may assume it can safely proceed with starting resources. So it is >> rather unexpected to suddenly read "cannot be fixed". >> >>> So this is definitely nothing that is specific to ocfs2. >>> It just looks specific to ocfs2 because you've disabled >>> quorum for pacemaker. >>> To be honnest doing this you wouldn't need a resource-manager >>> at all and could just start up your services using systemd. >>> >>> If you don't want a full 3rd node, and still want to handle cases >>> where one node doesn't come up after a full shutdown of >>> all nodes, you probably could go for a setup with qdevice. Regards, >>> Klaus >>> -- Regards, Muhammad Sharfuddin On 3/13/2018 6:16 PM, Klaus Wenninger wrote: > On 03/13/2018 02:03 PM, Muhammad Sharfuddin wrote: >> Hi, >> >> 1 - if I put a node(node2) offline; ocfs2 resources keep running on >> online node(node1) >> >> 2 - while node2 was offline, via cluster I stop/start the ocfs2 >> resource group successfully so many times in a row. >> >> 3 - while node2 was offline; I restart the pacemaker service on the >> node1 and then tries to start the ocfs2 resource group, dlm started >> but ocfs2 file system resource does not start. >> >> Nutshell: >> >> a - both nodes must be online to start the ocfs2 resource. >> >> b - if one crashes or offline(gracefully) ocfs2 resource keeps >> running >> on the other/surviving node. >> >> c - while one node was offline, we can stop/start the ocfs2 resource >> group on the surviving node but if we stops the pacemaker service, >> then ocfs2 file system resource does not start with the following >> info >> in the logs: > >From the logs I would say startup of dlm_controld times out > because it > is waiting > for quorum - which doesn't happen because of wait-for-all. >> Somehow I miss corosync confiuration in this thread. Do you know >> wait-for-all is set (how?) or you just assume it? >> >>
Re: [ClusterLabs] Antw: Re: single node fails to start the ocfs2 resource
On Wed, Mar 14, 2018 at 10:35 AM, Muhammad Sharfuddinwrote: > Hi Andrei, >>Somehow I miss corosync confiuration in this thread. Do you know >>wait-for-all is set (how?) or you just assume it? >> > solution found, I was not using "wait_for_all" option, I was assuming that > "two_node: 1" > would be sufficient: > > nodelist { > node { ring0_addr: 10.8.9.151 } > node { ring0_addr: 10.8.9.152 } > } > ###previously: > quorum { > two_node: 1 > provider: corosync_votequorum > } > ###now/fix: > quorum { > two_node: 1 > provider: corosync_votequorum > wait_for_all: 0 } > > My observation: > when I was not using "wait_for_all: 0" in corosync.conf, only ocfs2 > resources were > not running, rest of the resources were running fine because: OK, I tested it and indeed, when wait_for_all is (explicitly) disabled, single node comes up quorate (immediately). It still requests fencing of other node. So trying to wrap my head around it 1. two_node=1 appears to only permanently set "in quorate" state for each node. So whether you have 1 or 2 nodes, you are in quorum. E.g. with expected_votes=2 even if I kill one node I am left with single node that believes it is in "partition with quorum". 2. two_node=1 implicitly sets wait_for_all which prevents corosync entering quorate state until all nodes are up. Once they have been up, we are left in quorum. As long as OCFS2 requires quorum to be attained this also explains your observation. > a - "two_node: 1" in corosync.conf file. > b - "no-quorum-policy=ignore" in cib. > If my reasoning above is correct, I question the value of wait_for_all=1 with two_node. This is difference between "pretending we have quorum" and "ignoring we have no quorum", but split between different layers. End effect is the same as long as corosync quorum state is not queried directly. > @ Klaus >> what I tried to point out is that "no-quorum-policy=ignore" >>is dangerous for services that do require a resource-manager. If you don't >>have any of those go with a systemd startup. >> > running a single node is obviously something in-acceptable, but say if both > the nodes crashes > and only node come back and if I start the resources via systemd then the > day the other node > come back, I have to stop the services via systemd, to start the resources > via cluster, while if a > single node cluster was running the other node simply joins the cluster and > no downtime would occur. > Exactly. There is simply no other way to sensibly use two node cluster without it and I argue that notion of quorum is not relevant to most parts of pacemaker operation at all as long as stonith wirks properly. Again - if you use two_node=1, your cluster is ALWAYS in quorum except initial startup. So no-quorum-policy=ignore is redundant. It is only needed because of implicit wait_for_all=1. But if everyone ignores implicit wait_for_all=1 anyway, what's the point to set it by default? ___ Users mailing list: Users@clusterlabs.org https://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] Antw: Re: single node fails to start the ocfs2 resource
Hi Andrei, >Somehow I miss corosync confiuration in this thread. Do you know >wait-for-all is set (how?) or you just assume it? > solution found, I was not using "wait_for_all" option, I was assuming that "two_node: 1" would be sufficient: nodelist { node { ring0_addr: 10.8.9.151 } node { ring0_addr: 10.8.9.152 } } ###previously: quorum { two_node: 1 provider: corosync_votequorum } ###now/fix: quorum { two_node: 1 provider: corosync_votequorum wait_for_all: 0 } My observation: when I was not using "wait_for_all: 0" in corosync.conf, only ocfs2 resources were not running, rest of the resources were running fine because: a - "two_node: 1" in corosync.conf file. b - "no-quorum-policy=ignore" in cib. @ Klaus > what I tried to point out is that "no-quorum-policy=ignore" >is dangerous for services that do require a resource-manager. If you don't >have any of those go with a systemd startup. > running a single node is obviously something in-acceptable, but say if both the nodes crashes and only node come back and if I start the resources via systemd then the day the other node come back, I have to stop the services via systemd, to start the resources via cluster, while if a single node cluster was running the other node simply joins the cluster and no downtime would occur. -- Regards, Muhammad Sharfuddin On 3/13/2018 11:20 PM, Andrei Borzenkov wrote: 13.03.2018 17:32, Klaus Wenninger пишет: On 03/13/2018 02:30 PM, Muhammad Sharfuddin wrote: Yes, by saying pacemaker, I meant to say corosync as well. Is there any fix ? or a two node cluster can't run ocfs2 resources when one node is offline ? Actually there can't be a "fix" as 2 nodes are just not enough for a partial-cluster to be quorate in the classical sense (more votes than half of the cluster nodes). So to still be able to use it we have this 2-node config that permanently sets quorum. But not to run into issues on startup we need it to require both nodes seeing each other once. I'm rather confused. I have run quite a lot of 2 node clusters and standard way to resolve it is to require fencing on startup. Then single node may assume it can safely proceed with starting resources. So it is rather unexpected to suddenly read "cannot be fixed". So this is definitely nothing that is specific to ocfs2. It just looks specific to ocfs2 because you've disabled quorum for pacemaker. To be honnest doing this you wouldn't need a resource-manager at all and could just start up your services using systemd. If you don't want a full 3rd node, and still want to handle cases where one node doesn't come up after a full shutdown of all nodes, you probably could go for a setup with qdevice. Regards, Klaus -- Regards, Muhammad Sharfuddin On 3/13/2018 6:16 PM, Klaus Wenninger wrote: On 03/13/2018 02:03 PM, Muhammad Sharfuddin wrote: Hi, 1 - if I put a node(node2) offline; ocfs2 resources keep running on online node(node1) 2 - while node2 was offline, via cluster I stop/start the ocfs2 resource group successfully so many times in a row. 3 - while node2 was offline; I restart the pacemaker service on the node1 and then tries to start the ocfs2 resource group, dlm started but ocfs2 file system resource does not start. Nutshell: a - both nodes must be online to start the ocfs2 resource. b - if one crashes or offline(gracefully) ocfs2 resource keeps running on the other/surviving node. c - while one node was offline, we can stop/start the ocfs2 resource group on the surviving node but if we stops the pacemaker service, then ocfs2 file system resource does not start with the following info in the logs: >From the logs I would say startup of dlm_controld times out because it is waiting for quorum - which doesn't happen because of wait-for-all. Somehow I miss corosync confiuration in this thread. Do you know wait-for-all is set (how?) or you just assume it? ___ Users mailing list: Users@clusterlabs.org https://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org --- This email has been checked for viruses by Avast antivirus software. https://www.avast.com/antivirus ___ Users mailing list: Users@clusterlabs.org https://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] Antw: Re: single node fails to start the ocfs2 resource
On 03/13/2018 03:43 PM, Muhammad Sharfuddin wrote: > Thanks a lot for the explanation. But other then the ocfs2 resource > group, this cluster starts all other resources > > on a single node, without any issue just because the use of > "no-quorum-policy=ignore" option. Yes I know. And what I tried to point out is that "no-quorum-policy=ignore" is dangerous for services that do require a resource-manager. If you don't have any of those go with a systemd startup. Regards, Klaus > > -- > Regards, > Muhammad Sharfuddin > > On 3/13/2018 7:32 PM, Klaus Wenninger wrote: >> On 03/13/2018 02:30 PM, Muhammad Sharfuddin wrote: >>> Yes, by saying pacemaker, I meant to say corosync as well. >>> >>> Is there any fix ? or a two node cluster can't run ocfs2 resources >>> when one node is offline ? >> Actually there can't be a "fix" as 2 nodes are just not enough >> for a partial-cluster to be quorate in the classical sense >> (more votes than half of the cluster nodes). >> >> So to still be able to use it we have this 2-node config that >> permanently sets quorum. But not to run into issues on >> startup we need it to require both nodes seeing each >> other once. >> >> So this is definitely nothing that is specific to ocfs2. >> It just looks specific to ocfs2 because you've disabled >> quorum for pacemaker. >> To be honnest doing this you wouldn't need a resource-manager >> at all and could just start up your services using systemd. >> >> If you don't want a full 3rd node, and still want to handle cases >> where one node doesn't come up after a full shutdown of >> all nodes, you probably could go for a setup with qdevice. >> >> Regards, >> Klaus >> >>> -- >>> Regards, >>> Muhammad Sharfuddin >>> >>> On 3/13/2018 6:16 PM, Klaus Wenninger wrote: On 03/13/2018 02:03 PM, Muhammad Sharfuddin wrote: > Hi, > > 1 - if I put a node(node2) offline; ocfs2 resources keep running on > online node(node1) > > 2 - while node2 was offline, via cluster I stop/start the ocfs2 > resource group successfully so many times in a row. > > 3 - while node2 was offline; I restart the pacemaker service on the > node1 and then tries to start the ocfs2 resource group, dlm started > but ocfs2 file system resource does not start. > > Nutshell: > > a - both nodes must be online to start the ocfs2 resource. > > b - if one crashes or offline(gracefully) ocfs2 resource keeps > running > on the other/surviving node. > > c - while one node was offline, we can stop/start the ocfs2 resource > group on the surviving node but if we stops the pacemaker service, > then ocfs2 file system resource does not start with the following > info > in the logs: >From the logs I would say startup of dlm_controld times out because it is waiting for quorum - which doesn't happen because of wait-for-all. Question is if you really just stopped pacemaker or if you stopped corosync as well. In the latter case I would say it is the expected behavior. Regards, Klaus > lrmd[4317]: notice: executing - rsc:p-fssapmnt action:start > call_id:53 > Filesystem(p-fssapmnt)[5139]: INFO: Running start for > /dev/mapper/sapmnt on /sapmnt > kernel: [ 706.162676] dlm: Using TCP for communications > kernel: [ 706.162916] dlm: BFA9FF042AA045F4822C2A6A06020EE9: joining > the lockspace group... > dlm_controld[5105]: 759 fence work wait for quorum > dlm_controld[5105]: 764 BFA9FF042AA045F4822C2A6A06020EE9 wait for > quorum > lrmd[4317]: warning: p-fssapmnt_start_0 process (PID 5139) timed out > lrmd[4317]: warning: p-fssapmnt_start_0:5139 - timed out after > 6ms > lrmd[4317]: notice: finished - rsc:p-fssapmnt action:start > call_id:53 pid:5139 exit-code:1 exec-time:60002ms queue-time:0ms > kernel: [ 766.056514] dlm: BFA9FF042AA045F4822C2A6A06020EE9: group > event done -512 0 > kernel: [ 766.056528] dlm: BFA9FF042AA045F4822C2A6A06020EE9: group > join failed -512 0 > crmd[4320]: notice: Result of stop operation for p-fssapmnt on > pipci001: 0 (ok) > crmd[4320]: notice: Initiating stop operation dlm_stop_0 locally on > pipci001 > lrmd[4317]: notice: executing - rsc:dlm action:stop call_id:56 > dlm_controld[5105]: 766 shutdown ignored, active lockspaces > lrmd[4317]: warning: dlm_stop_0 process (PID 5326) timed out > lrmd[4317]: warning: dlm_stop_0:5326 - timed out after 10ms > lrmd[4317]: notice: finished - rsc:dlm action:stop call_id:56 > pid:5326 exit-code:1 exec-time:13ms queue-time:0ms > crmd[4320]: error: Result of stop operation for dlm on pipci001: > Timed Out > crmd[4320]: warning: Action 15 (dlm_stop_0) on pipci001 failed > (target: 0 vs. rc: 1): Error > crmd[4320]: notice: Transition aborted by operation dlm_stop_0 > 'modify' on pipci001: Event failed >
Re: [ClusterLabs] Antw: Re: single node fails to start the ocfs2 resource
Thanks a lot for the explanation. But other then the ocfs2 resource group, this cluster starts all other resources on a single node, without any issue just because the use of "no-quorum-policy=ignore" option. -- Regards, Muhammad Sharfuddin On 3/13/2018 7:32 PM, Klaus Wenninger wrote: On 03/13/2018 02:30 PM, Muhammad Sharfuddin wrote: Yes, by saying pacemaker, I meant to say corosync as well. Is there any fix ? or a two node cluster can't run ocfs2 resources when one node is offline ? Actually there can't be a "fix" as 2 nodes are just not enough for a partial-cluster to be quorate in the classical sense (more votes than half of the cluster nodes). So to still be able to use it we have this 2-node config that permanently sets quorum. But not to run into issues on startup we need it to require both nodes seeing each other once. So this is definitely nothing that is specific to ocfs2. It just looks specific to ocfs2 because you've disabled quorum for pacemaker. To be honnest doing this you wouldn't need a resource-manager at all and could just start up your services using systemd. If you don't want a full 3rd node, and still want to handle cases where one node doesn't come up after a full shutdown of all nodes, you probably could go for a setup with qdevice. Regards, Klaus -- Regards, Muhammad Sharfuddin On 3/13/2018 6:16 PM, Klaus Wenninger wrote: On 03/13/2018 02:03 PM, Muhammad Sharfuddin wrote: Hi, 1 - if I put a node(node2) offline; ocfs2 resources keep running on online node(node1) 2 - while node2 was offline, via cluster I stop/start the ocfs2 resource group successfully so many times in a row. 3 - while node2 was offline; I restart the pacemaker service on the node1 and then tries to start the ocfs2 resource group, dlm started but ocfs2 file system resource does not start. Nutshell: a - both nodes must be online to start the ocfs2 resource. b - if one crashes or offline(gracefully) ocfs2 resource keeps running on the other/surviving node. c - while one node was offline, we can stop/start the ocfs2 resource group on the surviving node but if we stops the pacemaker service, then ocfs2 file system resource does not start with the following info in the logs: >From the logs I would say startup of dlm_controld times out because it is waiting for quorum - which doesn't happen because of wait-for-all. Question is if you really just stopped pacemaker or if you stopped corosync as well. In the latter case I would say it is the expected behavior. Regards, Klaus lrmd[4317]: notice: executing - rsc:p-fssapmnt action:start call_id:53 Filesystem(p-fssapmnt)[5139]: INFO: Running start for /dev/mapper/sapmnt on /sapmnt kernel: [ 706.162676] dlm: Using TCP for communications kernel: [ 706.162916] dlm: BFA9FF042AA045F4822C2A6A06020EE9: joining the lockspace group... dlm_controld[5105]: 759 fence work wait for quorum dlm_controld[5105]: 764 BFA9FF042AA045F4822C2A6A06020EE9 wait for quorum lrmd[4317]: warning: p-fssapmnt_start_0 process (PID 5139) timed out lrmd[4317]: warning: p-fssapmnt_start_0:5139 - timed out after 6ms lrmd[4317]: notice: finished - rsc:p-fssapmnt action:start call_id:53 pid:5139 exit-code:1 exec-time:60002ms queue-time:0ms kernel: [ 766.056514] dlm: BFA9FF042AA045F4822C2A6A06020EE9: group event done -512 0 kernel: [ 766.056528] dlm: BFA9FF042AA045F4822C2A6A06020EE9: group join failed -512 0 crmd[4320]: notice: Result of stop operation for p-fssapmnt on pipci001: 0 (ok) crmd[4320]: notice: Initiating stop operation dlm_stop_0 locally on pipci001 lrmd[4317]: notice: executing - rsc:dlm action:stop call_id:56 dlm_controld[5105]: 766 shutdown ignored, active lockspaces lrmd[4317]: warning: dlm_stop_0 process (PID 5326) timed out lrmd[4317]: warning: dlm_stop_0:5326 - timed out after 10ms lrmd[4317]: notice: finished - rsc:dlm action:stop call_id:56 pid:5326 exit-code:1 exec-time:13ms queue-time:0ms crmd[4320]: error: Result of stop operation for dlm on pipci001: Timed Out crmd[4320]: warning: Action 15 (dlm_stop_0) on pipci001 failed (target: 0 vs. rc: 1): Error crmd[4320]: notice: Transition aborted by operation dlm_stop_0 'modify' on pipci001: Event failed crmd[4320]: warning: Action 15 (dlm_stop_0) on pipci001 failed (target: 0 vs. rc: 1): Error pengine[4319]: notice: Watchdog will be used via SBD if fencing is required pengine[4319]: notice: On loss of CCM Quorum: Ignore pengine[4319]: warning: Processing failed op stop for dlm:0 on pipci001: unknown error (1) pengine[4319]: warning: Processing failed op stop for dlm:0 on pipci001: unknown error (1) pengine[4319]: warning: Cluster node pipci001 will be fenced: dlm:0 failed there pengine[4319]: warning: Processing failed op start for p-fssapmnt:0 on pipci001: unknown error (1) pengine[4319]: notice: Stop of failed resource dlm:0 is implicit after pipci001 is fenced pengine[4319]: notice: * Fence pipci001 pengine[4319]: notice: Stop sbd-stonith#011(pipci001)
Re: [ClusterLabs] Antw: Re: single node fails to start the ocfs2 resource
On 03/13/2018 02:30 PM, Muhammad Sharfuddin wrote: > Yes, by saying pacemaker, I meant to say corosync as well. > > Is there any fix ? or a two node cluster can't run ocfs2 resources > when one node is offline ? Actually there can't be a "fix" as 2 nodes are just not enough for a partial-cluster to be quorate in the classical sense (more votes than half of the cluster nodes). So to still be able to use it we have this 2-node config that permanently sets quorum. But not to run into issues on startup we need it to require both nodes seeing each other once. So this is definitely nothing that is specific to ocfs2. It just looks specific to ocfs2 because you've disabled quorum for pacemaker. To be honnest doing this you wouldn't need a resource-manager at all and could just start up your services using systemd. If you don't want a full 3rd node, and still want to handle cases where one node doesn't come up after a full shutdown of all nodes, you probably could go for a setup with qdevice. Regards, Klaus > > -- > Regards, > Muhammad Sharfuddin > > On 3/13/2018 6:16 PM, Klaus Wenninger wrote: >> On 03/13/2018 02:03 PM, Muhammad Sharfuddin wrote: >>> Hi, >>> >>> 1 - if I put a node(node2) offline; ocfs2 resources keep running on >>> online node(node1) >>> >>> 2 - while node2 was offline, via cluster I stop/start the ocfs2 >>> resource group successfully so many times in a row. >>> >>> 3 - while node2 was offline; I restart the pacemaker service on the >>> node1 and then tries to start the ocfs2 resource group, dlm started >>> but ocfs2 file system resource does not start. >>> >>> Nutshell: >>> >>> a - both nodes must be online to start the ocfs2 resource. >>> >>> b - if one crashes or offline(gracefully) ocfs2 resource keeps running >>> on the other/surviving node. >>> >>> c - while one node was offline, we can stop/start the ocfs2 resource >>> group on the surviving node but if we stops the pacemaker service, >>> then ocfs2 file system resource does not start with the following info >>> in the logs: >> >From the logs I would say startup of dlm_controld times out because it >> is waiting >> for quorum - which doesn't happen because of wait-for-all. >> Question is if you really just stopped pacemaker or if you stopped >> corosync as well. >> In the latter case I would say it is the expected behavior. >> >> Regards, >> Klaus >> >>> lrmd[4317]: notice: executing - rsc:p-fssapmnt action:start >>> call_id:53 >>> Filesystem(p-fssapmnt)[5139]: INFO: Running start for >>> /dev/mapper/sapmnt on /sapmnt >>> kernel: [ 706.162676] dlm: Using TCP for communications >>> kernel: [ 706.162916] dlm: BFA9FF042AA045F4822C2A6A06020EE9: joining >>> the lockspace group... >>> dlm_controld[5105]: 759 fence work wait for quorum >>> dlm_controld[5105]: 764 BFA9FF042AA045F4822C2A6A06020EE9 wait for >>> quorum >>> lrmd[4317]: warning: p-fssapmnt_start_0 process (PID 5139) timed out >>> lrmd[4317]: warning: p-fssapmnt_start_0:5139 - timed out after 6ms >>> lrmd[4317]: notice: finished - rsc:p-fssapmnt action:start >>> call_id:53 pid:5139 exit-code:1 exec-time:60002ms queue-time:0ms >>> kernel: [ 766.056514] dlm: BFA9FF042AA045F4822C2A6A06020EE9: group >>> event done -512 0 >>> kernel: [ 766.056528] dlm: BFA9FF042AA045F4822C2A6A06020EE9: group >>> join failed -512 0 >>> crmd[4320]: notice: Result of stop operation for p-fssapmnt on >>> pipci001: 0 (ok) >>> crmd[4320]: notice: Initiating stop operation dlm_stop_0 locally on >>> pipci001 >>> lrmd[4317]: notice: executing - rsc:dlm action:stop call_id:56 >>> dlm_controld[5105]: 766 shutdown ignored, active lockspaces >>> lrmd[4317]: warning: dlm_stop_0 process (PID 5326) timed out >>> lrmd[4317]: warning: dlm_stop_0:5326 - timed out after 10ms >>> lrmd[4317]: notice: finished - rsc:dlm action:stop call_id:56 >>> pid:5326 exit-code:1 exec-time:13ms queue-time:0ms >>> crmd[4320]: error: Result of stop operation for dlm on pipci001: >>> Timed Out >>> crmd[4320]: warning: Action 15 (dlm_stop_0) on pipci001 failed >>> (target: 0 vs. rc: 1): Error >>> crmd[4320]: notice: Transition aborted by operation dlm_stop_0 >>> 'modify' on pipci001: Event failed >>> crmd[4320]: warning: Action 15 (dlm_stop_0) on pipci001 failed >>> (target: 0 vs. rc: 1): Error >>> pengine[4319]: notice: Watchdog will be used via SBD if fencing is >>> required >>> pengine[4319]: notice: On loss of CCM Quorum: Ignore >>> pengine[4319]: warning: Processing failed op stop for dlm:0 on >>> pipci001: unknown error (1) >>> pengine[4319]: warning: Processing failed op stop for dlm:0 on >>> pipci001: unknown error (1) >>> pengine[4319]: warning: Cluster node pipci001 will be fenced: dlm:0 >>> failed there >>> pengine[4319]: warning: Processing failed op start for p-fssapmnt:0 >>> on pipci001: unknown error (1) >>> pengine[4319]: notice: Stop of failed resource dlm:0 is implicit >>> after pipci001 is fenced >>> pengine[4319]: notice: * Fence pipci001 >>> pengine[4319]:
Re: [ClusterLabs] Antw: Re: single node fails to start the ocfs2 resource
On 03/13/2018 02:03 PM, Muhammad Sharfuddin wrote: > Hi, > > 1 - if I put a node(node2) offline; ocfs2 resources keep running on > online node(node1) > > 2 - while node2 was offline, via cluster I stop/start the ocfs2 > resource group successfully so many times in a row. > > 3 - while node2 was offline; I restart the pacemaker service on the > node1 and then tries to start the ocfs2 resource group, dlm started > but ocfs2 file system resource does not start. > > Nutshell: > > a - both nodes must be online to start the ocfs2 resource. > > b - if one crashes or offline(gracefully) ocfs2 resource keeps running > on the other/surviving node. > > c - while one node was offline, we can stop/start the ocfs2 resource > group on the surviving node but if we stops the pacemaker service, > then ocfs2 file system resource does not start with the following info > in the logs: From the logs I would say startup of dlm_controld times out because it is waiting for quorum - which doesn't happen because of wait-for-all. Question is if you really just stopped pacemaker or if you stopped corosync as well. In the latter case I would say it is the expected behavior. Regards, Klaus > > lrmd[4317]: notice: executing - rsc:p-fssapmnt action:start call_id:53 > Filesystem(p-fssapmnt)[5139]: INFO: Running start for > /dev/mapper/sapmnt on /sapmnt > kernel: [ 706.162676] dlm: Using TCP for communications > kernel: [ 706.162916] dlm: BFA9FF042AA045F4822C2A6A06020EE9: joining > the lockspace group... > dlm_controld[5105]: 759 fence work wait for quorum > dlm_controld[5105]: 764 BFA9FF042AA045F4822C2A6A06020EE9 wait for quorum > lrmd[4317]: warning: p-fssapmnt_start_0 process (PID 5139) timed out > lrmd[4317]: warning: p-fssapmnt_start_0:5139 - timed out after 6ms > lrmd[4317]: notice: finished - rsc:p-fssapmnt action:start > call_id:53 pid:5139 exit-code:1 exec-time:60002ms queue-time:0ms > kernel: [ 766.056514] dlm: BFA9FF042AA045F4822C2A6A06020EE9: group > event done -512 0 > kernel: [ 766.056528] dlm: BFA9FF042AA045F4822C2A6A06020EE9: group > join failed -512 0 > crmd[4320]: notice: Result of stop operation for p-fssapmnt on > pipci001: 0 (ok) > crmd[4320]: notice: Initiating stop operation dlm_stop_0 locally on > pipci001 > lrmd[4317]: notice: executing - rsc:dlm action:stop call_id:56 > dlm_controld[5105]: 766 shutdown ignored, active lockspaces > lrmd[4317]: warning: dlm_stop_0 process (PID 5326) timed out > lrmd[4317]: warning: dlm_stop_0:5326 - timed out after 10ms > lrmd[4317]: notice: finished - rsc:dlm action:stop call_id:56 > pid:5326 exit-code:1 exec-time:13ms queue-time:0ms > crmd[4320]: error: Result of stop operation for dlm on pipci001: > Timed Out > crmd[4320]: warning: Action 15 (dlm_stop_0) on pipci001 failed > (target: 0 vs. rc: 1): Error > crmd[4320]: notice: Transition aborted by operation dlm_stop_0 > 'modify' on pipci001: Event failed > crmd[4320]: warning: Action 15 (dlm_stop_0) on pipci001 failed > (target: 0 vs. rc: 1): Error > pengine[4319]: notice: Watchdog will be used via SBD if fencing is > required > pengine[4319]: notice: On loss of CCM Quorum: Ignore > pengine[4319]: warning: Processing failed op stop for dlm:0 on > pipci001: unknown error (1) > pengine[4319]: warning: Processing failed op stop for dlm:0 on > pipci001: unknown error (1) > pengine[4319]: warning: Cluster node pipci001 will be fenced: dlm:0 > failed there > pengine[4319]: warning: Processing failed op start for p-fssapmnt:0 > on pipci001: unknown error (1) > pengine[4319]: notice: Stop of failed resource dlm:0 is implicit > after pipci001 is fenced > pengine[4319]: notice: * Fence pipci001 > pengine[4319]: notice: Stop sbd-stonith#011(pipci001) > pengine[4319]: notice: Stop dlm:0#011(pipci001) > crmd[4320]: notice: Requesting fencing (reboot) of node pipci001 > stonith-ng[4316]: notice: Client crmd.4320.4c2f757b wants to fence > (reboot) 'pipci001' with device '(any)' > stonith-ng[4316]: notice: Requesting peer fencing (reboot) of pipci001 > stonith-ng[4316]: notice: sbd-stonith can fence (reboot) pipci001: > dynamic-list > > > -- > Regards, > Muhammad Sharfuddin | +923332144823 | nds.com.pk > > On 3/13/2018 1:04 PM, Ulrich Windl wrote: >> Hi! >> >> I'd recommend this: >> Cleanly boot your nodes, avoiding any manual operation with cluster >> resources. Keep the logs. >> Then start your tests, keeping the logs for each. >> Try to fix issues by reading the logs and adjusting the cluster >> configuration, and not by starting commands that the cluster should >> start. >> >> We had an 2-node OCFS2 cluster running for quite some time with >> SLES11, but now the cluster is three nodes. To me the output of >> "crm_mon -1Arfj" combined with having set record-pending=true was >> very valuable finding problems. >> >> Regards, >> Ulrich >> >> > Muhammad Sharfuddinschrieb am > 13.03.2018 um 08:43 in >> Nachricht
Re: [ClusterLabs] Antw: Re: single node fails to start the ocfs2 resource
Hi, 1 - if I put a node(node2) offline; ocfs2 resources keep running on online node(node1) 2 - while node2 was offline, via cluster I stop/start the ocfs2 resource group successfully so many times in a row. 3 - while node2 was offline; I restart the pacemaker service on the node1 and then tries to start the ocfs2 resource group, dlm started but ocfs2 file system resource does not start. Nutshell: a - both nodes must be online to start the ocfs2 resource. b - if one crashes or offline(gracefully) ocfs2 resource keeps running on the other/surviving node. c - while one node was offline, we can stop/start the ocfs2 resource group on the surviving node but if we stops the pacemaker service, then ocfs2 file system resource does not start with the following info in the logs: lrmd[4317]: notice: executing - rsc:p-fssapmnt action:start call_id:53 Filesystem(p-fssapmnt)[5139]: INFO: Running start for /dev/mapper/sapmnt on /sapmnt kernel: [ 706.162676] dlm: Using TCP for communications kernel: [ 706.162916] dlm: BFA9FF042AA045F4822C2A6A06020EE9: joining the lockspace group... dlm_controld[5105]: 759 fence work wait for quorum dlm_controld[5105]: 764 BFA9FF042AA045F4822C2A6A06020EE9 wait for quorum lrmd[4317]: warning: p-fssapmnt_start_0 process (PID 5139) timed out lrmd[4317]: warning: p-fssapmnt_start_0:5139 - timed out after 6ms lrmd[4317]: notice: finished - rsc:p-fssapmnt action:start call_id:53 pid:5139 exit-code:1 exec-time:60002ms queue-time:0ms kernel: [ 766.056514] dlm: BFA9FF042AA045F4822C2A6A06020EE9: group event done -512 0 kernel: [ 766.056528] dlm: BFA9FF042AA045F4822C2A6A06020EE9: group join failed -512 0 crmd[4320]: notice: Result of stop operation for p-fssapmnt on pipci001: 0 (ok) crmd[4320]: notice: Initiating stop operation dlm_stop_0 locally on pipci001 lrmd[4317]: notice: executing - rsc:dlm action:stop call_id:56 dlm_controld[5105]: 766 shutdown ignored, active lockspaces lrmd[4317]: warning: dlm_stop_0 process (PID 5326) timed out lrmd[4317]: warning: dlm_stop_0:5326 - timed out after 10ms lrmd[4317]: notice: finished - rsc:dlm action:stop call_id:56 pid:5326 exit-code:1 exec-time:13ms queue-time:0ms crmd[4320]: error: Result of stop operation for dlm on pipci001: Timed Out crmd[4320]: warning: Action 15 (dlm_stop_0) on pipci001 failed (target: 0 vs. rc: 1): Error crmd[4320]: notice: Transition aborted by operation dlm_stop_0 'modify' on pipci001: Event failed crmd[4320]: warning: Action 15 (dlm_stop_0) on pipci001 failed (target: 0 vs. rc: 1): Error pengine[4319]: notice: Watchdog will be used via SBD if fencing is required pengine[4319]: notice: On loss of CCM Quorum: Ignore pengine[4319]: warning: Processing failed op stop for dlm:0 on pipci001: unknown error (1) pengine[4319]: warning: Processing failed op stop for dlm:0 on pipci001: unknown error (1) pengine[4319]: warning: Cluster node pipci001 will be fenced: dlm:0 failed there pengine[4319]: warning: Processing failed op start for p-fssapmnt:0 on pipci001: unknown error (1) pengine[4319]: notice: Stop of failed resource dlm:0 is implicit after pipci001 is fenced pengine[4319]: notice: * Fence pipci001 pengine[4319]: notice: Stop sbd-stonith#011(pipci001) pengine[4319]: notice: Stop dlm:0#011(pipci001) crmd[4320]: notice: Requesting fencing (reboot) of node pipci001 stonith-ng[4316]: notice: Client crmd.4320.4c2f757b wants to fence (reboot) 'pipci001' with device '(any)' stonith-ng[4316]: notice: Requesting peer fencing (reboot) of pipci001 stonith-ng[4316]: notice: sbd-stonith can fence (reboot) pipci001: dynamic-list -- Regards, Muhammad Sharfuddin | +923332144823 | nds.com.pk On 3/13/2018 1:04 PM, Ulrich Windl wrote: Hi! I'd recommend this: Cleanly boot your nodes, avoiding any manual operation with cluster resources. Keep the logs. Then start your tests, keeping the logs for each. Try to fix issues by reading the logs and adjusting the cluster configuration, and not by starting commands that the cluster should start. We had an 2-node OCFS2 cluster running for quite some time with SLES11, but now the cluster is three nodes. To me the output of "crm_mon -1Arfj" combined with having set record-pending=true was very valuable finding problems. Regards, Ulrich Muhammad Sharfuddinschrieb am 13.03.2018 um 08:43 in Nachricht <7b773ae9-4209-d246-b5c0-2c8b67e62...@nds.com.pk>: Dear Klaus, If I understand you properly then, its a fencing issue, and whatever I am facing is "natural" or "by-design" in a two node cluster where quorum is incomplete. I am quite convinced that you have pointed out right because, when I start the dlm resource via cluster and then tries to start the ocfs2 file system manually from command line, mount command remains hanged and following events are reported in the logs: kernel: [62622.864828] ocfs2: Registered cluster interface user kernel:
[ClusterLabs] Antw: Re: single node fails to start the ocfs2 resource
Hi! I'd recommend this: Cleanly boot your nodes, avoiding any manual operation with cluster resources. Keep the logs. Then start your tests, keeping the logs for each. Try to fix issues by reading the logs and adjusting the cluster configuration, and not by starting commands that the cluster should start. We had an 2-node OCFS2 cluster running for quite some time with SLES11, but now the cluster is three nodes. To me the output of "crm_mon -1Arfj" combined with having set record-pending=true was very valuable finding problems. Regards, Ulrich >>> Muhammad Sharfuddinschrieb am 13.03.2018 um >>> 08:43 in Nachricht <7b773ae9-4209-d246-b5c0-2c8b67e62...@nds.com.pk>: > Dear Klaus, > > If I understand you properly then, its a fencing issue, and whatever I > am facing is "natural" or "by-design" in a two node cluster where quorum > is incomplete. > > I am quite convinced that you have pointed out right because, when I > start the dlm resource via cluster and then tries to start the ocfs2 > file system manually from command line, mount command remains hanged and > following events are reported in the logs: > > kernel: [62622.864828] ocfs2: Registered cluster interface user > kernel: [62622.884427] dlm: Using TCP for communications > kernel: [62622.884750] dlm: BFA9FF042AA045F4822C2A6A06020EE9: > joining the lockspace group... > dlm_controld[17655]: 62627 fence work wait for quorum > dlm_controld[17655]: 62680 BFA9FF042AA045F4822C2A6A06020EE9 wait > for quorum > > and then following messages keep reported every 5-10 minutes, till I > kill the mount.ocfs2 process: > > dlm_controld[17655]: 62627 fence work wait for quorum > dlm_controld[17655]: 62680 BFA9FF042AA045F4822C2A6A06020EE9 wait > for quorum > > I am also very much confused, because yesterday I did the same and was > able to mount the ocfs2 file system manually from command line(at least > once), and then unmount the file system manually stop the dlm resource > from cluster and then complete ocfs2 resource stack(dlm, file systems) > start/stop successfully via cluster even when only machine was online. > > In a two-node cluster, which have ocfs2 resources, we can't run the > ocfs2 resources when quorum is incomplete(one node is offline) ? > > -- > Regards, > Muhammad Sharfuddin > > On 3/12/2018 5:58 PM, Klaus Wenninger wrote: >> On 03/12/2018 01:44 PM, Muhammad Sharfuddin wrote: >>> Hi Klaus, >>> >>> primitive sbd-stonith stonith:external/sbd \ >>> op monitor interval=3000 timeout=20 \ >>> op start interval=0 timeout=240 \ >>> op stop interval=0 timeout=100 \ >>> params sbd_device="/dev/mapper/sbd" \ >>> meta target-role=Started >> Makes more sense now. >> Using pcmk_delay_max would probably be useful here >> to prevent a fence-race. >> That stonith-resource was not in your resource-list below ... >> >>> property cib-bootstrap-options: \ >>> have-watchdog=true \ >>> stonith-enabled=true \ >>> no-quorum-policy=ignore \ >>> stonith-timeout=90 \ >>> startup-fencing=true >> You've set no-quorum-policy=ignore for pacemaker. >> Whether this is a good idea or not in your setup is >> written on another page. >> But isn't dlm directly interfering with corosync so >> that it would get the quorum state from there? >> As you have 2-node set probably on a 2-node-cluster >> this would - after both nodes down - wait for all >> nodes up first. >> >> Regards, >> Klaus >> >>> # ps -eaf |grep sbd >>> root 6129 1 0 17:35 ?00:00:00 sbd: inquisitor >>> root 6133 6129 0 17:35 ?00:00:00 sbd: watcher: >>> /dev/mapper/sbd - slot: 1 - uuid: 6e80a337-95db-4608-bd62-d59517f39103 >>> root 6134 6129 0 17:35 ?00:00:00 sbd: watcher: Pacemaker >>> root 6135 6129 0 17:35 ?00:00:00 sbd: watcher: Cluster >>> >>> This cluster does not start ocfs2 resources when I first intentionally >>> crashed(reboot) both the nodes, then try to start ocfs2 resource while >>> one node is offline. >>> >>> To fix the issue, I have one permanent solution, bring the other >>> node(offline) online and things get fixed automatically, i.e ocfs2 >>> resources mounts. >>> >>> -- >>> Regards, >>> Muhammad Sharfuddin >>> >>> On 3/12/2018 5:25 PM, Klaus Wenninger wrote: Hi Muhammad! Could you be a little bit more elaborate on your fencing-setup! I read about you using SBD but I don't see any sbd-fencing-resource. For the case you wanted to use watchdog-fencing with SBD this would require stonith-watchdog-timeout property to be set. But watchdog-fencing relies on quorum (without 2-node trickery) and thus wouldn't work on a 2-node-cluster anyway. Didn't read through the whole thread - so I might be missing something ... Regards, Klaus On 03/12/2018 12:51 PM, Muhammad Sharfuddin wrote: > Hello
[ClusterLabs] Antw: Re: single node fails to start the ocfs2 resource
Hi! I didn't read the logs carefully, but I remember one pitfall (SLES 11): If I formatted the filesystem when the OCFS serveices were not running, I was unable to mount it; I had to reformat the filesystem when the OCFS services were running. Maybe that helps. Regards, Ulrich >>> "Gang He"schrieb am 12.03.2018 um 06:59 in Nachricht <5aa687c802f9000ae...@prv-mh.provo.novell.com>: > Hello Muhammad, > > Usually, ocfs2 resource startup failure is caused by mount command timeout > (or hanged). > The sample debugging method is, > remove ocfs2 resource from crm first, > then mount this file system manually, see if the mount command will be > timeout or hanged. > If this command is hanged, please watch where is mount.ocfs2 process hanged > via "cat /proc/xxx/stack" command. > If the back trace is stopped at DLM kernel module, usually the root cause is > cluster configuration problem. > > > Thanks > Gang > > >> On 3/12/2018 7:32 AM, Gang He wrote: >>> Hello Muhammad, >>> >>> I think this problem is not in ocfs2, the cause looks like the cluster >> quorum is missed. >>> For two-node cluster (does not three-node cluster), if one node is offline, >> the quorum will be missed by default. >>> So, you should configure two-node related quorum setting according to the >> pacemaker manual. >>> Then, DLM can work normal, and ocfs2 resource can start up. >> Yes its configured accordingly, no-quorum is set to "ignore". >> >> property cib-bootstrap-options: \ >> have-watchdog=true \ >> stonith-enabled=true \ >> stonith-timeout=80 \ >> startup-fencing=true \ >> no-quorum-policy=ignore >> >>> >>> Thanks >>> Gang >>> >>> Hi, This two node cluster starts resources when both nodes are online but does not start the ocfs2 resources when one node is offline. e.g if I gracefully stop the cluster resources then stop the pacemaker service on either node, and try to start the ocfs2 resource on the online node, it fails. logs: pipci001 pengine[17732]: notice: Start dlm:0#011(pipci001) pengine[17732]: notice: Start p-fssapmnt:0#011(pipci001) pengine[17732]: notice: Start p-fsusrsap:0#011(pipci001) pipci001 pengine[17732]: notice: Calculated transition 2, saving inputs in /var/lib/pacemaker/pengine/pe-input-339.bz2 pipci001 crmd[17733]: notice: Processing graph 2 (ref=pe_calc-dc-1520613202-31) derived from /var/lib/pacemaker/pengine/pe-input-339.bz2 crmd[17733]: notice: Initiating start operation dlm_start_0 locally on pipci001 lrmd[17730]: notice: executing - rsc:dlm action:start call_id:69 dlm_controld[19019]: 4575 dlm_controld 4.0.7 started lrmd[17730]: notice: finished - rsc:dlm action:start call_id:69 pid:18999 exit-code:0 exec-time:1082ms queue-time:1ms crmd[17733]: notice: Result of start operation for dlm on pipci001: 0 (ok) crmd[17733]: notice: Initiating monitor operation dlm_monitor_6 locally on pipci001 crmd[17733]: notice: Initiating start operation p-fssapmnt_start_0 locally on pipci001 lrmd[17730]: notice: executing - rsc:p-fssapmnt action:start call_id:71 Filesystem(p-fssapmnt)[19052]: INFO: Running start for /dev/mapper/sapmnt on /sapmnt kernel: [ 4576.529938] dlm: Using TCP for communications kernel: [ 4576.530233] dlm: BFA9FF042AA045F4822C2A6A06020EE9: joining the lockspace group. dlm_controld[19019]: 4629 fence work wait for quorum dlm_controld[19019]: 4634 BFA9FF042AA045F4822C2A6A06020EE9 wait for quorum lrmd[17730]: warning: p-fssapmnt_start_0 process (PID 19052) timed out kernel: [ 4636.418223] dlm: BFA9FF042AA045F4822C2A6A06020EE9: group event done -512 0 kernel: [ 4636.418227] dlm: BFA9FF042AA045F4822C2A6A06020EE9: group join failed -512 0 lrmd[17730]: warning: p-fssapmnt_start_0:19052 - timed out after 6ms lrmd[17730]: notice: finished - rsc:p-fssapmnt action:start call_id:71 pid:19052 exit-code:1 exec-time:60002ms queue-time:0ms kernel: [ 4636.420628] ocfs2: Unmounting device (254,1) on (node 0) crmd[17733]:error: Result of start operation for p-fssapmnt on pipci001: Timed Out crmd[17733]: warning: Action 11 (p-fssapmnt_start_0) on pipci001 failed (target: 0 vs. rc: 1): Error crmd[17733]: notice: Transition aborted by operation p-fssapmnt_start_0 'modify' on pipci001: Event failed crmd[17733]: warning: Action 11 (p-fssapmnt_start_0) on pipci001 failed (target: 0 vs. rc: 1): Error crmd[17733]: notice: Transition 2 (Complete=5, Pending=0, Fired=0, Skipped=0, Incomplete=6, Source=/var/lib/pacemaker/pengine/pe-input-339.bz2): Complete pengine[17732]: notice: Watchdog will be used via SBD if fencing is required pengine[17732]: notice: On loss of CCM