[ClusterLabs] qdevice up and running -- but questions

2020-04-11 Thread Eric Robinson
  1.  What command can I execute on the qdevice node which tells me which 
client nodes are connected and alive?


  1.  In the output of the pcs qdevice status command, what is the meaning of...


Vote:   ACK (ACK)


  1.  In the output of the  pcs quorum status Command, what is the meaning of...

Membership information
--
Nodeid  VotesQdevice Name
 1  1A,V,NMW 001db03a
 2  1A,V,NMW 001db03b (local)


--Eric

Disclaimer : This email and any files transmitted with it are confidential and 
intended solely for intended recipients. If you are not the named addressee you 
should not disseminate, distribute, copy or alter this email. Any views or 
opinions presented in this email are solely those of the author and might not 
represent those of Physician Select Management. Warning: Although Physician 
Select Management has taken reasonable precautions to ensure no viruses are 
present in this email, the company cannot accept responsibility for any loss or 
damage arising from the use of this email or attachments.
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Integrate external alerts with pcs cluster

2020-04-11 Thread Ajay Srivastava
Hi Strahil,Yes. Its simple but monitoring is done by an external service 
which sends alarm in case state of a hardware changes. Its a complex code 
and multiple devices are being monitored by it. If I ask pacemaker to monitor 
the device, this code has to be replicated in resource agents monitor 
which is not an option. Instead of that, I plan to examine the alert in 
external service and fail the resource if alarm is critical. Hope that it 
clarifies the scenario.By the way, how could I fail the resource from outside. 
I see that option in crm_resource CLI (--fail) but not in pcs 
CLI.Regards,AjayFrom: Strahil Nikolov hunter86...@yahoo.comSent: Sat, 
11 Apr 2020 18:50:25To: Cluster Labs - All topics related to open-source 
clustering welcomed users@clusterlabs.org,Ajay Srivastava 
ajay_srivast...@rediffmail.comSubject: Re: [ClusterLabs] Integrate 
external alerts with pcs clusterOn April 11, 2020 2:04:23 PM GMT+03:00, Ajay 
Srivastava ajay_srivastava@r
 ediffmail.com wrote:Hi,In my environment I have a pacemaker cluster 
running which hasvarious software services as resource agents.These 
services havedependency on some hardware components. There is a service 
whichmonitors the hardware and sends alerts if anything is wrong with 
thehardware.My plan is to add a resource agent for each hardware 
componentwith proper dependencies and fail this resource agent if I get 
acritical alert from hardware monitoring service. Please note that I 
donot want to use monitor functionality of resource agent as there is 
aservice which is already doing same thing. I have two queries here 
-1)Does the approach look good ? Is there a better way to implement it 
inpacemaker cluster ?2) I can find --fail option in crm_resource but 
notin pcs cli. What would be the equivalent command in pcs as I am 
usingpcs cli to configure the cluster ?Regards,Ajaynbsp;Hi Ajay,Sadly 
I dont get the logic.Based 
 ;on your e-mail you want to create a cluster resource that will 
fail based on hardware issue.Yet, are you going to put other cluster 
resources as a dependency (for example - ordering constraint that 
stops some application in case some cpu has stuck for 
120s) ?All the cluster needs is to know how to:- 
Start the resource- Monitor the liveliness of the resource- 
Stop the resourceBest Regards,Strahil Nikolov___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


[ClusterLabs] Verifying DRBD Run-Time Configuration

2020-04-11 Thread Eric Robinson
If I want to know the current DRBD runtime settings such as timeout, ping-int, 
or connect-int, how do I check that? I'm assuming they may not be the same as 
what shows in the config file.

--Eric




Disclaimer : This email and any files transmitted with it are confidential and 
intended solely for intended recipients. If you are not the named addressee you 
should not disseminate, distribute, copy or alter this email. Any views or 
opinions presented in this email are solely those of the author and might not 
represent those of Physician Select Management. Warning: Although Physician 
Select Management has taken reasonable precautions to ensure no viruses are 
present in this email, the company cannot accept responsibility for any loss or 
damage arising from the use of this email or attachments.
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


Re: [ClusterLabs] Why Do Nodes Leave the Cluster?

2020-04-11 Thread Eric Robinson


Hi Strahil --

I hope you won't mind if I revive this old question. In your comments below, 
you suggested using a 1s  token with a 1.2s consensus. I currently have 2-node 
clusters (will soon install a qdevice). I was reading in the corosync.conf man 
page where it says...

"For  two  node  clusters,  a  consensus larger than the join timeout but less 
than token is safe.  For three node or larger clusters, consensus should be 
larger than token."

Do you still think the consensus should be 1.2 * token in a 2-node cluster? Why 
is a smaller consensus considered safe for 2-node clusters? Should I use a 
larger consensus anyway?

--Eric


> -Original Message-
> From: Strahil Nikolov 
> Sent: Thursday, February 6, 2020 1:07 PM
> To: Eric Robinson ; Cluster Labs - All topics
> related to open-source clustering welcomed ;
> Andrei Borzenkov 
> Subject: RE: [ClusterLabs] Why Do Nodes Leave the Cluster?
>
> On February 6, 2020 7:35:53 PM GMT+02:00, Eric Robinson
>  wrote:
> >Hi Nikolov --
> >
> >> Defaults are 1s  token,  1.2s  consensus which is too small.
> >> In Suse, token is 10s, while consensus  is 1.2 * token -> 12s.
> >> With these settings, cluster  will not react   for 22s.
> >>
> >> I think it's a good start for your cluster .
> >> Don't forget to put  the cluster  in maintenance (pcs property set
> >> maintenance-mode=true) before restarting the stack ,  or  even better
> >- get
> >> some downtime.
> >>
> >> You can use the following article to run a simulation before removing
> >the
> >> maintenance:
> >> https://www.suse.com/support/kb/doc/?id=7022764
> >>
> >
> >
> >Thanks for the suggestions. Any thoughts on timeouts for DRBD?
> >
> >--Eric
> >
> >Disclaimer : This email and any files transmitted with it are
> >confidential and intended solely for intended recipients. If you are
> >not the named addressee you should not disseminate, distribute, copy or
> >alter this email. Any views or opinions presented in this email are
> >solely those of the author and might not represent those of Physician
> >Select Management. Warning: Although Physician Select Management has
> >taken reasonable precautions to ensure no viruses are present in this
> >email, the company cannot accept responsibility for any loss or damage
> >arising from the use of this email or attachments.
>
> Hi Eric,
>
> The timeouts can be treated as 'how much time to wait before  taking any
> action'. The workload is not very important (HANA  is something different).
>
> You can try with 10s (token) , 12s (consensus) and if needed  you can adjust.
>
> Warning: Use a 3 node cluster or at least 2 drbd nodes + qdisk. The 2 node
> cluster is vulnerable to split brain, especially when one of the nodes  is
> syncing  (for example after a patching) and the source is
> fenced/lost/disconnected. It's very hard to extract data from a semi-synced
> drbd.
>
> Also, if you need guidance for the SELINUX, I can point you to my guide in the
> centos forum.
>
> Best Regards,
> Strahil Nikolov
Disclaimer : This email and any files transmitted with it are confidential and 
intended solely for intended recipients. If you are not the named addressee you 
should not disseminate, distribute, copy or alter this email. Any views or 
opinions presented in this email are solely those of the author and might not 
represent those of Physician Select Management. Warning: Although Physician 
Select Management has taken reasonable precautions to ensure no viruses are 
present in this email, the company cannot accept responsibility for any loss or 
damage arising from the use of this email or attachments.
___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/


[ClusterLabs] Integrate external alerts with pcs cluster

2020-04-11 Thread Ajay Srivastava
Hi,In my environment I have a pacemaker cluster running which has various 
software services as resource agents.These services have dependency on some 
hardware components. There is a service which monitors the hardware and sends 
alerts if anything is wrong with the hardware.My plan is to add a resource 
agent for each hardware component with proper dependencies and fail this 
resource agent if I get a critical alert from hardware monitoring service. 
Please note that I do not want to use monitor functionality of resource agent 
as there is a service which is already doing same thing. I have two queries 
here -1) Does the approach look good ? Is there a better way to implement it in 
pacemaker cluster ?2) I can find --fail option in crm_resource but not in pcs 
cli. What would be the equivalent command in pcs as I am using pcs cli to 
configure the cluster ?Regards,Ajay___
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/