Re: [Pacemaker] crm_resource -L not trustable right after restart

Andrew Beekhof Wed, 15 Jan 2014 19:54:15 -0800

On 16 Jan 2014, at 1:13 pm, Brian J. Murrell (brian) <br...@interlinx.bc.ca> 
wrote:


> On Thu, 2014-01-16 at 08:35 +1100, Andrew Beekhof wrote:
>> 
>> I know, I was giving you another example of when the cib is not completely 
>> up-to-date with reality.
> 
> Yeah, I understood that.  I was just countering with why that example is
> actually more acceptable.
> 
>> It may very well be partially started.
> 
> Sure.
> 
>> Its almost certainly not stopped which is what is being reported.
> 
> Right.  But until it is completely started (and ready to do whatever
> it's supposed to do), it might as well be considered stopped.  If you
> have to make a binary state out of stopped, starting, started, I think
> most people will agree that the states are stopped and starting and
> stopped is anything < starting since most things are not useful until
> they are fully started.
> 
>> You're not using the output to decide whether to perform some logic?
> 
> Nope.  Just reporting the state.  But that's difficult when you have two
> participants making positive assertions about state when one is not
> really in a position to do so.
> 
>> Because crm_mon is the more usual command to run right after startup
> 
> The problem with crm_mon is that it doesn't tell you where a resource is
> running.

What crm_mon are you looking at?
I see stuff like:

 virt-fencing   (stonith:fence_xvm):    Started rhos4-node3 
 Resource Group: mysql-group
     mysql-vip  (ocf::heartbeat:IPaddr2):       Started rhos4-node3 
     mysql-fs   (ocf::heartbeat:Filesystem):    Started rhos4-node3 
     mysql-db   (ocf::heartbeat:mysql): Started rhos4-node3 


> 
>> (which would give you enough context to know things are still syncing).
> 
> That's interesting.  Would polling crm_mon be more efficient than
> polling the remote CIB with cibadmin -Q?

crm_mon in interactive mode subscribes to updates from the cib.
which would be more efficient than repeatedly calling cibadmin or crm_mon 

> 
>> DC election happens at the crmd.
> 
> So would it be fair to say then that I should not trust the local CIB
> until DC election has finished or could there be latency between that
> completing and the CIB being refreshed?

After the join completes (which happens after the election or when a new node 
is found), then it is safe.
You can tell this by running crmadmin -S -H `uname -n` and looking for S_IDLE, 
S_POLICY_ENGINE or S_TRANSITION_ENGINE iirc

> 
> If DC election completion is accurate, what's the best way to determine
> that has completed?

Ideally it doesn't happen when a node joins an existing cluster.

signature.asc
Description: Message signed with OpenPGP using GPGMail

_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [Pacemaker] crm_resource -L not trustable right after restart

Reply via email to